ANALYSING THREE-WAY PROFILE DATA USING THE PARAFAC AND TUCKER 3 MODELS ILLUSTRATED WITH VIEWS ON PARENTING

In this paper two major models for three-way profile data, i.e., the Parafac model and the Tucker3 model are discussed from the point of view of application. Topics treated are handling the data before analysis, model choice, choice of dimensionality, model fit, algorithmic hazards during the analyses, and interpretation and validation of the results. These issues are discussed in some detail so that prospective users can take guidance for analysing their own data. The data provided by Japanese girls and their parents about the parenting style in their family are the major vehicle for demonstrating the issues touched upon. The general results from these data are that parental styles consisted of three groups of behaviours: Acceptance, Control and Rejection, and Discipline. Within families the parenting behaviours of fathers and mothers are seen as parallel rather than at cross purposes, both by the daughters and the parents themselves. Moreover, daughters and parents largely agree about the parenting style itself. Notwithstanding, there are also families in which daughters and parents disagree about the parenting style in particular about Acceptance and Control, but not about Discipline.


INTRODUCTION
The focus of this paper is the analysis of three-way profile data using the two most common models for such data, i.e., the Parafac model and Tucker3 model.Before we can discuss the models we have to look at three-way data themselves and how they have to be handled before analysing them with one the three-mode models 3 .After having considered these data preliminaries, we present the two analysis models in a rather conceptual fashion referring the reader for technical details elsewhere.How these models work in practice is the central theme of this paper and this is illustrated with a detailed application of the Kojima data on judgements of parental behaviour towards their daughters (Kojima, 1975).Not only the working of the models is presented but also a comparison is made between them.Along the way we have formulated a number of recommendations which should allow applied researchers to start on a three-mode analysis of their own data and delve into the literature further.

Two-Way Profile Data: Setting the Scene
A common research goal in non-experimental psychology is to improve the understanding of a conceptual domain via studying a number of variables to measure aspects of that (multidimensional) domain.Often subjects are considered to be random samples from some population, but in reality they seldom are.Moreover, it is an empirical question whether in the sample subgroups exist which have different characteristics.Therefore, one should not only analyse the structures derived from the variable correlations, but also pay attention to individual differences or group differences during an analysis.A balanced interest in subjects and variables is called for.
Assume for the moment that we have drawn a random sample and that we want to assess the structure of the variables in a two-way (i.e., two-subscripted) data set consisting of individuals (rows) by variables (columns).Such data are an example of two-way profile data with the subjects having profiles across the variables.The variables or items could form a questionnaire constructed to contain a number different conceptual domains, and the aim of the investigation is to verify in a non-confirmatory way whether the items indeed group into their hypothesised domains.In such investigations, the prime interest is in the variables, and the subjects are exchangeable, one person is as acceptable as another as long as they come from the same population.This implies that individual differences are not of interest because the subjects are not treated as individuals but as members of a particular population.
In the case that the variables do not group into their domains, what to make of the variable grouping that one finds?If the researcher is primarily interested in refining the measurement device for the hypothesised domains, items are deleted that do not group as desired.Alternatively, wordings of items are modified in the hope this will put them within the intended domain, or new better items are constructed.With the thus amended measurement instrument, new data are gathered drawing again from the appropriate population, and so on until the instrument works as intended.
A different situation arises if the ultimate aim of the researcher is not the construction of a measurement instrument according to preconceived ideas, but the interest is in the psychological properties, processes, and structure in the domain from which the items are drawn.In other words, the data set at hand needs to be analysed in order to discover the underlying principles which guided the responses of the subjects.At the same time, because it is the data themselves which are of interest, there is a necessity to investigate the organisation of the subjects as well, especially if they cannot be guaranteed to be a truly random sample from an unstructured population.The purpose of such an analysis is not merely descriptive and interpretive, but the aim of the researcher is to make an inductive, or theoretically principled, use of the analysis.In such a case, the investigator wants to learn more about the substantive domain, wants to investigate multi-item properties, such as ''cognitive-affective dimensions'' or other characteristics that span multiple items and account for their statistical correlations.
Multivariate methods in many textbooks are couched in terms of algebra and distributions of statistics, but there is also a geometric side to these methods (see, for instance, Wickens, 1994) especially in exploratory analyses, such as principal component analysis, multidimensional scaling, and correspondence analysis.An excellent book discussing multivariate methods along these lines is Legendre and Legendre (1998).In the extensions of component analysis discussed in this paper this geometric view is paramount.
The basic elements of the geometric view of multivariate statistics are that subjects can be seen as points defined in the space defined (or spanned) by the variables as is well-known in the case of scatterplots for two variables.In the same way, the variables can be seen as points (usually represented by arrows, vectors or axes) in the space defined by the subjects.This geometric symmetry is especially exploited in multiway analysis.
Suppose one wishes to reduce the multidimensional space spanned by many variables to a lower-dimensional one with the principal aim of investigating the relationships of the variables.In such a case one does not necessarily have to assume that important (psychological) principles underlie the data, but one simply wants to investigate the major relationships between variables and subjects in a lower-dimensional space without explicitly seeking to interpret the axes in the space.Necessarily, one wants the low-dimensional subspace to explain as much variance as possible to be sure to capture as much structural variance as possible in a few dimensions.
Principal component analysis and 'exploratory factor analysis' have been the major analysis vehicles for investigating low-dimensional subspaces to disentangle relationships between variables, subjects and their interrelationship.However, factor analysis, neither the technique nor the term, will feature in this paper.For expository purposes, the somewhat controversial position is taken here that there is a clear distinction between (exploratory) component analysis and (confirmatory) factor analysis, and the present paper deals exclusively with component techniques.

Three-Way Profile Data
Often data are more complex than individuals by variables, for instance when the individuals are measured under several conditions or at more than one time point.The resulting data have a three-way format of, for instance, subjects having scores on variables under several conditions (three-way profile data) or of situations which are judged on a number of scales by several individuals (three-way rating data).To tackle the investigation of the structure in such data, it seems natural to look for generalisations of the two-mode component analysis for their analysis.These generalisations or three-mode component models are the subject of this paper; in particular two of the most commonly used models for this pur-pose, the Parafac model and the Tucker3 model.Several earlier didactically oriented papers exist which explain three-mode analysis in detail, among others Bro (1997); Harshman (1994b); Kroonenberg (1994); Van Mechelen and Kiers (1999); Kiers and Van Mechelen (2001), and many practical issues are treated in detail in Kroonenberg (2008).The present paper treats and illustrates the two major three-mode models in a single paper which we hope will provide insights into each method as well as an overall understanding of their relationship.
We will restrict ourselves here to three-way profile data to keep the discussion manageable; for a discussion of three-way rating data see Kroonenberg (2008, chap. 14).Subjects, variables, and conditions for the three modes will be the labels used, but obviously these labels are generic and in other sciences other types of three-way data occur.Even though many categorical variables are collected especially in the social and behavioural sciences, and three-mode methods exist to deal with them (see, for instance, Kroonenberg, 2008, chap. 17 and 18); in this paper we will limit ourselves to numeric variables.We will also not discuss data in which there exists a specific time structure or order relationship between the measurement conditions.The introduction of order information into an analysis introduces specific technical problems but also interpretational opportunities but treating them would take us too far afield.
Three-way profile data correspond to a multivariate repeated-measures design.The major difference between the standard multivariate analysis and our treatment with three-mode models is our focus on individual differences between the subjects as well as on the underlying correlational structure for the variables.If individuals are a random sample from one or more populations, i.e., the subject mode is a stochastic mode, only means and covariances are of interest.So that one may restrict oneself to their analysis without paying attention to the individuals themselves; see Bentler & Lee (1979) ;Oort, (1999), and their references for stochastic three-mode models.However, often the subjects are analysed as if they are a population rather than a (random) sample from a well-defined population.Moreover, there often is an interest in changes in the correlational structure between the variables across conditions.In such cases, three-mode analysis is particularly useful.The value of a three-mode analysis is enhanced if additional information is available about the subjects.If it is not, one can still fruitfully describe the differences found in terms of the data contained in the three-way array, but not in terms of other characteristics of the subjects.
If a data set is approached as a population or if one sees the three-mode analysis as a purely descriptive technique for the sample at hand, then sample size is not necessarily a serious concern.However, in small data sets, the description of the majority of the data can be easily distorted due to outlying data points, a specific level of a mode or both.
In three-way profile data, the subject mode often can be considered both as an individual differences mode and as a stochastic one, which allows for calling upon techniques which need stochastic modes for their application.For instance, given a stochastic mode with unknown distributional properties, the (statistical) stability of (parts of) a three-mode solution can nevertheless be assessed via a bootstrap analysis (see Kiers, 2004).Split-half procedures for assessing stability also assume a stochastic mode, as do imputation procedures for generating multiple data sets in the case of missing data (Kroonenberg, 2008, chap. 7).

Preprocessing Three-Way Profile Data
Because numeric variables often have different measurement scales and ranges, their values are not necessarily comparable and nearly always some form of preprocessing is required, i.e., means have to be removed (centring) and division by a scale factor has to be done as well (normalisations).The standard way to handle three-way profile data before the analysis proper is to centre per variable-condition combination (jk), usually per column across subjects, and to normalise per variable slice j so that , where x ijk are the raw data and z ijk the preprocessed data.A refers to an index after summing over it.This centring is recommended because it eliminates the average score on each variable for each occasion so that all scores are in deviation of the scores of the 'average person'.This is the person whose scores are exactly at the mean of each variable at each occasion.Thus all z ijk are in deviation of the variable means and normalised per variable j.In longitudinal studies, this type of centring has the effect of removing the time trends over occasions, so that these trends have to be investigated independent of the variation around these trends.The suggested centring has the additional advantage that if there is a real meaningful, but unknown, zero point of the scale, it is automatically removed from the scores (see Harshman & Lundy, 1984b, for a detailed discussion of this point).A further effect of this centring is that the component coefficients of the subjects are in deviation of their means.
Normalising over all values of each variable means that the normalised deviation scores carry the same meaning for all occasions.Thus a standard deviation has the same size in the original scale for each occasion.A similar argument is used in multi-group structural equation modelling for using covariance rather than correlation matrices; Kroonenberg (2008, chap. 6) discusses the recommendation for normalisation in greater depth.
The effect of the recommended preprocessing is that the variable coefficients on the components can be adjusted in such a way that they can be interpreted as variable-component correlations as shown in Table 1.The recommended preprocessing is the only way to get this interpretation.

Objectives of Three-Mode Analysis
The major aim of applying three-mode analysis to three-way profile data is to unravel complex patterns of dependencies between the observations.All models have component matrices for each of the modes, thus one set of components for the subjects, one for the variables, and one for the conditions.Moreover, the models also contain weights which provide the size of the links between components of the different modes.In the Parafac model these weights are such that a component of one mode is linked only with one component of another mode, but in the Tucker3 model links between all components of all modes are allowed.This makes the latter model inherently more complex, but also more flexible in fitting it to data.Details of this are worked out when we discuss the models themselves.

Model and Dimensionality Selection
Even though we have referred to the three-mode models as the Parafac model and the Tucker3 model, it is better to think of classes of three-mode models, because each model can occur with different numbers of components.In practice we will therefore not only have to choose a type or class of model, but within those classes we will have to decide how many components we need for analysis, just as we have to choose the number of components in ordinary two-mode principal component analysis.

Interpretation
There are essentially four different ways in which the basic results from a three-mode analysis can be presented, i.e., (1) Tables of the coefficients or loadings for each mode  rotated or not; (2) Separate pair-wise graphs of the components per mode; (3) All-components plots showing all components of a single mode in a plot with the levels of the modes on the horizontal axis; (4) Percomponent plots showing one component of each of the three modes in the same plot.As we shall see, the latter type of plots conforms most closely to the spirit of the Parafac model, because they allow for the simultaneous inspection of the coefficients of the three modes per component.The pair-wise component plots make more sense when the spatial characteristics of the solution are to be examined and this is more in line with the characteristics of the Tucker3 model.Further details will be discussed in later sections.

Va lidation
In validating three-mode results, one may look internally towards the analysis itself and investigate the statistical stability of the solution via bootstrap analyses; see Kroonenberg (2008, chap. 8).Furthermore, one may look at the residuals to see which parts of the data have not been fit well and to what extent there is much systematic information left.In Tucker models nearly every part of the model can be evaluated in terms of fit to the data.Another approach to validation is looking at information external to the data set available on the entities which make up the data set to see whether they can help to shed further light on the patterns found.For instance, design information for the variables, background information on the subjects, etc.

THREE-MODE MODELS
In this section we discuss the two most common three-mode models, the Parafac model and Tucker3 model.With these models the same type of data can be analysed but they differ in the assumed underlying structure in the data, the ease of fitting the models to data, and the number and type of choices one has make to find an appropriate model for the data at hand.We will take up each model in turn and discuss some of its most salient properties.For full details one should look in the reference mentioned earlier.However, before entering into the discussion of three-mode model, a brief recapitulation of principal component analysis and the singular value decomposition will be given to set the scene for the three-mode discussion.
Singular Value Decomposition: Basis for PCA Two-way profile data are usually collected in a two-way data matrix X = (x ij ) of I subjects by J variables.The variables do not have necessarily the same scales and thus some type of normalisations will generally be required.We will refer to the entities within a mode as levels, thus the variables and subjects are the levels of their modes.
In standard (two-mode) component analysis, we attempt to fit a two-mode model to two-way profile data of subjects by variables as follows: (1) where the a is are the coefficients or scores for the subjects, and per component they can collected into a column vector a s .Such columns are also known as the 'subject' (or left) singular vectors.Similarly, the b js are the coefficients for the variables, and per component they can be collected into the column vectors b s , which are the 'variable' (or right) singular vectors.The g ss are the square roots of the eigenvalues (or singular values) which represent the standard deviations of the components given the standard centring and normalisations.The model, is known in the statistical literature as the singular value decomposition.As Figure 1 shows the g ss constitute the diagonal of the matrix with singular values, G, and in the light of the coming generalisation to three modes, this matrix is called the two-mode core matrix.If the g ss are absorbed in the variable coefficients, the a is are generally referred to as (standardised) component scores, while the f js = b js g ss are then referred to as loadings (see left-hand panel of Figure 1).The column vectors or components a s together form the matrix A of component scores.The column vectors f s form the matrix of component loadings F.
a is b js g ss , If the g ss are absorbed in the subject coefficients, the q s are vectors of scores whose variance is equal to the eigenvalues (see right-hand panel of Figure 1).These two different ways of presenting the principal component model were

Parallel Proportional Profiles
The fundamental idea underlying the Parafac model was first formulated by Cattell (1944) in the form of the principle of parallel proportional profiles which he explained in Cattell and Cattell (1955) as follows: ''The basic assumption is that, if a factor corresponds to some real organic unity, then from one study to another it will retain its pattern, simultaneously raising or lowering all its loadings according to the magnitude of the role of that factor under the different experimental conditions of the second study.No inorganic factor a mere mathematical abstraction, would behave this way [...].This principle suggests that every factor analytic investigation should be carried out on at least two samples, under conditions differing in the extent to which the same psychological factors might be expected to be involved.We could then anticipate finding 'true' factors by locating the unique rotational position (simultaneously in both studies) in which each factor in the first study is found to have loadings which are proportional to (or some simple function of) those in the second: that is to say, a position should be discoverable in which the factor in the second study will have a pattern which is the same as the first, but stepped up or down.''(Cattell & Cattell, 1955, p. 84; their italics).
The parallel proportional profile principle has great intuitive appeal, exactly because it determines the orientation of component axes without recourse to assumptions about the desired pattern of loadings.In particular, there is no attempt to maximise simplicity or any other structural property within a given occasion.Instead, the proportional profiles criterion is based on a simple and plausible conception of how loadings of ''true factors'' would vary across occasions.
A further strength of the parallel proportional profile principle is that it is based on patterns in the data that could not have easily arisen by chance, or as an artefact of the way that the study was conducted.When it can be empirically confirmed that such proportional shifts are present and reliable, one has established an important fact about the data, one that is in accordance with the fact that there are underlying factors which have a particular axis orientation (see Harshman & Lundy, 1984a, pp. 147-152, 163-168, for a more detailed presentation of the argument).
Disadvantages of the Parafac model with respect to models or rotations with specific restrictions on the components is that no specific theories can be tested nor very simple, easily interpretable, components can be searched for.One way of attempting to get the best of both worlds is to use constrained Parafac models (see, for instance, Krijnen, 1993 andBro, 1998, for several ways how this can be done).Furthermore, in practice very often data sets only support a limited number of components with parallel profiles, while other three-mode models can describe additional patterns which do not conform to the parallel proportional profile principle.This is one of the important differences between the three-mode models discussed here.

The Basic Parafac Model
Three-way profile data can be collected in an I by J by K three-way data array X = (x ijk ) of I subjects by J variables by K conditions.The variables have not necessarily the same scales and thus some type of normalisations will generally be required.Again, we will refer to the entities within a mode as levels, thus the variables, subjects, and conditions are the levels of their modes.
The basic Parafac model for a three-way array X with elements x ijk consisting of the scores of I subjects on J variables under K conditions is a direct extension of standard two-mode PCA (Equation 1), and has the form, (2) where the subject scores a is , the variable loadings b js , and the condition weights c ks are the elements of the component matrices A, B, and C, respectively.Note that the model shows great similarity to the two-mode case (Equation 1), but it has an additional set of coefficients for the conditions.In the Parafac model, all modes take identical roles so that there is no a priori designation of variable coefficients as loadings in the sense of variable-component correlations and subject coefficients as standardised scores as is common in two-mode analysis.However, we often use these terms in this paper for ease of presentation without assuming the specific interpretations attached to them in two-mode analysis.
The weight of the sth component, g sss , indicates its importance.These weights can be interpreted in the same way as their two-mode counterparts, the singular values g ss , be it that they are not necessarily standard deviations.Moreover their squared values g 2 sss in general do not add to the total variance explained unless the components are uncorrelated.The squared values do, however, indicate the variability accounted for by their components and indicate their importance in reconstructing the data.In standard two-mode component analysis the g ss fit onto the diagonal of a square matrix (see Fig. 1), and analogously the g sss G can be collected on the diagonal of a cube (see Figure 2; top).Such a diagonal is called a superdiagonal and the cube (i.e., g pqr = 0 if p ≠ q ≠ r) is referred to as a core cube.The papers by Harshman andLundy (Harshman &Lundy, 1984a, 1984b;Harshman, 1984) together are the most complete treatment of the model and many related issues to date.More recent treatments are to be found in the following books: Krijnen (1983), Bro (1997), Smilde, Geladi, and Bro (2004), and Kroonenberg (2008).

Uniqueness and explicit models
Compared to the Tucker3 model to be discussed, the Parafac model has, given the number of components, a unique solution apart from trivial rescalings and reorderings of the components, given there is ''adequate'' variation across all three modes; to use the term originally used in Harshman (1970).The easiest set of conditions which define adequacy is that at least in one of the modes the components are nonproportional, i.e., c s ≠ k c s for any constant k.Adequate variation in all three modes is central to the uniqueness of the model and the presence of systematic variation in the data; full details can be found in Kruskal (1984), Carroll andChang (1970), andHarshman (1972b).
The uniqueness makes the Parafac model very attractive if it is known or hypothesised that it gives an adequate representation of the structural patterns in the data.Harshman uses this uniqueness property (or 'intrinsic axis property', as he also calls it) to search for ''real'' psychological factors basing himself on Cattell's parallel proportional profiles as was indicated in the quote in the opening paragraph of this section.The parameters in the model depend only on one of the indices i, j, k, thus they may be seen as proportionality constants for entire components.Thus, the conditions weight each a is b js with the same weight, c ks , irrespective the variable j or the subject i.For example, the component a s is proportionally enlarged or decreased by an amount c ks for a level k of the third mode.Thus across conditions the components a s are parallel; hence the name of the model.

Tucker Models
The Tucker models form the other class of models which are generalisations of two-mode principal component analysis, and they are named after their proposer Ledyard R Tucker.Tucker (1966) was the first to propose workable threemode models, to present algorithms for fitting them, and to show a number of real-data examples.There are several variants of the basic model proposed by Tucker, such as the Tucker3 and Tucker2 models.In this paper we will only discuss the Tucker3 model which features components for all three modes.The central difference between the Tucker3 model and the Parafac model is its property that each mode has its own components rather than that there is one set of components to which all modes refer.In two-mode analysis, the components in the subject and variable spaces are uniquely linked to each other and can therefore be considered to be the same (all g ss' are zero).However, this is not so in the Tucker3 model where any components of one mode can be linked to any component of the other modes.On the other hand, the exclusive link between components of different modes is retained in the Parafac models.
The Tucker3 model can be used to model a data array into all its components as is true for PCA in the two-mode case, so that one can always find a complete solution for any three-mode data set.The model applied to a three-way array X with elements x ijk has the form: (3) where the scores a ip , the loadings b jq , and occasion coefficients c kr are the elements of the component matrices A, B, and C, respectively.The g pqr indicate the weight of the link between the pth component of the first mode, qth component of the second mode, and the rth component of the third mode.The g pqr can be collected in a rectangular 'box' called the core array G , and the e ijk are the errors of approximation (see Fig. 3).In this presentation we will not go into the question, whether the e ijk are really random disturbance terms or that they represent information which may be decomposed further.The symmetry with respect to the modes (see Equation 3) makes that all modes take identical roles so that there is no a priori need to designate variable coefficients as loadings and subject coefficients as scores.However, also for the Tucker3 model we often use these terms for ease of presentation without assuming the specific interpretations attached to them as in two-mode analysis.The Tucker3 model can also be written in matrix form, but we refrain from this in this paper; for the technical details see Kroonenberg (2008, chap. 4).The perception of parental behaviour by parents and their children were the central concern of the study from which the illustrative data in this paper were drawn (Kojima, 1975).Kojima wanted to validate the component (or factorial) structure of the questionnaires used to measure parental behaviour.He argued that this could be done by comparing the reactions of parents and their children on the same (type of) instrument, and in separate analyses he found great similarity.This formed the basis for presuming that the perceptions or perceptual dimensions of parents and children are sufficiently the same to be included in a single analysis such as presented here.
The instrument used in his research was the Japanese version of the Child's Report of Parent Behaviour Inventory  CRPBI (Schaefer, 1965).In order to have parents judge their own behaviour, Kojima (1975) developed a strictly parallel parental version of this inventory (PR-PBI).The CRPBI is a three-point Likert-type questionnaire designed to assess children's perceptions with respect to parental acceptance, permitted psychological autonomy, and level of parental control.In English nearly identical forms are used for indicating behaviours of mothers and fathers, but owing to the structure of the Japanese language, it was possible to make a single version suitable for both parents.The substantive questions to be addressed in our analysis are ( 1) to what extent the structures of the questionnaire subscales are independent of who judges the parent behaviour and (2) whether individual differences exist between judges, and how they can be modelled and presented.
The data in this section are ratings expressing the judgements of parents with respect to their own behaviour towards their daughters and that of their daughters with respect to their parents.Thus, there are four conditions: both parents assessing their own behaviour with respect to their daughters  Father-Own behaviour (F-F), Mother-Own behaviour (M-M); the daughters' judgement of their parents' behaviour  Daughter-Father (D-F), Daughter-Mother (D-M).Collectively we will refer to conditions or to judges depending on the context.The judgements were made of parents of 150 middle-class Japanese eighth-grade girls on the 18 subscales of the inventory (see Table 1).Thus the three-way profile data consist of a 150 (girls) × 18 (scales) × 4 (judgement combinations) data array.Rather than only speaking of girls, we will at times refer to 'families' as well.

Objectives of the Analysis
Kojima performed separate component analyses for each of the judgement conditions, and evaluated the similarities of the components using congruence coefficients.He also used Tucker's interbattery procedure (Tucker, 1958).With the Parafac model we will search for a single set of parallel proportional components for the scales valid in all conditions simultaneously.
The three-component Parafac solution presented in the next section should be seen as an example of the kind of answers that can be obtained with the model.We do not claim that this model solution is necessarily the best or most detailed Parafac analysis that can be obtained from the data.In later sections, we will discuss such issues in more detail.

Dimensionality Selection
Before the Parafac analysis itself, the data were preprocessed in one of the ways recommended for three-way profile data.The condition means per scale x jk were removed and the scales were normalised.That is, per scale the centred data, z ijk = x ijk -x jk , were divided by the square root of the average sum of squares The first objective of a Parafac analysis is to establish how many reliable parallel proportional components can be sustained by the data.This issue is dealt .
with in detail in the section on the Parafac model, where it is investigated to what extent there exist inappropriate or degenerate solutions, and to what extent the components cross-validate in split-half samples.
As shown in detail in the section on the Kojima girls' data, the fitted sums of squares for the one-to four-component solutions were 16%, 30%, 39%, 46% respectively, but the reliability of the three-and four-component solutions are questionable.Requiring the first (subject) mode to be orthogonal improves the solutions at all dimensionalities.Their proportional fitted sums of squares are only proportions of one percent less than the original solutions (for more details see below).For this reason and comparability with the solution of Kojima's boys' data set reported in Kroonenberg (2008, chap. 13), we chose to present the orthogonal three-component solution with a fit of 38%.The amount of fit is probably reasonable for this type of three-mode data.It should, however, be pointed out that, as in two-mode analysis, it would be a mistake to look for a specific amount of fit.The purpose of component analyses is to find an adequate representation of the (major) structural patterns in data which may contain considerable amounts of noise.Whether these structures explain a large part of the total variability is also related to other aspects of the data such as their reliability, the homogeneity of the population from which the sample was drawn.From an inspection of the fit of each subject, scale, and condition, it could be seen that there were no levels seriously fitting much better than other levels, and there were no large groups of levels which did not fit at all.This gave sufficient confidence to proceed with this solution.

Interpretation
As the Parafac model specifies that per component s the same source of variation is underlying the variability in each mode, a single interpretation for a component is appropriate as it is in two-mode component analysis.To understand this interpretation, two aspects are involved in each set of components of the Parafac model: (1) the components themselves (a s = (a is ), b s = (b js ), c s = (c ks )) and ( 2) how they combine to generate the estimated observed score (or structural image of the observed score): ∑ s g sss a is b js c ks .Each of the components s corresponds to a source of variation.In particular, this source is in the system generating the measurements, but varies in impact across the levels of each mode.If parental love is the source in question, then items measuring parental love should have high scores on the variable component, children who are judged to receive much parental love should have high scores on the child component, and conditions in which parental love is particularly evident should have high scores on the condition component.The effect of the source is then expressed equally by the variation across levels of every mode, i.e., parental love increases the score for all children in the same manner and also for all conditions in the same manner.The interpretation of the source is made by considering the patterns of variation in its relative impact across levels of each of the modes.Basically, one interprets all modes because the component is an entity that is the same source of variation being moderated both by levels of one mode and by the levels of another mode.The interpretation of the Parafac model for three-way profile data usually starts with the variables (scales in the present example), but the interpretation of the component or the source of the variation in the data should be in line with the observed variability in the other modes.
The top left-hand part of Table 1 gives the component-scale correlations, i.e., the structure matrix; the right-hand part will be discussed in the section on the Tucker3 model.The scales have been arranged to supply as coherent a picture as possible, and the components 2 and 3 have been interchanged to correspond as well as possible with the Tucker3-solution to be discussed later.The commonly found dimensions underlying the CRPBI, are Psychological Control, Firm Control, and Acceptance (see Table 1 for their constituent items).Largely the items of each dimension group on the Parafac components, however the Parafac components themselves do not concur with the ''official'' dimensions.For instance, there is no separate Firm Control dimension.In the bottom part of the table we find the normalised coordinates for the conditions.Noteworthy is that all values are positive.The first component shows the consensus between the daughters' judgements and their parents be it that the girls has slightly higher coefficients than their parents indicating that their judgements are more extreme, the second component is primarily valid for the parents, and the third for the daughters.Note, that it is not a question of contrasts as all coefficients are positive.In other words the second and third components carry comparatively much less weight for the daughters and the parents, respectively.
Figure 4 shows the per-component plot for Components 1 and similar plots can be made for the other two components.As Kojima's interest centred on the similarity in structure of the scale space, we take this as our starting point for the interpretation.
The first scale component is largely the Acceptance -Rejection dimension often found for the CRPBI with Lax discipline siding with Acceptance, but it also contains three scales which are supposed to belong to the Psychology dimension (Inconsistent discipline, Control through guilt and Withdrawal of relations).All judging conditions are positive indicating that all judges had similar views on parental behaviour.Thus overall, parents and their daughters gave similar judgements.Whether the parents were characterised by the judges as Accepting or Rejecting varied enormously per family.
The term of the Parafac model displayed in Figure 4 is the product of all first components, g sss a i1 b j1 c k1 .We observe that the judges' coefficients c k1 are all positive.The signs of a i1 and b j1 in the product a i1 b j1 determine whether the parents of girl i are judged as rejecting or accepting.High positive scores for a i1 , such as those of daughters 29, 60, and 81, combined with the positive coefficients of the accepting scales b j1 indicate that the parents are judged particularly accepting and not rejecting, and vice versa for the girls at the other end of the component, i.e., all judges see the parents of girls 57, 13, and 138 not as accepting, but as rejecting parents.If the small differences between judges are to be believed, girls judge their parents to be slightly more extreme than the parents judge themselves.
To illustrate the above argument numerically, suppose that Acceptance (scale j) has a value of b j1 = .40on the first component.In addition, the judging condition k, Daughter judges parent has a value of c k1 = .43.Furthermore, the weight of the first component g sss = 2.1, then the combined value of g sss b j1 c k1 = 2.1 × .40× .43= .36.For Girl 29 who has a component score of a 29,1 = .20,the first component's contribution to her Acceptance score in case of the daughter-judges- parent condition is .36× .20 = .07.However, for Girl 138 with a component score of -.20, the contribution is -.07.Thus on the basis of this component Girl judged her parents as more accepting (rather than more rejecting), while Girl judged her parent as more rejecting (rather than more accepting).As the zero point represents the mean due to the centring, Girl 29 scores above and Girl below average on Accepting in the daughter-judges-parent condition.
All scales have positive and comparable coefficients on the second component.In addition, the coefficients of the judges are all positive, but those of the parents as judges are much higher than those of their daughters.Given that the positive coefficients of the parents, this component seems to represent something like a response style.Some judges especially parents give more extreme answers on all scales than others.For parents whose daughters are on the positive side of the girls' axis the response style serves to make the parents' judgements more positive, while for the girls on the negative side the parents' judgements are more negative on all scales.

G-F F-F G-M M-M
The third scale component is (positively) dominated by Behavioural and Psychological control, be it that there is a bit of a similar response-style flavour as with the parents in the second component.Only Lax discipline is exempt.From the condition components we see that this component reflects primarily the daughters' judgements of their parents.And the interpretation proceeds along the lines as with the parents on the second component.

Summary of the Analysis
There is substantial agreement between parents and their daughters about the parents' behaviour, but some differences between them could be observed.The differences between families were, however, much larger than the within family judgements.This evident from the considerable differences between the girls (between-family differences), and the comparatively small differences in the judging conditions (within-family differences).Parallel proportional differences were primarily evident in the contrast between Acceptance versus Rejection.It is not easy to compare these results with those of most component analyses of similar data, due to the use of a different model.When discussing analyses using the Tucker3 model, we will come back to comparisons of these results.
An interesting question arises from the analysis using the Parafac model.If we follow Cattell and Harshman that components derived with Parafac have more ''real'' meaning than (rotated) maximum-variance components, one may wonder what the meanings are of the Parafac components in the present study.The analysis gives the impression that the scales exhibit one real contrast (Acceptance -Rejection) and that the other components capture primarily response styles.Of course, these findings need to be confirmed by analysing other similar data sets in the same way, preferably some of the defining data sets of Schaefer.It is clear that the variable-component correlations (see Table 1) only partially follow the ''official'' grouping of the originators of the instrument.

THREE-MODE ANALYSIS: PRACTICAL ISSUES
In this section we will discuss how to carry out three-mode analyses with the major three-mode models.We will discuss such issues as choosing an analysis within each class, comparing solutions, plotting, properties of the solutions, and what to do in case things do not go according to plan.Various points are illustrated with the Kojima girls' data.

Parafac Model
The primary aim of the analyses in this section is to discuss the practical issues in connection with carrying out a Parafac analysis on three-way profile data.For this purpose, we will lean heavily on the magnum opus of Parafac's godfather, Richard Harshman (Harshman & Lundy, 1984a, 1984b;Harshman & DeSarbo, 1984).

Objectives
The aim of most analyses with the Parafac model is to uncover the existence of components which show parallel proportional profiles, and if possible to identify these components as ''real'' ones which carry true substantive meaning.In many chemical applications this is not a problem as the components often correspond to physical or chemical properties of the substances under investigation.Harshman, Bro and co-workers have shown several analyses where the uncovered components could be given such a status; e.g.Harshman (1994a), Harshman, Ladefoged, & Goldstein (1977), Bro (1998), Smilde et al. (2004a).However, in the example of the previous section, the correspondence between the components and the theory of parental behaviour is far more difficult to establish, especially because no non-orthogonal solutions could be found.If stable solutions are present, this will give valuable information to the substantive researchers about possible underlying parallel proportional profiles.Thus, given stable Parafac components have emerged, it should spur researchers on to make sense of these components using parallel proportional profiles.

Data and Design: Types of variability.
Because the Parafac model is based on parallel proportional profiles (PPP) and therefore sensitive to violation of this principle in the data, it is important to pay attention to the possibility that such profiles might not be present.However, often the only way to check this is via an analysis with the model itself.The proportional profile requirement is also known as trilinearity, which refers to the property that the model is linear in each mode given the other two modes (see Equation 2), where the components of one mode are linear given the values of the other two modes.
In Harshman & Lundy (1984a, p. 130) the PPP principle is explained in terms of models or components underlying the data.In the system variation model, the components ''reside in the system under study and through the system affect the particular objects; the [component] influences exhibited by particular objects would thus vary in a synchronous manner across a third mode.''In ''the object variation model, separate instances of the [component] can be found in each of the objects, and these within-object [components] would not be expected to show synchronous variation across levels of a third mode.'' 4 Typically, object variation is not in accordance with the Parafac model.Harshman & Lundy (1984a) suggest that if the object variation is in one mode only, the fully-crossed data can be converted to cross-product matrices such that the object-variation mode 'disappears', in the same way as individuals are no longer visible in a correlation matrix but only the variables.Then these crossproduct matrices can be investigated for parallel proportional profiles in the other modes with the standard Parafac model.

Model and Dimensionality Selection
The choice of the 'best' or most appropriate solution of the Parafac model is not an easy one and the procedure to arrive at such a solution requires consider-able attention to details.Because the model is identified or unique given the data and the number of components, it may not fit due to the lack of parallel profiles in the data.Often several analyses at each plausible number of components are necessary to find an appropriate solution if it exists at all.Again the Kojima girls' data, but not the boys' data, are a case in point.
Uniqueness.Harshman (1970) provided the first discussion of the uniqueness of the Parafac model.Krijnen (1993, chap. 2) provided a further discussion of the concept of uniqueness.The major thrust of his argument was that one should distinguish between weak and strong uniqueness.In the former case, the unique solution is one in which the obtained solution is surrounded by a number of non-unique solutions which fit almost as well as the unique one, and thus Harshman's claim that the unique solution should be given preference over all other solution is not quite as strong.He argues that in such a case it might be advantageous to take a nearly as good solution if it is easier to interpret, for example because via rotations the components can have a simple structure.A solution is strongly unique if there are no non-unique models which have almost as good a fit as the Parafac model itself.Harshman (1972a) (see also Krijnen, 1993, p. 28, 29) showed that a necessary condition for the uniqueness of a solution is that no component matrix has proportional columns.Krijnen suggested checking for weak uniqueness by comparing the fit of a regular solution of the Parafac model with that of a model with two proportional columns in any of the component matrices.If the difference is small the solution is considered weakly unique; for an example, see Kroonenberg (2008, pp. 323ff.).A similar approach may be taken by comparing the fit of the regular unique Parafac solution with that of any other Parafac model, say with orthogonality constraints on one of the component matrices.
The accepted way to compare two components, say x and y is via Tucker's congruence coefficient, (4) which is similar to a correlation coefficient except that the components are not in deviation of their means (Tucker, 1951;Ten Berge, 1986).If the components are already in deviation of their means as is generally the case for the subject mode in profile data, the congruence coefficient is equal to the correlation coefficient.For the Kojima girls' data, Table 2 shows the congruence coefficients for the onethrough three-component analyses with orthogonal subject components.The reason for the orthogonality is explained below.
For the Kojima girls' data, the orthogonal subject component spaces seem to be nearly unrelated components as most congruence coefficients are nowhere near one (Table 2).However, regression analyses show that both the one-dimensional and the two-dimensional solutions are contained in the three-dimensional solution, and the three-dimensional solution itself is again imbedded in the fourdimensional one.This shows that for the subjects the lower-dimensional spaces are nested in the higher-dimensional ones, but that the orientation of the axes in each of the component spaces is different from that of the lower-dimensional one.Therefore, it seems that the parallel proportional profiles in these data are not very strong and that one calls their existence into question.

Multiple Solutions.
Given that the Parafac model is a restrictive one, i.e., not all data sets have an acceptable Parafac solution, one has to run analyses with different numbers of components to determine whether the data contain parallel proportional profiles.In other words, when solving for the Parafac model we are really concerned with model fitting and not so much with data approximation.This also means that both using too few and using too many components can lead to unsatisfactory solutions.In general, it turns out that specifying too many components is more problematic than too few, because violation of the parallel proportional profile principle is more likely for later components.When two components are close together in terms of their contributions to the fit of a too small model, different analyses may pick-up different components, and only when the number of components is increased can stable solutions be found.Whether this occurs is very much data dependent; see, Murakami and Kroonenberg (2003) for an example.

Split-Half Strategy.
One way to get insight in the stability of the components is to split the data in half and perform a separate analysis on both parts.If there is a true underlying solution, it should show up in both analyses.Important restrictions to this procedure are that there must be a stochastic mode for which splitting makes sense.However, such a mode almost always exists in three-way profile data, in particular the subject mode.When there is no stochastic framework, splitting might not be possible.For instance, in some typical experiments in analytical chemistry splitting the data set in halves does not make sense.The other caveat is that there must be sufficient 'subjects' in the mode that is being split.In other words, both splits must be large enough to minimise the influence of the idiosyncrasies of specific individuals.How much is large enough is difficult to say in general, because much depends on the noise level of the data and the clarity of the underlying structure.Harshman and DeSarbo (1984) discusses the split-half procedure in great detail with illustrative examples (see also Kiers & Van Mechelen, 2001); see Kroonenberg (2008, pp. 326ff) for split-half analyses of the Kojima boys' data.
Degeneracy.Harshman (1970) was the first to note the problem of non-converging solutions in which two components tend to become perfectly negatively correlated, and in which the g sss increased without bound.Such solutions from a Parafac analyses are called degenerate, and this phenomenon has been a topic of intensive research ever since; for a recent paper, see, Stegeman (2006).The basic problem causing the degeneracy is that algorithms to compute the model parameters cannot cope with data that do not conform to the Parafac model, and therefore produce degenerate, uninterpretable solutions.Krijnen and Kroonenberg (2000) (see also Krijnen, 1993, p. 13ff.)discussed a number of measures to assess whether an algorithm is tending towards such a degeneration solution, and they suggest a number of heuristic values for these measures.Their approach was inspired by and an improvement of earlier work of Harshman and Lundy (1984a, p. 272).To assess degeneracy, we need the cosines between two components in a single mode, which is the same as calculating the congruence coefficient between the components.As an example, the cosine between the sth and s′th component of the first mode is cos(a s,s ′ ) = (a s ′a s ′ ).If we define f s as the I×J×K (column) vector of consisting of the terms a is b js c ks (see Equation 2) then the cosine θ s,s ′ between f s and f s ′ is the triple cosine product: cos(θ s,s ′ ) = cos(α s,s ′ ) cos(β s,s ′ ) cos(γ s,s ′ ) (5) If cos(θ s,s ′ ) is approaching -1, there is almost certain a degenerate solution.It signifies that the two components f s and f s ′ have become proportional, and this is explicitly 'forbidden' in the Parafac model.The conclusion can be further supported by creating an S×S matrix of the cos(θ s,s ′ ) and inspecting its smallest eigenvalue.If it gets, say, below .50,degeneracy might be present.In addition, one should assess the condition number of the triple-cosine matrix (i.e., the largest eigenvalue divided by the smallest one).If the condition number is getting large, say somewhat arbitrarily, larger than 5, a degeneracy is likely.Both the smallest eigenvalue and the condition number are indicators whether the triplecosine matrix is of full rank as it should be for a Parafac solution.The best way to confirm degeneracy is to run the same analysis again but with an increased number of iterations and possibly with a more stringent criterion.If the indicators get worse, degeneracy is getting more and more likely.
For the Kojima girls in an unconstrained three-component solution the triplecosine product between the first and second component was -.95, the smallest eigenvalue of the triple-cosine matrix was .06,and the condition number of that matrix was 23.4.Finally, the standardised component weights, i.e., the g 2 sss /Total sum or squares, were 3.9, 3.6, and .3,while they should be smaller than 1.0.Clearly, the three-component solution is a degenerate one.In fact, also the twoand the four-component solutions are degenerate.When degeneracy is suspected, it is absolutely necessary to use several starting positions for the algorithm, because ending up with one degenerate solution does not necessarily mean that there is no properly convergent solution; see especially Paatero (2000) for an enlightening discussion of this point.To circumvent degeneracy, one may pose constraints on the solution such as requiring one of the modes to be orthogonal or imposing nonnegativity of the components.Not uncommonly abandoning the Parafac model and exploring the data with a Tucker3 model might be a good option and this was the approach taken with the Kojima girls' data.

Searching for Convergent Solutions.
When searching for a satisfactory number of components, analyses with different numbers of components have to be run and, if necessary, restrictions have to be placed on one or more of the component matrices.Parafac analyses for Kojima's girls' data were run with 1 through 4 components both with and without orthogonality restrictions on the subjects' components and, for purposes of assessing stability, the data set was randomly split in halves (S1 and S2) and analysed with orthogonal solutions.
The first objective in getting acceptable solutions is to have convergent and non-degenerate solutions.None of the non-restricted analyses came up with an acceptable solution as all of them were degenerate.Because of this, only results of solutions with orthogonality restrictions are discussed.Note that because there are only four levels in the third mode, going beyond four components does not make much practical sense.It should be noted that having convergent solutions does not necessarily mean that the solutions are the same, a matter taken up in the next section.The proportional fitted sums of squares for the four solutions of the full data set were .16, .30, .38, and .46showing that at least 46% of the variability in the data could be modelled with a parallel proportional profiles model.The two splithalf samples show similar explained variability.Also on the other measures the split-half samples show comparable results to the complete sample.Note that the standardised component weights of most solutions show at least two nearly equal values which might explain why the components of the successive solutions did not align as was evident in Table 2.

Examining Convergent Solutions.
Detailed analyses of convergent solutions via component comparisons using the congruence coefficients are necessary to establish their stability.These comparisons may also be used as an additional way to assess the uniqueness of the solution, because assuming that there is a unique solution in the population, the same components should recur in the overall solution and in the split-half solutions, barring sampling errors and possible instability due to insufficient subjects.It is an empirical question whether 76 subjects in the split-half samples are enough to establish stability.When comparing solutions we could take our lead from Lorenzo-Seva and Ten Berge (2006), who found in an empirical study that subjects judged components to have a fair similarity when the congruence coefficient was in the range .85-.94, while two components with a congruence coefficient higher than .95were considered to be identical for interpretation.
Detailed analyses showed that there is no stability across the overall orthogonal analyses and the split-half ones.This suggests that there either is little stability or that there are serious violations of the underlying model.This is contrast with the boys' data where we could conclude that the same components were present in the overall solutions and the split-halves, be it that there were considerable differences between the split-halves in that analysis.
Assessing the Parafac Core Array.Bro (1998) (see also Bro & Kiers, 2003) suggested using the core consistency as a measure for evaluating Parafac models.This measure is based on assessing how far away the core array derived from the Parafac components is from a superdiagonal core array, i.e., the cube of size S×S×S in which only the g 111 , g 222 , ..., g SSS have sizeable values, and all other core elements are near zero.Bro (1998, p. 113-122) also proposed to construct a core consistency plot which has the core values on the vertical axis and the core elements on the horizontal one with the superdiagonal elements plotted first.For the Kojima girls' data, core consistency plots for the three-dimensional orthogonal solution and for the second split-half set (S2) are presented in Figure 5.
Normalised core consistency = .667( = 66.7%) All two-component solutions had a core consistency of just about 100%, indicating a perfect superdiagonal core array (see Table 3).The three-component orthogonal solution (Figure 7) with a core consistency of 20% shows that there are quite a few larger-than-zero elements in the core array, and the three superdiagonal elements are not even the largest ones.However, the three-component solution (Figure 7) of split-half sample S2 showed are far better core consistency (67%) in contrast with the other split-half sample (9%) (Table 3).

Interpretation
Above we have illustrated the presentation of the results via a table with the scale coefficients (Table 1) also via the per-component plot of the first components of each of the three modes (Figure 4).If one wants to make pair-wise component plots, it is not the components themselves which should form the coordinate axes as they are correlated, but a set of orthogonal axes which span the same space.This is necessary to obtain the correct interpretation of distances in such plots (Kiers, 2000).Given the problematic solution we will not show such a plot in this paper, but examples can be found in Kroonenberg (2008).

Parafac with Constraints.
In a previous section we introduced the idea of constraining the components of a Parafac solution to evaluate uniqueness, but constraints have a far wider use.For instance, in intelligence tests all subtests have to correlate positively with each other, therefore a nonnegative first component is obligatory and even two non-negative components are desirable (Krijnen and Ten Berge, 1992).Thus substantive issues may require restrictions on the components.

Parafac Core Arrays.
As mentioned above, one of the major reasons for degeneracies is that the data contain so-called Tucker structure (Harshman & Lundy, 1984b).Under the Parafac model, each component of each mode is exclusively linked to one single component of each of the other modes.Thus in the Parafac model only terms such as a is b js c ks exist in which the s is the same for all modes.However, in some data also the term a is ′ b js c ks with s′ = s explains a substantial part of the variability in the data.In other words, there are two components a s and a s ′ , which have links the components b s and c s .This is the standard situation in the Tucker3 model, but prohibited in the Parafac model, and therefore solutions including such terms are said to have Tucker structure.The core consistency plot (Figure 5) showed that there were indeed many non-zero elements in the core array computed from the three-component Parafac model for the Kojima girls' data.

Conclusion from the Parafac Analysis
The scale and condition spaces of the Kojima girl data showed clear interpretability, but a closer examination of the various solutions brought a considerable number of problems to light.That this is particular to this data was clear from the analyses of the parallel data of the boys, which were presented in Kroonenberg (2008, chap. 13).By looking in detail at the quality of the solutions via fit meas-ures, comparing the original solution with split-half samples, inspecting congruence coefficients between components, and examining core consistency, it became clear that the Parafac model is not the most appropriate, or easiest to interpret, model for these data.In the next section we will examine the same data but with a Tucker3 model to establish whether using this model a better insight can be gained into the judgements of parental behaviour.

Objectives
The Tucker3 model is primarily used to find a limited set of components with which the most important part of the variability in the data can be described.The model is thus especially useful for data reduction and for exploration of variability.The main reason that the Tucker model is not as directly useful as the Parafac model for the search of developing or identifying general principles or rules that underlie patterns in the variables is its rotational freedom.The basic result of applying the Tucker3 model to three-way data are component spaces in which any orientation of the axes is as good as any other one in terms of fit of the solution to the data.However, if descriptions of the patterns are desired in terms of the original variables, subjects, and conditions, rather than as latent entities, the rotational freedom can be extremely helpful in unravelling these patterns.
In three-way profile data the variability analysed takes mostly the form of squared normalised deviations from the variable means at each condition (see section on preprocessing).Thus, a large part of the variability in the original data, in particular the variability between the means per condition, is not contained in the three-mode analysis itself.When interpreting results, this should always be borne in mind.However, what facilitates the interpretation is that the removed means represent the scores of the average subject, i.e., the person who has an average score on each variable under each condition.In many applications, the variable×condition matrix of means contains important information in its own right and should be carefully analysed, for instance via a biplots (Gabriel, 1971).An important caveat is that such a matrix of means is most effectively interpreted if the sample over which the means are calculated is in some way representative of an identifiable population.In the present example this means that we should satisfy ourselves that the girls represent some meaningful sample of a population of Japanese girls.

Types of Variability.
In contrast with the Parafac model, the Tucker3 model can handle both system and object variation, and therefore can be used to fit data in which the correlations between the components change over time.It is especially the simultaneous handling of these types variation which make the Tucker3 model extremely useful to investigate the patterns within and between modes in situations where little is known a priori about them.Alternatively, the model can be used when one does not want to prejudice the analysis or cannot find a satisfactory Parafac analysis.

Model and Dimensionality Selection
When confronted with three-way data the choice of the most appropriate Tucker3 model is not always an easy choice because the numbers of components can vary independently for each mode.In this section we will try to find a reasonably fitting Tucker3 model for the Kojima girls' data by first evaluating a series of models with varying numbers of components in the three modes.There are two types of plots useful for model selection: (1) the deviance plot (Figure 6) displaying the deviance or residual sums of squares against the degrees-of-freedom, and the three-mode scree plot (Figure 7) showing the deviance against the total number of components, i.e., N = P + Q + R (Timmerman & Kiers, 2000;Kiers & Der Kinderen, 2003).The models with the best SS(Residual)/df ratios and those with the best SS(Residual)/N ratios lie on what is called the convex hull, which is the curve drawn in each of these figures.To assist the choosing a model, one may use the Ceulemans-Kiers (2006) st-criterion; see also Kroonenberg (2008, p. 182ff.).This criterion looks for the sharpest angle in the convex hull in the deviance plot.For the present data this occurs for the 4×3×2-Tucker3 model, but it should be noted that the convex hull does not show much curvature.The preferred model indicated by Cattell's ''elbow'' in the three-mode scree plot is the same 4×3×2-Tucker3 model.This model contains a large number of components for the girls which may cause difficulties in interpretation.An interesting alternative could be the more parsimonious 3×3×2-model, as it lies on the convex hull in the deviance plot and only just misses it in the three-mode scree plot.However, it has the disadvantage of only two scale components which may be simplifying the scale space too much.
Unlike the Parafac model, the Tucker3 model has no serious problems with multiple solutions, non-convergence, and similar algorithmic difficulties.Timmerman and Kiers (2000) conducted an extensive simulation study and they only found problems when the numbers of components deviate much from the real number of components analysed, and even then primarily when random starts was chosen, rather than a rational ones.
It is instructive to compare the fit of the Tucker3 model with those of comparable orthogonal Parafac ones.For the two-component models the proportional fitted sums of squares are: .3151(Parafac), .3016(Tucker3 -2×2×2), and for the three-component models the proportional fitted sums of squares are: .4046(Parafac), .3914(Tucker3 -3×3×3).Clearly on the basis of fit alone there is not much to choose between comparable models.

Evaluating Fit.
The fit of a model can be partitioned in several ways to assess how well different parts of the data are fitted.In particular, one can look at the fit of the components of each mode, at the fit of combinations of components from different modes, at the levels within each mode, and at the fit of each data point.In this way it is possible to make a detailed evaluation of the quality of the solution, and search for ways to improve the analysis by expanding or restricting the model.Components.The overall fit of the 4×3×2-Tucker3 model is 41% which is a fairly common value, considering the size of the data and the fact that the fit refers to individual data points rather than to covariances.
Girls.Due to the three-way orthogonality of the Tucker3 model, it is possible to assess the fit or lack of it for each level of each mode separately (Ten Berge, De Leeuw, & Kroonenberg, 1987).The sums-of-squares plot (Figure 8) shows that there are several girls who have large total sums of squares coupled with a good fit (e.g., G57 and G138), some girls have large total sums of squares coupled with a low fit (e.g., G104).Girls located near the origin have both small residual and fitted sums of squares (e.g., G44).Because the scales are centred this means that the latter have scores close to the means of the variables.The large dot on the dotted line indicates the point of average fit and average residual sum of squares.Scales.Figure 4 shows that Nonenforcement is the worst fitting scale in the 4×3×2-solution, while Acceptance, Positive involvement, and Acceptance of individuation fit best.Whether a scale fits better than another one, is not related to the type of disciplining involved, but has something to do with the consistency and variability of the answers.The normalisations ensured that all variables had equal total sum of squares so that the scales are aligned on the slanted line of equal total sum of squares in the graph.
Judgement conditions.As there are only four conditions, there is no need to construct a sums-of-squares plot.The Girl-Father and Girl-Mother conditions have a proportional fit of .49and .56,respectively.The Father-Father and Mother-Mother conditions have a better fit, i.e., .62 and .73,respectively.The (modest) difference is probably due to more inconsistency in the girls' judgements over the scales.Fit of combinations of components.Apart from determining an appropriate number of components and their contributions to the overall fit, one may also investigate the contributions of the component combinations to the fit of the model to the data.As long as the components are orthogonal within the modes, the core array can be partitioned into the explained variabilities by the combinations of components by squaring the core elements and dividing them by the total sum of squares after preprocessing (see on preprocessing).Thus, the proportion explained variability of combination (p,q,r) is g 2 pqr /SS(Total), where SS(Total) = ijk with z ijk the preprocessed data.For details on the Kojima girls' data see below.

Interpretation
In the Parafac analysis, we interpreted the results from the scales in a straightforward manner, i.e., we described the patterns of the components of the scales and condition modes.On the whole, these patterns were relatively clear without additional manipulation of the output itself.Moreover, Parafac components have the proportional profile property which gives them a special status, and it is primarily the patterns per component which need to be interpreted rather than the patterns across components.However, from a 'simple structure' point of view looking at patterns across components, many scales in the Parafac solution have sizeable coefficients on more than one component, so that the components are all but simple.Because of the rotational freedom, the results from a Tucker analysis can be investigated for simple structure and that is what we will proceed to do in this section.In particular, we will examine plots of components per mode, rotations, both of component spaces and the core array, joint representations of the modes, and using constraints.

Displaying Components and Core Array.
As mentioned in above, there are essentially four different ways in which the components may be presented: (1) Tables of the coefficients or loadings for each mode  rotated or not (Table 1); (2) Separate pair-wise graphs of the components per mode (not shown); (3) All-components plots showing all components of a single mode in a plot with the levels of the modes on the horizontal axis (not shown); (4) Per-component plots showing one component of each of the three modes in the same plot (Figure 4).The third option is especially useful when there is a natural order for the levels of a mode, but this was not the case here (for an example see Figure 11.10 in Kroonenberg (2008, p. 268).
Scale components.The solution for the scales after rotation was already presented in Table 1.The rotational procedure, called the Harris-Kaiser independent cluster rotation is described and evaluated in detail in Kiers and Ten Berge (1994).In essence it is an oblique rotation realised via a varimax rotation on an orthonormalised component matrix and its aim is to find as clearly separated clusters of variables as possible.After rotation the normalised scale components show indeed a much clearer cluster structure than in the Parafac case.The components can now be designated as Acceptance, Rejection and Control, and (Lax) Discipline, which again does not completely concur with the official grouping of the scales.However, the components no longer have the parallel proportional profile property.Judgement components.The mode of the judgement conditions is more conveniently presented in a table (Table 5), because there are so few levels.From this table, it follows that the girls judge their parents' behaviour as being rather similar, and also the parents give similar judgements about their own behaviour.The model selection graphs showed that adding condition components will not really increase the fit, so that no further important distinctions between the judgements can be found.In other words, it makes sense to talk about parental behaviour rather than father or mother behaviour, be it that the daughters mostly but not always agree with their parents about its nature.
Core array.The core array contains the weights for the links between the components from different modes (Table 6).In the unrotated case, we are always dealing with principal components, so that in almost all cases the first element of the core array, here g 111 = 1.57, is the largest one.
The second largest element in the unrotated core array, g 221 = 1.46 is a combination of the second girl component, the second scale component and the first condition component, which is an impossible combination for a Parafac model.Note, that the adjusted core array after rotation of the condition and scale components has many more mid-sized elements than the original core array.This general effect, i.e., distribution of variability, is the price to pay for having a simpler component structure.Kiers (1998b) has devised procedures to simultaneously simplify the components and the core array.
A detailed examination of the rotated core array proceeds via a comparison of the sizes of the links between the components.The rotated core array contains a fair number of medium-sized elements, which complicates its interpretation.However, due to the clear meaning of the rotated Tucker3 components (see Table 1) it is easier to investigate the relationships between the components from the three modes.The substantive interpretation of the rotated core array makes use of the fact that after rotation we have separate components for the daughters and the parents (see Table 1).This translates itself into separate panels for the daughters (left-hand panel) and the parents (right-hand panel).Looking at the first two rows G1 and G2 corresponding to the first two girl or rather family components, we see that the patterns are similar for the judges and that on the whole the core values are very similar as well.This indicates that, for families having highly positive or The numbers in bold indicate the larger elements of the core array.C stands for condition components; S for scale components; G for girls components.The rotated core is the result from a counter-rotation with respect to the varimax rotation of the condition components (see right-hand panel of Table 5).Accept = Acceptance; Control = Control & Rejection; Lax D = Lax discipline negative values on these components, the daughters and their parents have a similar view on parenting style.Families with highly positive values on the first family component show an accepting style of parenting g 111 = 1.21 and g 11 2 = 1.05 and a tendency towards lax discipline g 131 = .27and g 132 = .40;for families with highly negative values this is the reverse.Families with highly positive values on the second family component show primarily a controlling and rejecting style of parenting g 221 = 1.28 and g 222 = .90with small values for the other core elements.The third family component (G3) shows a contrast in views between the daughters and parents.Daughters: a negative value for Acceptance, near zero for Control and a moderately positive value for Lax discipline (-.76, .12,.40);Parents: a moderately positive value for Acceptance, a moderately negative value for Control and a positive value for Lax discipline (.40, -.37, and .68).The fourth family component signifies a contrast between daughters and parents in judging parental style especially in the control and rejection scales, but the judges agree in their view of the discipline exerted by the parents.Obviously, actual families can have scores on several components and thus are combinations of the patterns described above.It would have been insightful to have additional information on the families to make a further in depth analyses of parenting style of Japanese families.Unfortunately such information is not available as Kojima's primary interest centred on the usability of the CRPBI in Japan, rather than on a full description of the perceptions on parental behaviour by Japanese families.

Joint Biplots.
A large part of the analysis of three-way data concentrates on the links between the components of the three modes, the size of which is contained in the core array.The interpretation of these links can be hard when no clear meaning can be attached to the (principal) components themselves.The principal components represents after all directions in the component space, which are not necessarily the directions of maximal interpretability.Not being able to assign substantive meaning to components puts restrictions on the possibility of the interpretation of combinations of such components.This should be evident from the above interpretation of the Kojima core array that leaned very heavily on the meaning of the components in the scales and condition modes.To get around this, one may construct a joint biplot of the components of two modes (the display modes) given a component of the third (the reference mode).Detailed expositions of these plots can be found in Kroonenberg (2008) which also contains a number of examples and a joint biplot for the boys' data (Kroonenberg, 2008, p. 341).In most analyses, they form the most powerful representation of the results of a Tucker3 analyses.

Rotations of Components and/or Core Arrays.
The basic solution of the Tucker3 model is based on the orthonormality of the components in each mode.This is not a restriction due to the model, but one that has to be made in order to obtain a solution.However, once a solution is obtained it may be transformed without changing the fit to the data.This is in strong contrast with the Parafac model which due to its parallel proportional profiles puts strong restrictions on the solution and rotating the component matrices automati-cally leads to a loss of fit.Each component matrix may be rotated both with orthogonal transformations, such as varimax, and with oblique rotations or transformations, such as oblimin and the Harris-Kaiser independent cluster rotation (see Kiers & Ten Berge, 1994).The transformational freedom is a nuisance in the sense that there exists no unique solution for the model, at the same time it may be a blessing as it may be used to find more easily interpretable directions, which turned out to be case in the present example.Most proposed criteria for transformations in the Tucker3 model are orthogonal as is evident in the work by Kiers, summarised for instance in Kiers (1998b).This paper also treats the case of searching for overall simplicity in the model by simultaneous transforming the components and the core array.Applications where the combined procedure has been used are, for instance, Van Mechelen & Kiers (1999), Murakami and Kroonenberg (2003), and Kroonenberg (2008, p. 249ff.).

Tucker3 Model with Constraints.
Given that the Tucker3 model is primarily used in an exploratory sense, most applications use models without constraints on the parameters.However, theoretical developments have been extensive in the field of examining core arrays with as many zero elements as possible; see Kiers (1998a) for an overview.With respect to putting constraints on the components several authors have made proposal in this area and shown examples of their use, such as Klapper (1998), Bro and Sidiropoulos (1998), Bro (1998), Timmerman (2001), and Timmerman and Kiers (2002).

Internal validation of the model
Earlier we looked at the relative fit of the components and the combinations of the components via the core array, but we have not yet investigated the residuals from a model at the level of the individual data points.Similarly to other least squares procedures we may construct a residual plot of the standardised residuals verses the (standardised) predicted values.From such a plot (not shown) for the three-component Parafac, for instance, one can see that there are some larger standardised residuals, but given that we have 10,800 residuals that is only to be expected.The general conclusion from that residual plot is that the residuals are well-behaved.A further view at the adequacy of the solution can be had by looking at the distribution of the standardised residuals, which show as expected a well-behaved normal distribution.There are only a limited number (13) outlying observations, and there is no reason to be seriously worried about them.

CONCLUSION
In this paper an extensive overview is given of several practical issues in connection with applying three-models to real empirical data.It is hoped that this paper will make it easier for relatively uninitiated researchers to apply these techniques to unravelling complex patterns which exist in some three-way data sets.By placing the two major three-mode models side-by-side, an overview has been given of the strengths and weaknesses of these models.No real comprehen-

Figure 1 .
Figure 1.Singular value decomposition and principal component analysis.
MODELLING OF THREE-WAY PROFILE DATA already present in the early days of component analysis (see Ten Berge & Kiers, 1997, for a complete discussion).

Figure 2 .
Figure 2. The Parafac model: Sum of combinations of components; one from each mode for each component.

Figure 4 .
Figure 4. Kojima Data: First components of the Girls' mode, the Scale mode, and the Condition mode.The solid arrows on the left-hand side refer to the scales, the dashed arrows on the right-hand side to the daughters, and the four bold marks on the vertical line to the conditions, i.e., D-F: Daughter judges her father's behaviour towards her; D-M: same with respect to her mother; F-F: the father judges his own behaviour towards his daughter; M-M: same for the mother.

Figure 5 .
Figure 5. Core consistency plots for the Kojima girls' data.Left: Threecomponent Parafac solution for all data; Right: Three-component solution for subset S2.

Figure 6 .
Figure 6.Deviance plot for all Tucker3 models P ≤ 4 and Q ≤ 4 and R ≤ 4. The solid line connects optimal choices based on the deviance convex hull.

Figure 7 .
Figure 7. Kojima data: Three-mode scree plot for all Tucker3 models with P ≤ 4 and Q ≤ 4 and R ≤ 4. The horizontal line indicates that the models at the right of the line should probably not be considered.

Figure 8 .
Figure 8. Kojima girls' data: Sums-of-squares plots showing how well the model fits the data of each girl (left) and each scale (right).The dashed line starting from the origin indicates the locus of all girls (scales) with the same average residual/fit ratio.Girls (Scales) above the average line have a worse residual/fit ratio than average while the girls (scales) below the line have a better residual/fit ratio.The scales lie on a line of equal total sum of squares because they were normalised.

Table 5 . Unrotated and rotated condition components and their fit
Note: Unit length components; Varimax rotation on the orthonormal components.