Multivariate analysis

MVA allow analysis of two or more variables simultaneously. Why is this important? See Simpson’s paradox – collapsing over related variable can give misleading results. EDA(exploratory data analysis) usually very worthwhile – will highlight any problems with data etc. Important to think about missing values.

Often have many variables in market research, especially from surveys. MVA can help summarise the data, and reduce chance of obtaining spurious results. Two general techniques: analysis of dependence and analysis of independence.

Principal components: Identify underlying dimension of a distribution. Probably most commmonly used method of deriving factors.

Cluster analysis: identifying separate groups of similar cases. Also used to summarise by defining segments. Two main techniques: hierachical and iterative (eg. k-means). Sometimes do tandem segmentation by clustering on factors – loses information, but makes interpretation easier. Distance measure as important as clustering techqniue.

Structural equation modelling: extracts latent variables, with specified causal structures (confirmatory)

Partial least squares: multivariate genralisation of regression. Extract underlying factors to explain response variation and variation between predictors.

Discriminant analysis: How to best classify observations into (known) groups.

Chi-squared automatic interaction detection (CHAID): descrete response with many discrete predictors, produces tree structure

Correspondence analysis: visual summary of relationship of contingency tables. Vectors in similar directions positively related, opposite directions negatively, distance from origin represents strength of relationship.