R vs SPSS output for princomp
On Mon, 5 May 2003, James Howison wrote:
I am using R to do a principal components analysis for a class which is generally using SPSS - so some of my question relates to SPSS output (and this might not be the right place). I have scoured the mailing list and the web but can't get a feel for this. It is annoying because they will be marking to the SPSS output. Basically I'm getting different values for the component loadings in SPSS and in R - I suspect that there is some normalization or scaling going on that I don't understand (and there is plenty I don't understand). The scree-plots (and thus eigen values for each component) and Proportion of Variance figures are identical - but the factor loadings are an order of magnitude different. Basically the SPSS loadings are much higher than those shown by R. Should the loadings returned by the R princomp function and the SPSS "Component Matrix" be the same?
Only if they are defined the same. The length of a PCA loading is arbitrary. R's are of length (sum of squares of coefficients) one: how are SPSS's defined?
And subsidiary question would be: How does one approximate the "Kaiser's little jiffy" test for extracting the components (SPSS by default eliminates those components with eigen values below 1)? I've been doing this by loadings(DV.prcomped)[,1:x] after inspecting the scree plot (to set x) - but is there another way?
eigen values of what exactly? The component sdev is the aquare roots of the eigenvalues of the (possibly scaled) covariance matrix: maybe you intend this only for a correlation matrix? In R you have the source code, so if you know what you want you can find the pieces.
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595