how to tell if its better to standardize your data matrix first when you do principal
so under which cases is it better to standardize the data matrix first ? also is PCA generally used to predict the response variable , should I keep that variable in my data matrix ?
Uwe Ligges-3 wrote:
masterinex wrote:
Hi guys , Im trying to do principal component analysis in R . There is 2 ways of doing it , I believe. One is doing principal component analysis right away the other way is standardizing the matrix first using s = scale(m)and then apply principal component analysis. How do I tell what result is better ? What values in particular should i look at . I already managed to find the eigenvalues and eigenvectors , the proportion of variance for each eigenvector using both methods.
Generally, it is better to standardize. But in some cases, e.g. for the same units in your variables indicating also the importance, it might make sense not to do so. You should think about the analysis, you cannot know which result is `better' unless you know an interpretation.
I noticed that the proportion of the variance for the first pca without standardizing had a larger value . Is there a meaning to it ? Isnt this always the case? At last , if I am supposed to predict a variable ie weight should I drop the variable ie weight from my data matrix when I do principal component analysis ?
This sounds a bit like homework. If that is the case, please ask your teacher rather than this list. Anyway, it does not make sense to predict weight using a linear combination (principle component) that contains weight, does it? Uwe Ligges
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
View this message in context: http://old.nabble.com/how-to-tell-if-its-better-to-standardize-your-data-matrix-first-when-you-do-principal-tp26462070p26466400.html Sent from the R help mailing list archive at Nabble.com.