Message-ID: <1124030772.42ff59348c387@zeppo.wmin.ac.uk>
Date: 2005-08-14T14:46:12Z
From: R.P.Clement@westminster.ac.uk
Subject: PCA problem in R
In-Reply-To: <Pine.LNX.4.61.0508140714210.22738@gannet.stats>
Hi. I have two comments on this.
Quoting Prof Brian Ripley <ripley at stats.ox.ac.uk>:
> On Sat, 13 Aug 2005, Alan Zhao wrote:
>
> > When I have more variables than units, say a 195*10896 matrix which has
> > 10896 variables and 195 samples. prcomp will give only 195 principal
> > components. I checked in the help, but there is no explanation that why
> > this happen.
>
> There is not even a definition of a PC in the help. Did you read the
> references? This is what they are given for!
I don't know if it's too simple and introductory for the OP, but I quite like
Lindsay Smith's intro to PCA.
http://www.cs.otago.ac.nz/cosc453/student_tutorials/principal_components.pdf
> > Can we get more than 195 PCs for this case? Thank you very
> > much.
>
> Check out the theory in the references. You can, but all the remaining
> ones are constant across samples and not uniquely defined. You are likely
> to have trouble storing the coefficients (10701x10896 is 800Mb).
> It would be better to do whatever you intend to do with them without
> explicitly computing them.
I've been using prcomp on data with 50 samples and 8000 variables. That
completes in acceptable time on a very modest (XP2000+/512M/rh9) machine.
Though, I note that I only have 1/4 of the samples of the OP.
Cheers,
Ross-c