Back to formatted view
Raw Message

Message-ID: <1124030772.42ff59348c387@zeppo.wmin.ac.uk>
Date: 2005-08-14T14:46:12Z
From: R.P.Clement@westminster.ac.uk
Subject: PCA problem in R
In-Reply-To: <Pine.LNX.4.61.0508140714210.22738@gannet.stats>

Hi. I have two comments on this.

Quoting Prof Brian Ripley <ripley at stats.ox.ac.uk>:

> On Sat, 13 Aug 2005, Alan Zhao wrote:
>
> > When I have more variables than units, say a 195*10896 matrix which has
> > 10896 variables and 195 samples. prcomp will give only 195 principal
> > components. I checked in the help, but there is no explanation that why
> > this happen.
>
> There is not even a definition of a PC in the help. Did you read the
> references?  This is what they are given for!

I don't know if it's too simple and introductory for the OP, but I quite like
Lindsay Smith's intro to PCA.

http://www.cs.otago.ac.nz/cosc453/student_tutorials/principal_components.pdf

> > Can we get more than 195 PCs for this case? Thank you very
> > much.
>
> Check out the theory in the references.  You can, but all the remaining
> ones are constant across samples and not uniquely defined.  You are likely
> to have trouble storing the coefficients (10701x10896 is 800Mb).
> It would be better to do whatever you intend to do with them without
> explicitly computing them.

I've been using prcomp on data with 50 samples and 8000 variables. That
completes in acceptable time on a very modest (XP2000+/512M/rh9) machine.
Though, I note that I only have 1/4 of the samples of the OP.

Cheers,

Ross-c