Reconstructing Datasets
On Wed, 2005-03-02 at 08:30 +0200, Jari Oksanen wrote:
On Tue, 2005-03-01 at 20:30 +0000, Laura Quinn wrote:
Hi, Is it possible to recreate "smoothed" data sets in R, by performing a PCA and then reconstructing a data set from say the first 2/3 EOFs? I've had a look in the help pages and don't seem to find anything relevant.
It's not in the R help, but in the books about PCA in help references. This can be done, not quite directly. Most of the hassle comes from the centring, and I guess in your case, from scaling of the results. I guess it is best to first scale the results like PCA would do, then make the low-rank approximation, and then de-scale: x <- scale(x, scale = TRUE) pc <- prcomp(x) Full rank will be: xfull <- pc$x %*% pc$rotation
Naturally, I forgot the transposition: xfull <- pc$x %*% t(pc$rotation) and the check: range(x - xfull) which should be something in magnitude 1e-12 or better (6e-15 in the test I run).
The eigenvalues already are incorporated in pc$x, and you don't have to care about them. Then rank=3 approximation will be: x3 <- pc$x[,1:3] %*% pc$rotation[,1:3]
and the same here: x3 <- pc$x[,1:3] %*% t(pc$rotation[,1:3]) The moral: cut-and-paste.
Then you have to "de-scale": x3 <- sweep(x3, 2, attr(x, "scaled:scale", "*") x3 <- sweep(x3, 2, attr(x, "scaled:center", "+")
And here you need to close the parentheses: x3 <- sweep(x3, 2, attr(x, "scaled:scale", "*")) x3 <- sweep(x3, 2, attr(x, "scaled:center", "+")) The moral #1: cut-and-paste. and #2: drink coffee in the morning.
And here you are. I wouldn't call this a smoothing, though. Library 'vegan' can do this automatically for PCA run with function 'rda', but there the scaling of raw results is non-conventional (though "biplot").
cheers, jari oksanen
Jari Oksanen <jarioksa at sun3.oulu.fi>