error with princomp
Faheem Mitha <faheem at email.unc.edu> writes:
I have a data set, of 2061 rows and 99 columns originally. Now I guess it is going to be 97 columns, since the first column was all zeros (even Splus choked on this and I deleted it earlier) and the second one was all ones. Anyway, the first 64 (was 66) columns are binary data. The last 33 are numeric data. Now, I thought that a reasonable thing to do (in fact, the only thing I could think of) was to treat the first 64 columns as numeric zeros and ones, and then use the cor=TRUE flag (ie use the correlation matrix instead of the corelation matrix). This is advertised as a way of handling cases when the data is not all of the same scale. So that is what I did. Any comments/suggestions?
Whatever the software, you're likely to get in trouble trying to interpret the result of a PCA on binary data (using correlations or not). It can be tricky enough with continuous data.... Anyway, I bet some of those 64 binary columns will turn out to be linearly dependent, either overtly by some of them summing to a constant or more subtly because of some combinations being absent. A QR decomposition of your data matrix might be enlightening. Look at the rank and the pivoting information. It does seem that we're handling the singular case suboptimally, though. Ideally, one should apply a fuzz factor before declaring that the matrix isn't NND and I don't think there's a problem with factoring out a null space in the PCA.
O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._