Skip to content
Prev 5771 / 398506 Next

error with princomp

Faheem Mitha <faheem at email.unc.edu> writes:
Whatever the software, you're likely to get in trouble trying to
interpret the result of a PCA on binary data (using correlations or
not). It can be tricky enough with continuous data....

Anyway, I bet some of those 64 binary columns will turn out to be
linearly dependent, either overtly by some of them summing to a
constant or more subtly because of some combinations being absent.

A QR decomposition of your data matrix might be enlightening. Look at
the rank and the pivoting information. 

It does seem that we're handling the singular case suboptimally,
though. Ideally, one should apply a fuzz factor before declaring that
the matrix isn't NND and I don't think there's a problem with
factoring out a null space in the PCA.