LDA with previous PCA for dimensionality reduction

Dear Cristoph,

I guess you want to assess the error rate of a LDA that has been fitted to a 
set of currently existing training data, and that in the future you will get 
some new observation(s) for which you want to make a prediction.
Then, I'd say that you want to use the second approach. You might find that 
the first step turns out to be crucial and, after all, your whole subsequent 
LDA is contingent on the PC scores you obtain on the previous step. Somewhat 
similar issues have been discussed in the microarray literature. Two 
references are:

@ARTICLE{ambroise-02,
  author = {Ambroise, C. and McLachlan, G. J.},
  title = {Selection bias in gene extraction on the basis of microarray 
gene-expression data},
  journal = {Proc Natl Acad Sci USA},
  year = {2002},
  volume = {99},
  pages = {6562--6566},
  number = {10},
}

@ARTICLE{simon-03,
  author = {Simon, R. and Radmacher, M. D. and Dobbin, K. and McShane, L. M.},
  title = {Pitfalls in the use of DNA microarray data for diagnostic and 
prognostic classification},
  journal = {Journal of the National Cancer Institute},
  year = {2003},
  volume = {95},
  pages = {14--18},
  number = {1},
}

I am not sure, though, why you use PCA followed by LDA. But that's another 
story.

Best,

R.

LDA with previous PCA for dimensionality reduction

Thread (7 messages)