Skip to content

lda() called with data=subset() command

2 messages · Christoph Lehmann, Brian Ripley

#
Hi
I have a data.frame with a grouping variable having the levels 

C, 
mild AD, 
mod AD, 
O and 
S

since I want to compute a lda only for the two groups 'C' and 'mod AD' I
call lda with data=subset(mydata.pca,GROUP == 'mod AD' | GROUP == 'C')


my.lda <- lda(GROUP ~ Comp.1 + Comp.2 + Comp.3 + Comp.4+  Comp.5 +
Comp.6 + Comp.7 + Comp.8  , data=subset(mydata.pca,GROUP == 'mod AD' |
GROUP == 'C'), CV = TRUE)

this results in the warning "group(s) mild AD O S are empty in:
lda.default(x, grouping, ...)" of course...

my.lda$class now shows 

 [1] C       C       C       C       C       C       C       C       C
[10] C       C       C       C       C       C       C       C       C
[19] C       C       C       mild AD mild AD mild AD mild AD mild AD
mild AD
[28] mild AD C       mild AD mild AD mild AD C       C       mild AD
mild AD
[37] mild AD mild AD
Levels: C mild AD mod AD O S

it seems it just took the second level (mild AD) for the second class,
even though the second level was not used for the lda computation (only
the first level (C) and the third level (mod AD)

what shall I do to resolve this (little) problem?

thanks for a  hint

christoph
#
I presume is lda from the uncredited package MASS and you ignored the
advice to ask the maintainer?

The short answer is `don't ignore the warning', and set up a proper data 
frame with just the groups you actually want.

As a quick fix, look in lda.default and alter the line that looks like

        cl <- factor(max.col(dist), levels=seq(along=lev1), labels=lev1)

to be exactly like that.  (You will need fixInNamespace to do so.)
On Mon, 5 Jan 2004, Christoph Lehmann wrote: