Linear Discriminant Analysis error: "Variables appear constant"
- Reduce the model to a reasonable size with far less variables than observations. - Code factors as factors rather than numerics - don't use variables with perfect correlation to other nor any duplicates Best, Uwe Ligges
On 17.05.2011 15:46, Songer, Katherine B - DNR wrote:
Uwe, Thank you very much for looking at this. I'm attaching the data, in case you have any wisdom on why variables 10, 38, and 42 would appear constant. Meanwhile, I'll remove colinear variables and read up a little more... Thanks, Katie -----Original Message----- From: Uwe Ligges [mailto:ligges at statistik.tu-dortmund.de] Sent: Tuesday, May 17, 2011 04:25 AM To: Songer, Katherine B - DNR Cc: r-help at r-project.org Subject: Re: [R] Linear Discriminant Analysis error: "Variables appear constant" On 16.05.2011 22:07, Songer, Katherine B - DNR wrote:
Hi R experts, I'm attempting to run Linear Discriminant Analysis using the lda function in the MASS package. I've got around 50 predictor variables and one response variable. My response variable has 5 numeric categories that represent different clusters of fish abundance data (clusters were developed using Bray-Curtis and NMDS), and my predictor variables are environmental variables that might influence the fish data. These data all came from 68 sampling locations. I'm getting an error message:
DALogFish<-lda(Cluster~DrainArea+Flow+StrmWidth+Gradient+NatComm+Fish IBIUsed
+QHEI+QHEIsub+QHEImwh_h+QHEIcov+QHEIchan+QHEIrip+QHEIpool+QHEIrif+QHEI
+QHEI+QHEIsub+grads+
QHEIgradv+QHEImwh+QHEIcovtype+QHEIwwh+QHab+QHabBuff+QHabEros+QhabPool+
QHabWDRatio+QHabRif+QHabFines+QHabCov+QHabRating+QHabSize+TP+TKN+NH3+N
QHabWDRatio+QHabRif+QHabFines+QHabCov+QHabRating+QHabSize+TP+TKN+NH3+H
QHabWDRatio+QHabRif+QHabFines+QHabCov+QHabRating+QHabSize+TP+TKN+NH3+3
QHabWDRatio+QHabRif+QHabFines+QHabCov+QHabRating+QHabSize+TP+TKN+NH3+M
QHabWDRatio+QHabRif+QHabFines+QHabCov+QHabRating+QHabSize+TP+TKN+NH3+i
QHabWDRatio+QHabRif+QHabFines+QHabCov+QHabRating+QHabSize+TP+TKN+NH3+n
+NO3NO2N+BOD+TSS+TSSMax+TDS+SSC+SSCMax+Chloride+Sulfate+Ecoli+ChlA+DOper+
DOperMin+DOperMin1_5+DOmgL+DOmgLMean+DOmgLMax+Cond+pH+pHMax+Trans+Temp
DOperMin++
TempMin+Temp4+Crop100+Crop500+CropSub+Dev100+Dev500+DevSub+For100+For500+
ForSub+Pas100+Pas500+PasSub+Wat100+Wat500+WatSub+Wet100+Wet500+WetSub+
Undev100+Undev500+UndevTotal+Undev100NoPas+Undev500NoPas+UndevTotNoPas
Undev100+Undev500+UndevTotal+Undev100NoPas+Undev500NoPas+,
data=AllData1, na.action="na.omit", CV=TRUE)
Error in lda.default(x, grouping, ...) :
variables 10 38 42 appear to be constant within groups
When I look at the variables listed, they don't appear "constant within the groups" to me.
We do not know, since we do not have the data.
I'm new to LDA and am wondering what this error means... Are my data somehow not in the right format? Should I remove colinear variables? (All variables have been normalized.)
Yes, colinear variables should be removed. Note als, that you have roughly as many (or even more) variables in the model than observations. This won't work either. I think you should read some textbook on the mechanisms behind an LDA. Uwe Ligges
Thanks very much! Katie [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.