Skip to content

Please help me!!!! Error in `[.data.frame`(x, , retained, drop = FALSE) : undefined columns selected

4 messages · Max Kuhn, bbslover

#
I am learning the package "caret", after I do the "rfe" function, I get the
error ,as follows:

Error in `[.data.frame`(x, , retained, drop = FALSE) : 
  undefined columns selected
In addition: Warning message:
In predict.lm(object, x) :
  prediction from a rank-deficient fit may be misleading


I try to that manual example, that is good, my data is wrong. I do not know
what reanson?

my code is :

  subsets<-c(1:5,10,15,20,25)
  ctrl<-rfeControl(functions=lmFuncs, method  = "cv", 
            verbose=FALSE,returnResamp="final")
  lmProfile<-rfe(trainDescr,trainY,sizes=subsets,rfeControl=ctrl)

before it, I have do some pre-process and my data is in the attachment.

Please help me.  thank you!

kevin http://n4.nabble.com/file/n996068/trainDescr.txt trainDescr.txt 
http://n4.nabble.com/file/n996068/trainY.txt trainY.txt
#
Your data set has 217 predictors and 166 samples. If you read the
vignette on feature selection for this package, you'll see that the
default ranking mechanism that it uses for linear models requires a
linear model fit. The note that:

   >  prediction from a rank-deficient fit may be misleading

should tell you something. If it doesn't: the model fit is over
determined and there is no unique solution, so many of the parameter
estimates are NA.

Either create a modified version of lmFuncs that suits your needs or
remove variables prior to modeling (or try some other method that
doesn't require more samples than predictors, such as the lasso or
elasticnet).

Max
On Fri, Jan 1, 2010 at 10:14 PM, bbslover <dluthm at yeah.net> wrote:

  
    
#
thanks,
  
  I have reduce the  number of descriptors, and the erroe is none, my major
is qsar, but what is the criterion to select descritors, and how many
descriptors should be selected, It is a problem, I calculate my descriptors
troungh E-dragon, and apply the wonderful package caret,but my result is
poor, how can i improve my performance?

Max is an expert in this field I think ,can you give me some suggestion in
how can I well learn QSAR and build the perfect models based on nonlinear
and linear. Here, only myself do QSAR research study lonely, and I have no
some software to calculate descriptors except free ons, I just know
e-dragon,  have others?

and good tools to do QSAR?   thank you  again.

kevin!
Max Kuhn wrote: