I noticed a post in June 2011 where a user reported this message and the
ultimate problem was that the importance measure was being conditioned
on
too many variables (47). I have only a small number of variables here so
I
guessed that was not the problem.
Another suggestion was that there could be a factor with too many
levels. In
my case, all of the variables are continuous. Term 1 (x1 below) is the
day
of the year, which does happen to be integers 1 ... 366. But the
variable is
class numeric, not integer, so I don't believe cforest would treat it as
a
factor, although I do not know how to tell whether cforest is treating
something as continuous or as a factor.
Thank you for any help you can provide. I am running R 2.13.1 with party
0.9-99994. You can download the data from
http://www.duke.edu/~jjr8/data.rdata (512 KB). Here is the complete
code:
load("\\Temp\\data.rdata")
nrow(df)
y x1 x2 x3
x4 x5 x6 x7
x8
Min. : 0.000 Min. : 1.0 Min. :0.0000 Min. : 1.00
Min.
: 52 Min. : 0.008184 Min. :16.71 Min. :0.0000000 Min. :
0.02727
1st Qu.: 0.000 1st Qu.:105.0 1st Qu.:0.0000 1st Qu.: 30.00 1st
Qu.:1290 1st Qu.: 6.747035 1st Qu.:23.92 1st Qu.:0.0000000 1st
Qu.:
0.11850
Median : 1.282 Median :169.0 Median :0.2353 Median : 38.00
Median
:1857 Median :11.310277 Median :26.35 Median :0.0001569 Median :
0.14625
Mean : 5.651 Mean :178.7 Mean :0.2555 Mean : 55.03
Mean
:1907 Mean :12.889021 Mean :26.31 Mean :0.0162043 Mean :
0.20684
3rd Qu.: 5.353 3rd Qu.:262.0 3rd Qu.:0.4315 3rd Qu.: 47.00 3rd
Qu.:2594 3rd Qu.:18.427410 3rd Qu.:28.95 3rd Qu.:0.0144660 3rd
Qu.:
0.20095
Max. :195.238 Max. :366.0 Max. :1.0000 Max. :400.00
Max.
:3832 Max. :29.492380 Max. :31.73 Max. :0.3157486 Max.
:11.76877
x1 x2 x3 x4 x5 x6 x7 x8
1.374583 1.252250 1.021672 1.218801 1.015124 1.439868 1.075546 1.060580
mycontrols <- cforest_unbiased(ntree=50, mtry=3) # Small
forest
but requires a few minutes
myforest <- cforest(y ~ ., data=df, controls=mycontrols)
varimp(myforest)
x1 x2 x3 x4 x5 x6
x7
x8
11.924498 103.180195 16.228864 30.658946 5.053500 12.820551
2.113394
6.911377
varimp(myforest, conditional=TRUE)
Error in model.matrix.default(as.formula(f), data = blocks) :
term 1 would require 9e+12 columns