Skip to content
Back to formatted view

Raw Message

Message-ID: <D5FA03935F7418419332B61CA255F65F6B52F23A03@USCTMXP51012.merck.com>
Date: 2012-05-29T19:18:30Z
From: Liaw, Andy
Subject: Question about random Forest function in R
In-Reply-To: <1338302810.80531.YahooMailNeo@web121606.mail.ne1.yahoo.com>

Hi Kelly,

The function has a limitation that it cannot handle any column in your "x" that is a categorical variable with more than 32 categories.  One possibility is to see if you can "bin" some of the categories into one to get below 32 categories.

Andy 

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Kelly Cool
Sent: Tuesday, May 29, 2012 10:47 AM
To: r-help at r-project.org
Subject: [R] Question about random Forest function in R



Hello,?

I am trying to run the random Forest function on a data.frame using the following code..

myrf <- randomForest (y=sample_data_metal, x=Train, importance=TRUE, proximity=TRUE)


However, an error occurs saying, "can not handle categorical predictors with more than 32 categories".?

My "x=Train" data.frame is quite large and my "y=sample_data_metal" is one column.?

I'm not sure how to go about fixing this error or if there is even a way to get around this error. Thanks in advance for any help.?

	[[alternative HTML version deleted]]

Notice:  This e-mail message, together with any attachme...{{dropped:11}}