Discretize factors?
I could, but with close to 100 columns, its messy.
On 5/16/10 11:22 AM, Peter Ehlers wrote:
On 2010-05-16 11:06, Noah Silverman wrote:
Update,
I have it working, but now its producing really ugly labels. Must be a
small adjustment to the code. Any ideas??
##Create example data.frame
group<- c("A", "B","B","C","C","C")
a<- c(1,4,3,4,5,6)
b<- c(5,4,5,3,4,5)
d<- data.frame(cbind(a,b,group))
#create new frame with discretized group
cbind(d[,1:2], model.matrix(~0+d[,3]) )
a b d[, 3]A d[, 3]B d[, 3]C 1 1 5 1 0 0 2 4 4 0 1 0 3 3 5 0 1 0 4 4 3 0 0 1 5 5 4 0 0 1 6 6 5 0 0 1 So, as you can see, it works, but the labels for the groups don't I then tried using the column name instead of number and still got ugly results:
cbind(d[,1:2], model.matrix(~0+d[,"group"]) )
a b d[, "group"]A d[, "group"]B d[, "group"]C 1 1 5 1 0 0 2 4 4 0 1 0 3 3 5 0 1 0 4 4 3 0 0 1 5 5 4 0 0 1 6 6 5 0 0 1 Any ideas?
Can't you just use names(...) <- c() on your final dataframe? -Peter Ehlers
-N On 5/15/10 11:02 AM, Noah Silverman wrote:
Hi,
I'm looking for an easy way to discretize factors in R
I've noticed that the lm function does this automatically with a nice
result.
If I have
group<- c("A", "B","B","C","C","C")
and run:
lm(result ~ x1 + group)
The lm function has split the group into separate binary variables
{0,1}
before performing the regression. I now have:
groupA
groupB
groupC
Some of the other models that I want to try won't accept factors, so
they need to be discretized this way.
Is there a command in R for this, or some easy shortcut? (I tried
digging into the lm code, but couldn't find where this is being done.)
Thanks!
-N