coded to categorical variables in a large dataset
sj wrote:
I am working with a dataset where there are 5 possible outcomes (coded 1:5), I would like to create 5 categorical variables (event1...event5). I am using a for loop an if statements, but I have a large dataset( approx 100,000 rows) it takes quite a bit of time, is there a way to speed this up? Here is some sample code of what I am currently doing.
Here is one way you might do it: X <- sample(1:5, 100, replace=TRUE) # Your 5 event variables in a matrix model.matrix(lm(rnorm(length(X)) ~ as.factor(X) - 1)) Also, along the lines of your approach below, the following using ifelse() might be better: event3 <- ifelse(test2 == 3, 1, 0) I'm sure other people will post different solutions probably more elegant than these.
test2 <-rep(seq(1:5),2000)
event1 <- rep(0,nrow(test2))
event2 <- rep(0,nrow(test2))
event3 <- rep(0,nrow(test2))
event4 <- rep(0,nrow(test2))
event5 <- rep(0,nrow(test2))
for(i in 1:length(event1))
{
if (test2[i]==1)
{
event1[i]=1
}
if (test2[i]==2)
{
event2[i]=1
}
if (test2[i]==3)
{
event3[i]=1
}
if (test2[i]==4)
{
event4[i]=1
}
if (test2[i]==5)
{
event5[i]=1
}
}
thanks,
Spencer
[[alternative HTML version deleted]]
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894