Tim,
From: Tim Howard Andy, Thank you for the help. Yes, my question really did seem like I was going through a lot of unnecessary steps just to define levels of a variable. But that was just for the example. In my application, I bring new datasets into R on a daily basis. While the data differs, the variables are the same, and the categorical variables have the same levels. So I find myself daily applying the same factor and level definitions (by cutting and pasting the large chunk of commands from a text file). It really would be simpler to have it wrapped up in a function. That's why I asked the question about putting this into a function. Upon reading your answer, I thought maybe I could use your example and use the super-assignment '<<-' in the function. But, your method assigns levels, but does not define the var as a factor (interesting!).
levels(y$one) <- seq(1, 9, by=2) y$one
[1] 1 1 3 3 5 7 attr(,"levels") [1] 1 3 5 7 9
is.factor(y$one)
[1] FALSE
Ouch! "levels<-" is generic, and the default method simply attach the levels attribute to the object. You need to coerce the object into a factor explicitly.
Unfortunately, whenever I try to use <<- with the dataframe as the variable, I get an error message:
fncFact <- function(datfra){
+ datfra$one <<- factor(datfra$one, levels=c(1,3,5,7,9)) + }
fncFact(y)
Error in fncFact(y) : Object "datfra" not found
I believe the canonical ways of doing something like this in R is something
along the line of:
processData <- function(dat) {
dat$f1 <- factor(dat$f1, levels=...)
... ## any other manipulations you want to do
dat
}
Then when you get new data, you just do:
newData <- processData(newData)
HTH,
Andy
Tim
"Liaw, Andy" <andy_liaw at merck.com> 4/20/2005 4:03:24 PM >>>
Wouldn't it be easier to do this?
levels(y$one) <- seq(1, 9, by=2) y$one
[1] 1 1 3 3 5 7 attr(,"levels") [1] 1 3 5 7 9 Andy
From: Tim Howard R-help, After cogitating for a while, I finally figured out how to define
a
data.frame column as factor and assign the levels within a
function...
BUT I still need to pass the data.frame and its name separately. I can't seem to find any other way to pass the name of the data.frame,
rather
than the data.frame itself. Any suggestions on how to go about it? Is there something like value(object) or name(object) that I can't
find?
#sample dataframe for this example y <- data.frame( one=c(1,1,3,3,5,7), two=c(2,2,6,6,8,8))
levels(y$one) # check out levels
NULL
# the function I've come up with
fncFact <- function(datfra, datfraNm){
datfra$one <- factor(datfra$one, levels=c(1,3,5,7,9))
assign(datfraNm, datfra, pos=1)
}
fncFact(y, "y") levels(y$one)
[1] "1" "3" "5" "7" "9" I suppose only for aesthetics and simplicity, I'd like to have only pass the data.frame and get the same result. Thanks in advance, Tim Howard
version
_ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 0.1 year 2004 month 11 day 15 language R
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
--------------------------------------------------------------
----------------
Notice: This e-mail message, together with any attachment...{{dropped}}