Assign factor and levels inside function
Aha! You've just opened the door to another level for this blundering R user. I even went back to my well-used copy of "An Introduction to R" to see where I missed this standard approach for processing new data. Nothing clear but certainly alluded to in many of the function examples. I don't know why I was stuck in that rut. I'm sure 99.9% of you on this list know this, but... To be clear for anyone searching these archives later: Don't bother to ask your function to make assignments to pos=1 (the global environment), just do the assignment yourself when calling the function. For example, instead of coding a function call like this: processData(dat) to assign the processed data to pos=1, simply make the assignment when calling the function: dat <- processData(dat) Thanks for being gentle on me, Andy. Tim
"Liaw, Andy" <andy_liaw at merck.com> 4/21/2005 9:57:22 PM >>>
Tim,
From: Tim Howard Andy, Thank you for the help. Yes, my question really did seem like I
was
going through a lot of unnecessary steps just to define levels of a variable. But that was just for the example. In my application, I bring new datasets into R on a daily basis. While the data differs, the variables are the same, and the categorical variables have the same levels. So I find myself daily applying the same factor and level definitions (by cutting and pasting the large chunk of commands from
a
text file). It really would be simpler to have it wrapped up in a function. That's why I asked the question about putting this into a function. Upon reading your answer, I thought maybe I could use your example and use the super-assignment '<<-' in the function. But, your method assigns levels, but does not define the var as a factor (interesting!).
levels(y$one) <- seq(1, 9, by=2) y$one
[1] 1 1 3 3 5 7 attr(,"levels") [1] 1 3 5 7 9
is.factor(y$one)
[1] FALSE
Ouch! "levels<-" is generic, and the default method simply attach the levels attribute to the object. You need to coerce the object into a factor explicitly.
Unfortunately, whenever I try to use <<- with the dataframe as the variable, I get an error message:
fncFact <- function(datfra){
+ datfra$one <<- factor(datfra$one, levels=c(1,3,5,7,9)) + }
fncFact(y)
Error in fncFact(y) : Object "datfra" not found
I believe the canonical ways of doing something like this in R is
something
along the line of:
processData <- function(dat) {
dat$f1 <- factor(dat$f1, levels=...)
... ## any other manipulations you want to do
dat
}
Then when you get new data, you just do:
newData <- processData(newData)
HTH,
Andy
Tim
"Liaw, Andy" <andy_liaw at merck.com> 4/20/2005 4:03:24 PM >>>
Wouldn't it be easier to do this?
levels(y$one) <- seq(1, 9, by=2) y$one
[1] 1 1 3 3 5 7 attr(,"levels") [1] 1 3 5 7 9 Andy
From: Tim Howard R-help, After cogitating for a while, I finally figured out how to
define
a
data.frame column as factor and assign the levels within a
function...
BUT I still need to pass the data.frame and its name separately. I can't seem to find any other way to pass the name of the data.frame,
rather
than the data.frame itself. Any suggestions on how to go about it? Is there something like value(object) or name(object) that I can't
find?
#sample dataframe for this example y <- data.frame( one=c(1,1,3,3,5,7), two=c(2,2,6,6,8,8))
levels(y$one) # check out levels
NULL
# the function I've come up with
fncFact <- function(datfra, datfraNm){
datfra$one <- factor(datfra$one, levels=c(1,3,5,7,9))
assign(datfraNm, datfra, pos=1)
}
fncFact(y, "y") levels(y$one)
[1] "1" "3" "5" "7" "9" I suppose only for aesthetics and simplicity, I'd like to have
only
pass the data.frame and get the same result. Thanks in advance, Tim Howard
version
_ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 0.1 year 2004 month 11 day 15 language R
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
-------------------------------------------------------------- ---------------- Notice: This e-mail message, together with any attachments,
contains
information of Merck & Co., Inc. (One Merck Drive, Whitehouse
Station,
New Jersey, USA 08889), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp & Dohme or
MSD
and in Japan, as Banyu) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message. If you are not
the
intended recipient, and have received this message in error, please notify us immediately by reply e-mail and then delete it from your system. -------------------------------------------------------------- ----------------
------------------------------------------------------------------------------
Notice: This e-mail message, together with any attachments,...{{dropped}}