An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20101206/20ac838c/attachment.pl>
[plyr] Question regarding ddply: use of .(as.name(varname)) and varname in ddply function
7 messages · Sunny Srivastava, Peter Ehlers, jim holtman +1 more
On 2010-12-06 01:58, Sunny Srivastava wrote:
Dear R-Helpers:
I am using trying to use *ddply* to extract min and max of a particular
column in a data.frame. I am using two different forms of the function:
## var_name_to_split is a string -- something like "var1" which is the name
of a column in data.frame
ddply( df, .(as.name(var_name_to_split)), function(x) c(min(x[ , 3] , max(x[
, 3]))) ## fails with an error - case 1
ddply( df, var_name_to_split , function(x) c(min(x[ , 3] , max(x[ , 3])))
## works fine - case 2
Try it without the .(), i.e. ddply(df, as.name(), ....) Peter Ehlers
I can't understand why I get the error in case 1. Can someone help me please? Thank you in advance. S.
[ snip ]
Here is another approach to try:
require(data.table) var <- "g10" df <- data.table(df) str(df)
Classes ?data.table? and 'data.frame': 6 obs. of 5 variables: $ g10: int 1 1 1 10 10 10 $ l1 : num 0.41 0.607 0.64 -1.478 -1.482 ... $ d1 : num 0.918 0.959 0.773 0.474 0.591 ... $ l13: num 0.08037 -0.29174 -0.00191 0.29589 0.61538 ... $ d13: num -1.408 -1.275 -1.412 0.709 0.276 ...
df[,list(min=min(d1), max = max(d1)), by = eval(var)]
g10 min max [1,] 1 0.77292857 0.9592568 [2,] 10 0.04486293 0.5905809 On Mon, Dec 6, 2010 at 4:58 AM, Sunny Srivastava
<research.baba at gmail.com> wrote:
Dear R-Helpers: I am using trying to use *ddply* to extract min and max of a particular column in a data.frame. I am using two different forms of the function: ## var_name_to_split is a string -- something like "var1" which is the name of a column in data.frame ddply( df, .(as.name(var_name_to_split)), function(x) c(min(x[ , 3] , max(x[ , 3]))) ## fails with an error - case 1 ddply( df, var_name_to_split , function(x) c(min(x[ , 3] , max(x[ , 3]))) ? ? ? ? ? ? ? ## works fine - case 2 I can't understand why I get the error in case 1. Can someone help me please? Thank you in advance. S. ---------- Here is the reproducible code: https://gist.github.com/730069 Here is sample data: structure(list(g10 = c(1L, 1L, 1L, 10L, 10L, 10L), l1 = c(0.410077661080032, 0.607497980054711, 0.640488621149069, -1.47837849145189, -1.48199933642397, -1.42815840788069), d1 = c(0.917769870675383, 0.959256755797054, 0.772928570498006, 0.473545787883884, 0.590580940273922, 0.0448629265021484 ), l13 = c(0.0803696045647364, -0.291741079837731, -0.00191015929550312, 0.295889063381279, 0.615383505686296, 0.71991154637985), d13 = c(-1.40821713632015, -1.27501365601403, -1.41150703235157, 0.708943640186729, 0.276034890463749, 0.663383934998686)), .Names = c("g10", "l1", "d1", "l13", "d13" ), row.names = c(1L, 2L, 3L, 1758L, 1759L, 1760L), class = "data.frame") ----------- If some one doesn't want to open github - here is the code ## Doesn't work # grp -- name of a column of the the data.frame df # function call is -- getMinMax1( df1 , grp = "var1") getMinMax1 <-function(df, grp){ ? ? ?dfret <- ddply( df , .(as.name(grp)), ## I am using as.name(grp), source of error ? ? ? ? ? ?function(x){ ? ? ? ? ? ? ? ?minmax <- c(mix(x[ , 3]), max(x[ ,3])) ? ? ? ? ? ? ? ?return(minmax) ? ? ? ? ? ?} ? ? ? ? ? ?) ? ? ?return(dfret) ?} ## Works fine # grp -- name of a column of the the data.frame df # function call is -- getMinMax2( df1 , grp = "var1") getMinMax2 <-function(df, grp){ ? ? ?dfret <- ddply( df , grp, ## using the quoted variable name passed to grp when the fun is called ? ? ? ? ? ?function(x){ ? ? ? ? ? ? ? ?minmax <- c(min(x[ , 3]), max(x[ ,3])) ? ? ? ? ? ? ? ?return(minmax) ? ? ? ? ? ?} ? ? ? ? ? ?) ? ? ?return(dfret) ?} ? ? ? ?[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Jim Holtman Data Munger Guru What is the problem that you are trying to solve?
On Mon, Dec 6, 2010 at 3:58 AM, Sunny Srivastava
<research.baba at gmail.com> wrote:
Dear R-Helpers: I am using trying to use *ddply* to extract min and max of a particular column in a data.frame. I am using two different forms of the function: ## var_name_to_split is a string -- something like "var1" which is the name of a column in data.frame ddply( df, .(as.name(var_name_to_split)), function(x) c(min(x[ , 3] , max(x[ , 3]))) ## fails with an error - case 1 ddply( df, var_name_to_split , function(x) c(min(x[ , 3] , max(x[ , 3]))) ? ? ? ? ? ? ? ## works fine - case 2 I can't understand why I get the error in case 1. Can someone help me please?
Why do you expect case 1 to work? Hadley
Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20101206/b90d7460/attachment.pl>
It's easiest to see what's going on if you use eval.quoted directly:
eval.quoted(.(cyl), mtcars)
eval.quoted(.("cyl"), mtcars)
eval.quoted(.(as.name("cyl")), mtcars)
But you shouldn't need to do any syntactic hackery because the default
method automatically parses the string for you:
eval.quoted(as.quoted("cyl"), mtcars)
Hadley
On Mon, Dec 6, 2010 at 6:22 PM, Sunny Srivastava
<research.baba at gmail.com> wrote:
Hi Hadley: I was trying to use ddply using the format . (var1) for splitting. I thought . ( as.name(grp) ) would do the same thing. But it does not. I was just trying to know my mistake. I am sorry if it is a basic question. Thank you and others for your reply. Best Regards, S. On Mon, Dec 6, 2010 at 5:28 PM, Hadley Wickham <hadley at rice.edu> wrote:
On Mon, Dec 6, 2010 at 3:58 AM, Sunny Srivastava <research.baba at gmail.com> wrote:
Dear R-Helpers: I am using trying to use *ddply* to extract min and max of a particular column in a data.frame. I am using two different forms of the function: ## var_name_to_split is a string -- something like "var1" which is the name of a column in data.frame ddply( df, .(as.name(var_name_to_split)), function(x) c(min(x[ , 3] , max(x[ , 3]))) ## fails with an error - case 1 ddply( df, var_name_to_split , function(x) c(min(x[ , 3] , max(x[ , 3]))) ? ? ? ? ? ? ? ## works fine - case 2 I can't understand why I get the error in case 1. Can someone help me please?
Why do you expect case 1 to work? Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/
Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20101207/8553ba4b/attachment.asc>