New PLYR issue
Replying to old messages without including context (particularly old ones) is rather bad netiquette. Thank you for at least providing a reproducible example. Now if you can figure out how to read the documentation we will really make some progress. Further responses below.
On Tue, 17 Jan 2012, Gunnar Oehmichen wrote:
Hello everyone, I have got the same problem, with the same error message.
I wasn't able to draw a comparison between the problems, though the error messages were the same.
Using R 2.14.1, plyr 1.7.1, R.Studio 0.94.110, Windows XP The plyr mailing list does not provide any help until now.
require(plyr)
c(sample(c(1:100), 50, replace=TRUE))->V1
Much better to use " <- " than "->" for clarity of code (spaces and direction of assignment make a difference for readability)
c(rep( 1:5, 10))->f1 #variable to group V1
data.frame(cbind(V1, f1))->DF
str(DF)
ddply(DF$V1, DF$f1, "sd") ddply(.(DF$V1), .(DF$f1), "sd")
Error in if (empty(.data)) return(.data) :
missing value where TRUE/FALSE needed Thanks everyone,
If you hand a toothpick to a mechanic you should not be surprised when he tells you he cannot change a tire from your car. You are giving a vector where a data frame is needed, another vector where a name or vector of names are required, and the name of a function where an actual function is needed, and the function is complaining. In the face of such confusion, it is not surprising that people were unable to figure out where to start setting you straight. However, in return for your reproducible example I will give it a go. A basic unifying concept for the plyr package is that the name of the function tells you something about what needs to go in, and what will come out. "ddply" starts with a "d" so it expects a data frame as input, and because the second letter is also a "d" it will yield a data frame result when it is done. Argument 1: DF$V1 is a vector. It happens to be the the column named V1 in the data frame DF. To specify a data frame, don't apply operators to it, just write the name of the data frame DF. Argument 2: This argument tells ddply what the name of the grouping columns are. Do not actually give the grouping columns to ddply (which $ does). I have found that while the .() function seems cleaner, I find it clearer to use a vector of strings ... in this case, there is only one grouping column, so I would forego the usual c() concatenator and just give it "f1". Argument 3: This argument is supposed to be a function that will take a data frame (first d) and yield a data frame (second d) for one group of rows. ddply will take care of stacking them as a single data frame for the final result. You have given ddply the name (first error) of a function that takes a vector and returns a scalar (wrong type of function is error two). The correct documentation for all of these arguments can be found by typing ?ddply at the R command line (after you have loaded plyr). It looks like you have been reading the documentation for ?aggregate or ?summaryBy (doBy package) and trying to use that to inform your use of ddply. So the actual call should be:
ddply(DF,"f1",function(df){data.frame(sdV1=sd(df$V1))})
f1 sdV1 1 1 19.93016 2 2 35.96356 3 3 33.30349 4 4 26.62831 5 5 25.03087 In general, to add more simultaneous calculations, you add more columns to the data frame produced by your function that does the calculations. If you want to give it a function name, don't put it in quotes:
myfunction <- function(df){
+ data.frame(sdV1=sd(df$V1),meanV1=mean(df$V1)) + }
ddply(DF,"f1",myfunction)
f1 sdV1 meanV1
1 1 19.93016 49.1
2 2 35.96356 45.6
3 3 33.30349 44.7
4 4 26.62831 72.2
5 5 25.03087 30.1
Note that although ddply does a lot for you, it doesn't reproduce all of
your calculations on all of the data columns like summaryBy does... you
have to explicitly create every calculated column in your function.
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k