While putting my R code into functions, I've encountered a ddply function nesting issue and need a bit of advice on the proper way to fix it.? I've tried several approahces, but neither worked and I need to have the ability to include the "cut", "range", and "fullseq" methods within ddply.? (For a bit of that explanation refer to http://finzi.psych.upenn.edu/Rhelp08/2009-February/187331.html) Thus, in order to preserve that functionality, and put my code within functions, I needed to have an architecture similar to the following implemented, where you end up running: function_nesting() Unfortunately this produced errors within the ddply where it does not appear to be recognizing or allowing variables or functions to be processed within side its function.? Thank you for any advice about how to proceed forward. determine_counts<-function() { ??????? min_range<-1 ??????? max_range<-30 ??????? bin_range_size<-5 ??????? Me_df<-data.frame(Data = c(1:15), Person = "Me") ??????? You_df<-data.frame(Data = c(10:20), Person = "You") ??????? Them_df<-data.frame(Data = c(15:25), Person = "Them") ??????? Group_df_tmp<-rbind(Me_df,You_df) ??????? Group_df<-rbind(Group_df_tmp,Them_df) ??????? Group_df$Person <- factor(Group_df$Person, levels = c("Them", "You", "Me")) ??????? #counts <- ddply(Group_df, .(cut(Data, breaks=fullseq(range(Data), 5)), Person), nrow) ??????? ?????????# Approach 1 ??? ??? counts <- ddply(Group_df, .(cut(Data, breaks=fullseq(range(c(Group_df$Data, min_range, max_range)), bin_range_size)), Person), nrow) ??????? ??? ??? # Approach 2 ??????? range_tmp<-range(c(Group_df$Data, min_range, max_range)) ??????? counts <- ddply(Group_df, .(cut(Data, breaks=fullseq(range_tmp, bin_range_size)), Person), nrow) ??????????????? ??????????????? ??????? names(counts) <- c("Bin", "Person", "Frequency") ??????? qplot(Person, Frequency, data = counts, fill = Person, geom="bar", stat="identity", width = 0.9, xlab="Person") +? facet_grid(. ~ Bin) } function_nesting<-function() { ??????? determine_counts() } However, if the code is just run straight through without being nested it works fine: ??????? min_range<-1 ??????? max_range<-30 ??????? bin_range_size<-5 ??????? Me_df<-data.frame(Data = c(1:15), Person = "Me") ??????? You_df<-data.frame(Data = c(10:20), Person = "You") ??????? Them_df<-data.frame(Data = c(15:25), Person = "Them") ??????? Group_df_tmp<-rbind(Me_df,You_df) ??????? Group_df<-rbind(Group_df_tmp,Them_df) ??????? Group_df$Person <- factor(Group_df$Person, levels = c("Them", "You", "Me")) ??????? #counts <- ddply(Group_df, .(cut(Data, breaks=fullseq(range(Data), 5)), Person), nrow) ??????? counts <- ddply(Group_df, .(cut(Data, breaks=fullseq(range(c(Group_df$Data, min_range, max_range)), bin_range_size)), Person), nrow) Unfortunately this?is not within a function, so thanks again for any advice on how to approach this issue.
ddply function nesting problems
3 messages · Baptiste Auguie, Jason Rupert
Hi,
I think your ddply call with a calculation inside ".( )" is the
problem. Are you sure you need to do this? Performing the cut outside
ddply seems to work fine,
determine_counts<-function()
{
min_range<-1
max_range<-30
bin_range_size<-5
Me_df<-data.frame(Data = c(1:15), Person = "Me")
You_df<-data.frame(Data = c(10:20), Person = "You")
Them_df<-data.frame(Data = c(15:25), Person = "Them")
Group_df_tmp<-rbind(Me_df,You_df)
Group_df<-rbind(Group_df_tmp,Them_df)
Group_df$Person <- factor(Group_df$Person, levels = c("Them",
"You", "Me"))
Group_df <- transform(Group_df, cut=cut(Data,
breaks=fullseq(range(c(Data,
min_range, max_range)),
bin_range_size)))
counts <- ddply(Group_df, .(cut, Person), nrow)
names(counts) <- c("Bin", "Person", "Frequency")
qplot(Person, Frequency, data = counts,
fill = Person, geom="bar", stat="identity", width = 0.9,
xlab="Person") +
facet_grid(. ~ Bin)
}
function_nesting()
HTH,
baptiste
2009/11/19 Jason Rupert <jasonkrupert at yahoo.com>:
While putting my R code into functions, I've encountered a ddply function nesting issue and need a bit of advice on the proper way to fix it.? I've tried several approahces, but neither worked and I need to have the ability to include the "cut", "range", and "fullseq" methods within ddply.? (For a bit of that explanation refer to http://finzi.psych.upenn.edu/Rhelp08/2009-February/187331.html) Thus, in order to preserve that functionality, and put my code within functions, I needed to have an architecture similar to the following implemented, where you end up running: function_nesting() Unfortunately this produced errors within the ddply where it does not appear to be recognizing or allowing variables or functions to be processed within side its function. Thank you for any advice about how to proceed forward. determine_counts<-function() { ??????? min_range<-1 ??????? max_range<-30 ??????? bin_range_size<-5 ??????? Me_df<-data.frame(Data = c(1:15), Person = "Me") ??????? You_df<-data.frame(Data = c(10:20), Person = "You") ??????? Them_df<-data.frame(Data = c(15:25), Person = "Them") ??????? Group_df_tmp<-rbind(Me_df,You_df) ??????? Group_df<-rbind(Group_df_tmp,Them_df) ??????? Group_df$Person <- factor(Group_df$Person, levels = c("Them", "You", "Me")) ??????? #counts <- ddply(Group_df, .(cut(Data, breaks=fullseq(range(Data), 5)), Person), nrow) ?????????# Approach 1 ??? ??? counts <- ddply(Group_df, .(cut(Data, breaks=fullseq(range(c(Group_df$Data, min_range, max_range)), bin_range_size)), Person), nrow) ??? ??? # Approach 2 ??????? range_tmp<-range(c(Group_df$Data, min_range, max_range)) ??????? counts <- ddply(Group_df, .(cut(Data, breaks=fullseq(range_tmp, bin_range_size)), Person), nrow) ??????? names(counts) <- c("Bin", "Person", "Frequency") ??????? qplot(Person, Frequency, data = counts, fill = Person, geom="bar", stat="identity", width = 0.9, xlab="Person") +? facet_grid(. ~ Bin) } function_nesting<-function() { ??????? determine_counts() } However, if the code is just run straight through without being nested it works fine: ??????? min_range<-1 ??????? max_range<-30 ??????? bin_range_size<-5 ??????? Me_df<-data.frame(Data = c(1:15), Person = "Me") ??????? You_df<-data.frame(Data = c(10:20), Person = "You") ??????? Them_df<-data.frame(Data = c(15:25), Person = "Them") ??????? Group_df_tmp<-rbind(Me_df,You_df) ??????? Group_df<-rbind(Group_df_tmp,Them_df) ??????? Group_df$Person <- factor(Group_df$Person, levels = c("Them", "You", "Me")) ??????? #counts <- ddply(Group_df, .(cut(Data, breaks=fullseq(range(Data), 5)), Person), nrow) ??????? counts <- ddply(Group_df, .(cut(Data, breaks=fullseq(range(c(Group_df$Data, min_range, max_range)), bin_range_size)), Person), nrow) Unfortunately this?is not within a function, so thanks again for any advice on how to approach this issue.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Awesome!?
Thanks a ton!?
I guess I had overlooked how it was really working.
I will still have to reflect on why it was working running it straight through, but?not being nested.
That is?kind of a mystery.? Oh well...
Thanks again.?
?
----- Original Message ----
From: baptiste auguie <baptiste.auguie at googlemail.com>
To: Jason Rupert <jasonkrupert at yahoo.com>
Cc: R-help at r-project.org
Sent: Thu, November 19, 2009 9:24:29 AM
Subject: Re: [R] ddply function nesting problems
Hi,
I think your ddply call with a calculation inside ".(? )" is the
problem. Are you sure you need to do this? Performing the cut outside
ddply seems to work fine,
determine_counts<-function()
{
? ? ? ? min_range<-1
? ? ? ? max_range<-30
? ? ? ? bin_range_size<-5
? ? ? ? Me_df<-data.frame(Data = c(1:15), Person = "Me")
? ? ? ? You_df<-data.frame(Data = c(10:20), Person = "You")
? ? ? ? Them_df<-data.frame(Data = c(15:25), Person = "Them")
? ? ? ? Group_df_tmp<-rbind(Me_df,You_df)
? ? ? ? Group_df<-rbind(Group_df_tmp,Them_df)
? ? ? ? Group_df$Person <- factor(Group_df$Person, levels = c("Them",
"You", "Me"))
? ? ? ? Group_df <- transform(Group_df, cut=cut(Data,
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? breaks=fullseq(range(c(Data,
min_range, max_range)),
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? bin_range_size)))
? ? ? ? counts <- ddply(Group_df, .(cut, Person), nrow)
? ? ? ? names(counts) <- c("Bin", "Person", "Frequency")
? ? ? ? qplot(Person, Frequency, data = counts,
? ? ? ? ? ? ? fill = Person, geom="bar", stat="identity", width = 0.9,
xlab="Person") +
? ? ? ? ? ? ? ? facet_grid(. ~ Bin)
}
function_nesting()
HTH,
baptiste
2009/11/19 Jason Rupert <jasonkrupert at yahoo.com>:
While putting my R code into functions, I've encountered a ddply function nesting issue and need a bit of advice on the proper way to fix it.? I've tried several approahces, but neither worked and I need to have the ability to include the "cut", "range", and "fullseq" methods within ddply.? (For a bit of that explanation refer to http://finzi.psych.upenn.edu/Rhelp08/2009-February/187331.html) Thus, in order to preserve that functionality, and put my code within functions, I needed to have an architecture similar to the following implemented, where you end up running: function_nesting() Unfortunately this produced errors within the ddply where it does not appear to be recognizing or allowing variables or functions to be processed within side its function. Thank you for any advice about how to proceed forward. determine_counts<-function() { ??????? min_range<-1 ??????? max_range<-30 ??????? bin_range_size<-5 ??????? Me_df<-data.frame(Data = c(1:15), Person = "Me") ??????? You_df<-data.frame(Data = c(10:20), Person = "You") ??????? Them_df<-data.frame(Data = c(15:25), Person = "Them") ??????? Group_df_tmp<-rbind(Me_df,You_df) ??????? Group_df<-rbind(Group_df_tmp,Them_df) ??????? Group_df$Person <- factor(Group_df$Person, levels = c("Them", "You", "Me")) ??????? #counts <- ddply(Group_df, .(cut(Data, breaks=fullseq(range(Data), 5)), Person), nrow) ?????????# Approach 1 ??? ??? counts <- ddply(Group_df, .(cut(Data, breaks=fullseq(range(c(Group_df$Data, min_range, max_range)), bin_range_size)), Person), nrow) ??? ??? # Approach 2 ??????? range_tmp<-range(c(Group_df$Data, min_range, max_range)) ??????? counts <- ddply(Group_df, .(cut(Data, breaks=fullseq(range_tmp, bin_range_size)), Person), nrow) ??????? names(counts) <- c("Bin", "Person", "Frequency") ??????? qplot(Person, Frequency, data = counts, fill = Person, geom="bar", stat="identity", width = 0.9, xlab="Person") +? facet_grid(. ~ Bin) } function_nesting<-function() { ??????? determine_counts() } However, if the code is just run straight through without being nested it works fine: ??????? min_range<-1 ??????? max_range<-30 ??????? bin_range_size<-5 ??????? Me_df<-data.frame(Data = c(1:15), Person = "Me") ??????? You_df<-data.frame(Data = c(10:20), Person = "You") ??????? Them_df<-data.frame(Data = c(15:25), Person = "Them") ??????? Group_df_tmp<-rbind(Me_df,You_df) ??????? Group_df<-rbind(Group_df_tmp,Them_df) ??????? Group_df$Person <- factor(Group_df$Person, levels = c("Them", "You", "Me")) ??????? #counts <- ddply(Group_df, .(cut(Data, breaks=fullseq(range(Data), 5)), Person), nrow) ??????? counts <- ddply(Group_df, .(cut(Data, breaks=fullseq(range(c(Group_df$Data, min_range, max_range)), bin_range_size)), Person), nrow) Unfortunately this?is not within a function, so thanks again for any advice on how to approach this issue.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.