An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120508/5b0b6b1a/attachment.pl>
grouping function
4 messages · Geoffrey Smith, Sarah Goslee, arun
Hi,
On Tue, May 8, 2012 at 2:17 PM, Geoffrey Smith <gps at asu.edu> wrote:
Hello, I would like to write a function that makes a grouping variable for
some panel data . ?The grouping variable is made conditional on the begin
year and the end year. ?Here is the code I have written so far.
name <- c(rep('Frank',5), rep('Tony',5), rep('Edward',5));
begin <- c(seq(1990,1994), seq(1991,1995), seq(1992,1996));
end <- c(seq(1995,1999), seq(1995,1999), seq(1996,2000));
df <- data.frame(name, begin, end);
df;
Thanks for providing reproducible data. Two minor points: you don't need ; at the end of lines, and calling your data frame df is confusing because there's a df() function.
#This is the part I am stuck on;
makegroup <- function(x,y) {
?group <- 0
?if (x <= 1990 & y > 1990) {group==1}
?if (x <= 1991 & y > 1991) {group==2}
?if (x <= 1992 & y > 1992) {group==3}
?return(x,y)
}
makegroup(df$begin,df$end);
#I am looking for output where each observation belongs to a group
conditional on the begin year and end year. ?I would also like to use a for
loop for programming accuracy as well;
This isn't a clear specification:
1990, 1994 for instance fits into all three groups. Do you want to
extend this to more start years, or are you only interested in those
three? Assuming end is always >= start, you don't even need to
consider the end years in your grouping.
Here are two methods, one that "looks like" your pseudocode, and one
that is more R-ish. They give different results because of different
handling of cases that fit all three groups. Rearranging the
statements in makegroup1() from broadest to most restrictive would
make it give the same result as makegroup2().
makegroup1 <- function(x,y) {
group <- numeric(length(x))
group[x <= 1990 & y > 1990] <- 1
group[x <= 1991 & y > 1991] <- 2
group[x <= 1992 & y > 1992] <- 3
group
}
makegroup2 <- function(x, y) {
ifelse(x <= 1990 & y > 1990, 1,
ifelse(x <= 1991 & y > 1991, 2,
ifelse(x <= 1992 & y > 1992, 3, 0)))
}
makegroup1(df$begin,df$end)
[1] 3 3 3 0 0 3 3 0 0 0 3 0 0 0 0
makegroup2(df$begin,df$end)
[1] 1 2 3 NA NA 2 3 NA NA NA 3 NA NA NA NA
df
But really, it's a better idea to develop an unambiguous statement of your desired output. Sarah
Sarah Goslee http://www.functionaldiversity.org
HI Sarah, I run the same code from your reply email.? For the makegroup2, the results are 0 in places of NA.
makegroup1 <- function(x,y) {
+ group <- numeric(length(x)) + group[x <= 1990 & y > 1990] <- 1 + group[x <= 1991 & y > 1991] <- 2 + group[x <= 1992 & y > 1992] <- 3 + group + }
makegroup2 <- function(x, y) {
+?? ifelse(x <= 1990 & y > 1990, 1, +?????? ifelse(x <= 1991 & y > 1991, 2, +???????? ifelse(x <= 1992 & y > 1992, 3, 0))) + }
makegroup1(df$begin,df$end)
?[1] 3 3 3 0 0 3 3 0 0 0 3 0 0 0 0
makegroup2(df$begin,df$end)
?[1] 1 2 3 0 0 2 3 0 0 0 3 0 0 0 0 A. K. ----- Original Message ----- From: Sarah Goslee <sarah.goslee at gmail.com> To: gps at asu.edu Cc: "r-help at r-project.org" <r-help at r-project.org> Sent: Tuesday, May 8, 2012 2:33 PM Subject: Re: [R] grouping function Hi,
On Tue, May 8, 2012 at 2:17 PM, Geoffrey Smith <gps at asu.edu> wrote:
Hello, I would like to write a function that makes a grouping variable for
some panel data . ?The grouping variable is made conditional on the begin
year and the end year. ?Here is the code I have written so far.
name <- c(rep('Frank',5), rep('Tony',5), rep('Edward',5));
begin <- c(seq(1990,1994), seq(1991,1995), seq(1992,1996));
end <- c(seq(1995,1999), seq(1995,1999), seq(1996,2000));
df <- data.frame(name, begin, end);
df;
Thanks for providing reproducible data. Two minor points: you don't need ; at the end of lines, and calling your data frame df is confusing because there's a df() function.
#This is the part I am stuck on;
makegroup <- function(x,y) {
?group <- 0
?if (x <= 1990 & y > 1990) {group==1}
?if (x <= 1991 & y > 1991) {group==2}
?if (x <= 1992 & y > 1992) {group==3}
?return(x,y)
}
makegroup(df$begin,df$end);
#I am looking for output where each observation belongs to a group
conditional on the begin year and end year. ?I would also like to use a for
loop for programming accuracy as well;
This isn't a clear specification:
1990, 1994 for instance fits into all three groups. Do you want to
extend this to more start years, or are you only interested in those
three? Assuming end is always >= start, you don't even need to
consider the end years in your grouping.
Here are two methods, one that "looks like" your pseudocode, and one
that is more R-ish. They give different results because of different
handling of cases that fit all three groups. Rearranging the
statements in makegroup1() from broadest to most restrictive would
make it give the same result as makegroup2().
makegroup1 <- function(x,y) {
group <- numeric(length(x))
group[x <= 1990 & y > 1990] <- 1
group[x <= 1991 & y > 1991] <- 2
group[x <= 1992 & y > 1992] <- 3
group
}
makegroup2 <- function(x, y) {
???ifelse(x <= 1990 & y > 1990, 1,
? ? ? ifelse(x <= 1991 & y > 1991, 2,
?????? ???ifelse(x <= 1992 & y > 1992, 3, 0)))
}
makegroup1(df$begin,df$end)
[1] 3 3 3 0 0 3 3 0 0 0 3 0 0 0 0
makegroup2(df$begin,df$end)
[1]? 1? 2? 3 NA NA? 2? 3 NA NA NA? 3 NA NA NA NA
df
But really, it's a better idea to develop an unambiguous statement of your desired output. Sarah
Sarah Goslee http://www.functionaldiversity.org ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Sorry, yes: I changed it before posting it to more closely match what the default value in the pseudocode. That's a very minor issue: the very last value in the nested ifelse() statements is what's used by default. Sarah
On Tue, May 8, 2012 at 2:46 PM, arun <smartpink111 at yahoo.com> wrote:
HI Sarah, I run the same code from your reply email.? For the makegroup2, the results are 0 in places of NA.
makegroup1 <- function(x,y) {
+ group <- numeric(length(x)) + group[x <= 1990 & y > 1990] <- 1 + group[x <= 1991 & y > 1991] <- 2 + group[x <= 1992 & y > 1992] <- 3 + group + }
makegroup2 <- function(x, y) {
+?? ifelse(x <= 1990 & y > 1990, 1, +?????? ifelse(x <= 1991 & y > 1991, 2, +???????? ifelse(x <= 1992 & y > 1992, 3, 0))) + }
makegroup1(df$begin,df$end)
?[1] 3 3 3 0 0 3 3 0 0 0 3 0 0 0 0
makegroup2(df$begin,df$end)
?[1] 1 2 3 0 0 2 3 0 0 0 3 0 0 0 0 A. K. ----- Original Message ----- From: Sarah Goslee <sarah.goslee at gmail.com> To: gps at asu.edu Cc: "r-help at r-project.org" <r-help at r-project.org> Sent: Tuesday, May 8, 2012 2:33 PM Subject: Re: [R] grouping function Hi, On Tue, May 8, 2012 at 2:17 PM, Geoffrey Smith <gps at asu.edu> wrote:
Hello, I would like to write a function that makes a grouping variable for
some panel data . ?The grouping variable is made conditional on the begin
year and the end year. ?Here is the code I have written so far.
name <- c(rep('Frank',5), rep('Tony',5), rep('Edward',5));
begin <- c(seq(1990,1994), seq(1991,1995), seq(1992,1996));
end <- c(seq(1995,1999), seq(1995,1999), seq(1996,2000));
df <- data.frame(name, begin, end);
df;
Thanks for providing reproducible data. Two minor points: you don't need ; at the end of lines, and calling your data frame df is confusing because there's a df() function.
#This is the part I am stuck on;
makegroup <- function(x,y) {
?group <- 0
?if (x <= 1990 & y > 1990) {group==1}
?if (x <= 1991 & y > 1991) {group==2}
?if (x <= 1992 & y > 1992) {group==3}
?return(x,y)
}
makegroup(df$begin,df$end);
#I am looking for output where each observation belongs to a group
conditional on the begin year and end year. ?I would also like to use a for
loop for programming accuracy as well;
This isn't a clear specification:
1990, 1994 for instance fits into all three groups. Do you want to
extend this to more start years, or are you only interested in those
three? Assuming end is always >= start, you don't even need to
consider the end years in your grouping.
Here are two methods, one that "looks like" your pseudocode, and one
that is more R-ish. They give different results because of different
handling of cases that fit all three groups. Rearranging the
statements in makegroup1() from broadest to most restrictive would
make it give the same result as makegroup2().
makegroup1 <- function(x,y) {
group <- numeric(length(x))
group[x <= 1990 & y > 1990] <- 1
group[x <= 1991 & y > 1991] <- 2
group[x <= 1992 & y > 1992] <- 3
group
}
makegroup2 <- function(x, y) {
???ifelse(x <= 1990 & y > 1990, 1,
? ? ? ifelse(x <= 1991 & y > 1991, 2,
?????? ???ifelse(x <= 1992 & y > 1992, 3, 0)))
}
makegroup1(df$begin,df$end)
[1] 3 3 3 0 0 3 3 0 0 0 3 0 0 0 0
makegroup2(df$begin,df$end)
[1]? 1? 2? 3 NA NA? 2? 3 NA NA NA? 3 NA NA NA NA
df
But really, it's a better idea to develop an unambiguous statement of your desired output. Sarah
Sarah Goslee http://www.functionaldiversity.org