An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/r-help/attachments/20051013/d5ef60ce/attachment.pl
subsetting data frame using by() or tapply() or other
3 messages · Brian S Cade, Marc Schwartz (via MN)
On Thu, 2005-10-13 at 14:28 -0600, Brian S Cade wrote:
Ok so I see the problem that I'm having creating a new variable (LAG1DBC)
in the example data transformation below is that tapply() is creating a
list that is not dimensionally consistent with the data frame (data). So
how do I go from the list output of tapply() to create a dimensionally
consistent vector that can create the new variable in my original data
frame? I've been trying to use a function like
data$LAG1DBC <- tapply(data$DBC, data$LOCID, function(x) c(NA,
x[-length(x)]))
which creates a list of dimension much smaller than the nrows in data. And
I've tried things like using as.data.frame.array() or as.data.frame.list()
in front of tapply() and still have the same problem. I know this can't
be that unusual of a data manipulation and that someone has to have done
similar things before.
I want to go from something like this:
LOCID POPULATION YEAR DBC
1 algb-1 A 1992 0.70451575
2 algb-1 A 1993 0.59506851
3 algb-1 A 1997 0.84837544
4 algb-1 A 1998 0.50283182
5 algb-1 A 2000 0.91242707
6 algb-2 A 1992 0.09747155
7 algb-2 A 1993 0.84772253
8 algb-2 A 1997 0.43974081
9 algb-2 A 1998 0.83108544
10 algb-2 A 2000 0.22291192
11 algb-3 A 1992 0.44234175
12 algb-3 A 1993 0.54089534
5680 taylr-73 B 2001 0.43918082
5681 taylr-73 B 2002 0.34694427
5682 taylr-73 B 2003 3.35619190
5683 taylr-73 B 2004 0.71575815
5684 taylr-73 B 2005 0.42038506
5685 taylr-74 B 1992 3.88410354
5686 taylr-74 B 1993 3.32472557
5687 taylr-74 B 1994 3.29861501
5688 taylr-74 B 1996 0.48153827
5689 taylr-74 B 1997 3.63570636
5690 taylr-74 B 1998 1.94630194
to something like this:
LOCID POPULATION YEAR DBC LAG1DBC
1 algb-1 A 1992 0.70451575 NA
2 algb-1 A 1993 0.59506851 0.70451575
3 algb-1 A 1997 0.84837544 0.59506851
4 algb-1 A 1998 0.50283182 0.84837544
5 algb-1 A 2000 0.91242707 0.50283182
6 algb-2 A 1992 0.09747155 NA
7 algb-2 A 1993 0.84772253 0.09747155
8 algb-2 A 1997 0.43974081 0.84772253
9 algb-2 A 1998 0.83108544 0.43974081
10 algb-2 A 2000 0.22291192 0.83108544
11 algb-3 A 1992 0.44234175 NA
12 algb-3 A 1993 0.54089534 0.44234175
5680 taylr-73 B 2001 0.43918082 NA
5681 taylr-73 B 2002 0.34694427 0.43918082
5682 taylr-73 B 2003 3.35619190 0.34694427
5683 taylr-73 B 2004 0.71575815 3.35619190
5684 taylr-73 B 2005 0.42038506 0.71575815
5685 taylr-74 B 1992 3.88410354 NA
5686 taylr-74 B 1993 3.32472557 3.88410354
5687 taylr-74 B 1994 3.29861501 3.32472557
5688 taylr-74 B 1996 0.48153827 3.29861501
5689 taylr-74 B 1997 3.63570636 0.48153827
5690 taylr-74 B 1998 1.94630194 3.63570636
Brian
Brian, Use unlist():
data$LAG1DBC <- unlist(tapply(data$DBC, data$LOCID,
function(x) c(NA, x[-length(x)])))
data
LOCID POPULATION YEAR DBC LAG1DBC 1 algb-1 A 1992 0.70451575 NA 2 algb-1 A 1993 0.59506851 0.70451575 3 algb-1 A 1997 0.84837544 0.59506851 4 algb-1 A 1998 0.50283182 0.84837544 5 algb-1 A 2000 0.91242707 0.50283182 6 algb-2 A 1992 0.09747155 NA 7 algb-2 A 1993 0.84772253 0.09747155 8 algb-2 A 1997 0.43974081 0.84772253 9 algb-2 A 1998 0.83108544 0.43974081 10 algb-2 A 2000 0.22291192 0.83108544 11 algb-3 A 1992 0.44234175 NA 12 algb-3 A 1993 0.54089534 0.44234175 5680 taylr-73 B 2001 0.43918082 NA 5681 taylr-73 B 2002 0.34694427 0.43918082 5682 taylr-73 B 2003 3.35619190 0.34694427 5683 taylr-73 B 2004 0.71575815 3.35619190 5684 taylr-73 B 2005 0.42038506 0.71575815 5685 taylr-74 B 1992 3.88410354 NA 5686 taylr-74 B 1993 3.32472557 3.88410354 5687 taylr-74 B 1994 3.29861501 3.32472557 5688 taylr-74 B 1996 0.48153827 3.29861501 5689 taylr-74 B 1997 3.63570636 0.48153827 5690 taylr-74 B 1998 1.94630194 3.63570636 HTH, Marc Schwartz
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/r-help/attachments/20051013/60208456/attachment.pl