An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20080904/e39440b7/attachment.pl>
using complete.cases() with nested factors
4 messages · Andrew Barr, Ken Knoblauch, Hadley Wickham +1 more
Andrew Barr <wabarr <at> gmail.com> writes:
This maybe a newbie question. I have a dataframe
that looks like the sample
at the bottom of the email. I have monthly
precipitation data from several
sites over several years. For each site,
I need to extract years that have
a complete series of 12 monthly precipitation
values, while excluding that
year for sites with incomplete data.
I can't figure out how to do this
gracefully (i.e. without a silly for loop).
Any help will be appreciate,
thanks! SiteID year month precip(mm) 670090 1941 jan 2998 670090 1941 feb 1299 670090 1941 mar 1007 670090 1941 apr 354 670090 1941 may 88 670090 1941 jun 156 670090 1941 jul 8 670090 1941 aug 4 670090 1941 sep 8 670090 1941 oct 58 670090 1941 nov 397 670090 1941 dec 248 670090 1942 jan NA 670090 1942 feb 380 670090 1942 mar 797 670090 1942 apr 142 670090 1942 may 43 670090 1942 jun 14 670090 1942 jul 70 670090 1942 aug 51 670090 1942 sep 0 670090 1942 oct 10 670090 1942 nov 235 670090 1942 dec 405
There are likely more elegant solutions but this seems to work.
If the data frame is in a variable named dd
lapply(unique(dd$year), function(x) {s <- subset(dd, year == x)
if (nrow(s) == 12) s})
On Thu, Sep 4, 2008 at 4:19 PM, Ken Knoblauch <ken.knoblauch at inserm.fr> wrote:
Andrew Barr <wabarr <at> gmail.com> writes:
This maybe a newbie question. I have a dataframe
that looks like the sample
at the bottom of the email. I have monthly
precipitation data from several
sites over several years. For each site,
I need to extract years that have
a complete series of 12 monthly precipitation
values, while excluding that
year for sites with incomplete data.
I can't figure out how to do this
gracefully (i.e. without a silly for loop).
Any help will be appreciate,
thanks! SiteID year month precip(mm) 670090 1941 jan 2998 670090 1941 feb 1299 670090 1941 mar 1007 670090 1941 apr 354 670090 1941 may 88 670090 1941 jun 156 670090 1941 jul 8 670090 1941 aug 4 670090 1941 sep 8 670090 1941 oct 58 670090 1941 nov 397 670090 1941 dec 248 670090 1942 jan NA 670090 1942 feb 380 670090 1942 mar 797 670090 1942 apr 142 670090 1942 may 43 670090 1942 jun 14 670090 1942 jul 70 670090 1942 aug 51 670090 1942 sep 0 670090 1942 oct 10 670090 1942 nov 235 670090 1942 dec 405
There are likely more elegant solutions but this seems to work.
If the data frame is in a variable named dd
lapply(unique(dd$year), function(x) {s <- subset(dd, year == x)
if (nrow(s) == 12) s})
I think this is slightly more elegant, and follows the
split-apply-combine strategy:
years <- split(dd, dd$year)
full_years <- Filter(function(df) nrow(df) == 12, years)
do.call("cbind", full_years)
Hadley
See ?subset and ?ave and try this: subset(DF, ave(year, year, FUN = length) == 12)
On Thu, Sep 4, 2008 at 5:04 PM, Andrew Barr <wabarr at gmail.com> wrote:
Hello,
This maybe a newbie question. I have a dataframe that looks like the sample
at the bottom of the email. I have monthly precipitation data from several
sites over several years. For each site, I need to extract years that have
a complete series of 12 monthly precipitation values, while excluding that
year for sites with incomplete data. I can't figure out how to do this
gracefully (i.e. without a silly for loop). Any help will be appreciate,
thanks!
Andrew
SiteID year month precip(mm)
670090 1941 jan 2998
670090 1941 feb 1299
670090 1941 mar 1007
670090 1941 apr 354
670090 1941 may 88
670090 1941 jun 156
670090 1941 jul 8
670090 1941 aug 4
670090 1941 sep 8
670090 1941 oct 58
670090 1941 nov 397
670090 1941 dec 248
670090 1942 jan NA
670090 1942 feb 380
670090 1942 mar 797
670090 1942 apr 142
670090 1942 may 43
670090 1942 jun 14
670090 1942 jul 70
670090 1942 aug 51
670090 1942 sep 0
670090 1942 oct 10
670090 1942 nov 235
670090 1942 dec 405
--
Andrew
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.