creating conditional means
The error message says you have duplicate row names and that is not allowed. Make sure you have the same number of elements on each line of data as in the header. If you have one more on each line than on the header then the first data item on each line will be regarded as the row name. See ?count.fields The rest of your message is not clear.
On Dec 6, 2007 11:52 AM, Sherri Heck <sheck at ucar.edu> wrote:
hi gabor,
i was able to get your suggestion to work. i have been going through
the R help tools to figure out what each step actually does because i
have something similar but hours 2,5,8,11,14,17 and 20 are missing. i
haven't had any luck. each "mean value" that is calculated is the
same. i keep getting the following error:
"> DF<- read.table(textConnection(Lines), header = TRUE)
Error in read.table(textConnection(Lines), header = TRUE) :
duplicate 'row.names' are not allowed
> aggregate(DF[2:4],
+ with(DF, data.frame(Year, Qtr = (Month - 3) %/% 3 + 1, Hour)),
+ mean) #skip=hour[2,5,8,11,14]
Error in data.frame(Year, Qtr = (Month - 3)%/%3 + 1, Hour) :
object "Year" not found
"
i am not clear why in "aggregate(DF[#:#]" that we are subsetting other
variables besides co2. i have been trying to just subset co2 without
success though.
your original suggestion is below and a snippet of my data set is below
that. if you have any ideas or if you know of a help page that i may
not have found yet that would be great (i've been using the "aggregate"
help pages mostly.
thanks for your help-
s.heck
Lines <- "Year Month Hour co2 num1 num2
2006 11 0 383.3709 28 28
2006 11 1 383.3709 28 28
2006 11 2 383.3709 28 28
2006 11 3 383.3709 28 28
2006 11 4 383.3709 28 28
2006 11 5 383.3709 28 28
2006 11 6 383.3709 28 28
2006 11 7 383.3709 28 28
2006 11 8 383.3709 28 28
2006 11 9 383.3709 27 27
2006 11 10 383.3709 28 28
"
DF <- read.table(textConnection(Lines), header = TRUE)
aggregate(DF[4:6],
with(DF, data.frame(Year, Qtr = (Month - 1) %/% 3 + 1, Hour)),
mean) #skip=hour[2,5,8,11,14,17,20]???
Year Month Hour co2
2005 1 0 386.1600708
2005 1 1 386.823056
2005 1 3 387.1335939
2005 1 4 387.0681103
2005 1 6 387.4750983
2005 1 7 388.3398313
2005 1 9 388.7545317
2005 1 10 388.0844451
2005 1 12 386.7929627
2005 1 13 385.5569521
2005 1 15 384.5523752
2005 1 16 385.0246721
2005 1 18 385.8646669
2005 1 19 386.2182493
2005 1 21 386.4820756
2005 1 22 386.6606276
2005 2 0 386.6791667
2005 2 1 386.6597544
2005 2 3 386.5725303
2005 2 4 387.0638611
2005 2 6 387.9293508
2005 2 7 388.3778991
2005 2 9 388.3721947
2005 2 10 387.8324642
2005 2 12 386.8404892
2005 2 13 385.6770345
2005 2 15 384.4798484
2005 2 16 384.6214677
2005 2 18 384.3044105
2005 2 19 383.3018709
2005 2 21 382.5837339
2005 2 22 382.2658036
Gabor Grothendieck wrote:
Just adjust the formula for Qtr appropriately if your quarters are not Jan/Feb/Mar, Apr/May/Jun, Jul/Aug/Sep, Oct/Nov/Dec as I assumed. On Dec 1, 2007 5:21 PM, Sherri Heck <sheck at ucar.edu> wrote:
Hi Gabor, Thank you for your help. I think I need to clarify a bit more. I am trying to say average all 2pms for months march + april + may (for example). I hope this is clearer. here's a larger subset of my data set: year, month, hour, co2(ppm), num1,num2 2006 1 0 384.2055 14 14 2006 1 1 384.0304 14 14 2006 1 2 383.9672 14 14 2006 1 3 383.8452 14 14 2006 1 4 383.8594 14 14 2006 1 5 383.7318 14 14 2006 1 6 383.6439 14 14 2006 1 7 383.7019 14 14 2006 1 8 383.7487 14 14 2006 1 9 383.8376 14 14 2006 1 10 383.8684 14 14 2006 1 11 383.8301 14 14 2006 1 12 383.8058 14 14 2006 1 13 383.9419 14 14 2006 1 14 383.7876 14 14 2006 1 15 383.7744 14 14 2006 1 16 383.8566 14 14 2006 1 17 384.1014 14 14 2006 1 18 384.1312 14 14 2006 1 19 384.1551 14 14 2006 1 20 384.099 14 14 2006 1 21 384.1408 14 14 2006 1 22 384.3637 14 14 2006 1 23 384.1491 14 14 2006 2 0 384.7082 27 27 2006 2 1 384.6139 27 27 2006 2 2 384.7453 26 26 2006 2 3 384.9224 28 28 2006 2 4 384.8581 28 28 2006 2 5 384.9208 28 28 2006 2 6 384.9086 28 28 2006 2 7 384.837 28 28 2006 2 8 384.6163 27 27 2006 2 9 384.7406 28 28 2006 2 10 384.7468 28 28 2006 2 11 384.6992 28 28 2006 2 12 384.6388 28 28 2006 2 13 384.6346 28 28 2006 2 14 384.6037 28 28 2006 2 15 384.5295 28 28 2006 2 16 384.5654 28 28 2006 2 17 384.6466 28 28 2006 2 18 384.6344 28 28 2006 2 19 384.5911 28 28 2006 2 20 384.6084 28 28 2006 2 21 384.6318 28 28 2006 2 22 384.6181 27 27 2006 2 23 384.6087 27 27 thanks you again for your assistance- s.heck Gabor Grothendieck wrote:
Try aggregate: Lines <- "Year Month Hour co2 num1 num2 2006 11 0 383.3709 28 28 2006 11 1 383.3709 28 28 2006 11 2 383.3709 28 28 2006 11 3 383.3709 28 28 2006 11 4 383.3709 28 28 2006 11 5 383.3709 28 28 2006 11 6 383.3709 28 28 2006 11 7 383.3709 28 28 2006 11 8 383.3709 28 28 2006 11 9 383.3709 27 27 2006 11 10 383.3709 28 28 " DF <- read.table(textConnection(Lines), header = TRUE) aggregate(DF[4:6], with(DF, data.frame(Year, Qtr = (Month - 1) %/% 3 + 1, Hour)), mean) On Dec 1, 2007 3:57 PM, Sherri Heck <sheck at ucar.edu> wrote:
Hi all- I have a dataset (year, month, hour, co2(ppm), num1,num2) [49,] 2006 11 0 383.3709 28 28 [50,] 2006 11 1 383.3709 28 28 [51,] 2006 11 2 383.3709 28 28 [52,] 2006 11 3 383.3709 28 28 [53,] 2006 11 4 383.3709 28 28 [54,] 2006 11 5 383.3709 28 28 [55,] 2006 11 6 383.3709 28 28 [56,] 2006 11 7 383.3709 28 28 [57,] 2006 11 8 383.3709 28 28 [58,] 2006 11 9 383.3709 27 27 [59,] 2006 11 10 383.3709 28 28 that repeats in this style for each month. I would like to compute the mean for each hour in three month intervals. i.e. average all 2pms for each day for months march, april and may. and then do this for each hour interval. i have been messing around with 'for loops' but can't seem to get the output I want. thanks in advance for any help- s.heck CU, Boulder
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.