Simple time series question with zoo
On Thu, Oct 27, 2011 at 4:18 PM, Vinny Moriarty <vwmoriarty at gmail.com> wrote:
New user here. My goal is pull daily averages from a long dataset. I've been working with some code I got from this list from https://stat.ethz.ch/pipermail/r-help/2009-March/191302.html The code how I have been using it is as follows: library(zoo) library(chron) DB<-read.table("/Users/me/Desktop/R/data.csv", sep=",", header=TRUE, as.is =TRUE) z<-zoo(LTER6$temp, chron(LTER6$Date, LTER6$Time)) z.day=aggregate(z, trunc, mean) #This last line gives me daily averages for my data Simple and elegant- and it works. Thanks to the author the hard part is over. But I plan to tweak it so I have some questions about why this works 1- The data I have has the date and time format as a single string like this "2006-04-09 10:20:00". But the code was set up to read the data in two columns ?ie- "2006-04-09" & "10:20:00". Is this how the chrom package expects to have the data, or is there a way I can change the code to read the data as a single column. For now I am chopping up my date and time data manually before I run R. 2- ?I've read the help on "as.is", and I'm not sure why I need that function in the first line of code. This is what my original data looks like (with header) if this helps answer this this question line.site,time_local,time_utc,reef_type_code,sensor_type,sensor_depth_m,temp 06,2006-04-09 10:20:00,2006-04-09 20:20:00,BAK,sb39, 2, 29.63 06,2006-04-09 10:40:00,2006-04-09 20:40:00,BAK,sb39, 2, 29.56 3. Finally- how does the function "trunc" know to aggregate the data by day? If I wanted to do monthly averages I would need to specify with "as.yearmon", but I don't seem to need to specify "day" anywhere in the code.
That link is several years old. Since then the zoo package has gained
additional capabilities. Assuming the 2nd field is the desired
date/time and the last field on each line is the one you want try this
read.zoo statement. See ?read.zoo and also try:
vignette("zoo-read")
library(zoo)
library(chron)
# create test file
Lines <- "line.site,time_local,time_utc,reef_type_code,sensor_type,sensor_depth_m,temp
06,2006-04-09 10:20:00,2006-04-09 20:20:00,BAK,sb39, 2, 29.63
06,2006-04-09 10:40:00,2006-04-09 20:40:00,BAK,sb39, 2, 29.56"
cat(Lines, "\n", file = "data.txt")
# NULL fields are removed
temp <- read.zoo("data.txt", FUN = as.chron, header = TRUE, sep = ",",
colClasses = c("NULL", NA, "NULL", "NULL", "NULL", "NULL", NA))
# daily
temp.day <- read.zoo("data.txt", FUN = as.Date, header = TRUE, sep = ",",
aggregate = mean,
colClasses = c("NULL", NA, "NULL", "NULL", "NULL", "NULL", NA))
# monthly
temp.ym <- read.zoo("data.txt", FUN = as.yearmon, header = TRUE, sep = ",",
aggregate = mean,
colClasses = c("NULL", NA, "NULL", "NULL", "NULL", "NULL", NA))
chron represents date/time internally as days since the Epoch +
fraction of day for the time. Thus truncating to an integer removes
the fractional part (i.e. the time) leaving the day. See R News 4/1.
We could alternately just use the Date class in the base of R as shown
above.
If we had read in temp and wanted to aggregate it rather than read it
straight into an aggregated form then here are some possibilities:
aggregate(temp, trunc, mean) # daily
aggregate(temp, as.Date, mean) # daily with Date class
aggregate(temp, as.yearmon, mean) # monthly
Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com