Date_Time detected as Duplicated (but they are not!)
Dear Augustin: What are the duplicated times? Looks they really do occur twice or more in your original data: perhaps two stamps less time apart than the resolution of your clock? delme[duplicated(delme)] aur2009[[duplicated(delme),1]
On 18 May 2011, at 8:49 AM, Agustin Lobo wrote:
and is it not possible to ignore savings time? My data are in UTC, with no savings time changes
delme = strptime(aur2009[,1], "%m/%d/%Y %H:%M",tz="UTC") any(duplicated(delme))
[1] TRUE
delme = as.POSIXct(aur2009[,1], "%m/%d/%Y %H:%M",tz="UTC") any(duplicated(delme))
[1] TRUE Agus On Wed, May 18, 2011 at 8:55 AM, Michael Sumner <mdsumner at gmail.com> wrote:
See under "Note" in ?strptime:
Remember that in most timezones some times do not occur and some
occur twice because of transitions to/from summer time.
?strptime? does not validate such times (it does not assume a
specific timezone), but conversion by ?as.POSIXct?) will do so.
On Wed, May 18, 2011 at 3:53 PM, Agustin Lobo <Agustin.Lobo at ictja.csic.es>
wrote:
I have a problem with duplicated date_time stamps that I do not see as duplicated. I read a file with observations taken every 30 minutes:
aur2009=read.csv(paste(datadir,"AUR_ECPP_2009.csv",sep="/"),sep=";",stringsAsFactors=F) aur2009[1:3,1:5]
Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 1 1/1/2009 0:00 0 NaN 5.86 NaN 2 1/1/2009 0:30 0 NaN 5.05 NaN 3 1/1/2009 1:00 0 NaN 5.56 NaN
delme = strptime(aur2009[,1], "%m/%d/%Y %H:%M") aur2009[,1]=as.POSIXct(delme)
Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 1 2009-01-01 00:00:00 0 NaN 5.86 NaN 2 2009-01-01 00:30:00 0 NaN 5.05 NaN 3 2009-01-01 01:00:00 0 NaN 5.56 NaN
aur2009ts = ts(aur2009) row.names(aur2009ts) = as.character(delme) aur2009ts[1:3,1:5]
Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 2009-01-01 00:00:00 1230764400 0 NaN 5.86 NaN 2009-01-01 00:30:00 1230766200 0 NaN 5.05 NaN 2009-01-01 01:00:00 1230768000 0 NaN 5.56 NaN Then:
aur2009z = zoo(aur2009[,2:12],as.POSIXct(delme))
Warning message: In zoo(aur2009[, 2:12], as.POSIXct(delme)) : some methods for ?zoo? objects do not work if the index entries in ?order.by? are not unique So I investigate:
any(duplicated(aur2009ts[,1]))
[1] TRUE
aur2009ts[(duplicated(aur2009ts[,1])),1:5]
Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 2009-03-29 02:00:00 1238284800 0 NaN 1.2 NaN 2009-03-29 02:30:00 1238286600 0 NaN 1.2 NaN But note the surprise:
aur2009ts[aur2009ts[,1]==1238284800,1:5]
Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 2009-03-29 01:00:00 1238284800 0 NaN -0.58 NaN 2009-03-29 02:00:00 1238284800 0 NaN 1.20 NaN
aur2009ts[aur2009ts[,1]==1238286600,1:5]
Date.Time E_filled E_filled_flag LE_filled LE_filled_flag 2009-03-29 01:30:00 1238286600 0 NaN -0.34 NaN 2009-03-29 02:30:00 1238286600 0 NaN 1.20 NaN The dates detected as duplicated are actually different times that got the same value in the ts version of the object! What am I doing wrong? They are all observations every 30min, why are these 2 encoded as the same time? Any help appreciated Agus
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Michael Sumner Institute for Marine and Antarctic Studies, University of Tasmania Hobart, Australia e-mail: mdsumner at gmail.com
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.