cut POSIX results in NA - bug?
Dear prof. Ripley Thank you very much for explanation (without it I would not consider include.lowest has something to do with my observation). I changed my code to get rid of single final POSIXdates. BTW there is no mention in cut.POSIXt help page about include.lowest and I think that in case of dates it does something what is maybe not so *understandable* (61 minutes in one hour). datum<-seq(ISOdate(2004,8,31), ISOdate(2004,9,1), "min") # part of a datum variable datum[1379:1381] [1] "2004-09-01 12:58:00 St??edn\355 Evropa (letn\355 ??as)" "2004-09-01 12:59:00 St??edn\355 Evropa (letn\355 ??as)" [3] "2004-09-01 13:00:00 St??edn\355 Evropa (letn\355 ??as)"
# the last item seems to me to belong to time from 13:00:00 to 13:59:00 e.g. it is part of thirteen's hour of a day cut(datum[1370:1381],"hour", include.lowest=T) # it will include it to previous hour [1] 2004-09-01 12:00:00 2004-09-01 12:00:00 2004-09-01 12:00:00 2004-09-01 12:00:00 2004-09-01 12:00:00 2004-09-01 12:00:00 [7] 2004-09-01 12:00:00 2004-09-01 12:00:00 2004-09-01 12:00:00 2004-09-01 12:00:00 2004-09-01 12:00:00 2004-09-01 12:00:00 Levels: 2004-09-01 12:00:00 cut(datum[1370:1381],"hour") # this will drop it from result, correct but unfortunate [1] 2004-09-01 12:00:00 2004-09-01 12:00:00 2004-09-01 12:00:00 2004-09-01 12:00:00 2004-09-01 12:00:00 2004-09-01 12:00:00 [7] 2004-09-01 12:00:00 2004-09-01 12:00:00 2004-09-01 12:00:00 2004-09-01 12:00:00 2004-09-01 12:00:00 <NA> Levels: 2004-09-01 12:00:00 # so as a result an hour can have 61 minutes levels(cut(datum[1321:1381],"hour", include.lowest=T)) [1] "2004-09-01 12:00:00" length(cut(datum[1321:1381],"hour", include.lowest=T)) #??? [1] 61 Is it correct? Thank you again. Best regards Petr Pikal
On 3 Nov 2004 at 11:20, Prof Brian Ripley wrote:
On Wed, 3 Nov 2004, Petr Pikal wrote:
Dear all I try to make hourly average by cut() function, which almost works as *I* expected. What puzled me is that if there is only one item at the end of your data it results in NA. Example will explain what I mean datum<-seq(ISOdate(2004,8,31), ISOdate(2004,9,1), "min") cut(datum[1370:1381],"hour", labels=F) [1] 1 1 1 1 1 1 1 1 1 1 1 NA cut(datum[1370:1382],"hour", labels=F) [1] 1 1 1 1 1 1 1 1 1 1 1 2 2 I do not understand why the last item in first call is NA. I found it only when there was a switch from DST to standard time as it coused a trouble in one of my functions and I found there is NA value where I did not expected it.
cut(datum[1370:1381],"hour", labels=F, include.lowest=T)
is what you need. See ?cut, in the See Also, which says
include.lowest: logical, indicating if an 'x[i]' equal to the lowest
(or highest, for 'right = FALSE') 'breaks' value should be
included.
I can make some workaround but can you please explain me why first call results in NA value at the end of a vector and if it is *intended* behaviour.
It is the documented behaviour, for better or for worse. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Petr Pikal petr.pikal at precheza.cz