An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20131108/74969de4/attachment.pl>
Date handling in R is hard to understand
6 messages · Alemu Tadesse, Bert Gunter, Jim Lemon +2 more
Have a look at the "lubridate" package. It claims to try to make dealing with dates easier. -- Bert
On Fri, Nov 8, 2013 at 11:41 AM, Alemu Tadesse <alemu.tadesse at gmail.com> wrote:
Dear All,
I usually work with time series data. The data may come in AM/PM date
format or on 24 hour time basis. R can not recognize the two differences
automatically - at least for me. I have to specifically tell R in which
time format the data is. It seems that Pandas knows how to handle date
without being told the format. The problem arises when I try to shift time
by a certain time. Say adding 3600 to shift it forward, that case I have to
use something like:
Measured_data$Date <- as.POSIXct(as.character(Measured_data$Date),
tz="",format = "%m/%d/%Y %I:%M %p")+3600
or Measured_data$Date <- as.POSIXct(as.character(Measured_data$Date),
tz="",format = "%m/%d/%Y %H:%M")+3600 depending on the format. The date
also attaches MDT or MST and so on. When merging two data frames with
dates of different format that may create a problem (I think). When I get
data from excel it could be in any/random format and I needed to customize
the date to use in R in one of the above formats. Any TIPS - for automatic
processing with no need to specifically tell the data format ?
Another problem I saw was that when using r bind to bind data frames, if
one column of one of the data frames is a character data (say for example
none - coming from mysql) format R doesn't know how to concatenate numeric
column from the other data frame to it. I needed to change the numeric to
character and later after binding takes place I had to re-convert it to
numeric. But, this causes problem in an automated environment. Any
suggestion ?
Thanks
Mihretu
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374
Hi Mihretu, Can you grep for "AM" or "PM"? If so build your format string depending upon whether one of these exists in the date string. Jim
On 11/09/2013 06:41 AM, Alemu Tadesse wrote:
Dear All, I usually work with time series data. The data may come in AM/PM date format or on 24 hour time basis. R can not recognize the two differences automatically - at least for me. I have to specifically tell R in which time format the data is. It seems that Pandas knows how to handle date without being told the format. The problem arises when I try to shift time by a certain time. Say adding 3600 to shift it forward, that case I have to use something like: Measured_data$Date<- as.POSIXct(as.character(Measured_data$Date), tz="",format = "%m/%d/%Y %I:%M %p")+3600 or Measured_data$Date<- as.POSIXct(as.character(Measured_data$Date), tz="",format = "%m/%d/%Y %H:%M")+3600 depending on the format. The date also attaches MDT or MST and so on. When merging two data frames with dates of different format that may create a problem (I think). When I get data from excel it could be in any/random format and I needed to customize the date to use in R in one of the above formats. Any TIPS - for automatic processing with no need to specifically tell the data format ? Another problem I saw was that when using r bind to bind data frames, if one column of one of the data frames is a character data (say for example none - coming from mysql) format R doesn't know how to concatenate numeric column from the other data frame to it. I needed to change the numeric to character and later after binding takes place I had to re-convert it to numeric. But, this causes problem in an automated environment. Any suggestion ? Thanks Mihretu
I agree w/ lubridate. I also would like to mention that "date handling" is amazingly difficult in ALL computer languages, not just R. Take a stroll through sites like thedailywtf.com to see how quickly people get into tarpits full of thorns when trying to deal with leap years, weeks vs month ends, etc. Bert Gunter wrote
Have a look at the "lubridate" package. It claims to try to make dealing with dates easier. -- Bert On Fri, Nov 8, 2013 at 11:41 AM, Alemu Tadesse <
alemu.tadesse@
> wrote:
Dear All,
I usually work with time series data. The data may come in AM/PM date
format or on 24 hour time basis. R can not recognize the two differences
automatically - at least for me. I have to specifically tell R in which
time format the data is. It seems that Pandas knows how to handle date
without being told the format. The problem arises when I try to shift
time
by a certain time. Say adding 3600 to shift it forward, that case I have
to
use something like:
Measured_data$Date <- as.POSIXct(as.character(Measured_data$Date),
tz="",format = "%m/%d/%Y %I:%M %p")+3600
or Measured_data$Date <- as.POSIXct(as.character(Measured_data$Date),
tz="",format = "%m/%d/%Y %H:%M")+3600 depending on the format. The date
also attaches MDT or MST and so on. When merging two data frames with
dates of different format that may create a problem (I think). When I get
data from excel it could be in any/random format and I needed to
customize
the date to use in R in one of the above formats. Any TIPS - for
automatic
processing with no need to specifically tell the data format ?
Another problem I saw was that when using r bind to bind data frames, if
one column of one of the data frames is a character data (say for example
none - coming from mysql) format R doesn't know how to concatenate
numeric
column from the other data frame to it. I needed to change the numeric to
character and later after binding takes place I had to re-convert it to
numeric. But, this causes problem in an automated environment. Any
suggestion ?
Thanks
Mihretu
[[alternative HTML version deleted]]
______________________________________________
R-help@
mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374
______________________________________________
R-help@
mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- View this message in context: http://r.789695.n4.nabble.com/Date-handling-in-R-is-hard-to-understand-tp4680070p4680125.html Sent from the R help mailing list archive at Nabble.com.
1 day later
Hi
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- project.org] On Behalf Of Alemu Tadesse Sent: Friday, November 08, 2013 8:41 PM To: r-help at r-project.org Subject: [R] Date handling in R is hard to understand Dear All, I usually work with time series data. The data may come in AM/PM date format or on 24 hour time basis. R can not recognize the two differences automatically - at least for me. I have to specifically tell R in which time format the data is. It seems that Pandas knows how to handle date without being told the format. The problem arises when I try to shift time by a certain time. Say adding 3600 to shift it forward, that case I have to use something like: Measured_data$Date <- as.POSIXct(as.character(Measured_data$Date), tz="",format = "%m/%d/%Y %I:%M %p")+3600 or Measured_data$Date <- as.POSIXct(as.character(Measured_data$Date), tz="",format = "%m/%d/%Y %H:%M")+3600 depending on the format. The date also attaches MDT or MST and so on. When merging two data frames with dates of different format that may create a problem (I think). When I get data from excel it could be in any/random format and I needed to customize the date to use in R in one of the above formats. Any TIPS - for automatic processing with no need to specifically tell the data format ? Another problem I saw was that when using r bind to bind data frames, if one column of one of the data frames is a character data (say for example none - coming from mysql) format R doesn't know how to concatenate numeric column from the other data frame to it. I needed to
rbind/cbind can use data.frame method which add any column specific format. However with "normal" method, it results in matrix which has to have common type of data in all columns (actually matrix is only vector with dimensions).
str(cbind(airquality, 1:153))
'data.frame': 153 obs. of 7 variables: $ ozone : int 41 36 12 18 NA 28 23 19 8 NA ... $ solar.r: int 190 118 149 313 NA NA 299 99 19 194 ... $ wind : num 7.4 8 12.6 11.5 14.3 14.9 8.6 13.8 20.1 8.6 ... $ temp : int 67 72 74 62 56 66 65 59 61 69 ... $ month : int 5 5 5 5 5 5 5 5 5 5 ... $ day : int 1 2 3 4 5 6 7 8 9 10 ... $ 1:153 : int 1 2 3 4 5 6 7 8 9 10 ... Regards Petr
change the numeric to character and later after binding takes place I had to re-convert it to numeric. But, this causes problem in an automated environment. Any suggestion ? Thanks Mihretu [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20131111/038b5d47/attachment.pl>