Skip to content

Parsing a Date

6 messages · Philip, Eric Berger, Jeff Newmiller +2 more

#
Below is some Weather Service data.  I would like to parse the forecast date field into four different columns:

    Year
    Month
    Day
    Hour

I would like to drop the final four zeros.  Any suggestions?

forecast.date                 levels      lon           lat         HGT      RH          TMP       UGRD    VGRD
1 2020-08-01 12:00:00 1000 mb -113.130 33.6335 75.5519 49.6484 305.495 1.40155 2.23264
2 2020-08-01 12:00:00 1000 mb -113.111 33.5142 75.9582 51.0234 305.245 1.65155 2.23264
3 2020-08-01 12:00:00 1000 mb -113.092 33.3948 76.3957 52.7734 305.057 1.90155 2.23264
4 2020-08-01 12:00:00 1000 mb -112.987 33.6495 75.9269 49.1484 305.745 1.90155 2.04514
5 2020-08-01 12:00:00 1000 mb -112.968 33.5301 76.3019 50.2734 305.495 2.08905 1.98264

Philip Heinrich
#
If the forecast.date column is of type character you can use lubridate to
do this:
# [1] 2020
# [1] 8

etc
On Sun, Aug 2, 2020 at 7:24 PM Philip <herd_dog at cox.net> wrote:

            

  
  
#
Learn to post plain text and use dput to include data:

dta <- structure(list(forecast.date = c("2020-08-01 12:00:00", "2020-08-01 12:00:00", "2020-08-01 12:00:00", "2020-08-01 12:00:00", "2020-08-01 12:00:00" ), levels = c("1000 mb", "1000 mb", "1000 mb", "1000 mb", "1000 mb" ), lon = c(-113.13, -113.111, -113.092, -112.987, -112.968), lat = c(33.6335, 33.5142, 33.3948, 33.6495, 33.5301), HGT = c(75.5519, 75.9582, 76.3957, 75.9269, 76.3019), RH = c(49.6484, 51.0234, 52.7734, 49.1484, 50.2734), TMP = c(305.495, 305.245, 305.057, 305.745, 305.495), UGRD = c(1.40155, 1.65155, 1.90155, 1.90155, 2.08905), VGRD = c(2.23264, 2.23264, 2.23264, 2.04514, 1.98264 )), .Names = c("forecast.date", "levels", "lon", "lat", "HGT", "RH", "TMP", "UGRD", "VGRD"), class = "data.frame", row.names = c(NA, -5L))

dta$Year <- as.integer( substr( dta$forecast.date, 1, 4 ) )
dta$Month <- as.integer( substr( dta$forecast.date, 6, 7 ) )
dta$Day <- as.integer( substr( dta$forecast.date, 9, 10 ) )
dta$Hour <- as.integer( substr( dta$forecast.date, 12, 13 ) )
dta
On August 2, 2020 9:24:24 AM PDT, Philip <herd_dog at cox.net> wrote:

  
    
#
On 2020-08-02 09:24 -0700, Philip wrote:
| Below is some Weather Service data.  I 
| would like to parse the forecast date 
| field into four different columns: 
| Year, Month, Day, Hour

Dear Philip,

I'm largely re-iterating Eric and Jeff's 
excellent solutions:

	> dat <- structure(list(forecast.date =
	+ c("2020-08-01 12:00:00",
	+ "2020-08-01 12:00:00",
	+ "2020-08-01 12:00:00",
	+ "2020-08-01 12:00:00",
	+ "2020-08-01 12:00:00"
	+ ), TMP = c("305.495", "305.245",
	+ "305.057", "305.745", "305.495"
	+ )), row.names = c(NA, 5L),
	+ class = "data.frame")
	> t(apply(simplify2array(
	+   strsplit(dat$forecast.date, "-| |:")),
	+   2, as.numeric))
	     [,1] [,2] [,3] [,4] [,5] [,6]
	[1,] 2020    8    1   12    0    0
	[2,] 2020    8    1   12    0    0
	[3,] 2020    8    1   12    0    0
	[4,] 2020    8    1   12    0    0
	[5,] 2020    8    1   12    0    0
	> simplify2array(parallel::mclapply(c(
	+   lubridate::year,
	+   lubridate::month,
	+   lubridate::day,
	+   lubridate::hour), function(FUN, x) {
	+     FUN(x)
	+   }, x=dat$forecast.date))
	     [,1] [,2] [,3] [,4]
	[1,] 2020    8    1   12
	[2,] 2020    8    1   12
	[3,] 2020    8    1   12
	[4,] 2020    8    1   12
	[5,] 2020    8    1   12

V

r

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20200802/465df8ed/attachment.sig>
#
Hello,

And another solution, taking advantage of Rasmus' one:



simplify2array(parallel::mclapply(c(
 ? "%Y",
 ? "%m",
 ? "%d",
 ? "%H"), function(fmt, x) {
 ??? as.integer(format(as.POSIXct(x), format = fmt))
}, x = dta$forecast.date))
#???? [,1] [,2] [,3] [,4]
#[1,] 2020??? 8??? 1?? 12
#[2,] 2020??? 8??? 1?? 12
#[3,] 2020??? 8??? 1?? 12
#[4,] 2020??? 8??? 1?? 12
#[5,] 2020??? 8??? 1?? 12


The data set dta is Jeff's, it's in dput format.

Hope this helps,

Rui Barradas

?s 18:26 de 02/08/2020, Rasmus Liland escreveu:

  
    
#
Hello,

I'm reposting, I sent the previous in HTML format.
My apologies, I'm not at my computers.

And another solution, taking advantage of Rasmus' one:


simplify2array(parallel::mclapply(c(
 ?"%Y",
 ?"%m",
 ?"%d",
 ?"%H"), function(fmt, x) {
 ?as.integer(format(as.POSIXct(x), format = fmt))
}, x = dta$forecast.date))
#???? [,1] [,2] [,3] [,4]
#[1,] 2020??? 8??? 1?? 12
#[2,] 2020??? 8??? 1?? 12
#[3,] 2020??? 8??? 1?? 12
#[4,] 2020??? 8??? 1?? 12
#[5,] 2020??? 8??? 1?? 12



The data set dta is Jeff's, it's in dput format.

Hope this helps,

Rui Barradas



?s 22:54 de 02/08/2020, Rui Barradas escreveu: