Skip to content

Efficiency: as.POSIXct() or ISOdate()

1 message · Don MacQueen

#
I am using awk to pre-process several data files that have date-time 
information in a non-standard format. The files can be in the many 
tens of thousands of lines. At present, I plan to use scan(pipe()) or 
perhaps read.table(pipe()) with whatever other arguments are 
necessary.

It occurs to me that using awk I have the opportunity to read the 
datetime information as either a single character vector of datetimes 
and use as.POSIXct() , or as six numeric vectors and use ISOdate().

I would appreciate advice about the relative merits. I would tend to 
prefer the faster one, if there is much difference.

Thanks
-Don

p.s.
I haven't quite sorted out the options, i.e., when to use strftime(), 
when to use format.POSIXct(), when to use as.POSIXct(). At the 
moment, though, I believe I would use as.POSIXct(). I don't at this 
point see any reason to use the "lt" structure in my current 
application.
_
platform powerpc-apple-darwin1.3.7
arch     powerpc
os       darwin1.3.7
system   powerpc, darwin1.3.7
status
major    1
minor    3.0
year     2001
month    06
day      22
language R

(and I will probably also use a Solaris version at some point)