Skip to content

A problem with string handling to make a time duration

4 messages · Gavin Rudge, John Laing, Franklin Bretschneider

#
I have a character string that represents a time duration. It has an hours
minutes seconds structure(ish) but with letters denoting units (H,M or S) no
leading zeros and no placeholder at all where one or other of the units are
not required.

It looks like this:

t<-c("10H20M33S","1H1M","1M","21M9S","2H55S" ))
df<-data.frame(t)
df

#ideally should look like:
t2<-c("10:20:33","01:00:01","00:01:00","00:21:09","02:00:55") 
df2<-data.frame(t2)
df2

I need to get it into hours minutes and seconds either in time format or as
a string with leading zeros and all three time units represented in each
one, as in df2.  The data, part of a very large dataset, are for onward use
and processing in a GIS application.  I?ve messed about with string handling
statements in SQL to no avail, but wondered if R would be a better bet? 
I?ve had a look at some of the commands in stringr, but am unsure how to
operationalise a solution using this package.  Any advice is welcome.




--
View this message in context: http://r.789695.n4.nabble.com/A-problem-with-string-handling-to-make-a-time-duration-tp4706795.html
Sent from the R help mailing list archive at Nabble.com.
#
Regular expressions are the tool for this problem. This pattern
matches your input data:

t <- c("10H20M33S", "1H1M", "1M", "21M9S", "2H55S")
patt <- "^(([0-9]+)H)?(([0-9]+)M)?(([0-9]+)S)?$"
all(grepl(patt, t)) # TRUE

We can use the pattern to extract hour/minute/second components

hms <- lapply(c(h="\\2", m="\\4", s="\\6"), function(r) sub(patt, r, t))

And then just plug those components back into the desired format

formatted <- gsub(" ", "0", sprintf("%2s:%2s:%2s", hms$h, hms$m, hms$s))

In the last line we need the gsub because zero-padding with %02s seems
to be platform-dependent.

JL
On Mon, May 4, 2015 at 3:59 PM, gavinr <g.rudge at bham.ac.uk> wrote:
#
Hello gavinr,
This can be done easily with the substring function, e.g.

# say:

string="12H15M45S"

#then pick:

h=substr(string,1,2)
m=substr(string,4,5)

#  and join again:

newstr = paste(h,m,sep=":")
#  etcetera

Success and
Best regards,

Frank
--


Franklin Bretschneider
Dept of Biology
Utrecht University
bretschr at xs4all.nl
#
Thanks guys. The first solution with the gsub / lapply works perfectly. The
solution using substrings would work if the times were in a consistent
format, but without the leading zeros and with some parts of the string
absent completely it would need some extra logic to apply. I need something
to automate over a data set with a million or so time points in it.



--
View this message in context: http://r.789695.n4.nabble.com/A-problem-with-string-handling-to-make-a-time-duration-tp4706795p4706822.html
Sent from the R help mailing list archive at Nabble.com.