Skip to content

printing difftime summary

18 messages · Sam Steingold, arun, David Winsemius +2 more

#
Hi,
I have a vector of difftime objects and I want to see its summary.
Alas:
--8<---------------cut here---------------start------------->8---
Length    Class     Mode 
 9008386 difftime  numeric 
--8<---------------cut here---------------end--------------->8---
this is almost completely useless.
I can use as.numeric:
--8<---------------cut here---------------start------------->8---
structure(c(0.5, 1027, 5969, 29870, 28970, 603100), .Names = c("Min.", 
"1st Qu.", "Median", "Mean", "3rd Qu.", "Max."), class = c("summaryDefault", 
"table"))
Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
     0.5   1027.0   5969.0  29870.0  28970.0 603100.0 
--8<---------------cut here---------------end--------------->8---
but the printed representation is very unreadable: the fact that
603100.0 is almost exactly 7 days is not obvious.
Okay, maybe as.difftime will help?
--8<---------------cut here---------------start------------->8---
Time differences in secs
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
     0.5   1027.0   5969.0  29870.0  28970.0 603100.0
Time differences in hours
        Min.      1st Qu.       Median         Mean      3rd Qu.         Max. 
1.388889e-04 2.852778e-01 1.658056e+00 8.297222e+00 8.047222e+00 1.675278e+02 
--8<---------------cut here---------------end--------------->8---
nope; still unreadable.

What I really want to see _printed_ is something likes this:
--8<---------------cut here---------------start------------->8---
Min.     1st Qu.      Median        Mean     3rd Qu.        Max. 
"500.00 ms" "17.12 min" "99.48 min"  "8.30 hrs"  "8.05 hrs" "6.98 days" 
--8<---------------cut here---------------end--------------->8---
except that the quotes are not needed in the printed output.
Here I wrote:
--8<---------------cut here---------------start------------->8---
difftime2string <- function (x) {
  if (x < 1) return(sprintf("%.2f ms",x*1000))
  if (x < 100) return(sprintf("%.2f sec",x))
  if (x < 6000) return(sprintf("%.2f min",x/60))
  if (x < 108000) return(sprintf("%.2f hrs",x/3600))
  if (x < 400*24*3600) return(sprintf("%.2f days",x/(24*3600)))
  sprintf("%.2f years",x/(365.25*24*3600))
}
--8<---------------cut here---------------end--------------->8---

So, what is "The Right R Way" to print a summary of difftime objects?
Thanks!
#
Hello,

Just a doubt.? Are you looking for some other function (difftime2string) ot just remove the quotes from the printed output?

If it is the latter, then this should do it.
res<-do.call(data.frame,lapply(s,difftime2string))
?names(res)<-names(s)
?res
#?????? Min.?? 1st Qu.??? Median???? Mean? 3rd Qu.????? Max.
#1 500.00 ms 17.12 min 99.48 min 8.30 hrs 8.05 hrs 6.98 days

A.K.




----- Original Message -----
From: Sam Steingold <sds at gnu.org>
To: r-help at r-project.org
Cc: 
Sent: Wednesday, November 21, 2012 2:22 PM
Subject: [R] printing difftime summary

Hi,
I have a vector of difftime objects and I want to see its summary.
Alas:
--8<---------------cut here---------------start------------->8---
? Length? ? Class? ?  Mode 
9008386 difftime? numeric 
--8<---------------cut here---------------end--------------->8---
this is almost completely useless.
I can use as.numeric:
--8<---------------cut here---------------start------------->8---
structure(c(0.5, 1027, 5969, 29870, 28970, 603100), .Names = c("Min.", 
"1st Qu.", "Median", "Mean", "3rd Qu.", "Max."), class = c("summaryDefault", 
"table"))
? ? Min.? 1st Qu.?  Median? ?  Mean? 3rd Qu.? ?  Max. 
? ?  0.5?  1027.0?  5969.0? 29870.0? 28970.0 603100.0 
--8<---------------cut here---------------end--------------->8---
but the printed representation is very unreadable: the fact that
603100.0 is almost exactly 7 days is not obvious.
Okay, maybe as.difftime will help?
--8<---------------cut here---------------start------------->8---
Time differences in secs
? ? Min.? 1st Qu.?  Median? ?  Mean? 3rd Qu.? ?  Max. 
? ?  0.5?  1027.0?  5969.0? 29870.0? 28970.0 603100.0
Time differences in hours
? ? ? ? Min.? ? ? 1st Qu.? ? ?  Median? ? ? ?  Mean? ? ? 3rd Qu.? ? ? ?  Max. 
1.388889e-04 2.852778e-01 1.658056e+00 8.297222e+00 8.047222e+00 1.675278e+02 
--8<---------------cut here---------------end--------------->8---
nope; still unreadable.

What I really want to see _printed_ is something likes this:
--8<---------------cut here---------------start------------->8---
? ? ?  Min.? ?  1st Qu.? ? ? Median? ? ? ? Mean? ?  3rd Qu.? ? ? ? Max. 
"500.00 ms" "17.12 min" "99.48 min"? "8.30 hrs"? "8.05 hrs" "6.98 days" 
--8<---------------cut here---------------end--------------->8---
except that the quotes are not needed in the printed output.
Here I wrote:
--8<---------------cut here---------------start------------->8---
difftime2string <- function (x) {
? if (x < 1) return(sprintf("%.2f ms",x*1000))
? if (x < 100) return(sprintf("%.2f sec",x))
? if (x < 6000) return(sprintf("%.2f min",x/60))
? if (x < 108000) return(sprintf("%.2f hrs",x/3600))
? if (x < 400*24*3600) return(sprintf("%.2f days",x/(24*3600)))
? sprintf("%.2f years",x/(365.25*24*3600))
}
--8<---------------cut here---------------end--------------->8---

So, what is "The Right R Way" to print a summary of difftime objects?
Thanks!
#
Hi,
I am wondering what others do when they want to see a summary of difftime.
cool, thanks.
I now think that what I want is
--8<---------------cut here---------------start------------->8---
difftime.summary <- function (v) {
  s <- summary(as.numeric(v))
  r <- as.data.frame(sapply(s,difftime2string),stringsAsFactors=FALSE)
  names(r) <- c("string")
  r[[units(v)]] <- s
  r
}
string     secs
Min.    500.00 ms      0.5
1st Qu. 17.12 min   1027.0
Median  99.48 min   5969.0
Mean     8.30 hrs  29870.0
3rd Qu.  8.05 hrs  28970.0
Max.    6.98 days 603100.0
--8<---------------cut here---------------end--------------->8---
#
On Thu, Nov 22, 2012 at 4:01 AM, Sam Steingold <sds at gnu.org> wrote:
Any reason not summary.difftime to get S3 dispatch?

MW
#
I hoped that someone will ask this :-)

1. because its argument has type "vector of difftime", not "difftime"
(coming from CLOS, I do not expect summary(vector of difftime) to
dispatch to summary.difftime, but to summary.vector.of.difftime or something)

2. because difftime.summary returns a data.frame and not a
"Classes 'summaryDefault', 'table'" as I assume summary must return.

if these are not valid issues, then I wonder why my function should not
be the system default method.
#
On Thu, Nov 22, 2012 at 5:49 PM, Sam Steingold <sds at gnu.org> wrote:
I'm not sure that's a suitable distinction in R. (Almost) All objects
are vectors (either generic or atomic) and all that....
See http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-should-I-write-summary-methods_003f
#
what are the requirements on the class summary.foo?
does it have to inherit from some other class?
how do I define a class?
#
On Nov 23, 2012, at 12:35 PM, Sam Steingold wrote:

            
I'm not sure it makes sense to frame the question this way.  
summary.foo would nt be a class but rather a 'summary' method/function  
that applied to items of class 'foo.
There is implicit inheritance from the vector "class" and sometimes  
default methods will assume a numeric vector. But if you defin an  
object to be of a particular class, it would not need to have an  
explicit inheritance defined/
That's pretty easy. Read:

?class         # and the pages to which it links.

(And also read ?methods, and the pages to which it links. Then read  
Sect 5 Object-oriented programming in the "R Language Definition".)

--

David Winsemius, MD
Alameda, CA, USA
1 day later
#
--8<---------------cut here---------------start------------->8---
summary.difftime <- function (v) {
  s <- summary(as.numeric(v))
  r <- as.data.frame(sapply(s,difftime2string),stringsAsFactors=FALSE)
  names(r) <- c("string")
  r[[units(v)]] <- s
  class(r) <- c("data.frame","summary.difftime")
  r
}
print.summary.difftime <- function (sd) print.data.frame(sd)
--8<---------------cut here---------------end--------------->8---

it appears to work for a single vector:

--8<---------------cut here---------------start------------->8---
string     secs
Min.    492.00 ms      0.5
1st Qu. 18.08 min   1085.0
Median   1.77 hrs   6370.0
Mean     8.20 hrs  29530.0
3rd Qu.  8.12 hrs  29250.0
Max.    6.98 days 602900.0
Classes 'summary.difftime' and 'data.frame':	6 obs. of  2 variables:
 $ string: chr  "492.00 ms" "18.08 min" "1.77 hrs" "8.20 hrs" ...
 $ secs  :Classes 'summaryDefault', 'table'  num [1:6] 4.92e-01 1.08e+03 6.37e+03 2.95e+04 2.92e+04 ...
--8<---------------cut here---------------end--------------->8---

but not as a part of data frame:

--8<---------------cut here---------------start------------->8---
Error in summary.difftime(X[[22L]], ...) : 
  unused argument(s) (maxsum = 7, digits = 12)
--8<---------------cut here---------------end--------------->8---

I guess I should somehow accept a list of options in summary.difftime()
and pass them on to the inner call to summary() (or should it be
explicitly summary.numeric()?)

how do I do that?
#
On Nov 24, 2012, at 7:48 PM, Sam Steingold wrote:

            
In the usual way. If you know that the function will be called with  
arguments from the summary.data.frame function then you should allow  
the argument list to accept them. You can ignore them or provide  
provisions for them. You just can't define your function to have only  
one argument if you expect (as you should since you passes summary a  
dataframe object) that it might be called within summary.data.frame.

This is the argument list for summary.data.frame:

 >   summary.data.frame
function (object, maxsum = 7, digits = max(3, getOption("digits") -
     3), ...)
summary.difftime <- function (v, ... ) { ................

There are many asked and answered questions on rhelp about how to deal  
with the "dots" arguments.
1 day later
#
this overcomes the summary generation, but not printing:

--8<---------------cut here---------------start------------->8---
summary.difftime <- function (v, ...) {
  s <- summary(as.numeric(v), ...)
  r <- as.data.frame(sapply(s,difftime2string),stringsAsFactors=FALSE)
  names(r) <- c("string")
  r[[units(v)]] <- s
  class(r) <- c("data.frame","summary.difftime")
  r
}
print.summary.difftime <- function (sd) print.data.frame(sd)
--8<---------------cut here---------------end--------------->8---

summary(infl), where infl$delay is a difftime vector, prints

...
                   
    delay                                                                             
 string:c("492.00 ms", "18.08 min", "1.77 hrs", "8.20 hrs", "8.13 hrs", "6.98 days")  
 secs  :c("     0.5", "  1085.1", "  6370.2", " 29534.4", " 29254.0", "602949.7")     
                                                                                      
                                                                                      

instead of something like

   delay
   Min.:    492 ms
   1st Qu.: 18.08 min

&c

so, how do I arrange for a proper printing of difftime summary as a part
of the data frame summary?

  
    
#
On Nov 26, 2012, at 7:14 AM, Sam Steingold wrote:

            
If you like a particular format from an existing print method then why  
not look it up and copy the code?

methods(print)
#
On Mon, Nov 26, 2012 at 4:46 PM, David Winsemius <dwinsemius at comcast.net> wrote:
Surely reversed no? summary.difftime inherits from data.frame I would
have assumed.
What is this supposed to do exactly? If you have inheritance why have
the subclass method do nothing other than call the parent method?

Michael
#
the problem is that I cannot figure out which function prints this:
I added cat()s to print.summary.difftime and I do not see them, so it
appears that I have no direct control over how a summary.difftime is
printed as a part of a summary of a data.frame.


--8<---------------cut here---------------start------------->8---
summary.difftime <- function (v, ...) {
  s <- summary(as.numeric(v), ...)
  r <- as.data.frame(sapply(s,difftime2string),stringsAsFactors=FALSE)
  names(r) <- c("string")
  r[[units(v)]] <- s
  class(r) <- c("summary.difftime","data.frame")
  invisible(r)
}
print.summary.difftime <- function (sd, ...) {
  cat("[[[print.summary.difftime]]]\n")
  print(list(...))
  print.data.frame(sd, ...)
}
--8<---------------cut here---------------end--------------->8---
#
It looks like summary.data.frame(d) calls format(d[[i]]) for i in seq_len(ncol(d))
and pastes the results together into a "table" object for printing.  Hence, write
a format.summary.difftime if you want objects of class "summary.difftime" (which
I assume summary.difftime produces) to be formatted as you wish when a
difftime object is in a data.frame.  Once you've written it, have your print.summary.difftime
call it too.

E.g., with the following methods
summary.difftime <- function(x, ...) {
         ret <- quantile(x, p=(0:2)/2, na.rm=TRUE)
         class(ret) <- c("summary.difftime", class(ret))
         ret
}
format.summary.difftime <- function(x, ...) c(Min.Med.Max = paste(collapse="...", NextMethod("format")))
print.summary.difftime <- function(x, ...){ print(format(x), quote=FALSE) ; invisible(x) }

I get
Num         Date                    Delta
 Min.   :1   Min.   :2012-11-26   Min.Med.Max: 1 days... 4 days...16 days
 1st Qu.:2   1st Qu.:2012-11-27
 Median :3   Median :2012-11-28
 Mean   :3   Mean   :2012-11-28
 3rd Qu.:4   3rd Qu.:2012-11-29
 Max.   :5   Max.   :2012-11-30
Min.Med.Max
 1 days... 4 days...16 days

My summary.difftime inherits from difftime so the format method is not really
needed, as format.difftime does a reasonable job (except that it does not copy
the input names to its output).  I put it in to show how it gets called.


Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
#
Thanks a lot - almost there!

--8<---------------cut here---------------start------------->8---
format.summary.difftime <- function(sd, ...) {
  t <- matrix(sd$string)
  rownames(t) <- rownames(sd)
  print(t)
  format(as.table(t))
}
print.summary.difftime <- function (sd, ...) {
  print(format(sd), quote=FALSE)
  invisible(sd)
}
--8<---------------cut here---------------end--------------->8---

this almost works:

--8<---------------cut here---------------start------------->8---
share.id         min              max           
 12cf12372b87cce9:      1   NULL:492.00 ms   NULL:492.00 ms  
 12cf36060bdb9581:      1   NULL:3.70 min    NULL:21.80 min  
 12d2665c906bb232:      1   NULL:20.32 min   NULL:3.26 hrs   
 12d2802f1435b4cd:      1   NULL:5.52 hrs    NULL:13.78 hrs  
 12d292988f5f8422:      1   NULL:2.81 hrs    NULL:16.20 hrs  
 12d29dd2894e2790:      1   NULL:6.95 days   NULL:6.98 days  
--8<---------------cut here---------------end--------------->8---

why do I see NULLs?!

--8<---------------cut here---------------start------------->8---
[,1]       
Min.    "492.00 ms"
1st Qu. "3.70 min" 
Median  "20.32 min"
Mean    "5.52 hrs" 
3rd Qu. "2.81 hrs" 
Max.    "6.95 days"
A        
Min.    492.00 ms
1st Qu. 3.70 min 
Median  20.32 min
Mean    5.52 hrs 
3rd Qu. 2.81 hrs 
Max.    6.95 days
A          
Min.    "492.00 ms"
1st Qu. "3.70 min "
Median  "20.32 min"
Mean    "5.52 hrs "
3rd Qu. "2.81 hrs "
Max.    "6.95 days"

  
    
#
because
Replace your call of the form
  format(difftimeObject)
with
  structure(format(difftimeObject), names=names(difftimeObject))
to work around this.


Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
#
Looks like
format.summary.difftime <- function(sd, ...) structure(sd$string,
names=rownames(sd))
does the job.
any reason not to use it?
On Mon, Nov 26, 2012 at 7:36 PM, William Dunlap <wdunlap at tibco.com> wrote: