Skip to content
Prev 1563 / 15274 Next

Aggregating Statistics By Time Interval

Here is one more solution.  Using the tmp from your post, this one uses
SQLite via the sqldf package.  It produces a similar output as z from
our prior solution and then
we can use xtabs, reshape or the reshape package as before to get
the final layout.  The first subselect within the main select sums within
hour and spread and the second sums within hour.  We join the subselects
and take the ratio of the two the sums to get the answer.

In the solutions before we used hour relative to GMT rather than local time.
+   100 * aa.timediff / bb.timediff timediff from
+   (select
+     strftime('%H',a.time__1,'unixepoch') ahour,
+     strftime('%H',b.time__1,'unixepoch') bhour,
+     a.spread spread,
+     sum(b.time__1 - a.time__1) timediff
+    from tmp a, tmp b
+    where a.row_names = b.row_names-1 and ahour = bhour
+    group by ahour, a.spread) aa join
+    (select strftime('%H',c.time__1,'unixepoch') chour,
+       strftime('%H',d.time__1,'unixepoch') dhour,
+       sum(d.time__1 - c.time__1) timediff
+       from tmp c, tmp d
+       where c.row_names = d.row_names-1 and chour = dhour
+       group by chour) bb
+     where ahour = chour
+     group by spread, ahour",
+  row.names = TRUE)
  ahour spread  timediff
1    11  1e-04 91.358025
2    12  1e-04 92.613636
3    11  2e-04  8.641975
4    12  2e-04  5.681818
5    12  3e-04  1.704545
+ y <- data.frame(timediff = diff(x$time), head(x, -1))
+ aggregate(100 * y[1]/sum(y[1]), y[c("hour", "spread")], sum)
+ }
hour spread  timediff
11.1   11  1e-04 91.358025
11.2   11  2e-04  8.641975
12.1   12  1e-04 92.613636
12.2   12  2e-04  5.681818
12.3   12  3e-04  1.704545

Here is input:

library(sqldf)
sqldf("select ahour, spread,
	100 * aa.timediff / bb.timediff timediff from
	(select
	strftime('%H',a.time__1,'unixepoch') ahour,
	strftime('%H',b.time__1,'unixepoch') bhour,
	a.spread spread,
	sum(b.time__1 - a.time__1) timediff
	from tmp a, tmp b
	where a.row_names = b.row_names-1 and ahour = bhour
	group by ahour, a.spread) aa join
	(select strftime('%H',c.time__1,'unixepoch') chour,
		strftime('%H',d.time__1,'unixepoch') dhour,
		sum(d.time__1 - c.time__1) timediff
		from tmp c, tmp d
		where c.row_names = d.row_names-1 and chour = dhour
		group by chour) bb
	where ahour = chour
	group by spread, ahour",
	row.names = TRUE)
On 8/3/07, Gabor Grothendieck <ggrothendieck at gmail.com> wrote: