Date: Wed, 27 Apr 2011 14:40:23 +0200
From: jonathan at k-m-p.nl
To: r-help at r-project.org
Subject: Re: [R] Speed up plotting to MSWindows graphics window
On 27/04/2011 13:18, Mike Marchywka wrote:
Date: Wed, 27 Apr 2011 11:16:26 +0200
From:jonathan at k-m-p.nl
To:r-help at r-project.org
Subject: [R] Speed up plotting to MSWindows graphics window
Hello,
I am working on a project analysing the performance of motor-vehicles
through messages logged over a CAN bus.
I am currently plotting the data in R, overlaying 5 or more plots of
data, logged at 1kHz, (using plot.ts() and par(new = TRUE)).
The aim is to be able to pan, zoom in and out and get values from the
plotted graph using a custom Qt interface that is used as a front end to
R.exe (all this works).
The plot is drawn by R directly to the windows graphic device.
The data is imported from a .csv file (typically around 100MB) to a matrix.
(timestamp, message ID, byte0, byte1, ..., byte7)
I then separate this matrix into several by message ID (dimensions are
in the order of 8cols, 10^6 rows)
The panning is done by redrawing the plots, shifted by a small amount.
So as to view a window of data from a second to a minute long that can
travel the length of the logged data.
My problem is that, the redrawing of the plots whilst panning is too
slow when dealing with this much data.
i.e.: I can see the last graphs being drawn to the screen in the
half-second following the view change.
I need a fluid change from one view to the next.
My question is this:
Are there ways to speed up the plotting on the MSWindows display?
By reducing plotted point densities to*sensible* values?
Well, hard to know but it would help to know where all the time is going.
Usually people start complaining when VM thrashing is common but if you are
CPU limited you could try restricting the range of data you want to plot
rather than relying on the plot to just clip the largely irrelevant points
when you are zoomed in. It should not be too expensive to find the
limits either incrementally or with binary search on ordered time series.
Presumably subsetting is fast using foo[a:b,]
One thing you may want to try for change of scale is wavelet or
multi-resolution analysis. You can make a tree ( increasing memory usage
but even VM here may not be a big penalty if coherence is high ) and
display the resolution appropriate for the current scale.
I forgot to add, for plotting I use a command similar to:
plot.ts(timestampVector, dataVector, xlim=c(a,b))
a and b are timestamps from timestampVector
Is the xlim parameter sufficient for limiting the scope of the plots?
Or should I subset the timeseries each time I do a plot?