Skip to content
Prev 132363 / 398506 Next

Scatterplot Showing All Points

On 18/12/2007 10:01 AM, Antony Unwin wrote:
That plot is better than jittering, but there's the problem in the 
mosaic plot of understanding the scale of the rectangles:  is it area or 
diameter that encodes the count?  With a jittered plot, you lose 
resolution when the number of points gets too high because you just see 
a mess of ink, but at least you only require the viewer to count in 
order to get a close numerical reading from the plot.

I could also claim that while imperfect, at least jittering is widely 
applicable.  For example, if the data were not on a regular grid, 
perhaps because they had been generated like this:

xloc <- rnorm(50)
yloc <- rnorm(50)
index <- sample(1:50, 5000, rep=TRUE, prob = abs(xloc))
x <- xloc[index]
y <- yloc[index]

then jittering still works as well (or as poorly), but the imosaic would 
not work at all.  There are better plots than jittering available, but 
jittering is easy.

(Actually, with this dataset, plot(jitter(x), jitter(y)) is really poor, 
because jitter() chooses a bad amount of jittering.  But with manual 
tuning (e.g.  plot(jitter(x, a=0.1), jitter(y, a=0.1), pch=".")) it's 
not too bad.  So I'd say jittering worked, but the R implementation of 
it may need improvement).
Yes, I probably wouldn't recommend jittering if there were more than a 
few hundred replications at any point, or more than a few hundred unique 
points.

Duncan Murdoch

P.S. iplots 1.1-1 may have an init problem in Windows: in my first 
attempt, the plot made the boxes too large to fit in their cells, but it 
fixed itself when I resized the window, and the bug doesn't seem to be 
repeatable.