Skip to content

scatterplot of 100000 points and pdf file format

4 messages · Liaw, Andy, Hadley Wickham, (Ted Harding) +1 more

#
Do you know if majority of that time is spent in unique() itself?  If so,
which method?  What I see is:
[1] 0.25 0.01 0.26   NA   NA
[1] 101.80   0.34 104.61     NA     NA
[1] 10.17  0.00 10.24    NA    NA
[1] 23.94  0.11 24.15    NA    NA

Andy
#
Another possibility might be to use a 2d kernel density estimate (eg.
kde2d from library(MASS).  Then for the high density areas plot the
density contours, for the low density areas plot the individual
points.

Hadley
#
Hi Andy,
On 25-Nov-04 Liaw, Andy wrote:
I want to look into this a bit more systematically (I have
an idea why 'unique' may be taking longer on the array from
'cbind' than on the dataframe), but I will be doing this on
a much faster machine than I immediately have to hand, so
will report results (if interesting) later.

Meanwhile, I'm not sure what you mean by "which method?",
and I'm also wondering what "gcFirst" is about.

Thanks,
Ted.


--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861  [NB: New number!]
Date: 25-Nov-04                                       Time: 14:30:39
------------------------------ XFMail ------------------------------
#
(Ted Harding) <Ted.Harding at nessie.mcc.ac.uk> writes:
Just look inside the functions. One is pasting columns together, the
other is using a paste() construct inside an apply() function. So with
two columns by 1e6 rows, one is doing one large paste and the other a
million small ones.