On Wed, 3 Aug 2005, Nestor Fernandez wrote:
Dear all,
I'm trying to estimate clusters from a very large dataset using clara but
the
program stops with a memory error. The (very simple) code and the error:
mydata<-read.dbf(file="fnorsel_4px.dbf")
my.clara.7k<-clara(mydata,k=7)
Error: cannot allocate vector of size 465108 Kb
The dataset contains >3,000,000 rows and 15 columns. I'm using a windows
computer with 1.5G RAM; I also tried changing the memory limit to the
maximum
possible (4000M)
Actually, the limit is probably 2048M: see the rw-FAQ Q on memory limits.
Is there a way to calculate clara clusters from such large datasets?
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595