Fanny Clustering

Ok,

How can i increase the memory of your computer available to R?
Well, if you would like to increase memory of MY computer... you are 
welcome to do so... but I doubt it would be of any use for you ;-)

You don't tell us how much RAM you have currently, which platform you 
use, etc... The general approach is to use a computer with more RAM, up 
to the limit permitted by a 32-bit system for R, and then, to switch to 
a 64-bit version under Linux, if you need even more RAM.

The other proposed solution is not stupid. With 70.000 cases, you have a 
fairly large dataset. You don't tell use how many groups you expect from 
your clustering, but it is often better to use a couple of tens, or 
hundreds of representative cases for each group, no more. In supervised 
classification, it is easier to build such a training set with 
relatively balanced number of items in each group, because targeted 
classification is known a priori from the manual classification provided.

With unsupervised classification, you could either try a pure random 
subsampling, or select your subsample based on similarity according to a 
given distance measurement. I did something like that using a 
Malahanobis distance, MDS, and then, stratified subsampling inside a 
regular grid placed on top of the MDS plot.

Otherwise, I am not a specialist of unsupervised classification, and 
other people here could have better suggestion.

Best,

Philippe Grosjean
2007/3/29, Philippe Grosjean <phgrosjean at sciviews.org>:
1) Reduce the size of your sample (random or stratified subsampling),

2) Increase the memory of your computer available to R.

Best,

Philippe Grosjean

..............................................<?}))><........
) ) ) ) )
( ( ( ( (    Prof. Philippe Grosjean
) ) ) ) )
( ( ( ( (    Numerical Ecology of Aquatic Systems
) ) ) ) )   Mons-Hainaut University, Belgium
( ( ( ( (
..............................................................

Sergio Della Franca wrote:
Dear R-Helpers,

I'd like to develop a fanny clustering on my data set(70.000 rows), but
when
i run the procedure i obtain this error:

error in vector("double", lenght): too big dimension for
the selected vector.

How can i solve this problem?

Thank you in advance.

Sergio Della Franca.

      [[alternative HTML version deleted]]

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

	[[alternative HTML version deleted]]

------------------------------------------------------------------------

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Fanny Clustering

Thread (4 messages)