sparse matrix
On Mon, 8 Jan 2001, Martin Gotz wrote:
On Mon, 8 Jan 2001, Prof Brian Ripley wrote:
A matrix in R is a vector plus a dim atttribute. I presume you mean a numeric matrix (there are other types). Then it needs a 8byte space per entry.
Yes. Numeric matrix.
There are other ways to handle matrices: look at package Matrix, for example. One obvious representation is to store the non-zero elements and their locations. Just how large is this matrix, and does it have a pattern to its sparsity?
10000 x 10000 elements and 50 000 x 50 000 elements There is no pattern of sparsity. Values in matrix means some distance between words (for 10000 and 5 0000 words). 98% of elements in matrix are the same (with value 7500)
And what do you want to do with it? Doing things with sparse matrices has a tendency to make them less sparse.
Hierarchical clustering. Input matrix is distance matrix for hierarchical clustering with function hclust.
No chance. You need to find an algorithm that does not store the distance matrix. I think *any* clustering algorithm on 50 000 elements is going to be pretty pointless, but other low-storage algorithms do exist.
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._