Skip to content

How to create a high-dimensional matrix

4 messages · R. Michael Weylandt, lrl

lrl
#
Hi, everyone

I need to create a 429497 x 429497 matrix. 
When I use
*matrix(0,429497,429497)*
I got the error information :  Error in matrix(0, 429497, 429497) : too many
elements specified

Then I use "ff" package, try to store this matrix on disk
* x<-ff(0,dim=c(429497,429497)*
And I got the error information : 
Error in if (length < 0 || length > .Machine$integer.max) stop("length must
be between 1 and .Machine$integer.max") : 
  missing value where TRUE/FALSE needed
In addition: Warning message:
In ff(0, dim = c(429497, 429497)) : NAs introduced by coercion

I am using Unix. The free memory is about 33G
[1] 2147483647

What can I do the create such a high dimension matrix?

Many thanks!

Ruiling Liu




--
View this message in context: http://r.789695.n4.nabble.com/How-to-create-a-high-dimensional-matrix-tp4646396.html
Sent from the R help mailing list archive at Nabble.com.
#
On Tue, Oct 16, 2012 at 8:46 PM, lrl <liurl1221 at gmail.com> wrote:
You'll note that 429497 ^2 - .Machine$integer.max is still a very
large positive number.

Is your matrix perhaps sparse and you don't actually have to store
quite so many values?

Michael
lrl
#
Should it be .Machine$integer.max- 429497 ^2?

The matrix is not a sparse matrix, but a symmetric matrix. 

It's a process of feature selection. I am choosing the most important
variables from 439497 variables. Now I am considering divide the whole
dataset into several part and process a small part first. 



--
View this message in context: http://r.789695.n4.nabble.com/How-to-create-a-high-dimensional-matrix-tp4646396p4646410.html
Sent from the R help mailing list archive at Nabble.com.
#
On Tue, Oct 16, 2012 at 10:56 PM, lrl <liurl1221 at gmail.com> wrote:
6 of one; half dozen of the other. Your matrix is still absolutely
massive: the development version of R does allow for lager matrices so
I think you can store a 429497 by 429497 in memory then, but it won't
be hugely performant.
The use of a training set is generally good practice, but I'm terribly
skeptical of any automated feature selection from a pool of ~400k
selectors. To wit,

library(fortunes)
fortune("fancy random")

Cheers,
Michael