Hello,
I'm (eventually) attempting a singular value decomposition of a 3200 x
527829 matrix in R version 2.10.1. The script is as follows:
###---------Begin Script here-------###
library(Matrix)
snps <- 527829 ## Number of SNPs
N <- 3200 ## Sample size
y <- rnorm(N, 100,1) ## simulated phenotype
system.time(
## read in matrix 3200 x 527829
x <- scan("gedi7.raw", what=rep(0,snps), nmax=N*snps, skip=1))
system.time(x <- matrix(x,nrow=N,ncol=snps, byrow=TRUE))
print(object.size(x), units="Mb")
###--------End Script----------------####
The scan function finishes without a problem. "x" is in double precision
floating point format and takes up 12886.5Mb of memory at the first
object.size() statement.
When I convert it to a matrix I get an error stating that I cannot allocate
a vector of size 12.6Gb. I have requested 31Gb of memory on the server.
12.6+ 12.8 = 25.4Gb of used memory. Is it that R is using considerable
memory for operations not directly related to storing the matrix objects
here? Or is this perhaps a problem of contiguous memory?
Any help is greatly appreciated.
-Scott
--
View this message in context: http://r.789695.n4.nabble.com/Working-with-massive-matrices-in-R-tp3458561p3458561.html
Sent from the R help mailing list archive at Nabble.com.
Working with massive matrices in R
3 messages · svrieze, jim holtman, ancienthart
It is probably contiguous memory, I always suggest that you have 3-4X memory than your largest object to ensure that you have room for copies that might be made. So make a request for about 50GB of memory.
On Mon, Apr 18, 2011 at 4:10 PM, svrieze <vrie0006 at umn.edu> wrote:
Hello,
I'm (eventually) attempting a singular value decomposition of a 3200 x
527829 matrix in R version 2.10.1. ?The script is as follows:
###---------Begin Script here-------###
library(Matrix)
snps <- 527829 ? ? ? ? ? ? ? ? ? ## Number of SNPs
N <- 3200 ? ? ? ? ? ? ? ? ? ? ? ?## Sample size
y <- rnorm(N, 100,1) ? ? ? ? ? ? ? ## simulated phenotype
system.time(
## read in matrix 3200 x 527829
x <- scan("gedi7.raw", what=rep(0,snps), nmax=N*snps, skip=1))
system.time(x <- matrix(x,nrow=N,ncol=snps, byrow=TRUE))
print(object.size(x), units="Mb")
###--------End Script----------------####
The scan function finishes without a problem. ?"x" is in double precision
floating point format and takes up 12886.5Mb of memory at the first
object.size() statement.
When I convert it to a matrix I get an error stating that I cannot allocate
a vector of size 12.6Gb. ?I have requested 31Gb of memory on the server.
12.6+ 12.8 = 25.4Gb of used memory. ?Is it that R is using considerable
memory for operations not directly related to storing the matrix objects
here? ?Or is this perhaps a problem of contiguous memory?
Any help is greatly appreciated.
-Scott
--
View this message in context: http://r.789695.n4.nabble.com/Working-with-massive-matrices-in-R-tp3458561p3458561.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Jim Holtman Data Munger Guru What is the problem that you are trying to solve?
Are your matricies sparse? This package (or one of it's reverse dependencies/suggests) may help keep the memory down. http://cran.r-project.org/web/packages/SparseM/ Joal Heagney