Memory problem on a linux cluster using a large data set [Broadcast]

An embedded and charset-unspecified text was scrubbed...
Name: not available
Url: https://stat.ethz.ch/pipermail/r-help/attachments/20061221/7b8ab520/attachment.pl

Thank you all for your help!

So with all your suggestions we will try to run it on a computer with a 
64 bits proccesor. But i've been told that the new R versions all work 
on a 32bits processor. I read in other posts that only the old R 
versions were capable of larger data sets and were running under 64 bit 
proccesors. I also read that they are adapting the new R version for 64 
bits proccesors again so does anyone now if there is a version available 
that we could use?
Huh?  R 2.4.x runs perfectly happily accessing large memory under Linux on 
64bit processors (and Solaris, and probably others). I think it even works 
on Mac OS X now.

For example:
x<-rnorm(1e9)
gc()
used   (Mb) gc trigger   (Mb)   max used   (Mb)
Ncells     222881   12.0     467875   25.0     350000   18.7
Vcells 1000115046 7630.3 1000475743 7633.1 1000115558 7630.3

         -thomas

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle
Section 8 of the Installation and Administration guide says that on
64-bit architectures the 'size of a block of memory allocated is
limited to 2^32-1 (8 GB) bytes'.

The wording 'a block of memory' here is important, because this sets a
limit on a single allocation rather than the memory consumed by an R
session. The size of the allocation of the original poster was
something like 300,000 SNPs x 1000 individuals x 8 bytes (depending on
representation, I guess) = about 2.3 GB so there is still some room
for even larger data.

Obviously it's important to think carefully about how the statistical
analysis of such a large volume of data will proceed, and be
interpreted.

Martin

Thomas Lumley <tlumley at u.washington.edu> writes:
On Thu, 21 Dec 2006, Iris Kolder wrote:

Thank you all for your help!

So with all your suggestions we will try to run it on a computer with a 
64 bits proccesor. But i've been told that the new R versions all work 
on a 32bits processor. I read in other posts that only the old R 
versions were capable of larger data sets and were running under 64 bit 
proccesors. I also read that they are adapting the new R version for 64 
bits proccesors again so does anyone now if there is a version available 
that we could use?
Huh?  R 2.4.x runs perfectly happily accessing large memory under Linux on 
64bit processors (and Solaris, and probably others). I think it even works 
on Mac OS X now.

For example:
x<-rnorm(1e9)
gc()
              used   (Mb) gc trigger   (Mb)   max used   (Mb)
Ncells     222881   12.0     467875   25.0     350000   18.7
Vcells 1000115046 7630.3 1000475743 7633.1 1000115558 7630.3

         -thomas

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Martin T. Morgan
Bioconductor / Computational Biology
http://bioconductor.org