An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/r-help/attachments/20061221/7b8ab520/attachment.pl
Memory problem on a linux cluster using a large data set [Broadcast]
3 messages · Iris Kolder, Thomas Lumley, Martin Morgan
On Thu, 21 Dec 2006, Iris Kolder wrote:
Thank you all for your help! So with all your suggestions we will try to run it on a computer with a 64 bits proccesor. But i've been told that the new R versions all work on a 32bits processor. I read in other posts that only the old R versions were capable of larger data sets and were running under 64 bit proccesors. I also read that they are adapting the new R version for 64 bits proccesors again so does anyone now if there is a version available that we could use?
Huh? R 2.4.x runs perfectly happily accessing large memory under Linux on 64bit processors (and Solaris, and probably others). I think it even works on Mac OS X now. For example:
x<-rnorm(1e9) gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 222881 12.0 467875 25.0 350000 18.7
Vcells 1000115046 7630.3 1000475743 7633.1 1000115558 7630.3
-thomas
Thomas Lumley Assoc. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
Section 8 of the Installation and Administration guide says that on 64-bit architectures the 'size of a block of memory allocated is limited to 2^32-1 (8 GB) bytes'. The wording 'a block of memory' here is important, because this sets a limit on a single allocation rather than the memory consumed by an R session. The size of the allocation of the original poster was something like 300,000 SNPs x 1000 individuals x 8 bytes (depending on representation, I guess) = about 2.3 GB so there is still some room for even larger data. Obviously it's important to think carefully about how the statistical analysis of such a large volume of data will proceed, and be interpreted. Martin Thomas Lumley <tlumley at u.washington.edu> writes:
On Thu, 21 Dec 2006, Iris Kolder wrote:
Thank you all for your help! So with all your suggestions we will try to run it on a computer with a 64 bits proccesor. But i've been told that the new R versions all work on a 32bits processor. I read in other posts that only the old R versions were capable of larger data sets and were running under 64 bit proccesors. I also read that they are adapting the new R version for 64 bits proccesors again so does anyone now if there is a version available that we could use?
Huh? R 2.4.x runs perfectly happily accessing large memory under Linux on 64bit processors (and Solaris, and probably others). I think it even works on Mac OS X now. For example:
x<-rnorm(1e9) gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 222881 12.0 467875 25.0 350000 18.7
Vcells 1000115046 7630.3 1000475743 7633.1 1000115558 7630.3
-thomas
Thomas Lumley Assoc. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Martin T. Morgan Bioconductor / Computational Biology http://bioconductor.org