I can't understand that people still send things like this to R-core... ------- start of forwarded message ------- From: Tineke Casneuf <ticas at psb.ugent.be> Sender: r-core-bounces at stat.math.ethz.ch To: R-core at r-project.org Subject: help(Memory) Date: Wed, 04 Feb 2004 13:39:32 +0100 Dear, I am trying to find a appropriate package to analyse gene expression data from DNA microarray experiments. My data are allready normalized, so for the clustering of my data I used the 'mva' package. All I actually need is to calculate euclidean, manhatten, ... distances and various kinds of correlation coefficients. I am a R beginner, and to me it's not clear which package I should use (there's so many of them!!). I have looked at the Bioconductor website, but it looks as if those packages are meant to be used for fancy tools for smaller datasets (hundreds of genes): like ANOVA, identification of differentially expressed genes,... All I want is to calculate distances and correlation coefficients for all the genes on the microarray (up to 22 000 genes). I have allready tried to do some calculations, with the mva package, but the process kills itself and returns a warning: 'heap memory exhausted'. So I read in the manual how to increase the heap memory: I put it up to --vsize=2000M, but he still keeps saying it (needed 83Kb or some, more). I have tried to increase the heap memory to 2200M but he won't let me do it (too large and ignored). I used a 7 000 rows dataset. The commands I used are:
scan ("list_genes", what = "list") -> genenames
read.table(file ="list_signals", row.names = genenames) -> data
library (mva)
as.matrix(dist(data, method = "euclidean", diag = TRUE)) -> matrix
write.table(matrix, file = "euclidmartix")
So here's my problem: maybe I can't use R (or this package) for this kind of big datasets (he needs to calculate a 7000 to 7000 matrix), or there's something wrong with my commands, since R is given 2 giga and he still crashes. Is there maybe a better package for me to use? Or it this amount of heap memory not unusual for this big dataset and do I need to add more? Can somebody please help me with this? Thanks in advance, T.Casneuf -- ================================================================== Tineke Casneuf Tel: 32 (0)9 3313692 DEPARTMENT OF PLANT SYSTEMS BIOLOGY Fax:32 (0)9 3313809 GHENT UNIVERSITY/VIB, Technology Park 927, B-9052 Gent, Belgium Vlaams Interuniversitair Instituut voor Biotechnologie VIB e-mail:ticas at psb.ugent.be http://www.psb.ugent.be/bioinformatics/ ================================================================== ------- end of forwarded message -------