Skip to content

can we manage memory usage to increase speed?

5 messages · Tuszynski, Jaroslaw W., Spencer Graves, Douglas Bates +2 more

#
If you have a code that takes 2 weeks to run, than it might be a case of
inefficient algorithm design. I was able to go from overnight runs (SELDI
data analysis) to 20 minute runs by identifying single inefficient function
that took most of the time, and writing it in C.

Jarek
====================================================\=======

 Jarek Tuszynski, PhD.                           o / \ 
 Science Applications International Corporation  <\__,|  
 (703) 676-4192                                   ">   \
 Jaroslaw.W.Tuszynski at saic.com                     `    \


-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Zhilin Liu
Sent: Monday, August 01, 2005 8:28 PM
To: r-help at stat.math.ethz.ch
Subject: [R] can we manage memory usage to increase speed?

Hi,
 
Thanks for reading.
 
I am running  a process in R for microarray data analysis. RedHat Enterprise
Linux 4, dual AMD CPU, 6G memory. However, the R process use only a total of
<200M memory. And the CPU usage is total to ~110% for two. The program takes
at least 2 weeks to run at the current speed. Is there some way we can
increase the usage of CPUs and memories and speed up? Any suggestion is
appreciated.
 
Thanks again.
 
Zhilin 


______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
#
And you can identify inefficient code fairly easily taking snapshots 
from "proc.time" and computing elapsed time for sections of your code.

	  spencer graves
Tuszynski, Jaroslaw W. wrote:

            

  
    
#
On 8/2/05, Spencer Graves <spencer.graves at pdf.com> wrote:
Using Rprof may be a better choice.  See

?Rprof
#
On Tue, 2 Aug 2005, Spencer Graves wrote:

            
Or use the profiler, which makes it much easier. There was a Programmers' 
Niche article about it in one of the first R Newsletters.

 	-thomas
Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle
#
Hi, 
Thank you all for the kind reply.

I recompiled R as the previous one turned profiling
off. 

I am using package MAANOVA, running the matest
function which is a permutation test. The author did
warn that it takes a long time to run.

Here is one of the test results:
[ Rdata]# ./R CMD Rprof maanovatest.out

Each sample represents 0.02 seconds.
Total run time: 905.959999999466 seconds.

Total seconds: time spent in function and callees.
Self seconds: time spent in function alone.

   %       total       %       self
 total    seconds     self    seconds    name
100.00    905.94      0.00      0.00     "matest"
 80.18    726.40      0.25      2.30     "fitmaanova"
 79.37    719.04      0.19      1.72     "mixed"
 68.34    619.16      1.05      9.50     "pinv"
 64.33    582.78      0.88      7.96     "La.svd"
 55.51    502.90     55.51    502.90     ".Call"
 38.47    348.54      1.58     14.28     "makeHq"
 34.50    312.60      0.13      1.18     "solveMME"
 19.80    179.42      0.18      1.60    
"matest.engine"
 10.19     92.30     10.19     92.30     "%*%"
......

The other part are not pasted as they are almost the
same everytime we check the profiling. Only the parts
above changes. For example, here is another output:
100.00   1411.02      0.00      0.02     "matest"
 82.88   1169.40      0.24      3.40     "fitmaanova"
 82.22   1160.18      0.19      2.74     "mixed"
 68.82    971.02      1.06     14.90     "pinv"
 64.84    914.94      0.85     11.98     "La.svd"
 56.10    791.64     56.10    791.64     ".Call"
 39.13    552.10      1.55     21.84     "makeHq"
 36.77    518.82      0.13      1.88     "solveMME"
 31.40    443.04      0.00      0.00     "matest.perm"
 17.10    241.32      0.16      2.28    
"matest.engine"
 10.15    143.24     10.15    143.24     "%*%"

I run this with a permutation of 2 times and it is
still running. So it is not possible to run 1000
permutations with this kind of speed.

And here is the output of TOP for R:
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM   
TIME+  COMMAND
15250 liuz       0 -20  218m 136m 2556 R 72.3  2.5 
29:05.20 R

Any suggestion to improve the performance is highly
appreciated.

Thanks a lot.

Zhilin
--- Douglas Bates <dmbates at gmail.com> wrote:

            
====================================================\=======