Skip to content
Prev 164924 / 398503 Next

[ExternalEmail] Pearson Correlation Speed

On Tue, 16 Dec 2008, Nathan S. Watson-Haigh wrote:

            
The original object is
[1] 2.567358
Gigabytes, and so is the result. I added them together.
Because nobody ever really needed it?

Seriously, optimizing something like this is machine dependent, and R-core 
probably has higher priorities.

cor() provides lots of options - it handles NAs, for example - and it is 
probably not worth the trouble to try to optimize over those options. The 
calculation sans NAs is a simple one and can be done using the built in 
BLAS (as crossprod() does), which BLAS can in turn be tuned to the machine 
used. So, if your environment has a tuned or multithreaded BLAS, you might 
be better off to use crossprod() and scale the result.
Well, in that case the path of least resistance is to start the process 
when you leave for the night and pick up the results the next morning.


HTH,

Chuck

Charles C. Berry                            (858) 534-2098
                                             Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	            UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901