Skip to content

Large loops in R

11 messages · michael.weylandt at gmail.com (R. Michael Weylandt, Peter Langfelder, arun +3 more

#
Dear all,

I need to access data from a large matrix (48000 x 48000) and to do it
I am trying to run two loops using "for" command. Surely it is been a
very slow job.

I heard that "for" is not the best option to perform large loops in R,
but I don't really know what would be the best (fast) option. sapply?
vapply? Could anyone help me with this issue, please?

Thank you very much for your attention and for any help!

Best regards,

Charles
On Dec 4, 2012, at 6:47 PM, Charles Novaes de Santana <charles.santana at gmail.com> wrote:

            
What exactly are you trying to do? It's likely doable with a few vectorized operations. 

Michael
#
Dear Michael,

Thank you for your answer.

I have 2 matrices. Each position of the matrices is a weight. And I
need to calculate the following sum of differences:

Considering:
mat1 and mat2 - two matrices (each of them 48000 x 48000).
d1 and d2 - two constant values.

sum<-0;
for(i in 1:nrows1){
                        for(j in 1:nrows2){
                                        sum<-sum+ ( ( (mat1(i,j)/d1) -
(mat2(i,j)/d2) )^2 )
                                }
                        }
                }

I was wondering if there is a better way to do this sum.

Thank you for your attention!

Best,

Charles

On Tue, Dec 4, 2012 at 7:54 PM, R. Michael Weylandt
<michael.weylandt at gmail.com> <michael.weylandt at gmail.com> wrote:

  
    
#
Without a reproducible example it's hard to tell for certain, but what
about simply (assuming nrows2 is actually columns):

sum((mat1/d1 - mat2/d2)^2)

R is smart enough to understand elementwise manipulation of a matrix:
you shouldn't need a loop at all.

Sarah

On Tue, Dec 4, 2012 at 2:27 PM, Charles Novaes de Santana
<charles.santana at gmail.com> wrote:
--
Sarah Goslee
http://www.functionaldiversity.org
#
On Tue, Dec 4, 2012 at 11:27 AM, Charles Novaes de Santana
<charles.santana at gmail.com> wrote:
sum( (mat1/d1-mat2/d2)^2)

Correct me if I'm wrong though - aren't matrices of 48x times 48k
larger than what R can handle at present?

HTH

Peter
#
HI,

I just wonder whether your code worked or not.
set.seed(8)
mat1<-matrix(sample(1:80,40,replace=TRUE),ncol=8)
set.seed(25)
mat2<-matrix(sample(1:160,40,replace=TRUE),ncol=8)
#Since the dimensions are the same, 
m<-1:5
n<-1:8
sum1<-0

for(i in 1:length(m)){
?for(j in 1:length(n)){
?sum1<-sum1+(((mat1[i,j]/d1)-(mat2[i,j]/d2))^2)
?}
?}
sum1
#[1] 15192.89

#Sara's code:
sum((mat1/d1 - mat2/d2)^2)
#[1] 15192.89
A.K.




----- Original Message -----
From: Charles Novaes de Santana <charles.santana at gmail.com>
To: "r-help at r-project.org" <r-help at r-project.org>
Cc: 
Sent: Tuesday, December 4, 2012 2:27 PM
Subject: Re: [R] Large loops in R

Dear Michael,

Thank you for your answer.

I have 2 matrices. Each position of the matrices is a weight. And I
need to calculate the following sum of differences:

Considering:
mat1 and mat2 - two matrices (each of them 48000 x 48000).
d1 and d2 - two constant values.

sum<-0;
for(i in 1:nrows1){
? ? ? ? ? ? ? ? ? ? ? ? for(j in 1:nrows2){
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? sum<-sum+ ( ( (mat1(i,j)/d1) -
(mat2(i,j)/d2) )^2 )
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? }
? ? ? ? ? ? ? ? ? ? ? ? }
? ? ? ? ? ? ? ? }

I was wondering if there is a better way to do this sum.

Thank you for your attention!

Best,

Charles

On Tue, Dec 4, 2012 at 7:54 PM, R. Michael Weylandt
<michael.weylandt at gmail.com> <michael.weylandt at gmail.com> wrote:

  
    
#
On Tue, Dec 4, 2012 at 8:43 PM, Peter Langfelder
<peter.langfelder at gmail.com> wrote:
hmmm I didn't know that the limitation of R was below this value. I
found this error message:

"Error in matrix(0, 48000, 48000) : too many elements specified"

but I thought it was a machine limitation (and I was asking for access
to a better machine in my labs...). Thanks for clarifying it.

Well, when Sarah gave me the answer for my problem, I got a new one :)
Thank you, Sarah and Peter.

So, is there any other way to "trick R" and allocate such large matrices?

Best,

Charles
#
Thank you, Sarah! It is a wonderful new!!! :)

Now I need to solve the other question hehe How to allocate such large matrix :)

best,

Charles
On Tue, Dec 4, 2012 at 8:39 PM, Sarah Goslee <sarah.goslee at gmail.com> wrote:

  
    
#
On Tue, Dec 4, 2012 at 8:14 PM, Charles Novaes de Santana
<charles.santana at gmail.com> wrote:
Either

1) Use the development version of R which has large-vector support
(matrices are vectors under the hood)

or

2) A large matrix is usually sparse in structure: use one of the many
sparse representation packages (e.g., Matrix) available.

MW
#
I don't think there's any reason for the calculation you're doing that
you must have the whole matrix in memory, is there?

Unless there's something more than what you've shown us, you're just
taking the sum of elementwise operations. You can read the matrix in
in manageable chunks, take the sum of that chunk and save the single
value. Repeat, then add them all up at the end.

Sarah

On Tue, Dec 4, 2012 at 3:14 PM, Charles Novaes de Santana
<charles.santana at gmail.com> wrote:
--
Sarah Goslee
http://www.functionaldiversity.org
1 day later
#
Thank you all for your help!

I used the function "big.matrix", from a package named "bigmemory", to
allocate a large matrix 50k x 50k
(http://www.inside-r.org/packages/cran/bigmemory/docs/bigmemory).

And I followed the suggestions from Sarah to do the calculations fastly!

Thank you very much!

Best!

Charles
On Tue, Dec 4, 2012 at 9:33 PM, Sarah Goslee <sarah.goslee at gmail.com> wrote: