Skip to content

Matrix max by row

9 messages · Ana M Aparicio Carrasco, jim holtman, Rolf Turner +3 more

#
Ana M Aparicio Carrasco wrote:
matrix(apply(m, 1, max), nrow(m))

vQ
#
?apply
[1] 5 8 8
On Sat, Mar 28, 2009 at 7:54 PM, Ana M Aparicio Carrasco
<ana.aparicio at upr.edu> wrote:

  
    
1 day later
#
If speed is a consideration,availing yourself of the built-in pmax()
function via

do.call(pmax,data.frame(yourMatrix)) 

will be considerably faster for large matrices.

If you are puzzled by why this works, it is a useful exercise in R to figure
it out. 

Hint:The man page for ?data.frame says:
"A data frame is a list of variables of the same length with unique row
names, given class 'data.frame'."

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Statistics

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Wacek Kusnierczyk
Sent: Saturday, March 28, 2009 5:22 PM
To: Ana M Aparicio Carrasco
Cc: r-help at r-project.org
Subject: Re: [R] Matrix max by row
Ana M Aparicio Carrasco wrote:
matrix(apply(m, 1, max), nrow(m))

vQ

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
I tried the following:

m <- matrix(runif(100000),1000,100)
junk <- gc()
print(system.time(for(i in 1:100) X1 <- do.call(pmax,data.frame(m))))
junk <- gc()
print(system.time(for(i in 1:100) X2 <- apply(m,1,max)))

and got

    user  system elapsed
   2.704   0.110   2.819
    user  system elapsed
   1.938   0.098   2.040

so unless there's something that I am misunderstanding (always a serious
consideration) Wacek's apply method looks to be about 1.4 times  
*faster* than
the do.call/pmax method.

	cheers,

		Rolf Turner
On 30/03/2009, at 3:55 PM, Bert Gunter wrote:

            
######################################################################
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
#
It seems to be very system dependent.  Here's another take:
user  system elapsed 
   1.53    0.01    1.57
user  system elapsed 
   1.81    0.00    1.83
Now what happens if you work with data frames rather than matrices:
user  system elapsed 
   0.31    0.00    0.31
user  system elapsed 
   3.22    0.03    3.34
Go figure!  


Bill Venables
http://www.cmis.csiro.au/bill.venables/ 


-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Rolf Turner
Sent: Monday, 30 March 2009 1:39 PM
To: Bert Gunter
Cc: 'Wacek Kusnierczyk'; r-help at r-project.org
Subject: Re: [R] Matrix max by row

I tried the following:

m <- matrix(runif(100000),1000,100)
junk <- gc()
print(system.time(for(i in 1:100) X1 <- do.call(pmax,data.frame(m))))
junk <- gc()
print(system.time(for(i in 1:100) X2 <- apply(m,1,max)))

and got

    user  system elapsed
   2.704   0.110   2.819
    user  system elapsed
   1.938   0.098   2.040

so unless there's something that I am misunderstanding (always a serious
consideration) Wacek's apply method looks to be about 1.4 times  
*faster* than
the do.call/pmax method.

	cheers,

		Rolf Turner
On 30/03/2009, at 3:55 PM, Bert Gunter wrote:

            
######################################################################
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
Rolf Turner wrote:
hmm, since i was called by name (i'm grateful, rolf), i feel obliged to
check the matters myself:

    # dummy data, presumably a 'large matrix'?
    n = 5e3
    m = matrix(rnorm(n^2), n, n)

    # what is to be benchmarked...
    waku = expression(matrix(apply(m, 1, max), nrow(m)))
    bert = expression(do.call(pmax,data.frame(m)))

    # to be benchmarked
    library(rbenchmark)
    benchmark(replications=10, order='elapsed', columns=c('test',
'elapsed'),
       waku=matrix(apply(m, 1, max), nrow(m)),
       bert=do.call(pmax,data.frame(m)))

takes quite a while, but here you go:

    #   test elapsed
    # 1 waku  11.838
    # 2 bert  20.833

where bert's solution seems to require a wonder to 'be considerably
faster for large matrices'.  to have it fair, i also did

    # to be benchmarked
    library(rbenchmark)
    benchmark(replications=10, order='elapsed', columns=c('test',
'elapsed'),
       bert=do.call(pmax,data.frame(m)),
       waku=matrix(apply(m, 1, max), nrow(m)))

    #  test elapsed
    # 2 waku  11.695
    # 1 bert  20.912
   
take home point: a good product sells itself, a bad product may not sell
despite aggressive marketing.

rolf, thanks for pointing this out.

cheers,
vQ
#
Serves me right, I suppose. Timing seems also very dependent on the
dimensions of the matrix. Here's what I got with my inadequate test:
## via apply
user  system elapsed 
   2.09    0.02    2.10

## via pmax
user  system elapsed 
   0.10    0.02    0.11
Draw your own conclusions!

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
650-467-7374

-----Original Message-----
From: Wacek Kusnierczyk [mailto:Waclaw.Marcin.Kusnierczyk at idi.ntnu.no] 
Sent: Monday, March 30, 2009 2:33 AM
To: Rolf Turner
Cc: Bert Gunter; 'Ana M Aparicio Carrasco'; r-help at r-project.org
Subject: Re: [R] Matrix max by row
Rolf Turner wrote:
hmm, since i was called by name (i'm grateful, rolf), i feel obliged to
check the matters myself:

    # dummy data, presumably a 'large matrix'?
    n = 5e3
    m = matrix(rnorm(n^2), n, n)

    # what is to be benchmarked...
    waku = expression(matrix(apply(m, 1, max), nrow(m)))
    bert = expression(do.call(pmax,data.frame(m)))

    # to be benchmarked
    library(rbenchmark)
    benchmark(replications=10, order='elapsed', columns=c('test',
'elapsed'),
       waku=matrix(apply(m, 1, max), nrow(m)),
       bert=do.call(pmax,data.frame(m)))

takes quite a while, but here you go:

    #   test elapsed
    # 1 waku  11.838
    # 2 bert  20.833

where bert's solution seems to require a wonder to 'be considerably
faster for large matrices'.  to have it fair, i also did

    # to be benchmarked
    library(rbenchmark)
    benchmark(replications=10, order='elapsed', columns=c('test',
'elapsed'),
       bert=do.call(pmax,data.frame(m)),
       waku=matrix(apply(m, 1, max), nrow(m)))

    #  test elapsed
    # 2 waku  11.695
    # 1 bert  20.912
   
take home point: a good product sells itself, a bad product may not sell
despite aggressive marketing.

rolf, thanks for pointing this out.

cheers,
vQ
#
Bert Gunter wrote:
yes, similar to what i got.  but with the transpose, the ratio is way
more than inverted:

    waku = expression(matrix(apply(m, 1, max), nrow(m)))
    bert = expression(do.call(pmax, data.frame(m)))

    library(rbenchmark)

    m = matrix(rnorm(1e6), ncol=10)
    benchmark(replications=10, columns=c('test', 'elapsed'),
order='elapsed',
       waku=waku,
       bert=bert)
    #   test elapsed
    # 2 bert   1.633
    # 1 waku   9.974

    m = t(m)
    benchmark(replications=10, columns=c('test', 'elapsed'),
order='elapsed',
       waku=waku,
       bert=bert)
    #   test elapsed
    # 1 waku   0.507
    # 2 bert  27.261
my favourite:  you should have specified what 'large matrices' means.

vQ