using lapply
To add to William's remarks, another advantage of the
apply family of functions is that they avoid growing an
object inside a loop, which is very inefficient in R.
In other words, without the *apply functions, users
might do something like this:
answer = NULL
for(i in 1:nrows)
answer = rbind(answer,calculateanewrow(i))
or
answer = NULL
for(i in 1:n)
answer = c(answer,newcalculation(i))
So in addition to making your program easier to understand
(which is a huge advantange in and of itself), they
also help you avoid a programming paradigm that's
very inefficient in R:
mat = matrix(abs(rnorm(10000)),1000,10)
system.time({answer=NULL;for(i in 1:nrow(mat))answer = rbind(answer,log(mat[i,]))})
user system elapsed 0.052 0.020 0.072
system.time({answer1 = t(apply(mat,1,log))})
user system elapsed 0.012 0.000 0.012
all.equal(answer,answer1)
[1] TRUE That's a speedup of a factor of 6, which gets even bigger as the size of the object increases:
mat = matrix(abs(rnorm(100000)),10000,10)
system.time({answer=NULL;for(i in 1:nrow(mat))answer = rbind(answer,log(mat[i,]))})
user system elapsed 5.960 1.524 7.505
system.time({answer1 = t(apply(mat,1,log))})
user system elapsed 0.120 0.004 0.123
all.equal(answer,answer1)
[1] TRUE Now it's a speedup of 60 -- essentially an O(n^2) algorithm competing with an O(n) algorithm. The lack of scalability of this paradigm often leads new users to believe that R can't handle large problems. Learning to use the apply family of functions from the start avoids this misconception. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spector at stat.berkeley.edu
On Thu, 10 Mar 2011, William Dunlap wrote:
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of rex.dwyer at syngenta.com Sent: Thursday, March 10, 2011 8:47 AM To: ligges at statistik.tu-dortmund.de; arun.kumar.saha at gmail.com Cc: r-help at r-project.org Subject: Re: [R] using lapply But no one answered Kushan's question about performance implications of for-loop vs lapply. With apologies to George Orwell: "for-loops BAAAAAAD, no loops GOOOOOOD."
While using no loops is faster, lapply has a loop in it and isn't much different in speed from the equvialent for loop. The big advantage of the *apply functions is that they can make your code easier to understand. Here are some times for various ways of computing log(1:1000000). This example is probably close to a worst-case scenario for the for loop, since the time is dominated by the [<- operation. Using the various *apply functions can get you a speed-up of c. 4x, which is nice, but the vectorized log gives a speed-up of c. 15x over the fastest of the loops. I think the for-loop method is ungainly because it obscures to flow of the data, but there is no accounting for taste.
> system.time({ val.for <- numeric(1e6);for(i in
seq_len(1e6))val.for[i]<-log(i)})
user system elapsed
7.03 0.02 7.19
> system.time({ val.sapply <- sapply(seq_len(1e6), log) })
user system elapsed
6.59 0.03 6.80
> system.time({ val.lapply <- unlist(lapply(seq_len(1e6), log)) })
user system elapsed
2.48 0.00 2.52
> system.time({ val.vapply <- vapply(seq_len(1e6), log, FUN.VALUE=0)
})
user system elapsed
1.74 0.00 1.76
> system.time({ val.log <- log(seq_len(1e6)) })
user system elapsed
0.12 0.00 0.12
> identical(val.vapply,val.sapply) && identical(val.vapply,val.for) &&
identical(val.vapply,val.lapply) && identical(val.vapply,val.log) [1] TRUE Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Uwe Ligges Sent: Thursday, March 10, 2011 4:38 AM To: Arun Kumar Saha Cc: r-help at r-project.org Subject: Re: [R] using lapply On 10.03.2011 08:30, Arun Kumar Saha wrote:
On reply to the post http://r.789695.n4.nabble.com/using-lapply-td3345268.html
Hmmm, can you please reply to the original post and quote it? You mail was not recognized to be in the same thread as the message of the original poster (and hence I wasted time to answer it again). Thanks, Uwe Ligges
Dear Kushan, this may be a good start: ## assuming 'instr.list' is your list object and you are applying my.strat() function on each element of that list, you can use lapply function as lapply(instr.list, function(x) return(my.strat(x))) Here resulting element will again be another list with
length is same as the
length of your original list 'instr.list.' Instead if the returned object for my.strat() function is a
single number
then you might want to create a vector instead list, in
that case just use
'sapply'
HTH
Arun,
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.