Hi everyone, I'm translating into R some programs I worked through in Matlab to calculate the angle between two vectors (very large--like 6200 rows in each vector). In Matlab, I used a series of nested for loops, because I was calculating the angles between many pairs of vectors. I know for loops are not desirable in R code, so I was wondering if anyone could recommend a faster way to complete this task. Also, I have NAs in my vectors--I've had trouble performing various operations on my vectors in R because of these NAs. Any advice on this would be greatly appreciated. Thanks! Evan Macosko -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
vector angle
5 messages · Evan Zane Macosko, Brian Ripley, Laurent Gautier
Evan Zane Macosko wrote:
Hi everyone, I'm translating into R some programs I worked through in Matlab to calculate the angle between two vectors (very large--like 6200 rows in each vector). In Matlab, I used a series of nested for loops, because I was calculating the angles between many pairs of vectors. I know for loops are not desirable in R code, so I was wondering if anyone could recommend a faster way to complete this task. Also, I have NAs in my vectors--I've had trouble performing various operations on my vectors in R because of these NAs. Any advice on this would be greatly appreciated.
As far as I know, the use of apply (sapply and lapply) would make things run faster than 'for' loops. About the NAs, you may want to ignore the vectors which have a NA coordinate, or may be do something else... to have a foot in this, you may try the help for the functions 'is.na' and 'na.action'. I hope it helps, Laurent -- Laurent Gautier CBS, Building 208, DTU PhD. Student D-2800 Lyngby,Denmark tel: +45 45 25 24 85 http://www.cbs.dtu.dk/laurent -------------- next part -------------- An HTML attachment was scrubbed... URL: https://stat.ethz.ch/pipermail/r-help/attachments/20010717/a5a3487b/attachment.html
On Tue, 17 Jul 2001, Laurent Gautier wrote:
Evan Zane Macosko wrote:
Hi everyone, I'm translating into R some programs I worked through in Matlab to calculate the angle between two vectors (very large--like 6200 rows in each vector). In Matlab, I used a series of nested for loops, because I was calculating the angles between many pairs of vectors. I know for loops are not desirable in R code, so I was wondering if anyone could recommend a faster way to complete this task. Also, I have NAs in my vectors--I've had trouble performing various operations on my vectors in R because of these NAs. Any advice on this would be greatly appreciated.
As far as I know, the use of apply (sapply and lapply) would make things run faster than 'for' loops.
Not very much faster in R (and apply itself is basically a for loop). Because for loops were slow in S3, the message seems to have got transferred to S4 and R. Often the best approach is to see if a loop is fast enough, first. (In S-PLUS 5.0 lapply was actually slower than a for loop.) The key to speed is usually vectorization (often at the expense of memory, but these are not very large vectors). Here it looks as if outer() might be the key. I am not clear what is really required here, but if you want the angles between all pairs of p vectors of length m, say, I would try - normalizing them to unit length - using dist() to compute the Euclidean distances between the pairs, since for unit-length vectors x and y ||x - y||^2 = 2 - 2x.y = 2(1 - cos th) where th is the angle between them. That just makes use of for loops written in the C internals of dist, and doing the computations oneself in C is often a fairly simple option.
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Prof Brian Ripley wrote:
On Tue, 17 Jul 2001, Laurent Gautier wrote:
Evan Zane Macosko wrote:
Hi everyone, I'm translating into R some programs I worked through in Matlab to calculate the angle between two vectors (very large--like 6200 rows in each vector). In Matlab, I used a series of nested for loops, because I was calculating the angles between many pairs of vectors. I know for loops are not desirable in R code, so I was wondering if anyone could recommend a faster way to complete this task. Also, I have NAs in my vectors--I've had trouble performing various operations on my vectors in R because of these NAs. Any advice on this would be greatly appreciated.
As far as I know, the use of apply (sapply and lapply) would make things run faster than 'for' loops.
Not very much faster in R (and apply itself is basically a for loop). Because for loops were slow in S3, the message seems to have got transferred to S4 and R. Often the best approach is to see if a loop is fast enough, first. (In S-PLUS 5.0 lapply was actually slower than a for loop.)
And I lived all these years in ignorance, thinking I was doing good while I was making things worse..... I haven't look at R introduction manuals for a while now, so may be the following remark is already stated in them. but by the time I started with R, the modern statistics with S-plus was the main reference and being a poooor student at that time, I learned things through the internet. It prooves to be a bad thing, since at that time there was a rumour about these functions being faster than the loops (like the 'map' function is told to be faster than the for loop in Python for example). I just looked at what would get as an answer using a webcrawler, and the rumour seems to be still alive (see http://www.math.yorku.ca/Who/Faculty/Monette/S-news/2531.html , at the bottom of the page, or http://www.usc.edu/isd/doc/statistics/splus/faq/v5/newinv5.shtml ), but I would have followed better was has been told on this list I would have known it was not the case (see http://www.ens.gu.edu.au/robertk/R/help/00a/1999.html). Thanks for pointing out the mistake I made, Laurent -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Tue, 17 Jul 2001, Laurent Gautier wrote:
Prof Brian Ripley wrote:
On Tue, 17 Jul 2001, Laurent Gautier wrote:
Evan Zane Macosko wrote:
Hi everyone, I'm translating into R some programs I worked through in Matlab to calculate the angle between two vectors (very large--like 6200 rows in each vector). In Matlab, I used a series of nested for loops, because I was calculating the angles between many pairs of vectors. I know for loops are not desirable in R code, so I was wondering if anyone could recommend a faster way to complete this task. Also, I have NAs in my vectors--I've had trouble performing various operations on my vectors in R because of these NAs. Any advice on this would be greatly appreciated.
As far as I know, the use of apply (sapply and lapply) would make things run faster than 'for' loops.
Not very much faster in R (and apply itself is basically a for loop). Because for loops were slow in S3, the message seems to have got transferred to S4 and R. Often the best approach is to see if a loop is fast enough, first. (In S-PLUS 5.0 lapply was actually slower than a for loop.)
And I lived all these years in ignorance, thinking I was doing good while I was making things worse..... I haven't look at R introduction manuals for a while now, so may be the following remark is already stated in them. but by the time I started with R, the modern statistics with S-plus was the main reference and being a poooor student at that time, I learned things through the internet. It prooves to be a bad thing, since at that time there was a rumour about these functions being faster than the loops (like the 'map' function is told to be faster than the for loop in Python for example). I just looked at what would get as an answer using a webcrawler, and the rumour seems to be still alive (see http://www.math.yorku.ca/Who/Faculty/Monette/S-news/2531.html , at the bottom of the page, or http://www.usc.edu/isd/doc/statistics/splus/faq/v5/newinv5.shtml ), but I would have followed better was has been told on this list I would have known it was not the case (see http://www.ens.gu.edu.au/robertk/R/help/00a/1999.html). Thanks for pointing out the mistake I made,
Not a mistake, but a misconception perhaps. Bill Venables and I deliberately illustrated a number of approaches in Chapter 7 of S Programming using S+3.4, S+2000, S+5.1 and R 0.90.1 to show that (p. 152) Since our aim is not to compare systems, the timings here using different engines were done on different systems, all of which had ample RAM. It is worth noting that the different S implementations used here do differ, sometimes radically, in their ordering of approaches, and the ordering might be different again on machines with less RAM available. One example (using a for loop) shows a 17x speed up in S+5.1, and a 50% speed-up in R. Also, later versions of R show some differences, especially due to the new memory management system (and lapply has been altered since those timings too).
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._