Hi,
I have R installed from the Ubuntu PPA and a local build of R (more
details below). I will refer to these as "R" and "R-devel",
respectively. I've reproduced the following on Ubuntu 13.10 and 14.04.
Below is an example (which requires the bootstrap package) that takes
10 seconds for me to run with R-devel and 5 seconds with R
library(bootstrap)
str(tooth)
theta <- function(ind) {
easy <- lm(strength ~ E1+E2, data=tooth, subset=ind)
diffi<- lm(strength ~ D1+D2, data=tooth, subset=ind)
(sum(resid(easy)^2) - sum(resid(diffi)^2))/13 }
tooth.boot <- bootstrap(1:13, 2000, theta)
I'm wondering if this is due to different compiler flags. For R, when
installing the bootstrap package, I see
gcc -std=gnu99 -shared -Wl,-Bsymbolic-functions -Wl,-z,relro -o
bootstrap.so boott.o -lgfortran -lm -lquadmath -L/usr/lib/R/lib -lR
For R-devel I see:
ccache gcc -shared -L/usr/local/lib -o bootstrap.so boott.o -lgfortran
-lm -lquadmath -L/usr/local/lib/R-devel/lib/R/lib -lR
My install script for the local build is based on Dirk's script [1].
In particular, my configure command is:
R_PAPERSIZE=letter R_BATCHSAVE="--no-save --no-restore"
R_BROWSER=xdg-open PAGER=/usr/bin/pager PERL=/usr/bin/perl
R_UNZIPCMD=/usr/bin/unzip R_ZIPCMD=/usr/bin/zip
R_PRINTCMD=/usr/bin/lpr LIBnn=lib AWK=/usr/bin/awk CC="gcc"
CFLAGS="-ggdb -pipe -std=gnu99 -Wall -pedantic" CXX="g++"
CXXFLAGS="-ggdb -pipe -Wall -pedantic" FC="gfortran" F77="gfortran"
MAKE="make -j$NJOBS" "${repoDir}/configure"
--prefix=/usr/local/lib/R-devel --enable-R-shlib --with-blas
--with-lapack --with-readline --without-recommended-packages >
../build-logs/configure 2>&1
I'm using R-devel updated to today's revision but I compiled a version
from a year ago and had the same performance so that is why I suspect
my installation script accounts for the differences.
Any advice would be appreciated and please let me know if any other
information would be helpful.
Best,
Scott
[1]
http://www.personal.psu.edu/mar36/blogs/the_ubuntu_r_blog/2012/08/installing-the-development-version-of-r-on-ubuntu-alongside-the-current-version-of-r.html
--
Scott Kostyshak
Economics PhD Candidate
Princeton University
50% performance of custom R build compared to PPA R for a command
5 messages · Scott Kostyshak, Dirk Eddelbuettel
Scott, My first quick hunches are a) 50% is too much for compiler switches, b) your examples shows R code, and c) are you sure you are using the same BLAS? What happens when you profile? Dirk
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
On Thu, Apr 24, 2014 at 4:32 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
Scott, My first quick hunches are a) 50% is too much for compiler switches, b) your examples shows R code, and c) are you sure you are using the same BLAS?
Thanks for the quick reply Dirk and for the suggestions. As for BLAS, yes I believe I'm using the same BLAS. The output of the following two commands is the same (except for the memory addresses of course): $ ldd /usr/local/lib/R-devel/lib/R/bin/exec/R $ ldd /usr/lib/R/bin/exec/R And executing $ lsof -p <PID> | grep 'blas\|lapack' also returns the same output for both Rs: R 13017 scott mem REG 8,1 9142768 2097161 /usr/lib/atlas-base/atlas/liblapack.so.3.0 R 13017 scott mem REG 8,1 3776592 2097162 /usr/lib/atlas-base/atlas/libblas.so.3.0 I profiled and it seems that all of the R functions are slow (I can post the output if anyone is interested). I rebuilt with -O3 in CFLAGS and this improved things a lot. Time went down from 10 seconds to 5.7 or so. I reprofiled and again the R functions of R-devel seem just a tad slower across the board (I can send output if interested). Below are some timings comparing the optimized R-devel to R. $ time R-devel CMD BATCH mwe.R real 0m5.755s user 0m5.678s sys 0m0.079s $ time R CMD BATCH mwe.R real 0m5.453s user 0m5.371s sys 0m0.054s Rerunning the above commands multiple times gives about the same output. There's still a .3 second difference and I'm curious to know why. Any ideas? Scott -- Scott Kostyshak Economics PhD Candidate Princeton University
On 25 April 2014 at 11:38, Scott Kostyshak wrote:
| On Thu, Apr 24, 2014 at 4:32 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
| > | > Scott, | > | > My first quick hunches are a) 50% is too much for compiler switches, b) your | > examples shows R code, and c) are you sure you are using the same BLAS? | | Thanks for the quick reply Dirk and for the suggestions. | | As for BLAS, yes I believe I'm using the same BLAS. The output of the | following two commands is the same (except for the memory addresses of | course): | $ ldd /usr/local/lib/R-devel/lib/R/bin/exec/R | $ ldd /usr/lib/R/bin/exec/R | | And executing | $ lsof -p <PID> | grep 'blas\|lapack' | also returns the same output for both Rs: | R 13017 scott mem REG 8,1 9142768 2097161 | /usr/lib/atlas-base/atlas/liblapack.so.3.0 | R 13017 scott mem REG 8,1 3776592 2097162 | /usr/lib/atlas-base/atlas/libblas.so.3.0 | | I profiled and it seems that all of the R functions are slow (I can | post the output if anyone is interested). I rebuilt with -O3 in CFLAGS | and this improved things a lot. Time went down from 10 seconds to 5.7 That is surprisingly large. In my mail yesterday I basically bet against it. | or so. I reprofiled and again the R functions of R-devel seem just a | tad slower across the board (I can send output if interested). | | Below are some timings comparing the optimized R-devel to R. | | $ time R-devel CMD BATCH mwe.R | | real 0m5.755s | user 0m5.678s | sys 0m0.079s | | $ time R CMD BATCH mwe.R | | real 0m5.453s | user 0m5.371s | sys 0m0.054s | | Rerunning the above commands multiple times gives about the same output. | | There's still a .3 second difference and I'm curious to know why. Any ideas? Different code base? If you want _identical_ outcomes you need identical _input_: code, compiler, settings, hardware, ... Dirk
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
On Fri, Apr 25, 2014 at 11:59 AM, Dirk Eddelbuettel <edd at debian.org> wrote:
On 25 April 2014 at 11:38, Scott Kostyshak wrote: | On Thu, Apr 24, 2014 at 4:32 PM, Dirk Eddelbuettel <edd at debian.org> wrote: | > | > Scott, | > | > My first quick hunches are a) 50% is too much for compiler switches, b) your | > examples shows R code, and c) are you sure you are using the same BLAS? | | Thanks for the quick reply Dirk and for the suggestions. | | As for BLAS, yes I believe I'm using the same BLAS. The output of the | following two commands is the same (except for the memory addresses of | course): | $ ldd /usr/local/lib/R-devel/lib/R/bin/exec/R | $ ldd /usr/lib/R/bin/exec/R | | And executing | $ lsof -p <PID> | grep 'blas\|lapack' | also returns the same output for both Rs: | R 13017 scott mem REG 8,1 9142768 2097161 | /usr/lib/atlas-base/atlas/liblapack.so.3.0 | R 13017 scott mem REG 8,1 3776592 2097162 | /usr/lib/atlas-base/atlas/libblas.so.3.0 | | I profiled and it seems that all of the R functions are slow (I can | post the output if anyone is interested). I rebuilt with -O3 in CFLAGS | and this improved things a lot. Time went down from 10 seconds to 5.7 That is surprisingly large. In my mail yesterday I basically bet against it. | or so. I reprofiled and again the R functions of R-devel seem just a | tad slower across the board (I can send output if interested). | | Below are some timings comparing the optimized R-devel to R. | | $ time R-devel CMD BATCH mwe.R | | real 0m5.755s | user 0m5.678s | sys 0m0.079s | | $ time R CMD BATCH mwe.R | | real 0m5.453s | user 0m5.371s | sys 0m0.054s | | Rerunning the above commands multiple times gives about the same output. | | There's still a .3 second difference and I'm curious to know why. Any ideas? Different code base? If you want _identical_ outcomes you need identical _input_: code, compiler, settings, hardware, ...
Not looking for identical, just looking to squeeze out something to learn from about other possibilities for differences, e.g. libraries that I'm not linking against at compile time, or differences with byte compiling R. But it doesn't seem like there's any obvious candidates so I'll stop here for now. Thanks for the help, Dirk. Scott -- Scott Kostyshak Economics PhD Candidate Princeton University