Matrix oriented computing
Prof. Ripley, Excellent point. Neither sapply() nor outer() are the "elephant in the room" in this situation.
On Fri, 2005-08-26 at 16:55 +0100, Prof Brian Ripley wrote:
Try profiling. Doing this many times to get an overview, e.g. for sapply
with df=1:1000:
% self % total
self seconds total seconds name
98.26 6.78 98.26 6.78 "FUN"
0.58 0.04 0.58 0.04 "unlist"
0.29 0.02 0.87 0.06 "as.vector"
0.29 0.02 0.58 0.04 "names<-"
0.29 0.02 0.29 0.02 "names<-.default"
0.29 0.02 0.29 0.02 "names"
so almost all the time is in qchisq.
On Fri, 26 Aug 2005, Marc Schwartz (via MN) wrote:
On Fri, 2005-08-26 at 15:25 +0200, Peter Dalgaard wrote:
Marc Schwartz <MSchwartz at mn.rr.com> writes:
x <- c(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9,
0.95, 0.975, 0.99, 0.995)
df <- c(1:100)
mat <- sapply(x, qchisq, df)
dim(mat)
[1] 100 11
str(mat)
num [1:100, 1:11] 3.93e-05 1.00e-02 7.17e-02 2.07e-01 4.12e-01 ...
outer() is perhaps a more natural first try... It does give the transpose of the sapply approach, though. round(t(outer(x,df,qchisq)),2) should be close. You should likely add dimnames.
What I find interesting, is that I would have intuitively expected outer() to be faster than sapply(). However:
system.time(mat <- sapply(x, qchisq, df), gcFirst = TRUE)
[1] 0.01 0.00 0.01 0.00 0.00
system.time(mat1 <- round(t(outer(x, df, qchisq)), 2),
gcFirst = TRUE) [1] 0.01 0.00 0.01 0.00 0.00 # No round() or t() to test for overhead
system.time(mat2 <- outer(x, df, qchisq), gcFirst = TRUE)
[1] 0.01 0.00 0.02 0.00 0.00 # Bear in mind the round() on mat1 above
all.equal(mat, mat1)
[1] "Mean relative difference: 4.905485e-05"
all.equal(mat, t(mat2))
[1] TRUE Even when increasing the size of 'df' to 1:1000:
system.time(mat <- sapply(x, qchisq, df), gcFirst = TRUE)
[1] 0.16 0.01 0.16 0.00 0.00
system.time(mat1 <- round(t(outer(x, df, qchisq)), 2), gcFirst =
TRUE) [1] 0.16 0.00 0.18 0.00 0.00
# No round() or t() to test for overhead system.time(mat2 <- outer(x, df, qchisq), gcFirst = TRUE)
[1] 0.16 0.01 0.17 0.00 0.00 It also seems that, at least in this case, t() and round() do not add much overhead.
Definitely not for such small matrices.
True and both are C functions, which of course helps as well. Best regards, Marc