An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20091110/d486e6fb/attachment-0001.pl>
standardGeneric seems slow; any way to get around it?
2 messages · John Tillinghast, Martin Morgan
John Tillinghast wrote:
Hi, I'm running some routines with standard matrix operations like solve() and diag(). When I do a profile, the lead item under total time is standardGeneric(). Furthermore, solve() and diag() have much greater total time than self time. ???
Hi John -- are you interpreting the output of summaryRprof() correctly? Here's an example
f = function() { g(); h() }
g = function() { t0 = Sys.time(); while (Sys.time() - t0 < 5) {} }
h = function() { t0 = Sys.time(); while (Sys.time() - t0 < 2) {} }
We'd expect all the time to be spent in 'g' and 'h'; 'f' is just a wrapper. So...
Rprof(); f(); Rprof(NULL); head(summaryRprof()$by.total, 8)
total.time total.pct self.time self.pct "f" 6.98 100.0 0.00 0.0 "g" 4.98 71.3 0.06 0.9 "-" 4.36 62.5 0.18 2.6 "-.POSIXt" 4.24 60.7 0.06 0.9 "difftime" 4.00 57.3 0.22 3.2 "structure" 2.74 39.3 1.16 16.6 "match.arg" 2.04 29.2 0.20 2.9 "h" 2.00 28.7 0.02 0.3 This says 100% of the total time (about 7s) is spent in 'f' or functions called by f, and 0% (well, I guess that's an estimate) is actually spent in 'f' itself. Likewise, the total time in g or the functions it calls is about 5s, and h or the functions it calls about 2s. In all of these cases, little time is spent exactly _in_ the function, so self.time is small. Note also -, -.POSIX, difftime, structure etc; these functions are common to both g and h, and the time spent in them reflects the contributions from both g and h. So the small self.time of solve and diag just say that the code is moving quickly through these, and the large total time for standardGeneric likely means that it's at the top of your call stack, or that it's called from several different places in your code. Here's another example, after library(Matrix); example(solve)
Rprof(); for (i in 1:1000) solve(h8); Rprof(NULL);
head(summaryRprof()$by.tot)
total.time total.pct self.time self.pct
"solve" 0.22 100.0 0.04 18.2
"standardGeneric" 0.22 100.0 0.04 18.2
"solve.default" 0.14 63.6 0.00 0.0
"diag" 0.04 18.2 0.02 9.1
"*" 0.02 9.1 0.02 9.1
".Call" 0.02 9.1 0.02 9.1
'solve' and 'standardGeneric' are at the top of the call stack, so get
~100% of the total time. In this particular case they also account for a
substantial portion of the self.time, and it (might) be worth asking
whether there's a different algorithm that would reduce the number of
calls to solve (e.g., by not iterating over 1:1000 ;).
If you were being aggressive and really were calling solve repeatedly on
the same type of object, you might
Rprof(); f = selectMethod(solve, class(h8)); for (i in 1:1000) f(h8);
+ Rprof(NULL); head(summaryRprof()$by.total)
total.time total.pct self.time self.pct
"f" 0.18 100.0 0.04 22.2
"solve.default" 0.14 77.8 0.02 11.1
".Call" 0.04 22.2 0.04 22.2
"nrow" 0.04 22.2 0.04 22.2
"diag" 0.04 22.2 0.00 0.0
"array" 0.02 11.1 0.02 11.1
Hope that sheds some light on your problem.
Martin
I assume there is some time-consuming decision going on in the usual functions; is there any way to avoid that and go straight to the calculaions? Thanks John Tillinghast [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793