What algorithm is R using to calculate mean?
This was also asked on StackOverflow: http://stackoverflow.com/q/17866149/271616. Here is the answer I posted: This appears to be the updating method of West, 1979 [1] and it was implemented in R-2.3.0 in response to PR#1228 [2]. I'm not positive this is the correct algorithm, since it was suggested by Martin Maechler but implemented by Brian Ripley. I couldn't find a reference in the source code or version control logs that listed the actual algorithm used. It was implemented in cov.c in revision 37389 and in summary.c in revision 37393. [1] http://dl.acm.org/citation.cfm?doid=359146.359153 [2] https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=1228 Best, -- Joshua Ulrich | about.me/joshuaulrich FOSS Trading | www.fosstrading.com On Thu, Jul 25, 2013 at 2:44 PM, Zach Harrington
<zach.harrington at gmail.com> wrote:
I am curious to know what algorithm R's mean function uses. Is there some
reference to the numerical properties of this algorithm?
I found the following C code in summary.c:do_summary():
case REALSXP:
PROTECT(ans = allocVector(REALSXP, 1));
for (i = 0; i < n; i++) s += REAL(x)[i];
s /= n;
if(R_FINITE((double)s)) {
for (i = 0; i < n; i++) t += (REAL(x)[i] - s);
s += t/n;
}
REAL(ans)[0] = s;
break;
It seems to do a straight up mean:
for (i = 0; i < n; i++) s += REAL(x)[i];
s /= n;
Then it adds what i assume is a numerical correction which seems to be the
mean difference from the mean of the data:
for (i = 0; i < n; i++) t += (REAL(x)[i] - s);
s += t/n;
I haven't been able to track this algorithm down anywhere (mean is not a
great search term).
Any help would be much appreciated,
Zach Harrington
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel