Skip to content

different results form summarization by loop and sum or rowMeans function

3 messages · Markus Schmidberger, jim holtman, Brian Ripley

#
Hi,

I found different results calculating the rowMeans by the function 
rowMeans() and a simple for-loop. The differences are very low. But 
after this calculation I will start some optimization algorithms (BFGS 
or CG) and there I get huge differences (from the small changes in the 
beginning or start values, I changed nothing else in the code).
How I can avoid these differences between sum-loops and sum-functions?

Attached a small testcode using data form Bioconductor.

Best
Markus


library(affy)
data(affybatch.example)
mat <- exprs(affybatch.example)[1:100,1:3]
mat <- exp(1)*mat
mat <- asinh(mat)

rowM1<- rowMeans(mat)

t=rep(0,100) # Vektor mit 0en
for(i in 1:100){
   for(j in 1:3)
       t[i] <- t[i] + mat[i,j]
}
rowM2 <- t/3

m1 <- mat - rowM1
m2 <- mat -rowM2

print(m1-m2)

sessionInfo()
R version 2.7.1 (2008-06-23)
i386-pc-mingw32

locale:
LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC_MONETARY=German_Germany.1252;LC_NUMERIC=C;LC_TIME=German_Germany.1252

attached base packages:
[1] tools     stats     graphics  grDevices utils     datasets  methods 
[8] base    

other attached packages:
[1] affy_1.18.2          preprocessCore_1.2.0 affyio_1.8.0       
[4] Biobase_2.0.1
#
How low is "very low"?  This is probably answered by FAQ 7.31

On Thu, Sep 11, 2008 at 9:49 AM, Markus Schmidberger
<schmidb at ibe.med.uni-muenchen.de> wrote:

  
    
#
On Thu, 11 Sep 2008, Markus Schmidberger wrote:

            
Indeed, but the C code (rowMeans) is likely to be more accurate as it uses 
an extended-precision accumulator.
You cannot. What you can do is work on making what you do with these 
inputs numerically stable: unless you do so your end results will have 
very little value.  (For example, are you finding different local minima, 
in which case you need to decide how to treat that possibility?)

I suggest reading an introductory book on Numerical Analysis, or

Monahan, J. F. (2001) Numerical Methods of Statistics. Cambridge: 
Cambridge. Chapter 2.

or

Press,W. H., Teukolsky, S. A., Vetterling, W. T. and Flannery, B. P. 
(2007) Numerical Recipes. The Art of Scientific Programming. Third 
Edition. Cambridge. Section 1.1 (I think).