Skip to content

weighted.mean uses zero when na.rm=TRUE (PR#14032)

3 messages · Arni Magnusson, Peter Dalgaard, Henrik Bengtsson

#
The weighted.mean() function replaces NA values with 0.0 when the user 
specifies na.rm=TRUE:

   x <- c(101, 102, NA)
   mean(x, na.rm=TRUE)                         # 101.5, correct
   weighted.mean(x, na.rm=TRUE)                # 67.66667, wrong
   weighted.mean(x, w=c(1,1,1), na.rm=TRUE)    # 67.66667, wrong
   weighted.mean(x, w=c(1,1,1)/3, na.rm=TRUE)  # 67.66667, wrong

The weights are normalized w<-w/sum(w) before removing the NA values, 
effectively replacing x[is.na(x)]<-0. This bug was introduced between 
versions 2.9.2 and 2.10.0.

Thanks,

Arni
#
arnima at hafro.is wrote:
Yes,

r48644 on May 27, specifically.
#
Here some redundancy tests that may be useful (I use similar ones for
aroma.light::weightedMedian):

n <- 10
x <- 1:n

# No weights
m1 <- mean(x)
m2 <- weighted.mean(x)
stopifnot(all.equal(m1, m2))

# Equal weights on different scales
w1 <- rep(1, n)
m1 <- weighted.mean(x, w1)
w2 <- rep(100, n)
m2 <- weighted.mean(x, w2)
stopifnot(all.equal(m1,m2))

# Pull the mean towards first value
w1[1] <- 5000
m1 <- weighted.mean(x, w1)
w2[1] <- 500000
m2 <- weighted.mean(x, w2)
stopifnot(all.equal(m1,m2))

# Zero weights
x <- 1:n
w <- rep(1, n)
w[8:n] <- 0
m1 <- weighted.mean(x, w)
m2 <- mean(x[1:7])
stopifnot(all.equal(m1,m2))

# All weights set to zero
x <- 1:n
w <- rep(0, n)
m1 <- weighted.mean(x, w)
m2 <- mean(x[w > 0])
stopifnot(all.equal(m1,m2))

# Missing values
x <- 1:n
w <- rep(1, n)
x[4:5] <- NA
m1 <- weighted.mean(x, w, na.rm=TRUE)
m2 <- mean(x, na.rm=TRUE)
stopifnot(all.equal(m1,m2))

/Henrik


On Fri, Oct 30, 2009 at 8:07 AM, Peter Dalgaard
<p.dalgaard at biostat.ku.dk> wrote: