sapply puzzlement
On Jan 27, 2011, at 7:16 PM, Ernest Adrogu? i Calveras wrote:
Hi, I have this data.frame with two variables in it,
z
V1 V2 1 10 8 2 NA 18 3 9 7 4 3 NA 5 NA 10 6 11 12 7 13 9 8 12 11 and a vector of means,
means <- apply(z, 2, function (col) mean(na.omit(col))) means
V1 V2 9.666667 10.714286
Two methods:
A) use sweep (which by default takes the difference)
> sweep(z, 2, means)
V1 V2
1 0.3333333 -2.7142857
2 NA 7.2857143
3 -0.6666667 -3.7142857
4 -6.6666667 NA
5 NA -0.7142857
6 1.3333333 1.2857143
7 3.3333333 -1.7142857
8 2.3333333 0.2857143
B) use the scale function (whose "whole purpose in life" is to
subtract the mean and possibly divide by the standard deviation which
we suppressed in this case with the scale=FALSE argument)
> scale(z, scale=FALSE)
V1 V2
1 0.3333333 -2.7142857
2 NA 7.2857143
3 -0.6666667 -3.7142857
4 -6.6666667 NA
5 NA -0.7142857
6 1.3333333 1.2857143
7 3.3333333 -1.7142857
8 2.3333333 0.2857143
attr(,"scaled:center")
V1 V2
9.666667 10.714286
David. > > My intention was substracting means from z, so instictively I tried > >> z-means > V1 V2 > 1 0.3333333 -1.6666667 > 2 NA 7.2857143 > 3 -0.6666667 -2.6666667 > 4 -7.7142857 NA > 5 NA 0.3333333 > 6 0.2857143 1.2857143 > 7 3.3333333 -0.6666667 > 8 1.2857143 0.2857143 > > But this is completely wrong. sapply() gives the same result: > >> sapply(z, function(row) row - means) > V1 V2 > [1,] 0.3333333 -1.6666667 > [2,] NA 7.2857143 > [3,] -0.6666667 -2.6666667 > [4,] -7.7142857 NA > [5,] NA 0.3333333 > [6,] 0.2857143 1.2857143 > [7,] 3.3333333 -0.6666667 > [8,] 1.2857143 0.2857143 > > So, what is going on here? > The following appears to work > >> z-matrix(means,ncol=2)[rep(1, dim(z)[1]),] > V1 V2 > 1 0.3333333 -2.7142857 > 2 NA 7.2857143 > 3 -0.6666667 -3.7142857 > 4 -6.6666667 NA > 5 NA -0.7142857 > 6 1.3333333 1.2857143 > 7 3.3333333 -1.7142857 > 8 2.3333333 0.2857143 > > but I think it's rather cumbersome, surely there must be a cleaner way > to do it. > > -- > Ernest > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT