Skip to content

hiccup in apply?

2 messages · bogdan romocea, Gavin Simpson

#
Hello, I don't understand the behavior of apply() on the data frame below.

test <-
structure(list(Date = structure(c(13361, 13361, 13361, 13361,
13361, 13361, 13361, 13361, 13362, 13362, 13362, 13362, 13362,
13362, 13362, 13362, 13363, 13363, 13363, 13363, 13363, 13363,
13363, 13363, 13364, 13364, 13364, 13364, 13364, 13364, 13364,
13364, 13365, 13365, 13365, 13365, 13365, 13365, 13365, 13365,
13366, 13366, 13366, 13366, 13366, 13366, 13366, 13366, 13367,
13367), class = "Date"), RANK = as.integer(c(19, 7, 5, 4, 6,
3, 3, 4, 18, 7, 6, 4, 6, 3, 3, 4, 19, 7, 6, 4, 6, 3, 3, 4, 18,
6, 7, 4, 6, 3, 3, 4, 18, 6, 7, 4, 6, 3, 3, 4, 18, 6, 7, 4, 6,
3, 3, 4, 18, 6))), .Names = c("Date", "RANK"), row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24",
"25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35",
"36", "37", "38", "39", "40", "41", "42", "43", "44", "45", "46",
"47", "48", "49", "50"), class = "data.frame")

#---fine
Date                 RANK
 Min.   :2006-08-01   Min.   : 3.00
 1st Qu.:2006-08-02   1st Qu.: 4.00
 Median :2006-08-04   Median : 5.50
 Mean   :2006-08-03   Mean   : 6.62
 3rd Qu.:2006-08-05   3rd Qu.: 6.75
 Max.   :2006-08-07   Max.   :19.00

#---isn't this supposed to work?
Date RANK
  NA   NA
Warning messages:
1: argument is not numeric or logical: returning NA in:
mean.default(newX[, i], ...)
2: argument is not numeric or logical: returning NA in:
mean.default(newX[, i], ...)

Thank you,
b.

platform       i386-pc-mingw32
arch           i386
os             mingw32
system         i386, mingw32
status
major          2
minor          4.0
year           2006
month          10
day            03
svn rev        39566
language       R
version.string R version 2.4.0 (2006-10-03)
#
On Fri, 2007-01-19 at 11:36 -0500, bogdan romocea wrote:
Look at ?apply and details. 

Argument X of apply is supposed to be an array. Details says:

     If 'X' is not an array but has a dimension attribute, 'apply'
     attempts to coerce it to an array via 'as.matrix' if it is
     two-dimensional (e.g., data frames) or via 'as.array'.

So you should look at what is happening with as.matrix():

str(as.matrix(test))
 chr [1:50, 1:2] "2006-08-01" "2006-08-01" "2006-08-01" ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:50] "1" "2" "3" "4" ...
  ..$ : chr [1:2] "Date" "RANK"

Notice this is now a character matrix and not what you thought it was.
So look at ?as.matrix and we see:

     'as.matrix' is a generic function. The method for data frames will
     convert any non-numeric/complex column into a character vector
     using 'format' and so return a character matrix, except that
     all-logical data frames will be coerced to a logical matrix.  When
     coercing a vector, it produces a one-column matrix, and promotes
     the names (if any) of the vector to the rownames of the matrix.

Which explains what is happening.

Workaround:

lapply(test, mean)
sapply(test, mean)

Both work

HTH,

G