I have an application where I have a function to calculate results for
a 2-way table or matrix, which
returns a matrix with one less row and column. To keep this short, the
function below captures the structure:
fun2way <- function(f){
if (!length(dim(f)) ==2) stop("only for 2-way arrays")
R <- dim(f)[1]
C <- dim(f)[2]
f[1:(R-1), 1:(C-1)]
}
Now, I want to extend this to higher-way arrays, using apply-like
methods over the strata (all but the first two dimensions),
and returning an array in which the last dimensions correspond to
strata. That is, I want to define something like the
following using an a*ply method, but aaply gives a result in which the
applied .margin(s) do not appear last in the
result, contrary to the documentation for ?aaply. I think this is a
bug, either in the function or the documentation,
but perhaps there's something I misunderstand for this case.
fun <- function(f, stratum=NULL) {
L <- length(dim(f))
if (L > 2 && is.null(stratum))
stratum <- 3:L
if (is.null(stratum)) {
result <- fun2way(f)
}
else {
require(plyr)
result <- aaply(f, stratum, fun2way) ## order of dimensions
screwed up!
}
result
}
For example, by hand (or with a loop) I can calculate the
pieces and combine them as I want using abind():
> # apply separately to strata
> t1<-fun2way(HairEyeColor[,,1])
> t2<-fun2way(HairEyeColor[,,2])
>
> library(abind)
> abind(t1, t2, along=3)
, , 1
Brown Blue Hazel
Black 32 11 10
Brown 53 50 25
Red 10 10 7
, , 2
Brown Blue Hazel
Black 36 9 5
Brown 66 34 29
Red 16 7 7
alply() gives me what I want, but with the strata as list elements,
rather than an array
> library(plyr)
> # strata define separate list elements
> alply(HairEyeColor, 3, fun2way)
$`1`
Eye
Hair Brown Blue Hazel
Black 32 11 10
Brown 53 50 25
Red 10 10 7
$`2`
Eye
Hair Brown Blue Hazel
Black 36 9 5
Brown 66 34 29
Red 16 7 7
attr(,"split_type")
[1] "array"
attr(,"split_labels")
Sex
1 Male
2 Female
>
However, with aaply(), dim[3] ends up as first dimension, not last
> # dim[3] ends up as first dimension, not last
> aaply(HairEyeColor, 3, fun2way)
, , Eye = Brown
Hair
Sex Black Brown Red
Female 36 66 16
Male 32 53 10
, , Eye = Blue
Hair
Sex Black Brown Red
Female 9 34 7
Male 11 50 10
, , Eye = Hazel
Hair
Sex Black Brown Red
Female 5 29 7
Male 10 25 7
> str(aaply(as.array(HairEyeColor), 3, fun2way))
num [1:2, 1:3, 1:3] 36 32 66 53 16 10 9 11 34 50 ...
- attr(*, "dimnames")=List of 3
..$ Sex : chr [1:2] "Female" "Male"
..$ Hair: chr [1:3] "Black" "Brown" "Red"
..$ Eye : chr [1:3] "Brown" "Blue" "Hazel"
>
> ## aaply should return this....
> aperm(aaply(HairEyeColor, 3, fun2way), c(2,3,1))
, , Sex = Female
Eye
Hair Brown Blue Hazel
Black 36 9 5
Brown 66 34 29
Red 16 7 7
, , Sex = Male
Eye
Hair Brown Blue Hazel
Black 32 11 10
Brown 53 50 25
Red 10 10 7
>
On the other hand, aaply() does work as I expect, with an array of size
2 x C x strata
> library(vcd)
> fun2way(Employment[,,1])
<1Mo 1-3Mo 3-12Mo 1-2Yr 2-5Yr
8 35 70 62 56
> fun2way(Employment[,,2])
<1Mo 1-3Mo 3-12Mo 1-2Yr 2-5Yr
40 85 181 85 118
>
> aaply(Employment, 3, fun2way)
LayoffCause <1Mo 1-3Mo 3-12Mo 1-2Yr 2-5Yr
Closure 8 35 70 62 56
Replaced 40 85 181 85 118
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele Street Web: http://www.datavis.ca
Toronto, ONT M3J 1P3 CANADA
?That is, I want to define something like the
following using an a*ply method, but aaply gives a result in which the
applied .margin(s) do not appear last in the
result, contrary to the documentation for ?aaply. ?I think this is a bug,
either in the function or the documentation,
but perhaps there's something I misunderstand for this case.
Maybe the documentation isn't clear but I think this is behaving as expected:
* the margin you split on comes first in the output
* followed by the dimensions created by the applied function.
Hadley
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/
That is, I want to define something like the
following using an a*ply method, but aaply gives a result in which the
applied .margin(s) do not appear last in the
result, contrary to the documentation for ?aaply. I think this is a bug,
either in the function or the documentation,
but perhaps there's something I misunderstand for this case.
Maybe the documentation isn't clear but I think this is behaving as expected:
* the margin you split on comes first in the output
* followed by the dimensions created by the applied function.
Hadley
OK, this is clear (and more explicit than the .Rd doc description of
Value).
What is still not clear is why alply() returns the split levels in their
original order, while aaply() returns them in sorted order when the
level names are character. My application is for
contingency tables, where, e.g., a dimension may have levels
c("Lo", "Med", "Hi") and I need to preserve this order in the result.
> alply(HairEyeColor, 3, fun2way) # gives c("Male", "Female")
$`1`
Eye
Hair Brown Blue Hazel
Black 32 11 10
Brown 53 50 25
Red 10 10 7
$`2`
Eye
Hair Brown Blue Hazel
Black 36 9 5
Brown 66 34 29
Red 16 7 7
attr(,"split_type")
[1] "array"
attr(,"split_labels")
Sex
1 Male
2 Female
>
> aaply(HairEyeColor, 3, fun2way) # gives c("Female", "Male")
, , Eye = Brown
Hair
Sex Black Brown Red
Female 36 66 16
Male 32 53 10
, , Eye = Blue
Hair
Sex Black Brown Red
Female 9 34 7
Male 11 50 10
, , Eye = Hazel
Hair
Sex Black Brown Red
Female 5 29 7
Male 10 25 7
>
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele Street Web: http://www.datavis.ca
Toronto, ONT M3J 1P3 CANADA