Skip to content

summary statistics for lists of matrices or dataframes

4 messages · David Kane, Dimitris Rizopoulos, Patrick Burns

#
Is there a simple way to calculate summary statistics for all the
matrices or dataframes in a list? For example:
[[1]]
     [,1] [,2]
[1,]    2    2
[2,]    2    2

[[2]]
     [,1] [,2]
[1,]    4    4
[2,]    4    4
I would like to calculate, for example, the mean value for each
cell. I can do that the hard way as:
[,1] [,2]
[1,]    3    3
[2,]    3    3
But there must be an easier way. I am also interested in other
statistics (like median and sd). Since all my matrices have the same
attributes (especially row and column names), I would like to preserve
those in the answer.

Thanks,

Dave Kane

In case it matters:
_                
platform i686-pc-linux-gnu
arch     i686             
os       linux-gnu        
system   i686, linux-gnu  
status                    
major    2                
minor    1.0              
year     2005             
month    04               
day      18               
language R
#
Hi Dave,

maybe you can find these functions useful:


matSums <- function(lis){
    out <- array(data=0., dim=dim(lis[[1]]))
    for(i in seq(along=lis)) out <- out + lis[[i]]
    out
}
##
matMeans <- function(lis) matSums(lis) / length(lis)
##
matFun <- function(lis, FUN, ...){
    if(!is.list(lis) || !all(sapply(lis, is.matrix))) stop("'lis' must 
be a list containing 2-dimensional arrays")
    dims <- sapply(lis, dim)
    n <- dims[1, 1]
    p <- dims[2, 1]
    if(!all(n==dims[1,]) || !all(p==dims[2,])) stop("the matrices must 
have the same dimensions")
    out <- apply(matrix(unlist(lis), n * p, length(lis)), 1, FUN, ...)
    dim(out) <- c(n, p)
    out
}
# The first two are faster than "matFun(lis, sum)" or "matFun(lis, 
mean)"
# for large and many matrices
#############

matFun <- function(lis, FUN, ...){
    if(!is.list(lis) || !all(sapply(lis, is.matrix))) stop("'lis' must 
be a list containing 2-dimensional arrays")
    dims <- sapply(lis, dim)
    n <- dims[1, 1]
    p <- dims[2, 1]
    if(!all(n==dims[1,]) || !all(p==dims[2,])) stop("the matrices must 
have the same dimensions")
    out <- apply(matrix(unlist(lis), n*p, length(lis)), 1, FUN, ...)
    dim(out) <- c(n, p)
    out
}
###########
lis <- list(matrix(c(2,2,2,2), ncol = 2), matrix(c(4,4,4,4), ncol = 
2), matrix(c(5,5,5,5), ncol=2))

matSums(lis)
matFun(lis, sum)

matMeans(lis)
matFun(lis, mean)

matFun(lis, median)
matFun(lis, sd)


Best,
Dimitris

----
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/16/336899
Fax: +32/16/337015
Web: http://www.med.kuleuven.ac.be/biostat/
     http://www.student.kuleuven.ac.be/~m0390867/dimitris.htm




----- Original Message ----- 
From: "David Kane" <dave at kanecap.com>
To: <r-help at stat.math.ethz.ch>
Sent: Tuesday, May 10, 2005 4:03 PM
Subject: [R] summary statistics for lists of matrices or dataframes
#
You could use 'do.call' with 'bind.array' (from S Poetry) or 'abind'
to convert your list of matrices into a three-dimensional array.

Patrick Burns

Burns Statistics
patrick at burns-stat.com
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")
David Kane wrote:

            
#
No, you can't.  Because 'bind.array' doesn't take an arbitrary number
of arguments.  Robert's solution does what I had in mind.
Patrick Burns wrote: