Skip to content

number of pairwise present data in matrix with missings

6 messages · Andreas Wolf, Gabor Grothendieck, Brian Ripley +2 more

#
is there a smart way of determining the number of pairwise present data
in a data matrix with missings (maybe as a by-product of some
statistical function?)

so far, i used several loops like:

for (column1 in 1:99) {
  for (column2 in 2:100) {
    for (row in 1:500) {
      if (!is.na(matrix[row,column1]) & !is.na(matrix[row,column2])) {
        pairs[col1,col2] <- pairs[col1,col2]+1
      }
    }
  }
}

but this seems neither the most elegant nor an utterly fast solution.

thanks for suggestions.
andreas wolf
#
Andreas Wolf <andreas.wolf <at> uni-jena.de> writes:

: 
: is there a smart way of determining the number of pairwise present data
: in a data matrix with missings (maybe as a by-product of some
: statistical function?)
: 
: so far, i used several loops like:
: 
: for (column1 in 1:99) {
:   for (column2 in 2:100) {
:     for (row in 1:500) {
:       if (!is.na(matrix[row,column1]) & !is.na(matrix[row,column2])) {
:         pairs[col1,col2] <- pairs[col1,col2]+1
:       }
:     }
:   }
: }
: 
: but this seems neither the most elegant nor an utterly fast solution.

This is just matrix multiplication of the !na(x) matrix:

R> x <- matrix(1:12,4,3)
R> x[c(1,10)] <- NA
R> x
     [,1] [,2] [,3]
[1,]   NA    5    9
[2,]    2    6   NA
[3,]    3    7   11
[4,]    4    8   12
R> crossprod(!is.na(x))
     [,1] [,2] [,3]
[1,]    3    3    2
[2,]    3    4    3
[3,]    2    3    3
#
Suppose your matrix is called A (`matrix' is not a good name). Then 
crossprod(!is.na(A)) is pretty efficient.  Test:
[,1] [,2] [,3]
[1,]   NA    1    1
[2,]    1   NA    1
[3,]   NA    1    1
[4,]    1    1    1
[5,]    1    1    1
[6,]    1    1    1
[,1] [,2] [,3]
[1,]    4    3    4
[2,]    3    5    5
[3,]    4    5    6
On Tue, 23 Nov 2004, Andreas Wolf wrote:

            

  
    
#
Andreas Wolf wrote:
library(Hmisc)
n <- naclus(mydataframe)
plot(n)   # show pairwise missingness in a dendogram
naplot(n) # show more details in multiple plots

Frank
#
Hi Andreas,

maybe something like this could do it:

mat <- sample(0:3, 20*2, TRUE); dim(mat) <- c(20,2)
mat[sample(1:20, 4),] <- NA
########
mat
sum(rowMeans(mat)==mat[,1], na.rm=TRUE)


I hope it helps.

Best,
Dimitris

----
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/16/336899
Fax: +32/16/337015
Web: http://www.med.kuleuven.ac.be/biostat
     http://www.student.kuleuven.ac.be/~m0390867/dimitris.htm


----- Original Message ----- 
From: "Andreas Wolf" <andreas.wolf at uni-jena.de>
To: <r-help at stat.math.ethz.ch>
Sent: Tuesday, November 23, 2004 2:42 PM
Subject: [R] number of pairwise present data in matrix with missings
#
Sorry my first reply was not relevant, I understood a different thing.

Dimitris

----- Original Message ----- 
From: "Andreas Wolf" <andreas.wolf at uni-jena.de>
To: <r-help at stat.math.ethz.ch>
Sent: Tuesday, November 23, 2004 2:42 PM
Subject: [R] number of pairwise present data in matrix with missings