Skip to content
Prev 305609 / 398506 Next

boot() with glm/gnm on a contingency table

Le mercredi 12 septembre 2012 ? 07:08 -0700, Tim Hesterberg a ?crit :
If anybody is interested, I've finally taken this way, the function
described above being implemented as below. The idea is to assign an
index to each observation, and identify which cell the observation comes
from using the cumulative sum. Instead of going over all indices and
adding incrementing the corresponding cell count for each, I decided to
start with the original data, decrementing the counts for missing
indices, and incrementing it for duplicates. There are probably better
implementations, but performance-wise it seems good enough.

# tab is a table object
f <- function(tab, indices) {
  cs <- cumsum(tab)

  # Remove missing observations
  for(i in setdiff(1:sum(tab), indices)) {
      index <- min(which(i <= cs))
      tab[index] <- tab[index] - 1
  }

  # Add duplicate observations
  for(i in indices[duplicated(indices)]) {
      index <- min(which(i <= cs))
      tab[index] <- tab[index] + 1
  }
}


Thanks for the pointers!