boot() with glm/gnm on a contingency table
Le mercredi 12 septembre 2012 ? 07:08 -0700, Tim Hesterberg a ?crit :
One approach is to bootstrap the vector 1:n, where n is the number
of individuals, with a function that does:
f <- function(vectorOfIndices, theTable) {
(1) create a new table with the same dimensions, but with the counts
in the table based on vectorOfIndices.
(2) Calculate the statistics of interest on the new table.
}
When f is called with 1:n, the table it creates should be the same
as the original table. When called with a bootstrap sample of
values from 1:n, it should create a table corresponding to the
bootstrap sample.
If anybody is interested, I've finally taken this way, the function
described above being implemented as below. The idea is to assign an
index to each observation, and identify which cell the observation comes
from using the cumulative sum. Instead of going over all indices and
adding incrementing the corresponding cell count for each, I decided to
start with the original data, decrementing the counts for missing
indices, and incrementing it for duplicates. There are probably better
implementations, but performance-wise it seems good enough.
# tab is a table object
f <- function(tab, indices) {
cs <- cumsum(tab)
# Remove missing observations
for(i in setdiff(1:sum(tab), indices)) {
index <- min(which(i <= cs))
tab[index] <- tab[index] - 1
}
# Add duplicate observations
for(i in indices[duplicated(indices)]) {
index <- min(which(i <= cs))
tab[index] <- tab[index] + 1
}
}
Thanks for the pointers!