Multistage Sampling

Thomas Lumley · 2006-07-07T21:34:36Z

On Fri, 7 Jul 2006, Mark Hempelmann wrote: > library(survey) > multi3 1,2,3, 1,2), > nl=c(4,4,4,4, 3,3,3, 2,2), Nl=c(100,100,100,100, 50,50,50, 75,75), > M=rep(23,9), > y=c(23,33,77,25, 35,74,27, 37,72) ) > > dmulti3 svymean (~y, dmulti3) > mean SE > y 45.796 5.5483 > > svytotal(~y, dmulti3) > total SE > y 78999 13643 > > and I estimate the population total as N=M/m su

Thomas Lumley

Fri, Jul 7, 2006 2:34 PM

On Fri, 7 Jul 2006, Mark Hempelmann wrote:

I don't have any of my reference books here today, but if you use
   var.yT <- 23^2*( 20/23*1/6*sum(
    (yT1-ybarT)^2,(yT2-ybarT)^2,(yT3-ybarT)^2 ) +
    1/69 * sum(100*96*s1/4, 50*47*s2/3, 75*73*s3/2) ) # 242 101 517
the results agrees with svytotal(), and with Stata, and with formulas in a 
couple of sets of lecture notes I found by Googling.

This calculation is not correct for the mean, since it ignores the 
uncertainty in the estimated population total.  The correct standard error 
comes from treating the mean as a ratio of estimated total to estimated 
population size. In this case you have to do it that way since you don't 
know the population size, but R always does it this way. Because the 
estimated population size and total are correlated, taking into account 
the uncertainty in the denominator actually reduces the standard error.

The easiest way to reproduce the result that R gets is to do it the same 
way that R does: compute the standard error of the mean as the standard 
error of the total of a suitable set of estimating functions. If you 
define a new variable (y-45.796*1)/1725 and estimate the standard error of 
the total of this variable it will give:

I((y - 45.796)/1725) 0.0002963 5.5482

which is what svymean() gives for the standard error of the mean of y. 
Using your formula for the variance of the total (with the corrections 
above) on this variable also gives

[1] 5.54824


 	-thomas

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

Multistage Sampling

Thread (2 messages)