Skip to content
Prev 70491 / 398525 Next

Survey and Stratification

On Thu, 26 May 2005, Mark Hempelmann wrote:

            
Ok.  Assuming that Nh are the population sizes in each stratum, you have 
5/15 sampled in stratum 1 and 3/12 in stratum 2.

This can be specified in a number of ways
You can use
   sampling weights of 15/5 and 12/3
   sampling probabilities of 5/15 and 3/12
without or without specifiying the finite population correction. The 
finite population correction can be specified as 15 and 12 or 5/15 and 
3/12, and if the finite population correction is specified the weights are 
then optional.

So
   d1<-svydesign(ids=~id, strata=~stratum, weight=~I(Nh/nh), data=age)
   d2<-svydesign(ids=~id, strata=~stratum, prob=~I(nh/Nh), data=age)
give the with-replacement design (agreeing with your age.des3) and
   d3<-svydesign(ids=~id, strata=~stratum, weight=~I(Nh/nh), fpc=~Nh,data=age)
   d4<-svydesign(ids=~id, strata=~stratum, prob=~I(nh/Nh), fpc=~Nh,data=age)
   d5<-svydesign(ids=~id, strata=~stratum, weight=~I(Nh/nh), fpc=~I(nh/Nh),data=age)
   d6<-svydesign(ids=~id, strata=~stratum, prob=~I(nh/Nh), fpc=~I(nh/Nh),data=age)
   d7<-svydesign(ids=~id, strata=~stratum, fpc=~Nh,data=age)
   d8<-svydesign(ids=~id, strata=~stratum, fpc=~I(nh/Nh),data=age)
all give the without-replacement design. We get
mean     SE
y 26.296 0.9862
mean     SE
y 26.296 0.9862
mean     SE
y 26.296 0.8364
mean     SE
y 26.296 0.8364
mean     SE
y 26.296 0.8364
mean     SE
y 26.296 0.8364
mean     SE
y 26.296 0.8364
mean     SE
y 26.296 0.8364

Now, looking at your examples
This is wrong: the sampling weight is Nh/nh, not Nh
This is wrong: the sampling weight is Nh/nh. You need prob=~I(nh/Nh) to 
specify sampling fractions.
This is correct and agrees with d1 and d2
This is a stratified, unweighted mean, ie mean(age$y).
No, it does not.  A weight of 3 is not the same as a weight of 1/3.  With 
the finite population correction it is safe to assume that numbers less 
than 1 are sampling fractions and numbers greater than 1 are population 
sizes, but this isn't safe when it comes to weights.  It is possible that 
someone could want to use sampling weights less than 1.
Since this gives a mean of 7.01 for numbers around 25 it can't be right. 
You have divided by sample size twice. You should have
   y1.total<-3*118
   y2.total<-4*89
You then will get  (y1.total+y2.total)/27 to be 26.29630, in agreement 
with svymean().


 	-thomas