longitudinal survey data - R-help

Thu, May 26, 2005 11:20 AM #

Dear R-Users!

Is there a possibility in R to do analyze longitudinal survey data (repeated
measures in a survey)? I know that for longitudinal data I can use lme() to
incorporate the correlation structure within individual and I know that there is
the package survey for analyzing survey data. How can I combine both? I am
trying to calculate design-based estimates. However, if I use svyglm() from the
survey package I would ignore the correlation structure of the repeated measures.

Thanks!

Dassy

Thomas Lumley

Thu, May 26, 2005 1:11 PM #

On Thu, 26 May 2005 h.brunschwig at utoronto.ca wrote:

You *can* fit regression models to these data with svyglm(). Remember that 
from a design-based point of view there is no such thing as a correlation 
structure of repeated measures -- only the sampling is random, not the 
population data.


If you *want* to fit mixed models (eg because you are interested in 
estimating variance components, or perhaps to gain efficiency) then it's 
quite a bit trickier. You can't just use the sampling weights in lme(). 
You can correct for the biased sampling if you put the variables that 
affect the weights in as predictors in the model.  Cluster sampling could 
perhaps then be modelled as another level of random effect.


 	-thomas

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

Koen Pelleriaux

Fri, May 27, 2005 6:48 AM #

On 5/26/05, Thomas Lumley <tlumley at u.washington.edu> wrote:

I've been struggeling with case weights (in the case of unequal
selection probabilities) in mixed effects models. Those are not
possible in lme(). Isn't it, however, possible to use case weights in
glmmPQL from MASS?

Koen Pelleriaux
Sociologist
University of Antwerp

Hadassa Brunschwig

Fri, May 27, 2005 7:06 AM #

Thank you for your reply.

Does that mean that in order to take in account the repeated measures I denote
these as another cluster in R?

Dassy


Quoting Thomas Lumley <tlumley at u.washington.edu>:

Thomas Lumley

Fri, May 27, 2005 9:31 AM #

On Fri, 27 May 2005 h.brunschwig at utoronto.ca wrote:

Yes, but unless you have multistage finite population corrections to put 
in the design object only the first stage of clustering affects the 
results, so you may not need to bother.

 	-thomas

Hadassa Brunschwig

Fri, May 27, 2005 12:26 PM #

Sorry, still confused. If I dont have fpc's ready in my dataset (calculate
myself?) that means that R will use the weight of an individual for each of his
repeated observations. But is that then still correct? The "cluster" individual
is ignored and each observation of an individual has the same weight.

Thanks a lot.

Dassy

Quoting Thomas Lumley <tlumley at u.washington.edu>:

Thomas Lumley

Fri, May 27, 2005 4:35 PM #

On Fri, 27 May 2005 h.brunschwig at utoronto.ca wrote:

Well, it depends to some extent on what inferences you are making, but 
yes, you probably do want each observation to have the same weight.

Suppose you have 4 measurements on each person, and you are working with a 
simple random sample of 1000 people from a population of 1,000,000. If you 
had done these 4 measurements on the whole population you would have 
4,000,000 measurements, so the 4000 measurements you have are 1/1000 of 
the population.  This is the same weighting as if you had a single 
measurement person person, giving 1000 measurements in the sample and 
1,000,000 in the population.

If different individuals have different numbers of measurements then 
things get a bit trickier. It depends then on why there are different 
numbers of measurements.If they are the result of non-response you might 
want to rescale the weights at later time points to give the right 
population totals.  If they are part of the sampling design then the 
design will specify what to do with them.


 	-thomas