Dear R users, I have data from a long-term study that has opportunistically collected samples over the past 10 years. My data set is highly unbalanced because of the opportunistic sample collection.I have a single sample from 19 individuals, 2 samples from 4 individuals, and 3 samples from 2 individuals.I know that lmer can accommodate unbalanced data sets, but I am unsure if my data set is too unbalanced. I am testing if social rank, reproductive status, and age affect my response variables. I also need to determine if sample collection parameters such as sample date and the time from anesthetizing the animal to the time the sample was collected affects the response variables. Here are what I see as potential options: 1)Use a mixed model with subject as random intercept and sample date as random slope to account for potential temporal autocorrelation within the repeat samples. Lmer( y ~ 1 + x1 + x2 + x3 + ? (1 + date | subject) 2)Use a mixed model with subject as random intercept. Initial data exploration does not show any obvious temporal autocorrelation. Lmer( y ~ 1 + x1 + x2 + x3 + ? (1 | subject) 3)Use a GEE and specify an autoregressive correlation structure. I think this would be a good option, but from what I have found in the literature, my sample size is too small for this. 4)Use the mean for each individual and use a standard linear model. This option is not good because it does not allow me to include reproductive status as a predictor because reproductive status changes between samples. 5)Use only a single sample from each individual in standard linear model. This option is not good because my already limited sample size would be further reduced. Please let me know which of the above options would be best or if you can suggest a better option. Any advice or literature references are sincerely appreciated. Thanks, Andy
Andy Flies - Ph.D. Candidate Zoology Department Ecology, Evolutionary Biology, and Behavior (EEBB) Program Michigan State University