seeking some advice on fixed vs random specification
David, Have you considered that is TIME*REGION is crossed as fixed effects, you should also treat them as crossed if they are random effects (and not nested), thus lmer(log(CONT) ~ WB_TYPE + PORT*len + (1|TIME) + (1|REGION/wb_id)) Are TIME and REGION considered to be two independent sources of random variation, which would be implied by this model? If you want model variation across time differently for each region, then perhaps (TIME | REGION/wb_id) may be more appropriate. I would interpret (1|TIME/REGION), based on analogy to (1 | BLOCK/PLOT), to mean the REGION identified as "1" in TIME "A" would not be in any way related to REGION "1" in TIME "B"; that is, the region identifier only has meaning within the context of the time identifier. Peter Claussen Gylling Data Management
On Oct 31, 2011, at 9:27 AM, david depew wrote:
Dear list, I am seeking some thoughts/advice on whether my approach to this problem (below) makes sense. We have compiled a rather large dataset (n> 25,000 for most species of interest) on the levels of a contaminant in fish covering 40 years and a continental scale. We would like to investigate broad temporal changes across a large geographic region. Because the data comes from a variety of sources, with different resources and mandates for sampling fish, we do not consider this dataset to be a "true random sample", but in the absence of such, this is the best possible approximation to one. Sites that are sampled over time are generally not sampled frequently enough and with sufficient constraints (sample sizes, sizes of fish) to do more focused analysis of temporal trends. Having spent some time perusing the resources available on mixed models, I think this offers the best choice for making some sense of this messy dataset. I'm less inclined to try and estimate site specific slopes (regressed over year) for sites that have low sampling effort. Rather, I split the dataset into time periods (A,B and C) of ~ 15 year blocks. (Note: the levels of this particular contaminant are known to change very slowly over time), and assigned each site to an ecoregion based on geographic location. Thus, I am aiming to assess (if possible) whether levels of contaminant in each ecoregion change over the time blocks (A,B and C), where sites are assumed to represent a random selection of possible locations within an ecoregion. The variables of interest are as follows; CONT=contaminant Conc. WB_TYPE = waterbody type (lake, river) PORT = portion (fillet, whole fish) len=mean centered length of fish REGION=Ecoregion (37 unique types) TIME= time block (A, B or C) wb_id=unique id of site My initial thought was to specify the model with time and region as fixed effects. lmer(log(CONT) ~ TIME*REGION + WB_TYPE + PORT*len + (1|wb_id)) comparison of this model with one with only additive time and region terms suggests that this improves the model fit and the interaction is probably important. I can test TIME and REGION interaction contrasts specifically using the multcomp package and the results indeed suggest some regions have significant changes between time blocks. Or, would it make more sense to specify the time and region effects as part of the random terms with site nested within region, nested within time period? lmer(log(CONT) ~ WB_TYPE + PORT*len + (1|TIME/REGION/wb_id)) I'm assuming (perhaps wrongly) that the conditional means and 95% CI could be extracted and compared to assess changes within a region? I'm aware that there are arguments that can be made to treat TIME and REGION as either fixed or random, depending on the objective of the analysis. I'm mainly seeking some clarification if a) my interpretation of the specified model is correct, and b) if this makes sense with respect to the initial problem. Any thoughts or advice would be much appreciated. thanks -- David Depew Postdoctoral Fellow School of Environmental Studies Queen's University Kingston, Ontario K7L 3N6 david.depew at queensu.ca [[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models