Hi Andrew,
The reason I did not include year in the random effects part of the
model was since in terms of the actual data those are the individual
observations. Putting year into the model made the groups messed up. The
model statement "FenceEnd/FEsection/MIT_UNMIT" thus had the proper
number of groups to match the physical chatacteristics of the design
setup,
[number of obs: 1230, groups: MIT_UNMIT:(FEsection:FenceEnd), 93;
FEsection:FenceEnd, 58; FenceEnd, 7]
and then there are the count data from each year within each of these
groups. It made sense to me when I was writing the formula but maybe
this is not right?
As well, I have the treatment effects in the random part of the formula
since without doing this there would not be the proper structure to fit
the data, and would I not then have pseudoreplication? It is not
entirely clear to me what should be done with fixed/random effects in
this case where the treatmetn effects need to be used to define the
grouping of the data. Maybe as you say I could define a new variable
with the same characterisitces as 'FEsection' and 'MIT_UNMIT' in order
to have one for the random section to account for data structure, and
another in the fixed section to account for the fixed effects. Wouldn't
that just produce the same results by a different name?
In terms of the study layout:
-Parts of the highway were fenced at different intervals
(treatment=MIT_UNMIT), creating fence ends (location=FenceEnd).
Subsequent fencing extended the fence, removing some fence ends but
creating new ends. Generally, fence ends are far apart from the previous
fence end since most phases of fencing are ~20km long.
-roadkill count data were collected, and summed on a yearly basis
(year).
-roadkill location data were then processed and categorized by UTM
coordinates into five 1km segments inside each fence end and five 1km
segments outdside, centered on the fence end (subsite=FEsection). This
sort of like having 10 different levels of the 2 treatments at each
fence end. The treatment effects being accounted for by the MIT_UNMIT
variable that defines whether the sample had fencing or not, and the
FEsection variable defining how far from the fence end the mortalities
occurred. Of course there could be many ways to look at this, I tried to
keep things as simple as possible. Feel free to make suggestions if you
can think of alternatives...
A variety of models were fit to the data with the intention of
determining how fencing affected the distribution of mortality at a
fence end. Questions asked were - Does mitigation cause a shift in
mortality locations to the unfenced area beyond the fence end (segments
1-5->outside versus 6-10->inside)? This would imply the fence end should
perhaps be relocated elsewhere. Is there a notable difference in
mortality among the segments inside and the segments outside the fence
end? Higher mortality inside the fence could show whether animals are
getting inside at the fence end and then being killed, in which case
deterrents to keep them out and mitigations to allow trapped animals to
escape should be considered. If there is a concentration of mortality in
the 1km FEsections nearest the fence end this would support the idea
that substantial numbers of animals are rounding the fence end and being
killed at that location, implying the fence is not located properly
relative to animal migrations in the area and needs to be extended,
and/or perhaps more crossing structures are needed. Mainly elk are the
animal killed in this area.
The model I pasted into the last email was best of the the 3 best-fit
models. The others were similar in structure but with different form,
such as:
u0 <- lmer(ung ~ 1 + (1|FenceEnd/FEsection/MIT_UNMIT),
family=quasipoisson(link ="log")
u01<- lmer(ung ~ year + (1|FenceEnd/FEsection/MIT_UNMIT),
family=quasipoisson(link ="log")
u1 <- lmer(ung ~ FEsection + (1|FenceEnd/FEsection/MIT_UNMIT),
family=quasipoisson(link ="log"))
u2 <- lmer(ung ~ MIT_UNMIT + (1|FenceEnd/FEsection/MIT_UNMIT),
family=quasipoisson(link ="log"))
.
.
.
u5 <- lmer(ung ~ FEsection + MIT_UNMIT +
(1|FenceEnd/FEsection/MIT_UNMIT), family=quasipoisson(link ="log"))
Anyway I really appreciate the help since it is a new and confusing
issue to me. This is for some work I do on my own time, with data from a
pproject I worked on in the past. The data are for one of the larger
highway wildlife fencing studies out there, and since roadkill is an
issue growing with the growing traffic volumes, these results will
probably be used as reference for how to design other fencing projects.
Wayne
-----Original Message-----
From: Andrew Robinson [mailto:A.Robinson at ms.unimelb.edu.au]
Sent: Saturday, March 03, 2007 1:26 PM
To: Hallstrom, Wayne (Calgary)
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] models with fixed effets nested in random
effects
Hi Wayne,
your question points to a broader problem with the specification of
fixed and random effects. It is often true that fixed effects are
aliased with elements of the underlying experimental design - such as in
split plot designs - and then it makes sense to declare the effect as
both fixed and random. (Actually I think that it makes more sense to
create a new variable identical to the first and use one for fixed and
the other for random, but that is neither here nor there.) I'm not sure
whether this is true in your case.
Anyway, I'm confused that you write "The treatment was applied:
Location/Subsite/TREATMENT/year" and then in your model statement you
have only three levels: FenceEnd/FEsection/MIT_UNMIT. Also, it seems
that two treatment effects appear in the random statement. I can't
reconcile that with your earlier description. Can you try to clarify
these - perhaps by giving a more complete description of the design?
Cheers,
Andrew
On Fri, Mar 02, 2007 at 11:44:01AM -0700, Hallstrom, Wayne (Calgary)
wrote:
I am having trouble defining a model that accounts for the data
structure but also allows the treatment effect to be estimated as a
fixed effect. I think I have it figured out but I would like to see
what some other opinions are regarding placement of a fixed effect in
both the random and fixed sections of a model formula.
There is 20 years of records at a variety of locations and subsites
within each location, with treatments for 1/2 of the years of records
at each location and subsite. The general structure to the data is:
Location/Subsite/year
The treatment was applied:
Location/Subsite/TREATMENT/year
The model format is:
u5 <- lmer(ung ~ FEsection + MIT_UNMIT +
(1|FenceEnd/FEsection/MIT_UNMIT), family=quasipoisson(link =
"log"))
Model output is:
> summary(u5)
Generalized linear mixed model fit using Laplace
Formula: ung ~ FEsection + MIT_UNMIT + (1 |
FenceEnd/FEsection/MIT_UNMIT)
Family: quasipoisson(log link)
AIC BIC logLik deviance
1661 1733 -816.7 1633
Random effects:
Groups Name Variance Std.Dev.
MIT_UNMIT:(FEsection:FenceEnd) (Intercept) 0.268713 0.51838
FEsection:FenceEnd (Intercept) 0.066643 0.25815
FenceEnd (Intercept) 0.213373 0.46192
Residual 1.371364 1.17105
number of obs: 1230, groups: MIT_UNMIT:(FEsection:FenceEnd), 93;
FEsection:FenceEnd, 58; FenceEnd, 7
Fixed effects:
Estimate Std. Error t value
(Intercept) -1.808774 0.340329 -5.315
FEsection2 -0.148425 0.349608 -0.425
FEsection3 -0.066525 0.344542 -0.193
FEsection4 -0.021216 0.346025 -0.061
FEsection5 0.470120 0.333193 1.411
FEsection6 0.459920 0.329556 1.396
FEsection7 0.455736 0.381993 1.193
FEsection8 0.006034 0.399353 0.015
FEsection9 0.423509 0.383930 1.103
FEsection10 0.062848 0.393176 0.160
MIT_UNMITunmit 1.572822 0.188382 8.349
This seems fine to me, there are the propoer number of groups in the
data structure. The problem I have yet to wrap my head around is the
fact that the fixed effects are still in the random effects category
as well. Is that a problem? Should I be adding all terms with fixed
effect variable, including interaction terms, into the 'fixed effects'
part of the model equation? Since I am new to this LMER routine it is
a bit confusing to write.
Wayne Hallstrom
*** WORLEYPARSONS GROUP NOTICE ***
"This email is confidential. If you are not the intended recipient,
you must not disclose or use the information contained in it. If
you have received this email in error, please notify us immediately by
return email and delete the email and any attachments. Any personal
views or opinions expressed by the writer may not necessarily reflect
the views or opinions of any company in the WorleyParsons Group of
Companies."
--
Andrew Robinson
Department of Mathematics and Statistics Tel: +61-3-8344-9763
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewprhttp://blogs.mbs.edu/fishing-in-the-bay/
*** WORLEYPARSONS GROUP NOTICE ***
"This email is confidential. If you are not the intended recipient, you must not disclose or use the information contained in it. If you have received this email in error, please notify us immediately by return email and delete the email and any attachments. Any personal views or opinions expressed by the writer may not necessarily reflect the views or opinions of any company in the WorleyParsons Group of Companies."
On Mon, Mar 05, 2007 at 12:10:16PM -0700, Hallstrom, Wayne (Calgary) wrote:
Hi Andrew,
The reason I did not include year in the random effects part of the
model was since in terms of the actual data those are the individual
observations. Putting year into the model made the groups messed up. The
model statement "FenceEnd/FEsection/MIT_UNMIT" thus had the proper
number of groups to match the physical chatacteristics of the design
setup,
[number of obs: 1230, groups: MIT_UNMIT:(FEsection:FenceEnd), 93;
FEsection:FenceEnd, 58; FenceEnd, 7]
and then there are the count data from each year within each of these
groups. It made sense to me when I was writing the formula but maybe
this is not right?
The decision about year looks good to me, based on your description
above.
There may well be a temporal correlation structure to watch out for
within the lowest- level random effects. If that be the case then you
might want to move to using lme(), which has well-configured helper
functions for fitting the more complicated models.
As well, I have the treatment effects in the random part of the formula
since without doing this there would not be the proper structure to fit
the data, and would I not then have pseudoreplication? It is not
entirely clear to me what should be done with fixed/random effects in
this case where the treatmetn effects need to be used to define the
grouping of the data. Maybe as you say I could define a new variable
with the same characterisitces as 'FEsection' and 'MIT_UNMIT' in order
to have one for the random section to account for data structure, and
another in the fixed section to account for the fixed effects. Wouldn't
that just produce the same results by a different name?
Yes, it would. I hope that my addition of that text wasn't confusing.
I merely meant to include it as a possiblity to deal with concerns
about the same effects being fixed and random. I do think that it
would be clearler in some cases.
In terms of the study layout:
-Parts of the highway were fenced at different intervals
(treatment=MIT_UNMIT), creating fence ends (location=FenceEnd).
Ok, this implies to me that MIT_UNMIT represents the intervals
somehow, and that each interval has one or more FenceEnd associated
with it. What exactly does MIT_UNMIT mean? I assume that it's
mitigation vs no mitigation.
Subsequent fencing extended the fence, removing some fence ends but
creating new ends. Generally, fence ends are far apart from the previous
fence end since most phases of fencing are ~20km long.
-roadkill count data were collected, and summed on a yearly basis
(year).
-roadkill location data were then processed and categorized by UTM
coordinates into five 1km segments inside each fence end and five 1km
segments outdside, centered on the fence end (subsite=FEsection). This
sort of like having 10 different levels of the 2 treatments at each
fence end. The treatment effects being accounted for by the MIT_UNMIT
variable that defines whether the sample had fencing or not, and the
FEsection variable defining how far from the fence end the mortalities
occurred. Of course there could be many ways to look at this, I tried to
keep things as simple as possible. Feel free to make suggestions if you
can think of alternatives...
I'm a bit confused by the nesting of MIT_UNMIT inside FEsection. Your
earlier text implies to me that the MIT_UNMIT treatment differed at
the FenceEnd level. Nesting it inside FEsection implies that every
level of FEsection has all the MIT_UNMIT levels nested within them.
That seems like a contradiction to me.
Also, based on your description I wonder if FEsection should be
recoded to have a continuous basis as well as a categorical one - for
the purposes of representing an underlying distance? I note from your
earlier email that FEsection appears as a 10-level factor, and that
seems (prima facie) more difficult to interpret.
Cheers,
Andrew
A variety of models were fit to the data with the intention of
determining how fencing affected the distribution of mortality at a
fence end. Questions asked were - Does mitigation cause a shift in
mortality locations to the unfenced area beyond the fence end (segments
1-5->outside versus 6-10->inside)? This would imply the fence end should
perhaps be relocated elsewhere. Is there a notable difference in
mortality among the segments inside and the segments outside the fence
end? Higher mortality inside the fence could show whether animals are
getting inside at the fence end and then being killed, in which case
deterrents to keep them out and mitigations to allow trapped animals to
escape should be considered. If there is a concentration of mortality in
the 1km FEsections nearest the fence end this would support the idea
that substantial numbers of animals are rounding the fence end and being
killed at that location, implying the fence is not located properly
relative to animal migrations in the area and needs to be extended,
and/or perhaps more crossing structures are needed. Mainly elk are the
animal killed in this area.
The model I pasted into the last email was best of the the 3 best-fit
models. The others were similar in structure but with different form,
such as:
u0 <- lmer(ung ~ 1 + (1|FenceEnd/FEsection/MIT_UNMIT),
family=quasipoisson(link ="log")
u01<- lmer(ung ~ year + (1|FenceEnd/FEsection/MIT_UNMIT),
family=quasipoisson(link ="log")
u1 <- lmer(ung ~ FEsection + (1|FenceEnd/FEsection/MIT_UNMIT),
family=quasipoisson(link ="log"))
u2 <- lmer(ung ~ MIT_UNMIT + (1|FenceEnd/FEsection/MIT_UNMIT),
family=quasipoisson(link ="log"))
.
.
.
u5 <- lmer(ung ~ FEsection + MIT_UNMIT +
(1|FenceEnd/FEsection/MIT_UNMIT), family=quasipoisson(link ="log"))
Anyway I really appreciate the help since it is a new and confusing
issue to me. This is for some work I do on my own time, with data from a
pproject I worked on in the past. The data are for one of the larger
highway wildlife fencing studies out there, and since roadkill is an
issue growing with the growing traffic volumes, these results will
probably be used as reference for how to design other fencing projects.
Wayne