I am analyzing data from 3 field experiments (farms=3) for a citrus flower disease: response variable is binomial because the flower can only be diseased or healthy. I have particular interest in comparing 5 fungicide spraying systems (trt=5). Each farm had 4 blocks (bk=4) including 2 trees as subsamples (tree=2) in which I assessed 100 flowers each one. This is a quick look of the data: farm trt bk tree dis tot <fctr> <fctr> <fctr> <fctr> <int> <int> iaras cal 1 1 0 100 iaras cal 1 2 1 100 iaras cal 2 1 1 100 iaras cal 2 2 3 100 iaras cal 3 1 0 100 iaras cal 3 2 5 100... The model I considered was: resp <- with(df, cbind(dis, tot-dis)) m1 = glmer(resp ~ trt + (1|farm/bk) , family = binomial, data=df) I tested the overdispersion with the overdisp_fun() from GLMM page <http://glmm.wikidot.com/faq> chisq ratio p logp 4.191645e+02 3.742540e+00 4.804126e-37 -8.362617e+01 As ratio (residual dev/residual df) > 1, and the p-value < 0.05, I considered to add the observation level random effect (link <http://r.789695.n4.nabble.com/Question-on-overdispersion-td3049898.html>) to deal with the overdispersion. farm trt bk tree dis tot tree_id <fctr> <fctr> <fctr> <fctr> <int> <int> <fctr> iaras cal 1 1 0 100 1 iaras cal 1 2 1 100 2 iaras cal 2 1 1 100 3... so now was added a random effect for each row (tree_id) to the model, but I am not sure of how to include it. This is my approach: m2 = glmer(resp ~ trt + (1|farm/bk) + (1|tree_id), family = binomial, data=df) I also wonder if farm should be a fixed effect, since it has only 3 levels... m3 = glmer(resp ~ trt * farm + (1|farm:bk) + (1|tree_id), family = binomial, data=df) I really appreciate your suggestions about my model specifications... *Juan? Edwards- - - - - - - - - - - - - - - - - - - - - - - -# PhD student - ESALQ-USP/Brazil*
GLMM for Combined experiments and overdispersed data
10 messages · Thierry Onkelinx, Peter Claussen, Juan Pablo Edwards Molina
2 days later
Dear Juan, Use unique id's for random effects variables. So each bk should only be present in one farm. And each tree_id should be present in only one bk. In case each block has different treatments then each tree_id should be unique to one combination of bk and trt. Farm has too few levels to be a random effects. So either model is as a fixed effect or drop it. In case you drop it, the information will be picked up by bk. Note that trt + (1|farm) is less complex than trt * farm. Assuming that you are not interested in the effect of a specific farm, you could use sum, polynomial or helmert contrasts for the farms. Unlike the default treatment contrast, these type of contrasts sum to zero. Thus the effect of trt will be that for the average farm instead of the reference farm. Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey 2017-04-21 22:32 GMT+02:00 Juan Pablo Edwards Molina < edwardsmolina at gmail.com>:
I am analyzing data from 3 field experiments (farms=3) for a citrus flower disease: response variable is binomial because the flower can only be diseased or healthy. I have particular interest in comparing 5 fungicide spraying systems (trt=5). Each farm had 4 blocks (bk=4) including 2 trees as subsamples (tree=2) in which I assessed 100 flowers each one. This is a quick look of the data: farm trt bk tree dis tot <fctr> <fctr> <fctr> <fctr> <int> <int> iaras cal 1 1 0 100 iaras cal 1 2 1 100 iaras cal 2 1 1 100 iaras cal 2 2 3 100 iaras cal 3 1 0 100 iaras cal 3 2 5 100... The model I considered was: resp <- with(df, cbind(dis, tot-dis)) m1 = glmer(resp ~ trt + (1|farm/bk) , family = binomial, data=df) I tested the overdispersion with the overdisp_fun() from GLMM page <http://glmm.wikidot.com/faq> chisq ratio p logp 4.191645e+02 3.742540e+00 4.804126e-37 -8.362617e+01 As ratio (residual dev/residual df) > 1, and the p-value < 0.05, I considered to add the observation level random effect (link <http://r.789695.n4.nabble.com/Question-on-overdispersion-td3049898.html>) to deal with the overdispersion. farm trt bk tree dis tot tree_id <fctr> <fctr> <fctr> <fctr> <int> <int> <fctr> iaras cal 1 1 0 100 1 iaras cal 1 2 1 100 2 iaras cal 2 1 1 100 3... so now was added a random effect for each row (tree_id) to the model, but I am not sure of how to include it. This is my approach: m2 = glmer(resp ~ trt + (1|farm/bk) + (1|tree_id), family = binomial, data=df) I also wonder if farm should be a fixed effect, since it has only 3 levels... m3 = glmer(resp ~ trt * farm + (1|farm:bk) + (1|tree_id), family = binomial, data=df) I really appreciate your suggestions about my model specifications... *Juan? Edwards- - - - - - - - - - - - - - - - - - - - - - - -# PhD student - ESALQ-USP/Brazil* [[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
I?m sorry... I?m new in the list, and when I figured out that the question would suit best in the mixed model list I had already post it in general R-help. I don?t know if there?s a way to "cancel a question"... I will take care of it from now on. Dear Thierry, thanks for your answer. Yes, I am not interested in the effect of a specific farm, they simply represent the total of farms from the region where I want to suggest the best treatments. I Followed your suggestions, but still have a couple of doubts, 1- May "farm" be include as a simple fixed effect or interacting with the treatment? m3 = glmer(resp ~ trt * farm + (1|tree_id), family = binomial, data=df) m4 = glmer(resp ~ trt + farm + (1|tree_id), family = binomial, data=df) ?2 - ? In case of significant ?[ trt * farm ], should I report the results for each farm?? Thanks again Thierry, Juan Edwards *Juan* On Mon, Apr 24, 2017 at 4:29 AM, Thierry Onkelinx <thierry.onkelinx at inbo.be> wrote:
Dear Juan, Use unique id's for random effects variables. So each bk should only be present in one farm. And each tree_id should be present in only one bk. In case each block has different treatments then each tree_id should be unique to one combination of bk and trt. Farm has too few levels to be a random effects. So either model is as a fixed effect or drop it. In case you drop it, the information will be picked up by bk. Note that trt + (1|farm) is less complex than trt * farm. Assuming that you are not interested in the effect of a specific farm, you could use sum, polynomial or helmert contrasts for the farms. Unlike the default treatment contrast, these type of contrasts sum to zero. Thus the effect of trt will be that for the average farm instead of the reference farm. Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey 2017-04-21 22:32 GMT+02:00 Juan Pablo Edwards Molina < edwardsmolina at gmail.com>:
I am analyzing data from 3 field experiments (farms=3) for a citrus flower disease: response variable is binomial because the flower can only be diseased or healthy. I have particular interest in comparing 5 fungicide spraying systems (trt=5). Each farm had 4 blocks (bk=4) including 2 trees as subsamples (tree=2) in which I assessed 100 flowers each one. This is a quick look of the data: farm trt bk tree dis tot <fctr> <fctr> <fctr> <fctr> <int> <int> iaras cal 1 1 0 100 iaras cal 1 2 1 100 iaras cal 2 1 1 100 iaras cal 2 2 3 100 iaras cal 3 1 0 100 iaras cal 3 2 5 100... The model I considered was: resp <- with(df, cbind(dis, tot-dis)) m1 = glmer(resp ~ trt + (1|farm/bk) , family = binomial, data=df) I tested the overdispersion with the overdisp_fun() from GLMM page <http://glmm.wikidot.com/faq> chisq ratio p logp 4.191645e+02 3.742540e+00 4.804126e-37 -8.362617e+01 As ratio (residual dev/residual df) > 1, and the p-value < 0.05, I considered to add the observation level random effect (link <http://r.789695.n4.nabble.com/Question-on-overdispersion-td3049898.html
)
to deal with the overdispersion.
farm trt bk tree dis tot tree_id <fctr> <fctr>
<fctr> <fctr> <int> <int> <fctr>
iaras cal 1 1 0 100 1
iaras cal 1 2 1 100 2
iaras cal 2 1 1 100 3...
so now was added a random effect for each row (tree_id) to the model, but
I
am not sure of how to include it. This is my approach:
m2 = glmer(resp ~ trt + (1|farm/bk) + (1|tree_id), family = binomial,
data=df)
I also wonder if farm should be a fixed effect, since it has only 3
levels...
m3 = glmer(resp ~ trt * farm + (1|farm:bk) + (1|tree_id), family =
binomial, data=df)
I really appreciate your suggestions about my model specifications...
*Juan? Edwards- - - - - - - - - - - - - - - - - - - - - - - -# PhD student
- ESALQ-USP/Brazil*
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Juan, I would model this as m3 = glmer(resp ~ trt * farm + (1| bk/tree), family = binomial, data=df) or m3 = glmer(resp ~ trt * farm + (1| bk) + (1| tree_id), family = binomial, data=df) (I can?t say off the top of my head if what the difference would be if you?re dealing with over-dispersion). 1. I?m assuming that block is a somewhat uniform grouping of trees, so that including block in the model gives you an estimate of spatial variability in the response, and if that is important relative to tree-to-tree variation. 2. You will most certainly want to include trt*farm to test for treatment-by-environment interaction. If interaction is not significant, you may choose to exclude interaction from the model. If there is interaction, then you will want to examine each farm to determine if cross-over interaction present. If your experiment is to determine the ?best? fungicide spraying system, and cross-over interaction is present, then you have no ?best? system. You might have cross-over arising because, say, system 1 ranks ?best? on farm 1, but system 2 ranks ?best? on farm 2. There is extensive literature on the topic, mostly from the plant breeding genotype-by-environment interaction side. Some of the associated statistics implemented in the agricolae package, i.e. AMMI. Peter
On Apr 24, 2017, at 6:56 AM, Juan Pablo Edwards Molina <edwardsmolina at gmail.com> wrote: I?m sorry... I?m new in the list, and when I figured out that the question would suit best in the mixed model list I had already post it in general R-help. I don?t know if there?s a way to "cancel a question"... I will take care of it from now on. Dear Thierry, thanks for your answer. Yes, I am not interested in the effect of a specific farm, they simply represent the total of farms from the region where I want to suggest the best treatments. I Followed your suggestions, but still have a couple of doubts, 1- May "farm" be include as a simple fixed effect or interacting with the treatment? m3 = glmer(resp ~ trt * farm + (1|tree_id), family = binomial, data=df) m4 = glmer(resp ~ trt + farm + (1|tree_id), family = binomial, data=df) ?2 - ? In case of significant ?[ trt * farm ], should I report the results for each farm?? Thanks again Thierry, Juan Edwards *Juan* On Mon, Apr 24, 2017 at 4:29 AM, Thierry Onkelinx <thierry.onkelinx at inbo.be> wrote:
Dear Juan, Use unique id's for random effects variables. So each bk should only be present in one farm. And each tree_id should be present in only one bk. In case each block has different treatments then each tree_id should be unique to one combination of bk and trt. Farm has too few levels to be a random effects. So either model is as a fixed effect or drop it. In case you drop it, the information will be picked up by bk. Note that trt + (1|farm) is less complex than trt * farm. Assuming that you are not interested in the effect of a specific farm, you could use sum, polynomial or helmert contrasts for the farms. Unlike the default treatment contrast, these type of contrasts sum to zero. Thus the effect of trt will be that for the average farm instead of the reference farm. Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey 2017-04-21 22:32 GMT+02:00 Juan Pablo Edwards Molina < edwardsmolina at gmail.com>:
I am analyzing data from 3 field experiments (farms=3) for a citrus flower disease: response variable is binomial because the flower can only be diseased or healthy. I have particular interest in comparing 5 fungicide spraying systems (trt=5). Each farm had 4 blocks (bk=4) including 2 trees as subsamples (tree=2) in which I assessed 100 flowers each one. This is a quick look of the data: farm trt bk tree dis tot <fctr> <fctr> <fctr> <fctr> <int> <int> iaras cal 1 1 0 100 iaras cal 1 2 1 100 iaras cal 2 1 1 100 iaras cal 2 2 3 100 iaras cal 3 1 0 100 iaras cal 3 2 5 100... The model I considered was: resp <- with(df, cbind(dis, tot-dis)) m1 = glmer(resp ~ trt + (1|farm/bk) , family = binomial, data=df) I tested the overdispersion with the overdisp_fun() from GLMM page <http://glmm.wikidot.com/faq> chisq ratio p logp 4.191645e+02 3.742540e+00 4.804126e-37 -8.362617e+01 As ratio (residual dev/residual df) > 1, and the p-value < 0.05, I considered to add the observation level random effect (link <http://r.789695.n4.nabble.com/Question-on-overdispersion-td3049898.html
)
to deal with the overdispersion.
farm trt bk tree dis tot tree_id <fctr> <fctr>
<fctr> <fctr> <int> <int> <fctr>
iaras cal 1 1 0 100 1
iaras cal 1 2 1 100 2
iaras cal 2 1 1 100 3...
so now was added a random effect for each row (tree_id) to the model, but
I
am not sure of how to include it. This is my approach:
m2 = glmer(resp ~ trt + (1|farm/bk) + (1|tree_id), family = binomial,
data=df)
I also wonder if farm should be a fixed effect, since it has only 3
levels...
m3 = glmer(resp ~ trt * farm + (1|farm:bk) + (1|tree_id), family =
binomial, data=df)
I really appreciate your suggestions about my model specifications...
*Juan? Edwards- - - - - - - - - - - - - - - - - - - - - - - -# PhD student
- ESALQ-USP/Brazil*
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Dear Peter, Both models will yield identical results in case tree_id uses unique codes over the blocks. Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey 2017-04-24 15:55 GMT+02:00 Peter Claussen <dakotajudo at mac.com>:
Juan, I would model this as m3 = glmer(resp ~ trt * farm + (1| bk/tree), family = binomial, data=df) or m3 = glmer(resp ~ trt * farm + (1| bk) + (1| tree_id), family = binomial, data=df) (I can?t say off the top of my head if what the difference would be if you?re dealing with over-dispersion). 1. I?m assuming that block is a somewhat uniform grouping of trees, so that including block in the model gives you an estimate of spatial variability in the response, and if that is important relative to tree-to-tree variation. 2. You will most certainly want to include trt*farm to test for treatment-by-environment interaction. If interaction is not significant, you may choose to exclude interaction from the model. If there is interaction, then you will want to examine each farm to determine if cross-over interaction present. If your experiment is to determine the ?best? fungicide spraying system, and cross-over interaction is present, then you have no ?best? system. You might have cross-over arising because, say, system 1 ranks ?best? on farm 1, but system 2 ranks ?best? on farm 2. There is extensive literature on the topic, mostly from the plant breeding genotype-by-environment interaction side. Some of the associated statistics implemented in the agricolae package, i.e. AMMI. Peter
On Apr 24, 2017, at 6:56 AM, Juan Pablo Edwards Molina <
edwardsmolina at gmail.com> wrote:
I?m sorry... I?m new in the list, and when I figured out that the
question
would suit best in the mixed model list I had already post it in general R-help. I don?t know if there?s a way to "cancel a question"... I will
take
care of it from now on. Dear Thierry, thanks for your answer. Yes, I am not interested in the effect of a specific farm, they simply represent the total of farms from the region where I want to suggest the best treatments. I Followed your suggestions, but still have a couple of doubts, 1- May "farm" be include as a simple fixed effect or interacting with the treatment? m3 = glmer(resp ~ trt * farm + (1|tree_id), family = binomial, data=df) m4 = glmer(resp ~ trt + farm + (1|tree_id), family = binomial, data=df) ?2 - ? In case of significant ?[ trt * farm ], should I report the results for each farm?? Thanks again Thierry, Juan Edwards *Juan* On Mon, Apr 24, 2017 at 4:29 AM, Thierry Onkelinx <
thierry.onkelinx at inbo.be>
wrote:
Dear Juan, Use unique id's for random effects variables. So each bk should only be present in one farm. And each tree_id should be present in only one bk.
In
case each block has different treatments then each tree_id should be
unique
to one combination of bk and trt. Farm has too few levels to be a random effects. So either model is as a fixed effect or drop it. In case you drop it, the information will be picked up by bk. Note that trt + (1|farm) is less complex than trt *
farm.
Assuming that you are not interested in the effect of a specific farm,
you
could use sum, polynomial or helmert contrasts for the farms. Unlike the default treatment contrast, these type of contrasts sum to zero. Thus
the
effect of trt will be that for the average farm instead of the reference farm. Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and
Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to
say
what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey 2017-04-21 22:32 GMT+02:00 Juan Pablo Edwards Molina < edwardsmolina at gmail.com>:
I am analyzing data from 3 field experiments (farms=3) for a citrus
flower
disease: response variable is binomial because the flower can only be diseased or healthy. I have particular interest in comparing 5 fungicide spraying systems (trt=5). Each farm had 4 blocks (bk=4) including 2 trees as subsamples (tree=2)
in
which I assessed 100 flowers each one. This is a quick look of the
data:
farm trt bk tree dis tot <fctr> <fctr> <fctr> <fctr> <int> <int> iaras cal 1 1 0 100 iaras cal 1 2 1 100 iaras cal 2 1 1 100 iaras cal 2 2 3 100 iaras cal 3 1 0 100 iaras cal 3 2 5 100... The model I considered was: resp <- with(df, cbind(dis, tot-dis)) m1 = glmer(resp ~ trt + (1|farm/bk) , family = binomial, data=df) I tested the overdispersion with the overdisp_fun() from GLMM page <http://glmm.wikidot.com/faq> chisq ratio p logp 4.191645e+02 3.742540e+00 4.804126e-37 -8.362617e+01 As ratio (residual dev/residual df) > 1, and the p-value < 0.05, I considered to add the observation level random effect (link <http://r.789695.n4.nabble.com/Question-on-
overdispersion-td3049898.html
)
to deal with the overdispersion. farm trt bk tree dis tot tree_id <fctr> <fctr> <fctr> <fctr> <int> <int> <fctr> iaras cal 1 1 0 100 1 iaras cal 1 2 1 100 2 iaras cal 2 1 1 100 3... so now was added a random effect for each row (tree_id) to the model,
but
I am not sure of how to include it. This is my approach: m2 = glmer(resp ~ trt + (1|farm/bk) + (1|tree_id), family = binomial, data=df) I also wonder if farm should be a fixed effect, since it has only 3 levels... m3 = glmer(resp ~ trt * farm + (1|farm:bk) + (1|tree_id), family = binomial, data=df) I really appreciate your suggestions about my model specifications... *Juan? Edwards- - - - - - - - - - - - - - - - - - - - - - - -# PhD
student
- ESALQ-USP/Brazil*
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Yes, tree_id uses unique code over blocks and farms: ?each bk is only present in one farm and each tree is present in only one bk. (tree_id structure: farm_bk_tree) By this way the model can handle the overdispersion (If I'm not mistaken) Thank you all, *Juan* On Mon, Apr 24, 2017 at 11:46 AM, Thierry Onkelinx <thierry.onkelinx at inbo.be
wrote:
Dear Peter, Both models will yield identical results in case tree_id uses unique codes over the blocks. Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey 2017-04-24 15:55 GMT+02:00 Peter Claussen <dakotajudo at mac.com>:
Juan, I would model this as m3 = glmer(resp ~ trt * farm + (1| bk/tree), family = binomial, data=df) or m3 = glmer(resp ~ trt * farm + (1| bk) + (1| tree_id), family = binomial, data=df) (I can?t say off the top of my head if what the difference would be if you?re dealing with over-dispersion). 1. I?m assuming that block is a somewhat uniform grouping of trees, so that including block in the model gives you an estimate of spatial variability in the response, and if that is important relative to tree-to-tree variation. 2. You will most certainly want to include trt*farm to test for treatment-by-environment interaction. If interaction is not significant, you may choose to exclude interaction from the model. If there is interaction, then you will want to examine each farm to determine if cross-over interaction present. If your experiment is to determine the ?best? fungicide spraying system, and cross-over interaction is present, then you have no ?best? system. You might have cross-over arising because, say, system 1 ranks ?best? on farm 1, but system 2 ranks ?best? on farm 2. There is extensive literature on the topic, mostly from the plant breeding genotype-by-environment interaction side. Some of the associated statistics implemented in the agricolae package, i.e. AMMI. Peter
On Apr 24, 2017, at 6:56 AM, Juan Pablo Edwards Molina <
edwardsmolina at gmail.com> wrote:
I?m sorry... I?m new in the list, and when I figured out that the
question
would suit best in the mixed model list I had already post it in general R-help. I don?t know if there?s a way to "cancel a question"... I will
take
care of it from now on. Dear Thierry, thanks for your answer. Yes, I am not interested in the effect of a specific farm, they simply represent the total of farms from the region where I want to suggest the best treatments. I Followed your suggestions, but still have a couple of doubts, 1- May "farm" be include as a simple fixed effect or interacting with
the
treatment? m3 = glmer(resp ~ trt * farm + (1|tree_id), family = binomial, data=df) m4 = glmer(resp ~ trt + farm + (1|tree_id), family = binomial, data=df) ?2 - ? In case of significant ?[ trt * farm ], should I report the results for each farm?? Thanks again Thierry, Juan Edwards *Juan* On Mon, Apr 24, 2017 at 4:29 AM, Thierry Onkelinx <
thierry.onkelinx at inbo.be>
wrote:
Dear Juan, Use unique id's for random effects variables.
?? So each bk should only be
present in one farm. And each tree_id should be present in only one
bk. In
case each block has different treatments then each tree_id should be
unique
to one combination of bk and trt. Farm has too few levels to be a random effects. So either model is as a fixed effect or drop it. In case you drop it, the information will be picked up by bk. Note that trt + (1|farm) is less complex than trt *
farm.
Assuming that you are not interested in the effect of a specific farm,
you
could use sum, polynomial or helmert contrasts for the farms. Unlike
the
default treatment contrast, these type of contrasts sum to zero. Thus
the
effect of trt will be that for the average farm instead of the
reference
farm. Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and
Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able
to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does
not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey 2017-04-21 22:32 GMT+02:00 Juan Pablo Edwards Molina < edwardsmolina at gmail.com>:
I am analyzing data from 3 field experiments (farms=3) for a citrus
flower
disease: response variable is binomial because the flower can only be diseased or healthy. I have particular interest in comparing 5 fungicide spraying systems (trt=5). Each farm had 4 blocks (bk=4) including 2 trees as subsamples
(tree=2) in
which I assessed 100 flowers each one. This is a quick look of the
data:
farm trt bk tree dis tot <fctr> <fctr> <fctr> <fctr> <int> <int> iaras cal 1 1 0 100 iaras cal 1 2 1 100 iaras cal 2 1 1 100 iaras cal 2 2 3 100 iaras cal 3 1 0 100 iaras cal 3 2 5 100... The model I considered was: resp <- with(df, cbind(dis, tot-dis)) m1 = glmer(resp ~ trt + (1|farm/bk) , family = binomial, data=df) I tested the overdispersion with the overdisp_fun() from GLMM page <http://glmm.wikidot.com/faq> chisq ratio p logp 4.191645e+02 3.742540e+00 4.804126e-37 -8.362617e+01 As ratio (residual dev/residual df) > 1, and the p-value < 0.05, I considered to add the observation level random effect (link <http://r.789695.n4.nabble.com/Question-on-overdispersion-td
3049898.html
)
to deal with the overdispersion. farm trt bk tree dis tot tree_id <fctr> <fctr> <fctr> <fctr> <int> <int> <fctr> iaras cal 1 1 0 100 1 iaras cal 1 2 1 100 2 iaras cal 2 1 1 100 3... so now was added a random effect for each row (tree_id) to the model,
but
I am not sure of how to include it. This is my approach: m2 = glmer(resp ~ trt + (1|farm/bk) + (1|tree_id), family = binomial, data=df) I also wonder if farm should be a fixed effect, since it has only 3 levels... m3 = glmer(resp ~ trt * farm + (1|farm:bk) + (1|tree_id), family = binomial, data=df) I really appreciate your suggestions about my model specifications... *Juan? Edwards- - - - - - - - - - - - - - - - - - - - - - - -# PhD
student
- ESALQ-USP/Brazil*
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Thierry, sorry to bother you again...
I tried to follow your suggestion and I did the herlmert contrasts with
lsmeans package.
dinc <- within(dinc, { tree_id <- as.factor(interaction(farm, trt, bk,
tree)) })
resp1 <- with(dinc, cbind(dis, tot-dis))
m0 = glmer(resp1 ~ trt + farm + (1|tree_id), family = binomial, data=dinc)
summary(m0)
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [
glmerMod]
Family: binomial ( logit )
Formula: resp1 ~ trt + farm + (1 | tree_id)
Data: dinc
AIC BIC logLik deviance df.resid
521.5 543.8 -252.7 505.5 112
Scaled residuals:
Min 1Q Median 3Q Max
-0.90445 -0.51114 -0.00572 0.31456 1.04667
Random effects:
Groups Name Variance Std.Dev.
tree_id (Intercept) 1.028 1.014
Number of obs: 120, groups: tree_id, 120
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.87786 0.37604 -12.972 < 2e-16 ***
trtG10 -0.06738 0.49125 -0.137 0.89090
trtG15 0.90620 0.44435 2.039 0.04141 *
trtG20 1.13733 0.43920 2.590 0.00961 **
trtControl 5.10202 0.41215 12.379 < 2e-16 ***
farmstacruz -0.80155 0.30294 -2.646 0.00815 **
farmtaqua -0.84738 0.30659 -2.764 0.00571 **
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Correlation of Fixed Effects:
(Intr) trtG10 trtG15 trtG20 trtCnt frmstc
trtG10 -0.635
trtG15 -0.710 0.538
trtG20 -0.720 0.546 0.604
trtControl -0.763 0.571 0.650 0.651
farmstacruz -0.300 0.022 -0.015 0.010 -0.081
farmtaqua -0.301 0.012 0.004 0.017 -0.083 0.441
### Setting up a custom contrast function
helmert.lsmc <- function(levs, ...) {
M <- as.data.frame(contr.helmert(levs))
names(M) <- paste(levs[-1],"vs calendar")
attr(M, "desc") <- "Helmert contrasts"
M
}
lsmeans(m0, helmert ~ trt, type = "response")
$lsmeans trt prob SE df asymp.LCL asymp.UCL Calendar 0.004374833 0.001541432 NA 0.002191201 0.008715549 G10 0.004090935 0.001498922 NA 0.001993307 0.008377443 G15 0.010757825 0.003091538 NA 0.006116214 0.018855141 G20 0.013517346 0.003832360 NA 0.007740919 0.023502164 Control 0.419339074 0.051564118 NA 0.322885163 0.522377507 Results are averaged over the levels of: farm Confidence level used: 0.95 Intervals are back-transformed from the logit scale $contrasts contrast odds.ratio SE df z.ratio p.value G10 vs calendar 9.348400e-01 4.592373e-01 NA -0.137 0.8909 G15 vs calendar 6.552019e+00 4.909975e+00 NA 2.508 0.0121 G20 vs calendar 1.310737e+01 1.307704e+01 NA 2.579 0.0099 Control vs calendar 1.011296e+08 1.124089e+08 NA 16.582 <.0001 Results are averaged over the levels of: farm Tests are performed on the log odds ratio scale ## Do you think it's correct, if I consider trt calendar as the reference to test my other treatments? Thanks! Juan *Juan* On Mon, Apr 24, 2017 at 11:46 AM, Thierry Onkelinx <thierry.onkelinx at inbo.be
wrote:
Dear Peter, Both models will yield identical results in case tree_id uses unique codes over the blocks. Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey 2017-04-24 15:55 GMT+02:00 Peter Claussen <dakotajudo at mac.com>:
Juan, I would model this as m3 = glmer(resp ~ trt * farm + (1| bk/tree), family = binomial, data=df) or m3 = glmer(resp ~ trt * farm + (1| bk) + (1| tree_id), family = binomial, data=df) (I can?t say off the top of my head if what the difference would be if you?re dealing with over-dispersion). 1. I?m assuming that block is a somewhat uniform grouping of trees, so that including block in the model gives you an estimate of spatial variability in the response, and if that is important relative to tree-to-tree variation. 2. You will most certainly want to include trt*farm to test for treatment-by-environment interaction. If interaction is not significant, you may choose to exclude interaction from the model. If there is interaction, then you will want to examine each farm to determine if cross-over interaction present. If your experiment is to determine the ?best? fungicide spraying system, and cross-over interaction is present, then you have no ?best? system. You might have cross-over arising because, say, system 1 ranks ?best? on farm 1, but system 2 ranks ?best? on farm 2. There is extensive literature on the topic, mostly from the plant breeding genotype-by-environment interaction side. Some of the associated statistics implemented in the agricolae package, i.e. AMMI. Peter
On Apr 24, 2017, at 6:56 AM, Juan Pablo Edwards Molina <
edwardsmolina at gmail.com> wrote:
I?m sorry... I?m new in the list, and when I figured out that the
question
would suit best in the mixed model list I had already post it in general R-help. I don?t know if there?s a way to "cancel a question"... I will
take
care of it from now on. Dear Thierry, thanks for your answer. Yes, I am not interested in the effect of a specific farm, they simply represent the total of farms from the region where I want to suggest the best treatments. I Followed your suggestions, but still have a couple of doubts, 1- May "farm" be include as a simple fixed effect or interacting with
the
treatment? m3 = glmer(resp ~ trt * farm + (1|tree_id), family = binomial, data=df) m4 = glmer(resp ~ trt + farm + (1|tree_id), family = binomial, data=df) ?2 - ? In case of significant ?[ trt * farm ], should I report the results for each farm?? Thanks again Thierry, Juan Edwards *Juan* On Mon, Apr 24, 2017 at 4:29 AM, Thierry Onkelinx <
thierry.onkelinx at inbo.be>
wrote:
Dear Juan, Use unique id's for random effects variables. So each bk should only be present in one farm. And each tree_id should be present in only one
bk. In
case each block has different treatments then each tree_id should be
unique
to one combination of bk and trt. Farm has too few levels to be a random effects. So either model is as a fixed effect or drop it. In case you drop it, the information will be picked up by bk. Note that trt + (1|farm) is less complex than trt *
farm.
Assuming that you are not interested in the effect of a specific farm,
you
could use sum, polynomial or helmert contrasts for the farms. Unlike
the
default treatment contrast, these type of contrasts sum to zero. Thus
the
effect of trt will be that for the average farm instead of the
reference
farm. Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and
Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able
to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does
not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey 2017-04-21 22:32 GMT+02:00 Juan Pablo Edwards Molina < edwardsmolina at gmail.com>:
I am analyzing data from 3 field experiments (farms=3) for a citrus
flower
disease: response variable is binomial because the flower can only be diseased or healthy. I have particular interest in comparing 5 fungicide spraying systems (trt=5). Each farm had 4 blocks (bk=4) including 2 trees as subsamples
(tree=2) in
which I assessed 100 flowers each one. This is a quick look of the
data:
farm trt bk tree dis tot <fctr> <fctr> <fctr> <fctr> <int> <int> iaras cal 1 1 0 100 iaras cal 1 2 1 100 iaras cal 2 1 1 100 iaras cal 2 2 3 100 iaras cal 3 1 0 100 iaras cal 3 2 5 100... The model I considered was: resp <- with(df, cbind(dis, tot-dis)) m1 = glmer(resp ~ trt + (1|farm/bk) , family = binomial, data=df) I tested the overdispersion with the overdisp_fun() from GLMM page <http://glmm.wikidot.com/faq> chisq ratio p logp 4.191645e+02 3.742540e+00 4.804126e-37 -8.362617e+01 As ratio (residual dev/residual df) > 1, and the p-value < 0.05, I considered to add the observation level random effect (link <http://r.789695.n4.nabble.com/Question-on-overdispersion-
td3049898.html
)
to deal with the overdispersion. farm trt bk tree dis tot tree_id <fctr> <fctr> <fctr> <fctr> <int> <int> <fctr> iaras cal 1 1 0 100 1 iaras cal 1 2 1 100 2 iaras cal 2 1 1 100 3... so now was added a random effect for each row (tree_id) to the model,
but
I am not sure of how to include it. This is my approach: m2 = glmer(resp ~ trt + (1|farm/bk) + (1|tree_id), family = binomial, data=df) I also wonder if farm should be a fixed effect, since it has only 3 levels... m3 = glmer(resp ~ trt * farm + (1|farm:bk) + (1|tree_id), family = binomial, data=df) I really appreciate your suggestions about my model specifications... *Juan? Edwards- - - - - - - - - - - - - - - - - - - - - - - -# PhD
student
- ESALQ-USP/Brazil*
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
The effects of the baseline and control are very large. Do you have quasi
complete separation? All failures at the baseline and all successes at the
control?
Op 25 apr. 2017 3:40 p.m. schreef "Juan Pablo Edwards Molina" <
edwardsmolina at gmail.com>:
Thierry, sorry to bother you again...
I tried to follow your suggestion and I did the herlmert contrasts with
lsmeans package.
dinc <- within(dinc, { tree_id <- as.factor(interaction(farm, trt, bk,
tree)) })
resp1 <- with(dinc, cbind(dis, tot-dis))
m0 = glmer(resp1 ~ trt + farm + (1|tree_id), family = binomial, data=dinc)
summary(m0)
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [
glmerMod]
Family: binomial ( logit )
Formula: resp1 ~ trt + farm + (1 | tree_id)
Data: dinc
AIC BIC logLik deviance df.resid
521.5 543.8 -252.7 505.5 112
Scaled residuals:
Min 1Q Median 3Q Max
-0.90445 -0.51114 -0.00572 0.31456 1.04667
Random effects:
Groups Name Variance Std.Dev.
tree_id (Intercept) 1.028 1.014
Number of obs: 120, groups: tree_id, 120
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.87786 0.37604 -12.972 < 2e-16 ***
trtG10 -0.06738 0.49125 -0.137 0.89090
trtG15 0.90620 0.44435 2.039 0.04141 *
trtG20 1.13733 0.43920 2.590 0.00961 **
trtControl 5.10202 0.41215 12.379 < 2e-16 ***
farmstacruz -0.80155 0.30294 -2.646 0.00815 **
farmtaqua -0.84738 0.30659 -2.764 0.00571 **
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Correlation of Fixed Effects:
(Intr) trtG10 trtG15 trtG20 trtCnt frmstc
trtG10 -0.635
trtG15 -0.710 0.538
trtG20 -0.720 0.546 0.604
trtControl -0.763 0.571 0.650 0.651
farmstacruz -0.300 0.022 -0.015 0.010 -0.081
farmtaqua -0.301 0.012 0.004 0.017 -0.083 0.441
### Setting up a custom contrast function
helmert.lsmc <- function(levs, ...) {
M <- as.data.frame(contr.helmert(levs))
names(M) <- paste(levs[-1],"vs calendar")
attr(M, "desc") <- "Helmert contrasts"
M
}
lsmeans(m0, helmert ~ trt, type = "response")
$lsmeans trt prob SE df asymp.LCL asymp.UCL Calendar 0.004374833 0.001541432 NA 0.002191201 0.008715549 G10 0.004090935 0.001498922 NA 0.001993307 0.008377443 G15 0.010757825 0.003091538 NA 0.006116214 0.018855141 G20 0.013517346 0.003832360 NA 0.007740919 0.023502164 Control 0.419339074 0.051564118 NA 0.322885163 0.522377507 Results are averaged over the levels of: farm Confidence level used: 0.95 Intervals are back-transformed from the logit scale $contrasts contrast odds.ratio SE df z.ratio p.value G10 vs calendar 9.348400e-01 4.592373e-01 NA -0.137 0.8909 G15 vs calendar 6.552019e+00 4.909975e+00 NA 2.508 0.0121 G20 vs calendar 1.310737e+01 1.307704e+01 NA 2.579 0.0099 Control vs calendar 1.011296e+08 1.124089e+08 NA 16.582 <.0001 Results are averaged over the levels of: farm Tests are performed on the log odds ratio scale ## Do you think it's correct, if I consider trt calendar as the reference to test my other treatments? Thanks! Juan *Juan* On Mon, Apr 24, 2017 at 11:46 AM, Thierry Onkelinx <thierry.onkelinx at inbo.be
wrote:
Dear Peter, Both models will yield identical results in case tree_id uses unique codes over the blocks. Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey 2017-04-24 15:55 GMT+02:00 Peter Claussen <dakotajudo at mac.com>:
Juan, I would model this as m3 = glmer(resp ~ trt * farm + (1| bk/tree), family = binomial, data=df) or m3 = glmer(resp ~ trt * farm + (1| bk) + (1| tree_id), family = binomial, data=df) (I can?t say off the top of my head if what the difference would be if you?re dealing with over-dispersion). 1. I?m assuming that block is a somewhat uniform grouping of trees, so that including block in the model gives you an estimate of spatial variability in the response, and if that is important relative to tree-to-tree variation. 2. You will most certainly want to include trt*farm to test for treatment-by-environment interaction. If interaction is not significant, you may choose to exclude interaction from the model. If there is interaction, then you will want to examine each farm to determine if cross-over interaction present. If your experiment is to determine the ?best? fungicide spraying system, and cross-over interaction is present, then you have no ?best? system. You might have cross-over arising because, say, system 1 ranks ?best? on farm 1, but system 2 ranks ?best? on farm 2. There is extensive literature on the topic, mostly from the plant breeding genotype-by-environment interaction side. Some of the associated statistics implemented in the agricolae package, i.e. AMMI. Peter
On Apr 24, 2017, at 6:56 AM, Juan Pablo Edwards Molina <
edwardsmolina at gmail.com> wrote:
I?m sorry... I?m new in the list, and when I figured out that the
question
would suit best in the mixed model list I had already post it in general R-help. I don?t know if there?s a way to "cancel a question"... I will
take
care of it from now on. Dear Thierry, thanks for your answer. Yes, I am not interested in the effect of a specific farm, they simply represent the total of farms from the region where I want to suggest the best treatments. I Followed your suggestions, but still have a couple of doubts, 1- May "farm" be include as a simple fixed effect or interacting with
the
treatment? m3 = glmer(resp ~ trt * farm + (1|tree_id), family = binomial, data=df) m4 = glmer(resp ~ trt + farm + (1|tree_id), family = binomial, data=df) ?2 - ? In case of significant ?[ trt * farm ], should I report the results for each farm?? Thanks again Thierry, Juan Edwards *Juan* On Mon, Apr 24, 2017 at 4:29 AM, Thierry Onkelinx <
thierry.onkelinx at inbo.be>
wrote:
Dear Juan, Use unique id's for random effects variables. So each bk should only be present in one farm. And each tree_id should be present in only one
bk. In
case each block has different treatments then each tree_id should be
unique
to one combination of bk and trt. Farm has too few levels to be a random effects. So either model is as a fixed effect or drop it. In case you drop it, the information will be picked up by bk. Note that trt + (1|farm) is less complex than trt *
farm.
Assuming that you are not interested in the effect of a specific farm,
you
could use sum, polynomial or helmert contrasts for the farms. Unlike
the
default treatment contrast, these type of contrasts sum to zero. Thus
the
effect of trt will be that for the average farm instead of the
reference
farm. Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and
Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able
to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does
not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey 2017-04-21 22:32 GMT+02:00 Juan Pablo Edwards Molina < edwardsmolina at gmail.com>:
I am analyzing data from 3 field experiments (farms=3) for a citrus
flower
disease: response variable is binomial because the flower can only be diseased or healthy. I have particular interest in comparing 5 fungicide spraying systems (trt=5). Each farm had 4 blocks (bk=4) including 2 trees as subsamples
(tree=2) in
which I assessed 100 flowers each one. This is a quick look of the
data:
farm trt bk tree dis tot <fctr> <fctr> <fctr> <fctr> <int> <int> iaras cal 1 1 0 100 iaras cal 1 2 1 100 iaras cal 2 1 1 100 iaras cal 2 2 3 100 iaras cal 3 1 0 100 iaras cal 3 2 5 100... The model I considered was: resp <- with(df, cbind(dis, tot-dis)) m1 = glmer(resp ~ trt + (1|farm/bk) , family = binomial, data=df) I tested the overdispersion with the overdisp_fun() from GLMM page <http://glmm.wikidot.com/faq> chisq ratio p logp 4.191645e+02 3.742540e+00 4.804126e-37 -8.362617e+01 As ratio (residual dev/residual df) > 1, and the p-value < 0.05, I considered to add the observation level random effect (link <http://r.789695.n4.nabble.com/Question-on-overdispersion-td
3049898.html
)
to deal with the overdispersion. farm trt bk tree dis tot tree_id <fctr> <fctr> <fctr> <fctr> <int> <int> <fctr> iaras cal 1 1 0 100 1 iaras cal 1 2 1 100 2 iaras cal 2 1 1 100 3... so now was added a random effect for each row (tree_id) to the model,
but
I am not sure of how to include it. This is my approach: m2 = glmer(resp ~ trt + (1|farm/bk) + (1|tree_id), family = binomial, data=df) I also wonder if farm should be a fixed effect, since it has only 3 levels... m3 = glmer(resp ~ trt * farm + (1|farm:bk) + (1|tree_id), family = binomial, data=df) I really appreciate your suggestions about my model specifications... *Juan? Edwards- - - - - - - - - - - - - - - - - - - - - - - -# PhD
student
- ESALQ-USP/Brazil*
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Perhaps I?m missing something. Are the individual trees the experimental units, and are you only taking one count per tree, or are there multiple counts? That makes me think that that tree_id variance is effectively residual variance and that there is no random effect other than residual. You have 112 df.resid, and 120 groups : tree_id. So what do you gain by using glmer as opposed to gem with a quasibinomial family? Peter
On Apr 25, 2017, at 8:40 AM, Juan Pablo Edwards Molina <edwardsmolina at gmail.com> wrote:
Thierry, sorry to bother you again...
I tried to follow your suggestion and I did the herlmert contrasts with lsmeans package.
dinc <- within(dinc, { tree_id <- as.factor(interaction(farm, trt, bk, tree)) })
resp1 <- with(dinc, cbind(dis, tot-dis))
m0 = glmer(resp1 ~ trt + farm + (1|tree_id), family = binomial, data=dinc)
summary(m0)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) [
glmerMod]
Family: binomial ( logit )
Formula: resp1 ~ trt + farm + (1 | tree_id)
Data: dinc
AIC BIC logLik deviance df.resid
521.5 543.8 -252.7 505.5 112
Scaled residuals:
Min 1Q Median 3Q Max
-0.90445 -0.51114 -0.00572 0.31456 1.04667
Random effects:
Groups Name Variance Std.Dev.
tree_id (Intercept) 1.028 1.014
Number of obs: 120, groups: tree_id, 120
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.87786 0.37604 -12.972 < 2e-16 ***
trtG10 -0.06738 0.49125 -0.137 0.89090
trtG15 0.90620 0.44435 2.039 0.04141 *
trtG20 1.13733 0.43920 2.590 0.00961 **
trtControl 5.10202 0.41215 12.379 < 2e-16 ***
farmstacruz -0.80155 0.30294 -2.646 0.00815 **
farmtaqua -0.84738 0.30659 -2.764 0.00571 **
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Correlation of Fixed Effects:
(Intr) trtG10 trtG15 trtG20 trtCnt frmstc
trtG10 -0.635
trtG15 -0.710 0.538
trtG20 -0.720 0.546 0.604
trtControl -0.763 0.571 0.650 0.651
farmstacruz -0.300 0.022 -0.015 0.010 -0.081
farmtaqua -0.301 0.012 0.004 0.017 -0.083 0.441
### Setting up a custom contrast function
helmert.lsmc <- function(levs, ...) {
M <- as.data.frame(contr.helmert(levs))
names(M) <- paste(levs[-1],"vs calendar")
attr(M, "desc") <- "Helmert contrasts"
M
}
lsmeans(m0, helmert ~ trt, type = "response")
$lsmeans trt prob SE df asymp.LCL asymp.UCL Calendar 0.004374833 0.001541432 NA 0.002191201 0.008715549 G10 0.004090935 0.001498922 NA 0.001993307 0.008377443 G15 0.010757825 0.003091538 NA 0.006116214 0.018855141 G20 0.013517346 0.003832360 NA 0.007740919 0.023502164 Control 0.419339074 0.051564118 NA 0.322885163 0.522377507 Results are averaged over the levels of: farm Confidence level used: 0.95 Intervals are back-transformed from the logit scale $contrasts contrast odds.ratio SE df z.ratio p.value G10 vs calendar 9.348400e-01 4.592373e-01 NA -0.137 0.8909 G15 vs calendar 6.552019e+00 4.909975e+00 NA 2.508 0.0121 G20 vs calendar 1.310737e+01 1.307704e+01 NA 2.579 0.0099 Control vs calendar 1.011296e+08 1.124089e+08 NA 16.582 <.0001 Results are averaged over the levels of: farm Tests are performed on the log odds ratio scale ## Do you think it's correct, if I consider trt calendar as the reference to test my other treatments? Thanks! Juan Juan On Mon, Apr 24, 2017 at 11:46 AM, Thierry Onkelinx <thierry.onkelinx at inbo.be <mailto:thierry.onkelinx at inbo.be>> wrote: Dear Peter, Both models will yield identical results in case tree_id uses unique codes over the blocks. Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey 2017-04-24 15:55 GMT+02:00 Peter Claussen <dakotajudo at mac.com <mailto:dakotajudo at mac.com>>: Juan, I would model this as m3 = glmer(resp ~ trt * farm + (1| bk/tree), family = binomial, data=df) or m3 = glmer(resp ~ trt * farm + (1| bk) + (1| tree_id), family = binomial, data=df) (I can?t say off the top of my head if what the difference would be if you?re dealing with over-dispersion). 1. I?m assuming that block is a somewhat uniform grouping of trees, so that including block in the model gives you an estimate of spatial variability in the response, and if that is important relative to tree-to-tree variation. 2. You will most certainly want to include trt*farm to test for treatment-by-environment interaction. If interaction is not significant, you may choose to exclude interaction from the model. If there is interaction, then you will want to examine each farm to determine if cross-over interaction present. If your experiment is to determine the ?best? fungicide spraying system, and cross-over interaction is present, then you have no ?best? system. You might have cross-over arising because, say, system 1 ranks ?best? on farm 1, but system 2 ranks ?best? on farm 2. There is extensive literature on the topic, mostly from the plant breeding genotype-by-environment interaction side. Some of the associated statistics implemented in the agricolae package, i.e. AMMI. Peter
On Apr 24, 2017, at 6:56 AM, Juan Pablo Edwards Molina <edwardsmolina at gmail.com <mailto:edwardsmolina at gmail.com>> wrote: I?m sorry... I?m new in the list, and when I figured out that the question would suit best in the mixed model list I had already post it in general R-help. I don?t know if there?s a way to "cancel a question"... I will take care of it from now on. Dear Thierry, thanks for your answer. Yes, I am not interested in the effect of a specific farm, they simply represent the total of farms from the region where I want to suggest the best treatments. I Followed your suggestions, but still have a couple of doubts, 1- May "farm" be include as a simple fixed effect or interacting with the treatment? m3 = glmer(resp ~ trt * farm + (1|tree_id), family = binomial, data=df) m4 = glmer(resp ~ trt + farm + (1|tree_id), family = binomial, data=df) ?2 - ? In case of significant ?[ trt * farm ], should I report the results for each farm?? Thanks again Thierry, Juan Edwards *Juan* On Mon, Apr 24, 2017 at 4:29 AM, Thierry Onkelinx <thierry.onkelinx at inbo.be <mailto:thierry.onkelinx at inbo.be>> wrote:
Dear Juan, Use unique id's for random effects variables. So each bk should only be present in one farm. And each tree_id should be present in only one bk. In case each block has different treatments then each tree_id should be unique to one combination of bk and trt. Farm has too few levels to be a random effects. So either model is as a fixed effect or drop it. In case you drop it, the information will be picked up by bk. Note that trt + (1|farm) is less complex than trt * farm. Assuming that you are not interested in the effect of a specific farm, you could use sum, polynomial or helmert contrasts for the farms. Unlike the default treatment contrast, these type of contrasts sum to zero. Thus the effect of trt will be that for the average farm instead of the reference farm. Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey 2017-04-21 22:32 GMT+02:00 Juan Pablo Edwards Molina < edwardsmolina at gmail.com <mailto:edwardsmolina at gmail.com>>:
I am analyzing data from 3 field experiments (farms=3) for a citrus flower disease: response variable is binomial because the flower can only be diseased or healthy. I have particular interest in comparing 5 fungicide spraying systems (trt=5). Each farm had 4 blocks (bk=4) including 2 trees as subsamples (tree=2) in which I assessed 100 flowers each one. This is a quick look of the data: farm trt bk tree dis tot <fctr> <fctr> <fctr> <fctr> <int> <int> iaras cal 1 1 0 100 iaras cal 1 2 1 100 iaras cal 2 1 1 100 iaras cal 2 2 3 100 iaras cal 3 1 0 100 iaras cal 3 2 5 100... The model I considered was: resp <- with(df, cbind(dis, tot-dis)) m1 = glmer(resp ~ trt + (1|farm/bk) , family = binomial, data=df) I tested the overdispersion with the overdisp_fun() from GLMM page <http://glmm.wikidot.com/faq <http://glmm.wikidot.com/faq>> chisq ratio p logp 4.191645e+02 3.742540e+00 4.804126e-37 -8.362617e+01 As ratio (residual dev/residual df) > 1, and the p-value < 0.05, I considered to add the observation level random effect (link <http://r.789695.n4.nabble.com/Question-on-overdispersion-td3049898.html <http://r.789695.n4.nabble.com/Question-on-overdispersion-td3049898.html>
)
to deal with the overdispersion.
farm trt bk tree dis tot tree_id <fctr> <fctr>
<fctr> <fctr> <int> <int> <fctr>
iaras cal 1 1 0 100 1
iaras cal 1 2 1 100 2
iaras cal 2 1 1 100 3...
so now was added a random effect for each row (tree_id) to the model, but
I
am not sure of how to include it. This is my approach:
m2 = glmer(resp ~ trt + (1|farm/bk) + (1|tree_id), family = binomial,
data=df)
I also wonder if farm should be a fixed effect, since it has only 3
levels...
m3 = glmer(resp ~ trt * farm + (1|farm:bk) + (1|tree_id), family =
binomial, data=df)
I really appreciate your suggestions about my model specifications...
*Juan? Edwards- - - - - - - - - - - - - - - - - - - - - - - -# PhD student
- ESALQ-USP/Brazil*
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org <mailto:R-sig-mixed-models at r-project.org> mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org <mailto:R-sig-mixed-models at r-project.org> mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
_______________________________________________ R-sig-mixed-models at r-project.org <mailto:R-sig-mixed-models at r-project.org> mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models <https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
Should read ?glm?, un-autocorrected.
On Apr 25, 2017, at 9:29 AM, Peter Claussen <dakotajudo at mac.com> wrote: So what do you gain by using glmer as opposed to gem with a quasibinomial family?