Hello, I have an experiment with six streams in two groups (regulated and control). At each stream there were five sites (Transect). At each site there were unreplicated nutrient treatments (N, P, N+P, C). Light was measured at each site. Stream Regulated Transect Nitrogen Phosphorus R Light Cranberry Regulated 30 C C -0.102512563 2042.266667 Cranberry Regulated 30 C P -0.08877551 2042.266667 Cranberry Regulated 50 C C -0.107142857 1283.3 Cranberry Regulated 50 N C -0.059375 1283.3 Cranberry Regulated 70 C C -0.067346939 1336.6 Cranberry Regulated 70 N C -0.063636364 1336.6 ... I would like to know if the response differs among groups (regulated vs control) or is related to light or nutrient treatment. I have two separate analyses, N = 107 and N = 66 with different numbers of missing values (N = 120 before missing values). I think the appropriate model structure is: lme(Response ~ Regulated + Light + Nitrogen + Phosphorus + Nitrogen:Phosphorus), random=~1|Stream/Transect, data=data, method="ML")) However, I'm concerned that the model is far too complex for my sample size. Any advice would be appreciated! Thanks! John P. Ludlam, Ph.D. - Fitchburg State University
Rules of thumb for model complexity with small sample size in lme()
3 messages · John Ludlam, Thierry Onkelinx, Ben Bolker
Dear John, Since you don't have replication at the transect level, you should omit that from the random effects structure. I tend to strive for at least 10 observations per parameter. More is better of course. Assuming that Nitrogen and Phosphorus are factors with two levels, then Regulated + Light + Nitrogen + Phosphorus + Nitrogen:Phosphorus requires 5 parameters. Add 1 for the random effect and you have 6 parameters or at least 60 observations. So this model might work with N = 66. However you will need to carefully check the model. Best regards, ir. Thierry Onkelinx Statisticus / Statistician Vlaamse Overheid / Government of Flanders INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance thierry.onkelinx at inbo.be Havenlaan 88 bus 73, 1000 Brussel www.inbo.be /////////////////////////////////////////////////////////////////////////////////////////// To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey /////////////////////////////////////////////////////////////////////////////////////////// 2018-04-04 15:55 GMT+02:00 John Ludlam <jludlam at fitchburgstate.edu>:
Hello, I have an experiment with six streams in two groups (regulated and control). At each stream there were five sites (Transect). At each site there were unreplicated nutrient treatments (N, P, N+P, C). Light was measured at each site. Stream Regulated Transect Nitrogen Phosphorus R Light Cranberry Regulated 30 C C -0.102512563 2042.266667 Cranberry Regulated 30 C P -0.08877551 2042.266667 Cranberry Regulated 50 C C -0.107142857 1283.3 Cranberry Regulated 50 N C -0.059375 1283.3 Cranberry Regulated 70 C C -0.067346939 1336.6 Cranberry Regulated 70 N C -0.063636364 1336.6 ... I would like to know if the response differs among groups (regulated vs control) or is related to light or nutrient treatment. I have two separate analyses, N = 107 and N = 66 with different numbers of missing values (N = 120 before missing values). I think the appropriate model structure is: lme(Response ~ Regulated + Light + Nitrogen + Phosphorus + Nitrogen:Phosphorus), random=~1|Stream/Transect, data=data, method="ML")) However, I'm concerned that the model is far too complex for my sample size. Any advice would be appreciated! Thanks! John P. Ludlam, Ph.D. - Fitchburg State University
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
As I interpret the description, there is actually replication at the transect level; ~ 2 samples per transect (30 transects, total N=66). In principle one could even ask if there are variations in the effect of N and P across transects (this is essentially a very unbalanced randomized-block design), but I agree that would be unrealistically optimistic. Otherwise I agree with Thierry.
On 18-04-04 11:10 AM, Thierry Onkelinx wrote:
Dear John, Since you don't have replication at the transect level, you should omit that from the random effects structure. I tend to strive for at least 10 observations per parameter. More is better of course. Assuming that Nitrogen and Phosphorus are factors with two levels, then Regulated + Light + Nitrogen + Phosphorus + Nitrogen:Phosphorus requires 5 parameters. Add 1 for the random effect and you have 6 parameters or at least 60 observations. So this model might work with N = 66. However you will need to carefully check the model. Best regards, ir. Thierry Onkelinx Statisticus / Statistician Vlaamse Overheid / Government of Flanders INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance thierry.onkelinx at inbo.be Havenlaan 88 bus 73, 1000 Brussel www.inbo.be /////////////////////////////////////////////////////////////////////////////////////////// To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey /////////////////////////////////////////////////////////////////////////////////////////// 2018-04-04 15:55 GMT+02:00 John Ludlam <jludlam at fitchburgstate.edu>:
Hello, I have an experiment with six streams in two groups (regulated and control). At each stream there were five sites (Transect). At each site there were unreplicated nutrient treatments (N, P, N+P, C). Light was measured at each site. Stream Regulated Transect Nitrogen Phosphorus R Light Cranberry Regulated 30 C C -0.102512563 2042.266667 Cranberry Regulated 30 C P -0.08877551 2042.266667 Cranberry Regulated 50 C C -0.107142857 1283.3 Cranberry Regulated 50 N C -0.059375 1283.3 Cranberry Regulated 70 C C -0.067346939 1336.6 Cranberry Regulated 70 N C -0.063636364 1336.6 ... I would like to know if the response differs among groups (regulated vs control) or is related to light or nutrient treatment. I have two separate analyses, N = 107 and N = 66 with different numbers of missing values (N = 120 before missing values). I think the appropriate model structure is: lme(Response ~ Regulated + Light + Nitrogen + Phosphorus + Nitrogen:Phosphorus), random=~1|Stream/Transect, data=data, method="ML")) However, I'm concerned that the model is far too complex for my sample size. Any advice would be appreciated! Thanks! John P. Ludlam, Ph.D. - Fitchburg State University
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models