Hi all! I have a question related to using lmer() function of lme4 package in identical twins' studies which I would appreciate if you could answer. We have PFAS measured pollution dataset constructed of ~50 (n=100) monozygotic (identical) twins. The goal is to detect the significantly differential PFAS pollutants between the leaner individuals (L) and those individuals with obesity (F): 1. As a solution, I was planning to run lmer() to run differential PFAS levels while adjusting for the sex, age (young/old) and the sample extraction year. As a random effect I was thinking to use the family IDs (i.e. extreme similarities between the individuals that is caused by 'twinship'). Therefore, I am using the design model as 'pfasLogStandardized ~ LF + sex + youngOrOld + yearClass + (1 | familyID)'. However, I am wondering whether this is the best approach since considering the 'twinship' as a random effect means that the sample size within each of the random effects will be 2 (since it is family IDs of the 'twins') ! It seems like due to small sample size the fitted regressions will feature high variances. I was wondering if this the best approach in your opinion. Note that twinship or family ID is not completely independent from sex and age since identical twins have also the same sex and age. 2. The alternative approach that comes to my mind is to not adjust for familyID (or twinship) but to run ANOVA or student-t test and adjust for 'pfasLogStandardized ~ LF + sex + youngOrOld + yearClass'. Here the problem is that the analysis will not be adjusting for the extreme similarities between the twins. 3. Another approach is to swap the formula in lmer and to adjust for familyID as covariate and to consider a factor with combined info of age-sex-year as random effect, e.g. s.th. like 'pfasLogStandardized ~ LF + familyID + (1 | RandFact)' , while RandFact = as.factor(paste(sex , youngOrOld , yearClass)) . I would really appreciate your opinion on this issue and on, overall, what is the best way to run these kinds of analyses on identical twins' data while adjusting for the extreme similarities of the twins. Cheers,
lmer analysis of identical twins data
3 messages · Oghabian, Ali, David Duffy
2 days later
From: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org> on behalf of Oghabian, Ali <ali.oghabian at helsinki.fi>
Sent: Friday, 18 December 2020 8:59:56 PM
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] lmer analysis of identical twins data
Sent: Friday, 18 December 2020 8:59:56 PM
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] lmer analysis of identical twins data
Hi all!
I have a question related to using lmer() function of lme4 package in identical twins' studies which I would appreciate if you could answer.
We have PFAS measured pollution dataset constructed of ~50 (n=100) monozygotic (identical) twins. The goal is to detect the significantly differential PFAS pollutants between the leaner individuals (L) and those individuals with obesity (F):
1. As a solution, I was planning to run lmer() to run differential PFAS levels while adjusting for the sex, age (young/old) and the sample extraction year. As a random effect I was thinking to use the family IDs (i.e. extreme similarities between the individuals that is caused by 'twinship'). Therefore, I am using the design model as 'pfasLogStandardized ~ LF + sex + youngOrOld + yearClass + (1 | familyID)'. However, I am wondering whether this is the best approach since considering the 'twinship' as a random effect means that the sample size within each of the random effects will be 2 (since it is family IDs of the 'twins') ! It seems like due to small sample size the fitted regressions will feature high variances. I was wondering if this the best approach in your opinion. Note that twinship or family ID is not completely independent from sex and age since identical twins have also the same sex and age.
2. The alternative approach that comes to my mind is to not adjust for familyID (or twinship) but to run ANOVA or student-t test and adjust for 'pfasLogStandardized ~ LF + sex + youngOrOld + yearClass'. Here the problem is that the analysis will not be adjusting for the extreme similarities between the twins.
3. Another approach is to swap the formula in lmer and to adjust for familyID as covariate and to consider a factor with combined info of age-sex-year as random effect, e.g. s.th. like 'pfasLogStandardized ~ LF + familyID + (1 | RandFact)' , while RandFact = as.factor(paste(sex , youngOrOld , yearClass)) .
I would really appreciate your opinion on this issue and on, overall, what is the best way to run these kinds of analyses on identical twins' data while adjusting for the extreme similarities of the twins.
Cheers,
[[alternative HTML version deleted]]
_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
[EXTERNAL EMAIL] This message originates from an external email address, please exercise caution when clicking any links or opening attachments. If you believe the sender is impersonating someone at QIMR Berghofer, please forward this message to phishing at qimrberghofer.edu.au.
We have PFAS measured pollution dataset constructed of ~50 (n=100) monozygotic (identical) twins. The goal is to detect the significantly differential PFAS pollutants between the leaner individuals (L) and those individuals with obesity (F):
'pfasLogStandardized ~ LF + sex + youngOrOld + yearClass + (1 | familyID)'.
This is the correct approach, given this is a cotwin-control design. You might first check the magnitude of the twin intraclass correlation, but I think it very unlikely that you can ignore genetics and shared environment. How has zygosity been diagnosed? The model can be extended if you have dizygotic twins present as well. Cheers, David Duffy.