Dear list, I have a dichotomous outcome (child mortality) with a very high mean (0.9946) in a large dataset (3.5m). The "Error: pwrssUpdate did not converge in (maxit) iterations" occurs in most cases. I've tried using blme to combat complete separation with fixef.priors with SDs from 1 to 10 without success. The variance explained by the random family effect is numerically very small (0.000698) though I suppose that still amounts to ca 7%. There's few members per family (~ 2 on average). Fitting a glm without the family intercepts results in fairly different results (which I expect), judging by the few models that ran. Using less data sometimes leads to convergence, depending on the sample I draw, I suppose. I'm using bobyqa. I thought maybe the problem still is complete separation and I'm just being too timid with the blme prior. Oddly (maybe not), the only model where I do get convergence is one where I accidentally mis-specified my sample, so my outcome was censored (hence the mean but not the intercept was lower). I'm attaching the model. Best regards, Ruben Arslan ## Cov prior : idParents ~ wishart(df = 3.5, scale = Inf, posterior.scale = cov, common.scale = TRUE) ## Fixef prior: normal(sd = c(9, 9, ...), corr = c(0 ...), common.scale = FALSE) ## Prior dev : 143 ## ## Generalized linear mixed model fit by maximum likelihood (Laplace ## Approximation) [bglmerMod] ## Family: binomial ( logit ) ## Formula: surviveR ~ maternalage.factor + paternalloss + maternalloss + ## center(nr.siblings) + birth.cohort + male + paternalage.mean + ## paternalage.factor + (1 | idParents) ## Data: swed.2 ## Control: control_defaults ## Subset: survive1y == TRUE & byear < 2000 ## ## AIC BIC logLik deviance df.resid ## 938507 938795 -469231 938463 3691460 ## ## Scaled residuals: ## Min 1Q Median 3Q Max ## -134.50 0.04 0.05 0.06 3.07 ## ## Random effects: ## Groups Name Variance Std.Dev. ## idParents (Intercept) 0.000698 0.0264 ## Number of obs: 3691482, groups: idParents, 1907489 ## ## Fixed effects: ## Estimate Std. Error z value Pr(>|z|) ## (Intercept) 6.84279 0.03471 197.2 < 2e-16 *** ## maternalage.factor(14,20] 0.16356 0.01600 10.2 < 2e-16 *** ## maternalage.factor(35,61] -0.18822 0.00871 -21.6 < 2e-16 *** ## paternallossTRUE -0.41957 0.04694 -8.9 < 2e-16 *** ## paternallossNA -0.30693 0.01819 -16.9 < 2e-16 *** ## maternallossTRUE -0.67635 0.08228 -8.2 < 2e-16 *** ## maternallossNA -0.11658 0.02607 -4.5 7.8e-06 *** ## center(nr.siblings) 0.27749 0.00288 96.2 < 2e-16 *** ## birth.cohort(1970,1977] 0.35761 0.02833 12.6 < 2e-16 *** ## birth.cohort(1977,1984] 0.72394 0.03203 22.6 < 2e-16 *** ## birth.cohort(1984,1991] 0.86295 0.03173 27.2 < 2e-16 *** ## birth.cohort(1991,1999] -5.95342 0.01933 -308.0 < 2e-16 *** ## male -0.01946 0.00512 -3.8 0.00015 *** ## paternalage.mean 0.88269 0.01168 75.5 < 2e-16 *** ## paternalage.factor(25,30] -0.53984 0.01068 -50.5 < 2e-16 *** ## paternalage.factor(30,35] -1.18842 0.01360 -87.4 < 2e-16 *** ## paternalage.factor(35,40] -1.59243 0.01815 -87.7 < 2e-16 *** ## paternalage.factor(40,45] -2.02418 0.02429 -83.3 < 2e-16 *** ## paternalage.factor(45,50] -2.46269 0.03266 -75.4 < 2e-16 *** ## paternalage.factor(50,55] -3.11201 0.04679 -66.5 < 2e-16 *** ## paternalage.factor(55,90] -3.67437 0.06747 -54.5 < 2e-16 *** ## R version 3.1.0 (2014-04-10) ## Platform: x86_64-redhat-linux-gnu (64-bit) ## ## other attached packages: ## [1] mgcv_1.8-4 nlme_3.1-119 stringr_0.6.2 pander_0.5.1 ## [5] blme_1.0-2 formr_0.1.11 lme4_1.1-7 Rcpp_0.11.4 ## [9] Matrix_1.1-5 ggplot2_1.0.0 data.table_1.9.5 knitr_1.9
Little variability in outcome; "pwrssUpdate did not converge"
3 messages · Ruben Arslan, David Duffy
1 day later
On Mon, 23 Mar 2015, Ruben Arslan wrote:
I have a dichotomous outcome (child mortality) with a very high mean (0.9946) in a large dataset (3.5m). I thought maybe the problem still is complete separation and I'm just being too timid with the blme prior. Oddly (maybe not), the only model where I do get convergence is one where I accidentally mis-specified my sample, so my outcome was censored (hence the mean but not the intercept was lower). I'm attaching the model.
The misspecified model? Maybe you should be doing something else, such as bivariate logistic (dropping extra offspring) or marginal models? If you are interested just in familial aggregation, you can do the conditional analysis using just the ~18000 odd families with one or more events, using the other families just to estimate offsets. A few random thoughts ;) | David Duffy (MBBS PhD) | email: David.Duffy at qimrberghofer.edu.au ph: INT+61+7+3362-0217 fax: -0101 | Genetic Epidemiology, QIMR Berghofer Institute of Medical Research | 300 Herston Rd, Brisbane, Queensland 4006, Australia GPG 4D0B994A
Thanks for your response! I'd prefer to model this the same way I did in three other populations (with lower means and sample sizes) for the sake of presentation and comparability. The basic idea (sorry that wasn't clear) is a sibling control design, examining the effect of paternal age within families (i.e. no marginal models for me). I'm not sure I understand how I could estimate offsets separately from the conditional analysis. I've tried including only families with at least two sibs (nope), but wouldn't selecting based on the outcome introduce bias? How would I remedy that? My previous mail contained a mis-specified model, since that happened to give any output and I thought it might be informative. It also had a odd prior specification. The default specification is c(10,2.5). Unthinkingly, I set a very high SD on the slopes i.e. c(9,9). That's not a good idea since these high SDs on the normal put a lot of weight on 0 and 1 on the logit (there's a section on this in 2.6. of the MCMCglmm course notes). Unfortunately, even though I do get improved results with small subsamples (30k) using the default prior spec (as opposed to vanilla glmer), the models still do not converge with the 3.5m dataset. I was thinking that I might get closer by simply splitting my sample? I'm of course still hoping there's some control I've missed.
On 25 Mar 2015, at 00:01, David Duffy <David.Duffy at qimr.edu.au> wrote: On Mon, 23 Mar 2015, Ruben Arslan wrote:
I have a dichotomous outcome (child mortality) with a very high mean (0.9946) in a large dataset (3.5m). I thought maybe the problem still is complete separation and I'm just being too timid with the blme prior. Oddly (maybe not), the only model where I do get convergence is one where I accidentally mis-specified my sample, so my outcome was censored (hence the mean but not the intercept was lower). I'm attaching the model.
The misspecified model? Maybe you should be doing something else, such as bivariate logistic (dropping extra offspring) or marginal models? If you are interested just in familial aggregation, you can do the conditional analysis using just the ~18000 odd families with one or more events, using the other families just to estimate offsets. A few random thoughts ;) | David Duffy (MBBS PhD) | email: David.Duffy at qimrberghofer.edu.au ph: INT+61+7+3362-0217 fax: -0101 | Genetic Epidemiology, QIMR Berghofer Institute of Medical Research | 300 Herston Rd, Brisbane, Queensland 4006, Australia GPG 4D0B994A