Logistic regression with spatial autocorrelation structure

Mon, Feb 7, 2011 10:56 AM

On Mon, Feb 7, 2011 at 10:58 AM, <Dale.W.Steele at gmail.com> wrote:

Didn't I answer 2 in my previous message in the part about

(Intercept)   priorprob  genderMale         wbc
 -8.2886183   0.0502540   1.1870539   0.3412998

[,1]
[1,] -2.299365

[,1]
[1,] 0.09117557

It is certainly possible to automate the process a bit more with a
predict method but, for the time being, I think that should be the way
to go about it.

If you have a simple random effect for doc, as in model m2, then to
get a prediction for a specific physician you add the random effect
for, say, doc == "18", to the value of eta before transforming it.
You can get the conditional modes of the random effects with the ranef
extractor.

Thanks. --Dale

On Feb 1, 2011 1:08pm, Douglas Bates <bates at stat.wisc.edu> wrote:

On Tue, Feb 1, 2011 at 12:05 PM, Douglas Bates bates at stat.wisc.edu> wrote:

On Mon, Jan 31, 2011 at 11:45 AM, Dale Steele dale.w.steele at gmail.com>
wrote:

Dear mixed-modeling ?experts,

I'm interested in modeling the probability of appendicitis in patients

with abdominal pain.

The R binary data file 'http://www.ped-em.org/appy.rda' contains

the following variables from a pilot study of 138 children with

abdominal pain.

Thank you for providing the data.

'dx' ? ? ? ?eventual diagnosis: ? 0=no appendicitis, 1=appendicitis

'gender' ? ? ? Male/Female

'wbc' ? ? ? total white blood cell count

'priorprob' ? ?Clinical predicted probability of appendicitis

'doc' ? ? ? doctor who assigned 'priorprob'

After taking a history and performing a physical examination, the ER
doctor

was asked to make a vertical mark on a 100 mm horizontal line to
represent

her estimate of the (percent) probability that the patient had
appendicitis.

My initial thought was to fit a multiple logistic regression model:

m1

However, it seems likely that each doctor interpreted the probability
scale

differently. ?The 23 doctors evaluated from 1 to 17 patients each. I'm

not primarily interest in predictions by a specific clinician. ?Thus,

it seems to make sense to fit a generalized linear mixed model.

At this point I get muddled. Have I correctly specified a random

intercept model (m2) and a random intercept/random slope model (m3)?

Are there other sensible models?

Yes, you have correctly specified them. ?However, if you check the

output from the fits

(m2

Generalized linear mixed model fit by the Laplace approximation

Formula: dx ~ priorprob + gender + wbc + (1 | doc)

? Data: appy

? AIC ? BIC logLik deviance

?108.1 122.7 -49.03 ? ?98.07

Random effects:

?Groups Name ? ? ? ?Variance Std.Dev.

?doc ? ?(Intercept) 0.82922 ?0.91062

Number of obs: 138, groups: doc, 23

Fixed effects:

? ? ? ? ? ?Estimate Std. Error z value Pr(>|z|)

(Intercept) -8.28862 ? ?1.45666 ?-5.690 1.27e-08

priorprob ? ?0.05025 ? ?0.01291 ? 3.892 9.96e-05

genderMale ? 1.18705 ? ?0.56752 ? 2.092 ? 0.0365

wbc ? ? ? ? ?0.34130 ? ?0.07090 ? 4.814 1.48e-06

Correlation of Fixed Effects:

? ? ? ? ? (Intr) prrprb gndrMl

priorprob ?-0.775

genderMale -0.293 ?0.130

wbc ? ? ? ?-0.790 ?0.353 ?0.040

(m3

Generalized linear mixed model fit by the Laplace approximation

Formula: dx ~ priorprob + gender + wbc + (priorprob | doc)

? Data: appy

? AIC BIC logLik deviance

?111.5 132 -48.76 ? ?97.52

Random effects:

?Groups Name ? ? ? ?Variance ? Std.Dev. Corr

?doc ? ?(Intercept) 0.02897010 0.170206

? ? ? ?priorprob ? 0.00016799 0.012961 1.000

Number of obs: 138, groups: doc, 23

Fixed effects:

? ? ? ? ? ?Estimate Std. Error z value Pr(>|z|)

(Intercept) -8.23977 ? ?1.44408 ?-5.706 1.16e-08

priorprob ? ?0.04957 ? ?0.01328 ? 3.733 0.000189

genderMale ? 1.16593 ? ?0.57481 ? 2.028 0.042521

wbc ? ? ? ? ?0.34485 ? ?0.07056 ? 4.887 1.02e-06

Correlation of Fixed Effects:

? ? ? ? ? (Intr) prrprb gndrMl

priorprob ?-0.757

genderMale -0.299 ?0.141

wbc ? ? ? ?-0.800 ?0.350 ?0.030

anova(m2, m3)

Data: appy

Models:

m2: dx ~ priorprob + gender + wbc + (1 | doc)

m3: dx ~ priorprob + gender + wbc + (priorprob | doc)

? Df ? ?AIC ? ?BIC ?logLik ?Chisq Chi Df Pr(>Chisq)

m2 ?5 108.07 122.70 -49.034

m3 ?7 111.52 132.01 -48.759 0.5501 ? ? ?2 ? ? 0.7595

(m4

Generalized linear mixed model fit by the Laplace approximation

Formula: dx ~ priorprob + wbc + (1 | doc)

? Data: appy

? AIC ? BIC logLik deviance

?110.5 122.2 -51.24 ? ?102.5

Random effects:

?Groups Name ? ? ? ?Variance Std.Dev.

?doc ? ?(Intercept) 0.63458 ?0.7966

Number of obs: 138, groups: doc, 23

Fixed effects:

? ? ? ? ? ?Estimate Std. Error z value Pr(>|z|)

(Intercept) -7.62797 ? ?1.33760 ?-5.703 1.18e-08

priorprob ? ?0.04834 ? ?0.01229 ? 3.934 8.35e-05

wbc ? ? ? ? ?0.34395 ? ?0.06771 ? 5.080 3.77e-07

Correlation of Fixed Effects:

? ? ? ? ?(Intr) prrprb

priorprob -0.786

wbc ? ? ? -0.813 ?0.357

anova(m4, m3)

Data: appy

Models:

m4: dx ~ priorprob + wbc + (1 | doc)

m3: dx ~ priorprob + gender + wbc + (priorprob | doc)

? Df ? ?AIC ? ?BIC ?logLik ?Chisq Chi Df Pr(>Chisq)

m4 ?4 110.48 122.19 -51.240

m3 ?7 111.52 132.01 -48.759 4.9639 ? ? ?3 ? ? 0.1745



Sorry, that test should have been

anova(m4, m2)

Data: appy

Models:

m4: dx ~ priorprob + wbc + (1 | doc)

m2: dx ~ priorprob + gender + wbc + (1 | doc)

? Df ? ?AIC ? ?BIC ?logLik ?Chisq Chi Df Pr(>Chisq)

m4 ?4 110.48 122.19 -51.240

m2 ?5 108.07 122.70 -49.034 4.4138 ? ? ?1 ? ?0.03565

you will see that the random intercept/random slope model produces a

degenerate fit (estimated correlation of the within-doctor random

effects is -1.000), which is not significantly better than the random

intercept fit. ?I would therefore use m2. (Because the fixed-effect

for gender had a z-value close to 2 I performed the more reliable

likelihood-ratio test, just to check.)

library(lme4)

m2
? ? ? ? ? ?family=binomial, data=appy)

m3
? ? ? ? ? ?family=binomial, data=appy)

My ultimate goal is to estimate the probability of appendicitis

(and a prediction interval), given a specific 'gender', 'wbc' and

'priorprob' assigned by a doctor with similar diagnostic ability to

those who participated in our pilot study. I'm stuck on how to code
this

prediction.

The prediction will be based on the fixed-effects only. ?Because it

applies to a doctor not in this study, we assign the random effect for

that doctor to be zero, which is the expected value in the absence of

any information on that doctor.

The fixef extractor returns the estimates of the fixed-effects

parameters. ?For a female patient with median prior probability (55%)

and median white blood cell count (9.45) the estimated linear

predictor (eta = -2.299) corresponds to a probability of 9.1%

fixef(m2)

(Intercept) ? priorprob ?genderMale ? ? ? ? wbc

?-8.2886183 ? 0.0502540 ? 1.1870539 ? 0.3412998

eta
eta

? ? ? ? ?[,1]

[1,] -2.299365

binomial()$linkinv(eta)

? ? ? ? ? [,1]

[1,] 0.09117557

Logistic regression with spatial autocorrelation structure

Thread (2 messages)