Hi All,
I'm doing a simple logistic regression with one fixed and one random
effects, and I'm comparing the results I got from lmer() and
glmmPQL(). I'm finding that lmer gives my a "better" p-value for my
fixed effects. Because I'm a paranoid old man I'd go for the glmmPQL
results then, but my collaborators are less paranoid and I'm sure
they'd prefer the results from lmer. Am I too conservative? (I ralise
it looks like I'm asking for counselling more than advice, but there
you go...).
Best,
Federico
My models:
mod1 = glmmPQL(y ~ genotype, random = ~1|block, family = binomial, data)
mod2 = lmer(y ~ genotype + (1|block), family = binomial, data)
my data:
> data
genotype block y.1 y.2
1 A a 16 29
2 B a 19 26
3 C a 23 23
4 A c 6 24
5 B c 11 11
6 C c 13 14
7 A b 4 17
8 B b 10 8
9 C b 12 6
> data[[1]]
[1] A B C A B C A B C
attr(,"contrasts")
[,1] [,2]
B 1 -1
A -2 0
C 1 1
Levels: B A C
my results:
> summary(mod1)
Linear mixed-effects model fit by maximum likelihood
Data: dat
AIC BIC logLik
NA NA NA
Random effects:
Formula: ~1 | block
(Intercept) Residual
StdDev: 1.285532e-06 0.8077838
Variance function:
Structure: fixed weights
Formula: ~invwt
Fixed effects: y ~ genotype
Value Std.Error DF t-value p-value
(Intercept) -0.3327269 0.12516679 4 -2.658269 0.0565
genotype1 0.3288359 0.09065856 4 3.627190 0.0222
genotype2 0.1138920 0.14947659 4 0.761938 0.4885
Correlation:
(Intr) gntyp1
genotype1 -0.068
genotype2 -0.027 -0.019
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-1.0807836 -0.8047002 -0.4620287 0.8940787 1.5832300
Number of Observations: 9
Number of Groups: 3
> summary(mod2)
Generalized linear mixed model fit by the Laplace approximation
Formula: y ~ genotype + (1 | block)
Data: dat
AIC BIC logLik deviance
13.92 14.71 -2.960 5.919
Random effects:
Groups Name Variance Std.Dev.
block (Intercept) 0 0
Number of obs: 9, groups: block, 3
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.33273 0.12652 -2.630 0.008541 **
genotype1 0.32884 0.09164 3.588 0.000333 ***
genotype2 0.11389 0.15109 0.754 0.450965
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Correlation of Fixed Effects:
(Intr) gntyp1
genotype1 -0.068
genotype2 -0.027 -0.019
--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG
Tel +44 (0)20 75941602 Fax +44 (0)20 75943193
f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com
lmer vs glmmPQL
10 messages · Ken Beath, Federico Calboli, Daniel Ezra Johnson +3 more
On 24/06/2009, at 2:18 AM, Federico Calboli wrote:
Hi All, I'm doing a simple logistic regression with one fixed and one random effects, and I'm comparing the results I got from lmer() and glmmPQL(). I'm finding that lmer gives my a "better" p-value for my fixed effects. Because I'm a paranoid old man I'd go for the glmmPQL results then, but my collaborators are less paranoid and I'm sure they'd prefer the results from lmer. Am I too conservative? (I ralise it looks like I'm asking for counselling more than advice, but there you go...).
This seems to results from the use of a t-test with few df in glmmPQL and z in lmer. z seems fine to me. What is more of a problem is that your random effects variance is effectively 0. There are only 3 blocks so fitting a random effects model will be difficult and appears unnecessary. Ken
Best, Federico My models: mod1 = glmmPQL(y ~ genotype, random = ~1|block, family = binomial, data) mod2 = lmer(y ~ genotype + (1|block), family = binomial, data) my data:
data
genotype block y.1 y.2 1 A a 16 29 2 B a 19 26 3 C a 23 23 4 A c 6 24 5 B c 11 11 6 C c 13 14 7 A b 4 17 8 B b 10 8 9 C b 12 6
data[[1]]
[1] A B C A B C A B C
attr(,"contrasts")
[,1] [,2]
B 1 -1
A -2 0
C 1 1
Levels: B A C
my results:
summary(mod1)
Linear mixed-effects model fit by maximum likelihood
Data: dat
AIC BIC logLik
NA NA NA
Random effects:
Formula: ~1 | block
(Intercept) Residual
StdDev: 1.285532e-06 0.8077838
Variance function:
Structure: fixed weights
Formula: ~invwt
Fixed effects: y ~ genotype
Value Std.Error DF t-value p-value
(Intercept) -0.3327269 0.12516679 4 -2.658269 0.0565
genotype1 0.3288359 0.09065856 4 3.627190 0.0222
genotype2 0.1138920 0.14947659 4 0.761938 0.4885
Correlation:
(Intr) gntyp1
genotype1 -0.068
genotype2 -0.027 -0.019
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-1.0807836 -0.8047002 -0.4620287 0.8940787 1.5832300
Number of Observations: 9
Number of Groups: 3
summary(mod2)
Generalized linear mixed model fit by the Laplace approximation
Formula: y ~ genotype + (1 | block)
Data: dat
AIC BIC logLik deviance
13.92 14.71 -2.960 5.919
Random effects:
Groups Name Variance Std.Dev.
block (Intercept) 0 0
Number of obs: 9, groups: block, 3
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.33273 0.12652 -2.630 0.008541 **
genotype1 0.32884 0.09164 3.588 0.000333 ***
genotype2 0.11389 0.15109 0.754 0.450965
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Correlation of Fixed Effects:
(Intr) gntyp1
genotype1 -0.068
genotype2 -0.027 -0.019
--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG
Tel +44 (0)20 75941602 Fax +44 (0)20 75943193
f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
On 23 Jun 2009, at 22:46, Ken Beath wrote:
This seems to results from the use of a t-test with few df in glmmPQL and z in lmer. z seems fine to me. What is more of a problem is that your random effects variance is effectively 0. There are only 3 blocks so fitting a random effects model will be difficult and appears unnecessary.
That was a sample dataset so I could see what kind of data I had to deal with, the 'real' hing should have a variance > 0 for the random effect. My philosphycal issue was, given such a relatively straightforward model, should I be more (glmmPQL) or less (lmer) conservative? Best, Federico -- Federico C. F. Calboli Department of Epidemiology and Public Health Imperial College, St. Mary's Campus Norfolk Place, London W2 1PG Tel +44 (0)20 75941602 Fax +44 (0)20 75943193 f.calboli [.a.t] imperial.ac.uk f.calboli [.a.t] gmail.com
Federico Calboli wrote:
On 23 Jun 2009, at 22:46, Ken Beath wrote:
This seems to results from the use of a t-test with few df in glmmPQL and z in lmer. z seems fine to me. What is more of a problem is that your random effects variance is effectively 0. There are only 3 blocks so fitting a random effects model will be difficult and appears unnecessary.
That was a sample dataset so I could see what kind of data I had to deal with, the 'real' hing should have a variance > 0 for the random effect. My philosphycal issue was, given such a relatively straightforward model, should I be more (glmmPQL) or less (lmer) conservative? Best, Federico
My take would be to pick lmer over glmmPQL every time, provided it can handle your problem -- in general it should be more accurate. Picking on the basis of more or less conservative for any given problem feels biased. Ben Bolker
Regarding this thread, what about the method of fitting nested models and using anova() to estimate a p-value. For example: mod2 = lmer(y ~ genotype + (1|block), family = binomial, data) mod0 = lmer(y ~ (1|block), family = binomial, data) anova(mod0,mod2) How does that p-value (which is one value for the whole term "genotype") relate to the individual coefficient p-values derived from the Z-scores inside summary(mod2)? Thanks, Dan
Daniel Ezra Johnson wrote:
Regarding this thread, what about the method of fitting nested models and using anova() to estimate a p-value. For example: mod2 = lmer(y ~ genotype + (1|block), family = binomial, data) mod0 = lmer(y ~ (1|block), family = binomial, data) anova(mod0,mod2) How does that p-value (which is one value for the whole term "genotype") relate to the individual coefficient p-values derived from the Z-scores inside summary(mod2)? Thanks, Dan
The Z-scores are Wald tests. The anova results are likelihood ratio tests. The latter are in general more reliable but are known to be *un*reliable for small-sample (i.e. small-number-of-random-effects-levels) LMMs (Pinheiro and Bates). Wald tests are not *known* to be unreliable in the small-sample case, but I believe that is a statement of ignorance rather than a statement that they're OK ... Ben Bolker
Ben Bolker Associate professor, Biology Dep't, Univ. of Florida bolker at ufl.edu / www.zoology.ufl.edu/bolker GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc
Ben Bolker wrote:
Daniel Ezra Johnson wrote:
Regarding this thread, what about the method of fitting nested models and using anova() to estimate a p-value. For example: mod2 = lmer(y ~ genotype + (1|block), family = binomial, data) mod0 = lmer(y ~ (1|block), family = binomial, data) anova(mod0,mod2) How does that p-value (which is one value for the whole term "genotype") relate to the individual coefficient p-values derived from the Z-scores inside summary(mod2)? Thanks, Dan
The Z-scores are Wald tests. The anova results are likelihood ratio tests. The latter are in general more reliable but are known to be *un*reliable for small-sample (i.e. small-number-of-random-effects-levels) LMMs (Pinheiro and Bates). Wald tests are not *known* to be unreliable in the small-sample case, but I believe that is a statement of ignorance rather than a statement that they're OK ...
This is an interesting issue -- as Ben points out, likelihood-ratio
tests are known to be anti-conservative for tests between linear mixed
models differing only in fixed-effects structure. I think the
best-known demonstration of this can be found in Pinheiro & Bates, 2000,
pages 87-92. F-tests are less problematic.
For logit mixed models, presumably the results will depend a bit on how
fitting is done because the likelihood has to be approximated. For the
Laplace approximation currently in use in lme4, I've done some
simulations and generally found that likelihood-ratio tests are still
anti-conservative; tests based on Wald Z are not as anti-conservative,
though they don't seem quite nominal. You can do simulations along
these lines:
set.seed(2)
invlogit <- function(x) {
temp <- exp(x)
return(temp/(1+temp))
}
L <- 10
REP <- 4
N <- 4*10
fac <- gl(L,REP)
f <- function(ignore) {
x <- runif(N)
r <- rnorm(L)
y <- rbinom(N,1,invlogit(r[fac]))
m0 <- lmer(y ~ (1 | fac), family="binomial")
m1 <- lmer(y ~ x + (1 | fac), family="binomial")
temp <- pnorm(fixef(m1)[2]/sqrt(vcov(m1)[2,2]))
p.z <- 2*min(temp, 1-temp)
p.lr <- 1-pchisq(as.numeric(2*(logLik(m1)-logLik(m0))),1)
return(c(p.z,p.lr))
}
K <- 5000
p.vals <- sapply(1:K,f)
apply(p.vals, 1, function(x) sum(x<0.05))/K
# this gave me 0.0562 for Wald Z, 0.0608 for LRT
# Is Wald Z anti-conservative? The below says "yes" with p=0.044
2*(1-pbinom(sum(p.vals[1,]<0.05), K, 0.05))
Unlike what Ben states above, however, it's my intuition (and informal
observation on the basis of simulations) that what determines how
anticonservative the LRT is is not the number of random effect levels
but rather the ratio between the degrees of freedom involved in the LRT
to the residual degrees of freedom in the more general model. The more
residual degrees of freedom (basically, the more observations), the less
anti-conservative the LRT is. This can be assessed for a specific
dataset by fitting a null-hypothesis model, repeatedly generating new,
artificial responses from it, and looking at the distribution of LRTs on
the artificial responses (along the lines of P&B 2000).
Roger
Roger Levy wrote:
Ben Bolker wrote:
Daniel Ezra Johnson wrote:
Regarding this thread, what about the method of fitting nested models and using anova() to estimate a p-value. For example: mod2 = lmer(y ~ genotype + (1|block), family = binomial, data) mod0 = lmer(y ~ (1|block), family = binomial, data) anova(mod0,mod2) How does that p-value (which is one value for the whole term "genotype") relate to the individual coefficient p-values derived from the Z-scores inside summary(mod2)? Thanks, Dan
The Z-scores are Wald tests. The anova results are likelihood ratio tests. The latter are in general more reliable but are known to be *un*reliable for small-sample (i.e. small-number-of-random-effects-levels) LMMs (Pinheiro and Bates). Wald tests are not *known* to be unreliable in the small-sample case, but I believe that is a statement of ignorance rather than a statement that they're OK ...
This is an interesting issue -- as Ben points out, likelihood-ratio tests are known to be anti-conservative for tests between linear mixed models differing only in fixed-effects structure. I think the best-known demonstration of this can be found in Pinheiro & Bates, 2000, pages 87-92. F-tests are less problematic.
But note that F tests may not be available or well-defined for GLMMs ... (Distinguishing here between conditional F tests (e.g. p. 90 of Pinheiro & Bates 2000) and Wald F tests of parameter combinations -- as far as I know the former are not applicable to GLMMs. Wald F tests are always available but don't necessarily (???) share the same properties/reliability as the conditional F tests in the LMM case ...)
For logit mixed models, presumably the results will depend a bit on how fitting is done because the likelihood has to be approximated. For the Laplace approximation currently in use in lme4, I've done some simulations and generally found that likelihood-ratio tests are still anti-conservative; tests based on Wald Z are not as anti-conservative, though they don't seem quite nominal.
Hmmm. Can't we separate estimation (for which Laplace and AGQ are probably adequate) from inference? Would there be any point in considering inference on estimates from a less accurate (PQL) approximation scheme as long as we have more accurate schemes available? I worry whether the Wald Z tests are consistently better or just "differently wrong" ... You can do simulations along
these lines:
set.seed(2)
invlogit <- function(x) {
temp <- exp(x)
return(temp/(1+temp))
}
L <- 10
REP <- 4
N <- 4*10
fac <- gl(L,REP)
f <- function(ignore) {
x <- runif(N)
r <- rnorm(L)
y <- rbinom(N,1,invlogit(r[fac]))
m0 <- lmer(y ~ (1 | fac), family="binomial")
m1 <- lmer(y ~ x + (1 | fac), family="binomial")
temp <- pnorm(fixef(m1)[2]/sqrt(vcov(m1)[2,2]))
p.z <- 2*min(temp, 1-temp)
p.lr <- 1-pchisq(as.numeric(2*(logLik(m1)-logLik(m0))),1)
return(c(p.z,p.lr))
}
K <- 5000
p.vals <- sapply(1:K,f)
apply(p.vals, 1, function(x) sum(x<0.05))/K
# this gave me 0.0562 for Wald Z, 0.0608 for LRT
# Is Wald Z anti-conservative? The below says "yes" with p=0.044
2*(1-pbinom(sum(p.vals[1,]<0.05), K, 0.05))
Unlike what Ben states above, however, it's my intuition (and informal
observation on the basis of simulations) that what determines how
anticonservative the LRT is is not the number of random effect levels
but rather the ratio between the degrees of freedom involved in the LRT
to the residual degrees of freedom in the more general model. The more
residual degrees of freedom (basically, the more observations), the less
anti-conservative the LRT is. This can be assessed for a specific
dataset by fitting a null-hypothesis model, repeatedly generating new,
artificial responses from it, and looking at the distribution of LRTs on
the artificial responses (along the lines of P&B 2000).
Your conclusion agrees with what Pinheiro and Bates say (p. 88:
"as the number of parameters being removed from the fixed effects
becomes large, compared to the total number of observations, this
inaccuracy in the reported p-values can be substantial").
I don't really know how to decide what is more relevant to
"asymptopia" -- if N is the total number of obs., n the number
of blocks (say m = number per block so N=mn), and p the number of
parameters (fixed? random? variances counting as 1, or n-1, or
something in between?), is it 1-p/N [as PB suggest], or (N-p) [suggested
above], or n? Demidenko p. 145 shows that for LMMs N \to \infty is
not sufficient for consistency, that n \to \infty is also required ...
I don't know if that helps in practice.
I don't know where my intuition comes from -- perhaps it's just
conservatism/pessimism, since it's often easier to get N>>p than to get
n large ...
I do agree that the gold standard is simulations as suggested above.
There are examples in the "worked example" section of glmm.wikidot.com .
These simulations can be very slow -- hopefully (??) Doug Bates will be
able to work out an mcmcsamp() scheme for GLMMs soon, and we can see
if that (much faster) approach generally agrees with null simulations ...
@book{demidenko_mixed_2004,
edition = {1},
title = {Mixed Models: Theory and Applications},
isbn = {0471601616},
publisher = {{Wiley-Interscience}},
author = {Eugene Demidenko},
month = jul,
year = {2004}
}
Roger
Ben Bolker Associate professor, Biology Dep't, Univ. of Florida bolker at ufl.edu / www.zoology.ufl.edu/bolker GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc
On Wed, Jun 24, 2009 at 4:51 PM, Ben Bolker<bolker at ufl.edu> wrote:
Roger Levy wrote:
Ben Bolker wrote:
Daniel Ezra Johnson wrote:
Regarding this thread, what about the method of fitting nested models and using anova() to estimate a p-value. For example: mod2 = lmer(y ~ genotype + (1|block), family = binomial, data) mod0 = lmer(y ~ (1|block), family = binomial, data) anova(mod0,mod2) How does that p-value (which is one value for the whole term "genotype") relate to the individual coefficient p-values derived from the Z-scores inside summary(mod2)? Thanks, Dan
? ?The Z-scores are Wald tests. ?The anova results are likelihood ratio tests. ?The latter are in general more reliable but are known to be *un*reliable for small-sample (i.e. small-number-of-random-effects-levels) LMMs (Pinheiro and Bates). Wald tests are not *known* to be unreliable in the small-sample case, but I believe that is a statement of ignorance rather than a statement that they're OK ...
This is an interesting issue -- as Ben points out, likelihood-ratio tests are known to be anti-conservative for tests between linear mixed models differing only in fixed-effects structure. ?I think the best-known demonstration of this can be found in Pinheiro & Bates, 2000, pages 87-92. ?F-tests are less problematic.
?But note that F tests may not be available or well-defined for GLMMs ... ?(Distinguishing here between conditional F tests (e.g. p. 90 of Pinheiro & Bates 2000) and Wald F tests of parameter combinations -- as far as I know the former are not applicable to GLMMs. Wald F tests are always available but don't necessarily (???) share the same properties/reliability as the conditional F tests in the LMM case ...)
For logit mixed models, presumably the results will depend a bit on how fitting is done because the likelihood has to be approximated. ?For the Laplace approximation currently in use in lme4, I've done some simulations and generally found that likelihood-ratio tests are still anti-conservative; tests based on Wald Z are not as anti-conservative, though they don't seem quite nominal.
?Hmmm. ?Can't we separate estimation (for which Laplace and AGQ are probably adequate) from inference? ?Would there be any point in considering inference on estimates from a less accurate (PQL) approximation scheme as long as we have more accurate schemes available? ?I worry whether the Wald Z tests are consistently better or just "differently wrong" ... You can do simulations along
these lines:
set.seed(2)
invlogit <- function(x) {
? ?temp <- exp(x)
? ?return(temp/(1+temp))
}
L <- 10
REP <- 4
N <- 4*10
fac <- gl(L,REP)
f <- function(ignore) {
? ?x <- runif(N)
? ?r <- rnorm(L)
? ?y <- rbinom(N,1,invlogit(r[fac]))
? ?m0 <- lmer(y ~ (1 | fac), family="binomial")
? ?m1 <- lmer(y ~ x + (1 | fac), family="binomial")
? ?temp <- pnorm(fixef(m1)[2]/sqrt(vcov(m1)[2,2]))
? ?p.z <- 2*min(temp, 1-temp)
? ?p.lr <- 1-pchisq(as.numeric(2*(logLik(m1)-logLik(m0))),1)
? ?return(c(p.z,p.lr))
}
K <- 5000
p.vals <- sapply(1:K,f)
apply(p.vals, 1, function(x) sum(x<0.05))/K
# this gave me 0.0562 for Wald Z, 0.0608 for LRT
# Is Wald Z anti-conservative? ?The below says "yes" with p=0.044
2*(1-pbinom(sum(p.vals[1,]<0.05), K, 0.05))
Unlike what Ben states above, however, it's my intuition (and informal
observation on the basis of simulations) that what determines how
anticonservative the LRT is is not the number of random effect levels
but rather the ratio between the degrees of freedom involved in the LRT
to the residual degrees of freedom in the more general model. ?The more
residual degrees of freedom (basically, the more observations), the less
anti-conservative the LRT is. ?This can be assessed for a specific
dataset by fitting a null-hypothesis model, repeatedly generating new,
artificial responses from it, and looking at the distribution of LRTs on
the artificial responses (along the lines of P&B 2000).
? ?Your conclusion agrees with what Pinheiro and Bates say (p. 88: "as the number of parameters being removed from the fixed effects becomes large, compared to the total number of observations, this inaccuracy in the reported p-values can be substantial"). I don't really know how to decide what is more relevant to "asymptopia" -- if N is the total number of obs., n the number of blocks (say m = number per block so N=mn), and p the number of parameters (fixed? random? variances counting as 1, or n-1, or something in between?), is it 1-p/N [as PB suggest], or (N-p) [suggested above], or n? Demidenko p. 145 shows that for LMMs ?N \to \infty is not sufficient for consistency, that n \to \infty is also required ... I don't know if that helps in practice. ?I don't know where my intuition comes from -- perhaps it's just conservatism/pessimism, since it's often easier to get N>>p than to get n large ... ?I do agree that the gold standard is simulations as suggested above. There are examples in the "worked example" section of glmm.wikidot.com . These simulations can be very slow -- hopefully (??) Doug Bates will be able to work out an mcmcsamp() scheme for GLMMs soon, and we can see if that (much faster) approach generally agrees with null simulations ...
If you look at Doug's schedule for July (www.stat.wisc.edu/~bates) you will understand why he has been missing in action recently. In my "spare time" - when not working on a major report finished last week and a grant proposal due this week and in preparing these various courses - I am working on the draft of a book about lme4 because if I don't get some part of that manuscript to John Kimmel soon he is going to be very rude to me in Rennes.
@book{demidenko_mixed_2004,
? ? ? ?edition = {1},
? ? ? ?title = {Mixed Models: Theory and Applications},
? ? ? ?isbn = {0471601616},
? ? ? ?publisher = {{Wiley-Interscience}},
? ? ? ?author = {Eugene Demidenko},
? ? ? ?month = jul,
? ? ? ?year = {2004}
}
Roger
-- Ben Bolker Associate professor, Biology Dep't, Univ. of Florida bolker at ufl.edu / www.zoology.ufl.edu/bolker GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
I do agree that the gold standard is simulations as suggested above. There are examples in the "worked example" section of glmm.wikidot.com . These simulations can be very slow -- hopefully (??) Doug Bates will be able to work out an mcmcsamp() scheme for GLMMs soon, and we can see if that (much faster) approach generally agrees with null simulations ...
If you look at Doug's schedule for July (www.stat.wisc.edu/~bates) you will understand why he has been missing in action recently. In my "spare time" - when not working on a major report finished last week and a grant proposal due this week and in preparing these various courses - I am working on the draft of a book about lme4 because if I don't get some part of that manuscript to John Kimmel soon he is going to be very rude to me in Rennes.
Just to clarify, I wasn't trying to nag ... very excited about the prospect of the book ! (How would you, and Springer/John Kimmel, feel, about drafts of the book being accessible prior to publication ??) Ben Bolker