How does R estimate the intercept term \alpha in a loglinear
model with Poisson model and log link for a contingency table of counts?
(E.g., for a 2-by-2 table {n_{ij}) with \log(\mu) = \alpha + \beta_{i} +
\gamma_{j})
I fitted such a model and checked the calculations by hand. I agreed
with the main effect terms but not the intercept. Interestingly, I
agreed with the fitted value provided by R for the first cell {11} in
the table.
If my estimate of intercept = \hat{\alpha}, my estimate of the fitted
value for the first cell = exp(\hat{\alpha}) but R seems to be doing
something else for the estimate of the intercept.
However if I check the R $fitted_value for n_{11} it agrees with my
exp(\hat{\alpha}).
I would expect that with the corner-point parametrization, the
estimates for a 2 x 2 table would correspond to expected frequencies
exp(\alpha), exp(\alpha + \beta), exp(\alpha + \gamma), exp(\alpha +
\beta + \gamma). The MLE of \alpha appears to be log(n_{.1} *
n_{1.}/n_{..}), but this is not equal to the intercept given by R in the
example I tried.
With thanks in anticipation,
Colin Aitken
Professor Colin Aitken,
Professor of Forensic Statistics,
School of Mathematics, King?s Buildings, University of Edinburgh,
Mayfield Road, Edinburgh, EH9 3JZ.
Tel: 0131 650 4877
E-mail: c.g.g.aitken at ed.ac.uk
Fax : 0131 650 6553
http://www.maths.ed.ac.uk/~cgga
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
How does R estimate the intercept term \alpha in a loglinear
model with Poisson model and log link for a contingency table of
counts?
(E.g., for a 2-by-2 table {n_{ij}) with \log(\mu) = \alpha +
\beta_{i} + \gamma_{j})
I fitted such a model and checked the calculations by hand. I
agreed with the main effect terms but not the intercept.
Interestingly, I agreed with the fitted value provided by R for the
first cell {11} in the table.
If my estimate of intercept = \hat{\alpha}, my estimate of the
fitted value for the first cell = exp(\hat{\alpha}) but R seems to
be doing something else for the estimate of the intercept.
However if I check the R $fitted_value for n_{11} it agrees with my
exp(\hat{\alpha}).
I would expect that with the corner-point parametrization, the
estimates for a 2 x 2 table would correspond to expected frequencies
exp(\alpha), exp(\alpha + \beta), exp(\alpha + \gamma), exp(\alpha +
\beta + \gamma). The MLE of \alpha appears to be log(n_{.1} * n_{1.}/
n_{..}), but this is not equal to the intercept given by R in the
example I tried.
With thanks in anticipation,
Colin Aitken
--
Professor Colin Aitken,
Professor of Forensic Statistics,
Do you suppose you could provide a data-corpse for us to dissect?
Noting the tag line for every posting ....
and provide commented, minimal, self-contained, reproducible code.
How does R estimate the intercept term \alpha in a loglinear
model with Poisson model and log link for a contingency table of counts?
Colin,
If you fitted this using a GLM then the default in R is to use so-called
treatment contrasts (i.e. Dunnett contrasts). See ?contr.treatment. Take the
first example on the ?glm help page
## Dobson (1990) Page 93: Randomized Controlled Trial :
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
print(d.AD <- data.frame(treatment, outcome, counts))
glm.D93 <- glm(counts ~ outcome + treatment, family=poisson())
anova(glm.D93)
summary(glm.D93)
< snip >
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 3.045e+00 1.709e-01 17.815 <2e-16 ***
outcome2 -4.543e-01 2.022e-01 -2.247 0.0246 *
outcome3 -2.930e-01 1.927e-01 -1.520 0.1285
treatment2 1.338e-15 2.000e-01 0.000 1.0000
treatment3 1.421e-15 2.000e-01 0.000 1.0000
< snip >
levels(outcome)
[1] "1" "2" "3"
levels(treatment)
[1] "1" "2" "3"
So here the intercept represents the estimated counts at the first level of
"outcome" (i.e. outcome = 1) and the first level of "treatment" (i.e.
treatment = 1).
So here the intercept represents the estimated counts...
Perhaps I should have added (though surely unnecessary in your case) that
exponentiation gives the predicted/estimated counts, viz 21 (compared to 18
for the saturated model).
##
Professor Colin Aitken,
Professor of Forensic Statistics,
!!!)
<SNIP>
Do you suppose you could provide a data-corpse for us to dissect?
Fortune nomination!!!
I think Sherlock would have said, "But it's elementary, my dear Watson.
Oftentimes a corpse is not necessary, as here."
Regards, Mark.
-----
Mark Difford (Ph.D.)
Research Associate
Botany Department
Nelson Mandela Metropolitan University
Port Elizabeth, South Africa
--
View this message in context: http://r.789695.n4.nabble.com/Estimate-of-intercept-in-loglinear-model-tp4009905p4015131.html
Sent from the R help mailing list archive at Nabble.com.
Sorry about that. However I have solved the problem by declaring the
explanatory variables as factors.
An unresolved problem is: what does R do when the explanatory factors
are not defined as factors when it obtains a different value for the
intercept but the correct value for the fitted value?
A description of the data and the R code and output is attached for
anyone interested.
Best wishes,
Colin Aitken
-------------------
David Winsemius wrote:
On Nov 7, 2011, at 12:59 PM, Colin Aitken wrote:
How does R estimate the intercept term \alpha in a loglinear
model with Poisson model and log link for a contingency table of counts?
(E.g., for a 2-by-2 table {n_{ij}) with \log(\mu) = \alpha + \beta_{i}
+ \gamma_{j})
I fitted such a model and checked the calculations by hand. I agreed
with the main effect terms but not the intercept. Interestingly, I
agreed with the fitted value provided by R for the first cell {11} in
the table.
If my estimate of intercept = \hat{\alpha}, my estimate of the fitted
value for the first cell = exp(\hat{\alpha}) but R seems to be doing
something else for the estimate of the intercept.
However if I check the R $fitted_value for n_{11} it agrees with my
exp(\hat{\alpha}).
I would expect that with the corner-point parametrization, the
estimates for a 2 x 2 table would correspond to expected frequencies
exp(\alpha), exp(\alpha + \beta), exp(\alpha + \gamma), exp(\alpha +
\beta + \gamma). The MLE of \alpha appears to be log(n_{.1} *
n_{1.}/n_{..}), but this is not equal to the intercept given by R in
the example I tried.
With thanks in anticipation,
Colin Aitken
--
Professor Colin Aitken,
Professor of Forensic Statistics,
Do you suppose you could provide a data-corpse for us to dissect?
Noting the tag line for every posting ....
and provide commented, minimal, self-contained, reproducible code.
Professor Colin Aitken,
Professor of Forensic Statistics,
School of Mathematics, King?s Buildings, University of Edinburgh,
Mayfield Road, Edinburgh, EH9 3JZ.
Tel: 0131 650 4877
E-mail: c.g.g.aitken at ed.ac.uk
Fax : 0131 650 6553
http://www.maths.ed.ac.uk/~cgga
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
An unresolved problem is: what does R do when the explanatory factors
are not defined as factors when it obtains a different value for the
intercept but the correct value for the fitted value?
Colin,
I don't think that happens (that the fitted values are identical if
predictors are cast as numerical), but the following could (really is
answered by my initial answer). Once again, using the example I gave above,
but using the second level of "outcome" as a reference level for a new fit,
called glm.D93R. (For this part of the question a corpse would have been
nice, though not really needed---"yours" was unfortunately buried too deeply
for me to find it,)
## Dobson (1990) Page 93: Randomized Controlled Trial :
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
glm.D93 <- glm(counts ~ outcome + treatment, family=poisson())
glm.D93R <- glm(counts ~ C(outcome, base=2) + treatment, family=poisson())
## treat predictor as numeric
glm.D93N <- glm(counts ~ as.numeric(as.character(outcome)) +
as.numeric(as.character (treatment)), family=poisson())
coef(glm.D93)
(Intercept) outcome2 outcome3 treatment2 treatment3
3.044522e+00 -4.542553e-01 -2.929871e-01 1.337909e-15 1.421085e-15
## Different value for the Intercept but same fitted values (see below) as
the earlier fit (above)
##
summary(glm.D93R)
< snipped and edited for clarity>
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.590e+00 1.958e-01 13.230 <2e-16 ***
outcome1 4.543e-01 2.022e-01 2.247 0.0246 *
outcome3 1.613e-01 2.151e-01 0.750 0.4535
treatment2 -3.349e-16 2.000e-01 0.000 1.0000
treatment3 -6.217e-16 2.000e-01 0.000 1.0000
< snip >
1 2 3 4 5 6 7 8
9
19.40460 16.52414 14.07126 19.40460 16.52414 14.07126 19.40460 16.52414
14.07126
Regards, Mark.
-----
Mark Difford (Ph.D.)
Research Associate
Botany Department
Nelson Mandela Metropolitan University
Port Elizabeth, South Africa
--
View this message in context: http://r.789695.n4.nabble.com/Estimate-of-intercept-in-loglinear-model-tp4009905p4017091.html
Sent from the R help mailing list archive at Nabble.com.