Estimate of intercept in loglinear model - R-help

Mon, Nov 7, 2011 9:59 AM #

How does R estimate the intercept term \alpha in a loglinear
  model with Poisson model and log link for a contingency table of counts?

(E.g., for a 2-by-2 table {n_{ij}) with \log(\mu) = \alpha + \beta_{i} + 
\gamma_{j})

I fitted such a model and checked the calculations by hand. I  agreed 
with the main effect terms but not the intercept. Interestingly,  I 
agreed with the fitted value provided by R for the first cell {11} in 
the table.

If my estimate of intercept = \hat{\alpha}, my estimate of the fitted 
value for the first cell = exp(\hat{\alpha}) but R seems to be doing 
something else for the estimate of the intercept.

However if I check the  R $fitted_value for n_{11} it agrees with my 
exp(\hat{\alpha}).

	I would expect that with the corner-point parametrization, the 
estimates for a 2 x 2 table would correspond to expected frequencies 
exp(\alpha), exp(\alpha + \beta), exp(\alpha + \gamma), exp(\alpha + 
\beta + \gamma). The MLE of \alpha appears to be log(n_{.1} * 
n_{1.}/n_{..}), but this is not equal to the intercept given by R in the 
example I tried.

With thanks in anticipation,

Colin Aitken

Professor Colin Aitken,
Professor of Forensic Statistics,
School of Mathematics, King?s Buildings, University of Edinburgh,
Mayfield Road, Edinburgh, EH9 3JZ.

Tel:    0131 650 4877
E-mail:  c.g.g.aitken at ed.ac.uk
Fax :  0131 650 6553
http://www.maths.ed.ac.uk/~cgga


The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

David Winsemius

Mon, Nov 7, 2011 10:11 AM #

On Nov 7, 2011, at 12:59 PM, Colin Aitken wrote:

Do you suppose you could provide a data-corpse for us to dissect?

Noting the tag line for every posting ....

and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

Mark Difford

Mon, Nov 7, 2011 11:04 AM #

On Nov 07, 2011 at 7:59pm Colin Aitken wrote:

Colin,

If you fitted this using a GLM then the default in R is to use so-called
treatment contrasts (i.e. Dunnett contrasts). See ?contr.treatment. Take the
first example on the ?glm help page

## Dobson (1990) Page 93: Randomized Controlled Trial :
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
print(d.AD <- data.frame(treatment, outcome, counts))
glm.D93 <- glm(counts ~ outcome + treatment, family=poisson())
anova(glm.D93)
summary(glm.D93)

< snip >
Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
(Intercept)  3.045e+00  1.709e-01  17.815   <2e-16 ***
outcome2    -4.543e-01  2.022e-01  -2.247   0.0246 *  
outcome3    -2.930e-01  1.927e-01  -1.520   0.1285    
treatment2   1.338e-15  2.000e-01   0.000   1.0000    
treatment3   1.421e-15  2.000e-01   0.000   1.0000
< snip >

[1] "1" "2" "3"

[1] "1" "2" "3"

So here the intercept represents the estimated counts at the first level of
"outcome" (i.e. outcome = 1) and the first level of "treatment" (i.e.
treatment = 1).

1 
3.044522

Regards, Mark.

-----
Mark Difford (Ph.D.)
Research Associate
Botany Department
Nelson Mandela Metropolitan University
Port Elizabeth, South Africa
--
View this message in context: http://r.789695.n4.nabble.com/Estimate-of-intercept-in-loglinear-model-tp4009905p4012346.html
Sent from the R help mailing list archive at Nabble.com.

Mark Difford

Mon, Nov 7, 2011 11:12 AM #

On Nov 07, 2011 at 9:04pm Mark Difford wrote:

Perhaps I should have added (though surely unnecessary in your case) that
exponentiation gives the predicted/estimated counts, viz 21 (compared to 18
for the saturated model).

##

[1] 20.99999

Regards, Mark.

-----
Mark Difford (Ph.D.)
Research Associate
Botany Department
Nelson Mandela Metropolitan University
Port Elizabeth, South Africa
--
View this message in context: http://r.789695.n4.nabble.com/Estimate-of-intercept-in-loglinear-model-tp4009905p4012723.html
Sent from the R help mailing list archive at Nabble.com.

Rolf Turner

Mon, Nov 7, 2011 6:58 PM #

On 08/11/11 07:11, David Winsemius wrote:

(in response to

!!!)

<SNIP>

Fortune nomination!!!

     cheers,

         Rolf Turner

Mark Difford

Mon, Nov 7, 2011 11:22 PM #

Nov 08, 2011; 4:58am Rolf Turner wrote:

I think Sherlock would have said, "But it's elementary, my dear Watson.
Oftentimes a corpse is not necessary, as here."

Regards, Mark.

-----
Mark Difford (Ph.D.)
Research Associate
Botany Department
Nelson Mandela Metropolitan University
Port Elizabeth, South Africa
--
View this message in context: http://r.789695.n4.nabble.com/Estimate-of-intercept-in-loglinear-model-tp4009905p4015131.html
Sent from the R help mailing list archive at Nabble.com.

Colin Aitken

Tue, Nov 8, 2011 1:16 AM #

Sorry about that.  However I have solved the problem by declaring the 
explanatory variables as factors.

An unresolved problem is:  what does R do when the explanatory factors 
are not defined as factors when it obtains a different value for the 
intercept but the correct value for the fitted value?

A description of the data and the R code and output is attached for 
anyone interested.

Best wishes,

Colin Aitken

-------------------

David Winsemius wrote:

and provide commented, minimal, self-contained, reproducible code.

Professor Colin Aitken,
Professor of Forensic Statistics,
School of Mathematics, King?s Buildings, University of Edinburgh,
Mayfield Road, Edinburgh, EH9 3JZ.

Tel:    0131 650 4877
E-mail:  c.g.g.aitken at ed.ac.uk
Fax :  0131 650 6553
http://www.maths.ed.ac.uk/~cgga


The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

Mark Difford

Tue, Nov 8, 2011 10:58 AM #

On Nov 08, 2011 at 11:16am Colin Aitken wrote:

Colin,

I don't think that happens (that the fitted values are identical if
predictors are cast as numerical), but the following could (really is
answered by my initial answer). Once again, using the example I gave above,
but using the second level of "outcome" as a reference level for a new fit,
called glm.D93R. (For this part of the question a corpse would have been
nice, though not really needed---"yours" was unfortunately buried too deeply
for me to find it,)

## Dobson (1990) Page 93: Randomized Controlled Trial : 
counts <- c(18,17,15,20,10,20,25,13,12) 
outcome <- gl(3,1,9) 
treatment <- gl(3,3) 
glm.D93 <- glm(counts ~ outcome + treatment, family=poisson())
glm.D93R <- glm(counts ~ C(outcome, base=2) + treatment, family=poisson())

## treat predictor as numeric
glm.D93N <- glm(counts ~ as.numeric(as.character(outcome)) +
as.numeric(as.character    (treatment)), family=poisson())

(Intercept)      outcome2      outcome3    treatment2    treatment3 
 3.044522e+00 -4.542553e-01 -2.929871e-01  1.337909e-15  1.421085e-15

## Different value for the Intercept but same fitted values (see below) as
the earlier fit (above)
##
summary(glm.D93R)
< snipped and edited for clarity> 
Coefficients:
                        Estimate Std. Error z value Pr(>|z|)    
(Intercept)             2.590e+00  1.958e-01  13.230   <2e-16 ***
outcome1               4.543e-01  2.022e-01   2.247   0.0246 *  
outcome3               1.613e-01  2.151e-01   0.750   0.4535    
treatment2            -3.349e-16  2.000e-01   0.000   1.0000    
treatment3            -6.217e-16  2.000e-01   0.000   1.0000
< snip >

1        2        3        4        5        6        7        8       
9 
21.00000 13.33333 15.66667 21.00000 13.33333 15.66667 21.00000 13.33333
15.66667

1        2        3        4        5        6        7        8       
9 
21.00000 13.33333 15.66667 21.00000 13.33333 15.66667 21.00000 13.33333
15.66667

## if predictors treated as numeric---check summary(glm.D93N) yourself

1        2        3        4        5        6        7        8       
9 
19.40460 16.52414 14.07126 19.40460 16.52414 14.07126 19.40460 16.52414
14.07126

Regards, Mark.

-----
Mark Difford (Ph.D.)
Research Associate
Botany Department
Nelson Mandela Metropolitan University
Port Elizabeth, South Africa
--
View this message in context: http://r.789695.n4.nabble.com/Estimate-of-intercept-in-loglinear-model-tp4009905p4017091.html
Sent from the R help mailing list archive at Nabble.com.