An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-ecology/attachments/20111201/200e564c/attachment.pl>
reporting results from binomial glm with categorical variable
2 messages · Matthew Forister, Scott Foster
Hi Matt, This is only obliquely an R question. Here is an answer nonetheless. If you have G levels of the categorical factor then there are exactly G means to estimate (irrespective of the outcome type). This means that you cannot estimate an overall grand mean *and* the individual level means, as there would then be G+1 parameters for G means and the estimates would be non-unique... I suspect that you already knew this though. The way around this is to impose some sort of constraint on the overall mean and the level means. Commonly this is done by assigning one of the level `deviations' to be zero -- this is called a corner-point constraint. Another type is sum-to-zero where there is a grand mean (actually the mean) and G deviations that are constrained by their sum. This is the constraint that you mentioned. There are others, of course, but less common. One that I find very useful is to omit estimating the overall mean and just estimate the G factor level means. Generally though, the choice of constraint is not all that important but corner-point constraints can be easier to interpret, sometimes. If you do want to use sum-to-zero constraints then all you need to do is alter the `contrast' attribute of your categorical variable. This is done in R using the C() function (note capitalisation). Your glm() call would use a formula like cbind( nsuccess,nfailure)~1+C(myFac,"sum"). How to report the results? Good question... For me, it depends strongly on what information I want to convey. Typically, for this kind of analysis, that would be the means of the factor levels (unless there is more to this than we are seeing). This is most easily done using R's inbuilt prediction functions (see ?predict.glm for example). A call to this function would have a newdata argument given as a G row data frame with one row for each level of the factor. Note that it will not matter which contrasts you give it -- they will all perform equally well (they are all equally valid). I hope this helped (it is certainly long enough), Scott PS A couple of good references (oldies but goodies) for topics related to this are Lane and Nelder (1982) Analysis of covariance and standardisation as instances of prediction. Biometrics, 38, 613-621 Nelder (1994) The statistics of linear models: back to basics. Statistics and Computing, 4, 221-234
On 02/12/11 09:54, Matthew Forister wrote:
Dear All, I have two questions about reporting results from a binomial GLM (logit link) that includes a categorical variable. I understand how dummy coding works. My two questions are about interpretation and presentation: 1) The default in R seems to be to use the first level of a categorical variable as the reference. It makes more sense to me to use the grand mean as the reference -- I found a webpage that describes this as "deviation coding". This seems so commonsensical, that I'm surprised that I don't see more people using it instead of the default comparison to the first level. Am I missing something here? is this deviation coding a reasonable way to go? This is the website where I found that: http://www.ats.ucla.edu/stat/r/library/contrast_coding.htm 2) Whether I use the default in R or switch to deviation coding, I will get multiple coefficients associated with the different levels of my predictor variable. What is the convention for reporting the information associated with the "dummy coded" levels of a categorical variable? I had assumed that I would report details associated with each of the dummy coded levels, but I can't seem to find an example where someone has done that... thanks for your help, Matt
Scott Foster CSIRO Mathematics, Informatics and Statistics GPO Box 1538 Castray Esplanade Hobart 7001 Tasmania Australia Phone: (03) 6232 5178 Fax: (03) 6232 5000 Email: scott.foster at csiro.au