Skip to content

log-linear

4 messages · orkun, Thomas W Blackwell, Spencer Graves +1 more

#
hello

I have spatial data which contain
  number of landslide presence cells with respect to landslide 
predictors and
  number of landslide absence cells with respect to same predictors.

predictors are essentially categorical data.

I tried logistic regression. But because of providing interaction 
capability
of predictors, I want to use log-linear method.
I hesitate the way I should use landslide count as response variable.
only landslide presence data should be regarded ? or both landslide 
presence and absent data should be regarded as response variable ?

I will appreciate if anyone can supply information

thanks in advance

Ahmet Temiz
Gen Dir of Disaster of Affairs

TURKEY


______________________________________



______________________________________
The views and opinions expressed in this e-mail message are the sender's own
and do not necessarily represent the views and the opinions of Earthquake Research Dept.
of General Directorate of Disaster Affairs.

Bu e-postadaki fikir ve gorusler gonderenin sahsina ait olup, yasal olarak T.C.
B.I.B. Afet Isleri Gn.Mud. Deprem Arastirma Dairesi'ni baglayici nitelikte degildir.
#
The presence/absence nature of the outcome variable strongly supports
using logistic regression and nothing else.  I strongly encourage you
to stick with logistic regression.  The model formula and interaction
term capabilities in R are just the same for logistic regression as for
log-linear models.  (In some textbooks, log-linear models are used as
the motivation and example for introducing the ideas of interaction
terms, but once introduced, the ideas apply very generally.)

I would set up the data as you have, as a data frame or a matrix with
columns representing the number of landslide presence cells, the number
of landslide absence cells, and then one column for each predictor.

Then use  glm() with a call something like:

result <- glm(cbind(present, absent) ~ (a+b+c+d)^3,  family=binomial,
                 data = name.of.data.frame)

In  help("glm"), there's a sentence under "Details" which describes
the cbind() syntax I've used above, and  help("formula")  explains
the (.)^3 syntax.

-  tom blackwell  -  u michigan medical school  -  ann arbor  -
On Mon, 7 Apr 2003, orkun wrote:

            
#
1.  What did you use for logistic regression?  "glm"?  If your 
response variable is "number of landslides", I would think that "glm" 
with "family = poisson" might be appropriate.  Have you checked the R 
help for "?glm" and "?family" and the R search site at 
"http://www.r-project.org/" -> search -> "R search site"?  In 
particular, if you don't have "Modern Applied Statistics with S" by 
Venables and Ripley (2002), I suggest you get a copy.  This is the best 
reference I know on R.  If you've digested Venables and Ripley, at least 
on "glm", the next best book I know for your issues may be  McCullagh P. 
and Nelder, J. A. (1989) Generalized Linear Models (London: Chapman and 
Hall).

	  2.  You can use interactions with logistic regression, as you could 
with Poisson regression, "glm(..., family = poisson)".  If your 
explanatory variables are all categorical, then you might have a problem 
with estimating too many parameters:  If you have 5 categories in one 
variable and 7 in another, the main effects will estimate 4=(5-1) and 
6=(7-1) parameters, and the interaction will involve 4*6 = 24 
parameters.  Moreover, if you do NOT have data on at least 24 
sufficiently different combinations out of the 5*7 = 35 possible, you 
won't be able to estimate all the parameters in the interaction.  I 
suggest you try to construct at least ordinal scales, code the 
categories as numbers whereever that might be done plausibly, then look 
for linear terms, parabolics, etc., and linear*linear interactions, 
etc., THEN look for large residuals from the fitted model.

Hope this helps,
Spencer Graves
orkun wrote:
#
With this type of binary spatial data it would be nice to use
autologistic regression. I recently asked the list if there was a R
autologit function and nobody thought that there was. However, Jennifer
Hoeting has an S+ (and C++) version at her website:

http://www.stat.colostate.edu/~jah/software/


Mantel's test has also been used successfully on binary spatially
autocorrelated data:

Schick, R.S., Urban, D.L., 2000, Spatial Components of Bowhead
Whale(Balaena mysticetus) distribution in the Alaskan Beufort Sea.
Canadian Journal of Fisheries Aquatic Sciences, 57, 2193-2200.

I have a highly confounded binary dataset that has first and second
order spatial effects (live vs. dead trees and a host of environmental
correlates) for which I'm looking for a tidy analytical framework. I
haven't really found the right thing just yet.

Cheers, Andy