hello I have spatial data which contain number of landslide presence cells with respect to landslide predictors and number of landslide absence cells with respect to same predictors. predictors are essentially categorical data. I tried logistic regression. But because of providing interaction capability of predictors, I want to use log-linear method. I hesitate the way I should use landslide count as response variable. only landslide presence data should be regarded ? or both landslide presence and absent data should be regarded as response variable ? I will appreciate if anyone can supply information thanks in advance Ahmet Temiz Gen Dir of Disaster of Affairs TURKEY ______________________________________ ______________________________________ The views and opinions expressed in this e-mail message are the sender's own and do not necessarily represent the views and the opinions of Earthquake Research Dept. of General Directorate of Disaster Affairs. Bu e-postadaki fikir ve gorusler gonderenin sahsina ait olup, yasal olarak T.C. B.I.B. Afet Isleri Gn.Mud. Deprem Arastirma Dairesi'ni baglayici nitelikte degildir.
log-linear
4 messages · orkun, Thomas W Blackwell, Spencer Graves +1 more
The presence/absence nature of the outcome variable strongly supports
using logistic regression and nothing else. I strongly encourage you
to stick with logistic regression. The model formula and interaction
term capabilities in R are just the same for logistic regression as for
log-linear models. (In some textbooks, log-linear models are used as
the motivation and example for introducing the ideas of interaction
terms, but once introduced, the ideas apply very generally.)
I would set up the data as you have, as a data frame or a matrix with
columns representing the number of landslide presence cells, the number
of landslide absence cells, and then one column for each predictor.
Then use glm() with a call something like:
result <- glm(cbind(present, absent) ~ (a+b+c+d)^3, family=binomial,
data = name.of.data.frame)
In help("glm"), there's a sentence under "Details" which describes
the cbind() syntax I've used above, and help("formula") explains
the (.)^3 syntax.
- tom blackwell - u michigan medical school - ann arbor -
On Mon, 7 Apr 2003, orkun wrote:
hello I have spatial data which contain number of landslide presence cells with respect to landslide predictors and number of landslide absence cells with respect to same predictors. predictors are essentially categorical data. I tried logistic regression. But because of providing interaction capability of predictors, I want to use log-linear method. I hesitate the way I should use landslide count as response variable. only landslide presence data should be regarded ? or both landslide presence and absent data should be regarded as response variable ? I will appreciate if anyone can supply information thanks in advance Ahmet Temiz Gen Dir of Disaster of Affairs TURKEY
1. What did you use for logistic regression? "glm"? If your response variable is "number of landslides", I would think that "glm" with "family = poisson" might be appropriate. Have you checked the R help for "?glm" and "?family" and the R search site at "http://www.r-project.org/" -> search -> "R search site"? In particular, if you don't have "Modern Applied Statistics with S" by Venables and Ripley (2002), I suggest you get a copy. This is the best reference I know on R. If you've digested Venables and Ripley, at least on "glm", the next best book I know for your issues may be McCullagh P. and Nelder, J. A. (1989) Generalized Linear Models (London: Chapman and Hall). 2. You can use interactions with logistic regression, as you could with Poisson regression, "glm(..., family = poisson)". If your explanatory variables are all categorical, then you might have a problem with estimating too many parameters: If you have 5 categories in one variable and 7 in another, the main effects will estimate 4=(5-1) and 6=(7-1) parameters, and the interaction will involve 4*6 = 24 parameters. Moreover, if you do NOT have data on at least 24 sufficiently different combinations out of the 5*7 = 35 possible, you won't be able to estimate all the parameters in the interaction. I suggest you try to construct at least ordinal scales, code the categories as numbers whereever that might be done plausibly, then look for linear terms, parabolics, etc., and linear*linear interactions, etc., THEN look for large residuals from the fitted model. Hope this helps, Spencer Graves
orkun wrote:
hello I have spatial data which contain number of landslide presence cells with respect to landslide predictors and number of landslide absence cells with respect to same predictors. predictors are essentially categorical data. I tried logistic regression. But because of providing interaction capability of predictors, I want to use log-linear method. I hesitate the way I should use landslide count as response variable. only landslide presence data should be regarded ? or both landslide presence and absent data should be regarded as response variable ? I will appreciate if anyone can supply information thanks in advance Ahmet Temiz Gen Dir of Disaster of Affairs TURKEY
______________________________________ ______________________________________ The views and opinions expressed in this e-mail message are the sender's own and do not necessarily represent the views and the opinions of Earthquake Research Dept. of General Directorate of Disaster Affairs. Bu e-postadaki fikir ve gorusler gonderenin sahsina ait olup, yasal olarak T.C. B.I.B. Afet Isleri Gn.Mud. Deprem Arastirma Dairesi'ni baglayici nitelikte degildir. ______________________________________________ R-help at stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
With this type of binary spatial data it would be nice to use autologistic regression. I recently asked the list if there was a R autologit function and nobody thought that there was. However, Jennifer Hoeting has an S+ (and C++) version at her website: http://www.stat.colostate.edu/~jah/software/ Mantel's test has also been used successfully on binary spatially autocorrelated data: Schick, R.S., Urban, D.L., 2000, Spatial Components of Bowhead Whale(Balaena mysticetus) distribution in the Alaskan Beufort Sea. Canadian Journal of Fisheries Aquatic Sciences, 57, 2193-2200. I have a highly confounded binary dataset that has first and second order spatial effects (live vs. dead trees and a host of environmental correlates) for which I'm looking for a tidy analytical framework. I haven't really found the right thing just yet. Cheers, Andy