hello I have spatial data which contain number of landslide presence cells with respect to landslide predictors and number of landslide absence cells with respect to same predictors. predictors are essentially categorical data. I tried logistic regression. But because of providing interaction capability of predictors, I want to use log-linear method. I hesitate the way I should use landslide count as response variable. only landslide presence data should be regarded ? or both landslide presence and absent data should be regarded as response variable ? I will appreciate if anyone can supply information thanks in advance Ahmet Temiz Gen Dir of Disaster of Affairs TURKEY ______________________________________ ______________________________________ The views and opinions expressed in this e-mail message are the sender's own and do not necessarily represent the views and the opinions of Earthquake Research Dept. of General Directorate of Disaster Affairs. Bu e-postadaki fikir ve gorusler gonderenin sahsina ait olup, yasal olarak T.C. B.I.B. Afet Isleri Gn.Mud. Deprem Arastirma Dairesi'ni baglayici nitelikte degildir.
The presence/absence nature of the outcome variable strongly supports
using logistic regression and nothing else. I strongly encourage you
to stick with logistic regression. The model formula and interaction
term capabilities in R are just the same for logistic regression as for
log-linear models. (In some textbooks, log-linear models are used as
the motivation and example for introducing the ideas of interaction
terms, but once introduced, the ideas apply very generally.)
I would set up the data as you have, as a data frame or a matrix with
columns representing the number of landslide presence cells, the number
of landslide absence cells, and then one column for each predictor.
Then use glm() with a call something like:
result <- glm(cbind(present, absent) ~ (a+b+c+d)^3, family=binomial,
data = name.of.data.frame)
In help("glm"), there's a sentence under "Details"
which describes
the cbind() syntax I've used above, and help("formula") explains
the (.)^3 syntax.
- tom blackwell - u michigan medical school - ann arbor -
On Mon, 7 Apr 2003, orkun wrote:
> hello
>
> I have spatial data which contain number of landslide presence cells
> with respect to landslide predictors and number of landslide absence
> cells with respect to same predictors.
>
> predictors are essentially categorical data.
>
> I tried logistic regression. But because of providing interaction
> capability of predictors, I want to use log-linear method.
> I hesitate the way I should use landslide count as response variable.
> only landslide presence data should be regarded ? or both landslide
> presence and absent data should be regarded as response variable ?
>
> I will appreciate if anyone can supply information
>
> thanks in advance
>
> Ahmet Temiz
> Gen Dir of Disaster of Affairs
> TURKEY
1. What did you use for logistic regression? "glm"? If your response variable is "number of landslides", I would think that "glm" with "family = poisson" might be appropriate. Have you checked the R help for "?glm" and "?family" and the R search site at "http://www.r-project.org/" -> search -> "R search site"? In particular, if you don't have "Modern Applied Statistics with S" by Venables and Ripley (2002), I suggest you get a copy. This is the best reference I know on R. If you've digested Venables and Ripley, at least on "glm", the next best book I know for your issues may be McCullagh P. and Nelder, J. A. (1989) Generalized Linear Models (London: Chapman and Hall). 2. You can use interactions with logistic regression, as you could with Poisson regression, "glm(..., family = poisson)". If your explanatory variables are all categorical, then you might have a problem with estimating too many parameters: If you have 5 categories in one variable and 7 in another, the main effects will estimate 4=(5-1) and 6=(7-1) parameters, and the interaction will involve 4*6 = 24 parameters. Moreover, if you do NOT have data on at least 24 sufficiently different combinations out of the 5*7 = 35 possible, you won't be able to estimate all the parameters in the interaction. I suggest you try to construct at least ordinal scales, code the categories as numbers whereever that might be done plausibly, then look for linear terms, parabolics, etc., and linear*linear interactions, etc., THEN look for large residuals from the fitted model. Hope this helps, Spencer Graves orkun wrote:> hello > > I have spatial data which contain > number of landslide presence cells with respect to landslide predictors > and > number of landslide absence cells with respect to same predictors. > > predictors are essentially categorical data. > > I tried logistic regression. But because of providing interaction > capability > of predictors, I want to use log-linear method. > I hesitate the way I should use landslide count as response variable. > only landslide presence data should be regarded ? or both landslide > presence and absent data should be regarded as response variable ? > > I will appreciate if anyone can supply information > > thanks in advance > > Ahmet Temiz > Gen Dir of Disaster of Affairs > > TURKEY > > > ______________________________________ > > > > ______________________________________ > The views and opinions expressed in this e-mail message are the sender's > own > and do not necessarily represent the views and the opinions of > Earthquake Research Dept. > of General Directorate of Disaster Affairs. > > Bu e-postadaki fikir ve gorusler gonderenin sahsina ait olup, yasal > olarak T.C. > B.I.B. Afet Isleri Gn.Mud. Deprem Arastirma Dairesi'ni baglayici > nitelikte degildir. > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help