thr3ads.net - R help - [R] evaluation methods for logistic regression with proportion data [Dec 2005]

If this information is useful, please help other people find it:
Share via:

ahimsa campos arceiz

2005-Dec-26 10:28 UTC

[R] evaluation methods for logistic regression with proportion data

Dear list-members,

I have made a logistic regression analysis of the spatial distribution of 
an ecological phenomenon (wildlife-caused crop damage). I divided the 
region into 5x5 km grids, and in each grid I have performed a number of 
questionnaires to asses the presence of crop damage in particular houses. 
As a result, my dependent variable is not a simple presence/ absence data 
but a proportion of positive responses (n of positive responses/ n of 
questionnaires) per grid.

I used glm to fit a suitable model (all the variable and model selection 
process is ok).

The problem comes once that I get the final model. Since my observed data 
has no real positive or negative (but proportion of positives) I cannot 
calculate specifity or sensitivity, and therefore (I think that) cannot use 
statistics like kappa or area under the curve ROC to evaluate the 
performance of my model.

Can anybody suggest a suitable method to evaluate the performance of this 
kind of "proportion data" model that can be implemented in R?  Does
any
body know any alternative as elegant as the AUC ROC for this case?

Besides the internal evaluation, I am planning to use bootstrap resampling 
in order to produce "pseudo-independent" data to evaluate the
performance
of the model.

An example of how do my data look like:
       observed   predicted
         0.200   0.4079725
         0.556   0.5987730
         0.500   0.9140571
         0.857   0.8878290
         0.875   0.7845368
         1.000   0.8575587
         1.000   0.9406087
         0.778   0.5861066
         0.600   0.6204616
         1.000   0.8585725
         0.000   0.2949169
         0.100   0.1291246
         0.444   0.7627612

observed = proportion of positive responses to crop damage questionnaires 
in a 25 km2 grid
predicted = values produced by the final glm(binomial) model on the same 
dataset as used to develop the model

Thanks a lot in advance for any suggestion

Ahimsa


Ahimsa Campos Arceiz
The University Museum,
The University of Tokyo
Hongo 7-3-1, Bunkyo-ku,
Tokyo 113-0033
phone +81-(0)3-5841-2824
cell +81-(0)80-5402-7702 
	[[alternative HTML version deleted]]

Apparently Analagous Threads

Search for more seemingly similar threads

R help - Dec 2005 - evaluation methods for logistic regression with proportion data

[R] evaluation methods for logistic regression with proportion data

Apparently Analagous Threads

Wisdom of the Ancients