Patrick Giraudoux
2011-Jun-12 09:06 UTC
[R] logistic regression where the independant variable is a ratio
Dear Lister, I have collected data in 6 geographical areas on prevalence of a parasite in humans and in foxes. The results are expressed as a number of positive or negative cases in human and foxes in the following data.frame: Pvtab <- structure(list(posHum = c(3, 5, 3, 17, 0, 4), negHum = c(32631, 16293, 27988, 231282, 53215, 51046), posFox = c(18, 23, 18, 191, 12, 55), negFox = c(14, 24, 62, 105, 55, 43)), .Names = c("posHum", "negHum", "posFox", "negFox"), row.names = c("zone 1", "zone 2", "zone 3", "zone 4", "zone 5", "zone 6"), class = "data.frame") I want to check a possible link between prevalences in humans (the reponse variable) and prevalences in foxes (the independant variable). I though about a logistic regression of the form: pvFox<-Pvtab$posFox/(Pvtab$posFox+Pvtab$negFox) # computes the prevalence in foxes for each area mod0<-mod0<-glm(cbind(Pvtab$posHum,Pvtab$negHum)~pvFox,family=binomial) But in this cas the number of foxes that have been used to compute the prevalence estimate in foxes (pvFox) is deliberatly not taken into account in the model. I can hardly figure out how to do it (weighing the model with the square root of the number of fox in each area ?). Any advise appreciated about how to model a prevalence as a response of another prevalence at best. Patrick