joris meys
2009-Mar-24 19:30 UTC
[R] modelling probabilities instead of binary data with logistic regression
Dear all, I have a dataset where I reduced the dimensionality, and now I have a response variable with probabilities/proportions between 0 and 1. I wanted to do a logistic regression on those, but the function glm refuses to do that with non-integer values in the response. I also tried lrm, but that one interpretes the probabilities as different levels and gives for every level a different intercept. Not exactly what I want... Is there a way to specify that the response variable should be interpreted as a probability? Kind regards Joris [[alternative HTML version deleted]]
joris meys
2009-Mar-24 19:59 UTC
[R] modelling probabilities instead of binary data with logistic regression
Thank you all for the very fast answers. My proportions come from a factor analysis on a number of binary variables, in order to avoid having to fit 12 logistic regressions on the same dataset. By scaling the obtained scores to 0 and 1, I get weighted averages of the response combinations I'm interested in. I tried the betareg function, but that one can't deal with probabilities 0 and 1 unfortunately. I'll have to manually do the logit transformation, I'm afraid. Thanks for the help. Kind regards Joris On Tue, Mar 24, 2009 at 8:48 PM, Kjetil Halvorsen < kjetilbrinchmannhalvorsen@gmail.com> wrote:> You did'nt say how your proportions have arisen! If each corresonds to one > observation, you could simply simulate > indicator variables with those proportions as prob's, fit glm, repeat many > times, and > average results! > > More seriously, you could transform the proportions to logits > logit <- log(p/(1-p)) > and fit a linear regression. > > Kjetil > > On Tue, Mar 24, 2009 at 3:30 PM, joris meys <jorismeys@gmail.com> wrote: > >> Dear all, >> >> I have a dataset where I reduced the dimensionality, and now I have a >> response variable with probabilities/proportions between 0 and 1. I wanted >> to do a logistic regression on those, but the function glm refuses to do >> that with non-integer values in the response. I also tried lrm, but that >> one >> interpretes the probabilities as different levels and gives for every >> level >> a different intercept. Not exactly what I want... >> >> Is there a way to specify that the response variable should be interpreted >> as a probability? >> >> Kind regards >> Joris >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > >[[alternative HTML version deleted]]
ONKELINX, Thierry
2009-Mar-25 09:32 UTC
[R] modelling probabilities instead of binary data with logisticregression
Hi Joris, glm() handles proportions but will give you a warning (and not an error) about non-integer values. So if you get an error then there should be something wrong with the syntax, model or data. Can you provide us with a reproducible example? Cheers, Thierry ------------------------------------------------------------------------ ---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -----Oorspronkelijk bericht----- Van: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Namens joris meys Verzonden: dinsdag 24 maart 2009 20:30 Aan: R-help Mailing List Onderwerp: [R] modelling probabilities instead of binary data with logisticregression Dear all, I have a dataset where I reduced the dimensionality, and now I have a response variable with probabilities/proportions between 0 and 1. I wanted to do a logistic regression on those, but the function glm refuses to do that with non-integer values in the response. I also tried lrm, but that one interpretes the probabilities as different levels and gives for every level a different intercept. Not exactly what I want... Is there a way to specify that the response variable should be interpreted as a probability? Kind regards Joris [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document.
Maybe Matching Threads
- Multinomial and Ordinal Logistic Regression - Probability calculation
- incorrect import?
- logistic regression - what is being predicted when using predict - probabilities or odds?
- Adapting thresholds for predictions of ordinal logistic regression
- Difference between R and SAS in Corcordance index in ordinal logistic regression