Patrick (Malone Quantitative)
2020-Aug-01 18:15 UTC
[R] Dependent Variable in Logistic Regression
No, R does not. glm() does in order to do logistic regression. On Sat, Aug 1, 2020 at 2:11 PM Paul Bernal <paulbernal07 at gmail.com> wrote:> Hi Bert, > > Thank you for the kind reply. > > But what if I don't turn the variable into a factor. Let's say that in > excel I just coded the variable as 1s and 0s and just imported the dataset > into R and fitted the logistic regression without turning any categorical > variable or dummy variable into a factor? > > Does R requires every dummy variable to be treated as a factor? > > Best regards, > > Paul > > El s?b., 1 de agosto de 2020 12:59 p. m., Bert Gunter < > bgunter.4567 at gmail.com> escribi?: > > > x <- factor(0:1) > > x <- factor("yes","no") > > > > will produce identical results up to labeling. > > > > > > Bert Gunter > > > > "The trouble with having an open mind is that people keep coming along > and > > sticking things into it." > > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > > > > On Sat, Aug 1, 2020 at 10:40 AM Paul Bernal <paulbernal07 at gmail.com> > > wrote: > > > >> Dear friends, > >> > >> Hope you are doing great. I want to fit a logistic regression in R, > where > >> the dependent variable is the covid status (I used 1 for covid > positives, > >> and 0 for covid negatives), but when I ran the glm, R complains that I > >> should make the dependent variable a factor. > >> > >> What would be more advisable, to keep the dependent variable with 1s and > >> 0s, or code it as yes/no and then make it a factor? > >> > >> Any guidance will be greatly appreciated, > >> > >> Best regards, > >> > >> Paul > >> > >> [[alternative HTML version deleted]] > >> > >> ______________________________________________ > >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Patrick S. Malone, Ph.D., Malone Quantitative NEW Service Models: http://malonequantitative.com He/Him/His [[alternative HTML version deleted]]
Dear friend, I am aware that I have a binomial dependent variable, which is covid status (1 if covid positive, and 0 otherwise). My question was if R requires to turn a binomial response variable into a factor or not, that's all. Cheers, Paul El s?b., 1 de agosto de 2020 1:22 p. m., Bert Gunter <bgunter.4567 at gmail.com> escribi?:> ... yes, but so does lm() for a categorical **INdependent** variable with > more than 2 numerically labeled levels. n levels = (n-1) df for a > categorical covariate, but 1 for a continuous one (unless more complex > models are explicitly specified of course). As I said, the OP seems > confused about whether he is referring to the response or covariates. Or > maybe he just made the same typo I did. > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Sat, Aug 1, 2020 at 11:15 AM Patrick (Malone Quantitative) < > malone at malonequantitative.com> wrote: > >> No, R does not. glm() does in order to do logistic regression. >> >> On Sat, Aug 1, 2020 at 2:11 PM Paul Bernal <paulbernal07 at gmail.com> >> wrote: >> >>> Hi Bert, >>> >>> Thank you for the kind reply. >>> >>> But what if I don't turn the variable into a factor. Let's say that in >>> excel I just coded the variable as 1s and 0s and just imported the >>> dataset >>> into R and fitted the logistic regression without turning any categorical >>> variable or dummy variable into a factor? >>> >>> Does R requires every dummy variable to be treated as a factor? >>> >>> Best regards, >>> >>> Paul >>> >>> El s?b., 1 de agosto de 2020 12:59 p. m., Bert Gunter < >>> bgunter.4567 at gmail.com> escribi?: >>> >>> > x <- factor(0:1) >>> > x <- factor("yes","no") >>> > >>> > will produce identical results up to labeling. >>> > >>> > >>> > Bert Gunter >>> > >>> > "The trouble with having an open mind is that people keep coming along >>> and >>> > sticking things into it." >>> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >>> > >>> > >>> > On Sat, Aug 1, 2020 at 10:40 AM Paul Bernal <paulbernal07 at gmail.com> >>> > wrote: >>> > >>> >> Dear friends, >>> >> >>> >> Hope you are doing great. I want to fit a logistic regression in R, >>> where >>> >> the dependent variable is the covid status (I used 1 for covid >>> positives, >>> >> and 0 for covid negatives), but when I ran the glm, R complains that I >>> >> should make the dependent variable a factor. >>> >> >>> >> What would be more advisable, to keep the dependent variable with 1s >>> and >>> >> 0s, or code it as yes/no and then make it a factor? >>> >> >>> >> Any guidance will be greatly appreciated, >>> >> >>> >> Best regards, >>> >> >>> >> Paul >>> >> >>> >> [[alternative HTML version deleted]] >>> >> >>> >> ______________________________________________ >>> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> >> https://stat.ethz.ch/mailman/listinfo/r-help >>> >> PLEASE do read the posting guide >>> >> http://www.R-project.org/posting-guide.html >>> >> and provide commented, minimal, self-contained, reproducible code. >>> >> >>> > >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> >> -- >> Patrick S. Malone, Ph.D., Malone Quantitative >> NEW Service Models: http://malonequantitative.com >> >> He/Him/His >> >[[alternative HTML version deleted]]
Patrick (Malone Quantitative)
2020-Aug-01 18:38 UTC
[R] Dependent Variable in Logistic Regression
I didn't mean to imply that was the only time that it was required, only that it's not universal in R. On Sat, Aug 1, 2020 at 2:22 PM Bert Gunter <bgunter.4567 at gmail.com> wrote:> ... yes, but so does lm() for a categorical **INdependent** variable with > more than 2 numerically labeled levels. n levels = (n-1) df for a > categorical covariate, but 1 for a continuous one (unless more complex > models are explicitly specified of course). As I said, the OP seems > confused about whether he is referring to the response or covariates. Or > maybe he just made the same typo I did. > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Sat, Aug 1, 2020 at 11:15 AM Patrick (Malone Quantitative) < > malone at malonequantitative.com> wrote: > >> No, R does not. glm() does in order to do logistic regression. >> >> On Sat, Aug 1, 2020 at 2:11 PM Paul Bernal <paulbernal07 at gmail.com> >> wrote: >> >>> Hi Bert, >>> >>> Thank you for the kind reply. >>> >>> But what if I don't turn the variable into a factor. Let's say that in >>> excel I just coded the variable as 1s and 0s and just imported the >>> dataset >>> into R and fitted the logistic regression without turning any categorical >>> variable or dummy variable into a factor? >>> >>> Does R requires every dummy variable to be treated as a factor? >>> >>> Best regards, >>> >>> Paul >>> >>> El s?b., 1 de agosto de 2020 12:59 p. m., Bert Gunter < >>> bgunter.4567 at gmail.com> escribi?: >>> >>> > x <- factor(0:1) >>> > x <- factor("yes","no") >>> > >>> > will produce identical results up to labeling. >>> > >>> > >>> > Bert Gunter >>> > >>> > "The trouble with having an open mind is that people keep coming along >>> and >>> > sticking things into it." >>> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >>> > >>> > >>> > On Sat, Aug 1, 2020 at 10:40 AM Paul Bernal <paulbernal07 at gmail.com> >>> > wrote: >>> > >>> >> Dear friends, >>> >> >>> >> Hope you are doing great. I want to fit a logistic regression in R, >>> where >>> >> the dependent variable is the covid status (I used 1 for covid >>> positives, >>> >> and 0 for covid negatives), but when I ran the glm, R complains that I >>> >> should make the dependent variable a factor. >>> >> >>> >> What would be more advisable, to keep the dependent variable with 1s >>> and >>> >> 0s, or code it as yes/no and then make it a factor? >>> >> >>> >> Any guidance will be greatly appreciated, >>> >> >>> >> Best regards, >>> >> >>> >> Paul >>> >> >>> >> [[alternative HTML version deleted]] >>> >> >>> >> ______________________________________________ >>> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> >> https://stat.ethz.ch/mailman/listinfo/r-help >>> >> PLEASE do read the posting guide >>> >> http://www.R-project.org/posting-guide.html >>> >> and provide commented, minimal, self-contained, reproducible code. >>> >> >>> > >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> >> -- >> Patrick S. Malone, Ph.D., Malone Quantitative >> NEW Service Models: http://malonequantitative.com >> >> He/Him/His >> >-- Patrick S. Malone, Ph.D., Malone Quantitative NEW Service Models: http://malonequantitative.com He/Him/His [[alternative HTML version deleted]]
Dear Paul, I think that this thread has gotten unnecessarily complicated. The answer, as is easily demonstrated, is that a binary response for a binomial GLM in glm() may be a factor, a numeric variable, or a logical variable, with identical results; for example: --------------- snip ------------- > set.seed(123) > head(x <- rnorm(100)) [1] -0.56047565 -0.23017749 1.55870831 0.07050839 0.12928774 1.71506499 > head(y <- rbinom(100, 1, 1/(1 + exp(-x)))) [1] 0 1 1 1 1 0 > head(yf <- as.factor(y)) [1] 0 1 1 1 1 0 Levels: 0 1 > head(yl <- y == 1) [1] FALSE TRUE TRUE TRUE TRUE FALSE > glm(y ~ x, family=binomial) Call: glm(formula = y ~ x, family = binomial) Coefficients: (Intercept) x 0.3995 1.1670 Degrees of Freedom: 99 Total (i.e. Null); 98 Residual Null Deviance: 134.6 Residual Deviance: 114.9 AIC: 118.9 > glm(yf ~ x, family=binomial) Call: glm(formula = yf ~ x, family = binomial) Coefficients: (Intercept) x 0.3995 1.1670 Degrees of Freedom: 99 Total (i.e. Null); 98 Residual Null Deviance: 134.6 Residual Deviance: 114.9 AIC: 118.9 > glm(yl ~ x, family=binomial) Call: glm(formula = yl ~ x, family = binomial) Coefficients: (Intercept) x 0.3995 1.1670 Degrees of Freedom: 99 Total (i.e. Null); 98 Residual Null Deviance: 134.6 Residual Deviance: 114.9 AIC: 118.9 --------------- snip ------------- The original poster claimed to have encountered an error with a 0/1 numeric response, but didn't show any data or even a command. I suspect that the response was a character variable, but of course can't really know that. Best, John John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://socialsciences.mcmaster.ca/jfox/ On 2020-08-01 2:25 p.m., Paul Bernal wrote:> Dear friend, > > I am aware that I have a binomial dependent variable, which is covid status > (1 if covid positive, and 0 otherwise). > > My question was if R requires to turn a binomial response variable into a > factor or not, that's all. > > Cheers, > > Paul > > El s?b., 1 de agosto de 2020 1:22 p. m., Bert Gunter <bgunter.4567 at gmail.com> > escribi?: > >> ... yes, but so does lm() for a categorical **INdependent** variable with >> more than 2 numerically labeled levels. n levels = (n-1) df for a >> categorical covariate, but 1 for a continuous one (unless more complex >> models are explicitly specified of course). As I said, the OP seems >> confused about whether he is referring to the response or covariates. Or >> maybe he just made the same typo I did. >> >> Bert Gunter >> >> "The trouble with having an open mind is that people keep coming along and >> sticking things into it." >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> >> >> On Sat, Aug 1, 2020 at 11:15 AM Patrick (Malone Quantitative) < >> malone at malonequantitative.com> wrote: >> >>> No, R does not. glm() does in order to do logistic regression. >>> >>> On Sat, Aug 1, 2020 at 2:11 PM Paul Bernal <paulbernal07 at gmail.com> >>> wrote: >>> >>>> Hi Bert, >>>> >>>> Thank you for the kind reply. >>>> >>>> But what if I don't turn the variable into a factor. Let's say that in >>>> excel I just coded the variable as 1s and 0s and just imported the >>>> dataset >>>> into R and fitted the logistic regression without turning any categorical >>>> variable or dummy variable into a factor? >>>> >>>> Does R requires every dummy variable to be treated as a factor? >>>> >>>> Best regards, >>>> >>>> Paul >>>> >>>> El s?b., 1 de agosto de 2020 12:59 p. m., Bert Gunter < >>>> bgunter.4567 at gmail.com> escribi?: >>>> >>>>> x <- factor(0:1) >>>>> x <- factor("yes","no") >>>>> >>>>> will produce identical results up to labeling. >>>>> >>>>> >>>>> Bert Gunter >>>>> >>>>> "The trouble with having an open mind is that people keep coming along >>>> and >>>>> sticking things into it." >>>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >>>>> >>>>> >>>>> On Sat, Aug 1, 2020 at 10:40 AM Paul Bernal <paulbernal07 at gmail.com> >>>>> wrote: >>>>> >>>>>> Dear friends, >>>>>> >>>>>> Hope you are doing great. I want to fit a logistic regression in R, >>>> where >>>>>> the dependent variable is the covid status (I used 1 for covid >>>> positives, >>>>>> and 0 for covid negatives), but when I ran the glm, R complains that I >>>>>> should make the dependent variable a factor. >>>>>> >>>>>> What would be more advisable, to keep the dependent variable with 1s >>>> and >>>>>> 0s, or code it as yes/no and then make it a factor? >>>>>> >>>>>> Any guidance will be greatly appreciated, >>>>>> >>>>>> Best regards, >>>>>> >>>>>> Paul >>>>>> >>>>>> [[alternative HTML version deleted]] >>>>>> >>>>>> ______________________________________________ >>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>> PLEASE do read the posting guide >>>>>> http://www.R-project.org/posting-guide.html >>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>> >>>>> >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> >>> -- >>> Patrick S. Malone, Ph.D., Malone Quantitative >>> NEW Service Models: http://malonequantitative.com >>> >>> He/Him/His >>> >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Hello, Inline. ?s 20:01 de 01/08/2020, John Fox escreveu:> Dear Paul, > > I think that this thread has gotten unnecessarily complicated. The > answer, as is easily demonstrated, is that a binary response for a > binomial GLM in glm() may be a factor, a numeric variable, or a > logical variable, with identical results; for example: > > --------------- snip ------------- > > > set.seed(123) > > > head(x <- rnorm(100)) > [1] -0.56047565 -0.23017749? 1.55870831? 0.07050839? 0.12928774 > 1.71506499 > > > head(y <- rbinom(100, 1, 1/(1 + exp(-x)))) > [1] 0 1 1 1 1 0 > > > head(yf <- as.factor(y)) > [1] 0 1 1 1 1 0 > Levels: 0 1 > > > head(yl <- y == 1) > [1] FALSE? TRUE? TRUE? TRUE? TRUE FALSE > > > glm(y ~ x, family=binomial) > > Call:? glm(formula = y ~ x, family = binomial) > > Coefficients: > (Intercept)??????????? x > ???? 0.3995?????? 1.1670 > > Degrees of Freedom: 99 Total (i.e. Null);? 98 Residual > Null Deviance:??????? 134.6 > Residual Deviance: 114.9???? AIC: 118.9 > > > glm(yf ~ x, family=binomial) > > Call:? glm(formula = yf ~ x, family = binomial) > > Coefficients: > (Intercept)??????????? x > ???? 0.3995?????? 1.1670 > > Degrees of Freedom: 99 Total (i.e. Null);? 98 Residual > Null Deviance:??????? 134.6 > Residual Deviance: 114.9???? AIC: 118.9 > > > glm(yl ~ x, family=binomial) > > Call:? glm(formula = yl ~ x, family = binomial) > > Coefficients: > (Intercept)??????????? x > ???? 0.3995?????? 1.1670 > > Degrees of Freedom: 99 Total (i.e. Null);? 98 Residual > Null Deviance:??????? 134.6 > Residual Deviance: 114.9???? AIC: 118.9 > > --------------- snip ------------- > > The original poster claimed to have encountered an error with a 0/1 > numeric response, but didn't show any data or even a command. I > suspect that the response was a character variable, but of course > can't really know that.So continuing with your example: > head(yc <- as.character(y)) [1] "0" "1" "1" "1" "1" "0" > glm(yc ~ x, family=binomial) Error in weights * y : non-numeric argument to binary operator But the OP says that [...] R complains that I should make the dependent variable a factor. That is not what the error message says, it "asks" for a numeric argument to the '*' operator. We haven't seen the exact R message yet, so, like others have said, the OP should post it along with code. Hope this helps, Rui Barradas> > Best, > ?John > > John Fox, Professor Emeritus > McMaster University > Hamilton, Ontario, Canada > web: https://socialsciences.mcmaster.ca/jfox/ > > On 2020-08-01 2:25 p.m., Paul Bernal wrote: >> Dear friend, >> >> I am aware that I have a binomial dependent variable, which is covid >> status >> (1 if covid positive, and 0 otherwise). >> >> My question was if R requires to turn a binomial response variable >> into a >> factor or not, that's all. >> >> Cheers, >> >> Paul >> >> El s?b., 1 de agosto de 2020 1:22 p. m., Bert Gunter >> <bgunter.4567 at gmail.com> >> escribi?: >> >>> ... yes, but so does lm() for a categorical **INdependent** variable >>> with >>> more than 2 numerically labeled levels. n levels? = (n-1) df for a >>> categorical covariate, but 1 for a continuous one (unless more complex >>> models are explicitly specified of course). As I said, the OP seems >>> confused about whether he is referring to the response or >>> covariates. Or >>> maybe he just made the same typo I did. >>> >>> Bert Gunter >>> >>> "The trouble with having an open mind is that people keep coming >>> along and >>> sticking things into it." >>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >>> >>> >>> On Sat, Aug 1, 2020 at 11:15 AM Patrick (Malone Quantitative) < >>> malone at malonequantitative.com> wrote: >>> >>>> No, R does not. glm() does in order to do logistic regression. >>>> >>>> On Sat, Aug 1, 2020 at 2:11 PM Paul Bernal <paulbernal07 at gmail.com> >>>> wrote: >>>> >>>>> Hi Bert, >>>>> >>>>> Thank you for the kind reply. >>>>> >>>>> But what if I don't turn the variable into a factor. Let's say >>>>> that in >>>>> excel I just coded the variable as 1s and 0s and just imported the >>>>> dataset >>>>> into R and fitted the logistic regression without turning any >>>>> categorical >>>>> variable or dummy variable into a factor? >>>>> >>>>> Does R requires every dummy variable to be treated as a factor? >>>>> >>>>> Best regards, >>>>> >>>>> Paul >>>>> >>>>> El s?b., 1 de agosto de 2020 12:59 p. m., Bert Gunter < >>>>> bgunter.4567 at gmail.com> escribi?: >>>>> >>>>>> x <- factor(0:1) >>>>>> x <- factor("yes","no") >>>>>> >>>>>> will produce identical results up to labeling. >>>>>> >>>>>> >>>>>> Bert Gunter >>>>>> >>>>>> "The trouble with having an open mind is that people keep coming >>>>>> along >>>>> and >>>>>> sticking things into it." >>>>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >>>>>> >>>>>> >>>>>> On Sat, Aug 1, 2020 at 10:40 AM Paul Bernal <paulbernal07 at gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Dear friends, >>>>>>> >>>>>>> Hope you are doing great. I want to fit a logistic regression in R, >>>>> where >>>>>>> the dependent variable is the covid status (I used 1 for covid >>>>> positives, >>>>>>> and 0 for covid negatives), but when I ran the glm, R complains >>>>>>> that I >>>>>>> should make the dependent variable a factor. >>>>>>> >>>>>>> What would be more advisable, to keep the dependent variable >>>>>>> with 1s >>>>> and >>>>>>> 0s, or code it as yes/no and then make it a factor? >>>>>>> >>>>>>> Any guidance will be greatly appreciated, >>>>>>> >>>>>>> Best regards, >>>>>>> >>>>>>> Paul >>>>>>> >>>>>>> ???????? [[alternative HTML version deleted]] >>>>>>> >>>>>>> ______________________________________________ >>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>> PLEASE do read the posting guide >>>>>>> http://www.R-project.org/posting-guide.html >>>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>>> >>>>>> >>>>> >>>>> ???????? [[alternative HTML version deleted]] >>>>> >>>>> ______________________________________________ >>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>> >>>> >>>> -- >>>> Patrick S. Malone, Ph.D., Malone Quantitative >>>> NEW Service Models: http://malonequantitative.com >>>> >>>> He/Him/His >>>> >>> >> >> ????[[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Este e-mail foi verificado em termos de v?rus pelo software antiv?rus Avast. https://www.avast.com/antivirus
I like using a logical response in cases like this, but put its construction in the formula so it is unambiguous when I look at the results later.> d <- data.frame(Covid=c("Pos","Pos","Neg","Pos","Neg","Neg"), Age=41:46) > glm(family=binomial, data=d, Covid=="Pos"~Age)Call: glm(formula = Covid == "Pos" ~ Age, family = binomial, data = d) Coefficients: (Intercept) Age 52.810 -1.214 Degrees of Freedom: 5 Total (i.e. Null); 4 Residual Null Deviance: 8.318 Residual Deviance: 4.956 AIC: 8.956 Bill Dunlap TIBCO Software wdunlap tibco.com On Sat, Aug 1, 2020 at 12:21 PM John Fox <jfox at mcmaster.ca> wrote:> > Dear Paul, > > I think that this thread has gotten unnecessarily complicated. The > answer, as is easily demonstrated, is that a binary response for a > binomial GLM in glm() may be a factor, a numeric variable, or a logical > variable, with identical results; for example: > > --------------- snip ------------- > > > set.seed(123) > > > head(x <- rnorm(100)) > [1] -0.56047565 -0.23017749 1.55870831 0.07050839 0.12928774 1.71506499 > > > head(y <- rbinom(100, 1, 1/(1 + exp(-x)))) > [1] 0 1 1 1 1 0 > > > head(yf <- as.factor(y)) > [1] 0 1 1 1 1 0 > Levels: 0 1 > > > head(yl <- y == 1) > [1] FALSE TRUE TRUE TRUE TRUE FALSE > > > glm(y ~ x, family=binomial) > > Call: glm(formula = y ~ x, family = binomial) > > Coefficients: > (Intercept) x > 0.3995 1.1670 > > Degrees of Freedom: 99 Total (i.e. Null); 98 Residual > Null Deviance: 134.6 > Residual Deviance: 114.9 AIC: 118.9 > > > glm(yf ~ x, family=binomial) > > Call: glm(formula = yf ~ x, family = binomial) > > Coefficients: > (Intercept) x > 0.3995 1.1670 > > Degrees of Freedom: 99 Total (i.e. Null); 98 Residual > Null Deviance: 134.6 > Residual Deviance: 114.9 AIC: 118.9 > > > glm(yl ~ x, family=binomial) > > Call: glm(formula = yl ~ x, family = binomial) > > Coefficients: > (Intercept) x > 0.3995 1.1670 > > Degrees of Freedom: 99 Total (i.e. Null); 98 Residual > Null Deviance: 134.6 > Residual Deviance: 114.9 AIC: 118.9 > > --------------- snip ------------- > > The original poster claimed to have encountered an error with a 0/1 > numeric response, but didn't show any data or even a command. I suspect > that the response was a character variable, but of course can't really > know that. > > Best, > John > > John Fox, Professor Emeritus > McMaster University > Hamilton, Ontario, Canada > web: https://socialsciences.mcmaster.ca/jfox/ > > On 2020-08-01 2:25 p.m., Paul Bernal wrote: > > Dear friend, > > > > I am aware that I have a binomial dependent variable, which is covid status > > (1 if covid positive, and 0 otherwise). > > > > My question was if R requires to turn a binomial response variable into a > > factor or not, that's all. > > > > Cheers, > > > > Paul > > > > El s?b., 1 de agosto de 2020 1:22 p. m., Bert Gunter <bgunter.4567 at gmail.com> > > escribi?: > > > >> ... yes, but so does lm() for a categorical **INdependent** variable with > >> more than 2 numerically labeled levels. n levels = (n-1) df for a > >> categorical covariate, but 1 for a continuous one (unless more complex > >> models are explicitly specified of course). As I said, the OP seems > >> confused about whether he is referring to the response or covariates. Or > >> maybe he just made the same typo I did. > >> > >> Bert Gunter > >> > >> "The trouble with having an open mind is that people keep coming along and > >> sticking things into it." > >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >> > >> > >> On Sat, Aug 1, 2020 at 11:15 AM Patrick (Malone Quantitative) < > >> malone at malonequantitative.com> wrote: > >> > >>> No, R does not. glm() does in order to do logistic regression. > >>> > >>> On Sat, Aug 1, 2020 at 2:11 PM Paul Bernal <paulbernal07 at gmail.com> > >>> wrote: > >>> > >>>> Hi Bert, > >>>> > >>>> Thank you for the kind reply. > >>>> > >>>> But what if I don't turn the variable into a factor. Let's say that in > >>>> excel I just coded the variable as 1s and 0s and just imported the > >>>> dataset > >>>> into R and fitted the logistic regression without turning any categorical > >>>> variable or dummy variable into a factor? > >>>> > >>>> Does R requires every dummy variable to be treated as a factor? > >>>> > >>>> Best regards, > >>>> > >>>> Paul > >>>> > >>>> El s?b., 1 de agosto de 2020 12:59 p. m., Bert Gunter < > >>>> bgunter.4567 at gmail.com> escribi?: > >>>> > >>>>> x <- factor(0:1) > >>>>> x <- factor("yes","no") > >>>>> > >>>>> will produce identical results up to labeling. > >>>>> > >>>>> > >>>>> Bert Gunter > >>>>> > >>>>> "The trouble with having an open mind is that people keep coming along > >>>> and > >>>>> sticking things into it." > >>>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >>>>> > >>>>> > >>>>> On Sat, Aug 1, 2020 at 10:40 AM Paul Bernal <paulbernal07 at gmail.com> > >>>>> wrote: > >>>>> > >>>>>> Dear friends, > >>>>>> > >>>>>> Hope you are doing great. I want to fit a logistic regression in R, > >>>> where > >>>>>> the dependent variable is the covid status (I used 1 for covid > >>>> positives, > >>>>>> and 0 for covid negatives), but when I ran the glm, R complains that I > >>>>>> should make the dependent variable a factor. > >>>>>> > >>>>>> What would be more advisable, to keep the dependent variable with 1s > >>>> and > >>>>>> 0s, or code it as yes/no and then make it a factor? > >>>>>> > >>>>>> Any guidance will be greatly appreciated, > >>>>>> > >>>>>> Best regards, > >>>>>> > >>>>>> Paul > >>>>>> > >>>>>> [[alternative HTML version deleted]] > >>>>>> > >>>>>> ______________________________________________ > >>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help > >>>>>> PLEASE do read the posting guide > >>>>>> http://www.R-project.org/posting-guide.html > >>>>>> and provide commented, minimal, self-contained, reproducible code. > >>>>>> > >>>>> > >>>> > >>>> [[alternative HTML version deleted]] > >>>> > >>>> ______________________________________________ > >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>>> https://stat.ethz.ch/mailman/listinfo/r-help > >>>> PLEASE do read the posting guide > >>>> http://www.R-project.org/posting-guide.html > >>>> and provide commented, minimal, self-contained, reproducible code. > >>>> > >>> > >>> > >>> -- > >>> Patrick S. Malone, Ph.D., Malone Quantitative > >>> NEW Service Models: http://malonequantitative.com > >>> > >>> He/Him/His > >>> > >> > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.