y<-c(0,1,1,0,0,1,0,0,1,1,1,0,1,1,1,0,0,0,0,1) age<-c(45,23,56,67,23,23,28,56,45,47,36,37,33,35,38,39,43,28,39,41) smoking<-c(0,1,1,1,0,0,0,0,0,1,1,0,0,1,0,1,1,1,0,1) hypertension<-c(1,1,0,1,0,1,0,1,1,0,1,1,1,1,1,1,0,0,1,0) data<-data.frame(y,age,smoking,hypertension) data model<-glm(y~age+factor(smoking)+factor(hypertension), data, family binomial(link = "logit"),na.action = na.omit) summary(model) from above sample data I want to study a case-control study on male individuals with my response variable y, disease status (1=Case, 0=Control) with covariates age, smoking status(1=Yes, 0=No) and hypertension, hypertensive (1=Yes, 0=No). I want to fit the model to predict the disease status using at least two different methods. And to make model comparisons. I think logistic regression will be the best fit for this case control study. Do we have other options in addition to logistic regression? My objective is to fit the model to predict the disease status using at least two different methods. Kind regards, Hana
Ebert,Timothy Aaron
2022-Jun-15 13:46 UTC
[R] Model Comparision for case control studies in R
Disease status is missing from the sample data. Are age, disease, smoking, and/or hypertension correlated in any way or are they independent (correlation=0)? Are the correlations large enough to adversely influence your model? Tim -----Original Message----- From: R-help <r-help-bounces at r-project.org> On Behalf Of anteneh asmare Sent: Wednesday, June 15, 2022 7:29 AM To: r-help at r-project.org Subject: [R] Model Comparision for case control studies in R [External Email] y<-c(0,1,1,0,0,1,0,0,1,1,1,0,1,1,1,0,0,0,0,1) age<-c(45,23,56,67,23,23,28,56,45,47,36,37,33,35,38,39,43,28,39,41) smoking<-c(0,1,1,1,0,0,0,0,0,1,1,0,0,1,0,1,1,1,0,1) hypertension<-c(1,1,0,1,0,1,0,1,1,0,1,1,1,1,1,1,0,0,1,0) data<-data.frame(y,age,smoking,hypertension) data model<-glm(y~age+factor(smoking)+factor(hypertension), data, family = binomial(link = "logit"),na.action = na.omit) summary(model) from above sample data I want to study a case-control study on male individuals with my response variable y, disease status (1=Case, 0=Control) with covariates age, smoking status(1=Yes, 0=No) and hypertension, hypertensive (1=Yes, 0=No). I want to fit the model to predict the disease status using at least two different methods. And to make model comparisons. I think logistic regression will be the best fit for this case control study. Do we have other options in addition to logistic regression? My objective is to fit the model to predict the disease status using at least two different methods. Kind regards, Hana ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dhelp&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=l7afPQ_gGAoV2EsNoYSYul0qAISEiXLmTmu0IQ03nZO4rcAi9xHZGsWwwig4oYOB&s=ztyDthknydhlcM49F33Gz6xRl6G7U9s8aIhB1VN-EKY&ePLEASE do read the posting guide https://urldefense.proofpoint.com/v2/url?u=http-3A__www.R-2Dproject.org_posting-2Dguide.html&d=DwICAg&c=sJ6xIWYx-zLMB3EPkvcnVg&r=9PEhQh2kVeAsRzsn7AkP-g&m=l7afPQ_gGAoV2EsNoYSYul0qAISEiXLmTmu0IQ03nZO4rcAi9xHZGsWwwig4oYOB&s=tcsGkhvtVvoVvb1Ehah-vLRC6an40rJXQXqqfX2f0gI&eand provide commented, minimal, self-contained, reproducible code.