I am trying to develop a prognostic model using logistic regression. I built a full , approximate models with the use of penalization - design package. Also, I tried Chi-square criteria, step-down techniques. Used BS for model validation. The main purpose is to develop a predictive model for future patient population. One of the strong predictor pertains to the study design and would not mean much for a clinician/investigator in real clinical situation and have been asked to remove it. Can I propose a model and nomogram without that strong -irrelevant predictor?? If yes, do I need to redo model calibration, discrimination, validation, etc...?? or just have 5 predictors instead of 6 in the prognostic model?? Thanks for your help Al . [[alternative HTML version deleted]]
Al, I'd redo everything and report in the paper that your peculiar predictor was contributing strongly to models that were built without excluding this predictor. This is an important information: your models get "confused" by the predictor (I'd consider this a lack of a certain kind of robustness, but I'm not a statistician). HTH Claudia Am 26.05.2011 14:42, schrieb El-Tahtawy, Ahmed:> I am trying to develop a prognostic model using logistic regression. I > built a full , approximate models with the use of penalization - design > package. Also, I tried Chi-square criteria, step-down techniques. Used > BS for model validation. > > > > The main purpose is to develop a predictive model for future patient > population. One of the strong predictor pertains to the study design > and would not mean much for a clinician/investigator in real clinical > situation and have been asked to remove it. > > > > Can I propose a model and nomogram without that strong -irrelevant > predictor?? If yes, do I need to redo model calibration, discrimination, > validation, etc...?? or just have 5 predictors instead of 6 in the > prognostic model?? > > > > Thanks for your help > > Al > > . > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Claudia Beleites Spectroscopy/Imaging Institute of Photonic Technology Albert-Einstein-Str. 9 07745 Jena Germany email: claudia.beleites at ipht-jena.de phone: +49 3641 206-133 fax: +49 2641 206-399
On May 26, 2011, at 7:42 AM, El-Tahtawy, Ahmed wrote:> I am trying to develop a prognostic model using logistic regression. I > built a full , approximate models with the use of penalization - design > package. Also, I tried Chi-square criteria, step-down techniques. Used > BS for model validation. > > > > The main purpose is to develop a predictive model for future patient > population. One of the strong predictor pertains to the study design > and would not mean much for a clinician/investigator in real clinical > situation and have been asked to remove it. > > > > Can I propose a model and nomogram without that strong -irrelevant > predictor?? If yes, do I need to redo model calibration, discrimination, > validation, etc...?? or just have 5 predictors instead of 6 in the > prognostic model?? > > > > Thanks for your help > > AlIs it that the study design characteristic would not make sense to a clinician but is relevant to future samples, or that the study design characteristic is unique to the sample upon which the model was developed and is not relevant to future samples because they will not be in the same or a similar study? Is the study design characteristic a surrogate for other factors that would be relevant to future samples? If so, you might engage in a conversation with the clinicians to gain some insights into other variables to consider for inclusion in the model, that might in turn, help to explain the effect of the study design variable. Either way, if the covariate is removed, you of course need to engage in fully re-evaluating the model. You cannot just drop the covariate and continue to use model fit assessments made on the full model. HTH, Marc Schwartz
The strong predictor is the country/region where the study was conducted. So it is not important/useful for a clinician to use it (as long he/she is in USA or Europe). Excluding that predictor will make another 2 insignificant predictors to become significant!! Can the new model have a reliable predictive accuracy? I thought of excluding all patients from other countries and develop the model accordingly- is the exclusion of a lot of patients and compromise of the power is more acceptable?? Thanks for your help... Al -----Original Message----- From: Marc Schwartz [mailto:marc_schwartz at me.com] Sent: Thursday, May 26, 2011 10:54 AM To: El-Tahtawy, Ahmed Cc: r-help at r-project.org Subject: Re: [R] predictive accuracy On May 26, 2011, at 7:42 AM, El-Tahtawy, Ahmed wrote:> I am trying to develop a prognostic model using logistic regression.I> built a full , approximate models with the use of penalization -design> package. Also, I tried Chi-square criteria, step-down techniques. Used > BS for model validation. > > > The main purpose is to develop a predictive model for future patient > population. One of the strong predictor pertains to the study design > and would not mean much for a clinician/investigator in real clinical > situation and have been asked to remove it. > > Can I propose a model and nomogram without that strong -irrelevant > predictor?? If yes, do I need to redo model calibration,discrimination,> validation, etc...?? or just have 5 predictors instead of 6 in the > prognostic model?? > > > > Thanks for your help > > AlIs it that the study design characteristic would not make sense to a clinician but is relevant to future samples, or that the study design characteristic is unique to the sample upon which the model was developed and is not relevant to future samples because they will not be in the same or a similar study? Is the study design characteristic a surrogate for other factors that would be relevant to future samples? If so, you might engage in a conversation with the clinicians to gain some insights into other variables to consider for inclusion in the model, that might in turn, help to explain the effect of the study design variable. Either way, if the covariate is removed, you of course need to engage in fully re-evaluating the model. You cannot just drop the covariate and continue to use model fit assessments made on the full model. HTH, Marc Schwartz
1. This is not about R, and should be taken off list. 2. You are wading in an alligator infested swamp. Get help from (other) statisticians at Pfizer (there are many good ones there). Best, Bert P.S. The answer to all your questions is "no" (imho). On Thu, May 26, 2011 at 1:35 PM, El-Tahtawy, Ahmed <Ahmed.El-Tahtawy at pfizer.com> wrote:> The strong predictor is the country/region where the study was > conducted. So it is not important/useful for a clinician to use it (as > long he/she is in USA or Europe). > Excluding that predictor will make another 2 insignificant predictors to > become significant!! ?Can the new model have a reliable predictive > accuracy? I thought of excluding all patients from other countries and > develop the model accordingly- is the exclusion of a lot of patients and > compromise of the power is more acceptable?? > Thanks for your help... > Al > > -----Original Message----- > From: Marc Schwartz [mailto:marc_schwartz at me.com] > Sent: Thursday, May 26, 2011 10:54 AM > To: El-Tahtawy, Ahmed > Cc: r-help at r-project.org > Subject: Re: [R] predictive accuracy > > > On May 26, 2011, at 7:42 AM, El-Tahtawy, Ahmed wrote: > >> I am trying to develop a prognostic model using logistic regression. > I >> built a full , approximate models with the use of penalization - > design >> package. Also, I tried Chi-square criteria, step-down techniques. Used >> BS for model validation. >> >> > The main purpose is to develop a predictive model for future patient >> population. ? One of the strong predictor pertains to the study design >> and would not mean much for a clinician/investigator in real clinical >> situation and have been asked to remove it. >> > Can I propose a model and nomogram without that strong -irrelevant >> predictor?? If yes, do I need to redo model calibration, > discrimination, >> validation, etc...?? or just have 5 predictors instead of 6 in the >> prognostic model?? >> >> >> >> Thanks for your help >> >> Al > > > Is it that the study design characteristic would not make sense to a > clinician but is relevant to future samples, or that the study design > characteristic is unique to the sample upon which the model was > developed and is not relevant to future samples because they will not be > in the same or a similar study? > > Is the study design characteristic a surrogate for other factors that > would be relevant to future samples? If so, you might engage in a > conversation with the clinicians to gain some insights into other > variables to consider for inclusion in the model, that might in turn, > help to explain the effect of the study design variable. > > Either way, if the covariate is removed, you of course need to engage in > fully re-evaluating the model. You cannot just drop the covariate and > continue to use model fit assessments made on the full model. > > HTH, > > Marc Schwartz > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- "Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions." -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics