I would appreciate any comments to the following question. I am trying to build a model for survival based on 155 patients and 70 covariates using lasso. Lasso picks, three variables only, say X1,X2,X3, and omits the others. I wanted to check why a particular (clinically important) variable, say X4, is omitted by lasso. One of the things I did was I ran lasso on X1,X2,X3 and X4 only. The results (coefs) I get are different from running all 70 variables, and in fact now X4 is not omitted. Why is that ? should it not be that the global (among all 70 variables) optimum, which is X1,X2,X3 and not X4, be also the local (among the four only) optimum ? Thank you for your consideration Constantine Frangakis, PhD Professor Departments of Biostatistics Psychiatry, and Radiology Johns Hopkins University [[alternative HTML version deleted]]
Wrong list. This list is about R programming not statistics. Try stats.stackexchange.com for statistics questions. However, I will make a suggestion: Find local statistical experts with whom to consult. Your understanding appears to fall short of the necessary background to use such complex statistical procedures reliably. I am not confident that online lists will suffice. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Sat, Nov 5, 2016 at 11:37 AM, Constantine Frangakis <cfranga1 at jhu.edu> wrote:> I would appreciate any comments to the following question. > I am trying to build a model for survival based on 155 patients and 70 covariates using lasso. Lasso picks, three variables only, say X1,X2,X3, and omits the others. I wanted to check why a particular (clinically important) variable, say X4, is omitted by lasso. One of the things I did was I ran lasso on X1,X2,X3 and X4 only. The results (coefs) I get are different from running all 70 variables, and in fact now X4 is not omitted. > Why is that ? should it not be that the global (among all 70 variables) optimum, which is X1,X2,X3 and not X4, be also the local (among the four only) optimum ? > Thank you for your consideration > > > Constantine Frangakis, PhD > Professor > Departments of Biostatistics > Psychiatry, and Radiology > Johns Hopkins University > > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi Dr. Franggakis; This can be explained because of collinearity and suppressor variable in multiple regression models. In the first scenario, you have both correlated variables and suppressor variables in the second scenario you do not have this problem. I do wonder why to do not use the scale elastic net for this particular problem. Good luck,Oslo On Saturday, November 5, 2016 4:29 PM, Constantine Frangakis <cfranga1 at jhu.edu> wrote: I would appreciate any comments to the following question. I am trying to build a model for survival based on 155 patients and 70 covariates using lasso. Lasso picks, three variables only, say X1,X2,X3, and? omits the others. I wanted to check why a particular (clinically important) variable, say X4, is omitted by lasso. One of the things I did was I ran lasso on X1,X2,X3 and X4 only. The results (coefs) I get are different from running all 70 variables, and in fact now X4 is not omitted. Why is that ? should it not be that the global (among all 70 variables) optimum, which is X1,X2,X3 and not X4, be also the local (among the four only) optimum ? Thank you for your consideration Constantine Frangakis, PhD Professor Departments of Biostatistics Psychiatry, and Radiology Johns Hopkins University ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]