Maximilian Lklweryc
2012-Sep-25 15:03 UTC
[R] Plotting of regsubsets adjr2 values not correct
Hi, I want to make model selection with regsubsets. My code is: a<-regsubsets(Gesamt ~ CommunistSocialist + CountrySize + GNI + Lifeexp + Schoolyears + ExpMilitary + Mortality + PopPoverty + PopTotal + ExpEdu + ExpHealth, data=olympiadaten, nbest=2) summary(a) plot(a,scale="adjr2") (output attached) The problem is now, that I want to fit the best model again "manually" and have a look at it, but the value of the adjusted R squared is not the same as in the regsubsets output? This is also the case for the other models, e.g. when I do the simplest model in the graphic: summary(lm(Gesamt~ExpHealth)) I get an adj. R squared of 0.009202 but the plot says something abou 0.14, so it is not correct? I don't know how to solve this problem, any help would be nice, thanks. Also I do not understand, which models are shown there, e.g. the simple model just with an intercept and the variable GNI is not shown in the plot, why? -------------- next part -------------- A non-text attachment was scrubbed... Name: regsubsets.png Type: image/png Size: 8954 bytes Desc: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120925/6c2a42c8/attachment-0002.png>
1. You failed to tell us that you are using the leaps package. 2. You are lost statistically. I strongly recommend that you seek out local statistical help. At the very least, post on a statistical Help list, which this is _not_. 3. FWIW: What you are trying to do is quite unwise. That is why I suggested that you seek local help. Cheers, Bert On Tue, Sep 25, 2012 at 8:03 AM, Maximilian Lklweryc <maxlklweryc at gmail.com> wrote:> Hi, > I want to make model selection with regsubsets. My code is: > > a<-regsubsets(Gesamt ~ CommunistSocialist + CountrySize + GNI + Lifeexp + > Schoolyears + ExpMilitary + Mortality + > PopPoverty + PopTotal + ExpEdu + ExpHealth, data=olympiadaten, nbest=2) > summary(a) > plot(a,scale="adjr2") > > (output attached) > > The problem is now, that I want to fit the best model again "manually" and > have a look at it, but the value of the adjusted R squared is not the same > as in the regsubsets output? This is also the case for the other models, > e.g. when I do the simplest model in the graphic: > summary(lm(Gesamt~ExpHealth)) > I get an adj. R squared of 0.009202 but the plot says something abou 0.14, > so it is not correct? I don't know how to solve this problem, any help > would be nice, thanks. > > > Also I do not understand, which models are shown there, e.g. the simple > model just with an intercept and the variable GNI is not shown in the plot, > why? > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
On Wed, Sep 26, 2012 at 3:03 AM, Maximilian Lklweryc <maxlklweryc at gmail.com> wrote:> Hi, > I want to make model selection with regsubsets. My code is: > > a<-regsubsets(Gesamt ~ CommunistSocialist + CountrySize + GNI + Lifeexp + > Schoolyears + ExpMilitary + Mortality + > PopPoverty + PopTotal + ExpEdu + ExpHealth, data=olympiadaten, nbest=2) > summary(a) > plot(a,scale="adjr2") > > (output attached) > > The problem is now, that I want to fit the best model again "manually" and > have a look at it, but the value of the adjusted R squared is not the same > as in the regsubsets output?Hard to tell: you haven't given us any way to reproduce what you did. For the data example in the package the adjusted r2 values from individual models match up with the ones on the graph. I've checked another couple of data sets and they also agree.> > Also I do not understand, which models are shown there, e.g. the simple > model just with an intercept and the variable GNI is not shown in the plot, > why?You asked for the two best models of each size, so you get the two best models of each size. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland
Does your dataset have any missing data? (without a reproducible example we can only guess). If it does then you may be fitting the same model to different subsets of the data between the 2 methods. On Tue, Sep 25, 2012 at 9:03 AM, Maximilian Lklweryc <maxlklweryc at gmail.com> wrote:> Hi, > I want to make model selection with regsubsets. My code is: > > a<-regsubsets(Gesamt ~ CommunistSocialist + CountrySize + GNI + Lifeexp + > Schoolyears + ExpMilitary + Mortality + > PopPoverty + PopTotal + ExpEdu + ExpHealth, data=olympiadaten, nbest=2) > summary(a) > plot(a,scale="adjr2") > > (output attached) > > The problem is now, that I want to fit the best model again "manually" and > have a look at it, but the value of the adjusted R squared is not the same > as in the regsubsets output? This is also the case for the other models, > e.g. when I do the simplest model in the graphic: > summary(lm(Gesamt~ExpHealth)) > I get an adj. R squared of 0.009202 but the plot says something abou 0.14, > so it is not correct? I don't know how to solve this problem, any help > would be nice, thanks. > > > Also I do not understand, which models are shown there, e.g. the simple > model just with an intercept and the variable GNI is not shown in the plot, > why? > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Gregory (Greg) L. Snow Ph.D. 538280 at gmail.com