Hi there In rpart, one can get a graph of R-squared (using rsq.rpart (fit)), in which the x axis is the number of splits, and which contains two lines - an "apparent" R squared and an Rsquared based on the x error. I would like to caclulate these R-squared values, but cannot work out from the output how it is done. Is there any way to access the values that underpin this graph? Alternatively, is there any way to calculate them from the summary data? Thanks in advance, Andy Park _________________________________________________________________ [[replacing trailing spam]]
Andrew Park wrote:> Hi there > > In rpart, one can get a graph of R-squared (using rsq.rpart (fit)), in which the x axis is the number of splits, and which contains two lines - an "apparent" R squared and an Rsquared based on the x error. > > I would like to caclulate these R-squared values, but cannot work out from the output how it is done. Is there any way to access the values that underpin this graph? Alternatively, is there any way to calculate them from the summary data? > > Thanks in advance, > > Andy ParkBeware. Yi in his JASA paper about generalized degrees of freedom showed that to get an unbiased estimate of R^2 from recursive partitioning you have to use the formula for adjusted R^2 with number of parameters far exceeding the number of final splits. He showed how to estimate the d.f. Recursive partitioning seems to result in simple prediction models but this is mainly an illusion. Frank Harrell -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University
On Thursday 13 December 2007, Frank E Harrell Jr wrote:> Andrew Park wrote: > > Hi there > > > > In rpart, one can get a graph of R-squared (using rsq.rpart (fit)), in > > which the x axis is the number of splits, and which contains two lines - > > an "apparent" R squared and an Rsquared based on the x error. > > > > I would like to caclulate these R-squared values, but cannot work out > > from the output how it is done. Is there any way to access the values > > that underpin this graph? Alternatively, is there any way to calculate > > them from the summary data? > > > > Thanks in advance, > > > > Andy Park > > Beware. Yi in his JASA paper about generalized degrees of freedom > showed that to get an unbiased estimate of R^2 from recursive > partitioning you have to use the formula for adjusted R^2 with number of > parameters far exceeding the number of final splits. He showed how to > estimate the d.f. Recursive partitioning seems to result in simple > prediction models but this is mainly an illusion. > > Frank HarrellHi Frank and others, hapen to have a link / citation for that paper? thanks! -- Dylan Beaudette Soil Resource Laboratory http://casoilresource.lawr.ucdavis.edu/ University of California at Davis 530.754.7341
The author is actually "Ye", and not "Yi". It is titled "On Measuring and correcting the effects of data mining and model selection" by Jianming Ye, JASA(1998). Here is link from JSTOR: http://www.jstor.org/view/01621459/di015668/01p00145/0?currentResult=0162145 9%2bdi015668%2b01p00145%2b0%2cFF15&searchUrl=http%3A%2F%2Fwww.jstor.org%2Fse arch%2FAdvancedResults%3Fhp%3D25%26si%3D1%26q0%3DYe%2Bdata%2Bmining%26f0%3D% 26c0%3DAND%26wc%3Don%26sd%3D%26ed%3D%26la%3D%26dc%3DStatistics Ravi. ---------------------------------------------------------------------------- ------- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvaradhan at jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html ---------------------------------------------------------------------------- -------- -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Dylan Beaudette Sent: Thursday, December 13, 2007 5:51 PM To: r-help at r-project.org Cc: Andrew Park Subject: Re: [R] Calculating Rsquared values in rpart On Thursday 13 December 2007, Frank E Harrell Jr wrote:> Andrew Park wrote: > > Hi there > > > > In rpart, one can get a graph of R-squared (using rsq.rpart (fit)), in > > which the x axis is the number of splits, and which contains two lines - > > an "apparent" R squared and an Rsquared based on the x error. > > > > I would like to caclulate these R-squared values, but cannot work out > > from the output how it is done. Is there any way to access the values > > that underpin this graph? Alternatively, is there any way to calculate > > them from the summary data? > > > > Thanks in advance, > > > > Andy Park > > Beware. Yi in his JASA paper about generalized degrees of freedom > showed that to get an unbiased estimate of R^2 from recursive > partitioning you have to use the formula for adjusted R^2 with number of > parameters far exceeding the number of final splits. He showed how to > estimate the d.f. Recursive partitioning seems to result in simple > prediction models but this is mainly an illusion. > > Frank HarrellHi Frank and others, hapen to have a link / citation for that paper? thanks! -- Dylan Beaudette Soil Resource Laboratory http://casoilresource.lawr.ucdavis.edu/ University of California at Davis 530.754.7341 ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.