Michael Eisenring
2015-Sep-30 00:08 UTC
[R] Calculate Total CORRECTED SS for non linear regression
Hello I started to work on nonlinear regressions recently, I want to compare a set of models and would like to use an equivalent of R2 for nonlinear regression models. I learned that a good alternative is to calculate the following corrected formula: 1 - (Residual SS/Total Corrected SS) This provides a measure of how well the model fits versus just the overall mean for the response variable. With some help I could program the formula for my model and , but I am not quite sure if it is correct and what exactly the total CORRECTED SS is (compared to the Total SS) Can anyone tell me how to calculate the Total Corrected SS in R and how it can be implemented in my code? CODE and INPUT: # REGRESSION NON-LINEAR REGRESSION MODEL FIT------------------------------------ dta<-read.csv("input.csv",header=T, sep = ",") dput(dta) head(dta) # loading packages: analysis of mixed effect models library(nls2)#model #EQUATION 1: y~yo+a*(1-b^x) #Aim: fit equation to data: y~yo+a*(1-b^x) # y =Compound (from my data set) x= Damage (from my data set) #The other 3 parameters are unknown: yo=Intercept, a= assymptote ans b=slope plot(Compound~Damage, dta) # Looking at the plot, 0 is a plausible estimate for y0: # a+y0 is the asymptote, so estimate about 4000; # b is between 0 and 1, so estimate .5 dta.nls <- nls(Compound~y0+a*(1-b^Damage), dta, start=list(y0=0, a=4000, b=.5)) xval <- seq(0, 10, 0.1) lines(xval, predict(dta.nls, data.frame(Damage=xval))) profile(dta.nls, alpha= .05) summary(dta.nls) #Calculation R2 Variation expl: total Variation--> Is deviance(dta.nls) the same as Total Corrected SS??? CompoundSS <- sum((dta$Compound - mean(dta$Compound))^2) R2 <- deviance(dta.nls)/CompoundSS R2 #R2=0.5723023 structure(list(Treatment = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 4L, 1L, 2L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 4L), .Label = c("1_2d", "3_2d", "9_2d", "C"), class = "factor"), Compound = c(1036.331811, 4171.427741, 6039.995102, 4140.242559, 4854.985845, 6982.035521, 948.2418407, 3618.448997, 3130.376482, 1180.171957, 1500.863038, 4576.787021, 5629.979049, 3589.187889, 2508.417927, 1989.576826, 5972.926124, 450.7205451, 1120.955, 3470.09352, 3575.043632, 349.0864019, 1013.807628, 910.8879471, 3743.331903, 592.3403778, 1517.045807, 1504.491931, 3736.144027, 723.885643, 1782.864308, 1414.161257, 3723.629772, 2005.919344, 4198.569251, 2228.522959, 3322.115942, 720.9785449, 2874.651764, 2287.228752, 5654.858696, 1247.806111, 2547.326207, 2608.716056, 1079.846532), Damage = c(0.4955, 1.516, 4.409, 0.491, 2.3035, 3.51, 0, 0.4435, 1.573, 0, 0.142, 2.171, 4.023, 0, 0.6925, 1.989, 5.683, 0, 0.756, 2.129, 9.437, 0, 0.578, 2.966, 4.7245, 0, 1.0475, 1.62, 5.568, 0, 0.8295, 2.411, 7.272, 0, 0.4035, 2.974, 8.043, 0, 0.6965, 1.313, 5.681, 0, 0.5895, 2.559, 0)), .Names = c("Treatment", "Compound", "Damage"), class "data.frame", row.names = c(NA, -45L)) [[alternative HTML version deleted]]
peter dalgaard
2015-Sep-30 08:09 UTC
[R] Calculate Total CORRECTED SS for non linear regression
> On 30 Sep 2015, at 02:08 , Michael Eisenring <michael.eisenring at gmx.ch> wrote: > > Can anyone tell me how to calculate the Total Corrected SS in R and how it > can be implemented in my code?It is just sum((y-mean(y))^2). Beware that a fair amount of (somewhat silly) contention is going on in this area, though. In particular, the formula can give negative values, which is unfortunate if you try calling it "R^2". -pd -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
ProfJCNash
2015-Sep-30 11:56 UTC
[R] Calculate Total CORRECTED SS for non linear regression
Some workers consider it bad practise to compute what is called R-squared for a nonlinear model. I find it useful for nonlinear models as a signpost of how good a fit has been found. But just as signposts can be turned around by vandals, nonlinear models can give a misleading indication. With linear models, sum((y - model(y))^2) must be smaller than sum((y - mean(y))^2). That is not necessary for nonlinear models. Also for linear models there are many equivalent formulas due to the many identities that go with the linear algebra. Those are not in force for nonlinear models. So what in linear models is "R-squared" is just a comparison to the model that is the mean of the predicted variable, i.e., the best single number model. JN On 15-09-30 04:09 AM, peter dalgaard wrote:> >> On 30 Sep 2015, at 02:08 , Michael Eisenring <michael.eisenring at gmx.ch> wrote: >> >> Can anyone tell me how to calculate the Total Corrected SS in R and how it >> can be implemented in my code? > > It is just sum((y-mean(y))^2). > > Beware that a fair amount of (somewhat silly) contention is going on in this area, though. In particular, the formula can give negative values, which is unfortunate if you try calling it "R^2". > > -pd >
Michael Eisenring
2015-Sep-30 11:57 UTC
[R] Calculate Total CORRECTED SS for non linear regression
Hi Peter, Thanks for your answer. I am still a bit confused- So that means that the calculation of the " R2" for my non-linear model is in fact the formula I used in my eample: CompoundSS <- sum((dta$Compound - mean(dta$Compound))^2) R2 <- deviance(dta.nls)/CompoundSS R2 What's then the difference to the normally calculated R2 used for linear regression? To me it looks the same. -----Urspr?ngliche Nachricht----- Von: peter dalgaard [mailto:pdalgd at gmail.com] Gesendet: Mittwoch, 30. September 2015 01:09 An: Michael Eisenring <michael.eisenring at gmx.ch> Cc: r-help at r-project.org Betreff: Re: [R] Calculate Total CORRECTED SS for non linear regression> On 30 Sep 2015, at 02:08 , Michael Eisenring <michael.eisenring at gmx.ch>wrote:> > Can anyone tell me how to calculate the Total Corrected SS in R and > how it can be implemented in my code?It is just sum((y-mean(y))^2). Beware that a fair amount of (somewhat silly) contention is going on in this area, though. In particular, the formula can give negative values, which is unfortunate if you try calling it "R^2". -pd -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com