Michael Eisenring
2015-Sep-26 05:08 UTC
[R] How to calculate standard error of estimate (S) for my non-linear regression model?
Hi all, I am looking for something that indicates the goodness of fit for my non linear regression model (since R2 is not very reliable). I read that the standard error of estimate (also known as standard error of the regression) is a good alternative. The standard error of estimate is described on this page (including the formula) http://onlinestatbook.com/2/regression/accuracy.html <https://3c.gmx.net/mail/client/dereferrer?redirectUrl=http%3A%2F%2Fonlinest atbook.com%2F2%2Fregression%2Faccuracy.html> Unfortunately however, I have no clue how to programm it in R. Does anyone know and could help me? Thank you very much. I added an example of my model and a dput() of my data #CODE dta<-read.csv("Regression_exp2.csv",header=T, sep = ",") attach(dta) # tells R to do the following analyses on this dataset head(dta) # loading packages: analysis of mixed effect models library(nls2)#model #Aim: fit equation to data: y~yo+a*(1-b^x) : Two parameter exp. single rise to the maximum # y =Gossypol (from my data set) x= Damage_cm (from my data set) #The other 3 parameters are unknown: yo=Intercept, a= assymptote ans b=slope plot(Gossypol~Damage_cm, dta) # Looking at the plot, 0 is a plausible estimate for y0: # a+y0 is the asymptote, so estimate about 4000; # b is between 0 and 1, so estimate .5 dta.nls <- nls(Gossypol~y0+a*(1-b^Damage_cm), dta, start=list(y0=0, a=4000, b=.5)) xval <- seq(0, 10, 0.1) lines(xval, predict(dta.nls, data.frame(Damage_cm=xval))) profile(dta.nls, alpha= .05) summary(dta.nls) #INPUT structure(list(Gossypol = c(948.2418407, 1180.171957, 3589.187889, 450.7205451, 349.0864019, 592.3403778, 723.885643, 2005.919344, 720.9785449, 1247.806111, 1079.846532, 1500.863038, 4198.569251, 3618.448997, 4140.242559, 1036.331811, 1013.807628, 2547.326207, 2508.417927, 2874.651764, 1120.955, 1782.864308, 1517.045807, 2287.228752, 4171.427741, 3130.376482, 1504.491931, 6132.876396, 3350.203452, 5113.942098, 1989.576826, 3470.09352, 4576.787021, 4854.985845, 1414.161257, 2608.716056, 910.8879471, 2228.522959, 2952.931863, 5909.068158, 1247.806111, 6982.035521, 2867.610671, 5629.979049, 6039.995102, 3747.076592, 3743.331903, 4274.324792, 3378.151945, 3736.144027, 5654.858696, 5972.926124, 3723.629772, 3322.115942, 3575.043632, 2818.419785), Treatment = structure(c(5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 4L, 2L, 4L, 4L, 2L, 4L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L, 2L), .Label = c("1c_2d", "1c_7d", "3c_2d", "9c_2d", "C"), class = "factor"), Damage_cm = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.142, 0.4035, 0.4435, 0.491, 0.4955, 0.578, 0.5895, 0.6925, 0.6965, 0.756, 0.8295, 1.0475, 1.313, 1.516, 1.573, 1.62, 1.8115, 1.8185, 1.8595, 1.989, 2.129, 2.171, 2.3035, 2.411, 2.559, 2.966, 2.974, 3.211, 3.2665, 3.474, 3.51, 3.547, 4.023, 4.409, 4.516, 4.7245, 4.809, 4.9835, 5.568, 5.681, 5.683, 7.272, 8.043, 9.437, 9.7455), Damage_groups = c(0.278, 1.616, 2.501, 3.401, 4.577, 5.644, 7.272, 8.043, 9.591, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Gossypol_Averaged c(1783.211, 3244.129, 2866.307, 3991.809, 4468.809, 5121.309, 3723.629772, 3322.115942, 3196.731, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Groups = c(42006L, 42038L, 42067L, 42099L, 42130L, 42162L, 42193L, 42225L, 42257L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("Gossypol", "Treatment", "Damage_cm", "Damage_groups", "Gossypol_Averaged", "Groups"), class = "data.frame", row.names = c(NA, -56L)) [[alternative HTML version deleted]]
peter dalgaard
2015-Sep-26 08:42 UTC
[R] How to calculate standard error of estimate (S) for my non-linear regression model?
This is one area in which terminology in (computational) statistics has gone a bit crazy. The thing some call "standard error of estimate" is actually the residual standard deviation in the regression model, not to be confused with the standard errors that are associated with parameter estimates. In summary(nls(...)) (and summary(lm()) for that matter), you'll find it as "residual standard error", and even that is a bit of a misnomer. -pd> On 26 Sep 2015, at 07:08 , Michael Eisenring <michael.eisenring at gmx.ch> wrote: > > Hi all, > > I am looking for something that indicates the goodness of fit for my non > linear regression model (since R2 is not very reliable). > > I read that the standard error of estimate (also known as standard error of > the regression) is a good alternative. > > > > The standard error of estimate is described on this page (including the > formula) http://onlinestatbook.com/2/regression/accuracy.html > <https://3c.gmx.net/mail/client/dereferrer?redirectUrl=http%3A%2F%2Fonlinest > atbook.com%2F2%2Fregression%2Faccuracy.html> > > Unfortunately however, I have no clue how to programm it in R. Does anyone > know and could help me? > > Thank you very much. > > > > I added an example of my model and a dput() of my data > > #CODE > > dta<-read.csv("Regression_exp2.csv",header=T, sep = ",") > attach(dta) # tells R to do the following analyses on this dataset > head(dta) > > > > # loading packages: analysis of mixed effect models > library(nls2)#model > > #Aim: fit equation to data: y~yo+a*(1-b^x) : Two parameter exp. single rise > to the maximum > # y =Gossypol (from my data set) x= Damage_cm (from my data set) > #The other 3 parameters are unknown: yo=Intercept, a= assymptote ans b=slope > > plot(Gossypol~Damage_cm, dta) > # Looking at the plot, 0 is a plausible estimate for y0: > # a+y0 is the asymptote, so estimate about 4000; > # b is between 0 and 1, so estimate .5 > dta.nls <- nls(Gossypol~y0+a*(1-b^Damage_cm), dta, > start=list(y0=0, a=4000, b=.5)) > > xval <- seq(0, 10, 0.1) > lines(xval, predict(dta.nls, data.frame(Damage_cm=xval))) > profile(dta.nls, alpha= .05) > > > summary(dta.nls) > > > > > > > > #INPUT > > structure(list(Gossypol = c(948.2418407, 1180.171957, 3589.187889, > 450.7205451, 349.0864019, 592.3403778, 723.885643, 2005.919344, > 720.9785449, 1247.806111, 1079.846532, 1500.863038, 4198.569251, > 3618.448997, 4140.242559, 1036.331811, 1013.807628, 2547.326207, > 2508.417927, 2874.651764, 1120.955, 1782.864308, 1517.045807, > 2287.228752, 4171.427741, 3130.376482, 1504.491931, 6132.876396, > 3350.203452, 5113.942098, 1989.576826, 3470.09352, 4576.787021, > 4854.985845, 1414.161257, 2608.716056, 910.8879471, 2228.522959, > 2952.931863, 5909.068158, 1247.806111, 6982.035521, 2867.610671, > 5629.979049, 6039.995102, 3747.076592, 3743.331903, 4274.324792, > 3378.151945, 3736.144027, 5654.858696, 5972.926124, 3723.629772, > 3322.115942, 3575.043632, 2818.419785), Treatment = structure(c(5L, > 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 1L, 1L, 1L, 1L, 1L, 1L, > 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 3L, 3L, 3L, > 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 4L, 2L, 4L, 4L, 2L, 4L, 2L, 2L, > 4L, 4L, 4L, 4L, 4L, 4L, 2L), .Label = c("1c_2d", "1c_7d", "3c_2d", > "9c_2d", "C"), class = "factor"), Damage_cm = c(0, 0, 0, 0, 0, > 0, 0, 0, 0, 0, 0, 0.142, 0.4035, 0.4435, 0.491, 0.4955, 0.578, > 0.5895, 0.6925, 0.6965, 0.756, 0.8295, 1.0475, 1.313, 1.516, > 1.573, 1.62, 1.8115, 1.8185, 1.8595, 1.989, 2.129, 2.171, 2.3035, > 2.411, 2.559, 2.966, 2.974, 3.211, 3.2665, 3.474, 3.51, 3.547, > 4.023, 4.409, 4.516, 4.7245, 4.809, 4.9835, 5.568, 5.681, 5.683, > 7.272, 8.043, 9.437, 9.7455), Damage_groups = c(0.278, 1.616, > 2.501, 3.401, 4.577, 5.644, 7.272, 8.043, 9.591, NA, NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Gossypol_Averaged > c(1783.211, > 3244.129, 2866.307, 3991.809, 4468.809, 5121.309, 3723.629772, > 3322.115942, 3196.731, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > NA, NA, NA, NA, NA), Groups = c(42006L, 42038L, 42067L, 42099L, > 42130L, 42162L, 42193L, 42225L, 42257L, NA, NA, NA, NA, NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("Gossypol", > "Treatment", "Damage_cm", "Damage_groups", "Gossypol_Averaged", > "Groups"), class = "data.frame", row.names = c(NA, -56L)) > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
Michael Eisenring
2015-Sep-26 14:46 UTC
[R] How to calculate standard error of estimate (S) for my non-linear regression model?
Dear Peter, Thank you for your answer. If I look at my summary I see there a Residual standard error: 1394 on 53 degrees of freedom. This number is very high (the fit of the curve is pretty bad I know but still...). Are you sure the residual standard error given in the summary is the same as the one described on this page: http://onlinestatbook.com/2/regression/accuracy.html I am basically just looking for a value that describes the goodness of fit for my non-linear regression model. This is probably a pretty obvious question, but I am not a statistician and as you said the terminology is sometimes pretty confusing. Thanks mike -----Urspr?ngliche Nachricht----- Von: peter dalgaard [mailto:pdalgd at gmail.com] Gesendet: Samstag, 26. September 2015 01:43 An: Michael Eisenring <michael.eisenring at gmx.ch> Cc: r-help at r-project.org Betreff: Re: [R] How to calculate standard error of estimate (S) for my non-linear regression model? This is one area in which terminology in (computational) statistics has gone a bit crazy. The thing some call "standard error of estimate" is actually the residual standard deviation in the regression model, not to be confused with the standard errors that are associated with parameter estimates. In summary(nls(...)) (and summary(lm()) for that matter), you'll find it as "residual standard error", and even that is a bit of a misnomer. -pd> On 26 Sep 2015, at 07:08 , Michael Eisenring <michael.eisenring at gmx.ch>wrote:> > Hi all, > > I am looking for something that indicates the goodness of fit for my > non linear regression model (since R2 is not very reliable). > > I read that the standard error of estimate (also known as standard > error of the regression) is a good alternative. > > > > The standard error of estimate is described on this page (including > the > formula) http://onlinestatbook.com/2/regression/accuracy.html > <https://3c.gmx.net/mail/client/dereferrer?redirectUrl=http%3A%2F%2Fon > linest atbook.com%2F2%2Fregression%2Faccuracy.html> > > Unfortunately however, I have no clue how to programm it in R. Does > anyone know and could help me? > > Thank you very much. > > > > I added an example of my model and a dput() of my data > > #CODE > > dta<-read.csv("Regression_exp2.csv",header=T, sep = ",") > attach(dta) # tells R to do the following analyses on this dataset > head(dta) > > > > # loading packages: analysis of mixed effect models > library(nls2)#model > > #Aim: fit equation to data: y~yo+a*(1-b^x) : Two parameter exp. single > rise to the maximum # y =Gossypol (from my data set) x= Damage_cm > (from my data set) #The other 3 parameters are unknown: yo=Intercept, > a= assymptote ans b=slope > > plot(Gossypol~Damage_cm, dta) > # Looking at the plot, 0 is a plausible estimate for y0: > # a+y0 is the asymptote, so estimate about 4000; # b is between 0 and > 1, so estimate .5 dta.nls <- nls(Gossypol~y0+a*(1-b^Damage_cm), dta, > start=list(y0=0, a=4000, b=.5)) > > xval <- seq(0, 10, 0.1) > lines(xval, predict(dta.nls, data.frame(Damage_cm=xval))) > profile(dta.nls, alpha= .05) > > > summary(dta.nls) > > > > > > > > #INPUT > > structure(list(Gossypol = c(948.2418407, 1180.171957, 3589.187889, > 450.7205451, 349.0864019, 592.3403778, 723.885643, 2005.919344, > 720.9785449, 1247.806111, 1079.846532, 1500.863038, 4198.569251, > 3618.448997, 4140.242559, 1036.331811, 1013.807628, 2547.326207, > 2508.417927, 2874.651764, 1120.955, 1782.864308, 1517.045807, > 2287.228752, 4171.427741, 3130.376482, 1504.491931, 6132.876396, > 3350.203452, 5113.942098, 1989.576826, 3470.09352, 4576.787021, > 4854.985845, 1414.161257, 2608.716056, 910.8879471, 2228.522959, > 2952.931863, 5909.068158, 1247.806111, 6982.035521, 2867.610671, > 5629.979049, 6039.995102, 3747.076592, 3743.331903, 4274.324792, > 3378.151945, 3736.144027, 5654.858696, 5972.926124, 3723.629772, > 3322.115942, 3575.043632, 2818.419785), Treatment = structure(c(5L, > 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, > 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, > 3L, 3L, 3L, 2L, 2L, 2L, 4L, 2L, 4L, 4L, 2L, 4L, 2L, 2L, 4L, 4L, 4L, > 4L, 4L, 4L, 2L), .Label = c("1c_2d", "1c_7d", "3c_2d", "9c_2d", "C"), > class = "factor"), Damage_cm = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > 0.142, 0.4035, 0.4435, 0.491, 0.4955, 0.578, 0.5895, 0.6925, 0.6965, > 0.756, 0.8295, 1.0475, 1.313, 1.516, 1.573, 1.62, 1.8115, 1.8185, > 1.8595, 1.989, 2.129, 2.171, 2.3035, 2.411, 2.559, 2.966, 2.974, > 3.211, 3.2665, 3.474, 3.51, 3.547, 4.023, 4.409, 4.516, 4.7245, 4.809, > 4.9835, 5.568, 5.681, 5.683, 7.272, 8.043, 9.437, 9.7455), > Damage_groups = c(0.278, 1.616, 2.501, 3.401, 4.577, 5.644, 7.272, > 8.043, 9.591, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), > Gossypol_Averaged = c(1783.211, 3244.129, 2866.307, 3991.809, > 4468.809, 5121.309, 3723.629772, 3322.115942, 3196.731, NA, NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Groups = c(42006L, 42038L, > 42067L, 42099L, 42130L, 42162L, 42193L, 42225L, 42257L, NA, NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("Gossypol", > "Treatment", "Damage_cm", "Damage_groups", "Gossypol_Averaged", > "Groups"), class = "data.frame", row.names = c(NA, -56L)) > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com