Michael Eisenring
2015-Aug-01 15:17 UTC
[R] Using R to fit a curve to a dataset using a specific equation
Hi there I would like to use a specific equation to fit a curve to one of my data sets (attached)> dput(data)structure(list(Gossypol = c(1036.331811, 4171.427741, 6039.995102, 5909.068158, 4140.242559, 4854.985845, 6982.035521, 6132.876396, 948.2418407, 3618.448997, 3130.376482, 5113.942098, 1180.171957, 1500.863038, 4576.787021, 5629.979049, 3378.151945, 3589.187889, 2508.417927, 1989.576826, 5972.926124, 2867.610671, 450.7205451, 1120.955, 3470.09352, 3575.043632, 2952.931863, 349.0864019, 1013.807628, 910.8879471, 3743.331903, 3350.203452, 592.3403778, 1517.045807, 1504.491931, 3736.144027, 2818.419785, 723.885643, 1782.864308, 1414.161257, 3723.629772, 3747.076592, 2005.919344, 4198.569251, 2228.522959, 3322.115942, 4274.324792, 720.9785449, 2874.651764, 2287.228752, 5654.858696, 1247.806111, 1247.806111, 2547.326207, 2608.716056, 1079.846532), Treatment = structure(c(2L, 3L, 4L, 5L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 1L), .Label = c("C", "1c_2d", "3c_2d", "9c_2d", "1c_7d"), class "factor"), Damage_cm = c(0.4955, 1.516, 4.409, 3.2665, 0.491, 2.3035, 3.51, 1.8115, 0, 0.4435, 1.573, 1.8595, 0, 0.142, 2.171, 4.023, 4.9835, 0, 0.6925, 1.989, 5.683, 3.547, 0, 0.756, 2.129, 9.437, 3.211, 0, 0.578, 2.966, 4.7245, 1.8185, 0, 1.0475, 1.62, 5.568, 9.7455, 0, 0.8295, 2.411, 7.272, 4.516, 0, 0.4035, 2.974, 8.043, 4.809, 0, 0.6965, 1.313, 5.681, 3.474, 0, 0.5895, 2.559, 0)), .Names = c("Gossypol", "Treatment", "Damage_cm"), row.names c(NA, -56L), class = "data.frame") The equation is: y~yo+a*(1-b^x) Where: y =Gossypol (from my data set) xDamage_cm (from my data set) The other 3 parameters are unknown: yo=Intercept, a= assymptote ans b=slope In the end I would like to use the equation to plot a curve (with SE interval, I usually use ggplot2) Furthermore, I would like to know the R2 and p value. I would also be interested in the parameters yo , a and b I have never done this before and would be extremely grateful if anyone could help me? I suppose I have to use a non linear approach (glm(...)). I found out that the mosaic package might be helpful. thanks a lot, Mike [[alternative HTML version deleted]]
David L Carlson
2015-Aug-01 20:49 UTC
[R] Using R to fit a curve to a dataset using a specific equation
I can get you started, but you should really read up on non-linear least squares. Calling your data frame dta (since data is a function): plot(Gossypol~Damage_cm, dta) # Looking at the plot, 0 is a plausible estimate for y0: # a+y0 is the asymptote, so estimate about 4000; # b is between 0 and 1, so estimate .5 dta.nls <- nls(Gossypol~y0+a*(1-b^Damage_cm), dta, start=list(y0=0, a=4000, b=.5)) xval <- seq(0, 10, length.out=200) lines(xval, predict(dta.nls, data.frame(Damage_cm=xval))) profile(dta.nls, alpha= .05) ==========================================Number of iterations to convergence: 3 Achieved convergence tolerance: 1.750586e-06 attr(,"summary") Formula: Gossypol ~ y0 + a * (1 - b^Damage_cm) Parameters: Estimate Std. Error t value Pr(>|t|) y0 1303.4529432 386.1515684 3.37550 0.0013853 ** a 2796.0464520 530.4140959 5.27144 2.5359e-06 *** b 0.4939111 0.1809687 2.72926 0.0085950 ** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Residual standard error: 1394.375 on 53 degrees of freedom Number of iterations to convergence: 3 Achieved convergence tolerance: 1.750586e-06 David Carlson Dept of Anthropology Texas A&M College Station, TX 77843 ________________________________________ From: R-help [r-help-bounces at r-project.org] on behalf of Michael Eisenring [michael.eisenring at gmx.ch] Sent: Saturday, August 01, 2015 10:17 AM To: r-help at r-project.org Subject: [R] Using R to fit a curve to a dataset using a specific equation Hi there I would like to use a specific equation to fit a curve to one of my data sets (attached)> dput(data)structure(list(Gossypol = c(1036.331811, 4171.427741, 6039.995102, 5909.068158, 4140.242559, 4854.985845, 6982.035521, 6132.876396, 948.2418407, 3618.448997, 3130.376482, 5113.942098, 1180.171957, 1500.863038, 4576.787021, 5629.979049, 3378.151945, 3589.187889, 2508.417927, 1989.576826, 5972.926124, 2867.610671, 450.7205451, 1120.955, 3470.09352, 3575.043632, 2952.931863, 349.0864019, 1013.807628, 910.8879471, 3743.331903, 3350.203452, 592.3403778, 1517.045807, 1504.491931, 3736.144027, 2818.419785, 723.885643, 1782.864308, 1414.161257, 3723.629772, 3747.076592, 2005.919344, 4198.569251, 2228.522959, 3322.115942, 4274.324792, 720.9785449, 2874.651764, 2287.228752, 5654.858696, 1247.806111, 1247.806111, 2547.326207, 2608.716056, 1079.846532), Treatment = structure(c(2L, 3L, 4L, 5L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 1L), .Label = c("C", "1c_2d", "3c_2d", "9c_2d", "1c_7d"), class "factor"), Damage_cm = c(0.4955, 1.516, 4.409, 3.2665, 0.491, 2.3035, 3.51, 1.8115, 0, 0.4435, 1.573, 1.8595, 0, 0.142, 2.171, 4.023, 4.9835, 0, 0.6925, 1.989, 5.683, 3.547, 0, 0.756, 2.129, 9.437, 3.211, 0, 0.578, 2.966, 4.7245, 1.8185, 0, 1.0475, 1.62, 5.568, 9.7455, 0, 0.8295, 2.411, 7.272, 4.516, 0, 0.4035, 2.974, 8.043, 4.809, 0, 0.6965, 1.313, 5.681, 3.474, 0, 0.5895, 2.559, 0)), .Names = c("Gossypol", "Treatment", "Damage_cm"), row.names c(NA, -56L), class = "data.frame") The equation is: y~yo+a*(1-b^x) Where: y =Gossypol (from my data set) xDamage_cm (from my data set) The other 3 parameters are unknown: yo=Intercept, a= assymptote ans b=slope In the end I would like to use the equation to plot a curve (with SE interval, I usually use ggplot2) Furthermore, I would like to know the R2 and p value. I would also be interested in the parameters yo , a and b I have never done this before and would be extremely grateful if anyone could help me? I suppose I have to use a non linear approach (glm(...)). I found out that the mosaic package might be helpful. thanks a lot, Mike [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
David L Carlson
2015-Aug-03 14:40 UTC
[R] Using R to fit a curve to a dataset using a specific equation
Use Reply-All to keep the discussion on the list. I suggested reading about nls (not just how to do it in R) because you requested R2. It was not clear that you were aware that there are strong reasons to suspect that R2 is misleading when applied nls results. That is why nls() does not provide it automatically. But R2 is easily computed from the model results: GossSS <- sum((dta$Gossypol - mean(dta$Gossypol))^2) R2 <- deviance(dta.nls)/GossSS R2 [1] 0.6318866 As for ggplot, just add the line we created before to the points plot: library(ggplot) xval <- seq(0, 10, length.out=200) yval <- predict(dta.nls, data.frame(Damage_cm=xval)) ggplot() + geom_point(data=dta, aes(x=Damage_cm, y=Gossypol)) + geom_line(aes(x=xval, y=yval)) David Carlson From: Michael Eisenring [mailto:Michael.Eisenring at gmx.ch] Sent: Saturday, August 1, 2015 5:33 PM To: David L Carlson Subject: Aw: RE: [R] Using R to fit a curve to a dataset using a specific equation Hello and thank you very much for your help! I just started to read up on non-linear least squares in The RBook. (I am totally new to the topic so i dindt even know where to look in the book ). I have three last questions: ? In the Rbook they say how to describe a model. In my case it would be something like: ?The model y ~ y0 + a * (1 - b^x) had y0= 1303.45 ( 386.15 standard error), a=.... and b=.... The model explained ??% of the total variation in y ? My question is were do I find the %age of total variation the model explains. it does not say that in the book. Is there something similar as a R^2 value or a p-value? ? My last question: is it possible to use ggplot2 for plotting the whole model? ? Thanks a lot. Mike ? ? Gesendet:?Samstag, 01. August 2015 um 13:49 Uhr Von:?"David L Carlson" <dcarlson at tamu.edu> An:?"Michael Eisenring" <michael.eisenring at gmx.ch>, "r-help at r-project.org" <r-help at r-project.org> Betreff:?RE: [R] Using R to fit a curve to a dataset using a specific equation I can get you started, but you should really read up on non-linear least squares. Calling your data frame dta (since data is a function): plot(Gossypol~Damage_cm, dta) # Looking at the plot, 0 is a plausible estimate for y0: # a+y0 is the asymptote, so estimate about 4000; # b is between 0 and 1, so estimate .5 dta.nls <- nls(Gossypol~y0+a*(1-b^Damage_cm), dta, start=list(y0=0, a=4000, b=.5)) xval <- seq(0, 10, length.out=200) lines(xval, predict(dta.nls, data.frame(Damage_cm=xval))) profile(dta.nls, alpha= .05) ==========================================Number of iterations to convergence: 3 Achieved convergence tolerance: 1.750586e-06 attr(,"summary") Formula: Gossypol ~ y0 + a * (1 - b^Damage_cm) Parameters: Estimate Std. Error t value Pr(>|t|) y0 1303.4529432 386.1515684 3.37550 0.0013853 ** a 2796.0464520 530.4140959 5.27144 2.5359e-06 *** b 0.4939111 0.1809687 2.72926 0.0085950 ** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Residual standard error: 1394.375 on 53 degrees of freedom Number of iterations to convergence: 3 Achieved convergence tolerance: 1.750586e-06 David Carlson Dept of Anthropology Texas A&M College Station, TX 77843 ________________________________________ From: R-help [r-help-bounces at r-project.org] on behalf of Michael Eisenring [michael.eisenring at gmx.ch] Sent: Saturday, August 01, 2015 10:17 AM To: r-help at r-project.org Subject: [R] Using R to fit a curve to a dataset using a specific equation Hi there I would like to use a specific equation to fit a curve to one of my data sets (attached)> dput(data)structure(list(Gossypol = c(1036.331811, 4171.427741, 6039.995102, 5909.068158, 4140.242559, 4854.985845, 6982.035521, 6132.876396, 948.2418407, 3618.448997, 3130.376482, 5113.942098, 1180.171957, 1500.863038, 4576.787021, 5629.979049, 3378.151945, 3589.187889, 2508.417927, 1989.576826, 5972.926124, 2867.610671, 450.7205451, 1120.955, 3470.09352, 3575.043632, 2952.931863, 349.0864019, 1013.807628, 910.8879471, 3743.331903, 3350.203452, 592.3403778, 1517.045807, 1504.491931, 3736.144027, 2818.419785, 723.885643, 1782.864308, 1414.161257, 3723.629772, 3747.076592, 2005.919344, 4198.569251, 2228.522959, 3322.115942, 4274.324792, 720.9785449, 2874.651764, 2287.228752, 5654.858696, 1247.806111, 1247.806111, 2547.326207, 2608.716056, 1079.846532), Treatment = structure(c(2L, 3L, 4L, 5L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 1L), .Label = c("C", "1c_2d", "3c_2d", "9c_2d", "1c_7d"), class "factor"), Damage_cm = c(0.4955, 1.516, 4.409, 3.2665, 0.491, 2.3035, 3.51, 1.8115, 0, 0.4435, 1.573, 1.8595, 0, 0.142, 2.171, 4.023, 4.9835, 0, 0.6925, 1.989, 5.683, 3.547, 0, 0.756, 2.129, 9.437, 3.211, 0, 0.578, 2.966, 4.7245, 1.8185, 0, 1.0475, 1.62, 5.568, 9.7455, 0, 0.8295, 2.411, 7.272, 4.516, 0, 0.4035, 2.974, 8.043, 4.809, 0, 0.6965, 1.313, 5.681, 3.474, 0, 0.5895, 2.559, 0)), .Names = c("Gossypol", "Treatment", "Damage_cm"), row.names c(NA, -56L), class = "data.frame") The equation is: y~yo+a*(1-b^x) Where: y =Gossypol (from my data set) xDamage_cm (from my data set) The other 3 parameters are unknown: yo=Intercept, a= assymptote ans b=slope In the end I would like to use the equation to plot a curve (with SE interval, I usually use ggplot2) Furthermore, I would like to know the R2 and p value. I would also be interested in the parameters yo , a and b I have never done this before and would be extremely grateful if anyone could help me? I suppose I have to use a non linear approach (glm(...)). I found out that the mosaic package might be helpful. thanks a lot, Mike [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
David L Carlson
2015-Aug-03 16:23 UTC
[R] Using R to fit a curve to a dataset using a specific equation
Your question is more statistics than R and I?m not qualified to offer an opinion. You should be able to find someone locally to help you. The Cross Validated website is also a useful resource. David From: Michael Eisenring [mailto:Michael.Eisenring at gmx.ch] Sent: Monday, August 3, 2015 10:37 AM To: David L Carlson Cc: r-help Subject: Aw: RE: RE: [R] Using R to fit a curve to a dataset using a specific equation Hi David, thank you for your help. It makes sense to me that the R2 is very misleading in an non-linear regression. the same is true for the p-values. My question then is: how can I present the results of my curve and quantify its goodness if R2 and p-values are misleading? thanks a lot, Mike Gesendet: Montag, 03. August 2015 um 07:40 Uhr Von: "David L Carlson" <dcarlson at tamu.edu<mailto:dcarlson at tamu.edu>> An: "Michael Eisenring" <Michael.Eisenring at gmx.ch<mailto:Michael.Eisenring at gmx.ch>>, r-help <r-help at r-project.org<mailto:r-help at r-project.org>> Betreff: RE: RE: [R] Using R to fit a curve to a dataset using a specific equation Use Reply-All to keep the discussion on the list. I suggested reading about nls (not just how to do it in R) because you requested R2. It was not clear that you were aware that there are strong reasons to suspect that R2 is misleading when applied nls results. That is why nls() does not provide it automatically. But R2 is easily computed from the model results: GossSS <- sum((dta$Gossypol - mean(dta$Gossypol))^2) R2 <- deviance(dta.nls)/GossSS R2 [1] 0.6318866 As for ggplot, just add the line we created before to the points plot: library(ggplot) xval <- seq(0, 10, length.out=200) yval <- predict(dta.nls, data.frame(Damage_cm=xval)) ggplot() + geom_point(data=dta, aes(x=Damage_cm, y=Gossypol)) + geom_line(aes(x=xval, y=yval)) David Carlson From: Michael Eisenring [mailto:Michael.Eisenring at gmx.ch] Sent: Saturday, August 1, 2015 5:33 PM To: David L Carlson Subject: Aw: RE: [R] Using R to fit a curve to a dataset using a specific equation Hello and thank you very much for your help! I just started to read up on non-linear least squares in The RBook. (I am totally new to the topic so i dindt even know where to look in the book ). I have three last questions: In the Rbook they say how to describe a model. In my case it would be something like: ?The model y ~ y0 + a * (1 - b^x) had y0= 1303.45 ( 386.15 standard error), a=.... and b=.... The model explained ??% of the total variation in y My question is were do I find the %age of total variation the model explains. it does not say that in the book. Is there something similar as a R^2 value or a p-value? My last question: is it possible to use ggplot2 for plotting the whole model? Thanks a lot. Mike Gesendet: Samstag, 01. August 2015 um 13:49 Uhr Von: "David L Carlson" <dcarlson at tamu.edu<mailto:dcarlson at tamu.edu>> An: "Michael Eisenring" <michael.eisenring at gmx.ch<mailto:michael.eisenring at gmx.ch>>, "r-help at r-project.org<mailto:r-help at r-project.org>" <r-help at r-project.org<mailto:r-help at r-project.org>> Betreff: RE: [R] Using R to fit a curve to a dataset using a specific equation I can get you started, but you should really read up on non-linear least squares. Calling your data frame dta (since data is a function): plot(Gossypol~Damage_cm, dta) # Looking at the plot, 0 is a plausible estimate for y0: # a+y0 is the asymptote, so estimate about 4000; # b is between 0 and 1, so estimate .5 dta.nls <- nls(Gossypol~y0+a*(1-b^Damage_cm), dta, start=list(y0=0, a=4000, b=.5)) xval <- seq(0, 10, length.out=200) lines(xval, predict(dta.nls, data.frame(Damage_cm=xval))) profile(dta.nls, alpha= .05) ==========================================Number of iterations to convergence: 3 Achieved convergence tolerance: 1.750586e-06 attr(,"summary") Formula: Gossypol ~ y0 + a * (1 - b^Damage_cm) Parameters: Estimate Std. Error t value Pr(>|t|) y0 1303.4529432 386.1515684 3.37550 0.0013853 ** a 2796.0464520 530.4140959 5.27144 2.5359e-06 *** b 0.4939111 0.1809687 2.72926 0.0085950 ** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Residual standard error: 1394.375 on 53 degrees of freedom Number of iterations to convergence: 3 Achieved convergence tolerance: 1.750586e-06 David Carlson Dept of Anthropology Texas A&M College Station, TX 77843 ________________________________________ From: R-help [r-help-bounces at r-project.org] on behalf of Michael Eisenring [michael.eisenring at gmx.ch] Sent: Saturday, August 01, 2015 10:17 AM To: r-help at r-project.org<mailto:r-help at r-project.org> Subject: [R] Using R to fit a curve to a dataset using a specific equation Hi there I would like to use a specific equation to fit a curve to one of my data sets (attached)> dput(data)structure(list(Gossypol = c(1036.331811, 4171.427741, 6039.995102, 5909.068158, 4140.242559, 4854.985845, 6982.035521, 6132.876396, 948.2418407, 3618.448997, 3130.376482, 5113.942098, 1180.171957, 1500.863038, 4576.787021, 5629.979049, 3378.151945, 3589.187889, 2508.417927, 1989.576826, 5972.926124, 2867.610671, 450.7205451, 1120.955, 3470.09352, 3575.043632, 2952.931863, 349.0864019, 1013.807628, 910.8879471, 3743.331903, 3350.203452, 592.3403778, 1517.045807, 1504.491931, 3736.144027, 2818.419785, 723.885643, 1782.864308, 1414.161257, 3723.629772, 3747.076592, 2005.919344, 4198.569251, 2228.522959, 3322.115942, 4274.324792, 720.9785449, 2874.651764, 2287.228752, 5654.858696, 1247.806111, 1247.806111, 2547.326207, 2608.716056, 1079.846532), Treatment = structure(c(2L, 3L, 4L, 5L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 1L), .Label = c("C", "1c_2d", "3c_2d", "9c_2d", "1c_7d"), class "factor"), Damage_cm = c(0.4955, 1.516, 4.409, 3.2665, 0.491, 2.3035, 3.51, 1.8115, 0, 0.4435, 1.573, 1.8595, 0, 0.142, 2.171, 4.023, 4.9835, 0, 0.6925, 1.989, 5.683, 3.547, 0, 0.756, 2.129, 9.437, 3.211, 0, 0.578, 2.966, 4.7245, 1.8185, 0, 1.0475, 1.62, 5.568, 9.7455, 0, 0.8295, 2.411, 7.272, 4.516, 0, 0.4035, 2.974, 8.043, 4.809, 0, 0.6965, 1.313, 5.681, 3.474, 0, 0.5895, 2.559, 0)), .Names = c("Gossypol", "Treatment", "Damage_cm"), row.names c(NA, -56L), class = "data.frame") The equation is: y~yo+a*(1-b^x) Where: y =Gossypol (from my data set) xDamage_cm (from my data set) The other 3 parameters are unknown: yo=Intercept, a= assymptote ans b=slope In the end I would like to use the equation to plot a curve (with SE interval, I usually use ggplot2) Furthermore, I would like to know the R2 and p value. I would also be interested in the parameters yo , a and b I have never done this before and would be extremely grateful if anyone could help me? I suppose I have to use a non linear approach (glm(...)). I found out that the mosaic package might be helpful. thanks a lot, Mike [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org<mailto:R-help at r-project.org> mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]