Hello, I'm new here, but will try to be as specific and complete as possible. I'm trying to use “lm“ to first estimate parameter values from a set of calibration measurements, and then later to use those estimates to calculate another set of values with “predict.lm”. First I have a calibration dataset of absorbance values measured from standard solutions with known concentration of Bromide:> stdsabs conc 1 -0.0021 0 2 0.1003 200 3 0.2395 500 4 0.3293 800 On this small calibration series, I perform a linear regression to find the parameter estimates of the relationship between absorbance (abs) and concentration (conc):> linear1 <- lm(abs~conc, data=stds) > summary(linear1)Call: lm(formula = abs ~ conc, data = stds) Residuals: 1 2 3 4 -0.012600 0.006467 0.020667 -0.014533 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.050e-02 1.629e-02 0.645 0.58527 conc 4.167e-04 3.378e-05 12.333 0.00651 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.02048 on 2 degrees of freedom Multiple R-squared: 0.987, Adjusted R-squared: 0.9805 F-statistic: 152.1 on 1 and 2 DF, p-value: 0.00651 Now I come with another dataset, which contains measured absorbance values of Bromide in solution:> bromhours abs 1 -1.0 0.0633 2 1.0 0.2686 3 5.0 0.2446 4 18.0 0.2274 5 29.0 0.2091 6 42.0 0.1961 7 53.0 0.1310 8 76.0 0.1504 9 91.0 0.1317 10 95.5 0.1169 11 101.0 0.0977 12 115.0 0.1023 13 123.5 0.0879 14 138.5 0.0724 15 147.5 0.0564 16 163.0 0.0495 17 171.0 0.0325 18 189.0 0.0182 19 211.0 0.0047 20 212.5 NA 21 815.5 -0.2112 22 816.5 -0.1896 23 817.5 -0.0783 24 818.5 0.2963 25 819.5 0.1448 26 839.5 0.0936 27 864.0 0.0560 28 888.0 0.0310 29 960.5 0.0056 30 1009.0 -0.0163 The values in column brom$abs, measured on 30 subsequent points in time need to be calculated to Bromide concentrations, using the previously established relationship “linear1”. At first, I thought it could be done by:> predict.lm(linear1, brom$abs)Error in eval(predvars, data, env) : numeric 'envir' arg not of length one But, R gives the above error message. Then, after some searching around on different fora and R-communities (including this one), I learned that the “newdata” in “predict.lm” actually needs to be coerced into a separate dataframe. Thus:> mabs <- data.frame(Abs = brom$abs) > predict.lm(linear1, mabs)Error in eval(expr, envir, enclos) : object 'conc' not found Again, R gives an error...probably because I made an error, but I truly fail to see where. I hope somebody can explain to me clearly what I'm doing wrong and what I should do to instead. Any help is greatly appreciated, thanks ! -- View this message in context: http://r.789695.n4.nabble.com/Help-on-predict-lm-tp4509586p4509586.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]]
On 27-03-2012, at 19:24, Nederjaard wrote:> Hello, > > I'm new here, but will try to be as specific and complete as possible. I'm > trying to use ?lm? to first estimate parameter values from a set of > calibration measurements, and then later to use those estimates to calculate > another set of values with ?predict.lm?. > > First I have a calibration dataset of absorbance values measured from > standard solutions with known concentration of Bromide: > >> stds > abs conc > 1 -0.0021 0 > 2 0.1003 200 > 3 0.2395 500 > 4 0.3293 800 > > On this small calibration series, I perform a linear regression to find the > parameter estimates of the relationship between absorbance (abs) and > concentration (conc): > >> linear1 <- lm(abs~conc, data=stds) >> summary(linear1) > > Call: > lm(formula = abs ~ conc, data = stds) > > Residuals: > 1 2 3 4 > -0.012600 0.006467 0.020667 -0.014533 > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 1.050e-02 1.629e-02 0.645 0.58527 > conc 4.167e-04 3.378e-05 12.333 0.00651 ** > --- > Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > Residual standard error: 0.02048 on 2 degrees of freedom > Multiple R-squared: 0.987, Adjusted R-squared: 0.9805 > F-statistic: 152.1 on 1 and 2 DF, p-value: 0.00651 > > > > > > Now I come with another dataset, which contains measured absorbance values > of Bromide in solution: > >> brom > hours abs > 1 -1.0 0.0633 > 2 1.0 0.2686 > 3 5.0 0.2446 > 4 18.0 0.2274 > 5 29.0 0.2091 > 6 42.0 0.1961 > 7 53.0 0.1310 > 8 76.0 0.1504 > 9 91.0 0.1317 > 10 95.5 0.1169 > 11 101.0 0.0977 > 12 115.0 0.1023 > 13 123.5 0.0879 > 14 138.5 0.0724 > 15 147.5 0.0564 > 16 163.0 0.0495 > 17 171.0 0.0325 > 18 189.0 0.0182 > 19 211.0 0.0047 > 20 212.5 NA > 21 815.5 -0.2112 > 22 816.5 -0.1896 > 23 817.5 -0.0783 > 24 818.5 0.2963 > 25 819.5 0.1448 > 26 839.5 0.0936 > 27 864.0 0.0560 > 28 888.0 0.0310 > 29 960.5 0.0056 > 30 1009.0 -0.0163 > > The values in column brom$abs, measured on 30 subsequent points in time need > to be calculated to Bromide concentrations, using the previously established > relationship ?linear1?. > At first, I thought it could be done by: > >> predict.lm(linear1, brom$abs) > Error in eval(predvars, data, env) : > numeric 'envir' arg not of length one > > But, R gives the above error message. Then, after some searching around on > different fora and R-communities (including this one), I learned that the > ?newdata? in ?predict.lm? actually needs to be coerced into a separate > dataframe. Thus: > >> mabs <- data.frame(Abs = brom$abs) >> predict.lm(linear1, mabs) > Error in eval(expr, envir, enclos) : object 'conc' not found >There is no column with name "conc" in your dataframe mabs. You regressed abs on conc. For prediction you need data for conc and not abs. So provide data for conc. Or change the regression around: lm(conc ~ abs, data=stds) if that makes any sense. What you did with mabs wouldn't have worked anyway because Abs is not the same as abs. And it wasn't necessary. Berend> Again, R gives an error...probably because I made an error, but I truly fail > to see where. I hope somebody can explain to me clearly what I'm doing wrong > and what I should do to instead. > Any help is greatly appreciated, thanks ! > > -- > View this message in context: http://r.789695.n4.nabble.com/Help-on-predict-lm-tp4509586p4509586.html > Sent from the R help mailing list archive at Nabble.com. > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
R tries hard to keep you from committing scientific abuse. As stated, your problem seems to me akin to 1. Given that a man's age can be modelled as a function of the grayness of his hair, 2. predict a man's age from the temperature in Barcelona. Your calibration relates 'abs' and 'conc'. Now you want to predict 'abs' from _'hours'_ (I think). I suspect that concentration is actually related to time and this is the missing link that you'll have to provide. BTW, I'm surprised that you didn't find the requirement for 'newdata' to be a data frame on the predict.lm help page - it's pretty clearly stated there. Peter Ehlers On 2012-03-27 10:24, Nederjaard wrote:> Hello, > > I'm new here, but will try to be as specific and complete as possible. I'm > trying to use ?lm? to first estimate parameter values from a set of > calibration measurements, and then later to use those estimates to calculate > another set of values with ?predict.lm?. > > First I have a calibration dataset of absorbance values measured from > standard solutions with known concentration of Bromide: > >> stds > abs conc > 1 -0.0021 0 > 2 0.1003 200 > 3 0.2395 500 > 4 0.3293 800 > > On this small calibration series, I perform a linear regression to find the > parameter estimates of the relationship between absorbance (abs) and > concentration (conc): > >> linear1<- lm(abs~conc, data=stds) >> summary(linear1) > > Call: > lm(formula = abs ~ conc, data = stds) > > Residuals: > 1 2 3 4 > -0.012600 0.006467 0.020667 -0.014533 > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 1.050e-02 1.629e-02 0.645 0.58527 > conc 4.167e-04 3.378e-05 12.333 0.00651 ** > --- > Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > Residual standard error: 0.02048 on 2 degrees of freedom > Multiple R-squared: 0.987, Adjusted R-squared: 0.9805 > F-statistic: 152.1 on 1 and 2 DF, p-value: 0.00651 > > > > > > Now I come with another dataset, which contains measured absorbance values > of Bromide in solution: > >> brom > hours abs > 1 -1.0 0.0633 > 2 1.0 0.2686 > 3 5.0 0.2446 > 4 18.0 0.2274 > 5 29.0 0.2091 > 6 42.0 0.1961 > 7 53.0 0.1310 > 8 76.0 0.1504 > 9 91.0 0.1317 > 10 95.5 0.1169 > 11 101.0 0.0977 > 12 115.0 0.1023 > 13 123.5 0.0879 > 14 138.5 0.0724 > 15 147.5 0.0564 > 16 163.0 0.0495 > 17 171.0 0.0325 > 18 189.0 0.0182 > 19 211.0 0.0047 > 20 212.5 NA > 21 815.5 -0.2112 > 22 816.5 -0.1896 > 23 817.5 -0.0783 > 24 818.5 0.2963 > 25 819.5 0.1448 > 26 839.5 0.0936 > 27 864.0 0.0560 > 28 888.0 0.0310 > 29 960.5 0.0056 > 30 1009.0 -0.0163 > > The values in column brom$abs, measured on 30 subsequent points in time need > to be calculated to Bromide concentrations, using the previously established > relationship ?linear1?. > At first, I thought it could be done by: > >> predict.lm(linear1, brom$abs) > Error in eval(predvars, data, env) : > numeric 'envir' arg not of length one > > But, R gives the above error message. Then, after some searching around on > different fora and R-communities (including this one), I learned that the > ?newdata? in ?predict.lm? actually needs to be coerced into a separate > dataframe. Thus: > >> mabs<- data.frame(Abs = brom$abs) >> predict.lm(linear1, mabs) > Error in eval(expr, envir, enclos) : object 'conc' not found > > Again, R gives an error...probably because I made an error, but I truly fail > to see where. I hope somebody can explain to me clearly what I'm doing wrong > and what I should do to instead. > Any help is greatly appreciated, thanks ! > > -- > View this message in context: http://r.789695.n4.nabble.com/Help-on-predict-lm-tp4509586p4509586.html > Sent from the R help mailing list archive at Nabble.com. > [[alternative HTML version deleted]] >
R tries hard to keep you from committing scientific abuse. As stated, your problem seems to me akin to 1. Given that a man's age can be modelled as a function of the grayness of his hair, 2. predict a man's age from the temperature in Barcelona. Your calibration relates 'abs' and 'conc'. Now you want to predict 'abs' from 'hours' (I think). I suspect that concentration is actually related to time and this is the missing link that BTW, I'm surprised that you didn't find the requirement for 'newdata' to be a data frame on the predict.lm help page - it's pretty clearly stated there. Peter Ehlers On 2012-03-27 10:24, Nederjaard wrote:> Hello, > > I'm new here, but will try to be as specific and complete as possible. I'm > trying to use ?lm? to first estimate parameter values from a set of > calibration measurements, and then later to use those estimates to calculate > another set of values with ?predict.lm?. > > First I have a calibration dataset of absorbance values measured from > standard solutions with known concentration of Bromide: > >> stds > abs conc > 1 -0.0021 0 > 2 0.1003 200 > 3 0.2395 500 > 4 0.3293 800 > > On this small calibration series, I perform a linear regression to find the > parameter estimates of the relationship between absorbance (abs) and > concentration (conc): > >> linear1<- lm(abs~conc, data=stds) >> summary(linear1) > > Call: > lm(formula = abs ~ conc, data = stds) > > Residuals: > 1 2 3 4 > -0.012600 0.006467 0.020667 -0.014533 > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 1.050e-02 1.629e-02 0.645 0.58527 > conc 4.167e-04 3.378e-05 12.333 0.00651 ** > --- > Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > Residual standard error: 0.02048 on 2 degrees of freedom > Multiple R-squared: 0.987, Adjusted R-squared: 0.9805 > F-statistic: 152.1 on 1 and 2 DF, p-value: 0.00651 > > > > > > Now I come with another dataset, which contains measured absorbance values > of Bromide in solution: > >> brom > hours abs > 1 -1.0 0.0633 > 2 1.0 0.2686 > 3 5.0 0.2446 > 4 18.0 0.2274 > 5 29.0 0.2091 > 6 42.0 0.1961 > 7 53.0 0.1310 > 8 76.0 0.1504 > 9 91.0 0.1317 > 10 95.5 0.1169 > 11 101.0 0.0977 > 12 115.0 0.1023 > 13 123.5 0.0879 > 14 138.5 0.0724 > 15 147.5 0.0564 > 16 163.0 0.0495 > 17 171.0 0.0325 > 18 189.0 0.0182 > 19 211.0 0.0047 > 20 212.5 NA > 21 815.5 -0.2112 > 22 816.5 -0.1896 > 23 817.5 -0.0783 > 24 818.5 0.2963 > 25 819.5 0.1448 > 26 839.5 0.0936 > 27 864.0 0.0560 > 28 888.0 0.0310 > 29 960.5 0.0056 > 30 1009.0 -0.0163 > > The values in column brom$abs, measured on 30 subsequent points in time need > to be calculated to Bromide concentrations, using the previously established > relationship ?linear1?. > At first, I thought it could be done by: > >> predict.lm(linear1, brom$abs) > Error in eval(predvars, data, env) : > numeric 'envir' arg not of length one > > But, R gives the above error message. Then, after some searching around on > different fora and R-communities (including this one), I learned that the > ?newdata? in ?predict.lm? actually needs to be coerced into a separate > dataframe. Thus: > >> mabs<- data.frame(Abs = brom$abs) >> predict.lm(linear1, mabs) > Error in eval(expr, envir, enclos) : object 'conc' not found > > Again, R gives an error...probably because I made an error, but I truly fail > to see where. I hope somebody can explain to me clearly what I'm doing wrong > and what I should do to instead. > Any help is greatly appreciated, thanks ! > > -- > View this message in context: http://r.789695.n4.nabble.com/Help-on-predict-lm-tp4509586p4509586.html > Sent from the R help mailing list archive at Nabble.com. > [[alternative HTML version deleted]] >
Hello all, Thanks for all your replies. I have studied on it some more in the meantime, and found indeed out that what I was trying to do was not correct to begin with. Sorry to have wasted your time, but thanks for the comments. -- View this message in context: http://r.789695.n4.nabble.com/Help-on-predict-lm-tp4509586p4511508.html Sent from the R help mailing list archive at Nabble.com.