Alan Aspuru-Guzik
2000-Aug-29 00:44 UTC
[R] Newbie question: Linear regression with error bars.
Hello guys, I am a total newbie on R, having downloaded it, read the documentation and started playing with it right now. My general question is what 'lr' model can be used for doing a linear regression on points that have a variance associated with them (ie. Monte Carlo simulation results). Actually my Data sets look like: Timestep Energy Variance_of_the_Energy 0.0005 -14.876840 0.000670 0.001 -14.883960 0.000670 0.002 -14.887360 0.000700 0.05 -14.888730 0.000430 0.1 nan nan And what I want to do is to obtain the intercept of the best possible linear fit (weeding out outliers if necessary) but having the intercept with an associated error bar. What I managed to do today is read the files from a table and learn how to access the columns of the data frame . And managed to do a linear regression of the type x ~ y with normal vectors. I will eventually will like to use something like the leaps package (regression subset selection) and then do a linear regression including the error bars with the best points obtained by the 'leaps' package. The specific questions are: x. How can I Convert a data frame's column into a vector? x. Now that I have the vectors, how can I specify a linear model of a linear regression with associated variances? Thanks a lot in advance!! Alan Aspuru-Guzik -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
ben@zoo.ufl.edu
2000-Aug-29 13:05 UTC
[R] Newbie question: Linear regression with error bars.
Here are a few ideas to get you going. The answers to your two basic questions are (1) you can access the columns of a data frame with the column number as x[,1], as an element in a list as x[[1]], and with the column name as x$whatever; the column will be treated as a vector in any case; (2) use the "weights" option to lm() [using 1/variance as the weights] to do a weighted linear regression. I've included a confidence-interval plotter, originally written by Bill Venables and hacked a bit, for your convenience. I don't have the context, but I would be careful with linear regression in this context (there are obvious problems with the naive regression line shown below). It makes me nervous that the estimated error of the intercept is so much smaller than the errors associated with the points ... Ben Bolker ## assuming that the column labels are the first line of your data file x <- read.table("R.tmp",header=TRUE) plotCI <- function (x, y = NULL, uiw, liw = uiw, ylim=NULL, sfrac = 0.01, add=FALSE, col=par("col"), lwd=par("lwd"), slty=par("lty"), xlab=deparse(substitute(x)), ylab=deparse(substitute(y)), ...) { # from Bill Venables, R-list if (is.list(x)) { y <- x$y x <- x$x } if (is.null(y)) { if (is.null(x)) stop("both x and y NULL") y <- as.numeric(x) x <- seq(along = x) } ui <- y + uiw li <- y - liw if (is.null(ylim)) ylim <- range(c(y, ui, li), na.rm=TRUE) if (add) { points(x, y, col=col, lwd=lwd, ...) } else { plot(x, y, ylim = ylim, col=col, lwd=lwd, xlab=xlab, ylab=ylab, ...) } smidge <- diff(par("usr")[1:2]) * sfrac segments(x, li, x, ui, col=col, lwd=lwd, lty=slty) x2 <- c(x, x) ul <- c(li, ui) segments(x2 - smidge, ul, x2 + smidge, ul, col=col, lwd=lwd) invisible(list(x = x, y = y)) } ## linear regr. of Energy as a function of Timestep lm0 <- lm(x$Energy ~ x$Timestep ,weights=1/x$Var) summary(lm0) plotCI(x$Timestep,x$Energy,sqrt(x$Var)) abline(lm0) ## add a regression line to the plot interc <- summary(lm0)$coef[1,] ## estimate, std. error, etc. plotCI(0,interc["Estimate"],interc["Std. Error"],col="red",add=TRUE) On Mon, 28 Aug 2000, Alan Aspuru-Guzik wrote:> > Hello guys, > I am a total newbie on R, having downloaded it, read the documentation and > started playing with it right now. > > My general question is what 'lr' model can be used for doing a linear > regression on points that have a variance associated with them (ie. Monte > Carlo simulation results). > > Actually my Data sets look like: > > Timestep Energy Variance_of_the_Energy > 0.0005 -14.876840 0.000670 > 0.001 -14.883960 0.000670 > 0.002 -14.887360 0.000700 > 0.05 -14.888730 0.000430 > 0.1 nan nan > > And what I want to do is to obtain the intercept of the best possible > linear fit (weeding out outliers if necessary) but having the intercept > with an associated error bar. > > What I managed to do today is read the files from a table and learn how to > access the columns of the data frame . And managed to do a linear > regression of the type x ~ y with normal vectors. > > I will eventually will like to use something like the leaps package > (regression subset selection) and then do a linear regression including > the error bars with the best points obtained by the 'leaps' package. > > The specific questions are: > x. How can I Convert a data frame's column into a vector? > x. Now that I have the vectors, how can I specify a linear model of a > linear regression with associated variances? > > Thanks a lot in advance!! > > Alan Aspuru-Guzik > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ > >-- 318 Carr Hall bolker at zoo.ufl.edu Zoology Department, University of Florida http://www.zoo.ufl.edu/bolker Box 118525 (ph) 352-392-5697 Gainesville, FL 32611-8525 (fax) 352-392-3704 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._