Dear list, I have a series of data points which I want to approximate with exactly two linear functions. I would like to choose the intervals so that the total deviation from my fitted lines is minimal. How do I best do this? Thanks! Kamila The information transmitted in this electronic communication...{{dropped}}
On 30/08/2007, at 12:45 PM, Naxerova, Kamila wrote:> Dear list, > > I have a series of data points which I want to approximate with > exactly two > linear functions. I would like to choose the intervals so that the > total > deviation from my fitted lines is minimal. How do I best do this? > > Thanks! > KamilaIf you want to minimize the total ***squared*** deviations, then the code that I enclose below should do the trick for you. If you really want to minimize the total (presumably absolute) deviation, then you'll have to adapt the code a bit. The argument ``b'' to the bsr() function is a vector of candidate values for the ``join point'', e.g. b <- seq(lo,hi,length=100), where lo and hi are suitably chosen lower and upper bounds on the location of the join point. cheers, Rolf Turner bsr <- function(x,y,b) { # # Broken stick regression. # if(length(b)==1) { x1 <- ifelse(x<=b,0,x-b) temp <- lm(y~x+x1) ss <- sum(resid(temp)^2) rslt <- list(fit=temp,ss=ss,b=b) class(rslt) <- "bsr" return(rslt) } temp <- list() for(v in b) { x1 <- ifelse(x<=v,0,x-v) fff <- lm(y~x+x1) temp[[paste(v)]] <- sum(resid(fff)^2) } ss <- unname(unlist(temp)) v <- b[which.min(ss)] x1 <- ifelse(x<=v,0,x-v) rslt <- list(fit=lm(y~x+x1),ss=unname(unlist(temp)), b=b,ss.min=min(ss),b.min=v) class(rslt) <- "bsr" rslt } ===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+===+=== +===+== plot.bsr <- function(x,cont=TRUE,add=FALSE,rr=NULL,npts=101,...) { # # Plot method for broken stick regressions. # fit <- x if(cont) { if(is.null(rr)) { x <- fit$fit$model$x rr <- range(x) } xx <- seq(rr[1],rr[2],length=npts) b <- fit$b.min if(is.null(b)) b <- fit$b x1 <- ifelse(xx<=b,0,xx-b) yh <- predict(fit$fit,newdata=data.frame(x=xx,x1=x1)) } else { xx <- fit$fit$model[,2] yh <- predict(fit$fit) } if(add) lines(x=xx,y=yh,...) else plot(x=xx,y=yh,...) } ###################################################################### Attention:\ This e-mail message is privileged and confidenti...{{dropped}}
On Wed, 29 Aug 2007, Naxerova, Kamila wrote:> Dear list, > > I have a series of data points which I want to approximate with exactly two > linear functions. I would like to choose the intervals so that the total > deviation from my fitted lines is minimal. How do I best do this?>From the information you give it seems that you want to partition a modellike lm(y ~ x) along a certain ordering of the observations. Without any further restrictions you can do that with the function breakpoints() in package "strucchange". If there are continuity restrictions or something like that, you want to look at the "segmented" package. hth, Z> Thanks! > Kamila > > > The information transmitted in this electronic communication...{{dropped}} > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >
If you want to minimize absolute error for this, then you can try the rqss fitting in the quantreg package and tune lambda to get one break in the fitted function. url: www.econ.uiuc.edu/~roger Roger Koenker email rkoenker at uiuc.edu Department of Economics vox: 217-333-4558 University of Illinois fax: 217-244-6678 Champaign, IL 61820 On Aug 29, 2007, at 8:05 PM, Achim Zeileis wrote:> On Wed, 29 Aug 2007, Naxerova, Kamila wrote: > >> Dear list, >> >> I have a series of data points which I want to approximate with >> exactly two >> linear functions. I would like to choose the intervals so that the >> total >> deviation from my fitted lines is minimal. How do I best do this? > >> From the information you give it seems that you want to partition >> a model > like > lm(y ~ x) > along a certain ordering of the observations. Without any further > restrictions you can do that with the function breakpoints() in > package > "strucchange". If there are continuity restrictions or something like > that, you want to look at the "segmented" package. > > hth, > Z > >> Thanks! >> Kamila >> >> >> The information transmitted in this electronic communication... >> {{dropped}} >> >> ______________________________________________ >> R-help at stat.math.ethz.ch mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting- >> guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.