On Sep 20, 2011, at 4:50 AM, mael wrote:
> Hi,
>
> I am trying to add a function in a linear quantile regresion to find a
> breakpoint. The function I want to add is:
>
> y=(k+ax)(x<B)+(k+(a-d)B+dx)(x>B)
>
> How do I write it in the rq() function? Do I need to define the
> parameters
> in any way and how do I do that? I'm a biologist new to R.
You have not offered a test dataset, so no tested example from me but
why not use an indicator term in a formula:
(You also have not defined your model very clearly. What are the roles
of the different terms? There appear to be some superfluous constants.
The (a-d) expression would not fit into regression formula conventions
very well. (Either that or a linear model is not what you are asking
about.) This is just a wild guess:
You will get an intercept term by default which I take to be the "k"
term in your expression. I doubt the final form will be exactly as you
expected, but it may be able to recast it in you preferred form (once
you make clear what those terms mean). The estimate for the first term
should come out as an interaction with an x-term and an I(B<x) term.
The estimate for the second term should likewise yield two
coefficients. So (I thought) you should get 5 coefficients ...but
When I did it on a simple linear regression model I got 6, so I guess
my understanding of how many are linearly dependent was wrong.
> lm(y ~ x*I(x < 1) + dx*I(x >1), data=dat)
Call:
lm(formula = y ~ x * I(x < 1) + dx * I(x > 1), data = dat)
Coefficients:
(Intercept) x I(x < 1)TRUE dx
2.920 2.185 -25.026 49.833
I(x > 1)TRUE x:I(x < 1)TRUE dx:I(x > 1)TRUE
NA -33.605 -50.870
If that "dx" term is from a differential equation, then scratch all of
the above and instead use software appropriate to the task or perhaps
use the diff() function to create a delta-x. Or if you are trying to
do the very non-linear operation of estimating the B coefficient,
which is not "linear quantile regression". "Linear" in the
regression
context means linear in the coefficients and the expression (x<B)
would be non-linear when B is a coefficient to be estimated. In his
book "Quantile regression Koenker says changepoints can be estimated
with the nonparametric methods offered in `rqss`
I did try the above formula in package quantreg, but I get an error
from a singular design matrix:
require(quantreg)
rq( y ~ x*I(x < 1) + dx*I(x > 1), data=dat)
Error in rq.fit.br(x, y, tau = tau, ...) : Singular design matrix
So `rq` seems to be less capable of identifying and discarding
dependencies in the design matrix than is `lm`.
I think some exploratory analysis might be your best first choice,
using lowess. To see many types of regression methods illustrated you
might want to look at Vincent Zoonekynd pages:
http://zoonek2.free.fr/UNIX/48_R/10.html
(it's too bad Vincent still has quantreg examples on his TODO list,
but he does illustrate from package segmented and uses base `lowess`
and a couple of other to show the exploratory methods in action.)
And finishing back at rq with e advice to use exploratory methods as
in Koenker's example on p 16 of
http://cran.r-project.org/web/packages/quantreg/vignettes/rq.pdf
... where he used a 15-knot spline function inside an rq call.
--
David Winsemius, MD
West Hartford, CT