Dear List-members,
I found a strange behaviour in the lqs function.
Suppose I have the following data:
y <- c(7.6, 7.7, 4.3, 5.9, 5.0, 6.5, 8.3, 8.2, 13.2, 12.6, 10.4, 10.8,
13.1, 12.3, 10.4, 10.5, 7.7, 9.5, 12.0, 12.6, 13.6, 14.1, 13.5, 11.5,
12.0, 13.0, 14.1, 15.1)
x1 <- c(8.2, 7.6,, 4.6, 4.3, 5.9, 5.0, 6.5, 8.3, 10.1, 13.2, 12.6, 10.4,
10.8, 13.1, 13.3, 10.4, 10.5, 7.7, 10.0, 12.0, 12.1, 13.6, 15.0, 13.5,
11.5, 12.0, 13.0, 14.1)
x2 <- c(23.005, 23.873, 26.417, 24.868, 29.895, 24.200, 23.215, 21.862,
22.274, 23.830, 25.144, 22.430, 21.785, 22.380, 23.927, 33.443, 24.859,
22.686, 21.789, 22.041, 21.033, 21.005, 25.865, 26.290, 22.932, 21.313,
20.769, 21.393)
and I would like to fit the following model:
mod1 <- lqs(y ~ x1 + x2, method="lms", nsamp="exact")
mod1$coefficients
(Intercept) x1 x2
35.5293489 0.4422742 -1.2944534
mod1$bestone
[1] 12 17 27
Now, instead of using the formula, I want to provide the design matrix and
the response, then:
X <- cbind(1, x1, x2)
mod2 <- lqs.default(X, y, intercept=F, method="lms",
nsamp="exact")
mod2$coefficients
x1 x2
35.4217275 0.4276641 -1.2834731
mod2$bestone
[1] 6 14 15
The results are not the same (?!). Furthermore, if I create the design
matrix without the column of 1's for the intercept:
X <- cbind(x1, x2)
mod3 <- lqs.default(X, y, intercept=T, method="lms",
nsamp="exact")> mod3$coefficients
(Intercept) x1 x2
35.5293489 0.4422742 -1.2944534> mod3$bestone
[1] 12 17 27
I get the same result I had using the formula (see mod1).
This is confusing me!
Another small problem appears if I re-fit the first model using the
formula and asking to return the x-matrix and the response:
lqs(y ~ x1 + x2, method="lms", nsamp="exact", x.ret=T,
y.ret=T)
Error in model.frame(formula, rownames, variables, varnames, extras,
extranames, :
variable lengths differ
Does anyone know what's going on?
Regards, with thanks in advance,
Luca Scrucca
+-----------------------------------------------------------------------+
| Dr. Luca Scrucca |
| Dipartimento di Scienze Statistiche tel. +39 - 075 - 5855278 |
| Universita' degli Studi di Perugia fax. +39 - 075 - 43242 |
| Via Pascoli - C.P. 1315 Succ. 1 |
| 06100 PERUGIA (ITALY) |
| |
| E-mail: luca at stat.unipg.it |
| Web page: http://www.stat.unipg.it/luca |
+-----------------------------------------------------------------------+
On Mon, 10 Feb 2003, Luca Scrucca wrote:> I found a strange behaviour in the lqs function.Why is this strange? It *is* covered in the Note on the help page: please read it and the reference there.> Suppose I have the following data:[...]> Now, instead of using the formula, I want to provide the design matrix and > the response, then: > > X <- cbind(1, x1, x2) > mod2 <- lqs.default(X, y, intercept=F, method="lms", nsamp="exact") > mod2$coefficients > x1 x2 > 35.4217275 0.4276641 -1.2834731 > mod2$bestone > [1] 6 14 15 > > The results are not the same (?!). Furthermore, if I create the design > matrix without the column of 1's for the intercept:Right, it's not the same algorithm.> X <- cbind(x1, x2) > mod3 <- lqs.default(X, y, intercept=T, method="lms", nsamp="exact")> I get the same result I had using the formula (see mod1). > This is confusing me!That *is* the same algorithm.> Another small problem appears if I re-fit the first model using the > formula and asking to return the x-matrix and the response: > > lqs(y ~ x1 + x2, method="lms", nsamp="exact", x.ret=T, y.ret=T) > > Error in model.frame(formula, rownames, variables, varnames, extras, > extranames, : > variable lengths differ > > Does anyone know what's going on?No, but R has debugging tools for you to find out .... -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
On 10 Feb 2003 at 15:40, Luca Scrucca wrote: I am not sure about this, but lqs have an extra argument `adjust'. from the help page: adjust: should the intercept be optimized for each sample? Defaults to TRUE When you use formula, the estimation algorithm knows the columns of 1's is the intercept, and can traet it accordingly. When yo make the matrix X with on regressor column being all1's, the estimation algorithm does'nt know it represents the intercept (you did'nt give it a name). Could this be the reason why the different results? Testing this conjecture: mod4 <- lqs(y ~ x1 + x2, method="lms", nsamp="exact", adjust=FALSE)> coef(mod4)(Intercept) x1 x2 35.4217275 0.4276641 -1.2834731> coef(mod1)(Intercept) x1 x2 35.5293489 0.4422742 -1.2944534> coef(mod2)x1 x2 35.4217275 0.4276641 -1.2834731 so this might be the explanation. By the way, for the last question, using debug() and looking at the output gives: Browse[1]> n debug: subset <- eval(substitute(subset), data, env) Browse[1]> n debug: data <- .Internal(model.frame(formula, rownames, variables, varnames, extras, extranames, subset, na.action)) Browse[1]> n Error in model.frame(formula, rownames, variables, varnames, extras, extranames, : variable lengths differ so the error occurs in model.frame. You could continue the debugging yourself. (don't have time now). Kjetil Halvorsen> Dear List-members, > > I found a strange behaviour in the lqs function. > Suppose I have the following data: > > y <- c(7.6, 7.7, 4.3, 5.9, 5.0, 6.5, 8.3, 8.2, 13.2, 12.6, 10.4, 10.8, > 13.1, 12.3, 10.4, 10.5, 7.7, 9.5, 12.0, 12.6, 13.6, 14.1, 13.5, 11.5, > 12.0, 13.0, 14.1, 15.1) > x1 <- c(8.2, 7.6,, 4.6, 4.3, 5.9, 5.0, 6.5, 8.3, 10.1, 13.2, 12.6, 10.4, > 10.8, 13.1, 13.3, 10.4, 10.5, 7.7, 10.0, 12.0, 12.1, 13.6, 15.0, 13.5, > 11.5, 12.0, 13.0, 14.1) > x2 <- c(23.005, 23.873, 26.417, 24.868, 29.895, 24.200, 23.215, 21.862, > 22.274, 23.830, 25.144, 22.430, 21.785, 22.380, 23.927, 33.443, 24.859, > 22.686, 21.789, 22.041, 21.033, 21.005, 25.865, 26.290, 22.932, 21.313, > 20.769, 21.393) > > and I would like to fit the following model: > > mod1 <- lqs(y ~ x1 + x2, method="lms", nsamp="exact") > mod1$coefficients > (Intercept) x1 x2 > 35.5293489 0.4422742 -1.2944534 > mod1$bestone > [1] 12 17 27 > > Now, instead of using the formula, I want to provide the design matrix and > the response, then: > > X <- cbind(1, x1, x2) > mod2 <- lqs.default(X, y, intercept=F, method="lms", nsamp="exact") > mod2$coefficients > x1 x2 > 35.4217275 0.4276641 -1.2834731 > mod2$bestone > [1] 6 14 15 > > The results are not the same (?!). Furthermore, if I create the design > matrix without the column of 1's for the intercept: > > X <- cbind(x1, x2) > mod3 <- lqs.default(X, y, intercept=T, method="lms", nsamp="exact") > > mod3$coefficients > (Intercept) x1 x2 > 35.5293489 0.4422742 -1.2944534 > > mod3$bestone > [1] 12 17 27 > > I get the same result I had using the formula (see mod1). > This is confusing me! > > Another small problem appears if I re-fit the first model using the > formula and asking to return the x-matrix and the response: > > lqs(y ~ x1 + x2, method="lms", nsamp="exact", x.ret=T, y.ret=T) > > Error in model.frame(formula, rownames, variables, varnames, extras, > extranames, : > variable lengths differ > > Does anyone know what's going on? > > Regards, with thanks in advance, > > Luca Scrucca > > > +-----------------------------------------------------------------------+ > | Dr. Luca Scrucca | > | Dipartimento di Scienze Statistiche tel. +39 - 075 - 5855278 | > | Universita' degli Studi di Perugia fax. +39 - 075 - 43242 | > | Via Pascoli - C.P. 1315 Succ. 1 | > | 06100 PERUGIA (ITALY) | > | | > | E-mail: luca at stat.unipg.it | > | Web page: http://www.stat.unipg.it/luca | > +-----------------------------------------------------------------------+ > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > http://www.stat.math.ethz.ch/mailman/listinfo/r-help