Hi Tim,
you have two "problems" at the same time:
1.) The warning you get means that you predictor (e.g. predictor1) has
another range in the training set than in the test set. In this case you
have data in you test set that lies outside of the range of the training
set (for predictor1). This is only a problem if the ranges are REALLY
different. However, this doesn't lead to your second problem! So I think
you can just ignore the warning (especially as you write both training
and test set have the same range).
2.) The second problem you describe (negative prediction for a positive
outcome) has nothing to do with boosting or mboost. This results from
the fact that you estimate a model for a positive outcome but the
prediction might be ANY number.
You can avoid this by, for example, considering log-transformed outcomes
and / or using another family (depending on the type of your outcome).
Please consult literatur on generalized linear models (GLMs) for further
help.
Hope that helps
Benjamin
On 20.10.2010 12:00, r-help-request at r-project.org
wrote:> Message: 129
> Date: Wed, 20 Oct 2010 11:08:44 +0200
> From: H?ring, Tim (LWF)<Tim.Haering at lwf.bayern.de>
> To:<r-help at r-project.org>
> Subject: [R] problem with predict(mboost,...)
> Message-ID:
> <70FC67C4A585D1489E66225A4E0238BAB3600C at
RZS-EXC-CL06.rz-sued.bayern.de>
>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hi,
>
> I use a mboost model to predict my dependent variable on new data. I get
the following warning message:
> In bs(mf[[i]], knots = args$knots[[i]]$knots, degree = args$degree, :
> some 'x' values beyond boundary knots may cause ill-conditioned
bases
>
> The new predicted values are partly negative although the variable in the
training data ranges from 3 to 8 on a numeric scale. In order to restrict the
predicted values to the value range from 3 to 8 I limit the feature space of the
prediction data on the minima and maxima of the training data for every
predictor variable before applying the model on the new data.
> As baselearner in mboost I use splines ("bbs"):
>
> mod<- mboost(MF ~ bbs(predictor1) + bbs(predictor2) + bbs(...), data =
train)
>
> I wonder why there are negative values when applying the model on new data,
because both, training and prediction data have the same value ranges in the
predictor variables.
>
> Did somebody get the same warning message? Can someone help me please?
>
> TIM
>
> ------------------------------------------
> Tim H?ring
> Bavarian State Institute of Forestry
> Department of Forest Ecology
> Hans-Carl-von-Carlowitz-Platz 1
> D-85354 Freising
>
> E-Mail:tim.haering at lwf.bayern.de
> http://www.lwf.bayern.de
--
******************************************************************************
Dipl.-Stat. Benjamin Hofner
Institut f?r Medizininformatik, Biometrie und Epidemiologie
Friedrich-Alexander-Universit?t Erlangen-N?rnberg
Waldstr. 6 - 91054 Erlangen - Germany
Tel: +49-9131-85-22707
Fax: +49-9131-85-25740
Office:
Room 3.036
Universit?tsstra?e 22
(Entrance at the left side of the building)
benjamin.hofner at imbe.med.uni-erlangen.de
http://www.imbe.med.uni-erlangen.de/~hofnerb/
http://www.benjaminhofner.de