Hi Tim, you have two "problems" at the same time: 1.) The warning you get means that you predictor (e.g. predictor1) has another range in the training set than in the test set. In this case you have data in you test set that lies outside of the range of the training set (for predictor1). This is only a problem if the ranges are REALLY different. However, this doesn't lead to your second problem! So I think you can just ignore the warning (especially as you write both training and test set have the same range). 2.) The second problem you describe (negative prediction for a positive outcome) has nothing to do with boosting or mboost. This results from the fact that you estimate a model for a positive outcome but the prediction might be ANY number. You can avoid this by, for example, considering log-transformed outcomes and / or using another family (depending on the type of your outcome). Please consult literatur on generalized linear models (GLMs) for further help. Hope that helps Benjamin
On 20.10.2010 12:00, r-help-request at r-project.org wrote:
Message: 129
Date: Wed, 20 Oct 2010 11:08:44 +0200
From: H?ring, Tim (LWF)<Tim.Haering at lwf.bayern.de>
To:<r-help at r-project.org>
Subject: [R] problem with predict(mboost,...)
Message-ID:
<70FC67C4A585D1489E66225A4E0238BAB3600C at RZS-EXC-CL06.rz-sued.bayern.de>
Content-Type: text/plain; charset="iso-8859-1"
Hi,
I use a mboost model to predict my dependent variable on new data. I get the following warning message:
In bs(mf[[i]], knots = args$knots[[i]]$knots, degree = args$degree, :
some 'x' values beyond boundary knots may cause ill-conditioned bases
The new predicted values are partly negative although the variable in the training data ranges from 3 to 8 on a numeric scale. In order to restrict the predicted values to the value range from 3 to 8 I limit the feature space of the prediction data on the minima and maxima of the training data for every predictor variable before applying the model on the new data.
As baselearner in mboost I use splines ("bbs"):
mod<- mboost(MF ~ bbs(predictor1) + bbs(predictor2) + bbs(...), data = train)
I wonder why there are negative values when applying the model on new data, because both, training and prediction data have the same value ranges in the predictor variables.
Did somebody get the same warning message? Can someone help me please?
TIM
------------------------------------------
Tim H?ring
Bavarian State Institute of Forestry
Department of Forest Ecology
Hans-Carl-von-Carlowitz-Platz 1
D-85354 Freising
E-Mail:tim.haering at lwf.bayern.de
http://www.lwf.bayern.de
****************************************************************************** Dipl.-Stat. Benjamin Hofner Institut f?r Medizininformatik, Biometrie und Epidemiologie Friedrich-Alexander-Universit?t Erlangen-N?rnberg Waldstr. 6 - 91054 Erlangen - Germany Tel: +49-9131-85-22707 Fax: +49-9131-85-25740 Office: Room 3.036 Universit?tsstra?e 22 (Entrance at the left side of the building) benjamin.hofner at imbe.med.uni-erlangen.de http://www.imbe.med.uni-erlangen.de/~hofnerb/ http://www.benjaminhofner.de