negative r-squares
This raises a question though whether one should use the mean of the training data or the mean of the test data in calculating the total sum of squares. I believe the first is more fair with respect to answering whether a given model is any better as compared the null model in predicting the response. When using sst based the mean of the test data you are essentially comparing your model to a null model that has been based on different data (which i think isn't fair), and its probably the reason why the ss.err > sst, and hence R2<0. Caspar On Thu, Sep 9, 2010 at 9:15 PM, Edzer Pebesma
<edzer.pebesma at uni-muenster.de> wrote:
Pinar, Jason, From the script below it seems no adjustment for degrees of freedom is being made. In this case R2 can become negative because you use a different test and train set. Suppose the test set contains one single extreme that is not present in the training set. In that case, the mean of the test values is, in terms of sum of squares, a better predicter than your regression model that didn't know about this outlier. Don't forget that the mean of the test set does contain this outlier. Hence, R2 can easily become negative when evaluated over a different data set then the regression model was derived from. On 09/09/2010 06:25 PM, Jason Gasper wrote:
Hello Pinar, I don't know for sure what your calculation is, but R2 values can range from -inf to 1 if an adjusted R2 is being used. In other words, one possibility is that your adjusting for degrees of freedom using some variation of the following (n-1/n-k)(1-R2) where the adjusted R2 is equivalent to simple regression when k=1. ?So when the estimated R2 less than or equal to 0 that means the model forecast is inferior to the mean (really poor fit). Another way of looking at a negative R2 is that the fit is worse than a horizontal line, so the sum-of-squares from the model is larger than the sum-of-squares from a horizontal line. Again, poor fit. Cheers-Jason Pinar Aslantas Bostan wrote:
Hi all, I am working about comparison of kriging and regression methods. I have one dependent (PREC) and seven independent variables. I created 10 different test and train datasets. I am using train datasets for building the models and test datasets for calculating error (RMSE) and r-squares. When I obtained prediction values for grid, then I use overlay() to get predictions for test dataset. For example: # regression kriging # dem is the grid (I want to get predictions for each pixel of dem) and dem$rk.pred1 contains regression kriging predictions
test1$rk.predicted = dem$rk.pred1[overlay(dem, test1)]
# calculating r-square values based on test values
ss <-(test1$PREC-mean(test1$PREC))*(test1$PREC-mean(test1$PREC)) sst1<-sum(ss) e <-(test1$PREC-test1$rk.predicted)*(test1$PREC-test1$rk.predicted) sse.rk<-sum(e) rk1.r.square<-1-(sse.rk/sst1)
My problem is that, for some datasets the methods can be resulted with negative r-squares. Here I gave an example about regression kriging but also same problem may occur for linear regression. I checked the dependent and independent variables and there is no problem with them. Are there anyone who knows another function instead of overlay() for the same purpose? (I thougt that maybe the problem is because of overlay function) or do you have any idea about reason of negative r-square values? Best regards, Pinar ******************************************************************************** Pinar Aslantas Bostan Research Assistant Department of Geodetic and Geographic Information Technologies (GGIT) Middle East Technical University 06531 Ankara/TURKEY aslantas at metu.edu.tr
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
-- Edzer Pebesma Institute for Geoinformatics (ifgi), University of M?nster Weseler Stra?e 253, 48151 M?nster, Germany. Phone: +49 251 8333081, Fax: +49 251 8339763 ?http://ifgi.uni-muenster.de http://www.52north.org/geostatistics ? ? ?e.pebesma at wwu.de
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo