An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-ecology/attachments/20100826/e1390fa8/attachment.pl>
gls-crossvalidation
2 messages · Claudia liliana Ballesteros Mejia, Pinaud David
Dear Liliaina, What is "records"? counting data? It seems that you have a zero-inflated distribution (a lot of zero) with surdispersion. We had the same problem and try to solve the problem by using ZI Poisson distribution rather than using log(counts + 1). The problem here is that this kind of distribution have the variance increasing with mean, so you can have very large values in prediction... Look at the packages "pscl" and "ZIGP" maybe. HTH David Le 26/08/2010 11:29, Claudia liliana Ballesteros Mejia a ?crit :
Dear list,
I'm trying to fit a gls model with an spatial component and I want to validate my results doing a crossvalidation. I wrote the code getting at the end the mean square error, and supposedly it should work but I'm getting huge numbers as results (ranging between 3 to 2000). Perhaps I should mention that my data have lots of zeros.
can somebody tell me what's wrong?
This is the code I'm using for the crossvalidation.
m_err2.vect<- vector()
for(j in 1:10)
{
print(j)
select.rec<- sample(1:nrow(data.dmi), 0.9*nrow(data.dmi))
train.rec<- data.rec[select.rec,] #Selecting 90% of the data for training purpose
test.rec<- data.rec[-select.rec,] #Selecting 10% (remaining) for testing purpose
gls.rec<- gls (log(records+1)~roads+pop+conflict+airport+rails+PA+pristine+tur_plac, data = train.rec,correlation=corSpher(form=~X_Mol + Y_Mol, nugget=TRUE), na.action=na.omit)
#Create fitted values using test.dmi data
rec_pred<- predict(gls.rec, test.rec)
rec_obs<-test.rec[,"records"]
# Get the prediction error = Mean Square Error (MSE)= 1/n
m_err2<- t(rec_pred - rec_obs)%*%(rec_pred - rec_obs)/nrow(test.rec)
m_err2.vect<- c(m_err2.vect, m_err2)
}
m_err2.vect
[1] 155.68777 380.22485 121.41826 1188.19114 292.95930 40.00558 253.04283 13.58491 1239.02019 149.31290
mean(m_err2.vect)
[1] 383.3448
Thanks a lot in advance, any suggestion would be very much appreciated.
Cheers,
Liliana.
[[alternative HTML version deleted]]
__________ Information from ESET Mail Security, version of virus signature database 5397 (20100825) __________
The message was checked by ESET Mail Security.
http://www.eset.com
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology __________ Information from ESET Mail Security, version of virus signature database 5397 (20100825) __________ The message was checked by ESET Mail Security. http://www.eset.com
*************************************************** David PINAUD Ing?nieur de Recherche "Analyses spatiales" Centre d'Etudes Biologiques de Chiz? - CNRS UPR1934 79360 Villiers-en-Bois, France poste 485 Tel: +33 (0)5.49.09.35.58 Fax: +33 (0)5.49.09.65.26 http://www.cebc.cnrs.fr/ *************************************************** __________ Information from ESET Mail Security, version of virus signature database 5397 (20100825) __________ The message was checked by ESET Mail Security. http://www.eset.com