Skip to content

Pred function - miss understanding?

1 message · Aitor Gastón

#
An AUC of 0.72 means that your model predicts higher probability for 
unthreatened species than for threatened species in 28% of all possible 
species pairs in the training data. This performance is likely to be poorer 
if evaluated in other sample, because if you evaluate the performance of the 
model on the training data the estimates of model performance will be 
considerably optimistic. Try to evaluate the model with an independent 
sample or, if not possible, try internal validation with bootstrap using the 
validate function from the Design package. Bootstrap based validation 
performs better than other approaches (see Steyerberg et al., 2001. Internal 
validation of predictive models: Efficiency of some procedures for logistic 
regression analysis, Journal of Clinical Epidemiology, 54 (8): 774-781)
If you want to transform predictions from probabilities to binary data you 
have to choose a probability threshold. A simple approach is to use the 
prevalence in the training data as threshold. There are other approaches, 
but I haven't used them (see Liu et al., 2005. Selecting thresholds of 
occurrence in the prediction of species distributions. Ecography 28, 385-393 
or Jimenez-Valverde & Lobo, 2007. Threshold criteria for conversion of 
probability of species presence to either-or presence-absence, Acta 
Oecologica, 31 (3): 361-369).

Hope this helps,


Aitor


--------------------------------------------------
From: "Chris Mcowen" <chrismcowen at gmail.com>
Sent: Friday, August 27, 2010 3:47 PM
To: "Aitor Gast?nGonz?lez" <aitor.gaston at upm.es>
Subject: Re: [R-sig-eco] Pred function - miss understanding?