Hello R-sig-Geo,
my name is Edoardo and this is the first time I am asking for help here.
I am going through the "sdm" vignette of package "dismo" and, when using
the Logistic regression model of presence/absence(background) data of
bradypus (p. 53) for spatial prediction, I get values outside the range
[0,1]. I think this should not happen as using the Logit cumulative
distribution function should limit them in the [0,1] range.
It follows the code with the data:
## This is the dataset I use. It is just a random sample of the original
envtrain dataset of
## the "sdm" vignette
pa bio1 bio12 bio16 bio17 bio5 bio6 bio7 bio8 biome
735 0 265 2618 954 391 314 224 90 264 1
634 0 264 2590 1007 203 327 200 127 263 1
861 0 243 1783 916 29 335 132 202 247 1
70 1 261 2791 1077 148 310 218 92 254 1
48 1 257 3575 1295 359 316 202 114 258 1
511 0 220 1583 457 281 339 114 225 226 7
344 0 230 3352 1436 247 294 167 127 228 1
453 0 242 240 140 14 308 186 121 260 13
325 0 240 1541 447 302 291 186 104 249 1
193 0 232 1130 506 87 311 123 188 256 1
568 0 257 2269 774 246 317 188 128 259 1
664 0 190 1512 544 251 282 85 196 219 1
82 1 268 2403 723 407 327 207 120 268 1
271 0 180 1078 336 195 320 65 255 238 7
873 0 258 4238 1378 814 301 222 79 260 1
105 0 92 719 290 98 150 36 114 92 1
461 0 271 2112 849 254 330 224 106 268 1
37 1 276 372 215 25 330 222 108 278 13
822 0 110 1427 707 115 232 31 201 79 4
59 1 252 2471 1094 172 319 194 124 254 1
26 1 199 2005 657 256 262 143 119 196 1
115 0 215 1247 518 178 329 79 251 271 5
714 0 257 1903 714 162 325 182 143 260 1
781 0 265 2048 909 172 329 212 117 259 1
84 1 263 2921 868 607 318 205 113 262 1
604 0 119 1251 595 117 237 41 196 87 4
449 0 243 1307 540 86 310 148 162 259 7
384 0 233 2257 955 237 300 159 140 241 1
187 0 125 1282 616 115 230 50 180 101 4
278 0 254 819 521 12 314 199 115 255 13
766 0 178 1220 333 269 316 64 251 201 7
426 0 255 2207 984 52 350 159 191 253 1
153 0 253 1874 775 101 320 172 148 256 1
174 0 220 1410 492 168 304 113 191 241 7
214 0 238 1667 722 101 320 159 161 245 1
626 0 258 2116 959 203 333 206 127 252 7
779 0 224 1834 907 28 316 107 210 235 7
429 0 247 569 226 58 315 180 136 237 13
580 0 261 2507 1085 94 346 171 175 258 1
388 0 252 2408 886 278 319 181 138 255 1
## I then estimate a Logit model and use it for prediction
## with new data (predictors)
logit = glm(pa ~ bio1 + bio5 + bio6 + bio7 + bio8 + bio12 + bio16 + bio17,
family = binomial(link = 'logit'), data = envtrain0)
files = list.files(paste(system.file(package = 'dismo'),'/ex',sep=''),
pattern = 'grd', full.names = T)
predictors = stack(files)
ext = extent(-90,-32,-33,23)
pred = predict(predictors, logit, ext=ext)
pred
## The output says that predicted values are in the range (-22.41863,
5.837521)
class : RasterLayer
dimensions : 112, 116, 12992 (nrow, ncol, ncell)
resolution : 0.5, 0.5 (x, y)
extent : -90, -32, -33, 23 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
data source : in memory
names : layer
values : -22.41863, 5.837521 (min, max)
I was expecting to get values included in the range [0,1] as the logit
model should do. Indeed,
the fitted values of the model all lie between 0 and 1.
Why predicted values are in the range (-22.41863, 5.837521) here ?
Can you tell me where I am making a mistake ?
Thanks
Regards,
Edoardo
[[alternative HTML version deleted]]