Bootsrapping

Sam

Sat, Aug 28, 2010 11:31 PM

Dear list,
My model is a binary model which has a TRUE/FALSE response and a series of predictors - A-G

My question is linked to the pred function and bootstrapping,  sorry for cross posting, i was unsure of the relevant forum, i apologise if this is the wrong forum.


I have used pred() with type ="response" to predict the outcome of new data based on my model.

I have then used the boot() function for bootstrapping -

{
  assign(".inds", inds, envir=.GlobalEnv)
  lm.b <- glm(fit+resid[.inds] ~A+B+C+D+E+F+G data=test)
  pred.b <- predict(lm.b,x.pred)
  remove(".inds", envir=.GlobalEnv)
  c(coef(lm.b), pred.b-(fit.pred+dat$resid[i.pred]))
}

fit.pred=new.fit, x.pred=new.data)

I am relatively new to bootstrapping and pieced together this code from various online sources. My questions are

1 - This is the output, how are these used?

Call:
boot(data = traits, statistic = model1.fun, R = 999, m = 1, fit.pred = new.fit, 
 x.pred = newData)


Bootstrap Statistics :
      original      bias    std. error
t1*   0.007575497  0.18230191  0.19196660
t2*  -0.643937336  0.22086625  0.55583873
t3*  -0.366131418  0.12643852  0.31807412
t4*  -0.019436074 -0.01320383  0.49683325
t5*  -0.260817757  0.09496542  0.08263721
t6*   0.098142998 -0.04146995  0.11082210
t7*   0.291285409 -0.10313141  0.11456760
t8*   0.324143279 -0.10723659  0.11159225
t9*   0.400566802 -0.15227392  0.08602217
t10*  0.631600069 -0.20098945  0.78996314
t11*  0.423935018 -0.14632695  0.66234901
t12*  0.628394814 -0.20632632  0.56252950
t13*  0.466345278 -0.17405490  0.34344214

2 - I called mean(model1.boot$t[,8]^2) to calculate the bootstrap prediction error - is this correct and does it apply to all samples within the data set?

3 - I called  

new.fit-sort(model1..boot$t[,8])[c(975,25)]
to get the bootstrap prediction limits

How do i use this data with the output from the pred(model1, type="response) function?

Sample output
      1            2            3            4            5            6            7 
0.176089986  0.613674752  0.128182584  0.432106503  0.157871072  0.491160896  0.337954702 
        8            9           10           11           12           13           14 
0.714518456  0.040612536  0.669099532  0.218551172  0.728698781 -0.050813284  0.728698781 

etc etc

Are these the same as the output from pred() i.e is this saying sample 1 has a 0.17% probability of being true? 


Thanks for any help

Sam