Bootstrapping issues
Hi
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
project.org] On Behalf Of Clive Nicholas
Sent: Monday, November 12, 2012 8:06 AM
To: r-help at r-project.org
Subject: [R] Bootstrapping issues
sessionInfo()R version 2.15.2 (2012-10-26)
Platform: i686-pc-linux-gnu (32-bit)
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
LC_TIME=en_GB.UTF-8
[4] LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8
LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=C LC_NAME=C
LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_GB.UTF-8
LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] boot_1.3-7
loaded via a namespace (and not attached):
[1] tools_2.15.2
Hello. I have a very straightforward question. Here's some simulated
data
(N=500)
test<-data.frame(A=rnorm(500,mean=2.72,sd=5.36),
B=sample(c(12,20,24,28,32),size=500,prob=c(0.333,0.026,0.026,0.436,0.17
9),replace=TRUE),C=sample(c(0,1),size=500,replace=TRUE),D=sample(c(0,1)
,size=500,replace=TRUE))
head(test) A B C D
1 1.181804 28 1 0
2 -5.602307 12 1 1
3 2.925090 24 1 1
4 3.437408 28 1 0
5 -6.503531 32 0 0
6 11.013888 12 1 1
which I then bootstrap using
library(boot)
bs <- function(formula, data, indices) { test <- data[indices,]
fit <- lm(formula, data=test) return(coef(fit))
}
The following works
results <- boot(data=test, statistic=bs, R=1000, A~B+C+D+C*D)
Actually it does not work either
results <- boot(data=test, statistic=bs, R=1000, A~B+C+D+C*D)
Error in data[indices, ] : incorrect number of dimensions
I am not sure but I suspect your bs function expects some indices vector and it is somehow not in accordance with your data. Regards Petr
results But when I then amend the dataset by changing the D variable to simulate fixed proportions D=sample(c(0,1),size=500,prob=c(0.564,0.436),replace=TRUE head(test) A B C D 1 5.73771963 28 0 1 2 -0.19040750 12 1 0 3 2.22515982 12 0 1 4 -0.02905223 32 1 0 5 4.68314112 28 0 1 6 5.10711732 12 1 0 the same bootstrapping routine chokes with an error results <- boot(data=test, statistic=bs, R=1000, A~B+C+C*D)Error in data[indices, ] : incorrect number of dimensions despite the fact that the B variable also has simulated fixed proportions and yet the original code ran without any errors. I have two general observations to make about this: (1) this does not make sense; and (2) I don't understand this. How best to make these two observations go away and run the code to my satisfaction? Many thanks. -- Clive Nicholas (clivenicholas.posterous.com) [Please DO NOT mail me personally here, but at <clivenicholas at hotmail.com>. Please respond to contributions I make in a list thread here. Thanks!] "My colleagues in the social sciences talk a great deal about methodology. I prefer to call it style." -- Freeman J. Dyson [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.