"average" regression - bootstrap?
Hi Johannes, I think your approach is reasonable. As you pointed out, however, generating such parameter distributions as you did is not strictly the same as bootstraping. The bootstrap simulates repeated sampling from one original, target population, by assuming the available sample is representative of that population and thus resampling it. Hence, it is a way of indirectly gathering information from the original population. This is only because in practice, resampling the population of interest is often impossible. In your case, you are directly simulating the target population (from a uniform distribution with known limits), an thus bootstraping is not needed. However, by directly simulating the target population and its samples, your results will mainly reflect properties of this abstract, infinite population. If you are also interested in a more "real world" setting, you could first simulate a large, but finite population and then sample it. At the same time, you could focus on a single, random sample from this finite population and then apply the bootstrap, as people would usually be able todo with their own data. Then, you could compare the results. It is also useful to check the shape of the resulting distributions before choosing the adequate measures to summarize it. For instance, the R2 sampling distribution is likely to be skewed, so using the mean will emphasize the tail values; the median could be more representative of the central tendency of the distribution in this case. Regards 2011/8/29, Johannes Radinger <JRadinger at gmx.at>:
Hello,
I've kind of a tricky statistical problem. First of all: I want to do a
standard linear regression. Therefore my model is:
X <- function()runif(length(Xa), Xa, Xb)
model <- lm(Y~X())
so X is a function drawing a random number between Xa and Xb (that is
necessary in my case). What I did so far is:
example1 <- list()
n=1000
for(i in 1:n) {
model <- lm(Y~X())
example1[[paste("run",i,sep="")]] <- model
}
So I ran the regression 1000 times and created a list with the regression
parameters for each run.
How can I analyse these results now? I can get nice mean values for p,
R-squared etc. but is that the right way?
So I thought, maybe a bootstrap approach can help in this case. Instead of
doing the "manual" repeaded regression I can use bootstrap. But does the
boot-function allow to use the "runif"-function for the X variable, so that
each bootstrap run a new number is drawn? If it is the case it'd be nice
because then I can get summarized results, a thing that I want. On the other
hand, I don't necessarily need the subsampling of bootstrap. So in my case
the subsample=all cases. Does that make sense?
Hopefully you can give me some inputs
best regards
Johannes
--
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Pedro A. C. Lima Pequeno Programa de P?s-gradua??o em Ecologia Instituto Nacional de Pesquisas da Amaz?nia Manaus, AM, Brasil