Datum: Fri, 18 Nov 2011 09:08:52 +0100
Von: Lars Westerberg <lawes at ifm.liu.se>
An: r-sig-ecology at r-project.org
Betreff: Re: [R-sig-eco] Bootstrapping with pseudo-replicates
I mainly use two ways to collect results in a for-loop. Either by
defining an empty result variable:
out <- NA #is ugly but makes things really easy:
out[1] <- 1 #even out[3] <- 1 works
It is better, and faster, to allocate memory for a result variable using
e.g. 'array' or 'matrix'. In the example below, you have to match the
number of columns with length of the result vector:
out <- matrix(NA, nrow=R, ncol=4) #define out before for-loop
#for(i in 1:R){
#...
out[i,] <- c(coef(model),... #store results
#...
#}
apply(out,2,mean,na.rm=TRUE) #Calc mean of matrix columns/reg. param.
apply(out,2,sd,na.rm=TRUE) #Calc sd of matrix columns/reg. param.
HTH
/Lars
On 2011-11-17 16:02, Johannes Radinger wrote:
Hello Dixon,
As there is no real predefined function for doing that resampling and
regression what I want I tried to work on my own code. So far I get a code
which is working in a for-loop. There is still a problem because I don't know
how to collect the results-vector for each loop step into a data frame or
list etc.
Maybe someone can help me. So far I got following:
library(plyr)
y<- c(1,5,6,2,5,10) # response
x<- c(2,12,8,1,16,17) # predictor
group<- factor(c(1,2,2,3,4,4)) # group
df<- data.frame(y,x,group)
R = 50 # the number of replicates
out = numeric(R) # storage for the results
for (i in 1:R) {
subsample<- ddply(df, .(group), function(x){
x[sample(nrow(x), 1), ]})
model<- lm(y~x,data=subsample)
out[i]<- c(coef(model), #vector of coefficients
summary(model)$coefficients[-1,4], #p-values for all except Intercept
pf(summary(model)$fstatistic[1], summary(model)$fstatistic[2],
summary(model)$fstatistic[3], lower.tail = FALSE), #overall p-value
summary(model)$r.squared)
}
The problem is the object out. This must be a dataframe or a list where
all the resulting out[i] vectors are collected. I want it in a way so that
I can easily calculate the mean/variance of the single regression
parameters etc.
Thank you very much!
Johannes
-------- Original-Nachricht --------
Datum: Wed, 16 Nov 2011 08:25:11 -0600
Von: "Dixon, Philip M [STAT]"<pdixon at iastate.edu>
An: "r-sig-ecology at r-project.org"<r-sig-ecology at r-project.org>
Betreff: [R-sig-eco] Bootstrapping with pseudo-replicates
Johannes,
A very good question to ask, but you can't use a bootstrap, or boot(),
investigate it.
You can define strata and then bootstrap observations within strata,
all bootstrap data sets will have the same structure as the original
That's the point of the bootstrap. In your example, you have
from 4 sites, 1 obs from site 1, 2 from site 2, 1 from site 3, and 2
site 4. Every stratified bootstrap sample will have 1 from site 1, 2
site 2, 1 from site 3 and 2 from site 4.
I believe you have to construct your own code, probably along the lines
defining a vector for one obs per site, then for each site: extracting
set of pseudoreplicates for one site, using sample() to grab one value
from that set, then storing in the vector.
Best wishes,
Philip Dixon