Ein eingebundener Text mit undefiniertem Zeichensatz wurde abgetrennt. Name: nicht verf?gbar URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20100416/d33e3e7f/attachment.pl>
Bootstrapping a repeated measures ANOVA
3 messages · Charles C. Berry, Fischer, Felix
On Fri, 16 Apr 2010, Fischer, Felix wrote:
Hello everyone, i have a question regarding the sampling process in boot().
"PLEASE ... provide commented, minimal, self-contained, reproducible
code." Which means something a correspondent could actually run.
But before that, a careful reading of
?boot
should get you started. Note these bits:
Arguments:
data: The data as a vector, ...
statistic: A function which when applied to data returns a vector
containing the statistic(s) of interest. When
sim="parametric", [snip]
In all other cases
statistic must take at least two arguments. The first
argument passed will always be the original data. The second
will be a vector of indices, frequencies or weights which
define the bootstrap sample. ...
HTH,
Chuck
I try to bootstrap F-values for a repeated measures ANOVA to get a
confidence interval of F-values. Unfortunately, while the aov works
fine, it fails in the boot()-function. I think the problem might be that
the resampling process fails to select both lines of data representing
the 2 measuring times for one subject and I therefore get missing cases.
The data is organised like this:
subject ort mz PHQ
1 1 1 x
1 1 2 y
2 1 1 z
2 1 2 zz
...
Is there any way to specify, that both lines need to be selected?
Thanks a lot!
Felix Fischer
P.S. If you need to have a look to my code:
F_values <- function(formula, data, indices) {
d <- data[indices,] # allows boot to select sample
fit=aov(formula,data=d) #fit model
return(c(summary(fit)[1][[1]][[1]]$`F value`, summary(fit)[2][[1]][[1]]$`F value`)) #return F-values
}
results <- boot(data=anova.daten, statistic=F_values,
R=10, formula=PHQ_Sum_score~mz*ort+Error(subject/mz))
Dipl. Psych. Felix Fischer
Medizinische Klinik mit Schwerpunkt Psychosomatik
Charit? -- Universit?tsmedizin Berlin
Luisenstr. 13a
10117 Berlin
Tel.: 030 - 450 553575
Email: felix.fischer at charite.de<mailto:felix.fischer at charite.de>
[[alternative HTML version deleted]]
Charles C. Berry (858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
Thank you for your answer. Sorry for the missing example.
In fact, i think, i solved the issue by some data-manipulations in the function. I splitted the data (one set for each measuring time), selected the cases at random, and then combined the two measuring times again. Results look promising to me, but if someone is aware of problems, please let me know.
This code should run:
library(boot)
anova.daten=data.frame(subject=sort(rep(1:10,2)), mz=rep(1:2,10), ort=sort(rep(1:2,10)),PHQ_Sum_score=rnorm(20,10,2)) #generate data
summary(aov(PHQ_Sum_score~mz*ort+Error(subject/mz),data=anova.daten))
F_values <- function(formula, data1, indices) {
data2=subset(data1, data1$mz==2) #subsetting data for each measuring time
data3=subset(data1, data1$mz==1)
data4 <- data3[indices,] # allows boot to select sample
subjekte=na.omit(data4$subject)
data5=rbind(data3[subjekte,], data2[subjekte,]) #combine data
data5$subject=factor(rep(1:length(subjekte),2)) #convert repeated subjects to unique subjects
fit=aov(formula,data=data5) #fit model
return(c(summary(fit)[1][[1]][[1]]$`F value`, summary(fit)[2][[1]][[1]]$`F value`)) #return F-values
}
results <- boot(data=anova.daten, statistic=F_values,
R=10, formula=PHQ_Sum_score~mz*ort+Error(subject/mz)) #bootstrap
Thanks a lot,
Felix Fischer