Thank you Rui,
This is great. How about the following?
SimilatedData<-boot.array(b, indices=T)
seems it is giving the rows ID which are used in the calculation, isn't it?
On Thu, Jan 28, 2021 at 12:21 PM Rui Barradas <ruipbarradas at sapo.pt
<mailto:ruipbarradas at sapo.pt>> wrote:
Hello,
I don't know why you would need to see the indices but rewrite the
function bootprop as
bootprop_ind <- function(data, index){
? ?d <- data[index, ]
? ?#sum(d[["BothTimes"]], na.rm = TRUE)/sum(d[["Time1"]], na.rm = TRUE)
? ?index
}
and call in the same way. It will now return a matrix of indices with R
= 1000 rows and 19 columns.
Hope this helps,
Rui Barradas
?s 19:29 de 28/01/21, Marna Wagley escreveu:
> Hi Rui,
> I am sorry for asking you several questions.
>
> In the given example, randomizations (reshuffle) were done 1000
> and its 1000 proportion values (results) are stored and it can be
> using b$t; but I was wondering how the table was randomized
> have been missed/or repeated in each randomizing procedure?).
>
> Is there any way we can see the randomized table and its associated
> results? Here in this example, I randomized (or bootstrapped) the
> into three times (R=3) so I would like to store these three
> look at them later to know which rows were repeated/missed. Is
> possibility?
> The example data and the code is given below.
>
> Thank you for your help.
>
> ####
> library(boot)
> dat<-structure(list(Sample = structure(c(1L, 12L, 13L, 14L, 15L, 16L,
> 17L, 18L, 19L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L), .Label
> "id10", "id11", "id12", "id13", "id14", "id15", "id16", "id17",
> "id18", "id19", "Id2", "id3", "id4", "id5", "id6", "id7", "id8",
> "id9"), class = "factor"), Time1 = c(0L, 1L, 1L, 1L, 0L, 0L,
> 1L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 1L, 0L, 1L, 0L), Time2 = c(1L,
> 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L,
> 1L, 1L)), .Names = c("Sample", "Time1", "Time2"), class =
> row.names = c(NA,
> -19L))
> daT<-data.frame(dat %>%
>? ? mutate(Time1.but.not.in.Time2 = case_when(
>? ? ? ? ? ? ? Time1 %in% "1" & Time2 %in% "0" ?~ "1"),
> Time2.but.not.in.Time1 = case_when(
>? ? ? ? ? ? ? Time1 %in% "0" & Time2 %in% "1" ?~ "1"),
>? ?BothTimes = case_when(
>? ? ? ? ? ? ? Time1 %in% "1" & Time2 %in% "1" ?~ "1")))
> cols.num <- c("Time1.but.not.in.Time2","Time2.but.not.in.Time1",
> "BothTimes")
> daT[cols.num] <- sapply(daT[cols.num],as.numeric)
> summary(daT)
>
> bootprop <- function(data, index){
>? ? ?d <- data[index, ]
>? ? ?sum(d[["BothTimes"]], na.rm = TRUE)/sum(d[["Time1"]], na.rm =
> }
>
> R <- 3
> set.seed(2020)
> b <- boot(daT, bootprop, R)
> b
> b$t0? ? ?# original
> b$t
> sd(b$t)? # bootstrapped estimate of the SE of the sample prop.
> hist(b$t, freq = FALSE)
>
> str(b)
> b$data
> b$seed
> b$sim
> b$strata
> ################
>
>
> On Sat, Jan 23, 2021 at 12:36 AM Marna Wagley
<marna.wagley at gmail.com <mailto:marna.wagley at gmail.com>
> <mailto:marna.wagley at gmail.com <mailto:marna.wagley at gmail.com>>>
>
>? ? ?Yes Rui, I can see we don't need to divide by square root of
>? ? ?size. The example is great to understand it.
>? ? ?Thank you.
>? ? ?Marna
>
>
>? ? ?On Sat, Jan 23, 2021 at 12:28 AM Rui Barradas
<ruipbarradas at sapo.pt <mailto:ruipbarradas at sapo.pt>
>? ? ?<mailto:ruipbarradas at sapo.pt <mailto:ruipbarradas at sapo.pt>>>
>
>? ? ? ? ?Hello,
>
>? ? ? ? ?Inline.
>
>? ? ? ? ??s 07:47 de 23/01/21, Marna Wagley escreveu:
>? ? ? ? ? > Dear Rui,
>? ? ? ? ? > I was wondering whether we have to square root of SD
>? ? ? ? ?SE, right?
>
>? ? ? ? ?No, we don't. var already divides by n, don't divide again.
>? ? ? ? ?This is the code, that can be seen by running the
>? ? ? ? ?at a
>? ? ? ? ?command line.
>
>
>? ? ? ? ?sd
>? ? ? ? ?#function (x, na.rm = FALSE)
>? ? ? ? ?#sqrt(var(if (is.vector(x) || is.factor(x)) x else
>? ? ? ? ?#? ? na.rm = na.rm))
>? ? ? ? ?#<bytecode: 0x55f3ce900848>
>? ? ? ? ?#<environment: namespace:stats>
>
>
>
>? ? ? ? ? >
>? ? ? ? ? > bootprop <- function(data, index){
>? ? ? ? ? >? ? ?d <- data[index, ]
>? ? ? ? ? >? ? ?sum(d[["BothTimes"]], na.rm = TRUE)/sum(d[["Time1"]],
>? ? ? ? ?na.rm = TRUE)
>? ? ? ? ? > }
>? ? ? ? ? >
>? ? ? ? ? > R <- 1e3
>? ? ? ? ? > set.seed(2020)
>? ? ? ? ? > b <- boot(daT, bootprop, R)
>? ? ? ? ? > b
>? ? ? ? ? > b$t0? ? ?# original
>? ? ? ? ? > sd(b$t)? # bootstrapped estimate of the SE of the
>? ? ? ? ? > sd(b$t)/sqrt(1000)
>? ? ? ? ? > pandit*(1-pandit)
>? ? ? ? ? >
>? ? ? ? ? > hist(b$t, freq = FALSE)
>
>
>? ? ? ? ?Try plotting the normal densities for both cases, the red
>? ? ? ? ?clearly wrong.
>
>
>? ? ? ? ?f <- function(x, xbar, s){
>? ? ? ? ? ? ?dnorm(x, mean = xbar, sd = s)
>? ? ? ? ?}
>
>? ? ? ? ?hist(b$t, freq = FALSE)
>? ? ? ? ?curve(f(x, xbar = b$t0, s = sd(b$t)), from = 0, to = 1, col =
>? ? ? ? ?"blue",
>? ? ? ? ?add = TRUE)
>? ? ? ? ?curve(f(x, xbar = b$t0, s = sd(b$t)/sqrt(R)), from = 0,
>? ? ? ? ?col =
>? ? ? ? ?"red", add = TRUE)
>
>
>? ? ? ? ?Hope this helps,
>
>? ? ? ? ?Rui Barradas
>
>? ? ? ? ? >
>? ? ? ? ? >
>? ? ? ? ? >
>? ? ? ? ? >
>? ? ? ? ? > On Fri, Jan 22, 2021 at 3:07 PM Rui Barradas
>? ? ? ? ?<ruipbarradas at sapo.pt <mailto:ruipbarradas at sapo.pt>
<mailto:ruipbarradas at sapo.pt <mailto:ruipbarradas at sapo.pt>>
>? ? ? ? ? > <mailto:ruipbarradas at sapo.pt
<mailto:ruipbarradas at sapo.pt> <mailto:ruipbarradas at sapo.pt
<mailto:ruipbarradas at sapo.pt>>>>
>? ? ? ? ?wrote:
>? ? ? ? ? >
>? ? ? ? ? >? ? ?Hello,
>? ? ? ? ? >
>? ? ? ? ? >? ? ?Something like this, using base package boot?
>? ? ? ? ? >
>? ? ? ? ? >
>? ? ? ? ? >? ? ?library(boot)
>? ? ? ? ? >
>? ? ? ? ? >? ? ?bootprop <- function(data, index){
>? ? ? ? ? >? ? ? ? ?d <- data[index, ]
>? ? ? ? ? >? ? ? ? ?sum(d[["BothTimes"]], na.rm =
>? ? ? ? ?na.rm = TRUE)
>? ? ? ? ? >? ? ?}
>? ? ? ? ? >
>? ? ? ? ? >? ? ?R <- 1e3
>? ? ? ? ? >? ? ?set.seed(2020)
>? ? ? ? ? >? ? ?b <- boot(daT, bootprop, R)
>? ? ? ? ? >? ? ?b
>? ? ? ? ? >? ? ?b$t0? ? ?# original
>? ? ? ? ? >? ? ?sd(b$t)? # bootstrapped estimate of the SE of the
>? ? ? ? ?prop.
>? ? ? ? ? >? ? ?hist(b$t, freq = FALSE)
>? ? ? ? ? >
>? ? ? ? ? >
>? ? ? ? ? >? ? ?Hope this helps,
>? ? ? ? ? >
>? ? ? ? ? >? ? ?Rui Barradas
>? ? ? ? ? >
>? ? ? ? ? >? ? ??s 21:57 de 22/01/21, Marna Wagley escreveu:
>? ? ? ? ? >? ? ? > Hi All,
>? ? ? ? ? >? ? ? > I was trying to estimate standard error (SE)
>? ? ? ? ?proportion
>? ? ? ? ? >? ? ?value using
>? ? ? ? ? >? ? ? > some kind of randomization process
>? ? ? ? ?jackknifing)
>? ? ? ? ? >? ? ?in R, but
>? ? ? ? ? >? ? ? > I could not figure it out.
>? ? ? ? ? >? ? ? >
>? ? ? ? ? >? ? ? > Is there any way to generate SE for the proportion?
>? ? ? ? ? >? ? ? >
>? ? ? ? ? >? ? ? > The example of the data and the code I am using is
>? ? ? ? ?attached for your
>? ? ? ? ? >? ? ? > reference. I would like to generate the value of
>? ? ? ? ?proportion with
>? ? ? ? ? >? ? ?a SE using
>? ? ? ? ? >? ? ? > a 1000 times randomization.
>? ? ? ? ? >? ? ? >
>? ? ? ? ? >? ? ? > dat<-structure(list(Sample = structure(c(1L,
>? ? ? ? ?14L, 15L, 16L,
>? ? ? ? ? >? ? ? > 17L, 18L, 19L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L,
>? ? ? ? ?11L), .Label
>? ? ? ? ? >? ? ?= c("id1",
>? ? ? ? ? >? ? ? > "id10", "id11", "id12", "id13", "id14", "id15",
>? ? ? ? ?"id16", "id17",
>? ? ? ? ? >? ? ? > "id18", "id19", "Id2", "id3", "id4", "id5", "id6",
>? ? ? ? ?"id7", "id8",
>? ? ? ? ? >? ? ? > "id9"), class = "factor"), Time1 = c(0L, 1L,
>? ? ? ? ?0L, 0L,
>? ? ? ? ? >? ? ? > 1L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 1L, 0L, 1L,
>? ? ? ? ?Time2 = c(1L,
>? ? ? ? ? >? ? ? > 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 1L,
>? ? ? ? ?0L, 1L, 0L,
>? ? ? ? ? >? ? ? > 1L, 1L)), .Names = c("Sample", "Time1",
>? ? ? ? ? >? ? ?"data.frame",
>? ? ? ? ? >? ? ? > row.names = c(NA,
>? ? ? ? ? >? ? ? > -19L))
>? ? ? ? ? >? ? ? > daT<-data.frame(dat %>%
>? ? ? ? ? >? ? ? >? ? mutate(Time1.but.not.in.Time2 = case_when(
>? ? ? ? ? >? ? ? >? ? ? ? ? ? ? Time1 %in% "1" & Time2 %in% "0"? ~
>? ? ? ? ? >? ? ? > Time2.but.not.in.Time1 = case_when(
>? ? ? ? ? >? ? ? >? ? ? ? ? ? ? Time1 %in% "0" & Time2 %in% "1"? ~
>? ? ? ? ? >? ? ? >? ?BothTimes = case_when(
>? ? ? ? ? >? ? ? >? ? ? ? ? ? ? Time1 %in% "1" & Time2 %in% "1"? ~
>? ? ? ? ? >? ? ? >? ?daT
>? ? ? ? ? >? ? ? >? ?summary(daT)
>? ? ? ? ? >? ? ? >
>? ? ? ? ? >? ? ? > cols.num <-
>? ? ? ? ?c("Time1.but.not.in.Time2","Time2.but.not.in.Time1",
>? ? ? ? ? >? ? ? > "BothTimes")
>? ? ? ? ? >? ? ? > daT[cols.num] <- sapply(daT[cols.num],as.numeric)
>? ? ? ? ? >? ? ? > summary(daT)
>? ? ? ? ? >? ? ? > ProportionValue<-sum(daT$BothTimes,
>? ? ? ? ?na.rm=T)/sum(daT$Time1, na.rm=T)
>? ? ? ? ? >? ? ? > ProportionValue
>? ? ? ? ? >? ? ? > standard error??
>? ? ? ? ? >? ? ? >
>? ? ? ? ? >? ? ? >? ? ? ?[[alternative HTML version deleted]]
>? ? ? ? ? >? ? ? >
>? ? ? ? ? >? ? ? > ______________________________________________
>? ? ? ? ? >? ? ? > R-help at r-project.org
<mailto:R-help at r-project.org> <mailto:R-help at r-project.org
<mailto:R-help at r-project.org>>
>? ? ? ? ?<mailto:R-help at r-project.org
<mailto:R-help at r-project.org> <mailto:R-help at r-project.org
<mailto:R-help at r-project.org>>>