Bootstrap encounter histories data

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20130314/dbc68b39/attachment.pl>
Hello,

It doesn't seem very complicated.
First of all, for the function fun below to work, you need the data not 
as strings of staes followed by a space followed by a broup number, but 
in two columns vectors, a character vector of states and a vector of 
groups. The vector of groups can be of class character, factor or 
numeric. I've written a function to simulate such data.

makeData <- function(n, g = 3, size = 10){
	res <- matrix(0, nrow = n, ncol = size)
	gr <- sample(g, n, replace = TRUE)
	for(i in seq_len(n))
		res[i, ] <- sample(0:2, 10, replace = TRUE, prob = c(0.5, 0.25, 0.25))
	res <- apply(res, 1, paste0, collapse = "")
	data.frame(states = res, group = gr, stringsAsFactors = FALSE)
}

dat <- makeData(10)

# Now to sample from 'dat', by group.
fun <- function(x){
	f <- function(y){
		idx <- sample(nrow(y), nrow(y), replace = TRUE)
		y[idx, ]
	}
	res <- do.call(rbind, lapply(split(x, x[, 2]), f))
	rownames(res) <- seq_len(nrow(res))
	res
}

fun(dat)

Hope this helps,

Rui Barradas

Em 14-03-2013 10:25, Simone Santoro escreveu:
Hi all,

I am working with a capture-recapture analyses and my data set consists of a
typical set of encounter histories.
Thus, for each individual I have a string (same length for all the
individuals) consisting of 0 (not seen) and other numbers (seen in state
"1", seen in state "2", etc. where state may refer to breeding, nesting,
feeding, etc.).
At the end of each string I have a last value that refers to the group
(according to sex, age, sex*age, whatever). State and group refer to
different classifications.
Hence my original data set would be (by the way I can modify it to make
things easier):
0001002002 1; (individual of group 1, first captured in state "1! at
occasion 4th, not captured at occasion 5th and 6th, captured at 7th in state
2...etc.)
1100222101 1;
0000020010 3;
0010101022 2;
...

Suppose I have 5000 strings divided in x individuals of group 1, y
individuals of group 2, ... z individuals of group "n".
I need to bootstrap this data set to get a new data set of the same length
(resampling with replacement) and where the number of individuals of each
group is maintained the same.

Does anyone have an idea on how to do it?
Thanks in advance for any help

Simone

--
View this message in context: http://r.789695.n4.nabble.com/Bootstrap-encounter-histories-data-tp4661300.html
Sent from the R help mailing list archive at Nabble.com.

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Hello,

One more thing, if you have the data as strings of states/space/group, 
you can split it in vectors state/group with

# This is your data example
x <- c(
"1100222101 1",
"0000020010 3",
"0010101022 2"
)

mat <- do.call(rbind, strsplit(x, " "))

But this creates a matrix, so you'll need to revise the function as

fun <- function(x){
	f <- function(y){
		idx <- sample(NROW(y), NROW(y), replace = TRUE)
		y[idx, ]
	}
	sp <- split(as.data.frame(x), x[, 2])
	res <- do.call(rbind, lapply(sp, f))
	rownames(res) <- seq_len(nrow(res))
	res
}

fun(mat)

Hope this helps,

Rui Barradas

Em 14-03-2013 11:34, Rui Barradas escreveu:
Hello,

It doesn't seem very complicated.
First of all, for the function fun below to work, you need the data not
as strings of staes followed by a space followed by a broup number, but
in two columns vectors, a character vector of states and a vector of
groups. The vector of groups can be of class character, factor or
numeric. I've written a function to simulate such data.

makeData <- function(n, g = 3, size = 10){
     res <- matrix(0, nrow = n, ncol = size)
     gr <- sample(g, n, replace = TRUE)
     for(i in seq_len(n))
         res[i, ] <- sample(0:2, 10, replace = TRUE, prob = c(0.5, 0.25,
0.25))
     res <- apply(res, 1, paste0, collapse = "")
     data.frame(states = res, group = gr, stringsAsFactors = FALSE)
}

dat <- makeData(10)

# Now to sample from 'dat', by group.
fun <- function(x){
     f <- function(y){
         idx <- sample(nrow(y), nrow(y), replace = TRUE)
         y[idx, ]
     }
     res <- do.call(rbind, lapply(split(x, x[, 2]), f))
     rownames(res) <- seq_len(nrow(res))
     res
}

fun(dat)

Hope this helps,

Rui Barradas

Em 14-03-2013 10:25, Simone Santoro escreveu:
Hi all,

I am working with a capture-recapture analyses and my data set
consists of a
typical set of encounter histories.
Thus, for each individual I have a string (same length for all the
individuals) consisting of 0 (not seen) and other numbers (seen in state
"1", seen in state "2", etc. where state may refer to breeding, nesting,
feeding, etc.).
At the end of each string I have a last value that refers to the group
(according to sex, age, sex*age, whatever). State and group refer to
different classifications.
Hence my original data set would be (by the way I can modify it to make
things easier):
0001002002 1; (individual of group 1, first captured in state "1! at
occasion 4th, not captured at occasion 5th and 6th, captured at 7th in
state
2...etc.)
1100222101 1;
0000020010 3;
0010101022 2;
...

Suppose I have 5000 strings divided in x individuals of group 1, y
individuals of group 2, ... z individuals of group "n".
I need to bootstrap this data set to get a new data set of the same
length
(resampling with replacement) and where the number of individuals of each
group is maintained the same.

Does anyone have an idea on how to do it?
Thanks in advance for any help

Simone

--
View this message in context:
http://r.789695.n4.nabble.com/Bootstrap-encounter-histories-data-tp4661300.html

Sent from the R help mailing list archive at Nabble.com.

    [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.