Probably a good use for apply
Yes you are correct. I want need to change my sample number specification to the number of elements in the vector. So sampleWorker function should be: sampleWorker <- function(x) return(sample(c(TRUE,FALSE),length(x), replace = TRUE, prob = c(x, 1-x))) So this is where I get a little confused with using apply functions. Isnt x each element of each vector. So in the sample data I provide there are 4 x's, and each would be put into the sampleWorker function using the lapply. #sample data test_<- list(a=c(.85,.10),b=c(.99,.05)) To show what I want without using a list of vectors and instead just a single one see below: IsWorker.Hh_ <- lapply(c(.9,.1) , sampleWorker) #Returns: [[1]] [1] TRUE [[2]] [1] FALSE Now I just need to run through each vector of the list I specify, in this case test_. Then I need to sum the TRUES for each vector. So again if we assume the test_ data would result in a single TRUE for each vector (because of the .85 and .99 probabilities) the result would be
IsWorker_
$a [1] 1 $b [1] 1 Perhaps lapply isnt the right tool? I have seen a couple of comments on the list that say the plyr package is easy to figure out but you lose out on speed and that is my issue right now. I can do what I need to do using some for loops but its way way too slow. Any guidance is appreciated. Thanks guys Josh -----Original Message----- From: Sarah Goslee [mailto:sarah.goslee at gmail.com] Sent: Thursday, May 31, 2012 1:35 PM To: ROLL Josh F Cc: r-help at r-project.org Subject: Re: [R] Probably a good use for apply Hi,
On Thu, May 31, 2012 at 1:08 PM, LCOG1 <jroll at lcog.org> wrote:
This is great thank you. ?I think I am getting the hang of some of the apply functions. ?I am stuck again however. ?I have list test_ below and would like to apply the sample function using each element of each vector as the probability and return a TRUE or FALSE that I will ultimately sum the TRUES by vector. test_<- list(a=c(.85,.10),b=c(.99,.05)) #Write a function to sample based on labor force participation rates to determine presence of workers in household sampleWorker <- function(x) return(sample(c(TRUE,FALSE),x, replace = TRUE, prob = c(x, 1-x)))
Your first problem is that sampleWorker() doesn't run with a single component of test_ so it can't possibly run in an apply statement. Please reread ?sample - the second argument is the size of the desired sample, but what you are passing is a non-integer vector of length 2. What do you actually want this to be? Then for prob, you're passing c(x, 1-x)) but x is again a non-integer vector of length 2, so that results in a vector of length 4, which is longer than the number of options sample() is choosing from. Do you perhaps want to pass only a single probability at a time? But even then you need to resolve the size problem. Sarah
IsWorker.Hh_ <- lapply(test , sampleWorker) I am doing something wrong with the setup becuase i am getting an error about specifying probabilities incorrectly. The result I am looking for for ?IsWorker_ to be (assuming the .85, and . 99 probabilities 'win' from each vector and the lower values do not.
IsWorker_
$a [1]TRUE $b [1]TRUE but ultimately I will need to sum the TRUEs for each vector
IsWorker_
$a [1] 1 $b [1] 1 Thanks Josh
-- Sarah Goslee http://www.functionaldiversity.org