Skip to content

Power sampling

3 messages · Uwe Ligges, Thomas Friedrichsmeier

#
Hello,

after a unsuccessful search in lists maliarchive I wonder how I could 
estimate the power of a sample size related to an unknown population.

Given the following (fake))situation:

I do have a database containing about 5 millions observations over 70 
variables.
I would like to compute (as epidemiologists are used to) the required 
  size of a sample to do some test on a test sample (test data), later 
doing some subsequent analysis of a new sample to build a prediction 
model.

Help facilities of R show some entries regarding power, but none of 
them seemed to be appropriate for my purpose (maybe I am wrong, but 
sometimes I have some difficulties to decipher the message of those 
tiny hints for available packages)

I would appreciate somebody effort to point me to the right 
package/function to archieve this task!


regards

Thomas
#
Thomas Sch??nhoff wrote:

            
You have to tell us for which test you are going to calculate the power 
... (and there might be nothing, since calculating the power precisely 
is not always that easy).

Uwe Ligges
#
Hello Uwe,

Uwe Ligges schrieb:
Given my example from the first message I asked for a function which 
enables me to calculate a reasonable sample size.

I don't know the true mean or standard deviation of the population, I 
only know:

n= 5.000.000 observations over 70 variables

determined alpha = 0.01

I want to know how to generate a sufficiently sized sample based upon 
above mentioned facts to make some valid predictions regarding my 
population.

All I can hink of for now is that a two-tailed power test is required 
to find out if H0= random effect  or H1= no random effect hypothesis 
is accepted/rejected.

In epidemiological studies this situation is described like this:

How many cases do I have to include in my sample (s) to gain some 
representative results from a unknown population (=true mean, 
std-deviation etc.)?

How can I approach a situation like this in R ?



Regards

Thomas



platform i386-pc-linux-gnu
arch     i386
os       linux-gnu
system   i386, linux-gnu
status
major    2
minor    0.1
year     2004
month    11
day      15
language R