Skip to content

error propagation - hope it is correct subject

4 messages · Thomas W Blackwell, PIKAL Petr

#
Dear all

Please, can you advice me how to compute an error, standard deviation or 
another measure of variability of computed value.

I would like to do something like:

var(y) = some.function(var(x1),var(x2),var(x3))

for level F1 (2,3,...)

Let say I have some variables - x1, x2, x3 (two computed for levels of factor F 
and one which is same for all levels) and I want to compute

y = f(x1,x2,x3)

for some levels of factor F

I can compute variation of variables for levels of F, I know a variation of one 
variable but I am not sure how to transfer it to variation of y within respective 
levels.

I found some methods which I can use but I wonder if  there is some method 
implemented in R (Manly B.F. Biom.J.28,949,(1986), some local statistical books 
available to me).

I have a feeling I could use bootstrap method for this but I am not sure how.

Thank you and merry Christmas to all

Petr Pikal
petr.pikal at precheza.cz
#
Petr  -

Very briefly, I think of three ways to approximate the standard
deviation of  y = f(x1,x2,x3).

  (1) linearise f() and use the covariance matrix of [x1,x2,x3].
  (2) simulate draws from the joint distribution of [x1,x2,x3],
	then compute the sample std dev of resulting f()s.
  (3) go back to the original data set from which [x1,x2,x3] were
	estimated as parameters, re-sample rows with replacement,
	estimate [x1,x2,x3] and compute f, then take sample std dev.

Other names for these three would be (1) the "delta method" or
Taylor series expansion, (2) parametric bootstrap, (3) bootstrap.

Different choices are appropriate in different situations.

If the std devs of x1,x2,x3 are small relative to the curvature
(2nd derivative) in f(), then use (1) and compute by matrix algebra

Var(f(x1,x2,x3))  approx  t(grad f) %*% Cov(x1,x2,x3) %*% grad f.

If the curvature in f() is an issue, use (2) with draws of x1,x2,x3
from some parametric distribution (eg, rnorm()) with each component
properly conditioned on the ones already drawn.

Only if there were no set of intermediate parameters [x1,x2,x3]
would I use (3) to get the precision of f directly.  I'm sure
Brad Efron would say something different.  (3) is the only one
that is canned in R, simply because the other two are practically
one-liners.

-  tom blackwell  -  u michigan medical school  -  ann arbor  -
On Mon, 22 Dec 2003, Petr Pikal wrote:

            
#
Hallo Thomas

Thank you for your answer, even I am not sure how to do it in R (or maybe at 
all). My mathematics background is only faint so I drop the first possibility which 
is for me rather cryptic. 

Does your second suggestion mean:

1:	compute random variable  y <- f(rnorm(n,mymeanx1,mysdx1), 
rnorm(n,mymeanx2, ...), ...)

according to my function f  (based on assumption x variables values can be 
considered normally distributed and and independent)

2:	sd(y)

can be considered as variation of y?
Or is it necessary to do something like

vysled<-NULL
for (i in 1:300) vysled[i]<-sd(sample(y,100))
mean(vysled)

to get bootstraped estimation of sd(y)

My actual data have some missing values and some outliers which I can either 
remove or to use some robust statistics for mean and variation estimates.

Thank you and have a nice Christmas

Petr
On 22 Dec 2003 at 8:56, Thomas W Blackwell wrote:

            
Petr Pikal
petr.pikal at precheza.cz
#
Petr  -

Yes, you are interpreting the second suggestion exactly correctly,
apart from concern for possible correlations among x1,x2,x3.
If one can treat them as independent, I would do exactly as you
show:  generate a vector of, say, n = 10000 simulated draws from
x1, another vector of the same length for x2, and another for x3,
then calculate  sd(f(x1,x2,x3))  as an approximation to the std
dev of f().

-  tom blackwell  -  u michigan medical school  -  ann arbor  -
On Tue, 23 Dec 2003, Petr Pikal wrote: