Skip to content

Normal Distribution Quantiles

9 messages · Rainer Schuermann, David Winsemius, Charles C. Berry +3 more

#
This is probably embarrassingly basic, but I have spent quite a few hours in Google and RSeek without getting a clue - probably I'm asking the wrong questions...

There is this guy who has decided to walk through Australia, a total distance of 4000 km. His daily portion (mean) is 40km with an sd of 10 km. I want to calculate the number of days it takes to arrive with 80, 90, 95, 99% probability.
I know how to do this manually, eg. for 95%
$\Phi \left( \frac{4000-40n}{10 \sqrt{n}}  \right) \leq 0.05$
find the z score...

but how would I do this in R? Not qnorm(), but what is it?

Thanks in advance,
and apologies for the level of question...
Rainer
#
On Jan 8, 2011, at 6:56 AM, Rainer Schuermann wrote:

            
Sounds like homework, which is not an encouraged use of the Rhelp  
list. You can either do it in theory or you can simulate it. Here's a  
small step toward a simulation approach.

 > cumsum(rnorm(100, mean=40, sd=10))
   [1]   41.90617   71.09148  120.55569  159.56063  229.73167   
255.35290  300.74655
snipped
  [92] 3627.25753 3683.24696 3714.11421 3729.41203 3764.54192  
3809.15159 3881.71016
  [99] 3917.16512 3932.00861
 > cumsum(rnorm(100, mean=40, sd=10))
   [1]   38.59288   53.82815  111.30052  156.58190  188.15454   
207.90584  240.64078
snipped
  [92] 3776.25476 3821.90626 3876.64512 3921.16797 3958.83472  
3992.33155 4045.96649
  [99] 4091.66277 4134.45867

The first realization did not make it in the expected 100 days so  
further efforts should extend the simulation runs to maybe 120 days.  
The second realization had him making it on the 98th day. There is an  
R replicate() function available once you get a function running that  
will return a specific value for an instance. This one might work:
 > min(which(cumsum(rnorm(120, mean=40, sd=10)) >= 4000) )
[1] 97

If you wanted a forum that does not explicitly discourage homework and  
would be a better place to ask theory and probability questions, there  
is CrossValidated:
http://stats.stackexchange.com/faq
#
It is _from_ a homework but I have the solution already (explicitly got that done first!) - this was the pasted Latex code (apologies for that, but in plain text it looks unreadable[1], and I thought everybody here has his / her favorite Latrex editor open all the time anyway...). I'm just looking, for my own advancement and programming training, for a way of doing that in R - which, from your and Dennis' reply, doesn't seem to exist.

I would _not_ misuse the list for getting homework done easily, I will not ask "learning statistics" questions here, and I will always try to find the solution myself before posting something here, I promise!

Thanks anyway for the simulation advice,
Rainer


    (4000 - (40*n))   -329                                                                                                                                                                                                                          
[1] --------------- = ----                                                                                                                                                                                                                          
              1        200                                                                                                                                                                                                                           
       (10*(n^-))                                                                                                                                                                                                                                    
              2
On Saturday 08 January 2011 14:56:20 you wrote:
#
On Jan 8, 2011, at 9:25 AM, Rainer Schuermann wrote:

            
It would seem to be a straightforward application of the central limit  
theorem. Sum of normally distributed IID variables with common  
mean ... that sort of thing.  When I simulated it (n=10,000), I got:

50% 80% 90% 95% 99%
100 103 104 105 106
I'm not sure how that expression relates to the problem at hand. But  
even at my advanced age, I'm open to education.
#
On Sat, 8 Jan 2011, Rainer Schuermann wrote:

            
Your question was 'how do I find the smallest integer $n$ such that...', 
right?

Using uniroot and pnorm, you could solve for real $n$ and then round up.

Doing this, I find that in greater than 95% of trials, your bushwalker 
would be done in 105 days or less.

Or you could use findInterval, sapply, and pnorm to get all of the $n$s in 
one expression.

HTH,

Chuck
Charles C. Berry                            Dept of Family/Preventive Medicine
cberry at tajo.ucsd.edu			    UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901
#
On Sat, Jan 8, 2011 at 6:25 AM, Rainer Schuermann
<Rainer.Schuermann at gmx.net> wrote:
open all the time anyway...). I'm just looking, for my own advancement
and programming training, for a way of doing that in R - which, from
your and Dennis' reply, doesn't seem to exist.

We do ;-)  Is this what you are after?  It uses uniroot to find a
value of 'x' (days) such that the difference between the given and
obtained ps is 0 (over a sensible interval).

foo <- function(x, p, mean, sd) {
  mu <- mean * x
  sigma <- sqrt((sd^2) * x)
  p - pnorm(4000, mu, sigma, lower.tail = FALSE)
}

uniroot(foo, interval = c(90, 110), p = .8, mean = 40, sd = 10)

Cheers,

Josh

  
    
#
If I understand what you have said below, it looks like you do NOT
have the problem solved manually. You CAN use qnorm , and when you do
so, your equation yields a simple quadratic which, of course, has an
exact solution that you can calculate in R.

Of course, one can use uniroot or whatever to solve the quadratic; or
simulation or interpolation using pnorm. But other than the R
practice, these are unnecessary and, in this case, a bit silly.

Cheers,
Bert

On Sat, Jan 8, 2011 at 6:25 AM, Rainer Schuermann
<Rainer.Schuermann at gmx.net> wrote:

  
    
1 day later
#
Just to add to the silly solutions, here's how I would have done it...

mu <- 40
sdev <- 10
days <- 100:120 # range to explore
p <- 0.8
days[ match(TRUE, qnorm(0.2, mu*days, sqrt(sdev * sdev * days)) >= 4000) ]

Michael
On 9 January 2011 08:48, Bert Gunter <gunter.berton at gene.com> wrote:
#
Altogether I got five more or less silly solutions (not my judgment!), some of them further discussed in private mail, for a problem where my expectation was to get a simple one-liner back: "Check ?clt" or so...

Fortunately, with all of them I seem to arrive at a result that is consistent with what my expressions evaluates to (104.25) which gives me a great opportunity to play around with the various approaches - "brain fodder" for quite a few days.

Great experience,
Thanks to all,
Rainer