Skip to content

Using R to illustrate the Central Limit Theorem

8 messages · Paul Smith, Francisco J. Zagmutt, Vincent Goulet +5 more

#
Dear All

I am totally new to R and I would like to know whether R is able and
appropriate to illustrate to my students the Central Limit Theorem,
using for instance 100 independent variables with uniform distribution
and showing that their sum is a variable with an approximated normal
distribution.

Thanks in advance,

Paul
#
Hi Paul

This is one of many ways to do it

hist(runif(30))#histogram from 30 random samples from a uniform(0,1)
mm=c(NULL)#creates null vector
for(i in 1:100){mm[i]=mean(runif(30))}#100 sample means of 30 samples each 
from a uniform(0,1)
hist(mm)#the distribution of the sample mean looks normal!

You can even nest this loop within another loop so your student can see 
several histograms showing a "normal" behaviour.

I hope this helps

Francisco
#
Hi,

Not exactly what you asked for, but related.

I wrote the following little function to emulate a quincunx (a good 
illustration of the CLT, in my opinion):

quincunx <- function(nb.bins, nb.rows=nb.bins-1, nb.balls=2^nb.bins)
{
    x <- sample(c(0, 1), nb.balls * nb.rows, replace=TRUE)
    dim(x) <- c(nb.rows, nb.balls)
    hist(colSums(x), breaks=0:nb.rows, main="Number of balls per bin")
}

Idea: drop nb.balls in a quincunx with nb.bins bins at the bottom. The bin in 
which a ball ends up is the sum of nb.rows Bernouilli trials (where 0 stands 
for "left" and 1 for "right").

Hope this helps!

Le 21 Avril 2005 13:06, Paul Smith a ??crit??:

  
    
#
On 21-Apr-05 Paul Smith wrote:
Similar to Francisco's suggestion:

  m<-numeric(10000);
  for(k in (1:20)){
    for(i in(1:10000)){m[i]<-(mean(runif(k))-0.5)*sqrt(12*k)}
    hist(m,breaks=0.3*(-15:15),xlim=c(-4,4),main=sprintf("%d",k))
  }

(On my slowish laptop, this ticks over at a satidfactory rate,
about 1 plot per second. If your mahine is much faster, then
simply increase 10000 to a larger number.)

The real problem with demos like this, starting with the
uniform distribution, is that the result is, to the eye,
already approximately normal when k=3, and it's only out
in the tails that the improvement shows for larger values
of k.

This was in fact the way we used to simulate a normal
distribution in the old days: look up 3 numbers in
Kendall & Babington-Smith's "Tables of Random Sampling
Numbers", which are in effect pages full of integers
uniform on 00-99, and take their mean.

It's the one book I ever encountered which contained
absolutely no information -- at least, none that I ever
spotted.

A more dramatic illustration of the CLT effect might be
obtained if, instead of runif(k), you used rbinom(k,1,p)
for p > 0.5, say:

  m<-numeric(10000);
  p<-0.75; for(j in (1:50)){ k<-j*j
    for(i in(1:10000)){m[i]<-(mean(rbinom(k,1,p))-p)/sqrt(p*(1-p)/k)}
    hist(m,breaks=41,xlim=c(-4,4),main=sprintf("%d",k))
  }

Cheers,
Ted.


--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 21-Apr-05                                       Time: 19:48:05
------------------------------ XFMail ------------------------------
#
(Ted Harding) <Ted.Harding at nessie.mcc.ac.uk> writes:
Ted, you and several others in this thread may want to notice the
existence of a little function called replicate():

...
    m <- replicate(10000, (mean(runif(k))-0.5)*sqrt(12*k)) 
...
#
This won't help teach R, but it might illuminate the CLT.  Here are a series
of animated GIFs that begin with different densities, including one that has
a "U" shape, and plots the density of Xbar for n=2,3,4,8,16,32.

http://www.StatisticalEngineering.com/central_limit_theorem.htm

I've also included an explanation of what is happening at each iteration.

Charles Annis, P.E.

Charles.Annis at StatisticalEngineering.com
phone: 561-352-9699
eFax:  614-455-3265
http://www.StatisticalEngineering.com
 
-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Paul Smith
Sent: Thursday, April 21, 2005 1:07 PM
To: r-help at stat.math.ethz.ch
Subject: [R] Using R to illustrate the Central Limit Theorem

Dear All

I am totally new to R and I would like to know whether R is able and
appropriate to illustrate to my students the Central Limit Theorem,
using for instance 100 independent variables with uniform distribution
and showing that their sum is a variable with an approximated normal
distribution.

Thanks in advance,

Paul

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
#
Hi R users and developers:

I just install the new R version 2.1.0 in a
linux platform.

I get this error when I call the function
Making links in per-session dir ...
Error in gsub(pattern, replacement, x, ignore.case, extended, fixed) :
         input string 28 is invalid in this locale

What am I missing? (It works fine with version 2.0.1)
5 days later
#
The problem appears to be that you are in a UTF-8 locale and one of your 
packages is not written in UTF-8.

This exact problem is solved in R-patched, but some parts of help.start() 
will not work correctly with those packages.
On Fri, 22 Apr 2005, Kenneth Roy Cabrera Torres wrote:

            
Well, UTF-8 locales did not `work fine' in 2.0.1.