Skip to content

Simulation with R

8 messages · Ethan Johnsons, John Sorkin, Peter Dalgaard +2 more

#
An apparatus exists whereby a collection of balls is displaced to the
top of a stack by suction. A top level (Level 1) each ball is shifted
1 unit to the left or 1 unit to the right at random with equal
probability. The ball then drops down to level 2. At Level 2, each
ball is again shifted 1 unit to the left or 1 unit to the right at
random. The process continues for 15 levels and the balls are
collected at the bottom for a short time until being collected by
suction to the top.

Can we do a simulation of the process using R with 100 balls, and plot
the frequency distribution of the position of the balls at the bottom
with respect to the entry position?

thx in advance

ej
#
Ethan Johnsons <ethan.johnsons <at> gmail.com> writes:
Can you make a plausible case that this isn't a homework problem?

  Ben Bolker
#
R 2.3.1
Windows XP 

I am surprised at the lack of precision in R, as noted below. I would
expect the values to be closer to zero, particularly the later examples
where the sample size is quite large.
[1] -0.03209727
[1] -0.005991322
[1] -0.0006160524
[1] -0.001254309
[1] 0.0004633778
[1] -0.0001325764

I would appreciate any thoughts. Is the lack of precision due to
running on a 32-bit system?

Thanks,
John


John Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
Baltimore VA Medical Center GRECC,
University of Maryland School of Medicine Claude D. Pepper OAIC,
University of Maryland Clinical Nutrition Research Unit, and
Baltimore VA Center Stroke of Excellence

University of Maryland School of Medicine
Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524

(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
jsorkin at grecc.umaryland.edu

Confidentiality Statement:
This email message, including any attachments, is for the so...{{dropped}}
#
John Sorkin wrote:
Looks OK to me. The theoretical SEM in the latter case is 
sqrt(1/5e7)==0.0001414214.
#
Indeed, you are correct! In never pays to do stats too early in the
morning!
John

John Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
Baltimore VA Medical Center GRECC,
University of Maryland School of Medicine Claude D. Pepper OAIC,
University of Maryland Clinical Nutrition Research Unit, and
Baltimore VA Center Stroke of Excellence

University of Maryland School of Medicine
Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524

(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
jsorkin at grecc.umaryland.edu
John Sorkin wrote:
would
examples
Looks OK to me. The theoretical SEM in the latter case is 
sqrt(1/5e7)==0.0001414214.

Confidentiality Statement:
This email message, including any attachments, is for the so...{{dropped}}
#
On 12/9/2006 9:26 AM, John Sorkin wrote:
These are all cases where the theoretical distributions are easy to 
calculate, and the results you are getting are not particularly unusual:
> pnorm(-0.03209727, sd=1/sqrt(500))
[1] 0.2364660
> pnorm(-0.005991322, sd=1/sqrt(5000))
[1] 0.3359104

[skipped some]
> pnorm(-0.0001325764, sd=1/sqrt(50000000))
[1] 0.1742618
No, the lack of precision seems to be due to the distributions you are 
sampling from.

Duncan Murdoch
#
Ducan,
First, thank you for your reply. 
As I previously noted in a reply to Peter Dalgaard, you, and he, are
correct. I missed something quite obvious. Sometimes it does not pay to
do statistics early in the morning! Again, many thanks,
John

John Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
Baltimore VA Medical Center GRECC,
University of Maryland School of Medicine Claude D. Pepper OAIC,
University of Maryland Clinical Nutrition Research Unit, and
Baltimore VA Center Stroke of Excellence

University of Maryland School of Medicine
Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524

(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
jsorkin at grecc.umaryland.edu
On 12/9/2006 9:26 AM, John Sorkin wrote:
would
examples
These are all cases where the theoretical distributions are easy to 
calculate, and the results you are getting are not particularly
unusual:
> pnorm(-0.03209727, sd=1/sqrt(500))
[1] 0.2364660
> pnorm(-0.005991322, sd=1/sqrt(5000))
[1] 0.3359104

[skipped some]
> pnorm(-0.0001325764, sd=1/sqrt(50000000))
[1] 0.1742618
No, the lack of precision seems to be due to the distributions you are

sampling from.

Duncan Murdoch
1 day later
#
Ethan Johnsons <ethan.johnsons <at> gmail.com> writes:
After poking around a bit, I see from your Google groups profile
that you are (or at least claim to be) self-teaching yourself
biostats -- which explains your really heavy posting record
here and in other stats forums and means this isn't 
necessarily homework.
(I know, I could just keep my mouth shut).  It would
probably be helpful if it were possible for you to find someone
local to help with some of these questions, as many of them are 
basic stats questions ...

  As for this question -- this process generates a binomial distribution.
Some ways of simulating the frequency distribution are
(1) use rbinom() to sample a number of left- moves for
each of 100 balls -- then the number of right-moves is known,
and you can calculate the ending position
(2) construct a string of 1 and -1 values
for each ball
(either use sample(c(-1,1),size=100,replace=TRUE),
or something involving sign(runif(100),-1,1)) and
use cumsum() to calculate the final position
(3) use a for() loop to sample values and calculate
the position one step at a time [move <- sample(c(-1,1));
cpos <- cpos+move; etc.]

#1 is by far the most efficient, #3 the least, but
might be the most transparent.

  Would you consider "giving back to the community" by writing
up some of the most useful advice you've gotten
for the R wiki?

  Ben Bolker