Skip to content

have to point it out again: a distribution question

2 messages · Huntsinger, Reid, Weiwei Shi

#
Stock returns and other financial data have often found to be heavy-tailed.
Even Cauchy distributions (without even a first absolute moment) have been
entertained as models.

Your qq function subtracts numbers on the scale of a normal (0,1)
distribution from the input data. When the input data are scaled so that
they are insignificant compared to 1, say, then you get essentially the
"theoretical quantiles" ie the "x" component of the list back from l$x -
l$y. l$x is basically a sample from a normal(0,1) distribution so they do
line up perfectly in the second qqnorm(). Is that what's happening?

Reid Huntsinger



-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of WeiWei Shi
Sent: Thursday, April 28, 2005 1:38 PM
To: Vincent ZOONEKYND
Cc: R-help at stat.math.ethz.ch
Subject: [R] have to point it out again: a distribution question


Dear R-helpers:
I pointed out my question last time but it is only partially solved.
So I would like to point it out again since I think  it is very
interesting, at least to me.
It is a question not about how to use R, instead it is a kind of
therotical plus practical question, represented by R.

I came with this question when I built model for some stock returns.
That's the reason I cannot post the complete data here. But I would
like to attach some plots here (I zipped them since the original ones
are too big).

The first plot qq1, is qqnorm plot of my sample, giving me some
"S"-shape. Since I am not very experienced, I am not sure what kind of
distribution my sample follows.

The second plot, qq2, is obtained via
qqnorm(rt(10000, 4)) since I run
fitdistr(kk, 't') and got
        m              s              df
  9.998789e-01   7.663799e-03   3.759726e+00
 (5.332631e-05) (5.411400e-05) (8.684956e-02)

The second plot seems to say my sample distr follows t-distr. (not sure of
this)

BTW, what the commands for simulating other distr like log-norm,
exponential, and so on?

The third one was obtained by running the following R code:

Suppose my data is read into dataset k from file "f392.txt":
k<-read.table("f392.txt", header=F)    # read into k
kk<-k[[1]]
qq(kk)


qq function is defined as below:
qq<-function(dataset){
l<-qqnorm(dataset, plot.it=F)
diff<-l$y-l$x # difference b/w sample and it's therotical quantile
qqnorm(diff)
}


The most interesting thing is (if there is not any stupid game here,
and if my sample follows some kind of distribution (no matter if such
distr has been found or not)), my qq function seems like a way to
evaluate it. But what I am worried about, the line is too "perfect",
which indiates there is something goofy here, which can be proved via
some mathematical inference to get it. However I used
qq(rnorm(10000))
qq(rt(10000, 3.7)
qq(rf(....))

None of them gave me this perfect line!

Sorry for the long question but I want to make it clear to everybody
about my question. I tried my best :)

Thanks for your reading,

Weiwei (Ed) Shi, Ph.D
On 4/23/05, Vincent ZOONEKYND <zoonek at gmail.com> wrote:
#
Here is summary of
l<-qqnorm(kk) # kk is my sample 
l$y (which is my sample)
l$x (which is therotical quantile)
diff<-l$y-l$x

and
Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
 0.9007  0.9942  0.9998  0.9999  1.0060  1.1070
Min.    1st Qu.     Median       Mean    3rd Qu.       Max.
-4.145e+00 -6.745e-01  0.000e+00  2.383e-17  6.745e-01  4.145e+00
Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
-3.0380  0.3311  0.9998  0.9999  1.6690  5.0460

Comparing diff with l$x, though the 1st Qu. and 3rd Qu. are different,
diff and l$x seem similar to each other, which are proved by
qqnorm(l$x) and qqnorm(diff).


running the following codes:

r<-rnorm(1000)+1 # since my sample shift from zero to 1
qq(r[r>0.9 & r<1.2])  # select the central part

this gives me a straight line now.

Thanks for the good explanation for the phenomena.

Then, Reid, or other r-gurus, is there a good way to descritize the
sample into 3 category: 2 tails and the body?

Thanks again,

Weiwei
On 4/28/05, Huntsinger, Reid <reid_huntsinger at merck.com> wrote: