Dear all,
I'm working with the geometric distribution for the time being, and I'm
confused. This may have more to do with statistics than R itself, but
since I'm getting results from R I find counterintuitive (well, yeah, my
statistical intuition has not been properly sharpened), I feel like
asking.
The point first:
If I do
rgeom(1,prob=1)
I get:
[1] NaN
Warning message:
NAs produced in: rgeom(n, prob)
And if I do:
rgeom(1,prob=0)
I get:
[1] NaN
Warning message:
NAs produced in: rgeom(n, prob)
I was expecting to get 0 and Inf respectively.... Should I expect that?
Going back to my textbooks (Dudewicz and Mishra 1988, primarily), they
describe the geometric dist as the dist of a random variable that gives
the number of the trail that an event occurs, with a constant probability
p (or prob). Contrary to R, my textbooks have x = 1, 2, 3, ...
My intuition says that with this interpretation, a probability of 0 means
that the event just won't happen, and consequently rgeom(1,prob=0) should
always return Inf. qgeom() seems to agree:
qgeom(1,1)
[1] Inf
rgeom(1,prob=1) says to me that the event must occur on the first trail,
and with x = 0, 1, 2, ..., that means it should return 0. Here,
qgeom(0,0)
[1] NaN
Warning message:
NaNs produced in: qgeom(p, prob, lower.tail, log.p)
Is it my intuition that is playing tricks with me (again), or is it a bug?
Yours Confusedly,
Kjetil
Kjetil Kjernsmo
Graduate astronomy-student Problems worthy of attack
University of Oslo, Norway Prove their worth by hitting back
E-mail: kjetikj at astro.uio.no - Piet Hein
Homepage <URL:http://www.astro.uio.no/~kjetikj/>
Webmaster at skepsis.no
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Dear all,
I'm working with the geometric distribution for the time being, and I'm
confused. This may have more to do with statistics than R itself, but
since I'm getting results from R I find counterintuitive (well, yeah, my
statistical intuition has not been properly sharpened), I feel like
asking.
The point first:
If I do
rgeom(1,prob=1)
I get:
[1] NaN
Warning message:
NAs produced in: rgeom(n, prob)
And if I do:
rgeom(1,prob=0)
I get:
[1] NaN
Warning message:
NAs produced in: rgeom(n, prob)
I was expecting to get 0 and Inf respectively.... Should I expect that?
Yes, and probably No.
As far as I know, none of the rxxx functions produce Inf results -- we
have stuck to real-valued random variables.
We don't have entirely consistent handling of limiting behaviour in
mathematical functions. That is, if a quantity is undefined at a
particular parameter value but has a unique limit we often but not always
provide that limit.
I think rgeom(prob=1) is a case where there is a clearly right answer: 0.
Given that we have recently got upset at the glibc maintainers for not
providing correct limiting behaviour for exp, we should probably fix it.
On the other hand rgeom(prob=0)=Inf is less clearly correct. In the
development version we give NA rather than NaN: that is "we're not going
to answer this" rather than "the answer is undefined".
-thomas
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._