Skip to content

pbinom with size argument 0 (PR#8560)

7 messages · uht@dfu.min.dk, (Ted Harding), Peter Dalgaard +2 more

#
Full_Name: Uffe H?gsbro Thygesen
Version: 2.2.0
OS: linux
Submission from: (NULL) (130.226.135.250)


Hello all.

  pbinom(q=0,size=0,prob=0.5)

returns the value NaN. I had expected the result 1. In fact any value for q
seems to give an NaN. Note that

  dbinom(x=0,size=0,prob=0.5)

returns the value 1.

Cheers,

Uffe
#
On 03-Feb-06 uht at dfu.min.dk wrote:
Well, "NaN" can make sense since "q=0" refers to a single sampled
value, and there is no value which you can sample from "size=0";
i.e. sampling from "size=0" is a non-event. I think the probability
of a non-event should be NaN, not 1! (But maybe others might argue
that if you try to sample from an empty urn you necessarily get
zero "successes", so p should be 1; but I would counter that you
also necessarily get zero "failures" so q should be 1. I suppose
it may be a matter of whether you regard the "r" of the binomial
distribution as referring to the "identities" of the outcomes
rather than to how many you get of a particular type. Hmmm.)
That is probably because the .Internal code for pbinom may do
a preliminary test for "x >= size". This also makes sense, for
the cumulative p<dist> for any <dist> with a finite range,
since the answer must then be 1 and a lot of computation would
be saved (likewise returning 0 when x < 0). However, it would
make even more sense to have a preceding test for "size<=0"
and return NaN in that case since, for the same reasons as
above, the result is the probability of a non-event.

(But it depends on your point of view, as above ... However,
surely the two  should be consistent with each other.)

Best wishes,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 03-Feb-06                                       Time: 14:34:28
------------------------------ XFMail ------------------------------
#
(Ted Harding) <Ted.Harding at nessie.mcc.ac.uk> writes:
Once you get your coffee, you'll likely realize that you got your p's
and d's mixed up...

I think Uffe is perfectly right: The result of zero experiments will
be zero successes (and zero failures) with probability 1, so the
cumulative distribution function is a step function with one step at
zero ( == as.numeric(x>=0) ).

  
    
#
On 03-Feb-06 Peter Dalgaard wrote:
You're right about the mix-up! (I must mend the pipeline.)
I'm perfectly happy with this argument so long as it leads to
dbinom(x=0,size=0,prob=p)=1 and also pbinom(q=0,size=0,prob=p)=1
(which seems to be what you are arguing too). And I think there
are no traps if p=0 or p=1.
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 03-Feb-06                                       Time: 15:07:49
------------------------------ XFMail ------------------------------
1 day later
#
(Ted Harding) wrote:
I prefer a (consistent) NaN. What happens to our notion of a
Binomial RV as a sequence of Bernoulli RVs if we permit n=0?
I have never seen (nor contemplated, I confess) the definition
of a Bernoulli RV as anything other than some dichotomous-outcome
one-trial random experiment. Not n trials, where n might equal zero,
but _one_ trial. I can't see what would be gained by permitting a
zero-trial experiment. If we assign probability 1 to each outcome,
we have a problem with the sum of the probabilities.

Peter Ehlers
#
P Ehlers <ehlers at math.ucalgary.ca> writes:
What's the problem ??

An n=0 binomial is the sum of an empty set of Bernoulli RV's, and the
sum over an empty set is identically 0.
Consistency is what you gain. E.g. 

 binom(.,n=n1+n2,p) == binom(.,n=n1,p) * binom(.,n=n2,p)

where * denotes convolution. This will also hold for n1=0 or n2=0 if
the binomial in that case is defined as a one-point distribution at
zero. Same thing as any(logical(0)) etc., really.
1 day later
#
On Sun, 5 Feb 2006, Peter Dalgaard wrote:

            
Consistency is a Good Thing, and I had already altered the codebase to 
consistently allow size=0 as a discrete distribution concentrated at 0.

There were other inconsistencies, e.g. whether the geometric/negative 
binomial functions allow prob=0 or prob=1.  I have no problem with prob=1 
(it is a discrete distribution concentrated on one point) and this was 
addressed for rnbinom before (PR#1218) but subsequently broken (which is 
why we like regression tests ...).  However prob=0 does not correspond to 
a proper distribution unless Inf is allowed as a value, and it was not so 
documented (nor implemented).  Indeed we had
[1] 0
[1] 0
[1] 0

and in fact dgeom gave zero for every allowed value.  So I cannot accept 
that as being right (and we even have a d-p-q-r test with prob=0).