Skip to content

Beta binomial and Beta negative binomial

5 messages · Christophe Dutang, Tim Triche, Jr., Joan Maspons

#
Hi,
I need Beta binomial and Beta negative binomial functions but in R there is
only SuppDists package which provide this distributions using a limited
parameter space of the generalized hypergeometric distribution (dghyper & Co.)
which provide a limited parameter space for Beta binomial and Beta negative
binomial functions (e.g. alpha + beta <1 in the Beta negative binomial).

I've done a checkout to R svn to seek the code for the implementation of
distribution functions (dbinom et al.) but I haven't succeed. I don't know how
difficult could be to implement Beta binomial and Beta negative binomial
functions having the PDF  and CDF functions but I'm interested in implementing
it. I have programming skills but I don't know much about R internals.

Where is the code? Can I implement these new functions inside stats
package following the
same patterns as other probability distributions?

Yours,
#
Hi,

Please look at the distribution task view (http://cran.r-project.org/web/views/Distributions.html) and the package gamlss.dist.

By the way, distributions in R are implemented in <source>/src/nmath directory and not <source>/src/library/stats

Regards

Christophe

--
Christophe Dutang
Ph.D. student at ISFA, Lyon, France
website: http://dutangc.free.fr

Le 16 mars 2012 ? 18:41, Joan Maspons a ?crit :
#
Hello,


El 16 de mar? de 2012 20:34, Christophe Dutang <dutangc at gmail.com> ha escrit:
Thanks for the tip. There are Beta binomial functions but they don't
have the number of trials parameter so I supose it's a Beta Bernoulli
distribution.
I have implemented a prototype of the beta negative binomial:

FindParamBetaDist<- function(mu, sigma){ # return(data.frame(a=shape1,b=shape2))
# mu<- a/(a+b)          [mean]
# sigma<- ab/((a+b)^2 (a+b+1))  [variance]
# Maxima: solve([mu= a/(a+b) , sigma= a*b/((a+b)^2 * (a+b+1))], [a,b]);
  a<-  -(mu * sigma + mu^3 - mu^2) / sigma
  b<- ((mu-1) * sigma + mu^3 - 2 * mu^2 + mu) / sigma
  if (a <= 0 | b <= 0) return (NA)
  return (data.frame(a,b))
}

#Rmpfr::pochMpfr()?
pochhammer<- function (x, n){
    return (gamma(x+n)/gamma(x))
}

# PMF:
# P (X = x) = ((alpha)_n (n)_x (beta)_x)/(x! (alpha+beta)_n
(n+alpha+beta)_x) |  for  | x>=0
# (a)_b Pochhammer symbol
dbetanbinom<- function(x, size, mu, sigma){
    param<- FindParamBetaDist(mu, sigma)
    if (is.na(sum(param))) return (NA) #invalid Beta parameters
    if (length(which(x<0))) res<- 0
    else
        res<- (pochhammer(param$a, size) * pochhammer(size, x) *
pochhammer(param$b, x)
            / (factorial(x) * pochhammer(param$a + param$b, size)
            * pochhammer(size + param$a + param$b, x)))
    return (res)
}

curve(dbetanbinom(x, size=12, mu=0.75, sigma=.1), from=0, to=24, n=25, type="p")

# CDF:
# P (X<=x) = 1-(Gamma(n+floor(x)+1) beta(n+alpha, beta+floor(x)+1)
#            genhypergeo(1, n+floor(x)+1, beta+floor(x)+1;floor(x)+2,
n+alpha+beta+floor(x)+1;1))
#            /(Gamma(n) beta(alpha, beta) Gamma(floor(x)+2)) |  for  | x>=0
pbetanbinom<- function(q, size, mu, sigma){
    require(hypergeo)
    param<- FindParamBetaDist(mu, sigma)
    if (is.na(sum(param))) return (NA) #invalid Beta parameters
    res<- numeric(length(q))
    for (i in 1:length(q)){
        if (q[i]<0) res[i]<- 0
        else res[i]<- (1-(gamma(size+floor(q[i])+1) *
beta(size+param$a, param$b+floor(q[i])+1)
        * genhypergeo(c(1, 1+size+floor(q[i]), 1+param$b+floor(q[i])),
c(2+floor(q[i]),1+size+param$a+param$b+floor(q[i])), 1))
        / (beta(param$a, param$b) * gamma(size) * gamma(2+floor(q[i]))))
    }
    return (res)
}

## genhypergeo not converge. Increase iterations or tolerance?
pbetanbinom(0:10x, size=20, mu=0.75, sigma=0.03)

I have to investigate
http://mathworld.wolfram.com/GeneralizedHypergeometricFunction.html
Any tip on how to solve the problem?
#
El 18 de mar? de 2012 2:46, Tim Triche, Jr. <tim.triche at gmail.com> ha escrit:
I looks nice but I'm a little bit lost. Gsl have 10 hypergeometric functions:

hyperg_0F1(c, x, give=FALSE, strict=TRUE)
*hyperg_1F1_int(m, n, x, give=FALSE, strict=TRUE)
*hyperg_1F1(a, b, x, give=FALSE, strict=TRUE)
**hyperg_U_int(m, n, x, give=FALSE, strict=TRUE)
*hyperg_U(a, b, x, give=FALSE, strict=TRUE)
hyperg_2F1(a, b, c, x, give=FALSE, strict=TRUE)
hyperg_2F1_conj(aR, aI, c, x, give=FALSE, strict=TRUE)
hyperg_2F1_renorm(a, b, c, x, give=FALSE, strict=TRUE)
hyperg_2F1_conj_renorm(aR, aI, c, x, give=FALSE, strict=TRUE)
*hyperg_2F0(a, b, x, give=FALSE, strict=TRUE)

* functions with the same number of parameters
** functions with the swame number of parameters and types (a,b,c,x<-
integer, m,n<- real)
genhypergeo(c(1, 1+size+floor(q[i]), 1+param$b+floor(q[i])),
c(2+floor(q[i]),1+size+param$a+param$b+floor(q[i])), 1)
genhypergeo(c(int, int, real), c(int,real), 1)

I picked the hyperg_U_int(c(1,2,2.5),c(2,3.1),1) but this give a
vector of three numbers and a warning. I'm not mathematician and this
seems to much for me. Which function is the equivalent to
Mathematica's HypergeometricPQF [1,2]?
I would like to but with the dependences I don't think it will fit
R-base. Is there any package who want to include these distributions
(BB and BNB)?
[1] http://mathworld.wolfram.com/GeneralizedHypergeometricFunction.html
[2] http://reference.wolfram.com/mathematica/ref/BetaNegativeBinomialDistribution.html