Skip to content

credible interval for empirical Bayesian estimates of rates

5 messages · Dexter Locke, Roger Bivand

#
Dear esteemed list,

I'm using spdep::EBest with family = 'binomial' for counts of events within
polygons that have an 'at risk' population. The resultant "estmm" is
'shrunk' compared to the raw rate (both given by EBest and calculated "by
hand" rate. All good there.

Using GeoDa version 1.14.0 24 August 2019 produces identical results for
its Empirical Bayesian rate. This was confirmed by plotting the EBest
output against GeoDa's rate and finding a perfect correlation along the 1
to 1 line. All good there.

Two questions:
1. How can credible intervals around these smoothed rate estimates be
calculated?
2. The spdep documentation calls this a binomial family, but the identical
results are obtained from GeoDa calls this "Poisson-Gamma" model here:
https://geodacenter.github.io/workbook/3b_rates/lab3b.html#fnref11 , so
what is actually being calculated? This question may help me answer the
first question..

Possibly the answers are addressed in the literature cited which I cannot
access right now at home without institutional library access.

Thanks for your consideration,
Dexter
http://dexterlocke.com/
#
On Fri, 24 Apr 2020, Dexter Locke wrote:

            
Please provide a reproducible example, as this may help with answers.
No, the default family is "poisson", with "binomial" available for 
non-rare conditions following Martuzzi, implemented by Olaf 
Berke, ?spdep::EBest.

The code in spdep is easily accessible, so can be read directly. Please 
also compare with code for the EB Moran test, and with analogous code in 
the DCluster package, empbaysmooth(). Cf. ASDAR 2nd ed., ch. 10, section 
10.2, pp. 322-328. The epitools::pois.exact() function is used for CIs. 
For code and data see https://asdar-book.org/bundles2ed/dismap_bundle.zip.
Most institutions do have proxy or VPN access, but the code will be as 
useful. In PySAL, the code would also guide you, but even though GeoDa is 
open source, the C++ is fairly dense.

Hope this helps,

Roger

  
    
#
Thanks Roger and list.

I didn't think a repex was needed because a question was: why does
spdep::EBest(counts, population, family = 'binomial') give the same
results at GeoDa's, while EBest(.. binomial) is "binomial" while GeoDa
calls that "Poisson-Gamma". GeoDa can't give use a repex (GUI) and think
this is a question about terminology. The same results were achieved with
the packages while naming the model differently - why?

Yes ?spdep::EBest directed me to the literature I'm struggling to access.
And Yes, I've been looking at the raw code and understand how the estmm is
generated.

I've been using the epitools::pois.exact() and spdep::EBest. I can compare
the point estimates from pois.exact to those provided by EBest, but I'd
like to graph side by side their credible / confidence intervals.

Its this last part on the credible intervals I'm interested in. How to get
credible intervals around estmm? This is my main question.

ASDAR is a reference I'm using all the time. Thanks for that gem.

DCluster::empbaysmooth also does not provide a credible interval, either.

-Dexter
http://dexterlocke.com/
On Fri, Apr 24, 2020 at 10:23 AM Roger Bivand <Roger.Bivand at nhh.no> wrote:

            

  
  
#
On Fri, 24 Apr 2020, Dexter Locke wrote:

            
Reproducible example:

auckland <- st_read(system.file("shapes/auckland.shp",
   package="spData")[1], quiet=TRUE)
res <- EBest(auckland$M77_85, 9*auckland$Und5_81)
res0 <- EBest(auckland$M77_85, 9*auckland$Und5_81, family="binomial")
Min.  1st Qu.   Median     Mean  3rd Qu.     Max.
0.001487 0.002237 0.002549 0.002648 0.002968 0.004531
Min.  1st Qu.   Median     Mean  3rd Qu.     Max.
0.001484 0.002235 0.002549 0.002648 0.002968 0.004536

After calculating Und5_81 * 9 as a new column, running GeoDa -> Map -> 
Rates-Caluculated Map -> Empirical Bayes and exporting a shapefile, in R I 
see:
Min.  1st Qu.   Median     Mean  3rd Qu.     Max.
0.001487 0.002237 0.002549 0.002648 0.002968 0.004531

which is the same as the Poisson, and:
[1] TRUE

GeoDa is providing the same EB Poisson as EBest, isn't it? If yours 
differ, are both implementations seeing the same data?
EBest() was written to implement Bailey & Gatrell's textbook, which did 
not provide CI, and just used the Marshall Auckland dataset. If you'd like 
to implement them, I'd welcome a contribution.
As you see from ch. 10, CI are described for the epitools implementation. 
My feeling is that the literature moved away from simple EB rates towards 
IID RE and spatially structured RE, with relevant covariates, say like the 
classic Scottish Lip Cancer dataset, and currently CARBayes is a solid 
package among many others. Simply using base population becomes too 
unsatisfactory. PHE uses funnel plots which do have CI of a kind, to draw 
attention from small base populations: 
https://nhsrcommunity.com/blog/introduction-to-funnel-plots/ which can be 
used to adjust class intervals for mapping.

Roger

  
    
#
Wonderful thank you.

GeoDA and the spdep::EBest are providing the same estimates for my
datasets. Thanks for the reprex example with the GUI via ">". That is a
helpful convention I'll use in the future..

Ok great, there is not yet a CI implementation. If I can manage one I'd be
delighted to share and contribute it.

Julie Silge: https://juliasilge.com/blog/bayesian-blues/

and David Robinson
http://varianceexplained.org/r/credible_intervals_baseball/

have some suggestions that are similar to each other's (no surprise there,
they work together a lot).

Thanks for the suggestions about alternative model specifications and
including covariates.

This has been very helpful, I appreciate this list immensely,
Dexter
On Fri, Apr 24, 2020 at 3:05 PM Roger Bivand <Roger.Bivand at nhh.no> wrote: