Skip to content

Tests for the need of cluster analysis

3 messages · Tal Galili, MARY A. WEISS, Ben Bolker

#
MARY A. WEISS <mweiss <at> temple.edu> writes:
You might want to forward this question to the r-sig-mixed-models
list.   I think you are fairly far off base in comparing 'prabclus'
(spatial clustering) to what Stata means by "clustered standard errors"
(e.g. <http://www.stata.com/support/faqs/stat/cluster.html>).
Cluster _analysis_ has to do with finding clusters in data; prabclus
uses spatial information to do cluster analysis; robust cluster
variances or standard errors have to do with adjusting variance/SE
to account for predetermined grouping variables ("clusters" in the
data, e.g. states).

  I don't know offhand whether there are packages in R that implement
the "robust cluster variance" estimator; packages like geeglm,
geepack, and especially the "sandwich" package are definitely worth
looking at (they implement the equivalent of robust, but not robust
cluster [as far as I can tell], variance estimators]), as well as
the Econometrics Task View and the book "R for Stata Users" by
Muenchen and Hilbe.

  A final philosophical note: I don't think you should be
testing _based on your data_ whether robust or robust cluster
variance estimators are more appropriate; there's a fairly
dangerous data snooping issue here.  Rather, you should try to
decide _a priori_ based on your data what's most appropriate.

  Ben Bolker
[snip]
[snip]
[snip]