Tests for the need of cluster analysis - R-help

Tal Galili · 2011-05-02T17:02:01Z

An embedded and charset-unspecified text was scrubbed... Name: not available URL:

Mon, May 2, 2011 12:51 PM #

MARY A. WEISS <mweiss <at> temple.edu> writes:

You might want to forward this question to the r-sig-mixed-models
list.   I think you are fairly far off base in comparing 'prabclus'
(spatial clustering) to what Stata means by "clustered standard errors"
(e.g. <http://www.stata.com/support/faqs/stat/cluster.html>).
Cluster _analysis_ has to do with finding clusters in data; prabclus
uses spatial information to do cluster analysis; robust cluster
variances or standard errors have to do with adjusting variance/SE
to account for predetermined grouping variables ("clusters" in the
data, e.g. states).

  I don't know offhand whether there are packages in R that implement
the "robust cluster variance" estimator; packages like geeglm,
geepack, and especially the "sandwich" package are definitely worth
looking at (they implement the equivalent of robust, but not robust
cluster [as far as I can tell], variance estimators]), as well as
the Econometrics Task View and the book "R for Stata Users" by
Muenchen and Hilbe.

  A final philosophical note: I don't think you should be
testing _based on your data_ whether robust or robust cluster
variance estimators are more appropriate; there's a fairly
dangerous data snooping issue here.  Rather, you should try to
decide _a priori_ based on your data what's most appropriate.

  Ben Bolker

[snip]

[snip]

[snip]