Jeremy--
Just to chime in here, I think the devil is almost always in the details
of the specific data and model. Thus, "Stata can 'do' AGQ for multiple
random-effects" really does not mean too much outside of a specific set
of data and model.
I recently assisted with an Epidemiology colleague who was trying to run
a cross-classified GLMM with logit outcome and ~45K participants. After
24 hours Stata had made it through 4-5 iterations. lmer() fit the model
in 5 minutes.
At the same time, that example does not mean lmer() is universally
superior for all models and datasets. However, if you're talking
cross-classified data... it probably is. ;)
As for MCMC, I would strongly recommend taking a look at MCMCglmm, which
is a fully Bayesian package for generalized linear mixed models and very
good.