Hi Jon,
In this instance you are actually fitting 3 group-level random effects; 1
in the random part of the model, one in residual part of the model and one
in the argument to mev.
The random effects and the residual effects are confounded and are not
identifiable in the likelihood. You would be better just fitting the model:
m <- MCMCglmm::MCMCglmm(outcome ~ 1, rcov=~group, mev=dat$obsVar,data=dat,
prior=list(R=list(V=1, nu=0)))
it is virtually the same as your original model (since the residual
variance was set close to zero) but it will iterate faster and mix better.
In this model you estimate the between-observation (in your case
between-group) variance after accounting for the sampling variance. There
are two additional, equivalent, ways of writing this model:
m.a <- MCMCglmm::MCMCglmm(outcome ~ 1, random=~us(sqrt(obsVar)):group,
rcov=~group,data=dat, prior=list(R=list(V=1, nu=0), G=list(G1=list(V=1,
fix=1))))
and
Cinv_sparse<-as(diag(1/dat$obsVar), "dgCMatrix")
rownames(Cinv_sparse)<-dat$group
m.b <- MCMCglmm::MCMCglmm(outcome ~ 1, random=~group, rcov=~group,data=dat,
ginverse=list(group=Cinv_sparse), prior=list(R=list(V=1, nu=0), G=list(G1=list(V=1,
fix=1))))
It is this latter equivalence that I suggested you exploit in the
bivariate case. Lets say there are now two rows per group (one for spend
and one for performance) and you have a column in the data-frame indicating
response type (lets call it sORp)
I don't know how your 2x2 sampling covariance matrices for each group are
stored, but lets say they are in list form. As an example, with two groups:
C<-list(S1=matrix(c(1,0.5, 0.5,1),2,2), S2=matrix(c(2,0, 0,1),2,2))
Cinv_sparse<-as(list2bdiag(lapply(C,solve)), "dgCMatrix")
Assuming the data are sorted response type with group (i.e. in the same
order as C) then you can fit
dat$observation<-1:nrow(dat)
rownames(Cinv_sparse)<-1:nrow(dat)
m.biv <- MCMCglmm::MCMCglmm(outcome ~ sORp-1, random=~observation,
rcov=~us(sORp):group,data=dat, ginverse=list(observation=Cinv_sparse),
prior=list(R=list(V=diag(2), nu=0), G=list(G1=list(V=1, fix=1))))
The rcov term models between-observation (ie. between group) (co)variation
not due to sampling error. These variances could be zero, but you would
have to have a pretty effective way of randomising the original
(unsummarised) observations between groups.
Cheers,
Jarrod
On 20/10/2016 20:31, Jon Bischof wrote:
Hi Jarrod,
You are making an excellent point, but I am working in a big data setting
where only group aggregates, not individual outcomes, will fit in memory.
In brief: this model estimates the log ratio of advertiser spend and
performance between a control and experiment condition. These two outcomes
are measured at a click level, but we have far too many clicks to store
individually in memory. Instead we want to fit a model to advertiser
sufficient statistics, which are the log ratio mean and within variance.
For either of these outcomes individually, I was able to fit the following
model and recover the true hyperparameters from fake data:
# Prior distribution for model
kMCMCglmmPrior <- list(R=list(fix=1, V=1e-6), G=list(G1=list(V=1e-1,
nu=-2)))
# dat: A data.frame with columns 'group', 'outcome', and 'obsVar'
m <- MCMCglmm::MCMCglmm(outcome ~ 1, random=~group, mev=dat$obsVar,
data=dat, prior=kMCMCglmmPrior)
However, for many GLMM implementations (including LMER), the within and
between variance are not separately identifiable when aggregate data are
provided. This is not true statistically, but the aggregate likelihood
requires special handling and cannot be fit from a program that expects one
row for each y_ij.
Can MCMCglmm fit my one-dimensional model when R is not fixed and there is
only one row per group?
Thanks!
Jon
On Thu, Oct 20, 2016 at 11:29 AM, Jarrod Hadfield <j.hadfield at ed.ac.uk>
wrote:
Hi Jon,
MCMCglmm fits random-effect meta-analysis (I think this is what it is
caused) which assumes that even after correcting for sampling errors there
will be some 'real' between-observation variance (and in your case
covariance). I'm not sure what you are modelling, but at least in the types
of data I work with I can't really believe there isn't some 'real'
between-observation variance.
Cheers,
Jarrod
On 20/10/2016 19:09, Jon Bischof wrote:
Jarrod,
Thanks for your detailed response! Your understanding of my model is
correct: it's just a single grouping metaanalysis in two dimensions:
(y_i,1, y_i,2) ~ Normal ( (theta_i,1, theta_i,2), V_i )
(theta_i,1, theta_i,2) ~ Normal ( (mu_1, mu_2), S )
where V_i is a known covariance matrix of measurement error.
As a new user of mcmcglmm I will need some time to experiment with your
idea to confirm that it works. My understanding, however, was that residual
error was specified in the R matrix and not the G matrix. Do we need to fix
R as well? Will we still be able to estimate S?
Thanks!
Jon
On Tue, Oct 18, 2016 at 11:45 PM, Jarrod Hadfield <j.hadfield at ed.ac.uk>
wrote:
Hi Jon,
If you have the covariance matrix for your observations, then take its
inverse and store it in sparse format:
Cinv_sparse<-as(Cinv, "dgCMatrix")
where Cinv is the inverse in dense format. When you say multivariate do
you mean something like an explicit bivariate response such that the fixed
formula is of the form cbind(y_1, y_2)~...? If so you need to organise
your data in long format and pass a single response vector. You can include
a variable that denotes whether the observation is y_1 or y_2 and use it
like "trait", and include a variable that denotes the original row for the
observation and use it like "units". If we call this second variable "row"
then having fit "row" as a random effect, and pass the argument
ginverse=list(row=Cinv_sparse) to MCMCglmm. You will also need to fix the
"row" variance to one in the prior:
G1=list(V=1, fix=1)
Presumably covariances are only non-zero between observations from the
same original row? If so make sure the sparse Matrix also represents this:
numerical issues during inversion may cause zero entries to differ slightly
from zero and hence not be represented as zero.
Cheers,
Jarrod
Then you can fit the term ~trait:units
On 19/10/2016 05:41, Jon Bischof wrote:
I'm interested in fitting a multivariate meta-analysis model with
correlated measurement error. This means fixing the error to a
covariance
matrix per row.
I saw this post
<https://stat.ethz.ch/pipermail/r-sig-mixed-models/2013q2/020180.html>
on
the mailing list about non-correlated outcomes, but the noise
correlation
is too large to ignore in my use case. Professor Hadfield implies in the
post that it is possible but "complicated". Does anyone know how to do
it?
Thanks!
Jon Bischof
[[alternative HTML version deleted]]