MCMCglmm multivariate meta-analysis with covariance
Hi Jon, In this instance you are actually fitting 3 group-level random effects; 1 in the random part of the model, one in residual part of the model and one in the argument to mev. The random effects and the residual effects are confounded and are not identifiable in the likelihood. You would be better just fitting the model: m <- MCMCglmm::MCMCglmm(outcome ~ 1, rcov=~group, mev=dat$obsVar,data=dat, prior=list(R=list(V=1, nu=0))) it is virtually the same as your original model (since the residual variance was set close to zero) but it will iterate faster and mix better. In this model you estimate the between-observation (in your case between-group) variance after accounting for the sampling variance. There are two additional, equivalent, ways of writing this model: m.a <- MCMCglmm::MCMCglmm(outcome ~ 1, random=~us(sqrt(obsVar)):group, rcov=~group,data=dat, prior=list(R=list(V=1, nu=0), G=list(G1=list(V=1, fix=1)))) and Cinv_sparse<-as(diag(1/dat$obsVar), "dgCMatrix") rownames(Cinv_sparse)<-dat$group m.b <- MCMCglmm::MCMCglmm(outcome ~ 1, random=~group, rcov=~group,data=dat, ginverse=list(group=Cinv_sparse), prior=list(R=list(V=1, nu=0), G=list(G1=list(V=1, fix=1)))) It is this latter equivalence that I suggested you exploit in the bivariate case. Lets say there are now two rows per group (one for spend and one for performance) and you have a column in the data-frame indicating response type (lets call it sORp) I don't know how your 2x2 sampling covariance matrices for each group are stored, but lets say they are in list form. As an example, with two groups: C<-list(S1=matrix(c(1,0.5, 0.5,1),2,2), S2=matrix(c(2,0, 0,1),2,2)) Cinv_sparse<-as(list2bdiag(lapply(C,solve)), "dgCMatrix") Assuming the data are sorted response type with group (i.e. in the same order as C) then you can fit dat$observation<-1:nrow(dat) rownames(Cinv_sparse)<-1:nrow(dat) m.biv <- MCMCglmm::MCMCglmm(outcome ~ sORp-1, random=~observation, rcov=~us(sORp):group,data=dat, ginverse=list(observation=Cinv_sparse), prior=list(R=list(V=diag(2), nu=0), G=list(G1=list(V=1, fix=1)))) The rcov term models between-observation (ie. between group) (co)variation not due to sampling error. These variances could be zero, but you would have to have a pretty effective way of randomising the original (unsummarised) observations between groups. Cheers, Jarrod
On 20/10/2016 20:31, Jon Bischof wrote:
Hi Jarrod,
You are making an excellent point, but I am working in a big data
setting where only group aggregates, not individual outcomes, will fit
in memory.
In brief: this model estimates the log ratio of advertiser spend and
performance between a control and experiment condition. These two
outcomes are measured at a click level, but we have far too many
clicks to store individually in memory. Instead we want to fit a model
to advertiser sufficient statistics, which are the log ratio mean and
within variance.
For either of these outcomes individually, I was able to fit the
following model and recover the true hyperparameters from fake data:
# Prior distribution for model
kMCMCglmmPrior <- list(R=list(fix=1, V=1e-6),
G=list(G1=list(V=1e-1, nu=-2)))
# dat: A data.frame with columns 'group', 'outcome', and 'obsVar'
m <- MCMCglmm::MCMCglmm(outcome ~ 1, random=~group, mev=dat$obsVar,
data=dat, prior=kMCMCglmmPrior)
However, for many GLMM implementations (including LMER), the within
and between variance are not separately identifiable when aggregate
data are provided. This is not true statistically, but the aggregate
likelihood requires special handling and cannot be fit from a program
that expects one row for each y_ij.
Can MCMCglmm fit my one-dimensional model when R is not fixed and
there is only one row per group?
Thanks!
Jon
On Thu, Oct 20, 2016 at 11:29 AM, Jarrod Hadfield <j.hadfield at ed.ac.uk
<mailto:j.hadfield at ed.ac.uk>> wrote:
Hi Jon,
MCMCglmm fits random-effect meta-analysis (I think this is what it
is caused) which assumes that even after correcting for sampling
errors there will be some 'real' between-observation variance (and
in your case covariance). I'm not sure what you are modelling, but
at least in the types of data I work with I can't really believe
there isn't some 'real' between-observation variance.
Cheers,
Jarrod
On 20/10/2016 19:09, Jon Bischof wrote:
Jarrod,
Thanks for your detailed response! Your understanding of my model
is correct: it's just a single grouping metaanalysis in two
dimensions:
(y_i,1, y_i,2) ~ Normal ( (theta_i,1, theta_i,2), V_i )
(theta_i,1, theta_i,2) ~ Normal ( (mu_1, mu_2), S )
where V_i is a known covariance matrix of measurement error.
As a new user of mcmcglmm I will need some time to experiment
with your idea to confirm that it works. My understanding,
however, was that residual error was specified in the R matrix
and not the G matrix. Do we need to fix R as well? Will we still
be able to estimate S?
Thanks!
Jon
On Tue, Oct 18, 2016 at 11:45 PM, Jarrod Hadfield
<j.hadfield at ed.ac.uk <mailto:j.hadfield at ed.ac.uk>> wrote:
Hi Jon,
If you have the covariance matrix for your observations, then
take its inverse and store it in sparse format:
Cinv_sparse<-as(Cinv, "dgCMatrix")
where Cinv is the inverse in dense format. When you say
multivariate do you mean something like an explicit bivariate
response such that the fixed formula is of the form
cbind(y_1, y_2)~...? If so you need to organise your data in
long format and pass a single response vector. You can
include a variable that denotes whether the observation is
y_1 or y_2 and use it like "trait", and include a variable
that denotes the original row for the observation and use it
like "units". If we call this second variable "row" then
having fit "row" as a random effect, and pass the argument
ginverse=list(row=Cinv_sparse) to MCMCglmm. You will also
need to fix the "row" variance to one in the prior:
G1=list(V=1, fix=1)
Presumably covariances are only non-zero between observations
from the same original row? If so make sure the sparse Matrix
also represents this: numerical issues during inversion may
cause zero entries to differ slightly from zero and hence not
be represented as zero.
Cheers,
Jarrod
Then you can fit the term ~trait:units
On 19/10/2016 05:41, Jon Bischof wrote:
I'm interested in fitting a multivariate meta-analysis
model with
correlated measurement error. This means fixing the error
to a covariance
matrix per row.
I saw this post
<https://stat.ethz.ch/pipermail/r-sig-mixed-models/2013q2/020180.html
<https://stat.ethz.ch/pipermail/r-sig-mixed-models/2013q2/020180.html>>
on
the mailing list about non-correlated outcomes, but the
noise correlation
is too large to ignore in my use case. Professor Hadfield
implies in the
post that it is possible but "complicated". Does anyone
know how to do it?
Thanks!
Jon Bischof
[[alternative HTML version deleted]]
_______________________________________________
R-sig-mixed-models at r-project.org
<mailto:R-sig-mixed-models at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
<https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
-------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20161020/1671baca/attachment.pl>