estimating variance components for arbitrarily defined var/covar matrices

Matthew,

You should be able to do this in communityPGLMM in {pez}. Also, Steve Walker is currently working on a way to do this in lmer/glmer.

Cheers, Tony
Hi Matthew,

Both MCMCglmm and asreml-r fit these models in R.

Cheers,

Jarrod

Quoting Matthew Keller <mckellercran at gmail.com> on Wed, 25 Feb 2015 16:42:32 -0700:

Hi all,

This is a typical problem in genetics and I'm trying to figure out whether
there's any way to solve it using lmer or similar, and if not, why it isn't
possible.

Often in genetics, we have an n-by-n matrix (n=sample size) of genetic
relationships, where the diagonal is how related you are to yourself (~1,
depending on inbreeding) and off-diagonals each pairwise relationship. I'd
like to be able to use lmer or some other function in R to estimate the
variance attributable to this genetic relationship matrix. Thus:
y = b0 + b*X + g*Z + error
where y is a vector of observations, b is a vector of fixed covariate
effects and g is a vector of random genetic effects. X and Z are incidence
matrices for b & g respectively, and we assume g ~ N(0, VG). The variance
of y is therefore
var(y) = Z*Z' * VG + I*var(e)

Z*Z' is the observed n-by-n genetic relationship matrix. Given an observed
Z*Z' genetic relationship matrix, is there a way to estimate VG?

I guess this boils down to, if we have an observed n-by-n matrix of
similarities, can we use mixed models in R to get the variance in y that is
explained by that similarity?

Thanks in advance!

[[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
And just to throw another package in the mix, you can also use the metafor package for this. While this doesn't quite sound like a meta-analysis, in the end, meta-analysis models are just mixed-effects models. And in a phylogenetic meta-analysis, we add random effects with known correlations matrices to the model.

The syntax would be:

id.e <- 1:nrow(dat)
id.r <- 1:nrow(dat)
rma.mv(y ~ <fixed effects>, V = 0, random = list(~ 1 | id.r, ~ 1 | id.e), R = list(id.r = GRM), data=dat)

where 'GRM' is the n x n matrix of similarities. The V = 0 part seems a bit strange, but in meta-analytic models, we usually don't estimate the error variance and instead have known sampling variances (or even a known variance-covariance matrix of the sampling errors). Here, we don't, so we just set that part to 0 (you'll get a warning that 'V appears to be not positive definite.' but you can safely ignore this). This will fit the model that you specified below.

Besides the estimates of the fixed effects, the results will include two variance components, one for id.r (this is VG) and one for id.e (this is sigma^2_error). The default is REML estimation (method="ML" is you want MLEs).

By default, rma.mv() tries to rescale the matrix supplied via the R argument into a correlation matrix. Not sure if your matrix of similarities is really a correlation matrix or not. In case you don't want the function to mess with the matrix, set 'Rscale=FALSE' and it won't touch it.

Best,
Wolfgang

--   
Wolfgang Viechtbauer, Ph.D., Statistician   
Department of Psychiatry and Psychology   
School for Mental Health and Neuroscience   
Faculty of Health, Medicine, and Life Sciences   
Maastricht University, P.O. Box 616 (VIJV1)   
6200 MD Maastricht, The Netherlands   
+31 (43) 388-4170 | http://www.wvbauer.com
-----Original Message-----
From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-
project.org] On Behalf Of Anthony R Ives
Sent: Thursday, February 26, 2015 13:04
To: Jarrod Hadfield; Matthew Keller
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] estimating variance components for arbitrarily
defined var/covar matrices

Matthew,

You should be able to do this in communityPGLMM in {pez}. Also, Steve
Walker is currently working on a way to do this in lmer/glmer.

Cheers, Tony

On 02/26/15, Jarrod Hadfield  wrote:
Hi Matthew,

Both MCMCglmm and asreml-r fit these models in R.

Cheers,

Jarrod

Quoting Matthew Keller <mckellercran at gmail.com> on Wed, 25 Feb 2015
16:42:32 -0700:

Hi all,

This is a typical problem in genetics and I'm trying to figure out
whether
there's any way to solve it using lmer or similar, and if not, why it
isn't
possible.

Often in genetics, we have an n-by-n matrix (n=sample size) of genetic
relationships, where the diagonal is how related you are to yourself
(~1,
depending on inbreeding) and off-diagonals each pairwise relationship.
I'd
like to be able to use lmer or some other function in R to estimate
the
variance attributable to this genetic relationship matrix. Thus:
y = b0 + b*X + g*Z + error
where y is a vector of observations, b is a vector of fixed covariate
effects and g is a vector of random genetic effects. X and Z are
incidence
matrices for b & g respectively, and we assume g ~ N(0, VG). The
variance
of y is therefore
var(y) = Z*Z' * VG + I*var(e)

Z*Z' is the observed n-by-n genetic relationship matrix. Given an
observed
Z*Z' genetic relationship matrix, is there a way to estimate VG?

I guess this boils down to, if we have an observed n-by-n matrix of
similarities, can we use mixed models in R to get the variance in y
that is
explained by that similarity?

Thanks in advance!
Hi,

If the `genetic' matrix is pedigree derived (?) the matrix is a  
correlation matrix (in the absence of inbreeding) or proportional to  
the covariance matrix (with inbreeding). However, the matrix has a lot  
of structure that can be exploited by *not* treating it as just  
another correlation matrix. Specifically, the inverse of the matrix is  
all zeros, except for the diagonal elements and those that correspond  
to mates and/or parent-offspring. ASReml, WOMBAT, MCMCglmm, pedigreem  
(?) + others all exploit this structure leading to faster more robust  
results. For small problems this might not be an issue.

Cheers,

Jarrod

Quoting "Viechtbauer Wolfgang (STAT)"  
<wolfgang.viechtbauer at maastrichtuniversity.nl> on Thu, 26 Feb 2015  
17:24:39 +0100:
And just to throw another package in the mix, you can also use the  
metafor package for this. While this doesn't quite sound like a  
meta-analysis, in the end, meta-analysis models are just  
mixed-effects models. And in a phylogenetic meta-analysis, we add  
random effects with known correlations matrices to the model.

The syntax would be:

id.e <- 1:nrow(dat)
id.r <- 1:nrow(dat)
rma.mv(y ~ <fixed effects>, V = 0, random = list(~ 1 | id.r, ~ 1 |  
id.e), R = list(id.r = GRM), data=dat)

where 'GRM' is the n x n matrix of similarities. The V = 0 part  
seems a bit strange, but in meta-analytic models, we usually don't  
estimate the error variance and instead have known sampling  
variances (or even a known variance-covariance matrix of the  
sampling errors). Here, we don't, so we just set that part to 0  
(you'll get a warning that 'V appears to be not positive definite.'  
but you can safely ignore this). This will fit the model that you  
specified below.

Besides the estimates of the fixed effects, the results will include  
two variance components, one for id.r (this is VG) and one for id.e  
(this is sigma^2_error). The default is REML estimation (method="ML"  
is you want MLEs).

By default, rma.mv() tries to rescale the matrix supplied via the R  
argument into a correlation matrix. Not sure if your matrix of  
similarities is really a correlation matrix or not. In case you  
don't want the function to mess with the matrix, set 'Rscale=FALSE'  
and it won't touch it.

Best,
Wolfgang

--
Wolfgang Viechtbauer, Ph.D., Statistician
Department of Psychiatry and Psychology
School for Mental Health and Neuroscience
Faculty of Health, Medicine, and Life Sciences
Maastricht University, P.O. Box 616 (VIJV1)
6200 MD Maastricht, The Netherlands
+31 (43) 388-4170 | http://www.wvbauer.com

-----Original Message-----
From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-
project.org] On Behalf Of Anthony R Ives
Sent: Thursday, February 26, 2015 13:04
To: Jarrod Hadfield; Matthew Keller
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] estimating variance components for arbitrarily
defined var/covar matrices

Matthew,

You should be able to do this in communityPGLMM in {pez}. Also, Steve
Walker is currently working on a way to do this in lmer/glmer.

Cheers, Tony

On 02/26/15, Jarrod Hadfield  wrote:
Hi Matthew,

Both MCMCglmm and asreml-r fit these models in R.

Cheers,

Jarrod

Quoting Matthew Keller <mckellercran at gmail.com> on Wed, 25 Feb 2015
16:42:32 -0700:

Hi all,

This is a typical problem in genetics and I'm trying to figure out
whether
there's any way to solve it using lmer or similar, and if not, why it
isn't
possible.

Often in genetics, we have an n-by-n matrix (n=sample size) of genetic
relationships, where the diagonal is how related you are to yourself
(~1,
depending on inbreeding) and off-diagonals each pairwise relationship.
I'd
like to be able to use lmer or some other function in R to estimate
the
variance attributable to this genetic relationship matrix. Thus:
y = b0 + b*X + g*Z + error
where y is a vector of observations, b is a vector of fixed covariate
effects and g is a vector of random genetic effects. X and Z are
incidence
matrices for b & g respectively, and we assume g ~ N(0, VG). The
variance
of y is therefore
var(y) = Z*Z' * VG + I*var(e)

Z*Z' is the observed n-by-n genetic relationship matrix. Given an
observed
Z*Z' genetic relationship matrix, is there a way to estimate VG?

I guess this boils down to, if we have an observed n-by-n matrix of
similarities, can we use mixed models in R to get the variance in y
that is
explained by that similarity?

Thanks in advance!

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
Good point. Using sparse matrix methods is one way of handling that. Set 'sparse=TRUE' for rma.mv() to do so.

Best,
Wolfgang
-----Original Message-----
From: Jarrod Hadfield [mailto:j.hadfield at ed.ac.uk]
Sent: Thursday, February 26, 2015 17:42
To: Viechtbauer Wolfgang (STAT)
Cc: r-sig-mixed-models at r-project.org; Matthew Keller
Subject: Re: [R-sig-ME] estimating variance components for arbitrarily
defined var/covar matrices

Hi,

If the `genetic' matrix is pedigree derived (?) the matrix is a
correlation matrix (in the absence of inbreeding) or proportional to
the covariance matrix (with inbreeding). However, the matrix has a lot
of structure that can be exploited by *not* treating it as just
another correlation matrix. Specifically, the inverse of the matrix is
all zeros, except for the diagonal elements and those that correspond
to mates and/or parent-offspring. ASReml, WOMBAT, MCMCglmm, pedigreem
(?) + others all exploit this structure leading to faster more robust
results. For small problems this might not be an issue.

Cheers,

Jarrod

Quoting "Viechtbauer Wolfgang (STAT)"
<wolfgang.viechtbauer at maastrichtuniversity.nl> on Thu, 26 Feb 2015
17:24:39 +0100:

And just to throw another package in the mix, you can also use the
metafor package for this. While this doesn't quite sound like a
meta-analysis, in the end, meta-analysis models are just
mixed-effects models. And in a phylogenetic meta-analysis, we add
random effects with known correlations matrices to the model.

The syntax would be:

id.e <- 1:nrow(dat)
id.r <- 1:nrow(dat)
rma.mv(y ~ <fixed effects>, V = 0, random = list(~ 1 | id.r, ~ 1 |
id.e), R = list(id.r = GRM), data=dat)

where 'GRM' is the n x n matrix of similarities. The V = 0 part
seems a bit strange, but in meta-analytic models, we usually don't
estimate the error variance and instead have known sampling
variances (or even a known variance-covariance matrix of the
sampling errors). Here, we don't, so we just set that part to 0
(you'll get a warning that 'V appears to be not positive definite.'
but you can safely ignore this). This will fit the model that you
specified below.

Besides the estimates of the fixed effects, the results will include
two variance components, one for id.r (this is VG) and one for id.e
(this is sigma^2_error). The default is REML estimation (method="ML"
is you want MLEs).

By default, rma.mv() tries to rescale the matrix supplied via the R
argument into a correlation matrix. Not sure if your matrix of
similarities is really a correlation matrix or not. In case you
don't want the function to mess with the matrix, set 'Rscale=FALSE'
and it won't touch it.

Best,
Wolfgang

--
Wolfgang Viechtbauer, Ph.D., Statistician
Department of Psychiatry and Psychology
School for Mental Health and Neuroscience
Faculty of Health, Medicine, and Life Sciences
Maastricht University, P.O. Box 616 (VIJV1)
6200 MD Maastricht, The Netherlands
+31 (43) 388-4170 | http://www.wvbauer.com

-----Original Message-----
From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-
project.org] On Behalf Of Anthony R Ives
Sent: Thursday, February 26, 2015 13:04
To: Jarrod Hadfield; Matthew Keller
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] estimating variance components for arbitrarily
defined var/covar matrices

Matthew,

You should be able to do this in communityPGLMM in {pez}. Also, Steve
Walker is currently working on a way to do this in lmer/glmer.

Cheers, Tony

On 02/26/15, Jarrod Hadfield  wrote:
Hi Matthew,

Both MCMCglmm and asreml-r fit these models in R.

Cheers,

Jarrod

Quoting Matthew Keller <mckellercran at gmail.com> on Wed, 25 Feb 2015
16:42:32 -0700:

Hi all,

This is a typical problem in genetics and I'm trying to figure out
whether
there's any way to solve it using lmer or similar, and if not, why
it
isn't
possible.

Often in genetics, we have an n-by-n matrix (n=sample size) of
genetic
relationships, where the diagonal is how related you are to
yourself
(~1,
depending on inbreeding) and off-diagonals each pairwise
relationship.
I'd
like to be able to use lmer or some other function in R to estimate
the
variance attributable to this genetic relationship matrix. Thus:
y = b0 + b*X + g*Z + error
where y is a vector of observations, b is a vector of fixed
covariate
effects and g is a vector of random genetic effects. X and Z are
incidence
matrices for b & g respectively, and we assume g ~ N(0, VG). The
variance
of y is therefore
var(y) = Z*Z' * VG + I*var(e)

Z*Z' is the observed n-by-n genetic relationship matrix. Given an
observed
Z*Z' genetic relationship matrix, is there a way to estimate VG?

I guess this boils down to, if we have an observed n-by-n matrix of
similarities, can we use mixed models in R to get the variance in y
that is
explained by that similarity?

Thanks in advance!
Hi Wolfgang,

The matrix itself is not necessarily sparse (although usually is) its  
the inverse that is sparse.  The pattern of sparsity is also important  
because the equations can be permuted so they are more easily solved  
(if you use something like minimum degree ordering it will probably do  
a good job automatically). In addition, the nice structure is lost if  
individuals connecting two relatives are omitted. In this situation  
(the usual case) the random effect vector is usually augmented with  
these `missing' individuals and so the matrix is of greater dimension  
than the number of observations, and the corresponding columns of the  
design matrix are set to zero. I'm not sure if metafor handles these  
sorts of issues? The issues are not difficult to solve, but often  
general-purpose packages don't accommodate them because for most  
problems its not clear why you would ever need to worry about them.

Cheers,

Jarrod

Quoting "Viechtbauer Wolfgang (STAT)"  
<wolfgang.viechtbauer at maastrichtuniversity.nl> on Thu, 26 Feb 2015  
17:50:36 +0100:
Good point. Using sparse matrix methods is one way of handling that.  
Set 'sparse=TRUE' for rma.mv() to do so.

Best,
Wolfgang

-----Original Message-----
From: Jarrod Hadfield [mailto:j.hadfield at ed.ac.uk]
Sent: Thursday, February 26, 2015 17:42
To: Viechtbauer Wolfgang (STAT)
Cc: r-sig-mixed-models at r-project.org; Matthew Keller
Subject: Re: [R-sig-ME] estimating variance components for arbitrarily
defined var/covar matrices

Hi,

If the `genetic' matrix is pedigree derived (?) the matrix is a
correlation matrix (in the absence of inbreeding) or proportional to
the covariance matrix (with inbreeding). However, the matrix has a lot
of structure that can be exploited by *not* treating it as just
another correlation matrix. Specifically, the inverse of the matrix is
all zeros, except for the diagonal elements and those that correspond
to mates and/or parent-offspring. ASReml, WOMBAT, MCMCglmm, pedigreem
(?) + others all exploit this structure leading to faster more robust
results. For small problems this might not be an issue.

Cheers,

Jarrod

Quoting "Viechtbauer Wolfgang (STAT)"
<wolfgang.viechtbauer at maastrichtuniversity.nl> on Thu, 26 Feb 2015
17:24:39 +0100:

And just to throw another package in the mix, you can also use the
metafor package for this. While this doesn't quite sound like a
meta-analysis, in the end, meta-analysis models are just
mixed-effects models. And in a phylogenetic meta-analysis, we add
random effects with known correlations matrices to the model.

The syntax would be:

id.e <- 1:nrow(dat)
id.r <- 1:nrow(dat)
rma.mv(y ~ <fixed effects>, V = 0, random = list(~ 1 | id.r, ~ 1 |
id.e), R = list(id.r = GRM), data=dat)

where 'GRM' is the n x n matrix of similarities. The V = 0 part
seems a bit strange, but in meta-analytic models, we usually don't
estimate the error variance and instead have known sampling
variances (or even a known variance-covariance matrix of the
sampling errors). Here, we don't, so we just set that part to 0
(you'll get a warning that 'V appears to be not positive definite.'
but you can safely ignore this). This will fit the model that you
specified below.

Besides the estimates of the fixed effects, the results will include
two variance components, one for id.r (this is VG) and one for id.e
(this is sigma^2_error). The default is REML estimation (method="ML"
is you want MLEs).

By default, rma.mv() tries to rescale the matrix supplied via the R
argument into a correlation matrix. Not sure if your matrix of
similarities is really a correlation matrix or not. In case you
don't want the function to mess with the matrix, set 'Rscale=FALSE'
and it won't touch it.

Best,
Wolfgang

--
Wolfgang Viechtbauer, Ph.D., Statistician
Department of Psychiatry and Psychology
School for Mental Health and Neuroscience
Faculty of Health, Medicine, and Life Sciences
Maastricht University, P.O. Box 616 (VIJV1)
6200 MD Maastricht, The Netherlands
+31 (43) 388-4170 | http://www.wvbauer.com

-----Original Message-----
From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-
project.org] On Behalf Of Anthony R Ives
Sent: Thursday, February 26, 2015 13:04
To: Jarrod Hadfield; Matthew Keller
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] estimating variance components for arbitrarily
defined var/covar matrices

Matthew,

You should be able to do this in communityPGLMM in {pez}. Also, Steve
Walker is currently working on a way to do this in lmer/glmer.

Cheers, Tony

On 02/26/15, Jarrod Hadfield  wrote:
Hi Matthew,

Both MCMCglmm and asreml-r fit these models in R.

Cheers,

Jarrod

Quoting Matthew Keller <mckellercran at gmail.com> on Wed, 25 Feb 2015
16:42:32 -0700:

Hi all,

This is a typical problem in genetics and I'm trying to figure out
whether
there's any way to solve it using lmer or similar, and if not, why
it
isn't
possible.

Often in genetics, we have an n-by-n matrix (n=sample size) of
genetic
relationships, where the diagonal is how related you are to
yourself
(~1,
depending on inbreeding) and off-diagonals each pairwise
relationship.
I'd
like to be able to use lmer or some other function in R to estimate
the
variance attributable to this genetic relationship matrix. Thus:
y = b0 + b*X + g*Z + error
where y is a vector of observations, b is a vector of fixed
covariate
effects and g is a vector of random genetic effects. X and Z are
incidence
matrices for b & g respectively, and we assume g ~ N(0, VG). The
variance
of y is therefore
var(y) = Z*Z' * VG + I*var(e)

Z*Z' is the observed n-by-n genetic relationship matrix. Given an
observed
Z*Z' genetic relationship matrix, is there a way to estimate VG?

I guess this boils down to, if we have an observed n-by-n matrix of
similarities, can we use mixed models in R to get the variance in y
that is
explained by that similarity?

Thanks in advance!

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
Thanks Tony.  For those interested, you can check out the development of 
my work on phylogenetic models in lme4 here:

https://github.com/stevencarlislewalker/lme4ord

The section in the README file about phylogenetic models gives an 
example and briefly describes the kinds of models that can be fitted.

Regarding speed, the glmerc (with a 'c' for known covariance over 
grouping factor levels) function can fit models similar to the one in 
the README but with a 500 tip phylogeny in about 20 seconds on my pretty 
standard macbook pro.  lme4 uses sparse matrices for all random 
effect-related matrix computations, so most of the speed comes from this 
infrastructure.  However, it may be possible to speed things up more by 
exploiting Kronecker-product-type structure, but this would require some 
substantial additions to lme4, I think.

I should also note that the models fitted by glmerc do _not_ allow for 
fitting parameters that scale branch lengths (e.g. Ornstein-Uhlenbeck). 
  The fundamental challenge with these models in lme4 is that lme4 is 
based on a Cholesky factor parameterization of the random effects 
covariance matrix.  To parameterize the covariance matrix on the 
covariance or branch length scale, one would need to compute a Cholesky 
factor every time the deviance function is evaluated.  This could get 
costly.  The recent methods of Ho and An? (2014) in the phylolm package 
might help here.  However, in their current form these methods only 
apply to quadratic terms involving the inverse covariance matrix and to 
the log determinant of the covariance matrix.  These computations are 
certainly important in mixed modelling in general, but unfortunately not 
_so_ much for lme4, because of the Cholesky parameterization.  However, 
I bet that their ideas could be modified to be applicable to lme4.  In 
particular I bet one could compute a series of small Cholesky 
decompositions at the tips and then iteratively rank-one update them by 
traversing the tree to the root.  My guess is that this procedure would 
scale linearly, as do the other methods of Ho and An? (2014).

Cheers,
Steve
Matthew,

You should be able to do this in communityPGLMM in {pez}. Also, Steve
Walker is currently working on a way to do this in lmer/glmer.

Cheers, Tony

On 02/26/15, Jarrod Hadfield  wrote:
Hi Matthew,

Both MCMCglmm and asreml-r fit these models in R.

Cheers,

Jarrod

Quoting Matthew Keller <mckellercran at gmail.com> on Wed, 25 Feb 2015
16:42:32 -0700:

Hi all,

This is a typical problem in genetics and I'm trying to figure
out whether there's any way to solve it using lmer or similar,
and if not, why it isn't possible.

Often in genetics, we have an n-by-n matrix (n=sample size) of
genetic relationships, where the diagonal is how related you are
to yourself (~1, depending on inbreeding) and off-diagonals each
pairwise relationship. I'd like to be able to use lmer or some
other function in R to estimate the variance attributable to this
genetic relationship matrix. Thus: y = b0 + b*X + g*Z + error
where y is a vector of observations, b is a vector of fixed
covariate effects and g is a vector of random genetic effects. X
and Z are incidence matrices for b & g respectively, and we
assume g ~ N(0, VG). The variance of y is therefore var(y) = Z*Z'
* VG + I*var(e)

Z*Z' is the observed n-by-n genetic relationship matrix. Given an
observed Z*Z' genetic relationship matrix, is there a way to
estimate VG?

I guess this boils down to, if we have an observed n-by-n matrix
of similarities, can we use mixed models in R to get the variance
in y that is explained by that similarity?

Thanks in advance!

[[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

-- The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Interesting, thanks for the feedback. I am replying to you and cc-ing r-sig-mixed-models, as this is getting a bit off-topic.

The degree of sparseness of course depends on the application. For phylogenies, both the matrix and its inverse are equally sparse (due to the split at the root node). For pedigrees, this may be different, but I don't know enough about this. The types of applications I have been working on are phylogenetic meta-analyses, so here using sparse matrices works just fine (and in fact, we have been doing comparisons between metafor and MCMCglmm and things match up pretty nicely).

At any rate, one can use rma.mv() just fine to fit the model that Matthew was asking about. Maybe it is not the most efficient way of doing this, but I don't see how it is wrong.

Best,
Wolfgang
-----Original Message-----
From: Jarrod Hadfield [mailto:j.hadfield at ed.ac.uk]
Sent: Thursday, February 26, 2015 18:18
To: Viechtbauer Wolfgang (STAT)
Cc: r-sig-mixed-models at r-project.org; Matthew Keller
Subject: RE: [R-sig-ME] estimating variance components for arbitrarily
defined var/covar matrices

Hi Wolfgang,

The matrix itself is not necessarily sparse (although usually is) its
the inverse that is sparse.  The pattern of sparsity is also important
because the equations can be permuted so they are more easily solved
(if you use something like minimum degree ordering it will probably do
a good job automatically). In addition, the nice structure is lost if
individuals connecting two relatives are omitted. In this situation
(the usual case) the random effect vector is usually augmented with
these `missing' individuals and so the matrix is of greater dimension
than the number of observations, and the corresponding columns of the
design matrix are set to zero. I'm not sure if metafor handles these
sorts of issues? The issues are not difficult to solve, but often
general-purpose packages don't accommodate them because for most
problems its not clear why you would ever need to worry about them.

Cheers,

Jarrod

Quoting "Viechtbauer Wolfgang (STAT)"
<wolfgang.viechtbauer at maastrichtuniversity.nl> on Thu, 26 Feb 2015
17:50:36 +0100:

Good point. Using sparse matrix methods is one way of handling that.
Set 'sparse=TRUE' for rma.mv() to do so.

Best,
Wolfgang

-----Original Message-----
From: Jarrod Hadfield [mailto:j.hadfield at ed.ac.uk]
Sent: Thursday, February 26, 2015 17:42
To: Viechtbauer Wolfgang (STAT)
Cc: r-sig-mixed-models at r-project.org; Matthew Keller
Subject: Re: [R-sig-ME] estimating variance components for arbitrarily
defined var/covar matrices

Hi,

If the `genetic' matrix is pedigree derived (?) the matrix is a
correlation matrix (in the absence of inbreeding) or proportional to
the covariance matrix (with inbreeding). However, the matrix has a lot
of structure that can be exploited by *not* treating it as just
another correlation matrix. Specifically, the inverse of the matrix is
all zeros, except for the diagonal elements and those that correspond
to mates and/or parent-offspring. ASReml, WOMBAT, MCMCglmm, pedigreem
(?) + others all exploit this structure leading to faster more robust
results. For small problems this might not be an issue.

Cheers,

Jarrod

Quoting "Viechtbauer Wolfgang (STAT)"
<wolfgang.viechtbauer at maastrichtuniversity.nl> on Thu, 26 Feb 2015
17:24:39 +0100:

And just to throw another package in the mix, you can also use the
metafor package for this. While this doesn't quite sound like a
meta-analysis, in the end, meta-analysis models are just
mixed-effects models. And in a phylogenetic meta-analysis, we add
random effects with known correlations matrices to the model.

The syntax would be:

id.e <- 1:nrow(dat)
id.r <- 1:nrow(dat)
rma.mv(y ~ <fixed effects>, V = 0, random = list(~ 1 | id.r, ~ 1 |
id.e), R = list(id.r = GRM), data=dat)

where 'GRM' is the n x n matrix of similarities. The V = 0 part
seems a bit strange, but in meta-analytic models, we usually don't
estimate the error variance and instead have known sampling
variances (or even a known variance-covariance matrix of the
sampling errors). Here, we don't, so we just set that part to 0
(you'll get a warning that 'V appears to be not positive definite.'
but you can safely ignore this). This will fit the model that you
specified below.

Besides the estimates of the fixed effects, the results will include
two variance components, one for id.r (this is VG) and one for id.e
(this is sigma^2_error). The default is REML estimation (method="ML"
is you want MLEs).

By default, rma.mv() tries to rescale the matrix supplied via the R
argument into a correlation matrix. Not sure if your matrix of
similarities is really a correlation matrix or not. In case you
don't want the function to mess with the matrix, set 'Rscale=FALSE'
and it won't touch it.

Best,
Wolfgang

--
Wolfgang Viechtbauer, Ph.D., Statistician
Department of Psychiatry and Psychology
School for Mental Health and Neuroscience
Faculty of Health, Medicine, and Life Sciences
Maastricht University, P.O. Box 616 (VIJV1)
6200 MD Maastricht, The Netherlands
+31 (43) 388-4170 | http://www.wvbauer.com

-----Original Message-----
From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-
project.org] On Behalf Of Anthony R Ives
Sent: Thursday, February 26, 2015 13:04
To: Jarrod Hadfield; Matthew Keller
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] estimating variance components for
arbitrarily
defined var/covar matrices

Matthew,

You should be able to do this in communityPGLMM in {pez}. Also,
Steve
Walker is currently working on a way to do this in lmer/glmer.

Cheers, Tony

On 02/26/15, Jarrod Hadfield  wrote:
Hi Matthew,

Both MCMCglmm and asreml-r fit these models in R.

Cheers,

Jarrod

Quoting Matthew Keller <mckellercran at gmail.com> on Wed, 25 Feb
2015
16:42:32 -0700:

Hi all,

This is a typical problem in genetics and I'm trying to figure
out
whether
there's any way to solve it using lmer or similar, and if not,
why
it
isn't
possible.

Often in genetics, we have an n-by-n matrix (n=sample size) of
genetic
relationships, where the diagonal is how related you are to
yourself
(~1,
depending on inbreeding) and off-diagonals each pairwise
relationship.
I'd
like to be able to use lmer or some other function in R to
estimate
the
variance attributable to this genetic relationship matrix. Thus:
y = b0 + b*X + g*Z + error
where y is a vector of observations, b is a vector of fixed
covariate
effects and g is a vector of random genetic effects. X and Z are
incidence
matrices for b & g respectively, and we assume g ~ N(0, VG). The
variance
of y is therefore
var(y) = Z*Z' * VG + I*var(e)

Z*Z' is the observed n-by-n genetic relationship matrix. Given
an
observed
Z*Z' genetic relationship matrix, is there a way to estimate VG?

I guess this boils down to, if we have an observed n-by-n matrix
of
similarities, can we use mixed models in R to get the variance
in y
that is
explained by that similarity?

Thanks in advance!
Steve,

Just a quick follow-up. Even though your work (and communityPGLMM) does not allow transformations in the covariance matrices themselves (like the OU), it does in effect perform Pagel?s lambda in the form sigma2*V + I ~= lambda*V + (1-lambda)*I.

Cheers, Tony

Anthony Ives
Department of Zoology
459 Birge Hall (4th floor, E end of bldg)
UW-Madison
Madison, WI 53706
608-262-1519
On Feb 26, 2015, at 1:09 PM, Steve Walker <steve.walker at utoronto.ca> wrote:

Thanks Tony.  For those interested, you can check out the development of my work on phylogenetic models in lme4 here:

https://github.com/stevencarlislewalker/lme4ord

The section in the README file about phylogenetic models gives an example and briefly describes the kinds of models that can be fitted.

Regarding speed, the glmerc (with a 'c' for known covariance over grouping factor levels) function can fit models similar to the one in the README but with a 500 tip phylogeny in about 20 seconds on my pretty standard macbook pro.  lme4 uses sparse matrices for all random effect-related matrix computations, so most of the speed comes from this infrastructure.  However, it may be possible to speed things up more by exploiting Kronecker-product-type structure, but this would require some substantial additions to lme4, I think.

I should also note that the models fitted by glmerc do _not_ allow for fitting parameters that scale branch lengths (e.g. Ornstein-Uhlenbeck).  The fundamental challenge with these models in lme4 is that lme4 is based on a Cholesky factor parameterization of the random effects covariance matrix.  To parameterize the covariance matrix on the covariance or branch length scale, one would need to compute a Cholesky factor every time the deviance function is evaluated.  This could get costly.  The recent methods of Ho and An? (2014) in the phylolm package might help here.  However, in their current form these methods only apply to quadratic terms involving the inverse covariance matrix and to the log determinant of the covariance matrix.  These computations are certainly important in mixed modelling in general, but unfortunately not _so_ much for lme4, because of the Cholesky parameterization.  However, I bet that their ideas could be modified to be applicable to lme4.  In particular I bet one could compute a series of small Cholesky decompositions at the tips and then iteratively rank-one update them by traversing the tree to the root.  My guess is that this procedure would scale linearly, as do the other methods of Ho and An? (2014).

Cheers,
Steve

On 2015-02-26 7:04 AM, Anthony R Ives wrote:
Matthew,

You should be able to do this in communityPGLMM in {pez}. Also, Steve
Walker is currently working on a way to do this in lmer/glmer.

Cheers, Tony

On 02/26/15, Jarrod Hadfield  wrote:
Hi Matthew,

Both MCMCglmm and asreml-r fit these models in R.

Cheers,

Jarrod

Quoting Matthew Keller <mckellercran at gmail.com> on Wed, 25 Feb 2015
16:42:32 -0700:

Hi all,

This is a typical problem in genetics and I'm trying to figure
out whether there's any way to solve it using lmer or similar,
and if not, why it isn't possible.

Often in genetics, we have an n-by-n matrix (n=sample size) of
genetic relationships, where the diagonal is how related you are
to yourself (~1, depending on inbreeding) and off-diagonals each
pairwise relationship. I'd like to be able to use lmer or some
other function in R to estimate the variance attributable to this
genetic relationship matrix. Thus: y = b0 + b*X + g*Z + error
where y is a vector of observations, b is a vector of fixed
covariate effects and g is a vector of random genetic effects. X
and Z are incidence matrices for b & g respectively, and we
assume g ~ N(0, VG). The variance of y is therefore var(y) = Z*Z'
* VG + I*var(e)

Z*Z' is the observed n-by-n genetic relationship matrix. Given an
observed Z*Z' genetic relationship matrix, is there a way to
estimate VG?

I guess this boils down to, if we have an observed n-by-n matrix
of similarities, can we use mixed models in R to get the variance
in y that is explained by that similarity?

Thanks in advance!

[[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

-- The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Good point Tony.  I'll add that to the README.

Cheers,
Steve

Quoting Anthony Ives <arives at wisc.edu>:
Steve,

Just a quick follow-up. Even though your work (and communityPGLMM)   
does not allow transformations in the covariance matrices themselves  
 (like the OU), it does in effect perform Pagel's lambda in the form  
 sigma2*V + I ~= lambda*V + (1-lambda)*I.

Cheers, Tony

Anthony Ives
Department of Zoology
459 Birge Hall (4th floor, E end of bldg)
UW-Madison
Madison, WI 53706
608-262-1519

On Feb 26, 2015, at 1:09 PM, Steve Walker <steve.walker at utoronto.ca> wrote:

Thanks Tony.  For those interested, you can check out the   
development of my work on phylogenetic models in lme4 here:

https://github.com/stevencarlislewalker/lme4ord

The section in the README file about phylogenetic models gives an   
example and briefly describes the kinds of models that can be fitted.

Regarding speed, the glmerc (with a 'c' for known covariance over   
grouping factor levels) function can fit models similar to the one   
in the README but with a 500 tip phylogeny in about 20 seconds on   
my pretty standard macbook pro.  lme4 uses sparse matrices for all   
random effect-related matrix computations, so most of the speed   
comes from this infrastructure.  However, it may be possible to   
speed things up more by exploiting Kronecker-product-type   
structure, but this would require some substantial additions to   
lme4, I think.

I should also note that the models fitted by glmerc do _not_ allow  
for fitting parameters that scale branch lengths (e.g.  
Ornstein-Uhlenbeck).  The fundamental challenge with these models  
in lme4 is that lme4 is based on a Cholesky factor parameterization  
of the random effects covariance matrix.  To parameterize the  
covariance matrix on the covariance or branch length scale, one  
would need to compute a Cholesky factor every time the deviance  
function is evaluated.  This could get costly.  The recent methods  
of Ho and An? (2014) in the phylolm package might help here.   
However, in their current form these methods only apply to  
quadratic terms involving the inverse covariance matrix and to the  
log determinant of the covariance matrix.  These computations are  
certainly important in mixed modelling in general, but  
unfortunately not _so_ much for lme4, because of the Cholesky  
parameterization.  However, I bet that their ideas could be  
modified to be applicable to lme4.  In particular I
 bet one could compute a series of small Cholesky decompositions at  
the tips and then iteratively rank-one update them by traversing  
the tree to the root.  My guess is that this procedure would scale  
linearly, as do the other methods of Ho and An? (2014).

Cheers,
Steve

On 2015-02-26 7:04 AM, Anthony R Ives wrote:
Matthew,

You should be able to do this in communityPGLMM in {pez}. Also, Steve
Walker is currently working on a way to do this in lmer/glmer.

Cheers, Tony

On 02/26/15, Jarrod Hadfield  wrote:
Hi Matthew,

Both MCMCglmm and asreml-r fit these models in R.

Cheers,

Jarrod

Quoting Matthew Keller <mckellercran at gmail.com> on Wed, 25 Feb 2015
16:42:32 -0700:

Hi all,

This is a typical problem in genetics and I'm trying to figure
out whether there's any way to solve it using lmer or similar,
and if not, why it isn't possible.

Often in genetics, we have an n-by-n matrix (n=sample size) of
genetic relationships, where the diagonal is how related you are
to yourself (~1, depending on inbreeding) and off-diagonals each
pairwise relationship. I'd like to be able to use lmer or some
other function in R to estimate the variance attributable to this
genetic relationship matrix. Thus: y = b0 + b*X + g*Z + error
where y is a vector of observations, b is a vector of fixed
covariate effects and g is a vector of random genetic effects. X
and Z are incidence matrices for b & g respectively, and we
assume g ~ N(0, VG). The variance of y is therefore var(y) = Z*Z'
* VG + I*var(e)

Z*Z' is the observed n-by-n genetic relationship matrix. Given an
observed Z*Z' genetic relationship matrix, is there a way to
estimate VG?

I guess this boils down to, if we have an observed n-by-n matrix
of similarities, can we use mixed models in R to get the variance
in y that is explained by that similarity?

Thanks in advance!

[[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

-- The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models