Contribution to lme4Eigen - R-SIG-mixed-models

Thu, Mar 1, 2012 11:11 AM #

Hi lme4 developers,

I hope I can contribute to something positive to the community. I noticed that even in the latest version of lme4Eigen the lmer function only use Nelder-Mead optimizer. I would like to get Bobyqa optimizer back as an option (because I really like Bobyqa optimizer). So, I patched the code a little bit. I used the (hopefully) latest revision from SVN version 1631. I also added control for xst and xt factor multiplier (a FIXME item list). I hope this change is also acceptable. If you feel I am ignorant of your code style, I apologize.

Here is the code (only the lmer function). I clearly marked my changes. If you need a diff file with the current HEAD, I'll be happy to provide you with one. The function seems to be running happily without error or warning on my machine. I hope the changes can make it to the main trunk.

Sincerely,
Roby

Joehanes, Roby (NIH/NHLBI) [F]

Thu, Mar 1, 2012 11:15 AM #

Ouch. Apparently, the mail daemon ate my attachment.

So, here it is (pasted). Sorry for the mail bomb.

-----------------------
lmer <- function(formula, data, REML = TRUE, sparseX = FALSE,
control = list(), start = NULL,
verbose = 0L, subset, weights, na.action, offset,
contrasts = NULL, devFunOnly=FALSE, optimizer=c("NelderMead","bobyqa"), ...)
{
    if (sparseX) warning("sparseX = TRUE has no effect at present")
    mf <- mc <- match.call()
    ## '...' handling up front, safe-guarding against typos ("familiy") :
    if(length(l... <- list(...))) {
		if (!is.null(l...$family)) {  # call glmer if family specified
			mc[[1]] <- as.name("glmer")
			return(eval(mc, parent.frame()) )
		}
		## Check for method argument which is no longer used
		if (!is.null(method <- l...$method)) {
			msg <- paste("Argument", sQuote("method"), "is deprecated.")
			if (match.arg(method, c("Laplace", "AGQ")) == "Laplace") {
				warning(msg)
				l... <- l...[names(l...) != "method"]
			} else stop(msg)
		}
		if(length(l...))
	    warning("extra argument(s) ",
		paste(sQuote(names(l...)), collapse=", "),
		" disregarded")
    }
	
    stopifnot(length(formula <- as.formula(formula)) == 3)
    if (missing(data)) data <- environment(formula)
	# evaluate and install the model frame :
    m <- match(c("data", "subset", "weights", "na.action", "offset"),
	names(mf), 0)
    mf <- mf[c(1, m)]
    mf$drop.unused.levels <- TRUE
    mf[[1]] <- as.name("model.frame")
    fr.form <- subbars(formula) # substituted "|" by "+" -
    environment(fr.form) <- environment(formula)
    mf$formula <- fr.form
    fr <- eval(mf, parent.frame())
	# random effects and terms modules
    reTrms <- mkReTrms(findbars(formula[[3]]), fr)
    if (any(unlist(lapply(reTrms$flist, nlevels)) >= nrow(fr)))
	stop("number of levels of each grouping factor must be less than number of obs")
    ## fixed-effects model matrix X - remove random effects from formula:
    form <- formula
    form[[3]] <- if(is.null(nb <- nobars(form[[3]]))) 1 else nb
    X <- model.matrix(form, fr, contrasts)#, sparse = FALSE, row.names = FALSE) ## sparseX not yet
    p <- ncol(X)
    if ((qrX <- qr(X))$rank < p)
	stop(gettextf("rank of X = %d < ncol(X) = %d", qrX$rank, p))
    rho <- new.env(parent=parent.env(environment()))
    rho$pp <- do.call(merPredD$new, c(reTrms[c("Zt","theta","Lambdat","Lind")], n=nrow(X), list(X=X)))
    rho$resp <- mkRespMod(fr, if(REML) p else 0L)
	
    devfun <- mkdevfun(rho, 0L)
    devfun(reTrms$theta) # one evaluation to ensure all values are set
	
    if (devFunOnly) return(devfun)
	
    lower <- reTrms$lower
	# RJ's changes begin
	opt <- switch(match.arg(optimizer),
		bobyqa = {
			if(!is.numeric(control$rhobeg)) control$rhobeg <- 0.0002
			if(!is.numeric(control$rhoend)) control$rhoend <- 2e-7
			rho$control <- control
			# Delete unused options to prevent warning from showing up
			control$FtolAbs <- NULL
			control$FtolRel <- NULL
			bobyqa(rho$pp$theta, devfun, lower, control=control)
		},
		NelderMead = {
			## FIXME: this code is replicated in lmer/glmer/nlmer ...
			## it seems good to have it in R rather than C++ code but maybe it should go within Nelder_Mead() ??
			control$iprint <- switch(as.character(min(verbose,3L)), "0"=0, "1"=20,"2"=10,"3"=1)
			xst <- rep.int(0.1, length(lower))
			# RJ -- allow user control of xst, xt
			ctl <- control
			if(!is.numeric(control$xstFactor))
				xstFactor <- 0.2
			else
				xstFactor <- control$xstFactor
			if(!is.numeric(control$xtFactor))
				xtFactor <- 0.0001
			else
				xtFactor <- control$xtFactor
			ctl$xstFactor <- NULL
			ctl$xtFactor <- NULL
			Nelder_Mead(devfun, x0=rho$pp$theta, xst=xstFactor*xst, xt=xst*xtFactor, lower=lower, control=ctl)
		})

	if(optimizer=="NelderMead") {
	    if (opt$ierr < 0L) {
			if (opt$ierr > -4L)
			stop("convergence failure, code ", opt$ierr, " in NelderMead")
			else
			warning("failure to converge in", opt$control$maxfun, "evaluations")
		}
	} else if (optimizer=="bobyqa") {
		if (opt$ierr > 0L)
			warning("convergence problem, code ", opt$ierr, " in bobyqa")
	}
	# RJ's changes end
    mkMerMod(environment(devfun), opt, reTrms, fr, mc)
}## { lmer }

On Mar 1, 2012, at 2:11 PM, Joehanes, Roby (NIH/NHLBI) [F] wrote:

Joehanes, Roby (NIH/NHLBI) [F]

Thu, Mar 1, 2012 12:17 PM #

Apologies again. I submitted the changes to the tracker here:
https://r-forge.r-project.org/tracker/index.php?func=detail&aid=1861&group_id=60&atid=300

On Mar 1, 2012, at 2:15 PM, Joehanes, Roby (NIH/NHLBI) [F] wrote:

<snip>

Ben Bolker

Thu, Mar 1, 2012 12:32 PM #

Joehanes, Roby (NIH/NHLBI) [F] <roby.joehanes at ...> writes:

Would you be willing to send this as an SVN diff to
lme4-authors <at> r-forge.wu-wien.ac.at ?  Or post it to the issue tracker
on r-forge ?

  Ben

Joehanes, Roby (NIH/NHLBI) [F]

Thu, Mar 1, 2012 12:44 PM #

Hi Ben,

Thank you for the reply. I've added the diff patch against SVN rev 1631 in the issue tracker:
https://r-forge.r-project.org/tracker/download.php/60/300/1861/147/lmer-svn1631-patch.txt

See the corresponding page here:
https://r-forge.r-project.org/tracker/index.php?func=detail&aid=1861&group_id=60&atid=300

Also attached (cc-ed to the e-mail you mentioned). Apologies about the line ending. I am currently using a Mac.

Roby

On Mar 1, 2012, at 3:32 PM, Ben Bolker wrote:

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: lmer-svn1631-patch.txt
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20120301/385ad7d1/attachment-0001.txt>

Joehanes, Roby (NIH/NHLBI) [F]

Thu, Mar 1, 2012 2:52 PM #

Hi,

I would like to do something like this in pedigreemm:

result <- pedigreemm(y ~ ...<the other factors>... + (1 | id), data=mydata, pedigree=list(id=mypedigree), na.action=na.omit)

To my impression after reading pedigreemm's code, I need several observations with the same ID number and I cannot have each observation to have a different ID. Is this correct? Then, is it possible to have 1 observation having 1 ID? I don't think it is, given how the Z matrix for ID is encoded in lmer (and thereby pedigreemm) and how examples in the paper accompanying the package are constructed. Maybe I am incorrect or maybe I am doing an incorrect encoding.

What I am getting at is genome-wide association study (GWAS)-like analysis that account for familial correlation matrix. Here, each observation comes from one individual, which is associated an individual ID and a pedigree ID. One pedigree ID contains essentially a family tree from great grandparents down to the little ones. The pedigree is somewhat complicated by step-parents (indicated by siblings sharing one parent, but not the other) and a few members of one pedigree intermarrying those of another. I have the inclination to specify the model as:

result <- pedigreemm(y ~ ...<the other factors>... + (1 | ped_id), data=mydata, pedigree=list(ped_id =mypedigree), na.action=na.omit)

To me, it doesn't seem to be correct since pedigreemm will treat observations having the same ped_id as repeat observations. If I understand the code correctly, pedigreemm will tie ped_id with the mypedigree$id. I cannot put in individual IDs into mypedigree$id since it will make pedigreemm associate IDs incorrectly. However, I cannot put in pedigree ID into mypedigree$id either since one pedigree ID consists of the whole family tree.

The questions are: How can I deal with this problem? Should I break up the pedigree into nuclear families? How can I deal with the step-parents and intermarriage complications above? If pedigreemm was not designed for problems of this class, which other packages should I look into?

Thank you,
Roby

David Duffy

Thu, Mar 1, 2012 6:08 PM #

On Thu, 1 Mar 2012, Joehanes, Roby (NIH/NHLBI) [F] wrote:

The pedigree() constructor is set up as animal breeders do things: the 
entire dataset is one big pedigree, even though there may be subcomponents 
that do not connect to one another.  This is why sparse matrix 
representations are used.  So, you just need unique IDs for everybody (eg 
concatenate pedigree with individual ID).  This allows for intermarriage 
etc.  If you think step-parents transmit your phenotype to step-children, 
then you need to add in a family environmental random effect (ie common to 
all members of an extended family, or perhaps just to that "nongenetic" 
nuclear family).  If you have repeat measures on individuals, then you 
include multiple records with the same unique ID.

Cheers, David Duffy

| David Duffy (MBBS PhD)                                         ,-_|\
| email: davidD at qimr.edu.au  ph: INT+61+7+3362-0217 fax: -0101  /     *
| Epidemiology Unit, Queensland Institute of Medical Research   \_,-._/
| 300 Herston Rd, Brisbane, Queensland 4029, Australia  GPG 4D0B994A v