Hi lme4 developers, I hope I can contribute to something positive to the community. I noticed that even in the latest version of lme4Eigen the lmer function only use Nelder-Mead optimizer. I would like to get Bobyqa optimizer back as an option (because I really like Bobyqa optimizer). So, I patched the code a little bit. I used the (hopefully) latest revision from SVN version 1631. I also added control for xst and xt factor multiplier (a FIXME item list). I hope this change is also acceptable. If you feel I am ignorant of your code style, I apologize. Here is the code (only the lmer function). I clearly marked my changes. If you need a diff file with the current HEAD, I'll be happy to provide you with one. The function seems to be running happily without error or warning on my machine. I hope the changes can make it to the main trunk. Sincerely, Roby
Contribution to lme4Eigen
7 messages · Joehanes, Roby (NIH/NHLBI) [F], Ben Bolker, David Duffy
Ouch. Apparently, the mail daemon ate my attachment.
So, here it is (pasted). Sorry for the mail bomb.
-----------------------
lmer <- function(formula, data, REML = TRUE, sparseX = FALSE,
control = list(), start = NULL,
verbose = 0L, subset, weights, na.action, offset,
contrasts = NULL, devFunOnly=FALSE, optimizer=c("NelderMead","bobyqa"), ...)
{
if (sparseX) warning("sparseX = TRUE has no effect at present")
mf <- mc <- match.call()
## '...' handling up front, safe-guarding against typos ("familiy") :
if(length(l... <- list(...))) {
if (!is.null(l...$family)) { # call glmer if family specified
mc[[1]] <- as.name("glmer")
return(eval(mc, parent.frame()) )
}
## Check for method argument which is no longer used
if (!is.null(method <- l...$method)) {
msg <- paste("Argument", sQuote("method"), "is deprecated.")
if (match.arg(method, c("Laplace", "AGQ")) == "Laplace") {
warning(msg)
l... <- l...[names(l...) != "method"]
} else stop(msg)
}
if(length(l...))
warning("extra argument(s) ",
paste(sQuote(names(l...)), collapse=", "),
" disregarded")
}
stopifnot(length(formula <- as.formula(formula)) == 3)
if (missing(data)) data <- environment(formula)
# evaluate and install the model frame :
m <- match(c("data", "subset", "weights", "na.action", "offset"),
names(mf), 0)
mf <- mf[c(1, m)]
mf$drop.unused.levels <- TRUE
mf[[1]] <- as.name("model.frame")
fr.form <- subbars(formula) # substituted "|" by "+" -
environment(fr.form) <- environment(formula)
mf$formula <- fr.form
fr <- eval(mf, parent.frame())
# random effects and terms modules
reTrms <- mkReTrms(findbars(formula[[3]]), fr)
if (any(unlist(lapply(reTrms$flist, nlevels)) >= nrow(fr)))
stop("number of levels of each grouping factor must be less than number of obs")
## fixed-effects model matrix X - remove random effects from formula:
form <- formula
form[[3]] <- if(is.null(nb <- nobars(form[[3]]))) 1 else nb
X <- model.matrix(form, fr, contrasts)#, sparse = FALSE, row.names = FALSE) ## sparseX not yet
p <- ncol(X)
if ((qrX <- qr(X))$rank < p)
stop(gettextf("rank of X = %d < ncol(X) = %d", qrX$rank, p))
rho <- new.env(parent=parent.env(environment()))
rho$pp <- do.call(merPredD$new, c(reTrms[c("Zt","theta","Lambdat","Lind")], n=nrow(X), list(X=X)))
rho$resp <- mkRespMod(fr, if(REML) p else 0L)
devfun <- mkdevfun(rho, 0L)
devfun(reTrms$theta) # one evaluation to ensure all values are set
if (devFunOnly) return(devfun)
lower <- reTrms$lower
# RJ's changes begin
opt <- switch(match.arg(optimizer),
bobyqa = {
if(!is.numeric(control$rhobeg)) control$rhobeg <- 0.0002
if(!is.numeric(control$rhoend)) control$rhoend <- 2e-7
rho$control <- control
# Delete unused options to prevent warning from showing up
control$FtolAbs <- NULL
control$FtolRel <- NULL
bobyqa(rho$pp$theta, devfun, lower, control=control)
},
NelderMead = {
## FIXME: this code is replicated in lmer/glmer/nlmer ...
## it seems good to have it in R rather than C++ code but maybe it should go within Nelder_Mead() ??
control$iprint <- switch(as.character(min(verbose,3L)), "0"=0, "1"=20,"2"=10,"3"=1)
xst <- rep.int(0.1, length(lower))
# RJ -- allow user control of xst, xt
ctl <- control
if(!is.numeric(control$xstFactor))
xstFactor <- 0.2
else
xstFactor <- control$xstFactor
if(!is.numeric(control$xtFactor))
xtFactor <- 0.0001
else
xtFactor <- control$xtFactor
ctl$xstFactor <- NULL
ctl$xtFactor <- NULL
Nelder_Mead(devfun, x0=rho$pp$theta, xst=xstFactor*xst, xt=xst*xtFactor, lower=lower, control=ctl)
})
if(optimizer=="NelderMead") {
if (opt$ierr < 0L) {
if (opt$ierr > -4L)
stop("convergence failure, code ", opt$ierr, " in NelderMead")
else
warning("failure to converge in", opt$control$maxfun, "evaluations")
}
} else if (optimizer=="bobyqa") {
if (opt$ierr > 0L)
warning("convergence problem, code ", opt$ierr, " in bobyqa")
}
# RJ's changes end
mkMerMod(environment(devfun), opt, reTrms, fr, mc)
}## { lmer }
On Mar 1, 2012, at 2:11 PM, Joehanes, Roby (NIH/NHLBI) [F] wrote:
Hi lme4 developers, I hope I can contribute to something positive to the community. I noticed that even in the latest version of lme4Eigen the lmer function only use Nelder-Mead optimizer. I would like to get Bobyqa optimizer back as an option (because I really like Bobyqa optimizer). So, I patched the code a little bit. I used the (hopefully) latest revision from SVN version 1631. I also added control for xst and xt factor multiplier (a FIXME item list). I hope this change is also acceptable. If you feel I am ignorant of your code style, I apologize. Here is the code (only the lmer function). I clearly marked my changes. If you need a diff file with the current HEAD, I'll be happy to provide you with one. The function seems to be running happily without error or warning on my machine. I hope the changes can make it to the main trunk. Sincerely, Roby
Apologies again. I submitted the changes to the tracker here: https://r-forge.r-project.org/tracker/index.php?func=detail&aid=1861&group_id=60&atid=300
On Mar 1, 2012, at 2:15 PM, Joehanes, Roby (NIH/NHLBI) [F] wrote:
Ouch. Apparently, the mail daemon ate my attachment. So, here it is (pasted). Sorry for the mail bomb.
<snip>
Joehanes, Roby (NIH/NHLBI) [F] <roby.joehanes at ...> writes:
Ouch. Apparently, the mail daemon ate my attachment.
Would you be willing to send this as an SVN diff to lme4-authors <at> r-forge.wu-wien.ac.at ? Or post it to the issue tracker on r-forge ? Ben
Hi Ben, Thank you for the reply. I've added the diff patch against SVN rev 1631 in the issue tracker: https://r-forge.r-project.org/tracker/download.php/60/300/1861/147/lmer-svn1631-patch.txt See the corresponding page here: https://r-forge.r-project.org/tracker/index.php?func=detail&aid=1861&group_id=60&atid=300 Also attached (cc-ed to the e-mail you mentioned). Apologies about the line ending. I am currently using a Mac. Roby
On Mar 1, 2012, at 3:32 PM, Ben Bolker wrote:
Joehanes, Roby (NIH/NHLBI) [F] <roby.joehanes at ...> writes:
Ouch. Apparently, the mail daemon ate my attachment.
Would you be willing to send this as an SVN diff to lme4-authors <at> r-forge.wu-wien.ac.at ? Or post it to the issue tracker on r-forge ? Ben
-------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: lmer-svn1631-patch.txt URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20120301/385ad7d1/attachment-0001.txt>
Hi, I would like to do something like this in pedigreemm: result <- pedigreemm(y ~ ...<the other factors>... + (1 | id), data=mydata, pedigree=list(id=mypedigree), na.action=na.omit) To my impression after reading pedigreemm's code, I need several observations with the same ID number and I cannot have each observation to have a different ID. Is this correct? Then, is it possible to have 1 observation having 1 ID? I don't think it is, given how the Z matrix for ID is encoded in lmer (and thereby pedigreemm) and how examples in the paper accompanying the package are constructed. Maybe I am incorrect or maybe I am doing an incorrect encoding. What I am getting at is genome-wide association study (GWAS)-like analysis that account for familial correlation matrix. Here, each observation comes from one individual, which is associated an individual ID and a pedigree ID. One pedigree ID contains essentially a family tree from great grandparents down to the little ones. The pedigree is somewhat complicated by step-parents (indicated by siblings sharing one parent, but not the other) and a few members of one pedigree intermarrying those of another. I have the inclination to specify the model as: result <- pedigreemm(y ~ ...<the other factors>... + (1 | ped_id), data=mydata, pedigree=list(ped_id =mypedigree), na.action=na.omit) To me, it doesn't seem to be correct since pedigreemm will treat observations having the same ped_id as repeat observations. If I understand the code correctly, pedigreemm will tie ped_id with the mypedigree$id. I cannot put in individual IDs into mypedigree$id since it will make pedigreemm associate IDs incorrectly. However, I cannot put in pedigree ID into mypedigree$id either since one pedigree ID consists of the whole family tree. The questions are: How can I deal with this problem? Should I break up the pedigree into nuclear families? How can I deal with the step-parents and intermarriage complications above? If pedigreemm was not designed for problems of this class, which other packages should I look into? Thank you, Roby
On Thu, 1 Mar 2012, Joehanes, Roby (NIH/NHLBI) [F] wrote:
Hi, I would like to do something like this in pedigreemm: To my impression after reading pedigreemm's code, I need several observations with the same ID number and I cannot have each observation to have a different ID. Is this correct? What I am getting at is genome-wide association study (GWAS)-like analysis that account for familial correlation matrix. Here, each observation comes from one individual, which is associated an individual ID and a pedigree ID. One pedigree ID The questions are: How can I deal with this problem? Should I break up the pedigree into nuclear families? How can I deal with the step-parents and intermarriage complications above? If pedigreemm was not designed for problems of this class, which other packages should I look into?
The pedigree() constructor is set up as animal breeders do things: the entire dataset is one big pedigree, even though there may be subcomponents that do not connect to one another. This is why sparse matrix representations are used. So, you just need unique IDs for everybody (eg concatenate pedigree with individual ID). This allows for intermarriage etc. If you think step-parents transmit your phenotype to step-children, then you need to add in a family environmental random effect (ie common to all members of an extended family, or perhaps just to that "nongenetic" nuclear family). If you have repeat measures on individuals, then you include multiple records with the same unique ID. Cheers, David Duffy
| David Duffy (MBBS PhD) ,-_|\ | email: davidD at qimr.edu.au ph: INT+61+7+3362-0217 fax: -0101 / * | Epidemiology Unit, Queensland Institute of Medical Research \_,-._/ | 300 Herston Rd, Brisbane, Queensland 4029, Australia GPG 4D0B994A v