Pulling specific parameters from models to prevent exhausting memory.
Thanks, Cesko. I'll look into BAM. James
From: Voeten, C.C. <c.c.voeten at hum.leidenuniv.nl>
Sent: Sunday, October 18, 2020 1:16 AM
To: Ades, James <jades at health.ucsd.edu>; r-sig-mixed-models at r-project.org <r-sig-mixed-models at r-project.org>
Subject: RE: Pulling specific parameters from models to prevent exhausting memory.
Sent: Sunday, October 18, 2020 1:16 AM
To: Ades, James <jades at health.ucsd.edu>; r-sig-mixed-models at r-project.org <r-sig-mixed-models at r-project.org>
Subject: RE: Pulling specific parameters from models to prevent exhausting memory.
Hi James, You may have luck using mgcv::bam instead of lme4. It can also fit random-slopes models and is optimized for "big data", in terms of memory usage and computational efficiency. The modeling syntax is slightly different, though; the correct translation of lme4 random effects into mgcv's s(...,bs='re') terms depends on whether timepoint.nu is a covariate or a factor. HTH, Cesko > -----Original Message----- > From: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org> On > Behalf Of Ades, James > Sent: Sunday, October 18, 2020 2:01 AM > To: r-sig-mixed-models at r-project.org > Subject: [R-sig-ME] Pulling specific parameters from models to prevent > exhausting memory. > > Hi all, > > I'm modeling fMRI imaging data using lme4. There are 4 time points and > roughly 550 subjects with 27,730 regions of interest (these are the variables). > Since I have access to a super computer, my thought was to create a long > dataset with a repeated measures of regions of interest per time point and > then subjects over the 4 time points. I'm using the model below. I gather the > regions of interest using the super computer because it ends up being > roughly 70 million something observations. Timepoint is discrete and > timepoint.nu is just numerical time point. > > lmer(connectivity ~ roi * timepoint + (timepoint.nu|subjectID) + > (timepoint.nu|subjectID:roi), na.action = 'na.exclude', control = > lmerControl(optimizer = "nloptwrap", calc.derivs = FALSE), REML = FALSE, > data) > > I received back the following error: "cannot allocate vector of size 30206.2 > GbExecution halted" > > So I'm wondering how I can only pull the essential parameters I need (group > means vs individual fixed effects) while modeling, such that the super > computer can finish the job without exhausting the memory. I say group > means because I will eventually be adding in covariates. > > Also, the super computer rules are that the job must finish within two days. > I'm not sure that this would, so I'm wondering whether there is any way to > parallel code in lme4 such that I could make access of multiple cores and > nodes. > > I've included a slice of data here: > https://drive.google.com/file/d/1mhTj6qZZ2nT35fXUuYG_ThQ-QtWbb- > 8L/view?usp=sharing > > Thanks much, > > James > > > > [[alternative HTML version deleted]] > > _______________________________________________ > R-sig-mixed-models at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models