Hi Stuart,
How many (if any) iterations completed before you got the seg fault?
Also, how much memory does your system have?
Currently at 4000 iterations, I have not reproduced the error so far
with this made up example (although I have practically finished an
algorithm to sort any array in O(log log n) while waiting to get this
far ~1 hour per thousand iterations on a 6 core 3.9GHZ machine).
require(MCMCglmm)
require(MASS)
set.seed(10)
rint <- rep(rnorm(14982, 0, 4), each = floor(2.44e6/14982))
ID <- factor(rep(1:14982, each = floor(2.44e6/14982)))
X <- MASS::mvrnorm(length(rint), mu = c(0, 0, 0),
Sigma = matrix(c(1, .3, .3, .3, 1, .3, .3, .3, 1), 3))
b <- matrix(c(1.2, -.5, 2))
ycont <- rnorm(length(ID), mean = 2 + rint + X %*% b)
yord <- cut(ycont, breaks = quantile(ycont, c(0, .2, .4, .6, .8, 1)),
include.lowest=TRUE, ordered_result=TRUE)
testdat <- data.frame(yord, ID, x1 = X[,1], x2 = X[, 2], x3 = X[, 3])
## > gc()
## used (Mb) gc trigger (Mb) max used (Mb)
## Ncells 1117490 59.7 1710298 91.4 1368491 73.1
## Vcells 25878798 197.5 47836772 365.0 47753473 364.4
m <- MCMCglmm(yord ~ x1 + x2 + x3, random = ~ ID,
family = "ordinal", data = testdat,
prior = list(
R = list(V = 1, fix = 1),
G = list(
G1 = list(V = 1, nu = 0))),
nitt = 13000, thin = 10, burnin = 3000)
on a Win 8 pro x64 system with 32GB of memory and
R version 2.15.2 (2012-10-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] MASS_7.3-22 ggplot2_0.9.3 MCMCglmm_2.17
[4] corpcor_1.6.4 ape_3.0-6 coda_0.16-1 Matrix_1.0-10
[8] lattice_0.20-10 tensorA_0.36
On Fri, Jan 18, 2013 at 1:55 PM, Stuart Luppescu
<slu at ccsr.uchicago.edu> wrote:
Hello, I'm having problems running a simple ordinal outcome mixed
effects model, and I'm thinking it may be because of the size of the
dataset (or, it very may well be that I'm not specifying the model
correctly). I must confess to insecurity about how to specify the
priors. Here is the structure of the data frame (with columns not in
this model omitted). Note that there are more than 2.4 million rows. Is
that a problem?
str(all.subj)
'data.frame': 2438922 obs. of 112 variables:
$ gr10 : num 0 0 0 0 0 0 1 0 1 0 ...
$ gr11 : num 0 0 0 1 1 1 0 0 0 0 ...
$ gr12 : num 1 1 1 0 0 0 0 1 0 0 ...
$ tid : Factor w/ 14982 levels
"........","A.D46607",..: 2 2 2 2 2 2 2 2 2 2 ...
$ final.points : Ord.factor w/ 5 levels
"0"<"1"<"2"<"3"<..: 4 4 4 2 3 3 2 2 3 2 ...
Here are two attempts and their results:
glmm.uncond <- MCMCglmm(final.points ~ gr10 + gr11 + gr12,
prior=list(R=list(V=1, fix=1),
G=list(G1=list(V=1, nu=0))),
random = ~tid ,
family = "ordinal",
nitt=100000,
data = all.subj)
Error: segfault from C stack overflow
glmm.uncond <- MCMCglmm(final.points ~ gr10 + gr11 + gr12,
prior=list(R=list(V=1, nu=0),
G=list(G1=list(V=1, nu=0))),
random = ~tid ,
family = "ordinal",
nitt=100000,
data = all.subj)
Process R segmentation fault (core dumped) at Fri Jan 18 12:53:49 2013
Here is my sessionInfo
R version 2.15.1 (2012-06-22)
Platform: x86_64-redhat-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] foreign_0.8-52 MCMCglmm_2.17 corpcor_1.6.4 ape_3.0-6
[5] coda_0.16-1 Matrix_1.0-10 lattice_0.20-13 tensorA_0.36
loaded via a namespace (and not attached):
[1] compiler_2.15.1 gee_4.13-18 grid_2.15.1 nlme_3.1-107
[5] tools_2.15.1
and memory info.
gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 1196029 63.9 1835812 98.1 1710298 91.4
Vcells 678385919 5175.7 1777423119 13560.7 2114537169 16132.7
Any help will be appreciated.
--
Stuart Luppescu <slu at ccsr.uchicago.edu>
University of Chicago