Hi All, I'm wondering if someone would be able to clarify how to correctly specify the ar1 structure in glmmTMB. I have read the help page and the vignette ( https://cran.r-project.org/web/packages/glmmTMB/vignettes/covstruct.html), and understand that the general formula is ar1(time + 0|grouping variable), where time is a factor, but how does this work when time consists of two seperate variables (i.e., date and hour)? For instance, the dataset I'm working with consists of several ids, and each id contains multiple days of data, and multiple measurements per day (i.e., a measurement each hour). The structure in its basic form (no additional variables) would be something like: df <- data.frame(id = rep(seq(1,5,1), 3), date = rep(seq(lubridate::ymd("2020-03-01"), lubridate::ymd("2020-03-05"), 1),3), hour = rep(seq(5, 9, 1), 3), value = rnorm(15, 5, 2)) Now change 'hour' and 'date' to class factor and create a unique grouping variable called 'id_date', consisting of each id associated with each date. df <- df %>% mutate(hour_factor = as.factor(hour), date_factor = as.factor(date)) %>% unite(id_date, id, date_factor, remove=FALSE) And now the model: glmmTMB(value ~ ar1(hour_factor | id_date) + (1|id), data=df) Is this the correct specification for the ar1 structure, when 'hour' is nested within 'date', and when 'id' is a random effect? Or should 'date' and 'hour' be combined into a single variable (e.g. 2020-03-01 05:00:00) and then converted to a factor, with the grouping variable being 'id'? glmmTMB(value ~ ar1(date_hour | id) + (1|id), data=df) In my own dataset, both models work, but produce different estimates, p-values and residuals (note that the example here won't work because of too few observations). Thanks, Simon
Correct specification of ar1 structure in glmmTMB
1 message · Simon Tapper