Skip to content

Adjusting for random recording intervals in glmer/poisson

3 messages · Joshua Wiley, Dieter Menne

#
In a clinical study, events in patients were observed during multiple visits; on
each visit, a continuous predictor variable for the poisson-distributed number
of events was also available, it is the endpoint of the study.

The following model would be suitable

glmer(nevent~predictor + (1|subj),data=d, family=poisson)

but there is a catch: the recording interval on each day varies randomly, not
related to study parameters, from 30 to 60 minutes. The statistical consultant
at the university recommended the conservative solution to truncate ALL records
to the first 30 minutes, and discard the tails, but the PhD student who did the
study was not too happy to loose all data beyond 30 minutes.

A compromise would be to normalize all data to events/45 minutes (or
median(duration)), assuming that the variance in duration is not too large.

Is there a better way to factor out the nuisance parameter duration?

Dieter
#
Hi Dieter,

I do not think that I understand the question or problem very well.
What is the significance of the recording interval varying?  If the
issue is that with a longer recording time, there are more
opportunities for events to occur, then what about treating duration
as an exposure and including it in the offset?  Essentially you model
rate then rather than counts.

Again apologies if I grossly misunderstanding the issue.

Cheers,

Josh


On Wed, Jul 4, 2012 at 10:35 PM, Dieter Menne
<dieter.menne at menne-biomed.de> wrote:
--
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/
#
Joshua Wiley wrote:

            
Good to hear that you suggest it to put it into the offset; I wanted to do this, but was not sure what exactly to put into the offset term. Duration or log(duration)?

Dieter


Apologies: I forgot to attach the simulated sample data in the original message

library(lme4)
nsubj = 10
nvisit = 5
set.seed(100)
d = data.frame(
  subj = as.factor(1:nsubj),
  duration = runif(nsubj*nvisit,30,60),# in minutes
  predictor = rnorm(nsubj*nvisit,50,10))
d$nevent = with(d,rpois(nsubj*nvisit,predictor*duration/500))

# Proposed solution by university statistician: 
# use only the data from the first 30 minutes (not shown here) and do
glmer(nevent~predictor + (1|subj),data=d, family=poisson)
# Result is not correct, because truncated data not used

# Proposed by Joshua
glmer(nevent~predictor+offset(log(duration)) + (1|subj), data=d, family=poisson)