Skip to content

OT - study design advice sought

3 messages · Neil Shephard, Murray Pung, Francesco Cernuto

#
Hi,

Apologies for the slightly off-topic post but I'm seeking advise on
how to analyse a set of data I've encountered.

This is not my choice of study design, its something I'm being asked
to help a student with (and personally I think its a bit of a
data-dredging exercise, and have explained this to them, but there).

A group of cases with age-related macular degeneration (AMD) have been
collected from a clinic over a number of years, and because medical
records have been digitized for the past 20-30 years details on their
hospital admissions and a lot of other aspects of their health are
available from both before and the onset of AMD up to present date.

A group of controls from the general population without AMD have been
age and sex-matched to each of the cases, and data on hospital
admissions and so forth have also been extracted for the same
variables, from the first record available up to present date

The 'hypothesis' is that there are 'links' between AMD and
atherosclerotic disease, cardiovascular risk factors, systemic
inflamatory disease and neurodegenerative conditions.

The supervisor of this student has suggested that _all_ data should be
used, i.e. the occurence of some atherosclerotic disease in a case or
control _after_ the onset of AMD in the case should be included.  This
to me does not make any sense at all.  A case-control study has a
temporal relationship of the exposure occuring before disease does.
The aim is to assess whether certain diseases/risk factors
(athersclerosis, classic CVD risk factors etc.) increase the risk of
AMD.  Analysing such events after the onset of disease turns that
question around, and mixing the two just completely messes things up
in my view.

I've read through relevant sections of Breslow & Day's "Statistical
Methods in Cancer Research - Vol 1", and Dawson & Trapp's "Basic &
Clinical Biostatistics" as well as sections on case-control and
similar study designs in the Wiley Encyclopedia of Biostatistics but
can't find anything that deals with analysing data in such a manner.

Is it valid to take events before and after the event of interest
(onset of AMD)?  Personally I don't think so, but if anyone has any
thoughts or insights into this I'd greatly appreciate them.

Apologies again for the off-topic post, and thanks for your time,

Neil
#
Hi,
as a learned form epidemiology, the product of a "case-control" study is 
an exposure-odds ratio. Its components are a numerator of case series 
(exposure odds among incident cases of a disease) and a denominator of 
sampled referent series (exposure odds among a sample from the same 
population where cases arise). After taking a valid sample from this 
source population, the main threat to validiy is exposure 
misclassification, where information on exposure is not comparable 
between cases and "controls".

In your task, you should first assure a valid comparability on exposure 
information, but in any case I will not put in the analysis information 
on exposure after the event occurred, unless AMD is a relapsing disease.

For example in smoke and lung cancer the link between smoking status in 
subjects with the event is not an issue both for etiology and for 
prognosis. The same is not true for ischemic heart disease where smoking 
status is a prognostic indicator of a further heart attack.

For an insight on the topic I suggest:
OS Miettinen. Estimability and estimation in case-referent 
studies.Am.J.Epidemiol 1976;103:226-235

Francesco