I am using the mice package to impute some missing values, and it work nicely. I am facing a tricky strategic question though. Basically, I am working on predictors of myocardial infarction, with all patients having baseline features (eg age, gender), despite a few missing values. Some patients have performed also a stress test, with specific continous details (eg stress duration), but others haven't. What should I do to capture the information associated with stress test features? A complete case analysis will of course exclude all those without a stress test (roughly 50%). Is it reasonable to impute with mice the stress features among also those who did not undergo any stress test? Or should I best create a factor variable such as stress_status (0- no stress, 1-stress with low tolerance, 2-stress with high tolerance, and so forth)? Thanks for the help Giuseppe
mice package
2 messages · Giuseppe Biondi Zoccai, Bert Gunter
While your queries certainly intersect R , they are mostly about statistical methodology for this special kind of missing data. This list is mostly about R programming. I think you would do better posting to a statistics list, like stats.stackexchange.com . Advice there might bring you back here to ask about R implementation, but that's not your current concern. Cheers, Bert On Monday, March 7, 2016, Giuseppe Biondi Zoccai <gbiondizoccai at gmail.com> wrote:
I am using the mice package to impute some missing values, and it work
nicely.
I am facing a tricky strategic question though.
Basically, I am working on predictors of myocardial infarction, with all
patients having baseline features (eg age, gender), despite a few missing
values.
Some patients have performed also a stress test, with specific continous
details (eg stress duration), but others haven't.
What should I do to capture the information associated with stress test
features?
A complete case analysis will of course exclude all those without a stress
test (roughly 50%).
Is it reasonable to impute with mice the stress features among also those
who did not undergo any stress test?
Or should I best create a factor variable such as stress_status (0- no
stress, 1-stress with low tolerance, 2-stress with high tolerance, and so
forth)?
Thanks for the help
Giuseppe
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org <javascript:;> mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) [[alternative HTML version deleted]]