Skip to content

Missing data augmentation

2 messages · Jonck van der Kogel, John Fox

#
Hi all,
A short while ago I asked a question about multiple imputation and I 
got several helpful replies, thanks! I have untill now tried to use the 
packages mice and norm but both give me errors however.

mice does not even run to start with and gives me the following error 
right away:
iter imp variable
   1   1  Liquidity.ratioError in chol((v + t(v))/2) : the leading minor 
of order 1 is not positive definite

To be honest I have no idea whatsoever what that error message means, 
so my experiments with mice were shortlived :-)

I then tried the package "norm". I got some ways with the experiment, 
following the help file:
s <- prelim.norm(as.matrix(myDataSet))
thetahat <- em.norm(s)
rngseed(1234567)
theta <- da.norm(s, thetahat, steps=20, showits=TRUE)

At this stage however I get the following error:
Steps of Data Augmentation:
1...2...Error: NA/NaN/Inf in foreign function call (arg 2)

This seems strange to me, since the whole purpose of this routine is to 
work with NA values. So why is it complaining about NA values?

After this I got it to work in an unlikely fashion: I first 
standardized my dataset using scale(). After that I was able to run the
"theta <- da.norm(s, thetahat, steps=20, showits=TRUE)" line 
succesfully. Which seems strange to me, since s still creates NA 
values, so why is it not complaining about them this time. I have 
repeated the process several times, with subsets of my original dataset 
and the same problems arise each time.

Standardizing, calculating the missing values, imputing them and then 
standardizing again does not seem the correct way to go to me however. 
In my opionion the correct way of doing things would be to impute the 
missing values and then standardize the dataset. In other words, the 
way that seems correct to me is not working.

Any helpful comments on the problems described would be much 
appreciated!
Thanks, Jonck
1 day later
#
Dear Jonck,

I was hoping that someone with more experience with mice and norm would 
pick up this question, but perhaps the following will help:

Without seeing your data, it's hard to determine the source of the problem; 
of course, I wouldn't necessarily be able to do that even with the data.
At 08:25 PM 6/14/2003 +0200, Jonck van der Kogel wrote:
If I remember correctly, leading minors are determinants of square 
submatrices starting at row and column 1; the leading minor of order 1 is 
therefore just the entry in the first row, first column; for it to be "not 
positive definite" suggests that it is 0 or negative. What exactly v is I 
can't say, but using traceback() might help you locate the problem more 
specifically. Addressing questions to the authors of mice might also help.
Actually, the error message is less specific than that and suggests a 
numerical problem in the data augmentation step. Since both programs are 
producing numerical errors, I'd suspect some problem, such as 
ill-conditioning, in the data.
It's odd that scaling the data helps since I believe that norm does this 
itself.
I'm not sure that I follow that. You can always undo the standardization at 
the end, but perhaps I'm missing something.

I hope that these remarks are of some use,
  John

-----------------------------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada L8S 4M4
email: jfox at mcmaster.ca
phone: 905-525-9140x23604
web: www.socsci.mcmaster.ca/jfox