Hi all, A short while ago I asked a question about multiple imputation and I got several helpful replies, thanks! I have untill now tried to use the packages mice and norm but both give me errors however. mice does not even run to start with and gives me the following error right away: iter imp variable 1 1 Liquidity.ratioError in chol((v + t(v))/2) : the leading minor of order 1 is not positive definite To be honest I have no idea whatsoever what that error message means, so my experiments with mice were shortlived :-) I then tried the package "norm". I got some ways with the experiment, following the help file: s <- prelim.norm(as.matrix(myDataSet)) thetahat <- em.norm(s) rngseed(1234567) theta <- da.norm(s, thetahat, steps=20, showits=TRUE) At this stage however I get the following error: Steps of Data Augmentation: 1...2...Error: NA/NaN/Inf in foreign function call (arg 2) This seems strange to me, since the whole purpose of this routine is to work with NA values. So why is it complaining about NA values? After this I got it to work in an unlikely fashion: I first standardized my dataset using scale(). After that I was able to run the "theta <- da.norm(s, thetahat, steps=20, showits=TRUE)" line succesfully. Which seems strange to me, since s still creates NA values, so why is it not complaining about them this time. I have repeated the process several times, with subsets of my original dataset and the same problems arise each time. Standardizing, calculating the missing values, imputing them and then standardizing again does not seem the correct way to go to me however. In my opionion the correct way of doing things would be to impute the missing values and then standardize the dataset. In other words, the way that seems correct to me is not working. Any helpful comments on the problems described would be much appreciated! Thanks, Jonck
Missing data augmentation
2 messages · Jonck van der Kogel, John Fox
1 day later
Dear Jonck, I was hoping that someone with more experience with mice and norm would pick up this question, but perhaps the following will help: Without seeing your data, it's hard to determine the source of the problem; of course, I wouldn't necessarily be able to do that even with the data.
At 08:25 PM 6/14/2003 +0200, Jonck van der Kogel wrote:
Hi all, A short while ago I asked a question about multiple imputation and I got several helpful replies, thanks! I have untill now tried to use the packages mice and norm but both give me errors however. mice does not even run to start with and gives me the following error right away: iter imp variable 1 1 Liquidity.ratioError in chol((v + t(v))/2) : the leading minor of order 1 is not positive definite To be honest I have no idea whatsoever what that error message means, so my experiments with mice were shortlived :-)
If I remember correctly, leading minors are determinants of square submatrices starting at row and column 1; the leading minor of order 1 is therefore just the entry in the first row, first column; for it to be "not positive definite" suggests that it is 0 or negative. What exactly v is I can't say, but using traceback() might help you locate the problem more specifically. Addressing questions to the authors of mice might also help.
I then tried the package "norm". I got some ways with the experiment, following the help file: s <- prelim.norm(as.matrix(myDataSet)) thetahat <- em.norm(s) rngseed(1234567) theta <- da.norm(s, thetahat, steps=20, showits=TRUE) At this stage however I get the following error: Steps of Data Augmentation: 1...2...Error: NA/NaN/Inf in foreign function call (arg 2) This seems strange to me, since the whole purpose of this routine is to work with NA values. So why is it complaining about NA values?
Actually, the error message is less specific than that and suggests a numerical problem in the data augmentation step. Since both programs are producing numerical errors, I'd suspect some problem, such as ill-conditioning, in the data.
After this I got it to work in an unlikely fashion: I first standardized my dataset using scale(). After that I was able to run the "theta <- da.norm(s, thetahat, steps=20, showits=TRUE)" line succesfully. Which seems strange to me, since s still creates NA values, so why is it not complaining about them this time. I have repeated the process several times, with subsets of my original dataset and the same problems arise each time.
It's odd that scaling the data helps since I believe that norm does this itself.
Standardizing, calculating the missing values, imputing them and then standardizing again does not seem the correct way to go to me however. In my opionion the correct way of doing things would be to impute the missing values and then standardize the dataset. In other words, the way that seems correct to me is not working.
I'm not sure that I follow that. You can always undo the standardization at the end, but perhaps I'm missing something. I hope that these remarks are of some use, John ----------------------------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada L8S 4M4 email: jfox at mcmaster.ca phone: 905-525-9140x23604 web: www.socsci.mcmaster.ca/jfox