Skip to content

computationally singular error with mice()

7 messages · Joshua Wiley, Fei, Weidong Gu

Fei
#
When trying the above command, I got the error term:

 iter imp variable
  1   1  medu
Error in solve.default(xtx + diag(pen)) : 
  system is computationally singular: reciprocal condition number =
1.16487e-025/

What does that mean? How can I address this issue? My dataframe has 257
observations for 99 variables. Could this be the origin of the problem?
Could anyone helps me out? Thank you!

Fei

--
View this message in context: http://r.789695.n4.nabble.com/computationally-singular-error-with-mice-tp4109583p4109583.html
Sent from the R help mailing list archive at Nabble.com.
#
Hi Fei,
On Fri, Nov 25, 2011 at 7:20 PM, Fei <fayechen0807 at hotmail.com> wrote:
Yes, your data is likely the origin  of the problem.
Please see below---you really need to provide a reproducible example.
Right now, I can only suggest that something with variable medu (or
its relation in some model) appears to be problematic.  If you give us
a reproducible example, we can help you come up with ways around it.

This is purely speculation, but often when I see people simple use an
entire data set in MI, it is a sign they do not really know what they
are doing.  This is risky because a bad imputation model can be worse
than not doing it at all.  Have you carefully examined all 99
variables in the dataset?  Have you considered the class of each?  Do
you know what the different models available for imputation are and
have you considered whether or not you want to simply use the defaults
for all 99 variables?  It is possible to specify a different model
and method for each variable.  If you are imputing 99 variables, you
also should really be checking the results for all 99.  An excellent
book on missing data in general is Statistical Analysis with Missing
Data by Little and Rubin (2002).  It is not the easiest read ever, but
it has very useful information.  If you do not have experience and do
not have interest/time to read and learn about MI, I would strongly
urge you to seek the advice of a local statistician, as I said, bad
imputation can be worse than no imputation.  If you do know what you
are doing and carefully selected those 99 variables from a larger
dataset, checked them, etc. please ignore this paragraph with my
apologies.

Sincerely,

Josh
^^^^^^^^^^^^^^^^^^^^^ !!!!!this is important!!!!
this too!!! particularly the "reproducible code" part

  
    
Fei
#
Hi Josh,

Thanks for the kind reminder of posting the dataframe on. My dataframe
contains lots of categorical variables, which seems to be problematic.  For
instance,

dob        status         edu               mrext
1111      married       highschool   yes, full time

Do you know how to specify the imputation methods and the visitSquence so
that those categorical variables are not involved in the imputation process?
Thank you.

Fei



--
View this message in context: http://r.789695.n4.nabble.com/computationally-singular-error-with-mice-tp4109583p4110776.html
Sent from the R help mailing list archive at Nabble.com.
#
Hi Fei,

I wouldn't worry to much about categorical variables for mice. Mice
would use logisitic regression for binary and polytomous logistic
regression for categorical variables with >2 levels. However, you
should not include factors with a lot of levels, saying>30, in
imputation models because it would require a lot of dummy variables.

Another thing is that not excluding variables you would use in
substantive analysis. Otherwise, estimation would be biased.

Weidong
On Sat, Nov 26, 2011 at 12:07 PM, Fei <fayechen0807 at hotmail.com> wrote:
Fei
#
Hi Weidong,

Thank you for the clear explanation. You are right it is not the categorical
variables that are causing the trouble. It might be the relatively small
number of sample that causing the problem given so many variables. I tried
to exclude some variables that are not essential to all the analyses I am
going to conduct and get the commands run successfully. Thank you.

--
View this message in context: http://r.789695.n4.nabble.com/computationally-singular-error-with-mice-tp4109583p4111304.html
Sent from the R help mailing list archive at Nabble.com.
#
Hi Fei,
On Sat, Nov 26, 2011 at 9:07 AM, Fei <fayechen0807 at hotmail.com> wrote:
Still not exactly a useable dataset, but here is a snippet of code I used:

##############################################################################
#                         Multiple Imputation Model                          #
##############################################################################

## specify the predictor matrix for the imputation
pred.matrix <- rbind(
  VFQRoleDifficulties1 = c(0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1),
  MOODVision1 =          c(1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1),
  MOODImpact1 =          c(1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1),
[snip]
  SocialFunctioning1 =   c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1),
  RoleEmotional1 =       c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1),
  MentalHealth1 =        c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0))
## set rownames to column names of the data (this is a square matrix)
colnames(pred.matrix) <- colnames(dat)

## Set the methods used to impute each variable
imp.method <- c(
  VFQRoleDifficulties1 = "pmm",
  MOODVision1 = "pmm",
  MOODImpact1 = "pmm",
[snip]
  SocialFunctioning1 = "pmm",
  RoleEmotional1 = "pmm",
  MentalHealth1 = "pmm"
)

## Create multiply imputed dataset
datimp <- mice(data = dat, m = 500, method = imp.method,
predictorMatrix = pred.matrix,
  seed = 1, print = FALSE)

Basically you can write a k x k matrix where k is the number of
variables in your dataset.  This can control what variables are used
in the imputation model for each variable (all 0s would mean no
variables).  You can also pass a k length character vector controlling
the method used for each variable.  You can also control the order
mice goes in.

Cheers,

Josh