Skip to content
Prev 166912 / 398502 Next

polychoric correlation: issue with coefficient sign

Dear Dorothee,
This problem is very ill-conditioned (i.e., there is little information in the data to estimate the thresholds and correlation), and the standard error of the correlation can't be estimated either by the 2-step approach or ML. In fact, if you add 0.5 to each cell to get rid of the sampling 0, you get very different results:
[,1] [,2]
[1,]   23    0
[2,]  334   27
Polychoric Correlation, ML est. = 0.2629 (0.2281)

  Row Threshold
  Threshold Std.Err.
     -1.537   0.1003


  Column Threshold
  Threshold Std.Err.
      1.457  0.09567
Polychoric Correlation, 2-step est. = 0.2629 (0.2279)
That's not correct. The computations are done pairwise (although by default, as I explained in my previous message, only complete observations are used).
There are two possible sources of difference, but again since I don't have the data, I can't check. (1) As I mentioned before, if there are missing data, then the subset of cases used by hetcor() and polycor() can differ. (2) As stated in ?hetcor, by default hetcor() coerces the returned correlation matrix to be positive-definite; you can set pd=FALSE in the call to hetcor() to turn this off.

Regards,
 John