Skip to content

computationally singular

7 messages · Christian Hennig, Kjetil Halvorsen, Weiwei Shi

#
Hi,
I have a dataset which has around 138 variables and 30,000 cases. I am
trying to calculate a mahalanobis distance matrix for them and my
procedure is like this:

Suppose my data is stored in mymatrix
Error in solve.default(cov, ...) : system is computationally singular:
reciprocal condition number = 1.09501e-25

I understand the error message but I don't know how to trace down
which variables caused this so that I can "sacrifice" them if there
are not a lot. Again, not sure if it is due to some variables and not
sure if dropping variables is a good idea either.

Thanks for help,

weiwei
#
Once I had a situation where the reason was that the variables were
scaled to extremely different magnitudes. 1e-25 is a *very* small number
but still there is some probability that it may help to look up standard
deviations and to multiply the
variable with the smallest st.dev. with 1e20 or something.

Best,
Christian
On Mon, 8 Aug 2005, Weiwei Shi wrote:

            
*** NEW ADDRESS! ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
chrish at stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche
#
I think the problem might be caused two variables are very correlated.
Should I check the cov matrix and try to delete some?
But i am just not quite sure of your reply. Could you detail it with some steps?

thanks,

weiwei
On 8/8/05, Christian Hennig <chrish at stats.ucl.ac.uk> wrote:

  
    
#
More ideas:

You can also perform an Eigenvalue decomposition of the covariance
matrix and see along which
directions the singularity occurs and how strong it is.
Consequences could be: rescaling (or omission) of variables that are
strong in these
directions, taking principal components, or linear transformation of the
whole data in order to attain less extreme ratios between cov eigenvalues.

Generally I would say that information reduction (principal components or
leaving out variables) should only be done if "small variance along a
direction" means that "this direction is not important" in terms of the
subject matter problem. Otherwise transformation could help. (Perhaps my
guess was wrong in the first mail, you don't have to multiply something
by 1e20 to repair a 1e-25 condition number and a more moderate
transformation suffices.)

Best,
Christian
On Mon, 8 Aug 2005, Weiwei Shi wrote:

            
*** NEW ADDRESS! ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
chrish at stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche
#
Sorry, our emails crossed...
On Mon, 8 Aug 2005, Weiwei Shi wrote:

            
In this case, taking principal components should do the job.
Variable deletion may help as well - I am not extremely against it, it
depends on your whole project and aim, but I would not start with that
before I found out if there are more "proper" possibilities.
steps?

Look up all std.devs of the variables.
If the ratio between the largest one and the smallest one is more than,
let's say, 1e5, consider that as "not healthy". Multiply the variables
with the smallest std.devs with constants so that the ratio between
largest and smallest std.dev is not more than 1e3, say (I am not sure
about the exact size of these numbers... try something...). Look if the
problem vanishes after such rescaling.

Don't ask me the same about the second email - I don't have the time to
explain that in detail.

Sorry,
Christian
*** NEW ADDRESS! ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
chrish at stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche
2 days later
#
Weiwei Shi wrote:

            
Why not do principal component analysis? To identify the zero variance
linear combination(s) look at the nzero eigenvalues.  Also, it *might* 
make sense
to calculate a " mahalanobis" distance replacing the matrix inverse with a
pseudoinverse.

Kjetil
-- 

Kjetil Halvorsen.

Peace is the most effective weapon of mass construction.
               --  Mahdi Elmandjra
#
PCA definately is worth of trying, which was my second thought. But
thanks for the help and also on the suggestion.
On 8/10/05, Kjetil Brinchmann Halvorsen <kjetil at acelerate.com> wrote: