Skip to content

a correlation matrix subset where the subset avg is a maximum

1 message · Ryan Austin

#
Thanks for the thought in any case Mark.  Your right about the brute force.
I'll expand a bit with an example though for the sake of clarity.

Given a correlation matrix of 4 covariates ABCD with distances of:
AB=0.2;  AC=0.6; AD=0.3 ; BC=0.9 ; BD=0.8 ; CD=0.7

Find the optimal subset (size > n, n being the number of covariates) 
where the mean of r for the subset is a maximum.
Of course all NxN distances need to be considered between any chosen 
subset covariates.

Thus for n>1, the solution would be simply BC = 0.9
And for n>2, the solution would be BCD as (BC + CD + BD)/3) = 0.8 is the 
maximum mean r value that could be obtained from
any of the subsets with n>2.

I'd expected that this would be a common problem but 2 days of googling 
has given me little.  I'm expecting a greedy graph traversal
or the like will be my answer but I'd hoped to whip a solution of in R.
Any help would be greatly appreciated.
Ryan
Leeds, Mark (IED) wrote: