A combinatorial optimization problem: finding the best permutation of a complex vector

Sat, Nov 14, 2009 8:47 PM

Hi,

It was pointed out to me by Hans Borchers that the timing that I reported (0.3 sec) in the previous email for solving the LSAP problem, for N=1000, was too optimistic, because "X" and "Y" were equivalent up to a permutation.  

In order to test this out, I ran a few more experiments with different random variate distributions for X and Y.  In all the experiments, I took N = 500.

The execution times were faster when X and Y had the same or similar distributions.  This is generally around 8 - 9 seconds.  The more different the distributions are, the greater the time.   For example, when I took the real and imaginary parts of X to be from a t-distribution with 3 degrees of freedom, and Y to be from uniform distribution in (0, 1), the execution times were around 80-90 seconds. 

n <- 500

x <- rt(n, df=3) + 1i * rt(n, df=3)

y <- runif(n) + 1i * runif(n)

Cmat <- outer(x, y, FUN=function(x,y) Mod(x - y))

system.time(ans <- solve_LSAP(Cmat, maximum=FALSE))


When I increased N = 1000, the time was about 1400 seconds!  


Ravi.

____________________________________________________________________

Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University

Ph. (410) 502-2619
email: rvaradhan at jhmi.edu


----- Original Message -----
From: Ravi Varadhan <rvaradhan at jhmi.edu>
Date: Saturday, November 14, 2009 10:53 am
Subject: Re: [R] A combinatorial optimization problem: finding the best permutation of a complex vector
To: "Charles C. Berry" <cberry at tajo.ucsd.edu>
Cc: r-help at r-project.org

____________________________________________________________________

Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University

Ph. (410) 502-2619
email: rvaradhan at jhmi.edu


----- Original Message -----
From: "Charles C. Berry" <cberry at tajo.ucsd.edu>
Date: Thursday, November 12, 2009 2:20 pm
Subject: Re: [R] A combinatorial optimization problem: finding the 
best permutation of a complex vector
To: Ravi Varadhan <rvaradhan at jhmi.edu>
Cc: r-help at r-project.org


On Thu, 12 Nov 2009, Ravi Varadhan wrote:


Hi,

I have a complex-valued vector X in C^n.  Given another 
complex-valued 
vector Y in C^n, I want to find a permutation of Y, say, Y*, that 

minimizes ||X - Y*||, the distance between X and Y*.

Note that this problem can be trivially solved for "Real" vectors, 

since 
real numbers possess the ordering property. Complex numbers, 
however, do 
not possess this property.  Hence the challenge.

The obvious solution is to enumerate all the permutations of Y and 

pick 
out the one that yields the smallest distance.  This, however, is 

only 
feasible for small n.  I would like to be able to solve this for n 

as 
large as 100 - 1000, in which cases the permutation approach is 
infeasible.

I am looking for algorithms, possibly iterative, that can provide 
a 

"good" approximate solutions that are not necessarily optimal for 

high-dimensional vectors. I can do random sampling, but this can 
be 
very 
inefficient in high-dimensional problems.  I am looking for 
efficient 
algorithms because this step has to be performed in each iteration 

of an 
"outer" algorithm.

Are there any clever adaptive algorithms out there?


I do not know.

But would you settle for a not-so-clever adaptive heuristic?

If so, see below.



Here is an example illustrating the problem:

require(e1071)

n <- 8
x <- runif(n) + 1i * runif(n)
y <- runif(n) + 1i * runif(n)

dist <- function(x, y) sqrt(sum(Mod(x - y)^2))

perms <- permutations(n)
dim(perms)  # [1] 40320     8
tmp <- apply(perms, 1, function(ord) dist(x, y[ord]))
z <- y[perms[which.min(tmp), ]]  # exact solution
dist(x, z)

# an aproximate random-sampling approach
nsamp <- 10000
perms <- t(replicate(nsamp, sample(1:n, size=n, replace=FALSE)))
tmp <- apply(perms, 1, function(ord) dist(x, y[ord]))
z.app <- y[perms[which.min(tmp), ]]  # approximate solution
dist(x, z.app)


The heuristic is to use a stochastic greedy updates. Here is a very 

simple 
one:

swap.samp <-
  function(index) {
         sub.ind <- sample(seq(along=index),2)
         index[sub.ind]<- rev(sub.ind)
         index
  }


z.app <- y
z.cand <- y

for (i in 1:100) z.cand <-
 	if( dist(x,z.app) < dist(x,z.cand) ) {

         	z.app[swap.samp(1:8)]

 	} else {
 	z.app <- z.cand
 	z.cand[swap.samp(1:8)]
 	}

On your toy example, this usually finds the min(dist(x,z.app))  in < 

100 
trials.

Note that when

 	z.diff <- z.app != z.cand

 	dist(x[ z.diff ],z.app[ z.diff ])^2 - dist(x[ z.diff ],z.cand[ 
z.diff ])^2

equals

 	dist(x,z.app)^2 - dist(x,z.cand)^2

so you could vectorize the above to randomly pair up all the points 

(for n 
even) then update the n%/%2 pairs all at once.


For large problems, you might try to preferentially sample pairs of 

points 
with similar values of Mod ( x - z.app ) to begin with. The 
heuristic 

being that in two pairs of points that are both distant, swapping 
them 

might realize a bigger benefit.


--

Another version is to sample k points, randomly permute, then do a 
greedy 
update:

sub.samp <-
  function(index,n) {
 	sub.ind <- sample(seq(along=index),n)
 	index[sub.ind]<- sample(sub.ind)
 	index
}


# k is 4 here

for (i in 1:100) z.cand <-
 	if( dist(x,z.app) < dist(x,z.cand) ){
         	z.app[sub.samp(1:8,4)]
 	} else {
         	z.app <- z.cand
         	z.cand[sub.samp(1:8,4)]
 	}


On your toy problem, this takes in the low 100's to find the min.

I think you can do the similar vectorization trick here.
--

I suppose another variant would repeatedly sample k points, then 
enumerate 
distances for all permutations of them, then choose the best one to 

update z.app.

Again, in larger problems, one might weight those k points towards 
higher 
values of Mod( x - z.app ) at least to begin with.

HTH,

Chuck


Thanks for any help.

Best,
Ravi.


____________________________________________________________________

Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University

Ph. (410) 502-2619
email: rvaradhan at jhmi.edu

______________________________________________
R-help at r-project.org mailing list

PLEASE do read the posting guide 
and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry                            (858) 534-2098
                                             Dept of 
Family/Preventive 
Medicine
E                     UC San Diego
  La Jolla, San Diego 92093-0901



______________________________________________
R-help at r-project.org mailing list

PLEASE do read the posting guide 
and provide commented, minimal, self-contained, reproducible code.

A combinatorial optimization problem: finding the best permutation of a complex vector

Thread (9 messages)