Lars Simon Zehnder — written Sep 24, 2013 — source
Munkres’ Assignment Algorithm
(Munkres (1957),
also known as hungarian algorithm) is a well known algorithm
in Operations Research solving the problem to optimally
assign N
jobs to N
workers.
I needed to solve the Minimal Assignment Problem for a
relabeling algorithm in MCMC sampling for finite mixture
distributions, where I use a random permutation Gibbs
sampler. For each sample an optimal labeling must be found,
i.e. I have N
parameter vectors and must assign each vector
to one of the N
components of the finite mixture under the restriction
that each component gets a parameter vector. For the assignment
of parameters to components
[Stephens (1997b)] (http://www.isds.duke.edu/~scs/Courses/Stat376/Papers/Mixtures/LabelSwitchingStephensJRSSB.pdf)
suggests an algorithm relying on the Minimal Assignment in regard
to a loss matrix. The labeling with the smallest loss is
then considered as optimal.
After an unsuccessful search for libraries implementing
the algorithm easily for C++ or C, I made the decision to
program it myself using RcppArmadillo
for good performance.
I found some guidance by the websites of
Bob Pilgrim and
[TopCoder] (http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=hungarianAlgorithm).
These websites offer excellent tutorials to understand
the algorithm and to convert it to code. The order of this
implementation of Munkres’ algorithm is of an order N
to the power of 4. There exists
also a version of order N
to the power of 3, but an order of N
to the power of 4 works
very good for me and coding time is as usual a critical factor
for me.
In the following I walk through the different steps of Munkres’ algorithm and explain the main parts and their functionality.
Let’s begin. The first step in Munkres’ algorithm is to subtract the minimal element of each row from each element in this row.
Note, that we use references for all function arguments.
As we have to switch between the steps of the algorithm
continously, we always must be able to determine which
step should be chosen next. Therefore we give a mutable
unsigned integer step
as an argument to each step
function of the algorithm.
Inside the function we can easily access a whole row by
Armadillo’s row()
method for matrices.
In the second step, we then search for a zero in the
modified cost matrix of step one.
Only the first zero in a row is taken. Then, the indicator
matrix indM
indicates this zero by setting the corresponding
element at (r, c)
to 1. A unique zero - the only or first one in
a column and row - is called starred zero. In step 2
we find
such a starred zero.
Note, that we use here Armadillo’s element access via the
method at()
, which makes no bound checks and improves performance.
Note Bene: This code is thoroughly debugged - never do this for fresh written code!
In step 3
we cover each column with a starred zero. If already
N
columns are covered all starred zeros describe a complete
assignment - so, go to step 7
and finish. Otherwise go to
step 4
.
We cover a column by looking for 1s in the indicator
matrix indM
(See step 2
for assuring that these are
indeed only starred zeros).
Step 4
finds noncovered zeros and primes them. If there
are zeros in a row and none of them is starred, prime
them. For this task we program a helper function to keep
the code more readable and reusable. The helper function
searches for noncovered zeros.
We can detect noncovered zeros by checking if the cost matrix
contains at row r and column c a zero and row and column
are not covered yet, i.e. rcov(r) == 0
, ccov(c) == 0
.
This loop breaks, if we have found our first uncovered zero or
no uncovered zero at all.
In step 4
, if no uncovered zero is found we go to step 6
. If
instead an uncovered zero has been found, we set the indicator
matrix at its position to 2. We then have to search for a starred
zero in the row with the uncovered zero, uncover the column with
the starred zero and cover the row with the starred zero. To
indicate a starred zero in a row and to find it we create again
two helper functions.
We know that starred zeros are indicated by the indicator
matrix containing an element equal to 1.
Now, step 4
.
Notice the rpath_0
and cpath_0
variables. These integer
variables store the first vertex for an augmenting path in step 5
.
If zeros could be primed we go further to step 5
.
Step 5
constructs a path beginning at an uncovered primed
zero (this is actually graph theory - alternating and augmenting
paths) and alternating between starred and primed zeros.
This path is continued until a primed zero with no starred
zero in its column is found. Then, all starred zeros in
this path are unstarred and all primed zeros are starred.
All primes in the indicator matrix are erased and all rows
are uncovered. Then return to step 3
to cover again columns.
Step 5
needs several helper functions. First, we need
a function to find starred zeros in columns.
Then we need a function to find a primed zero in a row.
Note, that these tasks are easily performed by searching the
indicator matrix indM
.
In addition we need a function to augment the path, one to
clear the covers from rows and one to erase the primed zeros
from the indicator matrix indM
.
The function to augment the path gets an integer matrix path
of dimension 2 * N x 2. In it all vertices between rows and columns
are stored row-wise.
Now, we can set the complete step 5
:
Recall, if step 4
was successfull in uncovering all columns
and covering all rows with a primed zero, it then calls
step 6
.
Step 6
takes the cover vectors rcov
and ccov
and looks
in the uncovered region of the cost matrix for the smallest
value. It then subtracts this value from each element in an
uncovered column and adds it to each element in a covered row.
After this transformation, the algorithm starts again at step 4
.
Our last helper function searches for the smallest value in
the uncovered region of the cost matrix.
Step 6
looks as follows:
At last, we must create a function that enables us to jump around the different steps of the algorithm. The following code shows the main function of the algorithm. It defines also the important variables to be passed to the different steps.
Note, this function takes the numeric cost matrix as
an argument and returns the integer indicator matrix
indM
.
I chose to set the argument input_cost
constant and copy
it for reasons of reusability of the cost matrix in other
C++ code. When working with rather huge cost matrices, it
makes sense to make the argument mutable. Though, I used
pass-by-reference for all the arguments in functions to
avoid useless copying inside the functions.
To call the main function hungarian
from R and to use
our algorithm we construct an Rcpp Attribute
:
If we want to provide this function also to other users
we should probably ensure, that the matrix is square (There
exists also an extension to rectangular matrices, see
Burgeois and Lasalle (1971)).
This can be done easily with the exceptions provided by
Rcpp
passed over to R:
Note, that it is also possible to use for the attribute
directly an Armadillo
matrix, but following the recent
discussion on the Rcpp-devel list,
a pass-by-reference of arguments is not yet possible. Romain
Francois’ proposals seem promising, so maybe we can expect
in some of the next releases shallow copies allowing
pass-by-reference in Rcpp Attributes
.
Let us begin now with a very easy example that makes clear what is going on.
<Rcpp::exception: Matrix is not square.>
[,1] [,2] [1,] 1 2 [2,] 2 4
[,1] [,2] [1,] 0 1 [2,] 1 0
[1] 4
We can also compute a maximal assignment over a revenue matrix by simply considering the difference between a big value and this matrix as a cost matrix.
[,1] [,2] [,3] [,4] [1,] 1 2 3 4 [2,] 2 4 6 8 [3,] 3 6 9 12 [4,] 4 8 12 16
[,1] [,2] [,3] [,4] [1,] 1 0 0 0 [2,] 0 1 0 0 [3,] 0 0 1 0 [4,] 0 0 0 1
[1] 30
CoCost matrices containing negative values work as well:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [1,] -0.23018 0.3598 -0.2180 -0.29507 -0.20792 -0.02855 -0.5023 [2,] 1.55871 0.4008 -1.0260 0.89513 -1.26540 -0.04287 -0.3332 [3,] 0.07051 0.1107 -0.7289 0.87813 2.16896 1.36860 -1.0186 [4,] 0.12929 -0.5558 -0.6250 0.82158 1.20796 -0.22577 -1.0718 [5,] 1.71506 1.7869 -1.6867 0.68864 -1.12311 1.51647 0.3035 [6,] 0.46092 0.4979 0.8378 0.55392 -0.40288 -1.54875 0.4482 [7,] -1.26506 -1.9666 0.1534 -0.06191 -0.46666 0.58461 0.0530 [8,] -0.68685 0.7014 -1.1381 -0.30596 0.77997 0.12385 0.9223 [9,] -0.44566 -0.4728 1.2538 -0.38047 -0.08337 0.21594 2.0501 [10,] 1.22408 -1.0678 0.4265 -0.69471 0.25332 0.37964 -0.4910 [,8] [,9] [,10] [1,] -2.309169 0.3853 0.5484 [2,] 1.005739 -0.3707 0.2387 [3,] -0.709201 0.6444 -0.6279 [4,] -0.688009 -0.2205 1.3607 [5,] 1.025571 0.3318 -0.6003 [6,] -0.284773 1.0968 2.1873 [7,] -1.220718 0.4352 1.5326 [8,] 0.181303 -0.3259 -0.2357 [9,] -0.138891 1.1488 -1.0264 [10,] 0.005764 0.9935 -0.7104
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 0 0 0 0 0 0 0 1 0 0 [2,] 0 0 0 0 1 0 0 0 0 0 [3,] 0 0 0 0 0 0 1 0 0 0 [4,] 0 0 0 0 0 0 0 0 1 0 [5,] 0 0 1 0 0 0 0 0 0 0 [6,] 0 0 0 0 0 1 0 0 0 0 [7,] 0 1 0 0 0 0 0 0 0 0 [8,] 1 0 0 0 0 0 0 0 0 0 [9,] 0 0 0 0 0 0 0 0 0 1 [10,] 0 0 0 1 0 0 0 0 0 0
[1] -12.42
Finally let us make some benchmarking. We load the rbenchmark package and take now a more realistic cost matrix.
test replications elapsed user.self sys.self 1 indM <- hungarian_cc(cost) 1000 10.53 10.5 0.004
But we also see, where the limitations of this algorithm lie
user system elapsed 3.656 0.008 3.674
Some last notes on the structure of the code. I prefer to
put the code of the algorithm in an header file, e.g.
hungarian.h
. And use an attributes.cpp
(attributes.cc
)
file to program the Rcpp Attributes
. With this,
I can easily reuse the algorithm in C++ code by simple
inclusion (#include "hungarian.h"
) and have a complete
overview on all the C++ functions I export to R.
tags: armadillo
Tweet