Module munkres
[hide private]
[frames] | no frames]

Source Code for Module munkres

  1  #!/usr/bin/env python 
  2  # -*- coding: iso-8859-1 -*- 
  3   
  4  # Documentation is intended to be processed by Epydoc. 
  5   
  6  """ 
  7  Introduction 
  8  ============ 
  9   
 10  The Munkres module provides an implementation of the Munkres algorithm 
 11  (also called the Hungarian algorithm or the Kuhn-Munkres algorithm), 
 12  useful for solving the Assignment Problem. 
 13   
 14  Assignment Problem 
 15  ================== 
 16   
 17  Let *C* be an *n*\ x\ *n* matrix representing the costs of each of *n* workers 
 18  to perform any of *n* jobs. The assignment problem is to assign jobs to 
 19  workers in a way that minimizes the total cost. Since each worker can perform 
 20  only one job and each job can be assigned to only one worker the assignments 
 21  represent an independent set of the matrix *C*. 
 22   
 23  One way to generate the optimal set is to create all permutations of 
 24  the indexes necessary to traverse the matrix so that no row and column 
 25  are used more than once. For instance, given this matrix (expressed in 
 26  Python):: 
 27   
 28      matrix = [[5, 9, 1], 
 29                [10, 3, 2], 
 30                [8, 7, 4]] 
 31   
 32  You could use this code to generate the traversal indexes:: 
 33   
 34      def permute(a, results): 
 35          if len(a) == 1: 
 36              results.insert(len(results), a) 
 37   
 38          else: 
 39              for i in range(0, len(a)): 
 40                  element = a[i] 
 41                  a_copy = [a[j] for j in range(0, len(a)) if j != i] 
 42                  subresults = [] 
 43                  permute(a_copy, subresults) 
 44                  for subresult in subresults: 
 45                      result = [element] + subresult 
 46                      results.insert(len(results), result) 
 47   
 48      results = [] 
 49      permute(range(len(matrix)), results) # [0, 1, 2] for a 3x3 matrix 
 50   
 51  After the call to permute(), the results matrix would look like this:: 
 52   
 53      [[0, 1, 2], 
 54       [0, 2, 1], 
 55       [1, 0, 2], 
 56       [1, 2, 0], 
 57       [2, 0, 1], 
 58       [2, 1, 0]] 
 59   
 60  You could then use that index matrix to loop over the original cost matrix 
 61  and calculate the smallest cost of the combinations:: 
 62   
 63      n = len(matrix) 
 64      minval = sys.maxsize 
 65      for row in range(n): 
 66          cost = 0 
 67          for col in range(n): 
 68              cost += matrix[row][col] 
 69          minval = min(cost, minval) 
 70   
 71      print minval 
 72   
 73  While this approach works fine for small matrices, it does not scale. It 
 74  executes in O(*n*!) time: Calculating the permutations for an *n*\ x\ *n* 
 75  matrix requires *n*! operations. For a 12x12 matrix, that's 479,001,600 
 76  traversals. Even if you could manage to perform each traversal in just one 
 77  millisecond, it would still take more than 133 hours to perform the entire 
 78  traversal. A 20x20 matrix would take 2,432,902,008,176,640,000 operations. At 
 79  an optimistic millisecond per operation, that's more than 77 million years. 
 80   
 81  The Munkres algorithm runs in O(*n*\ ^3) time, rather than O(*n*!). This 
 82  package provides an implementation of that algorithm. 
 83   
 84  This version is based on 
 85  http://www.public.iastate.edu/~ddoty/HungarianAlgorithm.html. 
 86   
 87  This version was written for Python by Brian Clapper from the (Ada) algorithm 
 88  at the above web site. (The ``Algorithm::Munkres`` Perl version, in CPAN, was 
 89  clearly adapted from the same web site.) 
 90   
 91  Usage 
 92  ===== 
 93   
 94  Construct a Munkres object:: 
 95   
 96      from munkres import Munkres 
 97   
 98      m = Munkres() 
 99   
100  Then use it to compute the lowest cost assignment from a cost matrix. Here's 
101  a sample program:: 
102   
103      from munkres import Munkres, print_matrix 
104   
105      matrix = [[5, 9, 1], 
106                [10, 3, 2], 
107                [8, 7, 4]] 
108      m = Munkres() 
109      indexes = m.compute(matrix) 
110      print_matrix(matrix, msg='Lowest cost through this matrix:') 
111      total = 0 
112      for row, column in indexes: 
113          value = matrix[row][column] 
114          total += value 
115          print '(%d, %d) -> %d' % (row, column, value) 
116      print 'total cost: %d' % total 
117   
118  Running that program produces:: 
119   
120      Lowest cost through this matrix: 
121      [5, 9, 1] 
122      [10, 3, 2] 
123      [8, 7, 4] 
124      (0, 0) -> 5 
125      (1, 1) -> 3 
126      (2, 2) -> 4 
127      total cost=12 
128   
129  The instantiated Munkres object can be used multiple times on different 
130  matrices. 
131   
132  Non-square Cost Matrices 
133  ======================== 
134   
135  The Munkres algorithm assumes that the cost matrix is square. However, it's 
136  possible to use a rectangular matrix if you first pad it with 0 values to make 
137  it square. This module automatically pads rectangular cost matrices to make 
138  them square. 
139   
140  Notes: 
141   
142  - The module operates on a *copy* of the caller's matrix, so any padding will 
143    not be seen by the caller. 
144  - The cost matrix must be rectangular or square. An irregular matrix will 
145    *not* work. 
146   
147  Calculating Profit, Rather than Cost 
148  ==================================== 
149   
150  The cost matrix is just that: A cost matrix. The Munkres algorithm finds 
151  the combination of elements (one from each row and column) that results in 
152  the smallest cost. It's also possible to use the algorithm to maximize 
153  profit. To do that, however, you have to convert your profit matrix to a 
154  cost matrix. The simplest way to do that is to subtract all elements from a 
155  large value. For example:: 
156   
157      from munkres import Munkres, print_matrix 
158   
159      matrix = [[5, 9, 1], 
160                [10, 3, 2], 
161                [8, 7, 4]] 
162      cost_matrix = [] 
163      for row in matrix: 
164          cost_row = [] 
165          for col in row: 
166              cost_row += [sys.maxsize - col] 
167          cost_matrix += [cost_row] 
168   
169      m = Munkres() 
170      indexes = m.compute(cost_matrix) 
171      print_matrix(matrix, msg='Highest profit through this matrix:') 
172      total = 0 
173      for row, column in indexes: 
174          value = matrix[row][column] 
175          total += value 
176          print '(%d, %d) -> %d' % (row, column, value) 
177   
178      print 'total profit=%d' % total 
179   
180  Running that program produces:: 
181   
182      Highest profit through this matrix: 
183      [5, 9, 1] 
184      [10, 3, 2] 
185      [8, 7, 4] 
186      (0, 1) -> 9 
187      (1, 0) -> 10 
188      (2, 2) -> 4 
189      total profit=23 
190   
191  The ``munkres`` module provides a convenience method for creating a cost 
192  matrix from a profit matrix. Since it doesn't know whether the matrix contains 
193  floating point numbers, decimals, or integers, you have to provide the 
194  conversion function; but the convenience method takes care of the actual 
195  creation of the cost matrix:: 
196   
197      import munkres 
198   
199      cost_matrix = munkres.make_cost_matrix(matrix, 
200                                             lambda cost: sys.maxsize - cost) 
201   
202  So, the above profit-calculation program can be recast as:: 
203   
204      from munkres import Munkres, print_matrix, make_cost_matrix 
205   
206      matrix = [[5, 9, 1], 
207                [10, 3, 2], 
208                [8, 7, 4]] 
209      cost_matrix = make_cost_matrix(matrix, lambda cost: sys.maxsize - cost) 
210      m = Munkres() 
211      indexes = m.compute(cost_matrix) 
212      print_matrix(matrix, msg='Lowest cost through this matrix:') 
213      total = 0 
214      for row, column in indexes: 
215          value = matrix[row][column] 
216          total += value 
217          print '(%d, %d) -> %d' % (row, column, value) 
218      print 'total profit=%d' % total 
219   
220  References 
221  ========== 
222   
223  1. http://www.public.iastate.edu/~ddoty/HungarianAlgorithm.html 
224   
225  2. Harold W. Kuhn. The Hungarian Method for the assignment problem. 
226     *Naval Research Logistics Quarterly*, 2:83-97, 1955. 
227   
228  3. Harold W. Kuhn. Variants of the Hungarian method for assignment 
229     problems. *Naval Research Logistics Quarterly*, 3: 253-258, 1956. 
230   
231  4. Munkres, J. Algorithms for the Assignment and Transportation Problems. 
232     *Journal of the Society of Industrial and Applied Mathematics*, 
233     5(1):32-38, March, 1957. 
234   
235  5. http://en.wikipedia.org/wiki/Hungarian_algorithm 
236   
237  Copyright and License 
238  ===================== 
239   
240  Copyright 2008-2016 Brian M. Clapper 
241   
242  Licensed under the Apache License, Version 2.0 (the "License"); 
243  you may not use this file except in compliance with the License. 
244  You may obtain a copy of the License at 
245   
246    http://www.apache.org/licenses/LICENSE-2.0 
247   
248  Unless required by applicable law or agreed to in writing, software 
249  distributed under the License is distributed on an "AS IS" BASIS, 
250  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 
251  See the License for the specific language governing permissions and 
252  limitations under the License. 
253  """ 
254   
255  __docformat__ = 'restructuredtext' 
256   
257  # --------------------------------------------------------------------------- 
258  # Imports 
259  # --------------------------------------------------------------------------- 
260   
261  import sys 
262  import copy 
263   
264  # --------------------------------------------------------------------------- 
265  # Exports 
266  # --------------------------------------------------------------------------- 
267   
268  __all__     = ['Munkres', 'make_cost_matrix'] 
269   
270  # --------------------------------------------------------------------------- 
271  # Globals 
272  # --------------------------------------------------------------------------- 
273   
274  # Info about the module 
275  __version__   = "1.0.8" 
276  __author__    = "Brian Clapper, bmc@clapper.org" 
277  __url__       = "http://software.clapper.org/munkres/" 
278  __copyright__ = "(c) 2008 Brian M. Clapper" 
279  __license__   = "Apache Software License" 
280   
281  # --------------------------------------------------------------------------- 
282  # Classes 
283  # --------------------------------------------------------------------------- 
284   
285 -class Munkres:
286 """ 287 Calculate the Munkres solution to the classical assignment problem. 288 See the module documentation for usage. 289 """ 290
291 - def __init__(self):
292 """Create a new instance""" 293 self.C = None 294 self.row_covered = [] 295 self.col_covered = [] 296 self.n = 0 297 self.Z0_r = 0 298 self.Z0_c = 0 299 self.marked = None 300 self.path = None
301
302 - def make_cost_matrix(profit_matrix, inversion_function):
303 """ 304 **DEPRECATED** 305 306 Please use the module function ``make_cost_matrix()``. 307 """ 308 import munkres 309 return munkres.make_cost_matrix(profit_matrix, inversion_function)
310 311 make_cost_matrix = staticmethod(make_cost_matrix) 312
313 - def pad_matrix(self, matrix, pad_value=0):
314 """ 315 Pad a possibly non-square matrix to make it square. 316 317 :Parameters: 318 matrix : list of lists 319 matrix to pad 320 321 pad_value : int 322 value to use to pad the matrix 323 324 :rtype: list of lists 325 :return: a new, possibly padded, matrix 326 """ 327 max_columns = 0 328 total_rows = len(matrix) 329 330 for row in matrix: 331 max_columns = max(max_columns, len(row)) 332 333 total_rows = max(max_columns, total_rows) 334 335 new_matrix = [] 336 for row in matrix: 337 row_len = len(row) 338 new_row = row[:] 339 if total_rows > row_len: 340 # Row too short. Pad it. 341 new_row += [pad_value] * (total_rows - row_len) 342 new_matrix += [new_row] 343 344 while len(new_matrix) < total_rows: 345 new_matrix += [[pad_value] * total_rows] 346 347 return new_matrix
348
349 - def compute(self, cost_matrix):
350 """ 351 Compute the indexes for the lowest-cost pairings between rows and 352 columns in the database. Returns a list of (row, column) tuples 353 that can be used to traverse the matrix. 354 355 :Parameters: 356 cost_matrix : list of lists 357 The cost matrix. If this cost matrix is not square, it 358 will be padded with zeros, via a call to ``pad_matrix()``. 359 (This method does *not* modify the caller's matrix. It 360 operates on a copy of the matrix.) 361 362 **WARNING**: This code handles square and rectangular 363 matrices. It does *not* handle irregular matrices. 364 365 :rtype: list 366 :return: A list of ``(row, column)`` tuples that describe the lowest 367 cost path through the matrix 368 369 """ 370 self.C = self.pad_matrix(cost_matrix) 371 self.n = len(self.C) 372 self.original_length = len(cost_matrix) 373 self.original_width = len(cost_matrix[0]) 374 self.row_covered = [False for i in range(self.n)] 375 self.col_covered = [False for i in range(self.n)] 376 self.Z0_r = 0 377 self.Z0_c = 0 378 self.path = self.__make_matrix(self.n * 2, 0) 379 self.marked = self.__make_matrix(self.n, 0) 380 381 done = False 382 step = 1 383 384 steps = { 1 : self.__step1, 385 2 : self.__step2, 386 3 : self.__step3, 387 4 : self.__step4, 388 5 : self.__step5, 389 6 : self.__step6 } 390 391 while not done: 392 try: 393 func = steps[step] 394 step = func() 395 except KeyError: 396 done = True 397 398 # Look for the starred columns 399 results = [] 400 for i in range(self.original_length): 401 for j in range(self.original_width): 402 if self.marked[i][j] == 1: 403 results += [(i, j)] 404 405 return results
406
407 - def __copy_matrix(self, matrix):
408 """Return an exact copy of the supplied matrix""" 409 return copy.deepcopy(matrix)
410
411 - def __make_matrix(self, n, val):
412 """Create an *n*x*n* matrix, populating it with the specific value.""" 413 matrix = [] 414 for i in range(n): 415 matrix += [[val for j in range(n)]] 416 return matrix
417
418 - def __step1(self):
419 """ 420 For each row of the matrix, find the smallest element and 421 subtract it from every element in its row. Go to Step 2. 422 """ 423 C = self.C 424 n = self.n 425 for i in range(n): 426 minval = min(self.C[i]) 427 # Find the minimum value for this row and subtract that minimum 428 # from every element in the row. 429 for j in range(n): 430 self.C[i][j] -= minval 431 432 return 2
433
434 - def __step2(self):
435 """ 436 Find a zero (Z) in the resulting matrix. If there is no starred 437 zero in its row or column, star Z. Repeat for each element in the 438 matrix. Go to Step 3. 439 """ 440 n = self.n 441 for i in range(n): 442 for j in range(n): 443 if (self.C[i][j] == 0) and \ 444 (not self.col_covered[j]) and \ 445 (not self.row_covered[i]): 446 self.marked[i][j] = 1 447 self.col_covered[j] = True 448 self.row_covered[i] = True 449 450 self.__clear_covers() 451 return 3
452
453 - def __step3(self):
454 """ 455 Cover each column containing a starred zero. If K columns are 456 covered, the starred zeros describe a complete set of unique 457 assignments. In this case, Go to DONE, otherwise, Go to Step 4. 458 """ 459 n = self.n 460 count = 0 461 for i in range(n): 462 for j in range(n): 463 if self.marked[i][j] == 1: 464 self.col_covered[j] = True 465 count += 1 466 467 if count >= n: 468 step = 7 # done 469 else: 470 step = 4 471 472 return step
473
474 - def __step4(self):
475 """ 476 Find a noncovered zero and prime it. If there is no starred zero 477 in the row containing this primed zero, Go to Step 5. Otherwise, 478 cover this row and uncover the column containing the starred 479 zero. Continue in this manner until there are no uncovered zeros 480 left. Save the smallest uncovered value and Go to Step 6. 481 """ 482 step = 0 483 done = False 484 row = -1 485 col = -1 486 star_col = -1 487 while not done: 488 (row, col) = self.__find_a_zero() 489 if row < 0: 490 done = True 491 step = 6 492 else: 493 self.marked[row][col] = 2 494 star_col = self.__find_star_in_row(row) 495 if star_col >= 0: 496 col = star_col 497 self.row_covered[row] = True 498 self.col_covered[col] = False 499 else: 500 done = True 501 self.Z0_r = row 502 self.Z0_c = col 503 step = 5 504 505 return step
506
507 - def __step5(self):
508 """ 509 Construct a series of alternating primed and starred zeros as 510 follows. Let Z0 represent the uncovered primed zero found in Step 4. 511 Let Z1 denote the starred zero in the column of Z0 (if any). 512 Let Z2 denote the primed zero in the row of Z1 (there will always 513 be one). Continue until the series terminates at a primed zero 514 that has no starred zero in its column. Unstar each starred zero 515 of the series, star each primed zero of the series, erase all 516 primes and uncover every line in the matrix. Return to Step 3 517 """ 518 count = 0 519 path = self.path 520 path[count][0] = self.Z0_r 521 path[count][1] = self.Z0_c 522 done = False 523 while not done: 524 row = self.__find_star_in_col(path[count][1]) 525 if row >= 0: 526 count += 1 527 path[count][0] = row 528 path[count][1] = path[count-1][1] 529 else: 530 done = True 531 532 if not done: 533 col = self.__find_prime_in_row(path[count][0]) 534 count += 1 535 path[count][0] = path[count-1][0] 536 path[count][1] = col 537 538 self.__convert_path(path, count) 539 self.__clear_covers() 540 self.__erase_primes() 541 return 3
542
543 - def __step6(self):
544 """ 545 Add the value found in Step 4 to every element of each covered 546 row, and subtract it from every element of each uncovered column. 547 Return to Step 4 without altering any stars, primes, or covered 548 lines. 549 """ 550 minval = self.__find_smallest() 551 for i in range(self.n): 552 for j in range(self.n): 553 if self.row_covered[i]: 554 self.C[i][j] += minval 555 if not self.col_covered[j]: 556 self.C[i][j] -= minval 557 return 4
558
559 - def __find_smallest(self):
560 """Find the smallest uncovered value in the matrix.""" 561 minval = sys.maxsize 562 for i in range(self.n): 563 for j in range(self.n): 564 if (not self.row_covered[i]) and (not self.col_covered[j]): 565 if minval > self.C[i][j]: 566 minval = self.C[i][j] 567 return minval
568
569 - def __find_a_zero(self):
570 """Find the first uncovered element with value 0""" 571 row = -1 572 col = -1 573 i = 0 574 n = self.n 575 done = False 576 577 while not done: 578 j = 0 579 while True: 580 if (self.C[i][j] == 0) and \ 581 (not self.row_covered[i]) and \ 582 (not self.col_covered[j]): 583 row = i 584 col = j 585 done = True 586 j += 1 587 if j >= n: 588 break 589 i += 1 590 if i >= n: 591 done = True 592 593 return (row, col)
594
595 - def __find_star_in_row(self, row):
596 """ 597 Find the first starred element in the specified row. Returns 598 the column index, or -1 if no starred element was found. 599 """ 600 col = -1 601 for j in range(self.n): 602 if self.marked[row][j] == 1: 603 col = j 604 break 605 606 return col
607
608 - def __find_star_in_col(self, col):
609 """ 610 Find the first starred element in the specified row. Returns 611 the row index, or -1 if no starred element was found. 612 """ 613 row = -1 614 for i in range(self.n): 615 if self.marked[i][col] == 1: 616 row = i 617 break 618 619 return row
620
621 - def __find_prime_in_row(self, row):
622 """ 623 Find the first prime element in the specified row. Returns 624 the column index, or -1 if no starred element was found. 625 """ 626 col = -1 627 for j in range(self.n): 628 if self.marked[row][j] == 2: 629 col = j 630 break 631 632 return col
633
634 - def __convert_path(self, path, count):
635 for i in range(count+1): 636 if self.marked[path[i][0]][path[i][1]] == 1: 637 self.marked[path[i][0]][path[i][1]] = 0 638 else: 639 self.marked[path[i][0]][path[i][1]] = 1
640
641 - def __clear_covers(self):
642 """Clear all covered matrix cells""" 643 for i in range(self.n): 644 self.row_covered[i] = False 645 self.col_covered[i] = False
646
647 - def __erase_primes(self):
648 """Erase all prime markings""" 649 for i in range(self.n): 650 for j in range(self.n): 651 if self.marked[i][j] == 2: 652 self.marked[i][j] = 0
653 654 # --------------------------------------------------------------------------- 655 # Functions 656 # --------------------------------------------------------------------------- 657
658 -def make_cost_matrix(profit_matrix, inversion_function):
659 """ 660 Create a cost matrix from a profit matrix by calling 661 'inversion_function' to invert each value. The inversion 662 function must take one numeric argument (of any type) and return 663 another numeric argument which is presumed to be the cost inverse 664 of the original profit. 665 666 This is a static method. Call it like this: 667 668 .. python:: 669 670 cost_matrix = Munkres.make_cost_matrix(matrix, inversion_func) 671 672 For example: 673 674 .. python:: 675 676 cost_matrix = Munkres.make_cost_matrix(matrix, lambda x : sys.maxsize - x) 677 678 :Parameters: 679 profit_matrix : list of lists 680 The matrix to convert from a profit to a cost matrix 681 682 inversion_function : function 683 The function to use to invert each entry in the profit matrix 684 685 :rtype: list of lists 686 :return: The converted matrix 687 """ 688 cost_matrix = [] 689 for row in profit_matrix: 690 cost_matrix.append([inversion_function(value) for value in row]) 691 return cost_matrix
692 725 726 # --------------------------------------------------------------------------- 727 # Main 728 # --------------------------------------------------------------------------- 729 730 if __name__ == '__main__': 731 732 matrices = [ 733 # Square 734 ([[400, 150, 400], 735 [400, 450, 600], 736 [300, 225, 300]], 737 850), # expected cost 738 739 # Rectangular variant 740 ([[400, 150, 400, 1], 741 [400, 450, 600, 2], 742 [300, 225, 300, 3]], 743 452), # expected cost 744 745 746 # Square 747 ([[10, 10, 8], 748 [9, 8, 1], 749 [9, 7, 4]], 750 18), 751 752 # Rectangular variant 753 ([[10, 10, 8, 11], 754 [9, 8, 1, 1], 755 [9, 7, 4, 10]], 756 15)] 757 758 m = Munkres() 759 for cost_matrix, expected_total in matrices: 760 print_matrix(cost_matrix, msg='cost matrix') 761 indexes = m.compute(cost_matrix) 762 total_cost = 0 763 for r, c in indexes: 764 x = cost_matrix[r][c] 765 total_cost += x 766 print('(%d, %d) -> %d' % (r, c, x)) 767 print('lowest cost=%d' % total_cost) 768 assert expected_total == total_cost 769