CMSC 341 Fall 2012
Project 4

Assigned	Monday, Nov 19, 2012
Due	11:59 pm on Sunday Dec 9, 2012

Objectives

In this project you will use a min binary heap to support solving a combinatorial optimization problem, the linear assignment problem, using a method called uniform cost search. This is also an exercise to put together several data structures you have learned in this class to provide a solution for a non-trivial problem.

Description

The Linear Assignment Problem

This problem can be described as follows. There are n agents and n tasks, each agent can perform each of the n tasks at different costs. An assignment is a pairing of all n agents to n distinct tasks. There is a total of n! different assignments (Why?). Given the cost matrix Cost(a_i, t_j), the (optimal) solution to this problem is one of the assignments with the MINIMUM total cost for the n tasks. The total cost is the sum of the costs for all of the n tasks performed by the respective assigned agents.

The linear assignment problem and its more complex variations (e.g., the number of agents is different from the number of tasks, one agent allows to perform more than one tasks, etc.) find many applications in operations research such as transportation scheduling. Many algorithms have been developed for this problem and its variations. Some guarantee to find the exact (or optimal) solution (a minimum cost assignment), others find approximate (or sub-optimal) solutions (with small but not necessarily minimum cost) for problems with large size for which finding the exact solutions will be computationally intractable. The method you are asked to implement, the uniform cost search algorithm, guarantees to find the optimal solution, albeit in an inefficient way.

Uniform Cost Search

This method searches through the solution space and constructs the optimal solution along the way. It starts with an empty partial solution S0 in which no agent is assigned a task. Then all children of S0 are generated, each represents a new partial solution that expands S0with one more agent assigned. Now, these partial solutions represent different directions to search forward, and the question is which direction we should choose? The uniform cost search method chooses from the outstanding partial solutions the one with the smallest total cost. If the chosen partial solution is an assignment (i.e., all agents are assigned a distinct task), then this solution is guaranteed to be an optimal solution and it returned. Otherwise, all children of this partial solution are generated and compared with other outstanding partial solutions.

The algorithm for the uniform cost search is outline below, where openList contains all partial solutions we have generated but not expanded.

openList = {S0=()}; /* initialization */

Forever do {

x <- smallest cost element in openList; /* x is removed from openList */

If (x is an assignment) return x; /* an optimal solution is found */

generate all children of x and put them into openList; /* otherwise */

}

From the computational performance pointer of view, the central requirement for openList is to find and remove the element with minimum cost. Therefore, min binary heap becomes an ideal data structure to implement openList.

Example

The algorithm is illustrated in the following tiny example.

Table 1. The cost matrix

	t1	t2	t3
a1	5	10	15
a2	25	15	5
a3	8	16	20

Table 2. The openList

step	removed element	openList content
0		((), 0)
1	()	((1, 1), 5), ((3, 1), 8), ((2, 1), 25)
2	((1, 1), 5)	((3, 1), 8), ((1, 1), (2, 2), 20), ((1, 1), (3, 2), 21), ((2, 1), 25)
3	((3, 1), 8)	((3, 1), (1, 2), 18), ((1, 1), (2, 2), 20), ((1, 1), (3, 2), 21) , ((3, 1), (2, 2), 23), ((2, 1), 25)
4	((3, 1), (1, 2), 18)	((1, 1), (2, 2), 20), ((1, 1), (3, 2), 21), ((3, 1), (2, 2), 23), ((3, 1), (1, 2), (2, 3), 23), ((2, 1), 25),

As can be seen in Table 2, each element in the openList represents a partial solution (an incomplete assignment). It contains a list of agent-task pairs of this partial solution and its total cost (underlined). For example, ((1, 1), (3, 2), 21) has a1 assigned to t1 and a3 assigned to t2, with the total cost of 5 + 16 = 21. All elements in openList are arranged in the ascending order of their costs for illustration purpose. Elements inserted in each step are in bold; they are children of the element just removed from openList. If continued, openList will keep growing with more partial solutions generated and inserted. The process will stop at step 8, when ((3, 1), (1, 2), (2, 3), 23), is removed from the list and determined to be a complete assignment. This assignment is then returned as the solution.

Note:

1) For illustrative purpose all elements in openList in Table 2 are in ascending order of their costs, not in heap order.

2) Since the last two elements in the heap after step 4 have the same cost (23), which one is placed ahead the other depends on how the tie is broken. To ensure a consistent output, this tie-breaking is left for the author’s code.

Generate children

Table 2 also illustrates how the children of a partial solution are generated: each child contains one more agent-task pair for the next unassigned task. Children of S0 are all possible agent assignments to task t1. Children of a child of S0 are all possible assignment to t2 of those free agents. For example, ((1, 1), 5) has assigned a1 to t1, and a2 and a3 are now free agents. It thus has two children ((1, 1), (2, 2), 20) and ((1, 1), (3, 2), 21), with a2 and a3 assigned to t2, respectively. In general, in a partial solution with k agent-task pairs, k agents are assigned to tasks 1 to k. This partial solution has n - k free agents and thus n - k children, each having one free agent assigned to task k + 1.

A note of the method

The uniform cost search algorithm belongs to a class of algorithms known as best-first search that guarantees exact (optimal) solutions to many combinatorial optimization problems. This algorithm itself is very inefficient. However, it serves as a basis for more efficient best-first search algorithms. For example, if we can (conservatively) estimate the cost to complete an assignment from a partial solution and add it to the total cost, then the uniform cost search becomes the celebrated A* search algorithm of significantly higher time and space performance.

Project Requirements and Notes

1. Project 4 will be invoked with a command line of a single argument: the name of the data file. The first line of the data file contains a positive integer: the number of agents (and tasks). The rest of the file contains the cost matrix, one line per a row of the matrix, as shown in Table 1, for costs of one agent performing each of the tasks. All entries in the matrix are positive integers; they are separated by a sing space within each line. The data file for our tiny example can be found here.

2. Implement the openList using the author’s code for binary heap, which can be found here.

Children of a partial solution are generated and inserted into the heap (openList) by the order of the agent sequence (for example, the three children of S0 are generated in the order of (1, 1), (2, 1), and (3, 1).) If two or more partial solutions have the same cost, the tie is broken by the author’s heap code.

Be careful when you design the class for the elements of openList and develop code for generating children for partial solutions.

3. Output. When the search stops, print out the following

The number of agents
The total number of partial solution generated (counting both S0 and the final solution)
The final size of openList (after the solution is removed).
The solution you found (the sequence of the agent-task pairs, in ascending order of tasks) and the total cost of the solution assignment.

The out put for our tiny example would look like

The number of agents: 3

The total number of partial solutions generated: 12

The final size of openList: 4

The solution found: ((3, 1), (1, 2), (2, 3))

The cost of the solution: 23

Project Grading