Partitioned Graph Coloring
Before going into how to solve the partitioned graph coloring problem using integer linear programming (which is the goal of my thesis), I thought it would be a good idea to actually explain what is the partitioned graph coloring problem (or PCP from now on).
Graphs and partitions
First of all, graphs. Formally, a graph is defined as two sets: a set of nodes which compose the graph, and a set of edges between pairs of nodes. For example, the definition for the so-called diamond graph is the following:

And a representation for it could be the following:

While this is the simplest form of a graph, several additions can be made: weights or attributes can be assigned to either nodes or edges; edges can be directed; multiple edges can be allowed between pairs of nodes; etc. The modification we will be dealing with in this scenario is partitioning the graph’s nodes. As such, the graph is now defined as three sets: besides the nodes and edges, the set of partitions, where each partition is a non-empty set of nodes such that every node belongs to exactly one partition.
A possible partitioning of the diamond graph previously presented could be the following:

Coloring
Having defined the kind of graphs we will be working with, it’s time to define the problem itself. The graph coloring problem consists in defining a function that assigns a label (a “color”) to each node in the graph, with the restriction that every pair of adjacent nodes must have different colors.
For example, a valid coloring for the diamond graph presented earlier would be the following:
The goal here is to find a valid coloring that uses the minimum possible number of different colors. This value is called the chromatic number of a graph.
Determining if there is a valid k-coloring (coloring that uses exactly k colors) for a particular graph is usually difficult to determine, where difficult implies that it requires a huge computational effort; for those of you with background on CS, this problem is known to be NP-complete.
For .NET fans, it is worth checking the posts that Eric Lippert has been uploading recently on making a backtracking algorithm for the coloring problem using C#.
Partitioned Coloring
The coloring problem has several variants. For example, it might be of interest to minimize the not only the number of different colors used, but also the difference between how many nodes are painted with each color. Or allow that adjacent nodes have the same color, incurring in a penalty in an objective function, if it allows the usage of fewer colors.
In this case, the partitioned graph coloring consists in coloring one node per partition, with the restriction that two adjacent nodes may not have the same color. The difference with the standard coloring is that not all nodes must be colored, only one in each partition.
Going back to our diamond example, by cleverly picking which nodes to color in each partition, we can partition-color the whole graph using a single color:
Even though not all nodes are to be colored, this variant of the problem is still computationally difficult (NP-complete) to solve. Most approaches to it are heuristic, this is, they do not find the optimal solution, but offer a reasonable one in a reasonable time, as all algorithms that guarantee finding the best solution take too long to execute for large graphs.
Our goal will be to use a technique called integer linear programming to solve this problem, obtaining the exact solution in a reasonable time, for medium-sized instances.
What is Linear Programming
Despite being able to simply provide a link to wikipedia’s already excellent article on Linear Programming, I wanted to provide a short introduction in order to present what I am doing exactly on my thesis in a future article.
Linear Programming is best described as a technique, in which we want to maximize or minimize an objective function subject to a set of constraints. It is called linear because both the objective function and the constraints are linear inequalities on the variables.
And just to make sure that no one is left behind: a linear function on a set of variables
is a function
, where
are numbers. For example,
is a linear function, whereas
isn’t.
Example: diet problem
A classical example for linear programming is the diet problem. We want to determine a diet that satisfies all of our nutritional requirements while keeping it as cheap as possible. We have a set of different foods that fulfill certain requirements by containing a certain amount of proteins, vitamins, etc; and each food has a certain price per kilogram. Needless to say, the values in the following table are completely fictional.
| Food | Proteins per kg | Vitamins per kg | Carbohydrates per kg | Cost per kg |
|---|---|---|---|---|
| Apples | 2 | 4 | 2 | 5 |
| Beef | 10 | 5 | 4 | 20 |
| Cucumbers | 4 | 3 | 3 | 6 |
| Potatos | 1 | 2 | 8 | 2 |
Suppose also that we need at least 60 units of each nutrient in our diet. Our objective will be to find a solution that specifies how much food of each type we should buy to cover all of our needs while keeping budget to a minimum.
In order to create a linear programming model, we must first define which variables we will be working with. We will define
as how many kg of food
we will be purchasing; therefore, our variables will be
.
Now we must specify our restrictions. We want to consume at least 60 units of each nutrient, and each food provides a certain amount per kg of it. For instance, we may express that we need 60 units of carbs as:

Note that each unit of apples provides 2 units of carbs, therefore the total amount of carbs provided by apples will be the product
; same holds for all foods.
Similar expressions can be derived for vitamins:

And for proteins:

Needless to say, all amounts must be positive, as we are buying food, not selling it. Therefore, we add constraints:

The set of constraints we have built specify a set of valid or feasible solutions for us. For example, we will not accept a solution in which we purchase only 20kg of apples, since although that satisfies our vitamin intake, it does not satisfy carbs and proteins.
What we have to do now is, from every feasible solution, choose one that minimizes how much money we spend. So far, nothing prevents buying a whole supermarket of food from being a valid solution, albeit a very expensive one. In order to prevent that situation, we add an objective function to minimize, which is the cost of all the food we want to buy:

The purpose of the objective function is to measure how good or bad a particular solution is, whereas the constraints restrict what we consider a valid solution for us. Putting all of them together, we have constructed a linear programming model for our diet problem.
Generalizing
A linear programming problem consists in the minimization of a linear objective function
subject to a set of linear constraints, which can be expressed as
, where
is a matrix in which element
represents how much each unit of variable
contributes to satisfying demand
for product
.
Note that by using simple algebraic operations we can include all kind of non-strict linear constraints:
,
and
. Therefore, there are different canonical ways to represent a linear programming problem, one of which is the one we have already seen:

We may also express it as a maximization of the objective function:

Or either of them subject to a set of equalities:

In all canonical forms variables are usually restricted to be positive or zero, which makes sense in most scenarios. In a future post I would like to revisit this subject, using one of the canonical forms to go a little deeper into the economical interpretation of each of the variables and constraints. But for now, we will go straight to the resolution.
Resolution
We have seen how to model an optimization problem using linear programming, but we still need to know how to solve it. Luckily, there are several algorithms that deal with these specific problems, one of the most widely known is simplex.
How simplex works deserves another blog post of its own, but for now lets just say that it solves most models incredibly fast. The problem of solving an LPP (linear programming problem) is known to be polynomial, which means that it can be solved within an acceptable timespan.
Needless to say, linear programming is an invaluable tool for modeling several real life scenarios, and has multiple applications within operations research. In future posts I would like to dwelve deeper into linear programming, which I will do if I have some spare time, but for now I will go on blogging within the scope of my thesis: integer linear programming.
Making a thesis 2.0
Continuing with the tradition started by Beta two years ago here in Manas, I am now dedicating all my efforts to finish my thesis in order to obtain my MSc degree in Computer Science at FCEN-UBA. Having started it a few months ago, Manas generously started sponsoring my work in May, so I can deal with the last steps in the project and give it the finishing touch it deserves.
This work gives Manas Research division a very different direction than the one Beta gave it. Whereas his work was in the field of lambda calculus, my thesis is in operations research, integer linear programming to be more specific, dealing with a generalization of the graph coloring problem.
During these weeks I will be blogging about linear programming in general and the scope of my thesis, but for now I will upload the presentation I gave during the first days of June in the last ALIO/INFORMS joint meeting that took place in Buenos Aires so you can get a preview of what has been done so far.


