Wednesday, June 24, 2015

Efficient Use of Exponential Size Linear Programs

After being sick on and off for a couple of years, I decided not to finish my PhD. I did write a small thesis, though. Thus on March 20, 2015, there were two big cosmic events: a partial solar eclipse and I defended a licentiate degree. It was cloudy in Stockholm, so the eclipse was a disappointment, but my defense wasn't and I successfully defended.

I've tried to make the thesis introduction easy to understand and you can now read it here on my blog.

What is the thesis about?

Computer Science is no more about computers than astronomy is about telescopes. – (probably) Edsger Wybe Dijkstra
When Santa Claus needs to find a route around the Earth to distribute all the gifts to children in the shortest possible time, it might take millions of years to compute even if he used all the supercomputers that are currently on Earth. This is an example of a problem in computer science that is not known to be solvable fast even after decades of research.

Maybe we have just not been intelligent enough to find the right approach for this problem. However, if you ask a complexity theorist, they would most likely tell you that it will never be possible. Right now we are not completely sure about this fact, but the evidence is piling up (see the excellent essay The Scientific Case for P≠NP by Scott Aaronson).

If it really is impossible for Santa Claus to find the shortest route to distribute all gifts, what other options does he have? Instead of insisting on finding the shortest route, he might try to find a route that is at most 50% longer than the shortest one. Arguably, this should be an easier task and traveling a 50% longer route could still allow him to distribute all presents in time.

First Attempt

The problem that Santa Claus needs to solve is called the Traveling Salesman Problem (TSP). The famous algorithm by Christophides [Chr76] finds a solution to TSP within an error of 50% from the optimal solution. This is an example of an approximation algorithm, since it does not solve a problem optimally but only approximately, that is within a certain error from the optimal solution.

The time complexity of this algorithm is n2.373 operations [MS04] where n is the number of stops that Santa makes. So a computer running the algorithm makes at most about n2.373 operations.

To get the value of n we need to estimate the number of families with children. According to Gap Minder [Gap11], there are about 1.9 billion children on Earth. The average number of children per family has been estimated to be almost three [Fam06], so we can estimate that Santa Claus will need to make at most 650 million stops. Not all children expect to be visited by Santa Claus but a large fraction does, so this estimate is good enough for our purposes.

If Santa Claus had access to the fastest supercomputer Tianhe-2 (as of November 2014) with the speed of 33.86-petaflops, which is about 3 ⋅ 1016 operations per second, it should take a couple of minutes or hours to finish the computation. We note that the theoretically optimal subroutine taking about n2.373 [MS04] operations might in this case be slower than a subroutine that uses Edmond’s algorithm [Edm65] that needs at most n2.5 operations. In any case, such an algorithm will finish its execution in a couple of days.

A possible problem is the high memory consumption of these algorithms, but for the sake of simplicity of this text let us ignore this problem. A bigger problem might be the 50% error, because a tour that is 50% longer than the shortest one might be too long to distribute all the presents in time.

Improving the Error

The celebrated algorithm by Arora and Mitchell [Aro98Mit99] can find a solution to TSP within an error of say 20, 10 or 5% from the optimal solution. We can pick any error margin we like, but the smaller the error the longer the running time will be. The algorithm works provided that the problem is situated in a Euclidean space. For the purpose of this text, we only need to know that the well-known 2-dimensional (2D) plane or 3-dimensional (3D) space are both Euclidean.

We could pretend that the Earth is flat by using one of the many map projections. Unfortunately, map projections do not preserve an important geometric property, namely that the space is Euclidean and thus cannot be used by the algorithm.

Let us then try using 3 dimensions. The Arora-Mitchell algorithm assumes that the shortest path between two points is the straight line connecting them, but this is a problem for Santa, since he cannot travel under the ground. As it turns out, it is not such a big problem in the end.

Santa Claus can run the Arora-Mitchell algorithm on all positions of children on Earth as if they were simple points in a 3-dimensional space. Then he follows the route returned by the algorithm with the important change that he flies over the Earth’s surface instead of following the straight line under ground. How much longer is the route when flying over the Earth’s surface?

The most remote inhabited place on Earth, Tristan da Cunha, is about 2000 kilometers from the nearest inhabited land. However, the direct distance through the Earth is about 11.5 kilometers shorter, which is only about 0.5% smaller. The situation is shown on the picture below.

The distance between Tristan da Cunha and the nearest inhabited place Saint Helen is about 2000 kilometers. The direct distance through the Earth is about 11.5 kilometers shorter, which is only about 0.5% smaller.
Of course we cannot be sure that 2000 kilometers will be the longest distance between two stops found by the algorithm, however it will most likely not happen very often in the optimal tour. Human settlements are fairly spread out over the Earth’s surface and intuitively an optimal tour should not make a lot of long jumps—it is likely more efficient to first distribute all gifts in Asia and then go to Europe rather than flying 2000 kilometers back and forth all the time.

We thus have a strong reason to be optimistic, so let us hope that the Arora-Mitchell algorithm returns a tour such that Santa travels only about 0.5% longer route compared the one following straight lines under ground. The Arora-Mitchell algorithm is able to return a solution within x% of the optimal tour for any error x we choose. How big should x be if we want to get a 10% error?

If your answer is 9.5%, you are close but not completely right. We want to have 1.005 (1 + x) = 1.1, so x 9.45%. Let us now estimate how long the algorithm will run. The number of operations performed by the algorithm in 3 dimensions is at least
where n is the number of families to visit and a is a number that is at least 9 [RS98]. We note that the exact value of a is higher than 9, but for our purposes this information is enough.

Even if Santa Claus had access to the fastest supercomputer Tianhe-2, it would take about 101250 years to run the algorithm (101250 is a number with 1250 zeros at the end). Or equivalently about 101240 times the age of the universe. Even if each person on Earth had such supercomputer and we ran the algorithm in parallel on all the supercomputers, the running time would still be 101230 times the age of the universe.

For comparison, the Held-Karp algorithm [HK62] that makes about n22n operations is the fastest known exact algorithm that finds the optimal route. It would run 10100000000 times the age of the universe. The running time of the Arora-Mitchell algorithm is tiny compared to this one.

Heuristics

Even though the problem that Santa Claus faces may sound artificial, it appears in real life for companies that distribute goods. Instead of approximation algorithms, they often use heuristics, which are computer programs that try to solve a problem but only weak guarantees are known about the quality of their output.

For an approximation algorithm, we always have a guarantee that the output will be of a certain quality, but we might have to pay for this guarantee with slower running time, as in the case of the Arora-Mitchell algorithm. Heuristics, on the other hand, are usually very fast but we might be unlucky and receive a solution that is not good enough.

In 1947, Dantzig [Dan51] developed the simplex method that became one of the most important algorithms of the 20th century. It performs very fast in practice but for many years its theoretical speed was unknown. In 1970, Klee and Minty [KM72] found some examples where the algorithm became very slow. The algorithm continued to be used in practice and still is until today. It used to have a similar status as heuristics, since we did not know when and why it performs well.

In 1977, Borgwardt [Bor86] found a class of linear programs that are provably solved quickly by the simplex method. Subsequently, other people have found cases where the simplex method performs well. Finally, in 2001—five decades after the algorithm was discovered—Spielman and Teng developed a powerful tool called smoothed analysis [ST04] that explained why the simplex method works fast in practice while there are rare cases where it is very slow.

For many real-world problems, heuristics are both faster and better performing than approximation algorithms and we do not know why, as used to be the case for the simplex method. It is not unheard of that heuristics achieve 1% error as well as they are very fast, while the best-known approximation algorithm only achieves an error in the order of tens of percents (see for example [JM07]). Some people use this fact to argue against the use of complexity theory, but we would argue the exact opposite. Since some heuristics seem to perform well, we should study them more and try to figure out why they perform so well or find cases where they do not, as we did in the case of the simplex method. If anything, this is an argument for more research in complexity theory, not less.

What Should Complexity Theorists do?

There are several interesting research directions that help us. For example, we can try to find faster approximation algorithms for which we do not have to wait 101250 years to finish. We can also try to prove that some heuristics used in practice perform very well, hence turning a heuristic into an approximation algorithm. The hardest alternative of all is to prove that there will never be an algorithm that finds good solutions and is fast at the same time. Such negative results are called hardness results, because they prove that a problem is hard to solve fast.

History has shown that it is important to be both optimistic and skeptical at the same time. Some problems were thought to be hard but an algorithm was found in the end; while for other problems, a very simple and naive algorithm turned out to be the best we can hope for. However, for most of the problems, their complexity is still not well established. The best-known algorithms are often very far from the best-known hardness results and the truth can lie anywhere in between.

What is the Contribution of this Thesis?

In the past decades, linear programming (LP) has been successfully used to develop approximation algorithms for various optimization problems. The simplex method mentioned earlier is used to solve linear programs and is considered to be one of the most important algorithms of the twentieth century, which underlines the importance of linear programming in computer science.

The so-called assignment LP lead to substantial progress for various allocation problems, where items are allocated to players or machines. An example of such problem is scheduling unrelated parallel machines, where computer tasks are to be scheduled on a number of machines such that the last one is finished as soon as possible. Unfortunately, we know that the best-known approximation algorithms for this problem already have the best approximation ratio that any algorithm using the assignment LP can achieve. The bound on the achievable approximation ratio by an LP is called the integrality gap of the LP.

We have thus reached the limits of the assignment LP. The natural question is then whether a different LP formulation can lead to better algorithms. We answer this question positively for variants of two allocation problems: max-min fair allocation and maximum budgeted allocation, which we define in later chapters. This is achieved by using a more powerful LP formulation called the configuration LP that has exponential size. Exponential size is considered inefficient, but configuration LP can be approximated efficiently. This is where the title of the thesis comes from, since we use exponential linear programs in an efficient way.

The restricted max-min fair allocation, also known as the restricted Santa Claus problem, is one of few problems that has a better polynomial estimation algorithm than approximation algorithm. An estimation algorithm estimates the value of the optimal solution, but is not necessarily able to find the optimal solution. The configuration LP can be used to estimate the optimal value within a factor of 1/(4 + ε) for any ε > 0, but it is not known how to efficiently find a solution achieving this value. We give an approximation algorithm with the same approximation guarantee but improve the running time to nO(log n). Our techniques also have the interesting property that although we use the rather complex configuration LP in the analysis, we never actually solve it and therefore the resulting algorithm is purely combinatorial.

For the maximum budgeted allocation (MBA) we prove that the integrality gap of the configuration LP is strictly better than 3/4 and provide corresponding polynomial time rounding algorithms for two variants of the problem: the restricted MBA and the graph MBA. Finally, we improve the best-known upper bound on the integrality gap for the general case from 5/6 ≈ 0.833 to 2√2 − 2 ≈ 0.828 and also show hardness of approximation results for both variants studied.

Further reading

The full thesis is available for download and you can also have a look at my slides, which contain plenty of additional pictures.

Bibliography


[Aro98]    Sanjeev Arora. Polynomial time approximation schemes for euclidean traveling salesman and other geometric problems. J. ACM, 45(5):753–782, 1998.
[Bor86]    Karl Heinz Borgwardt. A Probabilistic Analysis of the Simplex Method. Springer-Verlag New York, Inc., New York, NY, USA, 1986.
[Chr76]    Nicos Christofides. Worst-case analysis of a new heuristic for the traveling salesman problem. Technical report, CMU, 1976.
[Dan51]    George B Dantzig. Maximization of a linear function of variables subject to linear inequalities, in activity analysis of production and allocation. 1951.
[Edm65]    Jack Edmonds. Paths, trees, and flowers. Canadian Journal of mathematics, 17(3):449–467, 1965.
[Fam06]    Number of people in a family. http://web.archive.org/web/20140819085940/http://hypertextbook.com/facts/2006/StaceyJohnson.shtml, 2006. Accessed: 2014-08-19.
[Gap11]    The world has reached peak number of children! http://web.archive.org/web/20141120230405/http://www.gapminder.org/news/world-_peak-_number-_of-_children-_is-_now/, 2011. Accessed: 2014-11-20.
[HK62]    Michael Held and Richard M Karp. A dynamic programming approach to sequencing problems. Journal of the Society for Industrial & Applied Mathematics, 10(1):196–210, 1962.
[JM07]    David S Johnson and Lyle A McGeoch. Experimental analysis of heuristics for the stsp. In The traveling salesman problem and its variations, pages 369–443. Springer, 2007.
[KM72]    Victor Klee and George J Minty. How good is the simplex algorithm? 1972.
[KMN+14]   Christos Kalaitzis, Aleksander Madry, Alantha Newman, Lukas Polacek, and Ola Svensson. On the configuration lp for maximum budgeted allocation. In IPCO, pages 333–344, 2014.
[Mit99]    Joseph S. B. Mitchell. Guillotine subdivisions approximate polygonal subdivisions: A simple polynomial-time approximation scheme for geometric tsp, k-mst, and related problems. SIAM J. Comput., 28(4):1298–1309, 1999.
[MS04]    M. Mucha and P. Sankowski. Maximum matchings via gaussian elimination. In Foundations of Computer Science, 2004. Proceedings. 45th Annual IEEE Symposium on, pages 248–255, Oct 2004.
[PS12]    Lukas Polacek and Ola Svensson. Quasi-polynomial local search for restricted max-min fair allocation. In ICALP, pages 726–737, 2012.
[RS98]    Satish B Rao and Warren D Smith. Approximating geometrical graphs via “spanners” and “banyans”. In Proceedings of the thirtieth annual ACM symposium on Theory of computing, pages 540–550. ACM, 1998.
[ST04]    Daniel A. Spielman and Shang-Hua Teng. Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time. J. ACM, 51(3):385–463, 2004.