Point Labeling Algorithms


These are the algorithms discribed below:

Approximation Algorithm A


Formann and Wagner showed by reduction from 3-SAT that the decision problem which corresponds to the Label Size Maximization Problem, is NP-complete. The main result of that paper ("A Packing Problem with Applications to Lettering of Maps", Proceedings of the 7th Annual ACM Symposium on Computational Geometry (1991) 281--288) is an approximation algorithm A that finds a valid labeling of at least half the optimal size. In addition, it is shown that no polynomial time approximation algorithm with a quality guaranty better than 50 percent exists. The running time of A is in O(n log n). Wagner also showed that there is a matching lower bound on the running time of Omega(n log n) ("Approximate Map Labeling is in Omega(n log n)", Information Processing Letters 52 (1994) 161--165).

A conceptually works as follows: We start with infinitesimal equally sized squares attached to each point in all four possible positions. Then all squares are expanded uniformly. In order to resolve conflicts between them, we eliminate all those which would contain another point if they were twice as big. It is easy to show that after this process, a point p can not have more than two squares left which overlap other squares.
If we consider p a boolean variable and associate its squares with the values p and not p, we can generate a boolean formula consisting of clauses which encode all conflicts. Suppose the square p was overlapping the square not q of a point q, this would give us the clause not (p and not q) = not p or q meaning that we do not want p and not q to be simultanously in the solution. If we join all such clauses with the and-operator, the satisfiability of the formula tells us exactly whether there is a solution of the current size. Since all clauses consist of two laterals, the formula is of 2-SAT type, and can be evaluated in time proportional to its length (S. Even, A. Itai, A. Shamir: "On the complexity of Timetable and Multicommodity Flow Problems", SIAM J. Comput. 5 (1976) 691--703).

This works only because we make sure that no point has more than two squares left after the elimination phase. On the other hand, we often eliminate both of two conflict partners, where it would have sufficed to delete one to resolve the conflict. This seems to be the reason for the practically very bad behaviour of A. In fact, A usually produces solutions not much better than 50 percent of the optimum, which makes it nearly useless for practical problems. So we developed a heuristical approach that uses strongly the ideas of A, maintains its quality and running time guaranty, and yields very convincing results. Instead of eliminating the squares as early as possible, it eliminates a square just when it is clear that it cannot be in any solution of the current size. The bad side effect of this is, that some points might have three or four squares left after the elimination phase. In order to handle this, we suggest three different heuristics to bring their number down to two.

The Heuristics

The simplest of these heuristics is used by the City of Munich for the application mentioned above, by the PTT Research Labs of the Netherlands to produce on-line maps for mobile radio networks, and in a computer system for the automated search for matching constellations in a star catalogue (G. Weber, L. Knipping, H. Alt: "An Application of Point Pattern Matching in Astronautics", Journal of Symbolic Computation 17 (1994) 321--340) as a tool to label the output on the screen. With a very similar algorithmic approach we were able to solve the so-called METAFONT labeling problem posed by Knuth and Raghunathan ("The Problem of Compatible Representatives", SIAM Journal on Discrete Mathematics 5 (1992) 422--427).

All three heuristics use a common framework:

  1. Find all interesting conflict sizes.
  2. Do a binary search on the interesting conflict sizes, and check for each size you look at, whether there is a solution or not, by going through the following three phases:
  3. Phase I: Preprocessing.
  4. Phase II: Make all decisions which do not destroy a possible solution.
  5. Phase III: For those points which still have two or more candidates left, choose exactly two, and check whether this remaining problem is solvable by 2-SAT.
The heuristics differ in the way in which they choose those two candidates in Phase III.

Heuristic H

We randomly choose two of the possible four candidates left per point, before we hand them over to 2-SAT. To increase the probability of a choice which enables a solution, this process can be repeated in case of a negative answer. Three repetitions yield good results without prolonging the running time too much.

Heuristic I

Here we run through all points twice. In the first run, we only look at those with four candidates left, eliminate the one with most conflicts, and make all decisions of the type we did in Phase II. During the second run, we do the same for points which still have three active candidates. Then the remaining problem (consisting only of points with exactly two candidates) is handed over to 2-SAT.

Heuristic J

For the third variant, we put all active candidates left into a priority queue according to the sum of all intersection areas of a candidate p_i. We then delete the minimum p_i from the queue, and eliminate all candidates q_j which overlap it, and the other active candidates p_k belonging to p. If any of these decisions induces new ones according to the pattern used in Phase II, then these are made as well, before the next minimum is deleted from the queue. Naturally the sizes of the intersection areas, and the data structure, have to be updated accordingly. This process is repeated until either one point runs out of candidates ("no solution"), or no point has more than two of them left, so the remaining problem can be handed over to 2-SAT.

Exact Solver X

This algorithm was implemented by Erik Schwarzenecker from Saarbrücken in C++. It uses some ideas of our Heuristic H but solves the problem in Phase III exactly. Thanks to its fine tuning it handles examples of up to 300 points even slightly faster than the heuristics, but we were forced to introduce a time limit of 5 minutes for larger hard and dense problem sets to be able to perform any test row in reasonable time. It shows exponential behaviour. For small examples it is very fast, for larger ones it is unreliable.

Exact Solver S

Still X is much better in practice than the Exact Solver S with a subexponential time bound suggested in (L. Kucera, K. Mehlhorn, B. Preis, E. Schwarzenecker: "Exact Algorithms for a Geometric Packing Problem" (Extended Abstract), Proceedings of the 10th Annual Symposium on Theoretical Aspects of Computer Science (STACS 93), Lecture Notes in Computer Science 665 (1993) 317--322). It normally runs out of memory for more than 60-80 points, which we could improve to 120-150, when we made it solve only the problem remaining in Phase III. Even splitting this up into its connected regions, and dealing with those seperately, did not help a great deal.

Approximation Algorithm B

In order to achieve A's running time efficiency of O(n log n), we introduce a new way of detecting conflicts between overlapping squares. It is based on an algorithm to find closest neighbours, which was suggested by M. Formann ("Algorithms for Geometric Packing and Scaling Problems", Dissertation, Fachbereich Mathematik und Informatik, Freie Universität Berlin 1992). We also maintain A's quality guarantee of 50 percent. In order to do so, we perform a test whether a solution can still be constructed from a subset of the squares which have not been eliminated so far during Phase II of the algorithm. We improve this elimination phase with the help of an additional detection rule for squares which are not needed in a valid solution. This increases the quality of the results of B significantly, even compared to the best of the heuristics, namely Heuristic I , with which B shares Phase I and III.
Back to the Labeling Home Page.
Alexander Wolff