|
Point Labeling Algorithms
|
These are the algorithms discribed below:
Formann and Wagner showed by reduction from 3-SAT that the decision
problem which corresponds to the Label Size
Maximization Problem, is NP-complete. The main result of
that paper ("A Packing Problem with Applications to Lettering of
Maps", Proceedings of the 7th Annual ACM Symposium on Computational
Geometry (1991) 281--288) is an approximation algorithm A that
finds a valid labeling of at least half the optimal size. In addition,
it is shown that no polynomial time approximation algorithm with a
quality guaranty better than 50 percent exists. The running time of
A is in O(n log n). Wagner also showed
that there is a matching lower bound on the running time of
Omega(n log n) ("Approximate Map Labeling is in
Omega(n log n)", Information Processing Letters
52 (1994) 161--165).
A conceptually works as follows: We start with infinitesimal
equally sized squares attached to each point in all four possible
positions. Then all squares are expanded uniformly. In order to
resolve conflicts between them, we eliminate all those which would
contain another point if they were twice as big. It is easy to show
that after this process, a point p can not have more than two
squares left which overlap other squares.
If
we consider p a boolean variable and associate its squares with
the values p and not p, we can generate a boolean
formula consisting of clauses which encode all conflicts. Suppose the
square p was overlapping the square not q of a point
q, this would give us the clause not (p and not q) = not p
or q meaning that we do not want p and not q
to be simultanously in the solution. If we join all such clauses with
the and-operator, the satisfiability of the formula tells us exactly
whether there is a solution of the current size. Since all clauses
consist of two laterals, the formula is of 2-SAT type, and
can be evaluated in time proportional to its length (S. Even, A. Itai,
A. Shamir: "On the complexity of Timetable and Multicommodity Flow
Problems", SIAM J. Comput. 5 (1976) 691--703).
This works only because we make sure that no point has more than two
squares left after the elimination phase. On the other hand, we often
eliminate both of two conflict partners, where it would have sufficed
to delete one to resolve the conflict. This seems to be the reason for
the practically very bad behaviour of A. In fact, A
usually produces solutions not much better than 50 percent of the
optimum, which makes it nearly useless for practical problems. So we
developed a heuristical approach that uses strongly the ideas of
A, maintains its quality and running time guaranty, and yields
very convincing results. Instead of eliminating the squares as early
as possible, it eliminates a square just when it is clear that it
cannot be in any solution of the current size. The bad side effect of
this is, that some points might have three or four squares left after
the elimination phase. In order to handle this, we suggest three
different heuristics to bring their number down to two.
The simplest of these heuristics is used by the City of Munich for the
application mentioned above, by the PTT Research Labs of the
Netherlands to produce on-line maps for mobile radio networks, and in
a computer system for the automated search for matching constellations
in a star catalogue (G. Weber, L. Knipping, H. Alt: "An Application of
Point Pattern Matching in Astronautics", Journal of Symbolic
Computation 17 (1994) 321--340) as a tool to label the output on the
screen. With a very similar algorithmic approach we were able to solve
the so-called METAFONT labeling problem posed by Knuth and Raghunathan
("The Problem of Compatible Representatives", SIAM Journal on Discrete
Mathematics 5 (1992) 422--427).
All three heuristics use a common framework:
- Find all interesting conflict sizes.
- Do a binary search on the interesting conflict sizes,
and check for each size you look at, whether there
is a solution or not, by going through the following
three phases:
The heuristics differ in the way in which they choose those two candidates in Phase III.
We randomly choose two of the possible four candidates left per point, before we hand them over to 2-SAT. To increase the probability of a choice which enables a solution, this process can be repeated in case of a negative answer. Three repetitions yield good results without prolonging the running time too much.
Here we run through all points twice. In the first run, we only look at those with four candidates left, eliminate the one with most conflicts, and make all decisions of the type we did in Phase II. During the second run, we do the same for points which still have three active candidates. Then the remaining problem (consisting only of points with exactly two candidates) is handed over to 2-SAT.
For the third variant, we put all active candidates left into a priority queue according to the sum of all intersection areas of a candidate p_i. We then delete the minimum p_i from the queue, and eliminate all candidates q_j which overlap it, and the other active candidates p_k belonging to p. If any of these decisions induces new ones according to the pattern used in Phase II, then these are made as well, before the next minimum is deleted from the queue. Naturally the sizes of the intersection areas, and the data structure, have to be updated accordingly. This process is repeated until either one point runs out of candidates ("no solution"), or no point has more than two of them left, so the remaining problem can be handed over to 2-SAT.
This algorithm was implemented by Erik Schwarzenecker from Saarbrücken in C++. It uses some ideas of our Heuristic H but solves the problem in Phase III exactly. Thanks to its fine tuning it handles examples of up to 300 points even slightly faster than the heuristics, but we were forced to introduce a time limit of 5 minutes for larger
hard and dense problem sets to be able to perform any test row in reasonable time. It shows exponential behaviour. For small examples it is very fast, for larger ones it is unreliable.
Still X is much better in practice than the Exact Solver S with a subexponential time bound suggested in (L. Kucera, K. Mehlhorn, B. Preis, E. Schwarzenecker: "Exact Algorithms for a Geometric Packing Problem" (Extended Abstract), Proceedings of the 10th Annual Symposium on Theoretical Aspects of Computer Science (STACS 93), Lecture Notes in Computer Science 665 (1993) 317--322). It normally runs out of memory for more than 60-80 points, which we could improve to 120-150, when we made it solve only the problem remaining in Phase III. Even splitting this up into its connected regions, and dealing with those seperately, did not help a great deal.
In order to achieve A's running time efficiency of O(n log n), we introduce a new way of detecting conflicts between overlapping squares. It is based on an algorithm to find closest neighbours, which was suggested by M. Formann ("Algorithms for Geometric Packing and Scaling Problems", Dissertation, Fachbereich Mathematik und Informatik, Freie Universität Berlin 1992). We also maintain A's quality guarantee of 50 percent. In order to do so, we perform a test whether a solution can still be constructed from a subset of the squares which have not been eliminated so far during Phase II of the algorithm. We improve this elimination phase with the help of an additional detection rule for squares which are not needed in a valid solution. This increases the quality of the results of B significantly, even compared to the best of the heuristics, namely Heuristic I , with which B shares Phase I and III.
Back to the
Labeling Home Page.
Alexander Wolff