General Aims

Graph clustering is a research area, which methods of algorithm engineering can contribute to decisively. Presumably, theoretical insights into formally and rigorously defined optimization problems in this field will only be of limited use, since they typically do not cover all relevant aspects of graph clustering. On the other hand, experimental results on specific applications or data sets are not sufficient for gaining consolidated findings that are valid beyond a fixed application. However, due to the increasing need for methods for the evaluation of huge datasets, which are nowadays collected electronically, more generally applicable algorithms are compulsory. Not the least reason algorithmics is called for to provide better procedures, is the fact that in the past years physics pushed into this field (there called “community detection”) within the scope of the heavily fostered domain of “complex systems”. In our opinion algorithm engineering can set something against the physics' approach, since it provides appropriate methods to arrive at universally valid and conferrable insights.

Concretization

In the realization of these intentions we pursue the following goals in the long term:

  1. A systematic modeling of all algorithmic aspects of the clustering of static and of evolving graphs: systematics of quality indices, formulation of measures for comparing clusterings of different static graphs and of different snapshots of an evolving graph.
  2. The development of software tools that support an experimental inspection of clustering algorithms for graphs. Apart from implementations of various algorithms and quality indices this includes measures and procedures for the direct comparison of clusterings, graph generators and a data structure for an apt representation of time-dependent graphs in particular.
  3. The development of algorithms for clustering static and evolving graphs, for which we can state qualitative assertion that are valid beyond a fixed set of applications. The computation of cuts and separators shall be incorporated both as an auxiliary means and as an independent topic.
  4. The development of clustering and decomposition algorithms against the background of selected applications in the field of network analysis. Potentially this will lead to cases of application that concern very large graphs. In this case we shall also consider models (e.g. a streaming model) for handling huge data sets.
  5. The allocation of benchmarks. Besides synthetic data we shall provide meaningful benchmarks that originate from application data.