Algorithm and code

The algorithm is defined in [Wilschut et al. (2017)]. The link between names of parameters in the algorithm and the code is listed in the Markov clustering function parameters table.

The algorithm finds a bus, and performs bottom-up clustering of both the bus nodes and the non-bus nodes. The bus detection algorithm is coded in the BusComputing class, while a clustering step is handled in the ClusterComputing class.

All these operations work on or produce subsets of nodes. A subset is represented by a BitSet instance in code, over nodes in its associated matrix. The clustering hierarchy is stored in Group instances.

The Markov clustering algorithm however assumes a complete matrix rather than a subset of it, where previously found clusters are compressed into a single node. This conversion is handled by the submatrix.SubMatrixFunctions functions that construct and use a SubNode translation table to convert the overall matrix to a sub-matrix, and convert clustering results in the sub-matrix back to groups in the overall matrix.

After all computing steps are finished, a Group hierarchy describes which nodes and groups belong to which cluster. From this information a shuffle array is constructed, relating result nodes back to originating nodes. That array is then used to construct the resulting labels and the resulting adjacency matrix.