Code organization

For a user, the entry point is the DsmClustering.flowBasedMarkovClustering method which takes the following parameters:

Table 1. Markov clustering function parameters
Parameter Algorithmic name Empirical useful range Description

RealMatrix adjacencies



Adjacency matrix between the nodes, non-negative element (r,c) denotes the weight of the connection from node r to node c.

Label[] labels



Labels of the nodes.

double evap


Between 1.5 and 3.5

Influence factor of distance between nodes.

int stepCount


Normally 2

Number of steps taken each iteration in the Markov clustering.

double inflation


Between 1.5 and 3.5

Speed factor of deciding clusters.

double epsilon



Convergence limit.

BusDetectionAlgorithms busDetectionAlgorithm



The bus detection algorithm to apply.

double busInclusion



Tuning factor for selecting bus nodes. Interpretation depends on the chosen bus detection algorithm.

The function computes how to group the nodes into a bus and hierarchical clusters by shuffling the nodes into a different order.

The result is a Dsm instance, which contains how the original nodes were shuffled to obtain the result, a shuffled version of the provided adjacency matrix and labels, and a tree describing the nested clustering hierarchy, with the getShuffledBase and getShuffledSize methods providing the node offset and length of each group in the shuffled version.