PyCCAlg - Correlation-Clustering Algorithms in Python

Python implementation of some algorithms for Correlation Clustering. Specifically:

Linear-programming + region-growing O(log n)-approximation algorithms for general weighted graphs
- round_demaine in src/pyccalg.py: Demaine et al.'s rounding algorithm
- round_charikar in src/pyccalg.py: Charikar et al.'s rounding algorithm
kwikcluster in src/pyccalg.py: KwikCluster randomized, linear-time algorithm (Ailon et al., JACM 2008), achieving constant-factor approximation guarantees on complete graphs satisfying certain constraints (e.g., probability constraint and/or triangle-inequality constraint)

python src/pyccalg.py -d <DATASET_FILE> [-r <LB,UB>] [-a <PROB>] [-s {'pulp','scipy'}] [-m {'charikar','demaine','kwik'}]

Optional arguments:
- -r <LB,UB>, if you want to generate random edge weights from [LB,UB] range
- -a <PROB>, if you want to randomly add edges with probability PROB
- -m {'charikar','demaine','kwik'}, to choose the algorithm (default: 'charikar'). NOTE: kwikcluster is always run too
- -s {'pulp','scipy'}, to select the solver to be used (default: 'scipy' (it seems faster))
Dataset-file format:
- First line: #VERTICES \t #EDGES
- One line per edge; every line is a quadruple: NODE1 \t NODE2 \t POSITIVE_WEIGHT \t NEGATIVE_WEIGHT (POSITIVE_WEIGHT and NEGATIVE_WEIGHT are ignored if code is run with -r option)
- Look at data folder for some examples

fgullo / pyccalg Goto Github PK