Project for the IASD Master program between Paris-Dauphine, École Normale Supérieure, and Mines ParisTech.
Check the 2 versions:
- PySpark in Jupyter Notebook: ccf-project-pyspark
- Scala in Databricks.
Reference:
- Kardes, H., Agrawal, S., Wang, X., & Sun, A. (2014). CCF: Fast and scalable connected component computation in MapReduce. [PDF].