Code Monkey home page Code Monkey logo

link-prediction's Introduction

Link Prediction Experiments

This repository contains a series of machine learning experiments for link prediction within social networks.

We first implement and apply a variety of link prediction methods to each of the ego networks contained within the SNAP Facebook dataset and SNAP Twitter dataset, as well as to various random networks generated using networkx, and then calculate and compare the ROC AUC, Average Precision, and runtime of each method.

If you use this repository in your work, please cite the corresponding DOI:

DOI

Link Prediction Methods Tested:

Requirements

Pre-Use Installation

python setup.py install

Included Files

Network Data

  • facebook/: Original Facebook ego networks dataset, with added .allfeats files (with both ego and alter features)
  • fb-processed/: Pickle dumps of (adjacency_matrix, feature_matrix) tuples for each ego network, and for combined network
  • twitter/: Twitter ego networks dataset (combined), with adjacency matrix pickle dump
  • visualizations/: Visualizations of each network generated by networkx and matplotlib
  • network-statistics/: .txt and .pkl files of pre-calculated network characteristics (with info on connectivity, network size, etc.) for each network
  • train-test-splits/: Pickle dumps of pre-processed train-test splits for Facebook ego networks, with varying degrees of visibility (i.e. how many edges are hidden). Includes: adj_train, train_edges, train_edges_false, val_edges, val_edges_false, test_edges, test_edges_false
  • process-ego-networks.py: Script used to process raw Facebook data and generate pickle dumps
  • process-combined-network.py: Script used to combine Facebook ego networks and generate complete network pickle dump
  • process-twitter-network.py: Script used to process raw Twitter data and generate pickle dumps
  • fb-train-test-splits.py: Script used to generate and store train-test splits for each Facebook ego network
  • twitter-train-test-splits.py: Script used to generate and store train-test-splits for Twitter combined network

Annotated Link Prediction IPython Notebooks

  • link-prediction-baselines.ipynb: Adamic-Adar, Jaccard Coefficient, Preferential Attachment
  • spectral-clustering.ipynb: Using spectral embeddings for link prediction
  • node2vec.ipynb: Skip-gram based representation learning for node/edge embeddings
  • graph-vae.ipynb: (Variational) Graph Autoencoder, learns node embeddings to recreate adjacency matrix

Link Prediction Helper Scripts

  • link_prediction_scores.py: Utility functions for running various link prediction tests

Exploratory Analysis

  • network-visualizations.ipynb: Generate .pdf visualizations for each network, in addition to calculating and storing a variety of network metrics (e.g. transtivity, avg. clustering coefficient, etc.)

Full Link Prediction Experiments

  • nx-graph-experiments.ipynb: Run all link prediction tests on various types of random networks (Erdos-Renyi, Barabasi-Albert, etc.)
  • fb-graph-experiments.ipynb: Run all link prediction tests on each Facebook ego network
  • run-all-experiments.py: Run all link prediction experiments (on both Facebook networks and random networkx networks), save results as pickle dumps in results
  • run-twitter-experiments.py: Run all link prediction experiments on combined (directed) Twitter network. Warning: may take a while to run.

Results

  • results/: Pickle dumps of experiment results, with results (ROC AUC, ROC Curve, Avg. Precision, Runtime) stored for each link prediction method in Python dictionary form
  • investigate-results.ipynb: Generate and save bar plots/ROC curve plots for each method and graph type, save network characteristics to .txt files
  • result-plots-by-graph/: Bar plots for the results of each experiment (ROC AUC, AP, Minimum Runtime) organized by network, in .pdf form
  • result-plots-by-algorithm/: Bar plots for the AUROC results of each experiment organized by link prediction algorithm, in .pdf form

link-prediction's People

Contributors

lucashu1 avatar tkipf avatar gokceneraslan avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.