Code Monkey home page Code Monkey logo

alp-baselines's Introduction

Baselines for Anchor Link Prediction (Network Alignment)

This project contains baselines, including MNA, PALE, IONE, FINAL, FRUI-P, CROSSMNA and REGAL.

MNA

  • Citation: Kong, X., Zhang, J., & Yu, P. S. (2013). Inferring anchor links across multiple heterogeneous social networks. In Proceedings of the 22nd ACM international conference on Conference on information & knowledge management - CIKM ’13 (pp. 179–188).

  • Implementation

python alp_main.py --method mna --graphs data/test.src.net data/test.obj.net --identity-linkage data/test.align --use-net True --neg-ratio 5 --output test.res
  • Evaluation
python eval_mna.py -net-src data/test.src.net -net-end data/test.obj.net -feat-src test.res.epoch5.node_order1 -feat-end ffvm.res.epoch5.node -linkage data/test.align -use-net true -model test.res.pkl -n-cands 1 -eval-type cls -output eval.mna

FINAL

  • Citation: Zhang, S. (2016). FINAL: Fast Attributed Network Alignment. In Proceedings of the 22th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining - KDD '16 (pp. 421–434).

  • Implementation

python alp_main.py --method final --graphs data/test.src.net data/test.obj.net --graph-sizes 20 20 --identity-linkage data/test.align --output test --alpha 0.1 --tol 0.1

FRUI-P

  • Citation: Zhou, X., Liang, X., Member, S., Du, X., & Zhao, J. (2018). Structure Based User Identification across Social Networks. IEEE Transactions on Knowledge and Data Engineering - TKDE, 30(6), 1178–1191.

  • Implementation

Step 1: network embedding (FFVM)

python ne_main.py --input data/test.src.net --output ffvm.res --batch-size 6 --table-size 100 --rep-size 4 --method ffvm --neg-ratio 2 --order 1

Step 2: matching (FRUI-P)

python alp_main.py --method fruip --embeddings test.res.epoch5.node_order1 ffvm.res.epoch5.node --graphs data/test.src.net data/test.obj.net --identity-linkage data/test.align --output test.fruip --epochs 10

PALE

  • Citation: Man, T., Shen, H., Liu, S., Jin, X., & Cheng, X. (2016). Predict anchor links across social networks via an embedding approach. International Joint Conference on Artificial Intelligence - IJCAI (1823–1829).

  • Implementation

Step 1: network embedding (LINE)

python ne_main.py --input data/test.src.net --output test.res --batch-size 6 --table-size 100 --rep-size 4 --method line --neg-ratio 2 --order 1

Step 2: matching (PALE)

python alp_main.py --embeddings test.res.epoch5.node_order1 ffvm.res.epoch5.node --type-model lin --identity-linkage data/test.align --output test.alp --batch-size 4 --input-size 4 --epochs 10 --method pale --neg-ratio 5 --device :/cpu:0
  • Evaluation
python eval_pale.py -feat-src test.res.epoch5.node_order1 -feat-end ffvm.res.epoch5.node -linkage data/test.align -model test.alp -model-type lin -eval-type cls -n-layer 2 -n-cands 1 -output eval.pale

IONE

  • Citation: Liu, L., Cheung, W. K., Li, X., & Liao, L. (2016). Aligning Users Across Social Networks Using Network Embedding. International Joint Conference on Artificial Intelligence - IJCAI (pp. 1774–1780).

  • Implementation

See [Author's implementation][https://github.com/ColaLL/IONE]

  • Evaluation
python eval_ione.py -feat-src xxx -feat-end xxx -linkage data/test.align -eval-type cls -n-cands 1 -output eval.ione

CROSSMNA

  • Citation: Chu, X., Fan, X., Yao, D., Zhu, Z., Huang, J., & Bi, J. (2019). Cross-Network Embedding for Multi-Network Alignment. The World Wide Web Conference - WWW (pp. 273-284).

  • Implementation

python alp_main.py --graphs data/test.src.net data/test.obj.net --identity-linkage data/test.align --log-file test.log --batch-size 4 --epochs 10 --table-size 1000 --method crossmna --lr .01 --nd-rep-size 4 --layer-rep-size 8 --neg-ratio 5 --output test.crossmna.out
  • Evaluation
python eval_crossmna.py -node test.crossmna.out -train-linkage data/test.align -linkage data/test.align -n-cands 1 -eval-type cls -output eval.crossmna

REGAL

  • Citation: Heimann, M., Shen, H., Safavi, T., & Koutra, D. (2018, October). Regal: Representation learning-based graph alignment. Proceedings of the 27th ACM International Conference on Information and Knowledge Management - CIKM (pp. 117-126)

  • Implementation

python regal/regal.py --input data/test.src.net data/test.obj.net --output test.regal.out
  • Evaluation
python eval_regal.py -node test.regal.out -linkage data/test.align -n-cands 1 -eval-type cls -output eval.regal

Notes (Updated 12/8/2021)

Update shell commands for evaluations

Here we provide metrics on ranking and classification issues. For classification issue, we calculate the evaluation results as follows,

src2end
Overall: 17
(1,1),0.439630,1
(1,7),5.488685,0
(2,2),1.018289,1
(2,4),1.463131,0
(3,3),0.769002,1
(3,15),2.228442,0
(4,4),2.119972,1
(4,7),5.188379,0
(5,5),1.434842,1
(5,19),0.474886,0
(6,6),2.483092,1
...

The first line shows the order of calculating anchor links, e.g., 'src2end'. The network which puts back refers that we will sample nodes in that network to compose negative sample in evaluation (We will take example later). The second line shows the overall tasks in the evaluation task. Then we start to record the results of the choosen metric. In classification, each line (from 3rd line to the end) has three columns separated by comma ','. It outputs the decision whether the pair of nodes (e.g.,(1,1)) is anchor node, and the decision score is listed in the second column. The last column records the ground truth provided by test dataset. As we only observe anchor links across networks in test dataset, we need to supplement negative samples. The negative samples are generated by sampling from the target network (e.g., the network listed back in first line). Then you can use any classification metrics by the results, e.g., AUC, F1 score, Accuracy and Recall (recommended scikit learn package in python to caluculate these metrics).

For ranking issue, we calcuate the evaluation results as follows,

src2end
Overall: 17
(1,1):1.000000
(2,2):0.500000
(3,3):0.500000
(4,4):0.250000
(5,5):0.250000
(6,6):0.200000
(7,7):0.166667
(8,8):0.250000
(9,9):0.200000
...
mean_mrr:0.41960784313725485, var:0.08660899653979237

The first and second lines have the same meaning as mentioned above. From the 3rd line to the end, each line has two columns seperated by colon ':', refering the MRR value (1.000000) corresponded to the pair of nodes (e.g., (1,1)). At the last line, we provide statistic results on mean MRR and its variance (calculated by np.mean and np.var).

We also update anchor link prediction methods based on profile, text and trajectory information. Please see the project https://github.com/wsdeprofile/ALP_algorithm, including

  • STUL
Wei Chen, Hongzhi Yin, Weiqing Wang, Lei Zhao, Wen Hua, and Xiaofang Zhou. (2017). Exploiting spatio-temporal user behaviors for user linkage. Proceedings of
the 2017 ACM on Conference on Information and Knowledge Management - CIKM (pp. 517–526)
  • HYDRA
Siyuan Liu, Shuhui Wang, Feida Zhu, Jinbo Zhang, and Ramayya Krishnan. (2014). HYDRA: large-scale social identity linkage via heterogeneous behavior modeling. Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data - SIGMOD. (pp. 51–62).

alp-baselines's People

Contributors

allen517 avatar

Stargazers

Crystal avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.