Code Monkey home page Code Monkey logo

makg's Introduction

Copyright 2023 by Southeast University & Nanjing University of Posts and Telecommunications.

Time: 3/9/2023 Authors: Heng Zhou & Weizhuo Li & Buye Zhang

Mail: [email protected] & [email protected] & [email protected]

Description: MAKG is a mobile applications knowledge graph, which is a high-quality knowledge graph about millions of applications and provide an open data resource to the researchers from communities in Semantic Web and CyberSecurity. You can use its original resources and visit the website to enjoy its services.

1. Introduction:

In this work, we present a mobile application knowledge graph, namely MAKG, which merge comprehensive resources (e.g., application markets, encyclopedias, news). to construct a high-quality knowledge graph about millions of applications.

We present a comprehensive framework to construct a mobile application knowledge graph for CyberSecurity, in which a lightweight ontology of apps is defined and concrete steps (App Crawling, Knowledge Extraction, Knowledge Alignment) are instantiated with promising algorithms. It can obtain more structured triples and correspondences among entities from different resources. Besides, we list three use-cases about MAKG that are helpful to provide better services for security analysts and users.

2. Usage:

MAKG consists of five important resources, including Ontology (Knowledge Schema), AppMarket-Triples, Encyclopedias, AppMarket-Alignments, Extraction-Triples.

A technical report with details of these resources and related evaluations can be downloaded in the same address.

Ontology:

We design one lightweight ontology of apps. It can bring a well-defined schema of collected apps so that these apps could share more linkage with each other. It contains 26 basic classes, 11 relations and 45 properties.

We provide two files (appOntology.owl and appSchema.xlsx) for researchers to use it. For the former file, it needs to install protege to open it.

AppMarket-Triples:

This dataset contains raw triples crawlled from Huawei AppGallery, Xiaomi App Store, Google Play, App Store.

All of the files of these triples from application markets are provided in the format of .nt.

Encyclopedias:

This dataset contains the triples of apps crawled from Baidu Baike, Toutiao Baike, Wikipedia.

As the number of Wikipedia is few, we only provide the extracted triples of apps from Baidu Baike, Toutiao Baike.

AppMarket-Alignments:

This dataset contains the alignments of apps, which can share and reuse the description information of apps so as to provide better services based on MAKG for security analysts and users. We utilize two kinds of entity alignment techniques (i.e., Rule miner method, Knowledge graph embedding-based platform) to obtain the best results of them.

We present all the manually labeled alignments among four mainstream application markekts for evaluation.

In addition, we also provide the correspondences that are automatically generated by RuleMiner and KG embedding methods. (i.e., MultiKE, RDGCN, NMN).

Extraction-Triples:

This dataset contains the triples extracted from textual descriptions of apps crawled from application markets. We utilize three strategies (i.e., Infobox-based Method, Named entity recognition, Relation extraction platforms including OpenNRE, DeepKE, FewRel) and select the best models to extract basic triples.

We also provide the labeled corpus for training the methods based on named entity recognition and relation extraction.

3. Use-Cases:

We list the main use-cases of MAKG about cybersecurity in our developed WebSite.

  • MAKG can provide semantic retrievalfor users and security analysts. For example, if one user queries one app, MAKG can present more comprehensive than application markets to the user.

  • MAKG can link the apps to their appearing textual descriptions (e.g., news) with entity linking techniques. Benefited from above cases, users can fully understand the information of apps and avoid downloading some invalid apps.

  • MAKG can help security analysts to detect some sensitive apps, which own more conditions or plausibility than normal apps that become the hotbeds for related cybercriminals. With comprehensive relations and properties of apps, analysts can induce more prior rules and employ promising algorithms to evaluate the sensitivity of apps. It is able to lower the risk of some sensitive apps in advance and delay them published in the application markets.

  • MAKG can recommend some similar apps by our hybrid method for users and security analysts when they request related services, which can further reduce the potential risks and maintain the security of mobile internet.

4. Citation:

If you want to employ this dataset, please cite our paper as follows:

###Normal:

Heng Zhou, Weizhuo Li, Buye Zhang, Qiu Ji, Yiming Tan, and Chongning Na. MAKG: 
A Mobile Application Knowledge Graph for the Research of Cybersecurity. In: Proceedings of China Conference on Knowledge Graph and Semantic Computing, 
Guangzhou, China, Springer, 2021, pp. 321–328.

###BibTeX:

@inproceedings{MAKG2021, 
author = {Heng Zhou, Weizhuo Li, Buye Zhang, Qiu Ji, Yiming Tan, and Chongning Na}, 
title = {Combining Knowledge Graph Embedding and Network Embedding for 
Detecting Similar Mobile Applications}, 
booktitle = {Proceedings of China Conference on Knowledge Graph and Semantic 
Computing,Guangzhou, China}, 
pages={321--328}, 
year={2021},
publisher={Springer}
}

makg's People

Contributors

everglow123 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.