Code Monkey home page Code Monkey logo

vast-2020-mini-challenge-1's Introduction

Project Title

Vast Challenge 2020 - Mini-Challenge 1: Graph Analysis

image title

Description

The objective of this challenge is to leverage visual analytics to determine a significant dataset provided by the Center for Global Cyber Strategy (CGCS), which comprises anonymized profiles created from data donated by white hat groups. These profiles encapsulate the behavioral and structural characteristics of various groups, one of which has been hypothesized by CGCS sociopsychologists to closely resemble the organization inadvertently responsible for a major internet outage. Our task within this challenge is to engage in a meticulous comparative analysis of the CGCS's provided subgraph template—a representation of the suspect group's structure—against several candidate subgraphs. The aim is to determine which of these subgraphs exhibits the highest degree of congruence with the template, thus identifying the group that most likely matches the profile associated with the outage.

Getting Started

  • Clone the project into your local machine using git clone <project_url>
  • Install the "live server" plugin for Visual Studio code
  • Click on "Go Live" button to spin up a server on port 5500
  • Another to run the project way would be using Python http server python3 -m http.server
  • Use localhost:<port> to access the project

Dependencies

Components

  1. arc-diagram - Used to identify potential seed nodes structures that match the template.
  2. bar-graph - Used to identify which subgraph eType activity correlates the most with the template.
  3. heat-map - Used to identify which potential seed graph spend activity correlates the most with the template.
  4. lollipop - Used to identify which potential seed graph communication activity correlates the most with the template.
  5. multiline - Used to identify which subgraph activity correlates the most with the template with time.
  6. node-link - Holistic identifier for the matching of template with the subgraphs, shows the subregion with the most similarity to template.
  7. scatter-activities-plot - Used to find out the activity history of each of the nodes using a scatter plot.
  8. scatter-travel-history - Used to find out the travel history of each of the nodes using scatter plot and flag glyphs.

Challenges

  • Memory and Computation overhead: Loading the dataset which was 6GB in size was quite challenging which is why we had to preprocess a lot before hand. Having access to a larger dataset would be
  • In depth analysis: While cosine similarity was a great metric for us to work with and provided near-accurate results, working with it gives us a lack of contextual knowledge of the dataset we are working with
  • Seed similarity analysis: While working with subgraphs was easier, working with subgraphs and template as the number of records was just about 2000-3000 records whereas the seed graph consists of 2000 child nodes per node, which means evaluating about 5 million node links in our main graph, while those links would’ve been useful we couldn’t evaluate them.

Contribute

  1. Create new components in the components dir, with it's own js and css files
  2. Use index.js and style.css for global js and css
  3. data can be found in the data dir
  4. assest dir will hold images, fonts etc..

Authors

  1. Darshan Vipresh - [email protected]
  2. Deep Rodge - [email protected]
  3. Jayati Goyal - [email protected]
  4. Prasad Mahalpure - [email protected]
  5. Kaushal Yadav - [email protected]
  6. Sravya Thummeti - [email protected]

vast-2020-mini-challenge-1's People

Contributors

dashk11 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.