Code Monkey home page Code Monkey logo

leap2trend's Introduction

Leap2Trend

Leap2Trend is a novel approach to instant detection of research trends. It relies on temporal word embeddings word2vecto track the dynamics of similarities between pairs of keywords, their rankings and respective uprankings (ascents) over time.

Leap2Trend has been developed with Python and Java following a pipline project.

#INSTALLATION

  1. Python 2.7
  2. Java 8

#ROAD MAP

Python has been used for the embedding phase. Two codes have been provided: the first one (FreshEmbedding.py) for word2vec training from scratch and the second one (UpdatedEmbedding.py) for updating word2vec pretrained model with new vocabulary. Recall that Gensim Python Library has to be installed in order to use the Word2vec package.

*) FreshEmbedding.py: This code serves to train a word2vec model from scratch. It takes as input a text file and returns a word2vec model.

*) UpdatedEmbedding.py: This code serves to update a word2vec trained model. It takes as input a pre-trained model and a new vocabulary, and returns an updated word2vec model with new vector representations of words.

After each training, the similarity function similarity(word1,word2) has to be applied to find the similarity measures between the couple of keywords being studied and the result could be saved in text file.

Java has been used for the postprocessing phase. The description of the used codes is given as follows:

*) MatrixFromFile.java: this code serves to create a [k.k] matrix from a text file. The text file is the output of the similarity function above and the matrix corresponds to top k similarity matrix that stores the cosine similarity between embedding vectors of k pairs of keywords.

*) RankingMatrix.java: this code serves to rank the similarity matrix and returns the positions of ranked couple of keywords.

*) FindingPosition.java: this code serves to return the list of ranked keywords at a specific window of time.

*) RankExtraction.java: this code serves to return the rank of each keyword at all windows.

*) Jump.java: this code computes the jumps of a couple of keywords over all windows.

*) Slope: this code computes the slope of the linear regression of Google Trends hits.

*) CountGoogleTrends: this code counts the Google Trends hits per year from the csv file returned by Google Trend.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.