Code Monkey home page Code Monkey logo

word-replacement-algorithms's Introduction

Word-Replacement-Algorithms

Implementations of two Word Replacement Algorithms to test speed and efficiencey when processing large data.

Contributors

  1. Aminbhavi Suyash
  2. Mishra Satanshu
  3. Ramos Jason

Problem Statement

Keyword replacement in a corpus: In text analytics, often it is required that a set of keywords are replaced with a given set for the documents in hand. For example, on Twitter, people write a lot of abbreviations. When one requires to analyze the tweets, (s)he should find all the abbreviations in a given list of abbreviations (e.g. ASAP, won’t) and replace all these brief terms in the tweets with its proper phrase/keywords (e.g. ASAP -> As soon as possible, or won’t -> will not). Your job is to design an algorithm that finds all of the keywords that are in the abbreviation list in each tweet, and then replace them with the appropriate given keyword/phrase. The number of tweets can be millions and the list of keywords can be hundreds. A naïve approach is that for each tweet, your algorithm checks for all of the elements in the abbreviated list and replaces them. Other than the naïve approach, design a better algorithm and apply the required four steps explained in the first page.

Abstract

This repository explores & compares the implementation of the search-and-replace algorithms utilising Hashmap and Trie tree.

The resulting graphs and prior analysis verify that the algorithms have a time complexity of O(n). In addition to the implementation, the team completed some data processing, environment preparations, and implementation optimizations to address several difficulties including insufficient dataset size or inconsistent time complexities.

Data & Results

Figure 1: Hash map algorithm tested against 100,000 to 1,000,000 character inputs

Figure 2: Trie tree algorithm tested against 100,000 to 1,000,000 character inputs

word-replacement-algorithms's People

Contributors

jasonr24 avatar satanshumishra avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.