Code Monkey home page Code Monkey logo

text-analysis-on-yelp-reviews's Introduction

Text Analytics on Yelp Reviews

The project includes Text analysis of Yelp reviews under the ‘Restaurant’ category and for the state of Arizona only as it had the maximum number of reviews. Next I have performed NLP tasks such as POS tagging, NER, WordCloud generation, Sentiment Analysis using Textblob and Vader.

- Data Exploration

From the dataset, we found that there were reviews pertaining to 7812 restaurants in 55 cities, given by about 400,000 users. The time period of the dataset ranged from February 2005 to November 2019.

- Data Pre-processing

Steps such as Tokenisation, Stop Word Removal, Lemmatization and other text cleaning activities were performed to transform the data for further analysis.

- Sentiment Analysis

Using sentiment analysis, we will try to extract the sentiment of each review after processing the text data and compare them to the ratings given by the user for that review (which are the human labels). I have implemented two lexicon-based sentiment analyzers: ‘TextBlob’ and ‘Vader’, to identify the sentiment polarity of the reviews.

- Sentiment Polarity Classification

Sentiment Polarity classification is done using ‘Naïve Bayes’ classifiers to help restaurants classify new reviews in yelp as positive or negative. The model evaluation is performed by computing the accuracy of the classifiers and since the classes are found to be imbalanced, ROC AUC of the classifiers were also computed, which provided a better metric to pick the best performing model.

text-analysis-on-yelp-reviews's People

Contributors

shoubhik23 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.