Code Monkey home page Code Monkey logo

sentiment-analys's Introduction

Sentiment Analysis using Naive Bayes

Introduction

A dataset of sample tweets is taken from the NLTK library and used to create a sentiment analysis model. The model is built using a Naive Bayes Classifier trained on a dataset of positive and negative tweets after preprocessing. The model takes a list of text tokens (that make up a comment) as input and predicts whether the corresponding comment is positive or negative.

Here is an screenshot of the app:

Run the app

The main.py file is a Streamlit app and is deployed to Streamlit Share. Visit the following link to run the app and test it:

Naive Bayes Theorem

For Example We Have dataset 10 Good Comments & 5 Negative Comments First, let's look at all the words in the good comments and negative comments and the number of occurrences as in the example below:

Good Comment
“Dear” : 20 times
“Precious” : 15 times
“Donation” : 1 times
“Love” : 15 times
“Hangout” : 3 times

Negative Comment
“Dear” : 2 times
“Bad” : 2 times
“Donation” : 15 times
“Worst” : 1 times
“Hangout”: 0 times

From the dataset above, let's say likelihood (the probability in discrete data is called likelihood) occurrence of each word as follows:

Good Comment
p(Dear|Normal) : 20/54 = 0.37
p(Precious|Normal) : 15/54 = 0.277
p(Donation|Normal) : 1/54 = 0.0185
p(Love|Normal) : 15/54 = 0.277
p(Hangout|Normal) : 3/54 = 0.055

Negative Comment
p(Dear|Negative): 2/20 = 0.1
p(Bad|Negative): 2/20 =0.1
p(Donation|Negative): 15/20 = 0.75
p(Worst|Negative): 1/20 = 0.05
p(Hangout|Negative) : 0/20 = 0

We already have the likelihood for the occurrence of each word in the Good and Negative Comments. Then we also have a chance in general Comment appears as Good or Negative, we call the prior probability.

p(G) = (summary good comment)/(summary good comment + summary negative comment)
p(G) = 10/(10+5) = 0.667

p(N) = 5/(5+10) = 0.333

Case Study

Now suppose we have a comment “Dear Bad Donation”. We will determine whether this comment is good or not.

by multiplying the prior probability of a good comment with the likelihood of the words “dear” “bad” and “donation” in the good comment as below.

p(G) x p(Dear|Good) x p(Bad|Good) x p(Donation|Good) = 0.667 x 0 x 0.277 x 0.0185 = 0

We count also for Negative Comment

p(N) x p(Dear|Negative) x p(Bad|Negative) x p(Donation|Negative) = 0.333 x 0.1 x 0.1 x 0.75 = 24.975

So, we can say “Dear Bad Donation” is Negative Comment because p(G|”Dear Bad Donation”) = 0 < 24.975 = p(N|”Dear Bad Donation”)

sentiment-analys's People

Contributors

panjek26 avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.