Code Monkey home page Code Monkey logo

shivamverma26 / fake_news_detection Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 1.28 MB

Created a Fake News Detection system employing Python, NLP, and Machine Learning. Utilized scikit-learn for data preprocessing and built a powerful Logistic Regression model for accurate classification. Achieved high F1 scores, showcasing strong data analysis and classification abilities. Effectively identified true and false news articles.

Jupyter Notebook 94.43% Python 5.57%
python3 data data-analysis data-visualization etl pandas

fake_news_detection's Introduction

Fake News Detection

Detecting Fake News with Natural Language Processing and Machine Learning in Python

Overview

This project employs various natural language processing techniques and machine learning algorithms to classify fake news articles. Using scikit-learn libraries in Python, we aim to differentiate between legitimate and fabricated news.

Getting Started

To set up and run this project on your local machine for development and testing, follow these steps:

Prerequisites

Ensure you have the following:

  1. Python 3.6: If not already installed, download Python from python.org and set up PATH variables if necessary.
  2. Anaconda: Alternatively, you can download Anaconda from anaconda.com.
  3. Required Packages: After installing Python or Anaconda, install the necessary packages by running the following commands:
    • If using Python 3.6:
      pip install -U scikit-learn
      pip install numpy
      pip install scipy
    • If using Anaconda, run these commands in Anaconda prompt:
      conda install -c scikit-learn
      conda install -c anaconda numpy
      conda install -c anaconda scipy

Dataset

We use the LIAR dataset, originally designed for Fake News Detection, which contains statements classified into "True" and "False" categories.

Project Structure

  • DataPrep.py: Preprocesses and analyzes data, including exploratory data analysis and data quality checks.
  • FeatureSelection.py: Implements feature extraction and selection methods, including bag-of-words, n-grams, and term frequency weighting.
  • classifier.py: Builds and evaluates classifiers, including Naive Bayes, Logistic Regression, SVM, Stochastic Gradient Descent, and Random Forest.
  • prediction.py: Utilizes the final Logistic Regression classifier to predict the class of a news headline provided by the user.

Performance

Our best-performing model, Logistic Regression, achieved an F1 score in the 70s range. The learning curves illustrate model performance.

Next Steps

Future improvements could include feature selection techniques like POS tagging, word2vec, and topic modeling, as well as increasing the training data size to enhance model accuracy.

How to Run the Software

  1. Clone this repository to your local machine.
  2. Navigate to the project directory.
  3. Run the prediction.py file as follows:
    • Anaconda: python prediction.py
    • Python 3.6: Replace python with the full path to your Python executable.

Follow the on-screen instructions to input a news headline and receive the classification and probability of truth.


Made By: Shivam Verma

fake_news_detection's People

Contributors

shivamverma26 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.