Code Monkey home page Code Monkey logo

email-spam-detection-using-dl's Introduction

Predictions based on the following approaches

BERT Fine-Tuning

Using the test dataset as my validation dataset I achieved the accuracy of about 87%.

Email Spam Detection using ML #340

Goal

The main goal of this project is to develop a ML model that can accurately classify the emails as spam or ham(not spam).

Dataset

The dataset for this project can be found at link given below.

https://www.kaggle.com/datasets/mayank07thakur/spam-mail-dataset

Approach

  1. Data loading and exploration: Loaded the dataset, examined its structure, and performed initial exploratory data analysis (EDA) to gain insights into the data distribution, missing values, and relationships between variables.

  2. Data preprocessing: Conducted data preprocessing steps such as handling missing values, encoding categorical variables to prepare the data for model training.

  3. Feature Extraction: Applied feature selection techniques to identify the most relevant features that can be used as the input for Logistic Regression. This helps in reducing model complexity and improving performance.

  4. Model development:

    a. Bidirectional Recurrent Neural Networks (BRNNs);  It used in earthquake classification by
  processing seismic time series data in both forward and backward directions. This architecture
  captures temporal dependencies effectively, enabling the model to consider past and future
  information simultaneously. BRNNs are adept at recognizing patterns in the seismic signals, allowing
  them to classify earthquake events accurately based on their unique characteristics.
  b. Long Short Term Memory (LSTM): It is a type of recurrent neural network (RNN) architecture designed to handle sequences and time-series data. LSTM networks are particularly well-suited for tasks involving sequential data, such as natural language processing, speech recognition, and video analysis.
    c. Logistic Regression is a statistical method used for binary classification tasks, where the goal is to predict whether an input belongs to one of two classes. Despite its name, logistic regression is a classification algorithm rather than a regression algorithm.
  1. Model evaluation: Evaluated the performance of each model using appropriate metrics such as accuracy and precision.

Libraries Needed

  • Pandas
  • Tensorflow
  • Seaborn
  • sklearn
  • pathlib
  • numpy
  • keras

Accuracies

Model Accuracy
BRNN 0.8664000276565552
GRU 0.8663676977157593
LG 0.9659192825112107

Conclusion

In conclusion, this project aimed to classifies emails using DL models. Among the models developed, the Logistic Regression Binary Classification model achieved the highest accuracy of 97%. This suggests that the temporal dependencies captured by the logistic regression architecture are valuable in Classification.

Mayank Thakur

email-spam-detection-using-dl's People

Contributors

mayank2130 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.