Code Monkey home page Code Monkey logo

arisdwi666 / klasifikasi-sms-spam-with-gradio Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 203.75 MB

Klasifikasi SMS Spam Berbahasa Indonesia Menggunakan Metode Multinomial Naive Bayes

Jupyter Notebook 0.52% Python 88.71% PowerShell 0.01% Batchfile 0.01% JavaScript 7.85% Cython 1.61% C 0.49% C++ 0.17% Shell 0.01% Svelte 0.35% TypeScript 0.14% HTML 0.05% CSS 0.04% Roff 0.01% Lua 0.01% Meson 0.01% Fortran 0.05% Forth 0.01% Smarty 0.01% VBScript 0.01%

klasifikasi-sms-spam-with-gradio's Introduction

Klasifikasi SMS Spam Berbahasa Indonesia Menggunakan Metode Multinomial Naive Bayes & Feature Selection Chi-Square dan Deploy menggunakan Gradio - Hosting menggunakan Hugging Face

Dataset diambil dari github https://github.com/ksnugroho/klasifikasi-spam-sms/ dengan jumlah data 1143 data. 569 data untuk SMS Normal, 335 data untuk Peniuan/Fraud, 239 data untuk Promo.

Permasalahan : SMS spam adalah pesan yang tidak diinginkan atau tidak diminta oleh pengguna, yang dapat mengganggu, menipu, atau bahkan merugikan pengguna.

Tujuan : Untuk mengklasifikasikan spam SMS dan mengembangkan sebuah sistem klasifikasi SMS spam berbahasa Indonesia yang efektif dan akurat menggunakan metode Multinomial Naive Bayes.

Model : Menggunakan algoritma klasifikasi Multinomial Naive Bayes (MNB) untuk mengklasifikasikan SMS menjadi spam atau non-spam berdasarkan fitur-fitur yang diekstrak.

Langkah-langkah penyelesaian meliputi:

  1. Data Acquisition
  2. Text Pre-processing
    1. Case Folding
    2. Filtering
    3. Stopword
    4. Stemming
  3. Feature Engineering
    1. Feature Extraction - BoW & TF IDF
    2. Feature Selection - Chi-Square
  4. Modelling (Machine Learning)
  5. Model Evaluation
  6. Deployment

Performa Model : Jumlah prediksi benar : 211

Jumlah prediksi salah : 18

Akurasi pengujian : 92.13973799126637 %

Confusion matrix:

[[106 1 1]

[ 6 64 1]

[ 6 3 41]]

Classification report:

           precision    recall  f1-score   support

       0       0.90      0.98      0.94       108
       1       0.94      0.90      0.92        71
       2       0.95      0.82      0.88        50
accuracy                           0.92       229

macro avg 0.93 0.90 0.91 229

weighted avg 0.92 0.92 0.92 229

Akurasi setiap split: [0.91266376 0.89956332 0.930131 0.89956332 0.91266376 0.91266376

0.94759825 0.89519651 0.89519651 0.89082969]

Rata-rata akurasi pada cross validation: 0.9096069868995634

Proses deployment:

  • Load Model yang sudah disimpan
  • Install Gradio
  • Buat initerface untuk gradio nya
  • siapkan requirements.txt
  • unggah file yang dibutuhkan seperti notebooks, file app.py, requirements, datasetnya.

Link Web App nya : Arisdwi/gradio-sms-classifier

Screenshot 2023-12-28 182604

klasifikasi-sms-spam-with-gradio's People

Contributors

arisdwi666 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.