Code Monkey home page Code Monkey logo

vaibhavbichave / phishing-url-detection Goto Github PK

View Code? Open in Web Editor NEW
111.0 111.0 76.0 719 KB

Phishers use the websites which are visually and semantically similar to those real websites. So, we develop this website to come to know user whether the URL is phishing or not before using it. URL - http://phishing-url-detector-api.herokuapp.com/

Jupyter Notebook 98.41% Python 1.11% CSS 0.27% HTML 0.21% Procfile 0.01%
cybersecurity machine-learning malicious-url-detection password phishing-attacks phishing-detection

phishing-url-detection's Introduction

Phishing URL Detection

image image

Table of Content

Introduction

The Internet has become an indispensable part of our life, However, It also has provided opportunities to anonymously perform malicious activities like Phishing. Phishers try to deceive their victims by social engineering or creating mockup websites to steal information such as account ID, username, password from individuals and organizations. Although many methods have been proposed to detect phishing websites, Phishers have evolved their methods to escape from these detection methods. One of the most successful methods for detecting these malicious activities is Machine Learning. This is because most Phishing attacks have some common characteristics which can be identified by machine learning methods. To see project click here.

Installation

The Code is written in Python 3.6.10. If you don't have Python installed you can find it here. If you are using a lower version of Python you can upgrade using the pip package, ensuring you have the latest version of pip. To install the required packages and libraries, run this command in the project directory after cloning the repository:

pip install -r requirements.txt

Directory Tree

├── pickle
│   ├── model.pkl
├── static
│   ├── styles.css
├── templates
│   ├── index.html
├── Phishing URL Detection.ipynb
├── Procfile
├── README.md
├── app.py
├── feature.py
├── phishing.csv
├── requirements.txt


Technologies Used

Result

Accuracy of various model used for URL detection


ML Model Accuracy f1_score Recall Precision
0 Gradient Boosting Classifier 0.974 0.977 0.994 0.986
1 CatBoost Classifier 0.972 0.975 0.994 0.989
2 XGBoost Classifier 0.969 0.973 0.993 0.984
3 Multi-layer Perceptron 0.969 0.973 0.995 0.981
4 Random Forest 0.967 0.971 0.993 0.990
5 Support Vector Machine 0.964 0.968 0.980 0.965
6 Decision Tree 0.960 0.964 0.991 0.993
7 K-Nearest Neighbors 0.956 0.961 0.991 0.989
8 Logistic Regression 0.934 0.941 0.943 0.927
9 Naive Bayes Classifier 0.605 0.454 0.292 0.997

Feature importance for Phishing URL Detection

image

Conclusion

  1. The final take away form this project is to explore various machine learning models, perform Exploratory Data Analysis on phishing dataset and understanding their features.
  2. Creating this notebook helped me to learn a lot about the features affecting the models to detect whether URL is safe or not, also I came to know how to tuned model and how they affect the model performance.
  3. The final conclusion on the Phishing dataset is that the some feature like "HTTTPS", "AnchorURL", "WebsiteTraffic" have more importance to classify URL is phishing URL or not.
  4. Gradient Boosting Classifier currectly classify URL upto 97.4% respective classes and hence reduces the chance of malicious attachments.

phishing-url-detection's People

Contributors

vaibhavbichave avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

phishing-url-detection's Issues

version

what is the version that satisfies the requirement catboost (from versions: 0.1.1).Which catboost version I supposed to be installed?

Phishing.csv

In Phishing.csv file, Can you tell me why and how is there the values 1,-1,0 in the dataset and how can I use that dataset in the project?

app.py

How to get from features import generate_data_set
it is throwing errors..module not found.
ImportError Traceback (most recent call last)
C:\Users\VINOVI~1\AppData\Local\Temp/ipykernel_14008/498048315.py in
5 import warnings
6 warnings.filterwarnings('ignore')
----> 7 from features import dataset
8 #Gradient Boosting Classifier Model
9 from sklearn.ensemble import GradientBoostingClassifier

ImportError: cannot import name 'dataset' from 'features'

about dataset

Respected Sir,
from where you download dataset for phishing and legitimate sites? and in phishing .csv what is class feature or column is it phishing for -1 and legitimate for 1?
i need feature extraction file also for study purpose can you provide? means how to execute in details?
please reply
thank you sir

app.py

image

If I run app.py it is not loading...If I touch the link means it keeps on loading...Kindly clear my issue as soon as possible.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.