The ai_project from giovannisorice

(EN) Project for Artificial Intelligence course

Project for the teaching of Artificial Intelligence, computer science degree course (Padua, Italy)

The project involved the implementation of a spam filter. This was done using Python using the following libraries:

Keras definition of the neural network and its training;
Pandas: conversion of the dataset into .csv format;
scikit-learn: data pre-processing and use of the LogisticRegression algorithm;

The dataset used for the project can be found at the following link: Spam Collection Dataset.

The purpose of the project and the tool we have used are explained inside the report.

How to run it

To run the project it is suggested to lean on the platform Google Colab which already has the integrated version of TensorFlow 2.0.0.

To import the project into your personal Colab platform, you need to log in to your Colab account and click on Upload notebook and load the previously downloaded notebook.

Download dataset

You must have the credentials of a Kaggle account from which you can download the dataset using a personal KAGGLE_KEY, see API Credentials for more info.

Once the Kaggle key is obtained, it is necessary to perform the following operations in the first cell of the files you want to execute:

Save the key obtained in the key field;
Write your personal Kaggle username in the username field.

If everything has been set correctly, the cell automatically downloads the dataset from Kaggle and saves it in .zip format to then decompress it in .csv format which is used for neural network training.

(IT) Progetto di Intelligenza Artificiale

Progetto per l'insegnamento di Intelligenza artificiale, corso di laurea in Informatica (Padova, Italia).

Il progetto prevedeva l'implementazione di uno spam filter. Questo è stato realizzato tramite Python utilizzando la seguenti librerie:

Keras: definizione della rete neurale e training di essa, questa è stata importata da Tensorflow, che dalla versione 2.0.0 la contiene al suo interno;
Pandas: conversione del dataset in formato .csv;
scikit-learn: pre-processing dei dati e utilizzo dell'algoritmo di LogisticRegression;
Kaggle API: scaricamento del dataset in formato .zip.

Il dataset utilizzato per il progetto è reperibile al seguente link: Spam Collection Dataset.

Lo scopo del progetto e gli strumenti che abbiamo utilizzato sono spiegati all'interno della relazione.

Come eseguirlo

Per eseguire il progetto si suggerisce di appoggiarsi alla piattaforma Google Colab la quale presenta già la versione di TensorFlow 2.0.0 integrata.

Per importare il progetto nella propria piattaforma Colab personale è necessario accedere al proprio account Colab e cliccare sulla voce Upload notebook e caricare il notebook precedentemente scaricato.

Scaricamento dataset

È necessario possedere le credenziali di un account Kaggle dal quale poter scaricare il dataset tramite l'utilizzo di una KAGGLE_KEY personale, vedi API Credentials per maggiori info.
Una volta ottenuta la chiave di Kaggle è necessario effettuare le seguenti operazioni nella prima cella dei file che si vogliono eseguire:

Salvare la chiave ottenuta nel campo key;
Scrivere nel campo username il proprio username personale di Kaggle.

Se tutto è stato settato correttamente la cella scarica automaticamente il dataset da Kaggle e lo salva in formato .zip per poi scomprimerlo in formato .csv che viene utilizzato per il training della rete neurale.

giovannisorice / ai_project Goto Github PK

ai_project's Introduction

(EN) Project for Artificial Intelligence course

How to run it

Download dataset

(IT) Progetto di Intelligenza Artificiale

Come eseguirlo

Scaricamento dataset

ai_project's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent