Code Monkey home page Code Monkey logo

fraud_detection's Introduction

fraud_detection project

JEDHA Bootcamp - Data Science & Engineering - Lead - June 2024

Webliography

Objectives

Use a model in production to predict fraudulent payment in real-time and respond appropriately.

This mean :

  1. Create a model to predict fraudulent payment in real-time

    • At this point the "performances" of the model is NOT so important
    • Can we plug a new model easily ?
    • How do we monitor model accuracy over time ?
    • What about if it drift ? Again can we plug a new model easily?
    • Faut penser à faire une mini EDA (2500 fraud sur 500 000 lignes)
  2. Create an infrastructure that ingest real-time payment

    • What if the number of payments is increase by 10x
    • How the archi scale ?
    • Faut prévoir de consommer les données par batch de N transactions à vérifier à chaque fois
      • En effet faire une prédiction une après l'autre est très lent
    • Pour la sauvegarde dans la base (Store Data ci-dessous)
      • Faut sauver toutes les données reçues de l'API
      • Plus une colonne "prediction" avec l'inférence du modèle
      • Plus une colonne "true value" qui sera remplie plus tard après vérification (si elle a lieu)
        • Quand la vérification est faite il faut envoyer la ligne supplémentaire au jeu de trainning
        • Permet d'enrichir le modèle pour le prochain entrainement
  3. Classify each payment

  4. Send the prediction in real-time to a notification center

    • email ?

Deliverables

A Powerpoint set of slides explaining

  • the architecture
  • the choices
  • the use cases
    • performances in terms of scalability
    • Can we unplug-plug new predictions model easily?
    • drifiting monitoring?
    • ...
  • Can we demonstrate the project in realtime in some way ?
    • Video ?

drawing

Directories organization

├───00_mlflow_tracking_server
│   └───assets
├───01_images_for_model_trainers
│   └───01_sklearn_trainer
├───02_train_code
│   └───01_sklearn
│       ├───01_minimal
│       │   ├───assets
│       │   └───img
│       └───02_template
│           ├───assets
│           └───img
├───98_EDA
├───99_tooling
│   ├───01_client_predict
│   │   ├───app
│   │   └───assets
│   ├───02_API_test
│   └───03_combine_train_and validated
├───assets
└───data 
  • 00_mlflow_tracking_server : everything needed to build & deploy mlflow tracking server. There is a readme.md
  • 01_images_for_model_trainers : everything needed to build docker images where the model to be trained will run. There is a readme.md
    • 01_sklearn_trainer :
  • 02_train_code
    • 01_sklearn
      • 01_minimal
      • 02_template
  • 98_EDA : quick EDA (jupyter notebook)
  • 99_tooling
    • 01_client_predict : demonstrates how to make a prediction using python
    • 02_API_test : demonstrates how to get simulatated transactions with the API
    • 03_combine_train_and validated : demonstrates how to combine 2 dataframe : initial training dataset and a dataframe containing some additional validated data
  • assets : png, pptx
  • data : local copy of the dataset

Blablabla...

  • Demo color : your text
  • Let's try to create a readme.md in every directory
    • In this case, create an assets directory where to store the .png files of the readme.md
  • Link to the mlflow tracking server : https://fraud-202406-70e02a9739f2.herokuapp.com/

fraud_detection's People

Contributors

40tude avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.