Code Monkey home page Code Monkey logo

mle-zoomcamp-m12-car-damage-image-classification-capstone-project's Introduction

MLE-Zoomcamp-M12-Car-Damage-Image-Classification-Capstone-Project

Introduction

In the insurance industry, processing claims for vehicle damage is a common task.
With advancements in AI and Computer Vision, settling claims online by uploading damaged car images is now possible.

Dataset

https://www.kaggle.com/datasets/imnandini/analytics-vidya-ripik-ai-hackfest

Training set (train.zip)
Test set (test.zip)
Sample submission (sample_submission.csv)

Training Dataset

The training set contains a diverse dataset of car images with labels indicating the specific type of damage (e.g., dents, scratches, cracks).
The train.csv file includes the following columns:

  • image_id: Unique identifier of the image
  • filename: Filename of the image
  • label: Type of damage present in the car
    1. Crack
    2. Scratch
    3. Tire Flat
    4. Dent
    5. Glass Shatter
    6. Lamp Broken

Test Dataset

The test set contains only images, and the goal is to predict the type of damage for each image.
The test.csv file includes the following columns:

  • image_id: Unique identifier of the image
  • filename: Filename of the image

Sample Submission

The solution file must contain predictions for every image_id in the test set. It must contain only 2 columns - image_id and label.
The solution file format must be similar to that of sample_submission.csv. sample_submission.csv contains 2 variables:

  • image_id: Unique identifier of an image
  • label: Type of damage present in the car {1:crack, 2:scratch, 3:tire flat, 4:dent, 5:glass shatter, 6:lamp broken}

Evaluation Metric

The model will be evaluated based on the macro F1 score.


Project Structure

The project is organized into CRISP-DM phases for effective development and documentation.

Table of Contents

  1. Business Understanding
  2. Data Understanding
  3. Data Preparation
  4. Modeling
  5. Evaluation
  6. Deployment
  7. Conclusion

Business Understanding

Project Name

Problem Statement

Identifying fraudulent claims, especially those exaggerating damage, poses a challenge. The goal is to develop a high-performance model for automatic car damage classification, enabling insurance companies to assess claim legitimacy accurately.

Objective

Develop a model to automatically classify images of damaged cars into different types of damages for efficient claims processing and fraud detection.

Stakeholders

  • Insurance companies
  • Claim processing teams

Data Understanding

Data Collection

  • Description of dataset acquisition.
  • Dataset statistics.

Exploratory Data Analysis (EDA)

  • Visualizations of image samples and their labels.
  • Insights into class distribution.

Data Preparation

Data Preprocessing

  • Image resizing and normalization.
  • Augmentation techniques applied.

Modeling

Model Selection

  • Keras offers pretrained models at keras.io
  • I use the EfficientNetV2B0 model due to its fairly high Top-1 Accuracy and does not require depth.
  • EfficientNetV2 models expect their inputs to be float tensors of pixels with values in the [0, 255] range.

Hyperparameter Tuning

  • Change Learning Rate
  • Adding more layers
    • Conv2D
    • AveragePooling2D
    • SpatialDropout2D
    • Dropout
    • BatchNormalization

Evaluation

Performance Metrics

  • Definition of evaluation metrics.
  • Results on validation and test sets.
675/675 [==============================] - 598s 886ms/step - loss: 1.3841 - categorical_accuracy: 0.3993 - pFbeta: 0.3193 - precision: 0.5266 - recall: 0.1667 - val_loss: 1.5634 - val_categorical_accuracy: 0.3800 - val_pFbeta: 0.2988 - val_precision: 0.4104 - val_recall: 0.1400

Deployment

Model Deployment

Preparing Docker Image

Build docker image using the recommended public image for Lambda once Dockerfile has been created below:

docker build -t car-insurance-model .

To test first run image that was built:

docker run -it --rm -p 8080:8080 car-insurance-model:latest

Docker Hub

# Tag the Existing Image, username/car-insurance-model:new-tag
docker tag car-insurance-model:latest developerhost/car-insurance-model:latest

# Push the newly tagged image to Docker Hub:
developerhost/car-insurance-model:latest

# you can pull the image:
docker pull developerhost/car-insurance-model:latest

test.py created per AWS documentation for testing

lambda function a function must be added as below to the lambda_function.py file:

def lambda_handler(event, context):
    url = event['url']
    result = predict(url)
    return result

Run the file:

python client_to_docker_test.py

This is the output I recieved which clearly shows that the image was predicted as a "dent" which is correct:

{'crack': 0.006185653153806925,
 'scratch': 0.34056955575942993,
 'tire_flat': 0.021280569955706596,
 'dent': 0.5486962795257568,
 'glass_shatter': 0.0674322172999382,
 'lamp_broken': 0.01583569310605526}

Testing function for Lambda

output:

# python client_to_docker_test.py
# {"crack": 0.006185653153806925, "scratch": 0.34056955575942993, "tire_flat": 0.021280569955706596, "dent": 0.5486962795257568, "glass_shatter": 0.0674322172999382, "lamp_broken": 0.01583569310605526}

Conclusion

mle-zoomcamp-m12-car-damage-image-classification-capstone-project's People

Contributors

celik-muhammed avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.