In the insurance industry, processing claims for vehicle damage is a common task.
With advancements in AI and Computer Vision, settling claims online by uploading damaged car images is now possible.
https://www.kaggle.com/datasets/imnandini/analytics-vidya-ripik-ai-hackfest
Training set (
train.zip
)
Test set (test.zip
)
Sample submission (sample_submission.csv
)
The training set contains a diverse dataset of car images with labels indicating the specific type of damage (e.g., dents, scratches, cracks).
The train.csv
file includes the following columns:
image_id
: Unique identifier of the imagefilename
: Filename of the imagelabel
: Type of damage present in the car
- Crack
- Scratch
- Tire Flat
- Dent
- Glass Shatter
- Lamp Broken
The test set contains only images, and the goal is to predict the type of damage for each image.
The test.csv
file includes the following columns:
image_id
: Unique identifier of the imagefilename
: Filename of the image
The solution file must contain predictions for every image_id
in the test set. It must contain only 2 columns - image_id
and label
.
The solution file format must be similar to that of sample_submission.csv
. sample_submission.csv
contains 2 variables:
image_id
: Unique identifier of an imagelabel
: Type of damage present in the car {1:crack, 2:scratch, 3:tire flat, 4:dent, 5:glass shatter, 6:lamp broken}
The model will be evaluated based on the macro F1 score.
The project is organized into CRISP-DM phases for effective development and documentation.
- Business Understanding
- Data Understanding
- Data Preparation
- Modeling
- Evaluation
- Deployment
- Conclusion
- GitHub Repo: Car Damage Image Classification Capstone Project
- Project Notebook: Car_Damage Image_Multi_Class_Classification.ipynb
Identifying fraudulent claims, especially those exaggerating damage, poses a challenge. The goal is to develop a high-performance model for automatic car damage classification, enabling insurance companies to assess claim legitimacy accurately.
Develop a model to automatically classify images of damaged cars into different types of damages for efficient claims processing and fraud detection.
- Insurance companies
- Claim processing teams
- Description of dataset acquisition.
- Dataset statistics.
- Visualizations of image samples and their labels.
- Insights into class distribution.
- Image resizing and normalization.
- Augmentation techniques applied.
- Keras offers pretrained models at keras.io
- I use the EfficientNetV2B0 model due to its fairly high Top-1 Accuracy and does not require depth.
- EfficientNetV2 models expect their inputs to be float tensors of pixels with values in the [0, 255] range.
- Change Learning Rate
- Adding more layers
- Conv2D
- AveragePooling2D
- SpatialDropout2D
- Dropout
- BatchNormalization
- Definition of evaluation metrics.
- Submissions are evaluated using the probabilistic F1 score (pFbeta)
- Results on validation and test sets.
675/675 [==============================] - 598s 886ms/step - loss: 1.3841 - categorical_accuracy: 0.3993 - pFbeta: 0.3193 - precision: 0.5266 - recall: 0.1667 - val_loss: 1.5634 - val_categorical_accuracy: 0.3800 - val_pFbeta: 0.2988 - val_precision: 0.4104 - val_recall: 0.1400
Build docker image using the recommended public image for Lambda once Dockerfile has been created below:
docker build -t car-insurance-model .
To test first run image that was built:
docker run -it --rm -p 8080:8080 car-insurance-model:latest
# Tag the Existing Image, username/car-insurance-model:new-tag
docker tag car-insurance-model:latest developerhost/car-insurance-model:latest
# Push the newly tagged image to Docker Hub:
developerhost/car-insurance-model:latest
# you can pull the image:
docker pull developerhost/car-insurance-model:latest
lambda function a function must be added as below to the lambda_function.py file:
def lambda_handler(event, context):
url = event['url']
result = predict(url)
return result
Run the file:
python client_to_docker_test.py
This is the output I recieved which clearly shows that the image was predicted as a "dent" which is correct:
{'crack': 0.006185653153806925,
'scratch': 0.34056955575942993,
'tire_flat': 0.021280569955706596,
'dent': 0.5486962795257568,
'glass_shatter': 0.0674322172999382,
'lamp_broken': 0.01583569310605526}
output:
# python client_to_docker_test.py
# {"crack": 0.006185653153806925, "scratch": 0.34056955575942993, "tire_flat": 0.021280569955706596, "dent": 0.5486962795257568, "glass_shatter": 0.0674322172999382, "lamp_broken": 0.01583569310605526}