Athens University of Economics and Business

School of Business

Department of Management Science & Technology

Master of Science in Business Analytics

Program:	Full-Time
Quarter:	3rd (Spring Quarter)
Course:	Machine Learning & Content Analytics
Assignment:	Image Classification for Outfit Suggestions
Students (Registration No):	Yankee Team (f2822203, f2822215, f2822217, f2822218),

AI Fashion

A project for the Machine Learning and Content Analytics course of the AUEB's MSc in Business Analytics, made by the Yankee team of the Full-Timers.

Project Description

Our main project idea, named Image Classification for Outfit Suggestions is to create an application, named dressmeup, that can create outfit combinations based on user’s input photos. So, the business question, that our project seeks a good solution, is: What to wear today (or a specific given day) from the clothes I have?, and Does an article (garment) that I like can be well-combined with the clothes I have?.

Data Collection

The data source that our data mostly came from was Fashion Product Images Dataset from Kaggle, which contains 44 thousands products with multiple category labels, descriptions and high-resolution images. Also, from thomascamminady GitHub repository, we got the digital version of the 'A Dictionary of Color Combinations' by Sanzo Wada book’s collection of color combinations, in order to use for clothes combination.

Dataset Overview

The dataset contains 44,446 products. Each one of the 44k products is identified by an ID. A map to all products exists in the “styles.csv” file, containing various features for each product. The features that we found most useful for our project and used were gender, subCategory, articleType, season, and usage. The images of 44,441 products can be fetched from ‘images/{ID}.jpg’. Gender attribute has 5 unique values, subCategory has 45, articleType has 143, season has 5, and usage has 9 unique values. The images are of resolution 1800x2400. In the color combination repository exist 348 combinations from 157 different colors.

Data Processing Steps

Filtered the dataset to keep only top and bottom-wear clothing items, removing other categories, but retained 'Shorts' from the 'Loungewear and Nightwear' subCategory.
Kept only specific usage categories: casual, formal, sports, smart casual, party, and travel clothes.
Excluded gender categories 'unisex', 'boys', and 'girls'; focused only on 'Men' and 'Women' clothing.
Created a new column for image paths and removed rows without associated images.
Imputed null values in the 'usage' and 'season' columns.
Null 'usage' values in three shirt products set to 'Formal' and null 'season' value in a T-shirt product set to 'Summer'.
Recategorized 'Smart Casual', 'Party', and 'Travel' clothes as 'Casual' due to low sample sizes.
Updated 'Winter' and 'Spring' clothes to 'Fall/Winter' and 'Spring/Summer' respectively, based on main fashion week seasons.
Augmented the dataset by applying image rotations, scaling, and horizontal flips to achieve 1,000 images per articleType.
Encoded category labels alphabetically, assigning integer values (e.g., Casual=0, Formal=1, Sports=2).
Saved the processed data as a CSV file for distribution among project members for parallel work.

Methodology

In our project, we utilize a diverse set of algorithms to tackle various aspects of our tasks:

Stochastic Gradient Descent (SGD) Classifier:
- Utilized as a Support Vector Machine (SVM) classifier.
- Implements regularized linear models with stochastic gradient descent.
- Supports hinge loss and L2 penalty.
- Allows model updates with a decreasing strength schedule (learning rate).
Passive Aggressive Classifier:
- Belongs to a family of algorithms designed for large-scale learning.
- Uses hinge loss for classification.
- Incorporates a regularization parameter, C, to control model behavior.
Convolutional Neural Network (CNN) with ResNet50 Transfer Learning:
- Leverages CNNs for image classification.
- Implements transfer learning from the ResNet50 architecture.
- ResNet50 is a 50-layer CNN known for its deep residual connections.
- These residual networks are designed with stacked residual blocks.
Convolutional Neural Network (CNN) with EfficientNetV2-S Transfer Learning:
- Adopts another powerful CNN architecture, EfficientNetV2S.
- EfficientNet models are created using Neural Architecture Search (NAS).
- EfficientNetV2S is a compact model with faster training and better parameter efficiency.
- It captures intricate image features efficiently, enhancing classification accuracy.
K-Means Clustering:
- Applied to cluster pixels within specified image regions.
- Extracts the most dominant color from the clusters, i.e., the color with the highest pixel count.

These algorithms collectively empower our project to handle image classification, feature extraction, and color analysis tasks effectively. Each algorithm plays a specific role, contributing to the overall success of our fashion item classification model.

Tools

Python (version 3.11) Programming Language.

Scikit-Learn and Tensorflow libraries.

Streamlit open-source app framework to create web application (deployed here without the Click & Update option and with a slight UI/Performance upgrade).

SQLite Database Engine.

Results

Average Output Accuracy per Model and Dataset

Models	Train Avg Output Accuracy	Evaluation Avg Output Accuracy	Test Avg Output Accuracy
1. SGD Classifier (SVM)	69.0	-	66.6
2. Passive Aggressive Classifier	64.4	-	63.0
3. CNN with pre-trained ResNet50	87.2	82.5	82.1
4. CNN with pre-trained EfficientNetv2-S	96.4	90.0	90.7

l3th2nh / ai-fashion Goto Github PK

ai-fashion's Introduction

Athens University of Economics and Business

School of Business

Department of Management Science & Technology

Master of Science in Business Analytics

AI Fashion

Project Description

Data Collection

Dataset Overview

Data Processing Steps

Methodology

Tools

Results

Average Output Accuracy per Model and Dataset

ai-fashion's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent