CS260D-project Investigation of SAS for imbalanced data

Linqiao Jiang
Zixian Li
Yihao Qin
Chang Xie

Description

This is the code of our 260d project. Our project is "Investigation of SAS for imbalanced data".

Environments

This is my experiment eviroument

python version: 3.10.13
pytorch version: 2.1.1+cu118

Jupyter notebook description

Final_Project_balanced.ipynb:

This jupyter notbook includes loading balanced CIFAR100 data, choosing subset from the balanced data using SAS, and the training result for both fullset and subset data.

Final_Project_imbalanced.ipynb:

This jupyter notbook includes constructing imbalanced CIFAR100 data, choosing subset from the imbalanced data using SAS, and the training result for both fullset and subset data.

Final_Project_solution.ipynb:

This jupyter notbook includes our proposed solution1(MDP) and solution2(MDA) for upsampling the imbalanced data, choosing subset from the upsampling data using SAS, and the training result for both fullset and subset data.

Usage

1. enter directory and install SAS package

$ cd SAS-IMBALANCEDDATA
$ pip install sas-pip/

2. dataset

We use cifar100 dataset from torchvision since it's more convenient to construct an imbalanced dataset and train on it, the sample code for writing own dataset module could be seen in dataset.py, as an example for people don't know how to write it.

3. run tensorbard(optional)

Install tensorboard

$ pip install tensorboard
$ tensorboard --logdir=runs

4. train the model

The training pipeline could be seen in train_changedata.py. It includes the whole process for training and testing pipeline. In order to apply the training in jupyter notebook, I packaged the training pipeline into function main(), then we could call main() with this format in python.

#python
from train_changedata import main
best_acc, confusion_matrix , best_f1, best_recall = main(train_dataset= new_cifar)

The function will return best accuracy, confusion matrix, f1 score and recall score for the model.

Besides, I also keep the train.py as reference.

felixlqjiang / sas-imbalanceddata Goto Github PK

sas-imbalanceddata's Introduction

CS260D-project Investigation of SAS for imbalanced data

Description

Environments

Jupyter notebook description

Usage

1. enter directory and install SAS package

2. dataset

3. run tensorbard(optional)

4. train the model

sas-imbalanceddata's People

Contributors

Stargazers

Watchers

Forkers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent