Code Monkey home page Code Monkey logo

efficient_imagenet_classification's Introduction

Efficient ImageNet Classification

๐Ÿš€ Training Resnet50 on ImageNet in 8 hours.

This repo provides an efficient implementation of ImageNet classification, based on PyTorch, DALI, and Apex.

If any questions, please create an issue or contact me at [email protected]

Features

  • Accelerate the pre-processing of the input data with DALI
  • Half/Mix precision training with Apex
  • Real-time logger
  • Extremely simple structure

Getting Start

Installation

1. Download repo

git clone https://github.com/13952522076/Efficient_ImageNet_Classification.git
cd Efficient_ImageNet_Classification

2. Requirements

  • Python3.6
  • PyTorch 1.3+
  • CUDA 10+
  • GCC 5.0+
pip install -r requirements.txt

3. Install DALI and Apex

DALI Installation:

cd ~
# For CUDA10
pip install --extra-index-url https://developer.download.nvidia.com/compute/redist nvidia-dali-cuda100
# or
# For CUDA11
pip install --extra-index-url https://developer.download.nvidia.com/compute/redist nvidia-dali-cuda110

For more details, please see Nvidia DALI installation.

Apex Installation:

cd ~
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

For more details, please see Apex or Apex Full API documentation.

Training & Testing

We provide two training strategies: step_lr schedular and cosine_lr schedular in main_step.py and main_cosine.py respectively.

The training models (last one and best one) and the log file are saved in "checkpoints/imagenet/model_name" by default.


I personally suggest to manually setup the path to imagenet dataset in main_step.py (line 49) and main_cosine.py (line 50). Replace the default value to your real PATH.

Or you can add a parameter --data in the following training command.

For the step learning rate schedular, run follwing commands

# change the parameters accordingly if necessary
# e.g, If you have 4 GPUs, set the nproc_per_node to 4. If you want to train with 32FP, remove ----fp16.
python3 -m torch.distributed.launch --nproc_per_node=8 main_step.py -a old_resnet50 --fp16 --b 32

For the cosine learning rate schedular, run follwing commands

# change the parameters accordingly if necessary
python3 -m torch.distributed.launch --nproc_per_node=8 main_cosine.py -a old_resnet18 --b 64 --opt-level O0

Add New Models

Please follow the same coding style in models/resnet.py.

  1. Add a new model file in folder models
  2. Import the model file in model package, say models/init.py

Calculate Parameters and FLOPs

python3 count_Param.py

๐Ÿ›: It would not consider the forward operations. For example, defining a pooling layer in init function and implementing the pooling operation in forward function will lead to different results.

Acknowledgements

This implementation is built upon PyTorch ImageNet demo and PytorchInsight.

Many thanks to Xiang Li for his great work.

efficient_imagenet_classification's People

Contributors

ma-xu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

efficient_imagenet_classification's Issues

Problem while using the code.

When I try to deploy the code on GCP using 4 V100 GPUs, the code can not work and throw the following error:
Traceback (most recent call last):
File "main_cosine.py", line 586, in
main()
File "main_cosine.py", line 235, in main
model = DDP(model, delay_allreduce=True)
File "/opt/anaconda3/lib/python3.7/site-packages/apex/parallel/distributed.py", line 253, in init
flat_dist_call([param.data for param in self.module.parameters()], dist.broadcast, (0,) )
File "/opt/anaconda3/lib/python3.7/site-packages/apex/parallel/distributed.py", line 75, in flat_dist_call
apply_flat_dist_call(bucket, call, extra_args)
File "/opt/anaconda3/lib/python3.7/site-packages/apex/parallel/distributed.py", line 41, in apply_flat_dist_call
call(coalesced, *extra_args)
File "/opt/anaconda3/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 810, in broadcast
work = _default_pg.broadcast([tensor], opts)
RuntimeError: Broken pipe

Python 3.7.4
CUDA Version 10.0.130
CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)
Torch Version 1.3.0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.