This project uses Convolutional Neural Networks (CNNs) and Transfer Learning to process real-world, user-supplied images and detect whether there is any dog or human in the images and predict the breed or the resembling breed for them respectively.
This is an educational project that aims to gain practical experiences on transfer learning.
- Clone the repository and navigate to the downloaded folder.
git clone https://github.com/qiaochen/CNNDogBreedDetector.git
cd dog-project
-
Download the dog dataset. Unzip the folder and place it in the repo, at location
path/to/dog-project/dogImages
. -
Download the human dataset. Unzip the folder and place it in the repo, at location
path/to/dog-project/lfw
. -
Download the bottleneck features for transfer learning
- Xception bottleneck features
- VGG-19 bottleneck features
- ResNet-50 bottleneck features
- Inception bottleneck features
- VGG-16 bottleneck features.
Place them in the repo, at location
path/to/dog-project/bottleneck_features
.
- Install Requirements
The core required packages are keras (on Tensorflow backend), open-cv. The detailed list can be found in the
requirements.txt
file in therequirements
folder. Note, that I used GPU to train the models, please refer to the Tensorflow document to ensure the environment for GPU version is prepared. To install all the dependencies, execute:
pip install -r requirements/requirements-gpu.txt
- Run The Training Code
- Non Transfer Learning model
python train_from_scratch.py # train the network without transfer learning
The trained model would be placed under the saved_models
directory with the name weights.best.FromScratch.hdf5
- Transfer Learning models
python train_transfer_learning.py # train the transfer learning models
The trained models would be placed under the saved_models
directory with the names for each base architecture.
weights.best.VGG16.hdf5
weights.best.VGG19.hdf5
weights.best.Resnet50.hdf5
weights.best.InceptionV3.hdf5
weights.best.Xception.hdf5
- Watch Predictions
python breed_predictor.py
This code will first randomly select figures from the human face dataset and the test set of the dog image dataset, and then make predictions. If neither dogs nor human faces are detected, no breed prediction results would be returned, otherwise, the program returns the predicted breeds (or resembling breeds for human face images) for the input images.
Architecture | Test Accuracy |
---|---|
From-Scratch CNN (Initial) | 10.53 |
From-Scratch CNN (Deeper) | 16.15 |
VGG16 | 77.99 |
VGG19 | 79.07 |
Resnet50 | 82.30 |
InceptionV3 | 82.42 |
Xception | 85.29 |
- Deeper models achieved better performances, whether it is by transfer-learning or not.
- The best performing architecture is the transfer learning based on the Xception architecture
dog-project
| ## Folders ##
|---- dogImages # folder for the dog image dataset
|---- haarcascades # folder for trained face detectors by OpenCV
|---- images # folder for image materials used in notebook
|---- lfw # folder for LFW face dataset
|---- requirements # folder for configuring requirements
|---- saved_models # folder for trained models
| ## Files ##
|---- breed_predictor.py # code for predicting using trained models
|---- datautils.py # code for data processing
|---- train_from_scratch.py # code for non-transfer learning
|---- train_transfer_learning.py # code for transfer learning
|---- extract_bottleneck_features.py # code for extract features for transfer learning
|---- dog_app.ipynb # notebook documenting the inital code and results
|---- Report.md # A detailed project report
|---- README.md
Thank all the authors and providers of the wonderful deep learning resources and software tools, which enabled me to focus only on the interesting part of the development. Thank Udacity for proving the pre-trained features and kind instructions.