Nail classification

A deep learning demo project for classifying manufactured nails.

The data set used for this demo contains 100 images of 'good' nails and 100 images of 'bad' nails, where the property 'good' and 'bad' refers to the nail being either intact or somehow bent, respectively.

As this data set is rather small it is highly recommended to use a pre-trained model as the building block of the classifier. In this demo, the pre-trained vgg16 model, implemented in keras, is used along with a customized top to accomplish this binary classification task.

Additionally, a simple CNN is implemented to establish an easy baseline. The simple CNN reached a validation accuracy of 0.625 % after training for 65 epochs. By contrast, the pre-trained vgg16-architecture achieved 0.958 % for the validation accuracy after training for only 10 epochs.
The training was performed on a conventional CPU (Intel® Core™ i7-6500U CPU @ 2.50GHz × 4).

To increase the performance of the models a cropping routine was applied upon the images. Thereby only the region which shows the target is considered during training and for the prediction. It is quite clear that this pre-processing step increases the performance as the images are downsampled when feeding them into the neural network. In case of the original sized images the target would appear to be even smaller and coarse. In addition to that features arising from structures of the background might confuse the neural net.

To-Do:

Implement robust test routines
Implement more sophisticated cropping algorithm

1. Easy start - out of the box nail classification:

Clone the repository

git clone [email protected]:L2Data/nail-classification.git
cd nail-classification

Place the pre-trained model in the folder models. If this folder does not already exist in the root directory of the project, run
```
 mkdir models/
```
from the root directory and copy the .h5 file of the pre-trained model there.
When you have not any pre-trained model or want to train one of yourself, section 3. explains how to train you model.
Using docker
Build the docker image (if docker is not installed yet: see the docker documentation for instructions)
```
docker build -t nail-classifier .
docker run -p 127.0.0.1:5000:5000 nail-classifier    
```
This starts the server API with the pre-trained model in the terminal.
For classifying an image, open a new terminal and type
```
curl -X POST -F image=@<path-to-your-nail-image.jpeg> 'http://localhost:5000/predict' 
```
The classifier is now predicting the class of the image.
Simultaneously, it will give you information about the probability that the image belongs to
```
 a. class 0:  bad nails, p_bad = 1-p_good        
 b. class 1:  good nails, p_good
```
The output format is JSON.

Alternative to docker: local usage
You can run the server without the docker image. Execute

make server

in the root directory of the project to start the server.
Again, open a new terminal and run

curl -X POST -F image=@<path-to-your-nail-image.jpeg> 'http://localhost:5000/predict'

to classify your nail image.

Remark: In src/server/app.py the model is loaded outside of the prediction routine. This is highly recommened as by this the model is load only once which decreases the run time of each request.

2. More information:

The local installation of this project comes with several options.
Execute

make

in the root directory of the project to see what is available:

clean               Delete all compiled Python files 
create_environment  Set up python interpreter environment 
data                Make Dataset 
model_predict       Predict from trained model 
model_train         Train a model 
requirements        Install Python Dependencies 
server              Run API 
test_environment    Test python environment is setup correctly

REMARK:
Using console make <command> always executes the corresponding python script with default settings.

3. Train a model from scratch:

Creating the data sets - Option 1:

For this task first copy the images into the data folder. The tree structure starting from the root directory should look like

└── data
    └── raw
        └── nailgun
            ├── good
            └── bad

and then execute

make data

which creates

└── data
    └── raw
    │   └── nailgun
    │       ├── good
    │       └── bad
    └── processed
        ├── train
        |   ├── good
        |   └── bad
        ├── validate
        |   ├── good
        |   └── bad
        └── test
            ├── good
            └── bad

The subfolders in processed contain cropped images according to a set validation and test split.
Alternatively, the data sets can be created by running

python src/data/make_dataset.py [OPTIONS]
[OPTIONS]:  --split (default 0.12: valdiation and test split)
            --seed  (default 42: random seed)
            --clean (default 1 (True): clean processed/<subdirs>)
            --crop  (default 1 (True): apply cropping)

Creating the data sets - Option 2:

You can alternatively also adjust the directory settings in src/utils/utils.py to your local settings.

Training the model

After creating the training data sets the model can be trained. Therefore execute

make model_train

Alternatively, the model can be trained by running

python src/models/train_model.py [OPTIONS]
[OPTIONS]:  --modelname (default 'vgg16: select the model, alternative: 'cnn')
            --ep
            --lr
            --augment
            --bs
            --width
            --heigth

where the default settings of all the options but the modelname are found in src/utils/utils.py.
The models are defined in src/models/model.py and you can add as many different architectures as you like.

The training can be monitored with tensorboard by running

tensorboard --logdir=<path-to-project>/nail-classification/logs/

from the root directory of the project and executing

http://localhost:6006

in your browser.

Predicting from the model

To run a prediction from the (pre-)trained model finally run

make model_predict

or, alternatively,

python src/models/train_predict.py [OPTIONS]
[OPTIONS]:  --modelname (default 'vgg16: select the model, alternative: 'cnn')

which yields probability(good), label, name as an output.

4. Project Organization

├── LICENSE
├── Makefile           <- Makefile with commands like `make data` or `make train`
├── README.md          <- The top-level README for developers using this project.
├── data
│   ├── external       <- Data from third party sources.
│   ├── interim        <- Intermediate data that has been transformed.
│   ├── processed      <- The final, canonical data sets for modeling.
│   └── raw            <- The original, immutable data dump.
│
├── docs               <- A default Sphinx project; see sphinx-doc.org for details
│
├── models             <- Trained and serialized models, model predictions, or model summaries
│
├── notebooks          <- Jupyter notebooks. Naming convention is a number (for ordering),
│                         the creator's initials, and a short `-` delimited description, e.g.
│                         `1.0-jqp-initial-data-exploration`.
│
├── references         <- Data dictionaries, manuals, and all other explanatory materials.
│
├── reports            <- Generated analysis as HTML, PDF, LaTeX, etc.
│   └── figures        <- Generated graphics and figures to be used in reporting
│
├── requirements.txt   <- The requirements file for reproducing the analysis environment, e.g.
│                         generated with `pip freeze > requirements.txt`
│
├── setup.py           <- makes project pip installable (pip install -e .) so src can be imported
├── src                <- Source code for use in this project.
│   ├── __init__.py    <- Makes src a Python module
│   │
│   ├── data           <- Scripts to download or generate data
│   │   └── make_dataset.py
│   │
│   ├── features       <- Scripts to turn raw data into features for modeling
│   │   └── build_features.py
│   │
│   ├── models         <- Scripts to train models and then use trained models to make
│   │   │                 predictions
│   │   ├── predict_model.py
│   │   └── train_model.py
│   │
│   └── visualization  <- Scripts to create exploratory and results oriented visualizations
│       └── visualize.py
│
└── tox.ini            <- tox file with settings for running tox; see tox.testrun.org

Project based on the cookiecutter data science project template. #cookiecutterdatascience

# nail-classification

lbhesse / nail-classification Goto Github PK