The data folder consists of 2 folders and 3 CSV files:
- train - Contains 18540 images from 102 categories of flowers
- test - Contains 2009 images
- train.csv - Contains 2 columns and 18541 rows (including the headers), which consists of image id and the true label for each of the images in the train folder
- test.csv - Contains the
image id
for the images present in test folder for which the true label needs to be predictedsample_submission.csv
- Specifies the format for the submission file
For more information on type of flowers provided as input you can visit here with valid hackerearth account. One can download the dataset from here.
Here are some of the sample images from the dataset:
The dependencies to run this code are:
Please follow notebooks to get information on the preprocessing done on images before training the model
Transfer learning is a technique to use a model learned on one problem to apply it to similar problem. We can efficiently achieve state-of-the-art results in short span of time. Special props to @rednivrug for providing the baseline. The final solution is an ensemble of three architectures of popular
- ResNet
- DenseNet
more information is given in the code
To train the model, you can run
./src/code_HE_Train.ipynb
Edit the path to load the dataset and save the weights
For just testing the model, run
./src/code_HE_Train.ipynb
Download the weights from the link given in the notebook and edit the path to load them
I was able to achieve rank of 17 out of 7620 users taking part in this competition with an accuracy of 89.99390 in just 5 days. This demonstrates the power of ensemble learning. One will have to re-train learn
and learn2
models to get the final accuracy.
The dataset belongs to hackerearth and dataquest. It is used here to demonstrate model capabilities.