This is a university project based on the Google Landmark Recognition 2020 kaggle competiton.
The goal of this project is to classify successfully images of known sites from around the world, given big and challenging train set to learn from and a test set that contain mainly out of domain images.
In face of the special and challenging features of the data set, we proposed and implemented two possible solution using machine learning techniques.
The first solution, a baseline, is a simple straight forward aprrocah, training a CNN (EfficientNet using RAdam optimizer) and use it as a classifier. This solution faile to overcome the challenging aspects of the data set and yields poor results.
The second solution is a retrival based solution that derive inspiration from other teams solution to this competition.
This solution consist of two steps, the first is to clean the test set from out of domain images using object detection (we used YOLO darknet implementation). Object detection examples:
The second is classification using nearest neighbor algorithm, using the images features vector.
The power of using feature vectore and K-NN (the test set image is to the left, next to it there are the 5 nearest neighbors from the train set):
This solutoin is built to face on the challenging features of the data set and although the solution it yields are far from great they are much better than the baseline's results.
The code we wrote for this project is organized in sub directories, so that there is a sub directory for each part of the project. Each sub directory contain the relevant code files (.py or .ipynb) and may contain csv files or images.
We tried to write the code so it will be organized and well documented.
To run the whole code of this project, one needs the following libraries (in the specified version or higher):
Library | Version |
---|---|
Python | 3.6 |
torch | 1.8.0 |
torchvision | 0.9.0 |
pandas | 1.25.0 |
numpy | 1.19.0 |
opencv | 4.2.0 |
matplotlib | 3.2.1 |
seaborn | 0.11.0 |
efficientnet_pytorch | 0.7.0 |
torch_optimizer | 0.1.0 |
sklearn | 0.21.3 |
PIlow | 6.1.0 |
tqdm | 4.55.0 |
In this project we also used YOLO darknet implementation as an object detector. We used version 3 and version 4 network that were pre trained on Open Images Dataset and COCO Dataset accordingly.
Many of the code in this project is part of a jupyter notebook. Unfortunately, GitHub is not able to render successfully all the notebooks, so one can download them and run them locally or via colab.
Matan Kleiner
Yuval Snir
under the guidance of Ori Linial
[1] T. Weyand, A. Araujo, B. Cao and J. Sim, Proc. "Google Landmarks Dataset v2 - A Large-Scale Benchmark for Instance-Level Recognition and Retrieval", CVPR'20
[2] K. Chen et-al โ2nd Place and 2nd Place Solution to Kaggle Landmark Recognition and Retrieval Competition 2019", arXiv:1906.03990 [cs.CV], Jun. 2019.
[3] J. Redmon and A. Farhadi. "YOLOv3: An Incremental Improvement", arXiv:1804.02767v1 [cs.CV] Apr. 2018.
[4] A. Krizhevsky, I. Sutskever, and G. E. Hinton. "ImageNet classification with deep convolutional neural networks", In Proceedings of NIPS, pages 1106โ1114, 2012.