Code Monkey home page Code Monkey logo

identify-known-sites-in-photo-album's Introduction

Identify Known Sites in Photo Album

shield shield shield shield shield shield shield shield shield shield shield shield shield shield shield shield shield shield shield

alt text

Introduction:

This is a university project based on the Google Landmark Recognition 2020 kaggle competiton.

The goal of this project is to classify successfully images of known sites from around the world, given big and challenging train set to learn from and a test set that contain mainly out of domain images.

In face of the special and challenging features of the data set, we proposed and implemented two possible solution using machine learning techniques.

The first solution, a baseline, is a simple straight forward aprrocah, training a CNN (EfficientNet using RAdam optimizer) and use it as a classifier. This solution faile to overcome the challenging aspects of the data set and yields poor results.

The second solution is a retrival based solution that derive inspiration from other teams solution to this competition.

This solution consist of two steps, the first is to clean the test set from out of domain images using object detection (we used YOLO darknet implementation). Object detection examples:

alt text alt text

The second is classification using nearest neighbor algorithm, using the images features vector.

The power of using feature vectore and K-NN (the test set image is to the left, next to it there are the 5 nearest neighbors from the train set):

alt text alt text alt text

This solutoin is built to face on the challenging features of the data set and although the solution it yields are far from great they are much better than the baseline's results.

Code

The code we wrote for this project is organized in sub directories, so that there is a sub directory for each part of the project. Each sub directory contain the relevant code files (.py or .ipynb) and may contain csv files or images.

We tried to write the code so it will be organized and well documented.

Prerequisites

To run the whole code of this project, one needs the following libraries (in the specified version or higher):

Library Version
Python 3.6
torch 1.8.0
torchvision 0.9.0
pandas 1.25.0
numpy 1.19.0
opencv 4.2.0
matplotlib 3.2.1
seaborn 0.11.0
efficientnet_pytorch 0.7.0
torch_optimizer 0.1.0
sklearn 0.21.3
PIlow 6.1.0
tqdm 4.55.0

In this project we also used YOLO darknet implementation as an object detector. We used version 3 and version 4 network that were pre trained on Open Images Dataset and COCO Dataset accordingly.

Many of the code in this project is part of a jupyter notebook. Unfortunately, GitHub is not able to render successfully all the notebooks, so one can download them and run them locally or via colab.

Team:

Matan Kleiner

Yuval Snir

under the guidance of Ori Linial

References

[1] T. Weyand, A. Araujo, B. Cao and J. Sim, Proc. "Google Landmarks Dataset v2 - A Large-Scale Benchmark for Instance-Level Recognition and Retrieval", CVPR'20

[2] K. Chen et-al โ€œ2nd Place and 2nd Place Solution to Kaggle Landmark Recognition and Retrieval Competition 2019", arXiv:1906.03990 [cs.CV], Jun. 2019.

[3] J. Redmon and A. Farhadi. "YOLOv3: An Incremental Improvement", arXiv:1804.02767v1 [cs.CV] Apr. 2018.

[4] A. Krizhevsky, I. Sutskever, and G. E. Hinton. "ImageNet classification with deep convolutional neural networks", In Proceedings of NIPS, pages 1106โ€“1114, 2012.

identify-known-sites-in-photo-album's People

Contributors

matankleiner avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.