Google Landmark Recognition 2020 was the third instance of the competition sponsored by Google and hosted on Kaggle. It challenged Kagglers to build models that recognize the correct landmark (if any) on a public and a private test set of real pictures taken by Google operators.
The training data for the competition comes from a cleaned version of the original Google Landmarks Dataset v2 (GLDv2), which consists of over 5M images and over 200k distinct instance labels (landmarks).
The biggest challenges in this competition and to the landmark recognition task are captured by the GLDv2 dataset, containing:
- Extremely skewed class distribution. While famous landmarks might have tens of thousands of image samples, 57% of classes have at most 10 images and 38% of classes have at most 5 images.
- Intra-class variability. Landmarks have views from different vantage points, and of different details, as well as both indoor and outdoor views of buildings.
- Out-of-domain query images. The query set consists of only 1.1% images of landmarks and 98.9% out-of-domain images, for which no result should be expected.
More details about the dataset construction, its cleaned version, and other particular challenges related to the Landmark Recognition task are available in the dataset paper.
I dedicated a Medium Article where I explain, illustrate, and implement the ideas behind a baseline architecture for landmark image recognition. There, I cover its core concepts from theory to practical results.
This repository contains my implementation of a metric learning solution with cosine similarity search (using an EfficientNet backbone for image embedding), paired with a DELF module for reranking based on local features of the images. These techniques were adapted from the 2020 Recognition challenge winner and 2019 Recognition challenge 2nd place papers. The notebooks contained here illustrate the implementation on a subset of the competition data.
The notebooks in this repository were split from my original Kaggle notebook to enable rendering on Github. Please consider leaving an upvote on my notebook if you find the content helpful for your applications!