Content-based image retrieval (CBIR) is the application of computer vision techniques to the image retrieval problem, that is, the problem of searching for digital images in large databases.
Content-base means that the search analyzes the contents of the image rather than the metadata such as keywords, tags, or descriptions associated with the image.
It is carried out in three steps:
- extraction of features from an image database to form a feature database,
- extraction of the features of the input image,
- find the most similar features in the database,
- return the image associated with the found features
I would like to know which model and distance similarity is the most suitable for finding similar faces. For that, I try:
- Cosine similarity
- Manhattan distance
- Euclidean distance
The objective is to find the right combination (extraction algorithm & similarity measure) that allows to have relevant answers.
In my exploration, I used the following datasets:
- Fashion dataset Apparel
CBIR system retrieves images based on feature similarity. To evaluate my models, I used:
- Mean of Mean Average Precision (MMAP) for robustness of system
- Mean Reciprocal Rank (MRR) for the relevance of the first element
- average time per query
the evaluation formulas is refer to here
Cosinus | Manhattan | Euclidean | |
---|---|---|---|
ORB | 0.04 | 0.19 | 0.18 |
SURF | 0.03 | 0.22 | 0.17 |
AKAZE | 0.04 | 0.20 | 0.20 |
VGG16 | 0.00 | 0.71 | 0.71 |
VGG19 | 0.00 | 0.71 | 0.71 |
MobileNet | 0.00 | 0.71 | 0.72 |
Autoencoder | 0.03 | 0.52 | 0.52 |
Demo available: https://sch-cbir-benchmark.herokuapp.com/