Code Monkey home page Code Monkey logo

vectorsearch_image_retrieval's Introduction

MongoDB Atlas VectorSearch Image Retrieval System

Description

The aim of this work is to find similar images, mostly similar jean, tshirt, tv or sofa image from our dataset. We leveraged MongoDB Atlas VectorSearch feature to create the image search similarity system to retrieve the information. The dataset images have been converted into embeddings and hosted on a MongoDB Atlas cluster. For retrieval, the querying image will be first converted into embeddings then via MongoDB Atlas VectorSearch function, retrieve the top k images, where k=5 in our case. Cosine similarity is used for distance calculation. Enbeddings are generated via Vision Transformer (ViT) pretrained model.

Dataset

The dataset used in this work has been downloaded from kaggle and is large of 796 images, divided into 4 classes: Jean, Tshirt, TV and Sofa.

Dataset Collection Overview

VectorSearch_DB_Overview

Requirements

  • Transformers
  • OS
  • Pillow
  • Requests
  • Glob
  • Matplotlib
  • Numpy
  • Dotenv
  • PyMongo

Steps

  1. Load all images from your dataset and create their embeddings via a pretrained vision transformer model.
  2. Pair image_filename and corresponding embeddings into a dictionary and store in MongoDB Atlas database.
  3. Create search index in MongoDB Atlas (see below image) to be later used for the image retrieval.
  4. Load and create embeddings for the querying image then retrieve similar images

System Pipeline

VectorSearch_pipeline

Atlas Search Configuration

VectorSearch

Results

Below we have a set of retrieval results. Based in below tests, we can observe that the system combines both the object shape and color to retrieve the perfect match. It finds the exact match of the querying image.

Query 1

test_result_1

Query 2

test_result_2

Query 3

test_result_3

Query 4

test_result_4

Query 5

test_result_5

Notes

  1. You need to create a .env file containing your MongoDB Atlas DB account credentials, also called connection string used to connect to your cluster.
  2. Adjust the paths in the code based on your local directory.

vectorsearch_image_retrieval's People

Contributors

wendgoundi avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.