Code Monkey home page Code Monkey logo

Faizanuddin Ansari's Projects

neural-image-captioning-with-object-detection-and-attention-mechanism icon neural-image-captioning-with-object-detection-and-attention-mechanism

Image captioning is a task which lies in the intersection of areas of object detection and natural language processing. We will be proposing a, model which will be utilizing both the areas of CV and NLP for the automatic generation of the captions of the given image. Model that we are going to propose mimics the human visual system that automatically describe image content. Main idea of our model is that rather than focusing on the whole image it is better to focus on particular areas like the areas where objects are present in the image. Our model consists of two sub model, first sub model or an encoder consist of object detection part which is used to identify the object in the given image along with their spatial location and finally making annotation vector consist of object features and their spatial feature. Second sub model or decoder consist of RNN based LSTM network along attention network which produce a context vector based on annotation vector at a particular time and finally at each step LSTM takes input of attention network along with the other input to generate caption of a given image. Experimental result on the MSCOCO dataset shows that our model outperforms previous benchmark models.

plds icon plds

The Penalized Linear Dynamical System Project

pytorch-gan icon pytorch-gan

PyTorch implementations of Generative Adversarial Networks.

pytorch-grad-cam icon pytorch-grad-cam

Many Class Activation Map methods implemented in Pytorch for CNNs and Vision Transformers. Including Grad-CAM, Grad-CAM++, Score-CAM, Ablation-CAM and XGrad-CAM

ssl4mis icon ssl4mis

Semi Supervised Learning for Medical Image Segmentation, a collection of literature reviews and code implementations.

tailcalibx icon tailcalibx

Pytorch implementation of Feature Generation for Long-Tail Classification by Rahul Vigneswaran, Marc T Law, Vineeth N Balasubramaniam and Makarand Tapaswi

torch-cam icon torch-cam

Class activation maps for your PyTorch models (CAM, Grad-CAM, Grad-CAM++, Smooth Grad-CAM++, Score-CAM, SS-CAM, IS-CAM, XGrad-CAM)

vision-transformer-pytorch icon vision-transformer-pytorch

Pytorch version of Vision Transformer (ViT) with pretrained models. This is part of CASL (https://casl-project.github.io/) and ASYML project.

vit-pytorch icon vit-pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.