techthiyanes,Thiya,github

vila

Incorporating VIsual LAyout Structures for Scientific Text Classification

vilmedic

ViLMedic (Vision-and-Language medical research) is a modular framework for vision and language multimodal research in the medical field

vilt

Code for the ICML 2021 (long talk) paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"

viper

Code for the paper "ViperGPT: Visual Inference via Python Execution for Reasoning"

virtex

[CVPR 2021] VirTex: Learning Visual Representations from Textual Annotations

vision-language-modelling-series

Companion Repo for the Vision Language Modelling YouTube series - https://bit.ly/3PsbsC2 - by Prithivi Da. Open to PRs and collaborations

vision-language-models-are-bows

Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?"

vissl

VISSL is FAIR's library of extensible, modular and scalable components for SOTA Self-Supervised Learning with images.

visual-clustering

Visual Clustering: Clustering Plotted Data by Image Segmentation

visual-question-answering-for-medical-domain

visual-spatial-reasoning

VSR: A probing benchmark for spatial undersranding of vision-language models.

visual_taste_approximator

Visual Taste Approximator (VTA) is a very simple tool that helps anyone create an automatic replica of themselves that can approximate their own personal visual taste

Visualkeras is a Python package to help visualize Keras (either standalone or included in TensorFlow) neural network architectures. It allows easy styling to fit most needs. This module supports layered style architecture generation which is great for CNNs (Convolutional Neural Networks), and a graph style architecture, which works great for most models including plain feed-forward networks.

visualvoice

Audio-Visual Speech Separation with Cross-Modal Consistency

vit-gpt2

vit-pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

vit-pytorch-1

Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)

vit-vqgan

JAX implementation ViT-VQGAN

vitdet

Unofficial implementation of Exploring Plain Vision Transformer Backbones for Object Detection

vitmem

Image memorability estimation

vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

vits-fast-fine-tuning

This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and any-to-any voice conversion

vits_diffusion

vitsinger

Singing Voice Speech modeling test

vizier

Python-based research interface for blackbox and hyperparameter optimization, based on Google's internal Vizier Service.

vizseq

An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)

vl-t5

PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)

techthiyanes Goto Github PK

Thiya's Projects

Recommend Projects

Recommend Topics

Recommend Org