Tat Trong Vu's Projects
Robust recipes to align language models with human and AI preferences
Instruct-tune LLaMA on consumer hardware
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
About Code release for "Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting" (NeurIPS 2021), https://arxiv.org/abs/2106.13008
A collection of open-source dataset to train instruction-following LLMs (ChatGPT,LLaMA,Alpaca)
:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
:pencil2: Web-based image segmentation tool for object detection, localization, and keypoints
A docker file use for training Deep RL with Gazebo/Mujoco Robot env.
Deep Reinforcement Learning for Robotic Grasping from Octrees
Dense Retrieval and Retrieval-augmented LLMs
Supporting code for my article on video streaming with Flask.
A toolkit for reproducible reinforcement learning research.
ARIAC conveyor belt repurposed for generic conveyor belt usage
A collection of tools and plugins for Gazebo
Generative Grasping CNN from "Closing the Loop for Robotic Grasping: A Real-time, Generative Grasp Synthesis Approach" (RSS 2018)
accompanying code for neurips submission "Goal-conditioned Imitation Learning"
Code for ICRA21 paper "End-to-end Trainable Deep Neural Network for Robotic Grasp Detection and Semantic Segmentation from RGB".
HoloLens 2 Sensor Streaming.
Infinity is a high-throughput, low-latency REST API for serving vector embeddings, supporting a wide range of sentence-transformer models and frameworks.
kubernetes series code
A Flask Web-App to stream live from local webcam or CCTV (rtsp link)
Python bindings for llama.cpp
[NeurIPS'23 Oral] Visual Instruction Tuning: LLaVA (Large Language-and-Vision Assistant) built towards GPT-4V level capabilities.
Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub
Doing simple retrieval from LLM models at various context lengths to measure accuracy