Thomas Coste's Projects
ALFWorld: Aligning Text and Embodied Environments for Interactive Learning
A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.
Code for the paper "Evaluating Large Language Models Trained on Code"
Train very large language models in Jax.
A repo for RLHF training and BoN over LLMs, with support for reward model ensembles.
Imperial master's project codebase
Codebase for a replication study of Conditional Neural Processes
[NeurIPS 2022] 🛒WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents