Code Monkey home page Code Monkey logo

PARL

前百度强化学习方向负责人,在文心一言模型组负责RLHF技术研发。

团队除了支持公司内外的业务之外,还深耕前沿技术,开展的工作线条包括但不限于:

  • 高性能RL并行框架PARL的研发(https://github.com/PaddlePaddle/PARL, 3.2k star)
  • 参与业内的国际RL赛事(我们团队已经连续三年在NeurIPS RL 赛事中拿下冠军名次了)
  • 学术论文的投稿
  • 机器人控制(自动驾驶&无人机、四足机械狗控制)

公司内业务支持包括超大规模LLM对齐、信息流推荐、搜索引擎、百度地图、广告排序、百度智能云(能源调度、信号灯控制)等。

Bo Zhou's Projects

cityflow icon cityflow

A Multi-Agent Reinforcement Learning Environment for Large Scale City Traffic Scenario

gnnpapers icon gnnpapers

Must-read papers on graph neural networks (GNN)

maskrcnn-benchmark icon maskrcnn-benchmark

Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch.

matd3 icon matd3

An implementation of multi-agent TD3 with paddlepaddle and parl

multiagent-particle-envs icon multiagent-particle-envs

Code for a multi-agent particle environment used in the paper "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments"

nni icon nni

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

osim-rl icon osim-rl

Reinforcement learning environments with musculoskeletal models

paddle icon paddle

PArallel Distributed Deep LEarning

palm icon palm

a Fast, Flexible, Extensible and Easy-to-use NLP Large-scale Pretraining and Multi-task Learning Framework.

parl icon parl

A high-performance distributed training framework for Reinforcement Learning

ray icon ray

A system for parallel and distributed Python that unifies the ML ecosystem.

tape icon tape

Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology.

tensorflow icon tensorflow

Computation using data flow graphs for scalable machine learning

tensorgo icon tensorgo

Using the tensorgo API for TensorFlow Async Model Parallel

transformers icon transformers

🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.

trpo icon trpo

Trust Region Policy Optimization with TensorFlow and OpenAI Gym

vim icon vim

Configuration of vim

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.