cby-pku Goto Github PK

followers: 17.0 following: 46.0 repos: 27.0 gists: 0.0

Name: Boyuan Chen

Type: User

Company: Peking University

Bio: Sophomore undergrad at Peking University📚 Focus on Scalable Oversight / AI Safety / AI Alignment

Location: Beijing

Blog: https://cby-pku.github.io/

Boyuan Chen's Projects

accelerate

🚀 A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision

aligner

Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction

awesome-rlhf

A curated list of reinforcement learning with human feedback resources (continually updated)

cby-pku.github.io

Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes

cookbook

🎉🎉🎉JAVA高级架构师技术栈==任何技能通过 “刻意练习” 都可以达到融会贯通的境界，就像烹饪一样，这里有一份JAVA开发技术手册，只需要增加自己练习的次数。🏃🏃🏃

data_process

Practical data processing python files that may be used in research

deepspeed-chat

fastchat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

gpt4_eval

GPT-4 evaluation prompt, accelerated with ray.

malib

A parallel framework for population-based multi-agent reinforcement learning.

markdown-emoji

Markdown语法支持添加 emoji表情，输入不同的符号码（两个冒号包围的字符）可以显示出不同的表情

marllib

One repository is all that is necessary for Multi-agent Reinforcement Learning (MARL)

oi-wiki

:star2: Wiki of OI / ICPC for everyone. （某大型游戏线上攻略，内含炫酷算术魔法）

omnisafe

OmniSafe is an infrastructural framework for accelerating SafeRL research.

overthinking_the_truth

reproduction of overthinking_the_truth

pku-mllab2023

PKU 2023 -12 Machine Learning Labs

pku_ai-basis

The basic demo of the ai_basic_learing_2023_spring_pku

pku_ai-socio

Code for PKU AI Social Sciences

pku_dsa

A review of my code practice when learning pku : data structure and algorithm

pku_ics

A review of my code lab when learning pku : ICS

pku_programming

A review of my code when learning PKU: programming-algorithm

pkumodeling

the demo of jiangzehan_modeling

ppoxfamily

PPO x Family DRL Tutorial Course（决策智能入门级公开课：8节课帮你盘清算法理论，理顺代码逻辑，玩转决策AI应用实践）

safe-policy-optimization

This is a benchmark repository for safe reinforcement learning algorithms

safe-rlhf

Safe-RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

cby-pku Goto Github PK

Boyuan Chen's Projects

Recommend Projects

Recommend Topics

Recommend Org