Code Monkey home page Code Monkey logo

Yuhua Jiang's Projects

baby-llama2-chinese icon baby-llama2-chinese

用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.

chatreviewer icon chatreviewer

ChatReviewer: use ChatGPT to review papers; ChatResponse: use ChatGPT to respond to reviewers.

cleanrl icon cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

cumcm2020b icon cumcm2020b

2020全国大学数学建模大赛 赛题B 穿越沙漠

deeprl_network icon deeprl_network

multi-agent deep reinforcement learning for networked system control.

elegantrl icon elegantrl

Scalable and Elastic Deep Reinforcement Learning Using PyTorch. Please star. 🔥

gpt_academic icon gpt_academic

为GPT/GLM提供图形交互界面,特别优化论文阅读润色体验,模块化设计支持自定义快捷按钮&函数插件,支持代码块表格显示,Tex公式双显示,新增Python和C++项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持清华chatglm等本地模型

gym-jsbsim icon gym-jsbsim

A reinforcement learning environment for aircraft control using the JSBSim flight dynamics model

ilkit icon ilkit

A clean code base for imitation learning and reinforcment learning , written in Pytorch

lightzero icon lightzero

LightZero: A lightweight and efficient MCTS/AlphaZero/MuZero algorithm toolkit.

omnisafe icon omnisafe

OmniSafe is an infrastructural framework for accelerating SafeRL research.

on-policy icon on-policy

This is the official implementation of Multi-Agent PPO (MAPPO).

ray icon ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a toolkit of libraries (Ray AIR) for accelerating ML workloads.

rpbt icon rpbt

Implementation of RPPO(Risk-sensitive PPO) and RPBT(Population-based self-play with RPPO)

spinningup icon spinningup

An educational resource to help anyone learn deep reinforcement learning.

tcgaiic icon tcgaiic

天池人工智能技术创新大赛赛道三

tdmpc2 icon tdmpc2

Code for "TD-MPC2: Scalable, Robust World Models for Continuous Control"

vem icon vem

Codes accompanying the paper "Offline Reinforcement Learning with Value-Based Episodic Memory" (ICLR 2022 https://arxiv.org/abs/2110.09796)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.