Topic: off-policy Goto Github

Some thing interesting about off-policy

👇 Here are 39 public repositories matching this topic...

amirhosein-mesbah / reinforcement_learning

off-policy,This repository contains the implementation of a wide variety of Reinforcement Learning Projects in different applications of Bandit Algorithms, MDPs, Distributed RL and Deep RL. These projects include university projects and projects implemented due to interest in Reinforcement Learning.

User: amirhosein-mesbah

bandit-algorithms deep-reinforcement-learning deeprl distributed-reinforcement-learning mdp multi-agent-reinforcement-learning network-routing off-policy on-policy reinforcement-learning

baturaysaglam / ac-off-poc

off-policy,Off-Policy Correction for Actor-Critic Algorithms in Deep Reinforcement Learning

User: baturaysaglam

actor-critic deep-reinforcement-learning experience-replay importance-sampling off-policy

baturaysaglam / dase

off-policy,Safe and Robust Experience Sharing for Deterministic Policy Gradient Algorithms

User: baturaysaglam

actor-critic deep-reinforcement-learning experience-replay multi-agent-reinforcement-learning off-policy

baturaysaglam / la3p

off-policy,Actor Prioritized Experience Replay

User: baturaysaglam

actor-critic deep-reinforcement-learning prioritized-experience-replay off-policy

baturaysaglam / q-error-exploration

off-policy,An Optimistic Approach to the Q-Network Error in Actor-Critic Methods

User: baturaysaglam

actor-critic deep-reinforcement-learning experience-replay off-policy exploration-exploitation

baturaysaglam / swtd3

off-policy,Stochastic Weighted Twin Delayed Deep Deterministic Policy Gradient (SWTD3)

User: baturaysaglam

actor-critic deep-reinforcement-learning off-policy reinforcement-learning-algorithms

bmaxdk / deeprl-nd-continuous-control

off-policy,DDPG and D4PG Continuous Control

User: bmaxdk

ddpg-algorithm deep-reinforcement-learning openai-gym unity d4pg-algorithm model-free off-policy pytorch

cbanerji / sample_efficient_rl.

off-policy,Collection of codes pertaining to my research in model-free RL algorithms.

User: cbanerji

ddpg model-free-rl off-policy sample-efficient-rl td3 soft-actor-critic

ccnets-team / causal-rl

off-policy,Causal RL: Reverse-Environment Network Integrated Actor-Critic Algorithm

Organization: ccnets-team

Home Page: https://www.linkedin.com/company/ccnets/

pytorch causal cooperative-network invertible-policy reverse-environment-network actor-critic-algorithm off-policy gpt reinforcement-learning causal-mask

denisyarats / drq

off-policy,DrQ: Data regularized Q

User: denisyarats

Home Page: https://sites.google.com/view/data-regularized-q

rl reinforcement-learning deep-learning mujoco dm-control gym pixel sac soft-actor-crit pytorch

denisyarats / exorl

off-policy,ExORL: Exploratory Data for Offline Reinforcement Learning

User: denisyarats

Home Page: https://sites.google.com/view/exorl

python control reinforcement-learning unsupevised offline-rl deep-learning pytorch off-policy mujoco model-free

djazdeck / spg

off-policy,Sample Policy Gradient

User: djazdeck

deep reinforcement learning policy optimization off-policy model-free continuous action control

fardinabbasi / tabulated_rl

off-policy,Interactive Learning [ECE 641] - Fall 2023 - University of Tehran - Prof. Nili

User: fardinabbasi

grid-world markov-decision-processes mdp off-policy on-policy q-learning sarsa tree-backup value-iteration

hydesmondliu / rubicon

off-policy,A novel method to incorporate existing policy (Rule-based control) with Reinforcement Learning.

User: hydesmondliu

climate-change deep-learning deep-reinforcement-learning energy-efficiency hvac-control machine-learning optimal-control optimization reinforcement-learning-algorithms thermal-comfort actor-critic-algorithm deterministic-policy-gradients off-policy rule-based-controller reinforcement-learning

instadeepai / flashbax

off-policy,⚡ Flashbax: Accelerated Replay Buffers in JAX

Organization: instadeepai

Home Page: https://instadeepai.github.io/flashbax/

buffers hpc jax machine-learning off-policy reinforcement-learning rl

kalyani011 / rl-q_learning_implementation

off-policy,Temporal Difference Method - Q-Learning Implementation for FrozenLake Grid Problem

User: kalyani011

off-policy q-learning reinforcement-learning temporal-differencing-learning value-based

lionelblonde / giwr-pytorch

off-policy,PyTorch implementation of our work: "Optimality Inductive Biases and Agnostic Guidelines for Offline Reinforcement Learning"

User: lionelblonde

reinforcement-learning offline pytorch imitation-learning off-policy

lionelblonde / giwr-pytorch-complete-history

off-policy,PyTorch implementation of our work: "Where is the Grass Greener? Revisiting Generalized Policy Iteration for Offline Reinforcement Learning"

User: lionelblonde

reinforcement-learning pytorch offline off-policy imitation-learning

lionelblonde / liayn-pytorch

off-policy,PyTorch implementation of our work: "Lipschitzness Is All You Need To Tame Off-policy Generative Adversarial Imitation Learning"

User: lionelblonde

reinforcement-learning pytorch gan imitation-learning gail off-policy

lionelblonde / liayn-pytorch-complete-history

off-policy,PyTorch implementation of our work: "Lipschitzness Is All You Need To Tame Off-policy Generative Adversarial Imitation Learning"

User: lionelblonde

reinforcement-learning pytorch gan imitation-learning gail off-policy

lionelblonde / sam-pytorch

off-policy,PyTorch implementation of "Sample-efficient Imitation Learning via Generative Adversarial Nets"

User: lionelblonde

gail imitation-learning reinforcement-learning pytorch gan off-policy

lionelblonde / sam-pytorch-complete-history

off-policy,PyTorch implementation of "Sample-efficient Imitation Learning via Generative Adversarial Nets"

User: lionelblonde

reinforcement-learning pytorch gan imitation-learning gail off-policy

lionelblonde / sam-tf

off-policy,TensorFlow implementation of "Sample-efficient Imitation Learning via Generative Adversarial Nets"

User: lionelblonde

gail imitation-learning reinforcement-learning tensorflow gan off-policy

lionelblonde / sam-tf-complete-history

off-policy,TensorFlow implementation of "Sample-efficient Imitation Learning via Generative Adversarial Nets"

User: lionelblonde

reinfrocement-learning tensorflow gan imitation-learning gail off-policy

mabirck / cs294-deeprl

off-policy,My content of CS294 Deep Reinforcement Learning course, conduced by Sergey Levine from UC Berkeley.

User: mabirck

reinforcement-learning cs294 deep-learning neural-networks reinforcement policy-gradient on-policy off-policy deep-reinforcement-learning deep-neural-networks

mishalaskin / curl

off-policy,CURL: Contrastive Unsupervised Representation Learning for Sample-Efficient Reinforcement Learning

User: mishalaskin

contrastive-learning contrastive-loss contrastive-predictive-coding curl deep-learning deep-learning-algorithms deep-neural-networks deep-q-learning deep-q-network deep-reinforcement-learning deep-rl deeplearning deeplearning-ai gpu model-free-rl off-policy reinforcement-agents reinforcement-learning reinforcement-learning-algorithms sac

mishalaskin / rad

off-policy,RAD: Reinforcement Learning with Augmented Data

User: mishalaskin

reinforcement-learning rl deep-learning data- mujoc dm-control rad data-augmentations codebase model-free

mohammadasadolahi / reinforcement-learning-solving-a-simple-4by4-gridworld-using-qlearning-in-python

off-policy,solving a simple 4*4 Gridworld almost similar to openAI gym FrozenLake using Qlearning Temporal difference method Reinforcement Learning

User: mohammadasadolahi

qlearning qlearning-on-gridworld reinforcement-learning off-policy

narjesno / reinforcement-learning

off-policy,This repository contains all of the Reinforcement Learning-related projects I've worked on. The projects are part of the graduate course at the University of Tehran.

User: narjesno

dynamic-programming off-policy on-policy model-free-rl model-based-rl monte-carlo sarsa n-step-bootstrapping n-step-expected-sarsa n-step-tree-backup

nus-lid / renault

off-policy,Ensemble and Auxiliary Tasks for Data-Efficient Deep Reinforcement Learning

Organization: nus-lid

Home Page: https://arxiv.org/abs/2107.01904

deep-reinforcement-learning data-efficient-learning deep-q-learning deep-rl deep-learning off-policy model-free-rl ensemble-learning auxiliary-tasks multi-task-learning

pokaxpoka / sunrise

off-policy,SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning

User: pokaxpoka

reinforcement-learning rl deep-learning mujoco dm-control codebase model-free off-policy deep-reinforcement-learning deep-neural-networks

puneet2000 / agent-doom

off-policy,A RL agent that learns to play doom's deadly corridor based on DDQN and PER.

User: puneet2000

reinforcement-learning prioritized-experience-replay dueling-network-architecture fixed-q-targets q-learning deep-q-learning pytorch-implmention off-policy experience-replay

raja-grewal / rlmd

off-policy,PROJECT MIGRATED TO CODEBERG - Reinforcement Learning in Multiplicative Domains

User: raja-grewal

Home Page: https://codeberg.org/raja-grewal/rlmd

artificial-intelligence deep-reinforcement-learning energy-efficiency extreme-value-statistics gym model-free-rl python risk-management trading-algorithms ergodicity

rosefintech / rosefintech-rosefinaiengine

off-policy,RosefinAIEngine of Rosfintech

Organization: rosefintech

ai engine off-policy tensorflow

saminyeasar / off_policy_ac

off-policy,Contains PyTorch Implementation of the following off policy actor critic algorithms

User: saminyeasar

actor-critic mujoco reinforcement-learning off-policy sac td3 ddpg pytorch

saminyeasar / pytorch-implementation-dice-algorithms

off-policy,PyTorch-implementation-DICE-algorithms

User: saminyeasar

pytorch algeadice valuedice off-policy imitation-learning rl

theunsolveddev / reinforcementlearning

off-policy,Repository containing basic algorithm applied in python.

User: theunsolveddev

reinforcement-learning algorithm policy-iteration policy-evaluation bandit-algorithms monte-carlo off-policy on-policy

tianhongdai / hindsight-experience-replay

off-policy,This is the pytorch implementation of Hindsight Experience Replay (HER) - Experiment on all fetch robotic environments.

User: tianhongdai

hindsight-experience-replay ddpg reinforcement-learning off-policy exploration pytorch-implmention her

zhihanyang2022 / off-policy-continuous-control

off-policy,Official PyTorch code for "Recurrent Off-policy Baselines for Memory-based Continuous Control" (DeepRL Workshop, NeurIPS 21)

User: zhihanyang2022

Home Page: https://arxiv.org/abs/2110.12628

pytorch recurrent-neural-network actor-critic off-policy continuous-control reinforcement-learning rdpg rtd3 rsac

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.