Topic: temporal-differencing-learning Goto Github

Some thing interesting about temporal-differencing-learning

👇 Here are 79 public repositories matching this topic...

aadimator / drl-nd

temporal-differencing-learning,My solution notebooks for the Deep Reinforcement Learning Nanodegree by Udacity

deep-reinforcement-learning udacity udacity-nanodegree udacity-deep-reinforcement-learning temporal-differencing-learning monte-carlo-methods openai-gym openai-gym-solutions reinforcement-learning

aestheticvoyager / temporal-difference-learning

temporal-differencing-learning,TD-Gammon is a computer backgammon program developed in 1992 by Gerald Tesauro at IBM's Thomas J. Watson Research Center. Its name comes from the fact that it is an artificial neural net trained by a form of temporal-difference learning, specifically TD-lambda.

User: aestheticvoyager

python3 reinforcement-learning temporal-differencing-learning

agrawal-rohit / tic-tac-toe-ai-bots

temporal-differencing-learning,AI bots playing Tic Tac Toe

User: agrawal-rohit

games minimax-algorithm reinforcement-learning temporal-differencing-learning tic-tac-toe tic-tac-toe-game tictactoe tictactoe-game

aylint / rl-algorithms

temporal-differencing-learning,Various fundamental reinforcement learning algorithms implemented from scratch

User: aylint

reinforcement-learning reinforcement-learning-algorithms temporal-differencing-learning sarsa expected-sarsa q-learning prioritized-sweeping dyna-q python

bardofcodes / drl_in_cv

temporal-differencing-learning,A course on Deep Reinforcement Learning in Computer Vision. Visit Website:

User: bardofcodes

Home Page: http://bardofcodes.github.io/DRL_in_CV/

reinforcement computer-vision deep-reinforcement-learning course-materials policy-gradient q-learning temporal-differencing-learning

by571 / munchausen-rl

temporal-differencing-learning,PyTorch implementation of the Munchausen Reinforcement Learning Algorithms M-DQN and M-IQN

User: by571

reinforcement-learning reinforcement-learning-algorithms temporal-differencing-learning maximum-entropy deep-q-learning deep-reinforcement-learning munchausen-reinforcement-learning

callmespring / rl-short-course

temporal-differencing-learning,Reinforcement Learning Short Course

User: callmespring

dynamic-programming markov-decision-processes monte-carlo-methods off-policy-evaluation q-learning reinforcement-learning temporal-differencing-learning model-based-rl policy-based-method offline-rl

chetweger / min-max-games

temporal-differencing-learning,Watch the AI learn to play Meta-Tic-Tac-Toe:

User: chetweger

Home Page: http://chet-weger.herokuapp.com/learn_meta_ttt/

ai python temporal-differencing-learning minmax-algorithm alpha-beta-pruning

coeusmaze / adaptive-temporal-difference-learning

temporal-differencing-learning,Implemented AdaTD and compared it with other optimization methods in temporal difference learning.

User: coeusmaze

optimization-algorithms reinforcement-learning temporal-differencing-learning

dellalibera / gym-backgammon

temporal-differencing-learning,Backgammon OpenAI Gym

User: dellalibera

gym gym-env reinforcement-learning self-play game td-gammon openai-gym artificial-intelligence openai-gym-environment backgammon-game

dellalibera / td-gammon

temporal-differencing-learning,TD-Gammon implementation

User: dellalibera

backgammon artificial-intelligence temporal-differencing-learning reinforcement-learning neural-network pytorch convolutional-neural-networks self-play value-function game

francescotorregrossa / deep-reinforcement-learning-nanodegree

temporal-differencing-learning,Exercises and projects from Udacity's Nanodegree

User: francescotorregrossa

reinforcement-learning deep-reinforcement-learning dqn dqn-pytorch dqn-variants double-dqn dueling-dqn dueling-ddqn pytorch qlearning temporal-differencing-learning monte-carlo-methods

hritikb / reinforcement-learning-algorithms

temporal-differencing-learning,

User: hritikb

reinforcement-learning multi-armed-bandits greedy-policy epsilon-greedy upper-confidence-bound optimistic-inital-values gradient-bandit dynamic-programming value-iteration policy-iteration

imimali / blackjack

temporal-differencing-learning,Well I'm gonna build my own theme park

User: imimali

sarsa monte-carlo-control q-learning temporal-differencing-learning blackjack futurama reinforcement-learning

imimali / reinforcement-learning-specialization

temporal-differencing-learning,Reinforcement Learning Specialization courses solutions

User: imimali

reinforcement-learning sarsa q-learning markov-decision-processes dyna-q neural-network policy-gradient monte-carlo temporal-differencing-learning

jhurricane96 / chessai

temporal-differencing-learning,A self-learning chess artificial intelligence

User: jhurricane96

python chess artificial-intelligence reinforcement-learning temporal-differencing-learning

john-cyhui / reinforcement-learning-cliff-walking

temporal-differencing-learning,This repo contains python implementation to the cliff walking problem from RL Introduction by Sutton & Barto Example 6.6.

User: john-cyhui

expected-sarsa q-learning reinforcement-learning temporal-differencing-learning

kalyani011 / rl-q_learning_implementation

temporal-differencing-learning,Temporal Difference Method - Q-Learning Implementation for FrozenLake Grid Problem

User: kalyani011

off-policy q-learning reinforcement-learning temporal-differencing-learning value-based

krm58 / reinforcement-learning-models

temporal-differencing-learning,Various computational models for reinforcement learning

User: krm58

reinforcement-learning rescorla-wagner temporal-differencing-learning credit-assignment

$deep-math-machine-learning.ai$

madhu009 / deep-math-machine-learning.ai

temporal-differencing-learning,A blog which talks about machine learning, deep learning algorithms and the Math. and Machine learning algorithms written from scratch.

User: madhu009

Home Page: https://medium.com/deep-math-machine-learning-ai

machine-learning linear-regression tensorflow gradient-descent-algorithm logistic-regression support-vector-machines deep-neural-networks word2vec natural-language-processing reinforcement-learning-algorithms

matakshay / deeprl-for-delayed-rewards

temporal-differencing-learning,Deep RL for Temporal Credit Assignment in decision processes with delayed rewards

User: matakshay

deep-neural-networks deep-reinforcement-learning epsilon-greedy-exploration graph-neural-networks graph-representation-learning markov-decision-processes model-free-rl monte-carlo multi-layer-perceptron node2vec

melodicyb / msc

temporal-differencing-learning,MSc Course Projects

User: melodicyb

complex-valued-data viterbi-hmm affordance-learning hopfield-network el-farol temporal-differencing-learning word2vec text-summarization author-identification

mohammadasadolahi / reinforcement-learning-solving-a-simple-4-4-gridworld-using-td0-evaluation-method-in-python

temporal-differencing-learning,solving a simple 4*4 Gridworld almost similar to openAI gym FrozenLake using Temporal difference method Reinforcement Learning

User: mohammadasadolahi

temporal-differencing-learning reinforcement-learning frozenlake reinforcement-learning-algorithms rl general-policy-iteration td0

mohammadasadolahi / reinforcement_learning_solving_a_simple_4_4_gridworld_using_sarsa-in-python

temporal-differencing-learning,solving a simple 4*4 Gridworld almost similar to openAI gym FrozenLake using SARSA Temporal difference method Reinforcement Learning

User: mohammadasadolahi

sarsa reinforcement-learning reinforcement-learning-algorithms frozenlake temporal-differencing-learning

moporgic / tdl2048-demo

temporal-differencing-learning,Temporal Difference Learning for the Game of 2048 (Demo)

User: moporgic

2048 machine-learning temporal-differencing-learning demo n-tuple-networks

mpatacchiola / dissecting-reinforcement-learning

temporal-differencing-learning,Python code, PDFs and resources for the series of posts on Reinforcement Learning which I published on my personal blog

User: mpatacchiola

Home Page: https://mpatacchiola.github.io/blog/

actor-critic deep-reinforcement-learning dissecting-reinforcement-learning drone-landing genetic-algorithm inverted-pendulum markov-chain mountain-car multi-armed-bandit neural-networks q-learning reinforcement-learning sarsa temporal-differencing-learning

mrgeislinger / udacitymlnd_rl-miniproject_temporaldifference

temporal-differencing-learning,Temporal difference mini project from the reinforcement learning section of Udacity's Machine Learning Nanodegree (MLND). This mini project wasn't required to be turned in; used as a teaching tool.

User: mrgeislinger

machine-learning udacity-nanodegree udacity-machine-learning-nanodegree reinforcement-learning temporal-differencing-learning

mvrahden / reinforce-js

temporal-differencing-learning,[INACTIVE] A collection of various machine learning solver. The library is an object-oriented approach (baked with Typescript) and tries to deliver simplified interfaces that make using the algorithms pretty simple.

User: mvrahden

Home Page: https://npmjs.com/package/reinforce-js

reinforcement-learning typescript npm deep-q-network deterministic-policy-gradients neural-network solver temporal-differencing-learning deepmind dqn-solver learning-agents reinforce-js reinforcement deep-reinforcement-learning dqn deep-learning td-solver machine-learning-solver artificial-intelligence ai

pouyan-asg / path-planning-with-rl-algorithms

temporal-differencing-learning,Path Planning with Reinforcement Learning algorithms in an unknown environment

User: pouyan-asg

double-q-learning global-path-planning path-planning q-learning reinforcement-learning sarsa-learning temporal-differencing-learning

prakhar-ff13 / reinforcement-learning-with-python

temporal-differencing-learning,Reinforcement Learning Notebooks

User: prakhar-ff13

Home Page: https://www.packt.com

machine-learning deep-learning reinforcement-learning deep-reinforcement-learning deep-q-learning markov-decision-processes monte-carlo temporal-differencing-learning actor-critic policy-gradient policy-iteration policy-evaluation value-iteration cross-entropy-method

purvasingh96 / deep-reinforcement-learning

temporal-differencing-learning,Various reinforcement learning algorithms implemented using Python. This repo also contains a DQN approach to solve credit-card anomaly detection use-case.

User: purvasingh96

deep-reinforcement-learning pac-man monte-carlo-simulation openai-gym temporal-differencing-learning q-learning sarsa-learning deep-q-network

quentin18 / gymnasium-2048

temporal-differencing-learning,Gymnasium environment for the game 2048

User: quentin18

2048 2048-ai 2048-game gymnasium n-tuple-networks pygame temporal-differencing-learning

rhalbersma / doctrina

temporal-differencing-learning,Exercises in reinforcement learning

User: rhalbersma

dynamic-programming monte-carlo-simulation reinforcement-learning temporal-differencing-learning

ricardodominguez / rl-intro

temporal-differencing-learning,Introduction to Reinforcement Learning in Python

User: ricardodominguez

reinforcement-learning reinforcement-learning-algorithms sarsa sarsa-lambda q-learning temporal-differencing-learning td2 lstdlambda actor-critic dynaq

ricky-ma / decentralizedrl

temporal-differencing-learning,Decentralized temporal-difference reinforcement learning over randomly reshuffled topology

User: ricky-ma

reinforcement-learning reinforcement-learning-algorithms decentralized temporal-differencing-learning graph-algorithms

rpegoud / temporal-difference-learning

temporal-differencing-learning,Implementation of Temporal Difference Learning algorithms, experiment featured in Towards Data Science

User: rpegoud

Home Page: https://medium.com/towards-data-science/temporal-difference-learning-and-the-importance-of-exploration-an-illustrated-guide-5f9c3371413a

gridworld-environment numpy python reinforcement-learning temporal-differencing-learning

rrando10 / rl-mc-vs-td

temporal-differencing-learning,Python implementation and analysis of two reinforcement algorithms – monte carlo and temporal differencing

User: rrando10

reinforcement-learning reinforcement-learning-algorithms monte-carlo temporal-differencing-learning machine-learning artificial-intelligence

saschaschramm / tiny-chatgpt

temporal-differencing-learning, Researching the reinforcement learning algorithm of ChatGPT

User: saschaschramm

chatgpt gae ppo rlhf temporal-differencing-learning general-advantage-estimation

scitator / rl-course-experiments

temporal-differencing-learning,

User: scitator

reinforcement-learning deep-learning deep-reinforcement-learning tensorflow neural-network genetic-algorithm monte-carlo temporal-differencing-learning deep-q-network policy-gradient

shehio / reinforcementlearning

temporal-differencing-learning,Reinforcement Learning algorithms with nothing abstracted away

User: shehio

reinforcement-learning planning-algorithms dynamic-programming value-iteration policy-iteration temporal-differencing-learning policy-gradient markov-decision-processes python monte-carlo-tree-search

suchetaaa / cs747-assignments

temporal-differencing-learning,Foundations Of Intelligent Learning Agents (FILA) Assignments

User: suchetaaa

reinforcement-learning multi-armed-bandits bellman-equation linear-programming howards-pi bootstrapping monte-carlo sarsa-learning windy-gridworld temporal-differencing-learning

sushant-ctrl / rl-implementations

temporal-differencing-learning,This repository has all the codes and sources of various RL algorithms that I have implemented.

User: sushant-ctrl

rl dqn multiarmed-bandits tabular-rl montecarlomethod temporal-differencing-learning

thaidat / temporal-difference-learning-to-play-2048

temporal-differencing-learning,A simple reinforcement learning AI to play 2048 games

User: thaidat

temporal-differencing-learning reinforcement-learning ai 2048 python

thaidat / temporal-difference-learning-to-play-2048-pascal-version

temporal-differencing-learning,A simple reinforcement learning AI to play 2048 games

User: thaidat

temporal-differencing-learning reinforcement-learning ai pascal 2048

tirthajyoti / rl_basics

temporal-differencing-learning,Basic Reinforcement Learning algorithms

User: tirthajyoti

reinforcement-learning value-iteration policy-iteration q-learning artificial-intelligence machine-learning temporal-differencing-learning td-learning machine-learning-algorithms

tnmichael309 / 2048ai

temporal-differencing-learning,My RL Project (2048 World Record + IEEE TCIAIG Journal Source Code)

User: tnmichael309

Home Page: https://ieeexplore.ieee.org/document/7518633/

2048-game machine-learning reinforcement-learning-algorithms temporal-differencing-learning tuple-networks

tybens / rl-easy21

temporal-differencing-learning,Reinforcement Learning as applied to a simplified blackjack game: Easy21

User: tybens

reinforcement-learning episodes easy21 blackjack ucl monte-carlo temporal-differencing-learning

vansh404 / pathplanning_withrl

temporal-differencing-learning,Using Q-Learning Control for path planning of mobile agents in an enviroment.

User: vansh404

artificial-intelligence path-planning qlearning-algorithm reinforcement-learning temporal-differencing-learning

vexlife / accelerated-td

temporal-differencing-learning,My Implementation of the Accelerated Gradient Temporal Difference Learning algorithm in Python

User: vexlife

td temporal-differencing-learning temporal-difference temporal-difference-algorithms temporal-difference-learning random-walk reinforcement-learning reinforcement-learning-algorithms accelerated-td atd

worenga / nine-mens-morris-challenge

temporal-differencing-learning,Einreichung für die it-talents.de/Adesso Code-Competition Oktober 2017 ("Kampf gegen Mühlen"). Eine ES6-Webapplikation auf Basis von vue.js, fabric.js und synaptic für das Spiel Mühle im Browser. Es stehen unterschiedlich starke AI mit diversen Charakteristika zur Verfuegung. Das Spiel und AI laufen komplett im Browser als WebWorker.

User: worenga

Home Page: https://morris.benedikt-wolters.de/

ai nine-mens-morris q-learning alpha-beta-pruning board-game boardgame artificial-intelligence temporal-differencing-learning