Code Monkey home page Code Monkey logo

introduction2rl_2024's Introduction

Introduction2RL_2024

Additional material for the lecture and exercises held at Paris Lodron University Salzburg in 2024 by me (Simon Hirlaender).

This is our little Mars rover from the lecture: image

Mars Rover Environment with Gymnasium

The Mars Rover Environment is a reinforcement learning scenario implemented with Gymnasium. This environment simulates a rover navigating across a linear Martian landscape, aiming to reach a designated goal. It's designed to provide a simple yet challenging task for reinforcement learning algorithms, focusing on probabilistic state transitions and reward optimization.

Environment Overview

In this simulated Martian landscape, the rover faces a series of states it can navigate through by moving left or right. The environment is discrete, with terminal states at both ends of the linear space. The rover's goal is to reach the most rewarding terminal state, overcoming the uncertainty introduced by probabilistic movement outcomes.

States

  • The environment is composed of n_states + 2 states, where n_states is the number of non-terminal states. The two additional states represent the terminal points at either end of the rover's path.
  • The rover's position within these states determines its progress and influences the rewards it accumulates.

Actions

The rover can perform two actions, each intended to move it one step in the desired direction:

  • Left (0): Move one state to the left.
  • Right (1): Move one state to the right.

The actual outcome of an action is subject to probabilities, adding an element of unpredictability to the rover's movement.

Transition Probabilities

The movement outcomes are influenced by the following probabilities:

  • p_stay: The probability that the rover remains in its current state, despite taking an action.
  • p_backward: The probability that the rover moves in the opposite direction of the intended action.
  • The probability of moving forward, as intended, is 1 - p_stay - p_backward, ensuring all probabilities sum to 1.

Rewards

  • left_side_reward: Reward received upon reaching the left terminal state.
  • right_side_reward: Higher reward for reaching the right terminal state, incentivizing the rover to navigate towards this goal.

Terminal States

Reaching a terminal state concludes the current episode. These states represent the rover's successful navigation to an endpoint of its journey, with rewards allocated based on the terminal state reached.

Integration with Gymnasium

This environment is compatible with the Gymnasium library, facilitating its use in reinforcement learning projects. Gymnasium provides a standardized API for interacting with the environment, including initiating episodes, taking actions, and receiving feedback in the form of state observations, rewards, and termination signals.

Customization

The Mars Rover Environment supports several customization options, allowing users to adjust the number of states (n_states), the probabilities of action outcomes (p_stay, p_backward), and the rewards for reaching terminal states (left_side_reward, right_side_reward). This flexibility makes it suitable for various experimentation needs, from introductory reinforcement learning tasks to more complex strategic explorations.

Usage Note

To utilize this environment, ensure you have the Gymnasium library installed in your Python environment (pip install gymnasium). The environment inherits from Gymnasium's DiscreteEnv, using its mechanisms for discrete state and action spaces, probabilistic transitions, and reward definitions.

introduction2rl_2024's People

Contributors

mathphyssim avatar pochabas avatar

Watchers

 avatar

Forkers

snowblinx

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.