Light

fhswf / mlpro Goto Github PK

View Code? Open in Web Editor NEW

12.0 4.0 3.0 392.33 MB

MLPro - The Integrative Middleware Framework for Standardized Machine Learning in Python

Home Page: https://mlpro.readthedocs.io/

License: Apache License 2.0

Python 99.98% Makefile 0.01% Batchfile 0.02%

machine-learning python3 reinforcement-learning engineering framework game-theory middleware online-machine-learning supervised-learning

mlpro's Introduction

MLPro - The Integrative Middleware Framework for Standardized Machine Learning in Python

MLPro provides complete, standardized, and reusable functionalities to support your scientific research, educational tasks or industrial projects in machine learning.

Key Features

a) Open, modular and extensible architecture

Overarching software infrastructure (mathematics, data management and plotting, UI framework, logging, ...)
Fundamental ML classes for adaptive models and their training and hyperparameter tuning

b) MLPro-RL: Sub-Package for Reinforcement Learning

Powerful Environment templates for simulation, training and real operation
Templates for single-agents, model-based agents (MBRL) with action planning to multi-agents (MARL)
Advanced training/tuning funktionalities with separate evaluation and progress detection
Growing pool of reuseable environments of automation and robotics

c) MLPro-GT: Sub-Package for Native Game Theory and Dynamic Games

Templates for native game theory regardless number of players and type of games
Templates for multi-players in dynamic games, including game boards, players, and many more
Reuse of advanced training/tuning classes and multi-agent environments of sub-package MLPro-RL

d) Numerous executable self study examples

e) Integration of established 3rd party packages

MLPro provides wrapper classes for:

Environments of OpenAI Gym and PettingZoo
Policy Algorithms of Stable Baselines 3
Hyperparameter tuning with Hyperopt

Documentation

The Documentation is available here: https://mlpro.readthedocs.io/

Development

Consequent object-oriented design and programming (OOD/OOP)
Quality assurance by test-driven development
Hosted and managed on GitHub
Agile CI/CD approach with automated test and deployment
Clean code paradigma

Project and Team

Project MLPro was started in 2021 by the Group for Automation Technology and Learning Systems at the South Westphalia University of Applied Sciences, Germany.

MLPro is designed and developed by Detlef Arend, Steve Yuwono, M Rizky Diprasetya, and further contributors.

How to contribute

If you want to contribute, please read CONTRIBUTING.md

mlpro's People

Contributors

Stargazers

Watchers

Forkers

softwareimpacts nyx-22-255 rizkydiprasetya

mlpro's Issues

Class SciUISubplot: Context menu function "Undock"

Subplots shall be "undockable". That means they appear in a separate window.

Hyperparameter Tuning: Wrapper class for Optuna

Takeover Issues of Legacy Projects

CCB-BF
CCB-ML
CCB-UI
MLPro

Howto xy (GT) - Train RAY-RLlib based multi-player with Petting Zoo environment

...

Dependencies:
#13

Wrapper classes for OpenAI Gym/Petting Zoo: New method get_cycle_limit()

The interface of class Environment has been extended by a new method get_cycle_limit() that returns the recommended cycle limit for training episodes. The method can be used to configure a proper training process.

In native env implementations it is a static value in the new internal constant C_CYCLE_LIMIT and has to be defined by the creator of a new env.

OpenAI Gym/Petting Zoo provide this value as well. So the new method get_cycle_limit() of the related wrapper classes can be implemented by returning this value from the env registry.

Tasks:

Wrapper OpenAI Gym: implementation of method get_cycle_limits()
Wrapper Petting Zoo: implementation of method get_cycle_limits()
Adaption of related HowTo-files

Requirements:
Branch agent-and-env-model-improvements and the related issues.

See also:
Gym Registry

New Wrapper Class WrEnvMLPro2Gym

A new wrapper class WrEnvMLPro2Gym is to be implemented. It inherits the interface of the OpenAI Gym env and wrapps a MLPro single agent environment to be used as OpenAI Gym env.

Test Script for Policy Wrappers

We need test script to check the wrapper policy

New Exploration class for off-policy exploration

By default, off-policy samples the action with some exploration algorithm so called, epsilon greedy. This is just a basic exploration algorithm. There is another exploration algorithm exist in the world. That is why, I think we need an Exploration class to provide different type of exploration algorithm.

SciUI scenario "Reinforcement Learning"

Layout and features to be discussed:

Plots of states, actions, reward and overall episode/training statistics
Host frames/tabs for agent visualisation and specific parameters
Host frames/tabs for environment visualisation and specific parameters

Buffer Class Extension

Agent: Adaptation Algorithm

The adaption algorithm of an agent orchestrates various internal adaption steps. Involved components are:

Policy
Optional environment model
Optional adaptive preprocessing steps (e.g. normalization)

To be discussed...

Environment Model Extension: Method for cycle limit per training episode

For a successful training a meaningful limit for cycles per episode is needed. The environment itself should provide a proposal for this...

Environment model - Dynamic latency/Asynchronous processing

In version 1.0 of the reinforcement learning model pool the Environment class supports a dynamic latency that can be changed dynamically by using the set_latency() method. But there is no algorithm behind that so that it is up to the environment implementation to change the latency.
The environment/action model shall be extended by an optional latency for each action component. The dynamic latency of the environment is computed as follows:

Compute the maximum latency of all action components for each agent
Determine the minimum agent latency
The environment waits/simulates this dynamic latency duration, determines a new state and returns a reward for the regarding agent...

Finally, the described extension enables asynchronous action processing.

Related Issues
#672 : Focuses on the latency at the system level.

Hyperparameter Tuning: Wrapper class for Hyperopt

PPO Algorithm Implementation

On-/Off-Policy

Consequences/changes to be worked out. Current state of discussion:

1. On-Policy

SAR-Buffer will be cleared after a training episode
Data base for policy adaption is the entire SAR-Buffer

2. Off-Policy

SAR-Buffer will not be cleared after a training episode
Data base for policy adaption is a sample of the SAR-Buffer

A2C for Discrete Action

Wrapper Gym: Detection/mapping of Box spaces

Gym state spaces of type Box shall be mapped to a one-dimensional space where the (one-dimensional) elements contain of just a refrence to any kind of data object (and some additional meta data). This issue depends on the extension of the Dimension class in bf-math...

SciUI: Save plot - Ask before overwrite

The context menu functionality "Save as" shall ask user before overwriting an existing file.

Coach: @detlefarend

Add UR5 env to unit test for envs

The existing unit test ./test/test_environment.py is to be extended by the UR5 environment.

DQN Implementation

Implementation of DQN Algorithm

SciUI Extension: Info popup window for the new property class "ScientificObject"

Remodelling SAR-Buffer

As discussed on 10th Sep. 2021 the State/Action/Reward-Buffer shall be extended by a new functionality to extract a sample following a specific algorithm. In consequence the related class becomes a pool class...

New method extract_sample()
New subpool folder 'sarbuffers'
Reference implementation 'Random sampling' (further implementations for BER, HER postponed)

Add Automated Testing Python

Wrapper Class for RLlib

...

Related work:
#45 #46

Wrapper Class for Petting Zoo

Fixing SARBuffer indexing

BF-Math: Data object extension

To deal with big data objects like images, point clouds, ... the classes Dimension and Element have to be extended:

1. Extension of class Dimension

A new base set type 'DO'(=Data Objects) shall be addded.

2. New class DataObject

def init(self, p_data, *p_meta_data)
def get_data()
def get_meta_data()
... maybe the new universal wrapper for data of any type

3. Extension of class Element

Proposal: internal data storage will be changed from np.arry to list
Element components of base set type DO will be stored of type DataObject

Further hints: in own applications the framework should be able to deal with special classes derived from b). For example: images, point clouds. See also: OpenAI Gym Env, space type Box.

Mail Address [email protected]

This address shall be used as official contact address especially in PyPI.

Automatic PyPI Deployment

Idea:
A new branch 'released' shall be created. If the core team decides to publish the next release of MLPro the main branch will be merged/pushed/copied to branch 'release'. This triggers an automatic procedure that deploys branch 'released' to the Python Package Index (PyPI)...

New branch: 'released'
Deployment procedure

Prioritized Buffer

Test Script for SARBuffer

We need test script to check SARBuffer implementation

BF-Various: Review Logging

Logging should be compatible with established mechanisms in third party packages but especially with Pytorch Tensorboard. Maybe a kind of wrapper mechanism or additional methods for class Log to switch logging to a different destination...

New property class "ScientificObject"

This class marks inherited classes as "scientific objects" with properties like

Title
Description/Abstract
DOI, ISBN, ...
Citation
....

Potential classes to be enriched with the above properties are: Environment, Agent, ExpRateAlgo, Stream, GameBoard, ...

SciUI: Extensions for 3D-Plot

Autorotation in steps
(De-)activate Grid
Views: top, left, right, front, back, bottom

Re-structure the pool folder structure

Fix Howto RL After new SARBuffer implementation

New Wrapper Class WrEnvMLPro2PZoo

A new wrapper class WrEnvMLPro2PZoo is to be implemented. It inherits the interface of the OpenAI Gym env and wrapps a MLPro multi-agent environment to be used as PettingZoo env.

Howto xy (RL) - Train RAY-RLlib based multi-agent with Petting Zoo environment

...

Dependency:
#13

Hyperparameter Tuning: Wrapper class for RAY-Tune

Prepare Docu in ReadTheDocs

Check the privacy settings in ReadTheDocs
Set up docu project based on GitHub project
Schedule an internal appointment (short intro, talk about docu structure and further steps)

Policy Algo SAC

SAC algorithm implementation on Pool

Howto xy (RL) - Train UR5 Robot environment with A2C Algorithm

A new sample script with the following parts is to be created in the ./examples/rl folder:

New local class ScearioUR5A2C, inherited by class Scenario
Method ScenarioUR5A2C._setup(): setup and wire up your UR5 env and an Agent based on Rizky's A2C policy
Episodical training by using the class Training

Environment "Double Pendulum"

The double pendulum problem shall be added as a benchmark environment. The mathematics and visualization of it can be adapted from Matplotlib example "The double pendulum problem"

Included tasks are:

Prerequisites:
#319

Review Training Mechanisms on Example Howto 09 (SAC with Gym Env)

Currently Howto 09 does not succeed. That raises the following questions:

Can we find a better set of (hyper) parameters so that the example succeeds?
Do we have a leak in our training process in general (e.g. automatically storing the best policy)?
Is MountainCarContinuous the best env to demonstrate the strength of SAC?
Do we have to extend our env model by a method that returns the cycle limit per training episode (I am pretty sure that this makes sense)?
...

Class Scenario: optional permanent data logging

Currently class Training collects runtime data only. For supervision of permanent realtime/productive control processes the class Scenario shall be extended by optional permanent data logging. To limit the amount of data the logging should be cyclic e.g. the last n frames of every DataStoring object could be stored in separate files...

Random Sampling from SARBuffer

DDPG Algorithm Implementation

Assurance of Clean Code

Clean Code is an internal and external quality feature. It helps to understand/improve/maintain the code and extends therefore the lifecycle of our project. Furthermore it is something a scientific reviewer can easily assess. Last but not least it influences the spread of MLPro: people trust more in code that is understandable/hygenic/unified/stanardized...

Style sheet / module template
Info slot in a team meeting

Policy Algo A2C

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.