Code Monkey home page Code Monkey logo

reinforcement_learning's People

Contributors

ataymano avatar bassmang avatar byronxu99 avatar cheng-tan avatar cirvine-msft avatar dwaijam avatar homezcx avatar jackgerrits avatar jakub-szymanski avatar johnlangford avatar kumpera avatar lalo avatar lokitoth avatar marco-rossi29 avatar olgavrou avatar orenmichaely avatar ormichae avatar peterychang avatar quinndamerell-ms avatar rajan-chari avatar rajanchari avatar rajanjedi avatar schuylergoodman avatar sheetallahabar avatar skofsky avatar tparuchuri avatar trevorchristman avatar tyclintw avatar yannstad avatar zwd-ms avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

reinforcement_learning's Issues

CPPRestSDK dependency does not build with GCC 9.3

The specified cpprestsdk dependency version does not build with GCC 9.3

# Checkout 2.10.1 version of cpprestsdk
git checkout e8dda215426172cd348e4d6d455141f40768bf47
g++ (Ubuntu 9.3.0-10ubuntu2) 9.3.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Build issue:

[  0%] Building CXX object src/CMakeFiles/cpprest.dir/http/client/http_client.cpp.o
In file included from /home/jack/cpprestsdk/Release/src/pch/stdafx.h:105,
                 from /home/jack/cpprestsdk/Release/src/http/client/http_client.cpp:16:
/home/jack/cpprestsdk/Release/include/cpprest/json.h: In member function ‘web::json::value& web::json::array::operator[](web::json::array::size_type)’:
/home/jack/cpprestsdk/Release/include/cpprest/json.h:974:29: error: implicitly-declared ‘constexpr msl::safeint3::SafeInt<long unsigned int>::SafeInt(const msl::safeint3::SafeInt<long unsigned int>&)’ is deprecated [-Werror=deprecated-copy]
  974 |             if (nlastSize < nMinSize)
      |                             ^~~~~~~~
In file included from /home/jack/cpprestsdk/Release/include/cpprest/details/basic_types.h:29,
                 from /home/jack/cpprestsdk/Release/src/pch/stdafx.h:92,
                 from /home/jack/cpprestsdk/Release/src/http/client/http_client.cpp:16:
/home/jack/cpprestsdk/Release/include/cpprest/details/SafeInt3.hpp:5639:22: note: because ‘msl::safeint3::SafeInt<long unsigned int>’ has user-provided ‘msl::safeint3::SafeInt<T, E>& msl::safeint3::SafeInt<T, E>::operator=(const msl::safeint3::SafeInt<T, E>&) [with T = long unsigned int; E = msl::safeint3::SafeIntInternal::SafeIntExceptionHandler<msl::safeint3::SafeIntException>]’
 5639 |     SafeInt< T, E >& operator =( const SafeInt< T, E >& rhs ) SAFEINT_NOTHROW
      |                      ^~~~~~~~
/home/jack/cpprestsdk/Release/include/cpprest/details/SafeInt3.hpp:6443:34: note:   initializing argument 1 of ‘bool msl::safeint3::operator<(msl::safeint3::SafeInt<U, E>, msl::safeint3::SafeInt<T, E>) [with T = long unsigned int; U = long unsigned int; E = msl::safeint3::SafeIntInternal::SafeIntExceptionHandler<msl::safeint3::SafeIntException>]’
 6443 | bool operator <( SafeInt< U, E > lhs, SafeInt< T, E > rhs ) SAFEINT_NOTHROW
      |                  ~~~~~~~~~~~~~~~~^~~
In file included from /home/jack/cpprestsdk/Release/src/pch/stdafx.h:105,
                 from /home/jack/cpprestsdk/Release/src/http/client/http_client.cpp:16:
/home/jack/cpprestsdk/Release/include/cpprest/json.h:974:29: error: implicitly-declared ‘constexpr msl::safeint3::SafeInt<long unsigned int>::SafeInt(const msl::safeint3::SafeInt<long unsigned int>&)’ is deprecated [-Werror=deprecated-copy]
  974 |             if (nlastSize < nMinSize)
      |                             ^~~~~~~~
In file included from /home/jack/cpprestsdk/Release/include/cpprest/details/basic_types.h:29,
                 from /home/jack/cpprestsdk/Release/src/pch/stdafx.h:92,
                 from /home/jack/cpprestsdk/Release/src/http/client/http_client.cpp:16:
/home/jack/cpprestsdk/Release/include/cpprest/details/SafeInt3.hpp:5639:22: note: because ‘msl::safeint3::SafeInt<long unsigned int>’ has user-provided ‘msl::safeint3::SafeInt<T, E>& msl::safeint3::SafeInt<T, E>::operator=(const msl::safeint3::SafeInt<T, E>&) [with T = long unsigned int; E = msl::safeint3::SafeIntInternal::SafeIntExceptionHandler<msl::safeint3::SafeIntException>]’
 5639 |     SafeInt< T, E >& operator =( const SafeInt< T, E >& rhs ) SAFEINT_NOTHROW
      |                      ^~~~~~~~
/home/jack/cpprestsdk/Release/include/cpprest/details/SafeInt3.hpp:6443:55: note:   initializing argument 2 of ‘bool msl::safeint3::operator<(msl::safeint3::SafeInt<U, E>, msl::safeint3::SafeInt<T, E>) [with T = long unsigned int; U = long unsigned int; E = msl::safeint3::SafeIntInternal::SafeIntExceptionHandler<msl::safeint3::SafeIntException>]’
 6443 | bool operator <( SafeInt< U, E > lhs, SafeInt< T, E > rhs ) SAFEINT_NOTHROW
      |                                       ~~~~~~~~~~~~~~~~^~~
In file included from /home/jack/cpprestsdk/Release/include/cpprest/http_msg.h:28,
                 from /home/jack/cpprestsdk/Release/src/pch/stdafx.h:116,
                 from /home/jack/cpprestsdk/Release/src/http/client/http_client.cpp:16:
/home/jack/cpprestsdk/Release/include/cpprest/containerstream.h: In instantiation of ‘size_t Concurrency::streams::details::basic_container_buffer<_CollectionType>::in_avail() const [with _CollectionType = std::vector<unsigned char>; size_t = long unsigned int]’:
/home/jack/cpprestsdk/Release/include/cpprest/containerstream.h:116:24:   required from here
/home/jack/cpprestsdk/Release/include/cpprest/containerstream.h:124:38: error: implicitly-declared ‘constexpr msl::safeint3::SafeInt<long unsigned int>::SafeInt(const msl::safeint3::SafeInt<long unsigned int>&)’ is deprecated [-Werror=deprecated-copy]
  124 |             return (size_t)(writeend - readhead);
      |                            ~~~~~~~~~~^~~~~~~~~~~
In file included from /home/jack/cpprestsdk/Release/include/cpprest/details/basic_types.h:29,
                 from /home/jack/cpprestsdk/Release/src/pch/stdafx.h:92,
                 from /home/jack/cpprestsdk/Release/src/http/client/http_client.cpp:16:
/home/jack/cpprestsdk/Release/include/cpprest/details/SafeInt3.hpp:5639:22: note: because ‘msl::safeint3::SafeInt<long unsigned int>’ has user-provided ‘msl::safeint3::SafeInt<T, E>& msl::safeint3::SafeInt<T, E>::operator=(const msl::safeint3::SafeInt<T, E>&) [with T = long unsigned int; E = msl::safeint3::SafeIntInternal::SafeIntExceptionHandler<msl::safeint3::SafeIntException>]’
 5639 |     SafeInt< T, E >& operator =( const SafeInt< T, E >& rhs ) SAFEINT_NOTHROW
      |                      ^~~~~~~~
/home/jack/cpprestsdk/Release/include/cpprest/details/SafeInt3.hpp:6053:48: note:   initializing argument 1 of ‘msl::safeint3::SafeInt<T, E> msl::safeint3::SafeInt<T, E>::operator-(msl::safeint3::SafeInt<T, E>) const [with T = long unsigned int; E = msl::safeint3::SafeIntInternal::SafeIntExceptionHandler<msl::safeint3::SafeIntException>]’
 6053 |     SafeInt< T, E > operator -(SafeInt< T, E > rhs) const SAFEINT_CPP_THROW
      |                                ~~~~~~~~~~~~~~~~^~~

Events tracking / logging

Right now it is impossible to validate which messages were dropped (because of sampling) or not sent because of network issues.

Suggestion for quick improvement:

Introduce event sequence identifier as "{first_event_id}/{last_event_id}".

  1. Warning message for sampling: 'We dropped X events (Y%) out of "{first_event_id}/{last_event_id}" block'
  2. Warning message for sending: 'We failed to send the block "{first_event_id}/{last_event_id}" because of the following HTTP issue'

Create .clang-format for codebase

A .clang-format needs to be created for consistent style. Strongly consider adopting VW's .clang-format for style consistency between codebases.

OpenSSL and windows

Recent versions of cpprestsdk on vcpkg (as of 5/26/20) no longer have openssl as a dependency and this is triggering a failure when building rlclientlib.

This happens because rlclientlib\utility\http_authorization.cc depends on openssl for HMAC.

We should address this by doing the following:

  • update documentation and build scripts to explicitly include openssl.
  • replace the dependency on windows with the OS crypto library (same goes for OSX).

Non-reproducibility in CCB

Right now sampling in CCB is not deterministic because of seed corruption:
Did quick debugging and I see that random junk is appended to seed string in cb sample:
image
Didn't have chance to validate if issue is on vw or rl side yet, so creating issue here

Improve example gen

  • Change extension to .fb to avoid confusion with .fbs that's used by schema files.
  • Make it possible to produce joinable events.

Add 'what is this' section to readme

I have an intuition that this repo is potentially of immense practical value. But details in the readme are sparse as to what exactly this does.

Would be awesome if an expert were to write a couple paragraphs describing what this does and how.

thank you

[improvement] Remove left overs from CCB

As CCB designed evolved, some bits turned out be to redundant and we slipped from removing them.
They should be removed from the code base.

In no particular order:

Make LiveModel.init robust against random network failures

Right now LiveModel.Init is doing check of models' azure location, which can lead to unsuccesful initialization due to transient network issues.
We probably should make it more robust and process model update errors in the same place (background thread).

[usability] Strange and/or undocumented config defaults

Config defaults must be intuitive and documented.

For example, interaction and observation senders default to event hub, for some reason.
Another example, the event hub senders default to "localhost:8080" as their url - for some reason.

I don't quite understand the justification for those, but the later is actually not nice as it can cause confusion with people running other stuff in such common development time port.

Split out example code from RL.Net.Cli

RL.Net.Cli started out as a project to test and debug the RL.Net bindings. Over time we added simulator and replay functionality to it. Long-term, we should probably keep the samples grouped together, and out of the simulation/replay tools.

Python bindings improvements

Currently the Python bindings are generated using SWIG - and they are quite difficult to get right. Perhaps we should consider replatforming at the same time as we do it in VW to have a consistent story here?

Specifications - Enable logging only (C#)

Design:

Expose a Serializer Interface

  • General: Get Inteface using a factory
  • General: Buffer Alloc (allocation from pool)
  • General: Buffer Return (return to pool)
  • General: Serialize into buffer using Flatbuffer?

Implement C#/C++ serialization helper for

  • buffer = choose_rank (eventid, context, probability array, action array)
  • buffer = report_outcome(eventid, reward)

Expose a Logger Interface

  • General: Get Interface using factory

C# binding

  • Serializer Interface
  • Logger Interface

Need Sizing
Create Tasks

Model validation will fail for CB cases

In RLClientLib/vw_model/vw_model.cc the default value for MODEL_VW_INITIAL_COMMAND_LINE is changed to CCB. The default behavior has now changed for cb. The compatibility check for APS model s will always fail.

Change the default command line to cb.

Document configuration parameters for CMake

Right now the CMake setup is only documented by the CMakeLists.txt files themselves. We should have some basic documentation (such as in the wiki) to describebuild configuration parameters.

Plans for C API?

Are there plans for a C header file? Would make writing bindings easier among other things.

Would love to contribute here if it would help.

Get rid of copy paste in loggers initialization

Right now we have several sets of parameters required for logger initialization in a single plain list:
"interaction.eventhub.host",
"interaction.eventhub.name",
"interaction.eventhub.keyname",
...
"observation.eventhub.host",
"observation.eventhub.name",
"observation.eventhub.keyname",
...
It introduces some amount of copy-paste for all logger implementations.
It would be helpful either to move to more structured config:
"interaction_config": {"eventhub" { "host", "name"}},...
or at least simulate this behavior by accessing to certain keys of config by prefix

Add blackhole sender implementation

When writing tests with the C# bindings, it's convenient to disregard all interactions and observations by sending them to nowhere.

We should provide a blackhole sender/receiver to make it easier for end-users writing tests with rlclientlib.

This is particularly important given we can't register providers from the bindings.

Flatbuffer dependency resolution does not work when consumed as submodule

set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} "${CMAKE_SOURCE_DIR}/cmake/Modules/")
find_package(FlatBuffers REQUIRED)

${CMAKE_SOURCE_DIR} will not work when the project is consumed as as submodule. This should probably be CMAKE_CURRENT_LIST_DIR and the statement moved to the top level file to be with the other dependencies.

[C#] Unsafe usage of safe handles

The current usage of SafeHandle is not safe.

There are two ways to correctly use it.

The first is to use pinvoke marshalling to safely convert it to an IntPtr.

The second is to use the following pattern:

bool is_valid = false;
try {
  DangerousAddRef (ref is_valid);
  DoStuff(DangerousGetHandle());
} finally {
 if(is_valid)
   DangerousRelease();
}

Match project and solution files to same version of visual studio

Currently the solution file (rl.sln) use VS2017 and the project files and toolset are VS2015. Need to keep them consistent.
a) Upgrade all projects to VS2017 -or-
b) Downgrade solution file to VS2015 -or-
c) Provide VS2015 compatible solution file and a VS2017 compatible solution file.

[usability] Avoid out params for construction of disposable objects.

This happens when creating a configuration object. The following piece of code doesn't work:

using(Configuration config = null) {
  Configuration.TryLoadConfigurationFromJson("", out config, null);
}

C# doesn't allow passing a variable byref if it's defined in an using statement.
This means the user either will not properly dispose it or relie on finalization.

Enable Exploration Distribution Generation strategy injection

Right now it is possible to run a VW model (or any i_model) that supports exploration out of the box by relying on the PMF output to represent an "exploration distribution". It is also possible to do manual exploration by specifying the PMF via the "Passthrough" mechanism.

Running a VW (or other model) and then changing the exploration distribution (to increase action diversity) is difficult: It requires setting up two live_model instances, one of which is correctly configured not to log, running the one, grabbing the output PMF, tweaking it, generating a new JSON query and running it through a "Passthrough" live_model.

It would be nice to be able to instantiate a live model with a custom exploration strategy callback which will be used to tweak the output from i_model before sampling.

Python documentation

Current Python documentation is generated using doxygen and doesn't seem to really outline the Python API. We should instead use autodoc with Sphinx or something similar.

Check whether workaround for AppVeyor OpenSSL is still needed

A little while back we needed to add a workaround to the AppVeyor build to delete the installed OpenSSL packages to enable vcpkg to properly build gRPC/OpenSSL (for cpprestsdk), as detailed here.

microsoft/vcpkg#4189 (comment)

The issue has been closed because

I am going to close this issue now because it does seem to be resolved, at least with AppVeyor since their patch was made. enigma-dev/RadialGM#34
If other people continue to have issues with vcpkg and OpenSSL, then they'll need to file a new issue because I am not able to provide any more information about other issues I do not have and it seems to be fixed here.

We should check whether we can remove this workaround.

[C#] Crash with misconfigured loop.

using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
using Rl.Net;
using System;
using System.IO;
using System.Threading;

namespace rl_lib_repro
{
    class Program
    {
        static void Main(string[] args)
        {
            string modelFilePath = Path.GetTempFileName();
            File.WriteAllBytes(modelFilePath, new byte[0]);

            var configObj = new JObject();
            configObj["AppId"] = "1234";
            configObj["model.backgroundrefresh"] = false;
            configObj["model.source"] = "FILE_MODEL_DATA";
            configObj["model_file_loader.file_name"] = modelFilePath;
            configObj["model_file_loader.file_must_exist"] = true;

            if (!Configuration.TryLoadConfigurationFromJson(JsonConvert.SerializeObject(configObj), out Configuration config))
                throw new Exception("oh no");

            LiveModel lm = new LiveModel(config);
            lm.Init();
            lm.BackgroundError += (a, b) => Console.WriteLine($"got {a} due to {b}");

            string payload = JsonConvert.SerializeObject(JObject.FromObject(new
            {
                Foo = 1,
                _multi = new object[]
       {
                new
                {
                    ActionA = 1,
                },
                new
                {
                    ActionB = 1,
                },
       },
            }));

            using (var ignore = lm.ChooseRank("lalala", payload, ActionFlags.Default))
            {
                // do nothing with the answer
            }

            lm.Dispose();

            Console.WriteLine("done!");
            Console.ReadLine();
        }
    }
}

Run under the debugger and notice that a background Tpp thread will throw an unhandled exception from rl.net.native.

Trace logging support for python

Hi, are you considering adding trace logging support via the python module? A lot of useful error details are currently hidden.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.