Hi! First of all, big thanks for this crate. I had some questions re

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Agents requiring mutability for evaluation (action selection) about rsrl HOT 8 CLOSED

tspooner commented on June 7, 2024

Agents requiring mutability for evaluation (action selection)

from rsrl.

Comments (8)

tspooner commented on June 7, 2024 1

Hey,

I agree completely. I think generators would be a great thing to support. My only concern is that I don't want to rely on unstable features in the core crate. I've been desperate for some of the features like impl specialisation and GATs for a long time, but held out in favor of stable Rust.

However, I have a proposal. I have been re-writing a lot of the underlying code in rsrl this weekend and will push something soon. This replaces many of the wet traits with a single trait called Handler. This traits looks like the following:

pub trait Handler<M> {
    type Response;
    type Error;

    fn handle(&mut self, message: M) -> Result<Self::Response, Self::Error>;
}

I think it makes a lot of sense to design the framework to work in an event-driven manner. Furthermore, I'm super keen to integrate first-class support for futures once async traits drop on stable. This will make it much easier to integrate later down the line.

Now, taking all of this into consideration, here's the proposal: split the crate into a larger family as in the num ecosystem. I'm not sure what would be the best way yet, but I think your suggestion to have a traits crate is the first step. The next would be to have a crate for distributed experiments etc (as with the ray framework in Python). This would be the perfect place for code that you're proposing!

So, please do keep me updated as I will create this crate at some point soon and the more features and syntactic sugar for running experiments the better.

Tom

from rsrl.

tspooner commented on June 7, 2024

Hey!

First off, sorry for the delay in responding (I haven't been getting alerts about issues). I do really appreciate the input and I'm glad that you're getting some use out of the code.

So I totally agree that there are some real limitations with the interfaces at the moment. I've been meaning to take another look into this to try and cover more broad classes of algorithms. I'll need to spend some time looking into paper you cite to make a proper decision. My only concern is that the crate does not currently support planning. While this would be something worth having in future, I do want to try and avoid making overly complicated traits to make everything integrate right now. My priority is having a set of terse but parsimonious learning abstractions.

Anyway, I'm in the (very slow I realise) process of doing some major changes to the framework which may well change a lot of this anyway. For example, I do think that the Controller trait is actually somewhat ill-defined, and I'm not keen on my choice to have "target" and "behaviour" policies separated here. As part of that I'll prioritise looking into your suggestions. For now, are you OK to fork the crate and implement the traits you need? If you are, please link it here so I can refer to it as I go.

I think the idea of separating some of the core abstractions into a separate crate makes a lot of sense. This would be akin some something like num and num-traits. I will absolutely do this so please do watch out for that as input is always welcome.

Regards,
Tom

from rsrl.

wschella commented on June 7, 2024

Thanks for the response, no sweat for the delay!

Designing abstractions is just a very hard problem to tackle given the broad scope of artificial 'intelligence'. The reason this online controller or planner would be 'necessary' is because of Rust's mutability & ownership rules, in other languages it would not even be a problem at all (until ofc you end up with bugs). I've thought about it some more, and mutability during evaluation feels necessary for a very broad class of agents actually. I've mentioned online solvers and planners, but anything that does 'continual learning' would fall under it, which in fact every RL agent should be able to do, depending on the needs of the application. Unfortunately I can't give many pointers to influential papers, as my overview of the field is quite limited.

Also, the nature of defining handler-type methods makes it already quite difficult to have any concept of being called sequentially be reflected in the API and types. Currently, this is the responsibility of the evaluation/training code. So I'm not even sure you can cleanly adapt the traits to the 'online learner/planner' paradigm at all without completely revamping the crate, or defining a crate that is completely disconnected from the rest of the crate

So there's definitely no need to prioritize my use case in any way. It's hard, and I've no clue on how to do it myself in any way that is not a hack. I currently solved my problem by just having a local crate with all the types (since there where some problems with enabling blas here), where I just added a mut to the controller 😆.

I will keep an eye on separating out the types!

Thanks for the effort, there's no hurry.

from rsrl.

wschella commented on June 7, 2024

Concretely I feel like the generator pattern would be a flexible and nice abstraction over agents that would fit a very broad scope of use cases while maintaining a clean interface.

Since generators are an unstable feature (and will likely be so for a while), I'll experiment a bit with a hand rolled implementation in somewhere in the (not super near) future. I'll try to report progress here, which should by no means be interpreted as request to consider my conclusions for this crate.

Frankly I just get exited about implementing environments and agents as generators and then implementing executors, experiments & evaluation over them.

from rsrl.

wschella commented on June 7, 2024

A generalized handler pattern like this actually comes quite close to a 'handrolled' generator, and I think in almost all scenario's is preferable over the raw Generator trait, which as you suggest, is usually just relevant as a syntactic sugar.

Given that this is the current unstable Generator trait

pub trait Generator<R = ()> {
    type Yield;
    type Return;
    fn resume(self: Pin<&mut Self>, resume: R) -> GeneratorState<Self::Yield, Self::Return>;
}

It'd say what you're currently doing looks perfect (and a better fit for AI use cases), and I'm excited to check it out :)

from rsrl.

tspooner commented on June 7, 2024

Hey @wschella. Not sure if you've seen the latest version of rsrl. Do let me know whether this fixes your issues and we can close this. If not it would be good to hear what you think so I can try to address things in future versions.

from rsrl.

wschella commented on June 7, 2024

Hi, thanks for the changes! I haven't had the time to actually do some implementations, but as far as I can see the handler trait should make it possible for me to whip something up. So if it's up to me this issue can be closed.

Kind regards and much thanks

from rsrl.

tspooner commented on June 7, 2024

Ok excellent!

from rsrl.

Agents requiring mutability for evaluation (action selection) about rsrl HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent