Code Monkey home page Code Monkey logo

rocca's Introduction

Rational OpenCog Controlled Agent

CI Status Code style: black

Description

Rational OpenCog Controlled Agent, or ROCCA, is a project aiming at creating an opencog agent that acts rationally in OpenAI Gym environments (including Minecraft via MineRL and Malmo).

At its core it relies on PLN (Probabilistic Logic Networks) for both learning and planning. In practice most of the learning is however handled by the pattern miner, which can be seen as a specialized form of PLN reasoning. Planning, the discovery of cognitive schematics, is handled by PLN and its temporal reasoning rule base. Decision is currently a hardwired module, heavily inspired by OpenPsi with a more rational sampling procedure (Thompson Sampling for better exploitation vs exploration tradeoff).

Status

For now learning is able to

  1. Discover temporal patterns based on directly observable events via the pattern miner.
  2. Turn these temporal patterns into plans (cognitive schematics).
  3. Combine these plans to form new plans, possibly composed of new action sequences, via temporal deduction.

The next steps are

  1. Add more sophisticated temporal (including dealing with longs lags between cause and effect) and then spatial inference rules.
  2. Integrate ECAN, for Attention Allocation, to dynamically restrict the atomspace to subsets of items to process/pay-attention-to.
  3. Record attention spreading to learn/improve Hebbian links.
  4. Carry concept creation and schematization (crystallized attention allocation).
  5. Record internal processes, not just attention spreading, as percepta to enable deeper forms of instrospective reasoning.
  6. Plan internal actions, not just external, to enable self-growth.

Requirements

OpenCog tools

  • cogutil (tested with revision 555a003)
  • atomspace (tested with revision 396e1e7)
  • unify (tested with revision 1e93141)
  • ure (tested with revision 4e01b02)
  • spacetime (tested with revision 962862c)
  • pln (tested with revision 08c100f)
  • miner (tested with revision 15befc4)

Third party tools

Python 3.10 vs 3.8

Python 3.10 offers a better out-of-the-box type annotation system than Python 3.8 and is thus the default required version. However you may still use Python 3.8 by checking out the python-3.8-compatible branch. Beware that such Python 3.8 branch may not be as well maintained as the master.

Install

In the root folder enter the following command (you might need to be root depending on your system):

pip install -e .

For the tools used for development:

pip install -r requirements-dev.txt

How to use

An OpencogAgent defined under the rocca/agents folder is provided that can used to implement agents for given environments. See the examples under the examples folder.

There are Jupyter notebooks provided for experimentation as well. To run them call jupyter notebook on a ipynb file, such as

jupyter notebook 01_cartpole.ipynb

TensorBoard support

Some experiments, notably the notebooks, use TensorBoard via the tensorboardX library to store event files that show certain metrics over time for training / testing (for now it's just rewards).

By default, event files will be created under the runs/<datetime><comment> directory. You can invoke tensorboard --logdir runs from the project root to start an instance that will see all the files under that directory. Open your browser to http://localhost:6006 to see its interface.

Develop

If you write code in notebooks that is exported (has the #export comment on top of the cell), remember to invoke nbdev_build_lib to update the library. Remember to use black for formatting, you can invoke black . from the project root to format everything.

You can also use the Makefile for your convenience, invoking make rocca will do both of the above in sequence.

Development container

The .devcontainer folder has configuration for VS Code devcontainer functionality. You can use it to setup a development environment very quickly and regardless of the OS you use.

  • The container has a JupyterLab instance running on the port 8888.
  • The container has a VNC server running on the port 5901.
  • The password for the VNC server started in the container is vncpassword. You can use any VNC client to see the results of rendering Gym environments this way.

Tests

Static type checking

Using type annotations is highly encouraged. One can type check the entire Python ROCCA code by calling

tests/mypy.sh

from the root folder.

To only type check some subfolder, you may call mypy.sh from that subfolder. For instance to type check the examples subfolder

cd examples
../tests/mypy.sh

Or directly run mypy on files or directories, such as

mypy rocca/rocca/agents/core.py

tests/mypy.sh merely calls mypy on all python files in the directory from which it is called while filtering out some error messages.

Unit tests

Simply run pytest in the root folder.

References

There is no ROCCA paper per se yet. In the meantime here is a list of related references

rocca's People

Contributors

amebel avatar eman22s avatar jadeoneill avatar kasimebrahim avatar ngeiswei avatar ntoxeg avatar tanksha avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

rocca's Issues

Divide by zero in `_beta_pdf`

[/usr/local/lib/python3.10/dist-packages/scipy/stats/_continuous_distns.py:608](): RuntimeWarning: divide by zero encountered in _beta_pdf
  return _boost._beta_pdf(x, a, b)

Happens while running the Cartpole notebook, during the learning phase.

Support behavior tree instead mere action sequence as plan

Overview

Plans covering more possibilities are more likely to have high probabilities of success, and therefore be better guides for action selection. Using behavior trees instead of mere action sequences could be a way to go.

Rational

Let the following world be

  1. In context C₁
    1.1. action A₁ equi-probabilistically leads to context C₁ or C₂.
    1.2. action A₄ leads to goal G with probability 0.6.
    1.3. other actions lead to ¬G.
  2. In context C₂, action A₂ leads to G, other actions leads to ¬G.
  3. In context C₃, action A₃ leads to G, other actions leads to ¬G.

Assuming the agent is limited to action sequence plans, from context C₁, only the following three plans can reach G (≺ stands for SequentialAnd, and ↝ stands for PredictiveImplication):

  1. Plan P₁ "if in C₁ take A₁, then take A₂"
    (C₁∧A₁)≺A₂↝G
    
    has a 0.5 probability of success because A₁ has a 0.5 probability of leading to C₃ where A₂ will be ineffective.
  2. Plan P₂ "if in C₁ take A₁, then take A₃"
    (C₁∧A₁)≺A₃↝G
    
    has a 0.5 probability of success because A₁ has a 0.5 probability of leading to C₂ where A₃ will be ineffective.
  3. Plan P₃ "if in C₁ take A₄"
    C₁∧A₄↝G
    
    has a 0.6 probability of success.

Assuming max confidence over these probabilities, the action selector is gonna choose plan P₃, as it has the highest probability of success, while the optimal behavior would be to execute A₁, then depending on the context execute A₂ or A₃.

One way to open the mind of the agent is to allow plans covering these contextual branches. For instance the following plan "if in C₁ take A₁, then if in C₂ take A₂, else if in C₃ take A₃"

(C₁∧A₁)≺((C₂∧A₂)∨(C₃∧A₃))↝G

has a probability of success of 1, and therefore would lead to selecting A₁, instead of A₄ as above, as the best next action.

Existing work

Note that behavior tree is apprently already somewhat supported in OpenCog, see https://wiki.opencog.org/w/Behavior_tree_(2015_Archive). It is unknown as of right now how much reusable that work would be for ROCCA, or even if behavior tree, strictly defined, is the way to go, but if not, it will certainly be something alike.

Alternative

Another way to handle that is to include planning itself in the action space, then the following plan "if in C₁ take A₁, then plan and run the selected action from there"

(C₁∧A₁)≺PLAN_SELECT_RUN↝G

should in principle have a probability 1 of success. However it seems more difficult to reason about planning involving planning, even in such a linear fashion, than to reason about planning involving behavior tree with more primitive actions.

Add unit and integration tests

As more people are getting involved it is of paramount importance that we add a test suite.

All working examples (such as https://github.com/opencog/rocca/blob/master/examples/chase.py) should probably be turned into tests (not sure if the example code should be reused or copied). And important methods and utils functions associated to the OpencogAgent class should be tested.

It's unclear to me what test framework we want to use. I suppose

https://pypi.org/project/nose/

is an obvious candidate, but other suggestions are welcome.

Type annotate python code

Out-of-the-box type annotation has been introduced in Python 3.5

https://www.python.org/dev/peps/pep-0484/

and substantially improved in Python 3.9

https://www.python.org/dev/peps/pep-0585/

Already some functions have been annotated by Adrian

def plan(self, goal, expiry) -> List:

def mk_list(*args) -> ListLink:

It would be very good if we take the habit to annotate all of them.
We could also add, alongside the test suite #25, an easy-to-run
1-liner static checker command (using mypy or such).

Let us choose light over darkness, truth over lies, type annotation over technological apocalypse.

Crash on Cartpole interaction phase

The Cartpole example crashes when trying to run the learning agent (epochs = 5 in the notebook).

ic| msg: 'Learning phase started. (1/5)'
[/usr/local/lib/python3.10/dist-packages/scipy/stats/_continuous_distns.py:608](): RuntimeWarning: divide by zero encountered in _beta_pdf
  return _boost._beta_pdf(x, a, b)
ic| msg: 'Interaction phase started. (1/5)'
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
/workspace/01_cartpole.ipynb Cell 21' in <module>
     [17]()[ log_msg(agent_log, f"Interaction phase started. ({i + 1}/{epochs})")
     ]()[18]()[ for j in range(epoch_len):
---> ]()[19]()[     done = agent.control_cycle()
     ]()[20]()[     wrapped_env.render()  # uncomment to see the rendered env
     ]()[21]()[     time.sleep(0.01)

File /workspace/rocca/agents/core.py:1587, in OpencogAgent.control_cycle(self)
   ]()[1584]()[ agent_log.debug("cogscms [count={}]:\n{}".format(len(cogscms), cogscms))
   ]()[1586]()[ # Deduce the action distribution
-> ]()[1587]()[ mxmdl = self.deduce(cogscms)
   ]()[1588]()[ agent_log.debug("mxmdl:\n{}".format(mxmdl_to_str(mxmdl)))
   ]()[1590]()[ # Select the next action

File /workspace/rocca/agents/core.py:1491, in OpencogAgent.deduce(self, cogscms)
   ]()[1456]()[ # For each cognitive schematic estimate the probability of its
   ]()[1457]()[ # context to be true and multiply it by the truth value of the
   ]()[1458]()[ # cognitive schematic, then calculate its weight based on
   (...)
   ]()[1486]()[ # result, as they allegedly exert an unknown influence (via
   ]()[1487]()[ # their invalid parts).
   ]()[1488]()[ ctx_tv = lambda cogscm: get_context_actual_truth(
   ]()[1489]()[     self.atomspace, cogscm, self.cycle_count
   ]()[1490]()[ )
-> ]()[1491]()[ valid_cogscms = [cogscm for cogscm in cogscms if 0.9 < ctx_tv(cogscm).mean]
   ]()[1492]()[ agent_log.fine(
   ]()[1493]()[     "valid_cogscms [count={}]:\n{}".format(len(valid_cogscms), valid_cogscms)
   ]()[1494]()[ )
   ]()[1496]()[ # Size of the complete data set, including all observations
   ]()[1497]()[ # used to build the models.
   ]()[1498]()[ #
   ]()[1499]()[ # Needs to be set before calling self.weight

File /workspace/rocca/agents/core.py:1491, in <listcomp>(.0)
   ]()[1456]()[ # For each cognitive schematic estimate the probability of its
   ]()[1457]()[ # context to be true and multiply it by the truth value of the
   ]()[1458]()[ # cognitive schematic, then calculate its weight based on
   (...)
   ]()[1486]()[ # result, as they allegedly exert an unknown influence (via
   ]()[1487]()[ # their invalid parts).
   ]()[1488]()[ ctx_tv = lambda cogscm: get_context_actual_truth(
   ]()[1489]()[     self.atomspace, cogscm, self.cycle_count
   ]()[1490]()[ )
-> ]()[1491]()[ valid_cogscms = [cogscm for cogscm in cogscms if 0.9 < ctx_tv(cogscm).mean]
   ]()[1492]()[ agent_log.fine(
   ]()[1493]()[     "valid_cogscms [count={}]:\n{}".format(len(valid_cogscms), valid_cogscms)
   ]()[1494]()[ )
   ]()[1496]()[ # Size of the complete data set, including all observations
   ]()[1497]()[ # used to build the models.
   ]()[1498]()[ #
   ]()[1499]()[ # Needs to be set before calling self.weight

File /workspace/rocca/agents/core.py:1488, in OpencogAgent.deduce.<locals>.<lambda>(cogscm)
   ]()[1454]()[ agent_log.fine("deduce(cogscms={})".format(cogscms))
   ]()[1456]()[ # For each cognitive schematic estimate the probability of its
   ]()[1457]()[ # context to be true and multiply it by the truth value of the
   ]()[1458]()[ # cognitive schematic, then calculate its weight based on
   (...)
   ]()[1486]()[ # result, as they allegedly exert an unknown influence (via
   ]()[1487]()[ # their invalid parts).
-> ]()[1488]()[ ctx_tv = lambda cogscm: get_context_actual_truth(
   ]()[1489]()[     self.atomspace, cogscm, self.cycle_count
   ]()[1490]()[ )
   ]()[1491]()[ valid_cogscms = [cogscm for cogscm in cogscms if 0.9 < ctx_tv(cogscm).mean]
   ]()[1492]()[ agent_log.fine(
   ]()[1493]()[     "valid_cogscms [count={}]:\n{}".format(len(valid_cogscms), valid_cogscms)
   ]()[1494]()[ )

File /workspace/rocca/agents/utils.py:832, in get_context_actual_truth(atomspace, cogscm, i)
    ]()[825]()[ body = AndLink(
    ]()[826]()[     PresentLink(*stamped_present_clauses),
    ]()[827]()[     IsClosedLink(*stamped_present_clauses),
    ]()[828]()[     IsTrueLink(*stamped_present_clauses),
    ]()[829]()[     *virtual_clauses
    ]()[830]()[ )
    ]()[831]()[ query = SatisfactionLink(vardecl, body)
--> ]()[832]()[ tv = execute_atom(atomspace, query)
    ]()[833]()[ return tv

File execute.pyx:14, in opencog.execute.execute_atom()

RuntimeError: BUG! Still have ungrounded clauses!! (/home/opencog/atomspace/opencog/query/NextSearchMixin.cc:154)]()

Integrate OpencogAgent into the pong example

We have a pong example with a dummy agent

https://github.com/opencog/rocca/blob/master/examples/pong.py

it would interesting to integrate the opencog agent, this could be especially interesting to study delayed reward as the agent needs to plan its actions (moving the paddle) long before missing the ball.

To integrate the opencog agent into the pong example one may take example on the chase example

https://github.com/opencog/rocca/blob/master/examples/chase.py

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.