Code Monkey home page Code Monkey logo

atomspace-rocks's Introduction

OpenCog

CircleCI

This repo is no longer maintained! Please use the following, instead:

Obsolete! As of 2021, the most interesting and actively maintained parts of this git repo have been split off into their own distinct git repos. What is left here is a mish-mash of unmatained stuff that is in the process of bit-rotting. Some unit tests fail. Some unit tests won't run. Some code won't compile. Perhaps there's some good stuff in here. Perhaps it can be brought back to life and used for something or other. However... for the most part, it is obsolete.


This git repository contains the "OpenCog Framework", which has served as a (scientific, technical) laboratory for researching, exploring and learning how to integrate AI algorithms and systems into humanoid robotic systems. Most of the activity within this particular repo has focused on integrating natural language chat, common-sense reasoning, assorted learning algorithms, and motor control of humanoid robots.

A stated goal of the OpenCog project is to develop artificial general intelligence (AGI) systems. This is all and well; however, what can be found here, in this particular repo, is very far from that. The code here really is ... a laboratory for integrating various types of AI systems. As such, it is a compilation of several decades of work by a large and varying collection of students, researchers, professors and software engineers. As a laboratory, it is filled with all sorts of devices in varying states of working order, from well-polished to mostly-broken.

See also:

  • ROCCA - Rational OpenCog Controlled Agent. This is a different assemblage of assorted OpenCog components, so that they operate within Minecraft, in the OpenAI Gym. The focus is on learning with the pattern miner, and reasoning with PLN.

Overview

Most of the basic components used in OpenCog are distributed across various git repos, (mostly) grouped under https://github.com/opencog

This git repository contains a crude natural language processing pipeline, several embodied chatbots, and some control/action-selection mechanisms. These include:

  • Ghost, a Chatscript-compatible chatbot with additional capabilities for accepting visual sensory input, and for controlling robot movements.

  • OpenPsi, a model of psychological states. Its currently a mashup of two unrelated ideas: a generic rule-class action-selection and planning system, and a model of human psychological states. An open to-do item is to untangle these two.

  • An assortment of natural language processing subsystems, including:

    • Natural language generation (for expressing thoughts as sentences).
    • Natural language input (for reading and hearing).
    • Relex2logic, converting natural language to logic expressions.
    • Assorted chatbots, some of which are embodied.
    • A Lojban tool.

Prerequisites

To build and run the system here, the packages listed below are required. Users of Ubuntu may use the dependency installer from the /opencog/octool repository. Docker containers with OpenCog preconfigured can be found in the opencog/docker repo.

cogutil

Common OpenCog C++ utilities. https://github.com/opencog/cogutil It uses exactly the same build procedure as this package. Be sure to sudo make install at the end.

atomspace

OpenCog Atomspace, a sophisticated (hyper-)graph database. https://github.com/opencog/atomspace It uses exactly the same build procedure as this package. Be sure to sudo make install at the end.

cogserver

OpenCog CogServer Network Server. https://github.com/opencog/cogserver It uses exactly the same build procedure as this package. Be sure to sudo make install at the end.

attention

OpenCog Attention Allocation subsystem. https://github.com/opencog/attention It uses exactly the same build procedure as this package. Be sure to sudo make install at the end.

URE

OpenCog Unified Rule Engine. https://github.com/opencog/ure Required for PLN It uses exactly the same build procedure as this package. Be sure to sudo make install at the end.

pln

OpenCog Probabilistic Logic Networks reasoning system. https://github.com/opencog/pln It uses exactly the same build procedure as this package. Be sure to sudo make install at the end.

spacetime

OpenCog Spacetime Server - locations of objects in space and time. https://github.com/opencog/spacetime It uses exactly the same build procedure as this package. Be sure to sudo make install at the end.

ros-behavior-scripting

Visual and auditory senses, robot motor control. https://github.com/opencog/ros-behavior-scripting It uses exactly the same build procedure as this package. Be sure to sudo make install at the end.

lg-atomese

Natural Language Parser for English, Russian, other languages. Required for natural language generation, and the chatbot. https://github.com/opencog/lg-atomese It uses exactly the same build procedure as this package. Be sure to sudo make install at the end.

Building OpenCog

Perform the following steps at the shell prompt:

    cd to project root dir
    mkdir build
    cd build
    cmake ..
    make

Libraries will be built into subdirectories within build, mirroring the structure of the source directory root.

Unit tests

To build and run the unit tests, from the ./build directory enter (after building opencog as above):

    make test

atomspace-rocks's People

Contributors

linas avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

atomspace-rocks's Issues

Bug or User error? when using frames

Here's some weird unexpected, confusing behavior w.r.t. frames. It's both a bug and a user-error. It is exhibited by the following:

(use-modules (opencog) (opencog exec))
(use-modules (opencog persist))
(use-modules (opencog persist-rocks))

(define a (AtomSpace))
(define b (AtomSpace))
(define c (AtomSpace a b))
(cog-set-atomspace! a)
(Concept "foo")
(Concept "bar")
(Concept "baz")
(Concept "I'm in A")

(cog-set-atomspace! c)
(define rsn (RocksStorageNode "rocks:///tmp/foo"))
(cog-open rsn)

; Oh no! Saving atomspace contents without saving frames first!!
(store-atomspace)
(cog-rocks-print rsn "")

; Now store atomspace b (which should be empty!!)
(store-frames b)

; Hmm looks like all atoms got assigned to atomspace b! Oh no!
(cog-rocks-print rsn "")

If the store-frames is done first, before the store-atomspace, then the correct behavior is seen. This is a user-error, since the user should have done the store-frames first. It is also a bug, since the store-frames should not have re-assigned the atoms to the wrong atomspace.

cog-prt-atomspace doesn't work in union of two atomspaces

In the context of #15 , while testing the Multiframe-Example
following happened:
When I tried to connect the union atomspace to a RocksDB and then try cog-prt-atomspace, I got a segfault.:

(cog-set-atomspace! c)
(define rsn (RocksStorageNode "rocks:///home/opcwdir/repodock/ForTestingStuff/foo.rdb"))
(cog-open rsn)
(store-atomspace)
(cog-prt-atomspace)

This happens also, if I use cog-rocks-open.
Printing the content with cog-rocks-print works fine.

Not sure, if this is a realistic use case.

Performance insanity!

On small datasets and on synthetic benchmarks, the RocksDB backend is 2x or 3x faster than the Postgres backend. This is not surprising, given the complexity of mapping the AtomSpace to SQL, together with the overhead of client-server communications. However, in a real-world use-case, RocksDB is 9x slower than Postgres. Why? What is going wrong? Is this related to issues #10 and #9? Will fixes to those also fix this?

From the learning project. RocksDB:

(define wsv (make-shape-vec-api))
(define wss (add-pair-stars wsv))
(define wst (batch-transpose wss))
(wst 'mmt-marginals)
...
Stored 40000 of 510993 left-wilds in 101 secs (396 pairs/sec)
Stored 80000 of 510993 left-wilds in 356 secs (112 pairs/sec)
Stored 120000 of 510993 left-wilds in 607 secs (66 pairs/sec)
Stored 160000 of 510993 left-wilds in 865 secs (46 pairs/sec)
Stored 200000 of 510993 left-wilds in 1075 secs (37 pairs/sec)
Stored 240000 of 510993 left-wilds in 1352 secs (30 pairs/sec)
Stored 280000 of 510993 left-wilds in 1591 secs (25 pairs/sec)
Stored 320000 of 510993 left-wilds in 1841 secs (22 pairs/sec)
Stored 360000 of 510993 left-wilds in 2076 secs (19 pairs/sec)
Stored 400000 of 510993 left-wilds in 2296 secs (17 pairs/sec)
Stored 440000 of 510993 left-wilds in 2584 secs (15 pairs/sec)
Stored 480000 of 510993 left-wilds in 2871 secs (14 pairs/sec)
Done storing 510993 left-wilds in 19969 secs

Freakin disaster. That works out to 5.5 hours, or 26 pairs/second. Compare this to exactly the same dataset on Postgres:

Stored 40000 of 510993 left-wilds in 170 secs (235 pairs/sec)
Stored 80000 of 510993 left-wilds in 168 secs (238 pairs/sec)
Stored 120000 of 510993 left-wilds in 188 secs (213 pairs/sec)
Stored 160000 of 510993 left-wilds in 177 secs (226 pairs/sec)
Stored 200000 of 510993 left-wilds in 171 secs (234 pairs/sec)
Stored 240000 of 510993 left-wilds in 165 secs (242 pairs/sec)
Stored 280000 of 510993 left-wilds in 164 secs (244 pairs/sec)
Stored 320000 of 510993 left-wilds in 176 secs (227 pairs/sec)
Stored 360000 of 510993 left-wilds in 163 secs (245 pairs/sec)
Stored 400000 of 510993 left-wilds in 189 secs (212 pairs/sec)
Stored 440000 of 510993 left-wilds in 187 secs (214 pairs/sec)
Stored 480000 of 510993 left-wilds in 185 secs (216 pairs/sec)
Done storing 510993 left-wilds in 2246 secs

Postgres took 37 minutes to store the same dataset. What's the problem, here? How can we fix this?

Provide query "is this Atom being held in this DB?"

The (cog-node TYPE NAME) function allows a user to ask "does a node with the given TYPE and NAME exist in the AtomSpace?", without actually creating that Node. A similar function is needed for the RocksStorageNode: to ask if an Atom is being held, without actually creating it.

This needs modifications the generic StorageNode API, so that this question can be asked w/ the generic API, which then trickles down to the specific StorageNode.

In the same vein, other useful functions would be modeled on (cog-incoming-size ATOM) and (cog-incoming-size-by-type ATOM TYPE) to report on the incoming set of an ATOM held in storage, without actually fetching that incoming set.

Provide summary report of what is stored

The (cog-report-counts) function provides a summary of the Atom types stored in the AtomSpace. A similar function is needed, to report what is held in a RocksStorageNode (without actually loading those atoms, of course).

The simple version of this would be to modify the existing report function to total these up and print them. The fancy version of this would be to also add the API to the generic StorageNode API, to make it usable by all backends.

Need to automate compaction on close. 40x too big.

Running the learning kit on a tiny dictionary resulted in this:

$ du -s *rdb
21710584        gram-2.rdb   <<<<<<<<<<< wtf 21GB really?

closing and opening gives:

538360  gram-2.rdb  << half a GB. That's more like it.

which is a 40x compaction. That's sick. That's too big.

Do not always hide atoms when using frames.

When multiple frames are being used, the current rocks code will just hide atoms, instead of removing them, when frames are being used. This results in functionally correct behavior, but is wasteful of storage if the atom can actually be deleted.

The AtomSpace extract code currently implements the correct (reference) implementation: it will either hide an atom, when needed, or it will delete the atom, if possible. It can be used as the final arbiter of whether to delete, or not. The backend should follow this advice.

There right way to solve this is to implement the pre-delete and a post-delete calls in the backend. The atomspace calls pre-delete before doing the deletion, and rocks can gather any needed info for the deletion to happen. Next, the atomspace extracts the atom. Then it calls the post-delete hook. The hook code should look to see what the atomspace did: either the absent flag is set on the atom, or the atom is actually gone. If its only marked absent, then rocks should also hide the atom. else, rocks should delete the atom.

Ram usage insanity.

During learning on a tiny grammar, RAM usage by RocksDB exploded to 90GBytes. This is .. insane, it should not be more than a few GBytes for this workload. This is 40x greater RAM usage than expected ... the 40x number is just like the one in issue #9 and might be curable in the same way...

cog-rocks-print/print_range not working after cog-atomspace-clear

(store-atomspace)
; Clear the local AtomSpace (the Atoms remain on disk, just not in RAM).
(cog-atomspace-clear)

Before (cog-atomspace-clear), the cog-rocks-print works as expected.
If I try to inspect the RocksDB after (cog-atomspace-clear), I get:

scheme@(guile-user)> (cog-rocks-print rsn  "a@")
ice-9/boot-9.scm:1685:16: In procedure raise-exception:
In procedure cog-rocks-print: Wrong type (expecting opencog atom): #<Invalid handle>

Entering a new prompt.  Type `,bt' for a backtrace or `,q' to continue.
scheme@(guile-user) [1]> 

As I understand it, rocks should not be affected if you clear the AS in RAM.

Support handling of unknown atom types

Per discussion in opencog/atomspace#2787 (comment) if a RocksDB database holds an atom type that the C++ code does not know about, then it should be auto-created in the atomspace. To make this possible, the RocksDB tables will need to store the atom inheritance hierarchy, which they do not currently do.

Perhaps this could be best achieved by formally defining new atom types in atomese!? (and then just storing the atomese!?)

See also opencog/atomspace#2789

store encoded s-expressions

s-expressions are OK, but the database could be smaller if they were encoded by e.g. using an integer instead of the spelled-out type-name. Additional compression is possible, but at some point, the cpu-time of additional compression/decompression outweighs storage savings. Note also RocksDB has built-in compression, so this idea is kind-of questionable anyway?

Alpha-renaming is broken.

Two alpha-equivalent Atoms will be stored s distinct entries. This will result in duplicate, corrupted Values associated with the alpha-equivalent Atoms.

The fix for this seems to involve storing the 64-bit hash and using that to detect possible alpha-equivalence. All Atoms with the same hash would have to be fetched, and then tested for possible alpha-equivalence. The getLink() and getNode() callbacks are problematic, since they don't come with a pre-computed hash, and so have to pay the extra cost of computing it.

FWIW, this problem might affect other back-ends too, e.g. the sql backend, and maybe the cogserver backend? This problem has not been explored...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.