Code Monkey home page Code Monkey logo

Comments (2)

hmate9 avatar hmate9 commented on September 7, 2024

Here is what I have tried:

class DensePolicy(object):
    def __init__(self, ob_space, ac_space):
        # The input is the observed state
        self.x = x = tf.placeholder(tf.float32, [None] + list(ob_space))
        # The observation space may be multi-D, so flatten it
        x = flatten(x)
        # Add the output for getting the action
        self.logits = linear(x, ac_space, "action", normalized_columns_initializer(0.01))
        # Now for getting the value
        self.vf = tf.reshape(linear(x, 1, "value", normalized_columns_initializer(1.0)), [-1])
        # One-hot encode it
        self.sample = categorical_sample(self.logits, ac_space)[0, :]

        # No idea what this does
        self.var_list = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, tf.get_variable_scope().name)

    def get_initial_features(self):
        # TODO: I have no idea what this function should be doing
        return []

    # Run the observation through the network to determine the action
    def act(self, ob):
        sess = tf.get_default_session()
        print("Closed:", sess._closed)
        return sess.run([self.sample, self.logits] + [], 
            feed_dict={self.x: [ob]})

    def value(self, ob):
        sess = tf.get_default_session()
        return sess.run(self.vf, 
            feed_dict={self.x: [ob]})

However, after around 8-10 steps tensorflow quits saying that it is trying to use a session that is already closed:

[2016-12-10 20:22:05,428] Initializing all parameters.
[2016-12-10 20:22:06,140] Resetting environment
[2016-12-10 20:22:06,140] Starting training at step=0
Closed: False
Closed: False
Closed: False
Closed: False
[2016-12-10 20:22:06,153] Episode terminating: episode_reward=-10 episode_length=4
[2016-12-10 20:22:06,155] Resetting environment
Episode finished. Sum of rewards: -10. Length: 4
Closed: False
Closed: False
Closed: False
Closed: False
[2016-12-10 20:22:06,162] Episode terminating: episode_reward=-10 episode_length=4
[2016-12-10 20:22:06,164] Resetting environment
Episode finished. Sum of rewards: -10. Length: 4
Closed: True
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/threading.py", line 914, in _bootstrap_inner
    self.run()
  File "/Users/matehegedus/Downloads/universe-starter-agent/a3c.py", line 92, in run
    self._run()
  File "/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/contextlib.py", line 77, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3517, in get_controller
    yield default
  File "/Users/matehegedus/Downloads/universe-starter-agent/a3c.py", line 92, in run
    self._run()
  File "/Users/matehegedus/Downloads/universe-starter-agent/a3c.py", line 101, in _run
    self.queue.put(next(rollout_provider), timeout=600.0)
  File "/Users/matehegedus/Downloads/universe-starter-agent/a3c.py", line 122, in env_runner
    fetched = policy.act(last_state, *last_features)
  File "/Users/matehegedus/Downloads/universe-starter-agent/model.py", line 70, in act
    feed_dict={self.x: [ob]})
  File "/usr/local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 766, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 902, in _run
    raise RuntimeError('Attempted to use a closed Session.')
RuntimeError: Attempted to use a closed Session.

Traceback (most recent call last):
  File "worker.py", line 122, in <module>
    tf.app.run()
  File "/usr/local/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 43, in run
    sys.exit(main(sys.argv[:1] + flags_passthrough))
  File "worker.py", line 114, in main
    run(args, server)
  File "worker.py", line 61, in run
    trainer.process(sess)
  File "/Users/matehegedus/Downloads/universe-starter-agent/a3c.py", line 278, in process
    fetched = sess.run(fetches, feed_dict=feed_dict)
  File "/usr/local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 766, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 937, in _run
    np_val = np.asarray(subfeed_val, dtype=subfeed_dtype)
  File "/usr/local/lib/python3.5/site-packages/numpy/core/numeric.py", line 482, in asarray
    return array(a, dtype, copy=False, order=order)
ValueError: setting an array element with a sequence.```

from universe-starter-agent.

hmate9 avatar hmate9 commented on September 7, 2024

Solved. This is the correct def act:

def act(self, ob):
        sess = tf.get_default_session()
        print("Closed:", sess._closed)
        return sess.run([self.sample, self.vf] + [], 
            feed_dict={self.x: [ob]})

from universe-starter-agent.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.