Code Monkey home page Code Monkey logo

normalized-advantage-functions's People

Contributors

axnedergaard avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

erlerobot marylxd

normalized-advantage-functions's Issues

Failed to run, status?

Hey @axnedergaard, is great you started taking this request. What's the status? Is the code runnable at this point? I gave it a try and failed. I'm guessing your focus is on Python3.

Code fails because it can't find

python3 experiment.py --environment MountainCarContinuous-v0 --learning_rate 0.01 --episodes 2000 --hidden_size 16 32 
200 --output results.txt
Traceback (most recent call last):
  File "experiment.py", line 10, in <module>
    import naf
  File "/Users/Victor/continuous-deep-q-learning/naf.py", line 16, in <module>
    from tensorflow.python.ops.distributions.util import fill_lower_triangular
ModuleNotFoundError: No module named 'tensorflow.python.ops.distributions'

I tried applying the following patch (couldn't find tensorflow.python.ops.distributions.util in any tf version):

@@ -13,7 +13,21 @@
 import tensorflow as tf
 import numpy as np
 import random
-from tensorflow.python.ops.distributions.util import fill_lower_triangular
+# from tensorflow.python.ops.distributions.util import fill_lower_triangular
+
+def fill_lower_triangular(x):
+    """
+    Numpy implementation of `fill_lower_triangular`.
+        Dumped from https://programtalk.com/vs2/python/13142/deep_recommend_system/java_predict_client/src/main/proto/tensorflow/contrib/distributions/python/kernel_tests/distribution_util_test.py/
+    """
+    x = np.asarray(x)
+    d = x.shape[-1]
+    # d = n(n+1)/2 implies n is:
+    n = int(0.5 * (math.sqrt(1. + 8. * d) - 1.))
+    ids = np.tril_indices(n)
+    y = np.zeros(list(x.shape[:-1]) + [n, n], dtype=x.dtype)
+    y[..., ids[0], ids[1]] = x
+    return y

but also failed miserably:

python3 experiment.py --environment MountainCarContinuous-v0 --learning_rate 0.01 --episodes 2000 --hidden_size 16 32 
200 --output results.txt
[2017-12-19 18:13:39,003] Making new env: MountainCarContinuous-v0
Traceback (most recent call last):
  File "experiment.py", line 149, in <module>
    rewards = recursive_experiment(keys, vals, [])
  File "experiment.py", line 64, in recursive_experiment
    rewards += recursive_experiment(keys, remaining_vals[1:], vals + [remaining_vals[0]])
  File "experiment.py", line 64, in recursive_experiment
    rewards += recursive_experiment(keys, remaining_vals[1:], vals + [remaining_vals[0]])
  File "experiment.py", line 64, in recursive_experiment
    rewards += recursive_experiment(keys, remaining_vals[1:], vals + [remaining_vals[0]])
  File "experiment.py", line 67, in recursive_experiment
    rewards += recursive_experiment(keys, remaining_vals[1:],  vals + [r])
  File "experiment.py", line 64, in recursive_experiment
    rewards += recursive_experiment(keys, remaining_vals[1:], vals + [remaining_vals[0]])
  File "experiment.py", line 64, in recursive_experiment
    rewards += recursive_experiment(keys, remaining_vals[1:], vals + [remaining_vals[0]])
  File "experiment.py", line 64, in recursive_experiment
    rewards += recursive_experiment(keys, remaining_vals[1:], vals + [remaining_vals[0]])
  [Previous line repeated 1 more times]
  File "experiment.py", line 67, in recursive_experiment
    rewards += recursive_experiment(keys, remaining_vals[1:],  vals + [r])
  File "experiment.py", line 64, in recursive_experiment
    rewards += recursive_experiment(keys, remaining_vals[1:], vals + [remaining_vals[0]])
  File "experiment.py", line 64, in recursive_experiment
    rewards += recursive_experiment(keys, remaining_vals[1:], vals + [remaining_vals[0]])
  File "experiment.py", line 64, in recursive_experiment
    rewards += recursive_experiment(keys, remaining_vals[1:], vals + [remaining_vals[0]])
  File "experiment.py", line 67, in recursive_experiment
    rewards += recursive_experiment(keys, remaining_vals[1:],  vals + [r])
  File "experiment.py", line 64, in recursive_experiment
    rewards += recursive_experiment(keys, remaining_vals[1:], vals + [remaining_vals[0]])
  File "experiment.py", line 64, in recursive_experiment
    rewards += recursive_experiment(keys, remaining_vals[1:], vals + [remaining_vals[0]])
  File "experiment.py", line 64, in recursive_experiment
    rewards += recursive_experiment(keys, remaining_vals[1:], vals + [remaining_vals[0]])
  [Previous line repeated 3 more times]
  File "experiment.py", line 60, in recursive_experiment
    return [experiment(dict(zip(keys,vals)))]
  File "experiment.py", line 78, in experiment
    agent = naf.Agent(args['v'], env.observation_space, env.action_space, args['learning_rate'], args['batch_normalize'], args['gamma'], args['tau'], args['epsilon'], args['hidden_size'], args['hidden_n'], args['hidden_activation'], args['batch_size'], args['memory_capacity'], args['load_path'], args['covariance'])
  File "/Users/Victor/continuous-deep-q-learning/naf.py", line 144, in __init__
    self.N = fill_lower_triangular(self.M.h)
  File "/Users/Victor/continuous-deep-q-learning/naf.py", line 24, in fill_lower_triangular
    d = x.shape[-1]
IndexError: tuple index out of range

Mind giving me a few pointers?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.