hips / autograd Goto Github PK

View Code? Open in Web Editor NEW

6.9K 6.9K 904.0 15.78 MB

Efficiently computes derivatives of numpy code.

License: MIT License

Python 99.66% Shell 0.34%

autograd's People

Contributors

$refraction-ray avatar$

Stargazers

Watchers

Forkers

ml-lab barak yanweifu se4u kcarnold agibsonccc keyua-cisco duankai qiwsir asappinc kastnerkyle tingelst jackkamm rtvt123 jsalim zhangaustin vsrathore ivanajw litxio shuangao kalyanp amoliu fangzheng354 detrident t1m0thy fighterlyl lixiangnlp tarun86 ch11y melgor aliha domluna eq4 karthiknrao frankszn taebong antoniocoppola richardotis ml-ai-nlp-ir xsongx mathkann codeaudit dboyliao xuguanggen caomw eavie abhillis dhirschfeld sadiqj zaxtax qnix xindaya tuananhle7 nfoti daithimaco sandhawalia technologiclee nllcc ltiao rollingstone zerkh craftsliu cpehle yinyumeng hang-wu shokoryu soroushmehr lizhangzhan tangwudu wavelets viveksck zephyrzilla aporia3517 wangg12 sarah-strauss jbregli jwgu yukoba ismailej shashankg7 glouppe j-towns dgrigonis-gmt augustlong lanseyege luozhen riashat linearregression embeddedsamurai tdglobalnews yiluan zbxzc35 naocandu kirk86 davidweichiang babooppa6 vyraun hedgefair peratham xzflin

autograd's Issues

is there a fragile dependence on garbage collection?

This test works on the current master:

import autograd.numpy as np
from autograd import grad

print grad(np.linalg.det)(np.eye(3))

(It prints an identity matrix.)

But if we just comment out this del statement in grad it no longer prints an identity matrix and instead prints something like

<autograd.numpy.numpy_extra.ArrayNode object at 0x1092184d0>

I think the change in behavior could be because the tapes member of any Node instance is a WeakKeyDictionary, so that a del statement which removes a strong reference to some tape can have side-effects on many Node.tapes members. If that's the case, that behavior is pretty confusing, and it might be fragile and/or very Python-implementation-specific (e.g. is WeakKeyDictionary guaranteed to remove entries immediately, or only on garbage collection passes?).

Why does the behavior apparently change when removing that seemingly-critical del statement, and is that a good thing?

Differentiation with np.select (piecewise functions)

You're doing some great work here! I've run into a slight issue with computing gradients of piecewise functions using np.select. I've put together a minimal (trivial) example (Python 3.3):

import autograd.numpy as anp
from autograd import grad

def func(x):
    return anp.select([True], [x[0]*(anp.log(x[1]**x[1]) + anp.log(x[2]**x[2]))])

gradient = grad(func)(anp.array([300., 0.5, 0.5]))
print(gradient)

[ 0.  0.  0.]
[...]/autograd/core.py:39: UserWarning: Output seems independent of input. Returning zero gradient.
  warnings.warn("Output seems independent of input. Returning zero gradient.")

select accepts a list of conditions and a list of choices, and the first condition to be evaluated True is returned from the corresponding choice list. The gradient of a piecewise-defined function should just be the gradient of the choice list, but I haven't yet wrapped my head around how constructing a closure for defgrad would work in this context.

The incredible nestable egg

$ python
>>> from funkyyak import grad
>>> grad (lambda x: x * grad (lambda y: x*y) (2.0)) (1.0)
0.0

should be 2.0

Recent commit "Added tests and fix for singleton array outputs" breaks higher order derivatives

Hi,

This commit from earlier today, breaks some code I have. Unfortunately I am having a really hard time debugging and finding the location of the error, but here are some clues:

It only happens when I take a second-order derivative -- the first-order derivative of my function works fine.
The error occurs in numpy_extra.py, inside the inner lambda function defined in

untake.defgrad(lambda ans, x, idx, template : lambda g : take(g, idx))

Specifically, the error occurs because g is a 0-d array, array(float), and idx=0, and I get
IndexError: 0-d arrays can't be indexed

For now I am going to switch back to the previous commit, but will work on debugging it a bit more over the next couple days.

Also, are there any good strategies for debugging? I find it really difficult to trace errors once autograd goes into the backward_pass() function.

Releasing on Anaconda.org

Any chance you guys could maintain autograd on https://anaconda.org ? I would like to be able to reference the package in my conda recipe so the conda users of my library can easily install it. I can build autograd myself and host in my custom channel, but it's more likely to be the latest version if the official maintainers do it. :)

I wrote up some notes on my own experiences uploading a library to Anaconda.org here: https://github.com/richardotis/pycalphad/blob/0fdf613116c22740262660db12696804af5e4553/RELEASING.rst

Here's an example conda recipe file: https://github.com/richardotis/pycalphad/blob/0fdf613116c22740262660db12696804af5e4553/conda_recipe/conda.yaml

I drive all my versioning from Git, but it's also possible to hard-code version numbers or pull them fron another source.

Nth derivative

I thought the example looked a little clunky, where we do the following to compute an nth derivative:

>>> grad_tanh_2 = grad(grad_tanh)           # 2nd derivative
>>> grad_tanh_3 = grad(grad_tanh_2)         # 3rd derivative1
>>> grad_tanh_4 = grad(grad_tanh_3)         # etc.
>>> grad_tanh_5 = grad(grad_tanh_4)
>>> grad_tanh_6 = grad(grad_tanh_5)

Would it be worth having an optional second argument in grad, say:

 >>> grad_tanh_6 = grad(tanh,6)

Or just a different function?

Inherit docstrings from numpy functions

Calling import autograd.numpy as np does not inherit numpy docstrings (see below)

Autograd:

Standard numpy:

Would it be possible to use functools.wraps in some clever way [1] as to preserve the docstrings?

[1] http://stackoverflow.com/questions/2025562/inherit-docstrings-in-python-class-inheritance

(this isn't a huge issue, but keeping docstrings would be nice)

Dynd support

Dynd is a next generation array library for python, with lots of cool features like JIT compilation, heterogenous data, user defined data types, missing data support, type checking etc https://speakerdeck.com/izaid/dynd

Wonder if there is interests from either HIPS or dynd devs @izaid , @insertinterestingnamehere or @mwiebe in integrated this with Dynd (or atleast leaving in hooks for future funcitonality or cooperation later on?)

Both are cool libraries and would hate to have the python community having to choose between them for projects.

Elementwise Hessian

I'm super impressed with the progress you've been making! Autograd has the potential to save me and many others a lot of grief in terms of automatically differentiating scalar-valued functions while retaining broadcasting support. So right now I can do something like

from autograd import elementwise_grad
import numpy as np
def wrapper(args):
    return energy(*args)
grad = elementwise_grad(wrapper)
inp = [1000*np.random.rand(10,1), np.random.rand(1,10), np.random.rand(1,10), np.random.rand(1,10), np.random.rand(1,10), np.random.rand(1,10)]
inp = np.broadcast_arrays(*inp)
%time gres = grad(inp)
print([i.shape for i in gres])


CPU times: user 301 ms, sys: 18 ms, total: 319 ms
Wall time: 300 ms
[(10, 10), (10, 10), (10, 10), (10, 10), (10, 10), (10, 10)]

And so each element of the returned list is the gradient of the objective function, with respect to the variable at that argument index, for all 100 combinations of input variables. So what I want now is the Hessian of my objective function, elementwise, meaning if my gradient is a (6, 10, 10) array, I want back a (6, 6, 10, 10) array with all the second derivatives including cross terms. Is this possible to do with Autograd?

Numba support

Hi everyone,

I started using autograd, and it's pretty fantastic. The one draw-back, is that I've not really found a way to jit the functions (in nopython mode) using numba when calculating the derivatives.

Is this something very difficult - or is it on the timeline to eventually add? Being able to get really fast forward and backwards evaluation would be pretty awesome.

grad is working, jacobian is not

First of all, autograd is an amazing tool! The grad function works great, but I am not able to make the jacobian function work. I am on Python 3.4.3, numpy 1.10.1 and autograd 1.1.1 (pip).

The call to the jacobian function works, but the function it returns does not accept my argument and throws a TypeError: The first input argument needs to be a sequence

The error seems to occur in the concatenate function, defined at line 44 in autograd/core.py (although I do believe this error is actually an error raised by numpy).

A minimum example to reproduce this:

import autograd.numpy as np
from autograd import grad
from autograd import jacobian

x = np.array([5, 3], dtype=float)

def cost(x):
    return x[0]**2 / x[1] - np.log(x[1])

grad_cost = grad(cost)
jaco_cost = jacobian(cost)

print(grad_cost(x)) # works correctly!
print(jaco_cost(x)) # error

Another unrelated question (I am not sure if this is the correct place for this): If I wrote the above cost function in a separate module that imports numpy by itself (so not the autograd.numpy). Is it possible to somehow import this function from the module directly, instead of copy-pasting the code? Right now I get the following AttributeError: 'FloatNode' object has no attribute 'log'

The import is as follows:

import autograd.numpy as np
from autograd import grad

#This module contains an 'import numpy as np' line
from my_module import cost

x = np.array([5, 3], dtype=float)
#Function from 'my_module' for which I'd like to calculate the gradient
grad_cost = grad(cost)
grad_cost(x) # error

If I remove the np.log call from my cost function the error disappears. So I guess I somehow need to override the 'regular' numpy import with the autograd numpy import, if possible.

Autograd and Numpy and Anaconda et al

Not an Issue but a question: if a Python distribution is used such as Anaconda, will Autograd work?

numpy.take and duplicate indices

Hi,

First of all, thanks for making autograd! It's been extremely useful.

I noticed though that np.take produces the wrong result if the index array contains duplicate values. For example,
quick_grad_check(lambda x: np.sum(x[[0,0]]), np.array([1.]))
fails with:
AssertionError: Check failed! nd=2.0, ad=1.0

You can fix it by changing a line in primitive_sum_arrays (in numpy_extra.py) from
new_array[array.idx] += array.val
to:
numpy.add.at(new_array, array.idx, array.val). I'm not sure if this error occurs elsewhere, but it seems to at least fix the problem in all of the cases that I've tried.

Assertion failure in Anaconda under Windows

neural_net.py and convnet.py have assertion errors in Anaconda Python under Windows 7 64bit (tested on latest Anaconda Python 2.2.0).

Traceback (most recent call last):
File "neural_net.py", line 117, in
quick_grad_check(loss_fun, W, (train_images, train_labels))
File "C:\Anaconda\lib\site-packages\autograd\util.py", line 98, in quick_grad_check
analytic_grad = np.sum(grad(fun)(arg0, _extra_args, *_kwargs) * random_dir)
File "C:\Anaconda\lib\site-packages\autograd\core.py", line 15, in gradfun
return backward_pass(*forward_pass(fun,args,kwargs,argnum))
File "C:\Anaconda\lib\site-packages\autograd\core.py", line 56, in backward_pass
og = cast_to_node_type(gradfun(cur_outgrad), parent.node_type, parent.node_value)
File "C:\Anaconda\lib\site-packages\autograd\numpy\numpy_grads.py", line 346, in new_fun
assert anp.shape(result) == shape
AssertionError

from future import division

Really nice package, thanks for all the great work.

When I try to use
from future import division
it breaks. Any chance of fixing this? I think it would basically require implementing truediv for a few classes.

testing slack integration

usage question of defgrad and gradmaker from numpy.divide

Im trying to implement a layer for conv network, here is my code:

class L2Normalize(Function):

    def forward(self, inputs):
        xp = cuda.get_array_module(*inputs)
        x, = inputs
        if type(x) is cuda.ndarray:
            Xnp = cuda.cupy.asnumpy(x)
            self._norm = xp.asarray(np.linalg.norm(Xnp, ord=2, axis=1), dtype=xp.float32)
        else:
            self._norm = np.linalg.norm(x, ord=2, axis=1)
        z = xp.divide(x, xp.expand_dims(self._norm, axis=1))
        return z,

    def backward(self, inputs, grad_outputs):
        xp = cuda.get_array_module(*inputs)
        x, = inputs
        gz, = grad_outputs
        gx = xp.expand_dims(gz/self._norm, axis=1) * x
        return gx,

I dont get it how to calculate a gradient of np.divide?
P.s.: read xp as np (numpy)

As i understand this code wont use implemented autograd.numpy.divide, so i should not count it in:

>>> grad_tanh = grad(tanh)       # Obtain its gradient function
>>> grad_tanh(1.0)               # Evaluate the gradient at x = 1.0

ie something like

>>> divgrad = autograd.grad(np.divide)

would be overusage, no?

numpy version 1.8 should be marked as a requirement

autograd seems to rely on numpy v1.8. (specifically it needs methods argpartition and full that were added in v1.8 http://docs.scipy.org/doc/numpy/release.html) It would be good to mark this in the requirements.txt or setup.py so that people can easily figure out the reason if they get errors.

Gradient of norm doesn't account for axis argument

I see that Autograd currently implements the gradient of norm in numpy.linalg, but it doesn't handle any of its options. Support for the "axis" option would be useful.

Pickling grad functions

Hello! I'm attempting to use autograd gradient functions with Python's multiprocessing package. The package passes data to the processes it creates by pickling all the objects. When I try to pickle a grad func, I get:

PicklingError: Can't pickle <function grad.<locals>.gradfun at 0x10abaf048>: 
     attribute lookup grad_logp_wrt_argnum_0 on autograd.core failed

When trying to pickle the function, Pickle needs to know about grad_logp_wrt_argnum_0 so it needs to be in the name space. But I'm not sure how to get this object from autograd. It's cool if I need to find another way to do my parallel stuff, but I'm wondering if it is possible to make the grad functions pickleable.

Great package, this stuff is amazing.

The name sounds like a euphemism for vomiting...

:-)

make_grad_chooser() got an unexpected keyword argument 'out'

With numpy 1.9.3 this gives me trouble:

import numpy as np  # nevermind, this line was the problem
from autograd import grad

def d(x, y):
    return np.log(np.max(x/y) / np.min(x/y))

g = grad(d)
print g(np.ones(2), np.ones(2))

[numpy 1.8.x] Array slices broken?

See the following test case:

import autograd.numpy as np
import autograd

def fn1(x):
    a = np.array([x[0], x[1]])
    b = np.array([x[2], x[3]])
    return np.dot(a, b)

f = autograd.value_and_grad(fn1)
print f(np.ones(4))

produces (2.0, array([ 1., 1., 1., 1.])), naturally.

The identical code using array slices for a and b:

def fn2(x):
    a = x[0:2]
    b = x[2:4]
    return np.dot(a, b)

f = autograd.value_and_grad(fn2)
print f(np.ones(4))

throws an exception in autograd/numpy/numpy_extra.pyc on line 123 in primitive_sum_arrays(*arrays), trying to call np.add.at. I haven't figured out which argument is the slice here.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-15a72619fe84> in <module>()
     16 
     17 f = autograd.value_and_grad(fn2)
---> 18 print f(np.ones(4))

/Library/Python/2.7/site-packages/autograd/convenience_wrappers.pyc in value_and_grad_fun(*args, **kwargs)
     45 
     46     def value_and_grad_fun(*args, **kwargs):
---> 47         gradval, val = gradval_and_val(*args, **kwargs)
     48         return val, gradval
     49 

/Library/Python/2.7/site-packages/autograd/convenience_wrappers.pyc in grad_and_aux_fun(*args, **kwargs)
     31             saved_aux.append(aux)
     32             return val
---> 33         gradval = grad(return_val_save_aux, argnum)(*args, **kwargs)
     34         return gradval, saved_aux[0]
     35 

/Library/Python/2.7/site-packages/autograd/core.pyc in gradfun(*args, **kwargs)
     15     the same type as the argument."""
     16     def gradfun(*args,**kwargs):
---> 17         return backward_pass(*forward_pass(fun,args,kwargs,argnum))
     18 
     19     try:

/Library/Python/2.7/site-packages/autograd/core.pyc in backward_pass(start_node, end_node, tape)
     52         node = tape.pop()
     53         if node.outgrads:
---> 54             cur_outgrad = node.sum_outgrads()
     55             assert type(new_node(getval(cur_outgrad))) == node.node_type, \
     56                 "Types are {0} and {1}".format(type(new_node(getval(cur_outgrad))), node.node_type)

/Library/Python/2.7/site-packages/autograd/core.pyc in sum_outgrads(self)
    154 
    155     def sum_outgrads(self):
--> 156         return self.node_type.sum_outgrads(self.outgrads)
    157 
    158 class Node(object):

/Library/Python/2.7/site-packages/autograd/numpy/numpy_extra.pyc in sum_outgrads(outgrads)
     46             return outgrads[0]
     47         else:
---> 48             return primitive_sum_arrays(*outgrads)
     49 
     50     @staticmethod

/Library/Python/2.7/site-packages/autograd/core.pyc in __call__(self, *args, **kwargs)
    111                         tapes.add(tape)
    112 
--> 113         result = self.fun(*argvals, **kwargs)
    114         if result is NotImplemented: return result
    115         if ops:

/Library/Python/2.7/site-packages/autograd/numpy/numpy_extra.pyc in primitive_sum_arrays(*arrays)
    121     for array in arrays:
    122         if isinstance(array, SparseArray):
--> 123             np.add.at(new_array, array.idx, array.val)
    124         else:
    125             new_array += array

TypeError: long() argument must be a string or a number, not 'slice'

Sherlock Holmes and the Missing Case of Int

$ python
>>> from funkyyak import grad
>>> (lambda x:x*x)(2.0)
4.0
>>> (lambda x:x*x)(2)
4
>>> grad(lambda x:x*x)(2.0)
4.0
>>> grad(lambda x:x*x)(2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "funkyyak/core.py", line 9, in gradfun
    start_node = Node(args[argnum], tape)
  File "funkyyak/core.py", line 65, in __new__
    raise TypeError("Can't differentiate wrt {0}".format(type(value)))
TypeError: Can't differentiate wrt <type 'int'>

IndexError: list index out of range

After running lstm.py for a minute or two, this error comes up:

Generating text from LSTM model...
Traceback (most recent call last):
File "lstm.py", line 157, in
logprobs = pred_fun(trained_weights, seqs)[-1].ravel()
IndexError: list index out of range

Vector valued outputs

Please... :-)

*=, /= operators don't coerce double to complex

I am not so sure about the underlying cause; the description is my best guess. Came across this designing IIR filters... Not-totally-minimal test case:

import autograd.numpy as np
import autograd


def f1(x):
    z = np.exp(1j*np.linspace(0, 3, 10))
    p1 = np.exp(1j * x - 0.1)
    h = z + 1.0
    h /= (z - p1) * (z - np.conj(p1))
    return np.mean(np.abs(h))


def f2(x):
    z = np.exp(1j*np.linspace(0, 3, 10))
    p1 = np.exp(1j * x - 0.1)
    h = (z + 1.0) / ((z - p1) * (z - np.conj(p1)))
    return np.mean(np.abs(h))


print autograd.value_and_grad(f2)(0.1)
# Works OK, produces (13.965532859262613, -107.22571311244289)
print autograd.value_and_grad(f1)(0.1)
# Crashes with:

Traceback (most recent call last):
  File "cplx.py", line 21, in <module>
    print autograd.value_and_grad(f1)(0.1)
  File "/Users/andy/.py/lib/python2.7/site-packages/autograd/convenience_wrappers.py", line 47, in value_and_grad_fun
    gradval, val = gradval_and_val(*args, **kwargs)
  File "/Users/andy/.py/lib/python2.7/site-packages/autograd/convenience_wrappers.py", line 33, in grad_and_aux_fun
    gradval = grad(return_val_save_aux, argnum)(*args, **kwargs)
  File "/Users/andy/.py/lib/python2.7/site-packages/autograd/core.py", line 21, in gradfun
    return backward_pass(*forward_pass(fun,args,kwargs,argnum))
  File "/Users/andy/.py/lib/python2.7/site-packages/autograd/core.py", line 63, in forward_pass
    except Exception as e: add_extra_error_message(e)
  File "/Users/andy/.py/lib/python2.7/site-packages/autograd/core.py", line 392, in add_extra_error_message
    raise_(etype, value, traceback)
  File "/Users/andy/.py/lib/python2.7/site-packages/autograd/core.py", line 62, in forward_pass
    try: end_node = fun(*args, **kwargs)
  File "/Users/andy/.py/lib/python2.7/site-packages/autograd/convenience_wrappers.py", line 30, in return_val_save_aux
    val, aux = fun(*args, **kwargs)
  File "/Users/andy/.py/lib/python2.7/site-packages/autograd/convenience_wrappers.py", line 42, in double_val_fun
    val = fun(*args, **kwargs)
  File "cplx.py", line 9, in f1
    h /= (z - p1) * (z - np.conj(p1))
TypeError: ufunc 'divide' output (typecode 'O') could not be coerced to provided output parameter (typecode 'D') according to the casting rule ''same_kind''

Same thing happens with *= vs *.

Support np.r_

Make this a package

Just needs a setup.py file. Something like this...

from distutils.core import setup
setup(name='FunkyYak',
      version='1.0',
      description='FunkyYak (Functional Kayak): Stateless reverse-mode autodiff implementation that also offers higher-order derivatives.',
      author='Harvard Intelligent Probabilistic Systems Group',
      packages=['funkyyak'])

TypeError: object of type 'ArrayNode' has no len()

Thanks for the extremely fast response to the issue raised yesterday! :D

I have another issue/request. In places where one could normally use len() on ndarray, it fails on ArrayNode.

To some extent I can work around this by using ArrayNode.shape[0]. But, I would like to use the function
autograd.numpy.apply_along_axis()
and this function fails, due to a call to len().

Einsum Second Argument

Pushing my luck. :-)

NotImplementedError: Gradient of w.r.t. arg number 2 not yet implemented

Bug with `np.diff` gradient for single-element input vectors

I've tried to debug this but to no avail:

The below implementation of diff agrees with np.diff so can be used in its place with autograd

In [1]: from numpy.testing import assert_almost_equal
   ...: 
   ...: from autograd import elementwise_grad
   ...: import autograd.numpy as np

In [2]: def diff(x):
   ...:     D = np.diag(np.ones(x.size)) - np.diag(np.ones(x.size-1), k=-1)
   ...:     return np.dot(D, x)[1:]
   ...: 
   ...: 

In [3]: x = np.array([ 0.0667,  0.944 ,  1.42  ,  1.7165,  2.177 ])

In [4]: for end in range(x.size, 0, -1):
   ...:     assert_almost_equal(np.diff(x[:end]), diff(x[:end]))
   ...: 
In [5]:

The definitions of the gradients should therefore be the same but differ when x is of size 1:

In [8]: g1 = elementwise_grad(np.diff)
   ...: g2 = elementwise_grad(diff)
   ...: for end in range(x.size, 0, -1):
   ...:     assert_almost_equal(g1(x[:end]), g2(x[:end]))
   ...: 
<snip>
AssertionError: 
Arrays are not almost equal to 7 decimals

(shapes (0,), (1,) mismatch)
 x: array([], dtype=float64)
 y: array([ 0.])

Looking more closely at the arguments:

In [9]: x = x[:1]
   ...: x
Out[9]: array([ 0.0667])

In [10]: np.diff(x)
Out[10]: array([], dtype=float64)

In [11]: g2(x)
Out[11]: array([ 0.])

In [12]: g1(x)
Out[12]: array([], dtype=float64)

So autograd's automatic differentiation of the diff function gives array([ 0.]) for a 1-element input vector whereas the gradient of np.diff gives an empty array array([], dtype=float64)

improve docs about complex function conventions

In this example, we would expect to see a non-zero imag gradient component -- instead it computes the real valued gradient only and zeros the imag portion

import autograd,autograd.numpy as numpy
import matplotlib.pyplot as plt
def f(x):
    return numpy.exp(1j*x);
fp = autograd.grad(f)
xx = numpy.arange(0,10,0.1)
y = map(lambda x: numpy.real(f(x)), xx)
yp = map(lambda x: numpy.real(fp(x)), xx)
plt.plot(xx,y)
plt.hold(True)
plt.plot(xx,yp,'r')
plt.title('complex gradient real component')

plt.figure()
y = map(lambda x: numpy.imag(f(x)), xx)
yp = map(lambda x: numpy.imag(fp(x)), xx)
plt.plot(xx,y)
plt.hold(True)
plt.plot(xx,yp,'r')
plt.title('complex gradient imag component')
plt.show()

Update PyPi

It looks like the pypi install is a couple of versions behind. Could you push the latest? Thanks!

Error with `w.T.dot(x)`, correct if `np.dot(w, x)` is used

Why does Method 2 below (using np.dot) return the correct gradient, while Method 1 returns an array of zeros? Perhaps an exception should be thrown if the parser detects dot products of the form w.T.dot(x) instead of np.dot(w, x) if the latter format is really necessary.

import numpy as np
from autograd import grad
w = np.arange(5).astype('float')

# Method 1
f = lambda x: w.T.dot(x)
df = grad(f)
df(np.random.randn(5))
# >>> array([0., 0., 0., 0., 0.])

# Method 2
f = lambda x: np.dot(w, x)
df = grad(f)
df(np.random.randn(5))
# >>> array([0., 1., 2., 3., 4.])

swapaxes (feature request)

Hi there --
Could I request gradient for numpy.swapaxes?
Thanks! :)
--Jack

Differentiating chart based dynamic programming algorithms

Consider the following code, it sums up the entries in y by keeping a running sum in the chart at x[0, 0] and finally outputs the value at chart position x[0, 0] when the loop finishes. This is a /mwe/ and more complicated logic is used in algorithms like inside-outside for CKY parsing or forward-backward for finding the sum of all valid paths in a linear chain CRF. I was wondering if it was possible to fix the library to get gradients for such chart based algorithms ?

import autograd.numpy as np
from autograd import grad
def assignment(y, x):
    for i in range(3):
        x[0, 0] = np.sum(np.array([y[i], x[0, 0]]))
    return x[0, 0]

def test_assignment():
    x = np.zeros((1,1))
    f = lambda y : assignment(y, x)
    y = np.random.rand(3,)
    print f(y)
    print grad(f)(y)
    return

if __name__ == "__main__":
    test_assignment()

>>> 0.530750805709
Traceback (most recent call last):
  File "test_autograd.py", line 35, in <module>
    test_assignment()
  File "test_autograd.py", line 31, in test_assignment
    print grad(f)(y)
  File "~/site-packages/autograd/core.py", line 16, in gradfun
    end_node = fun(*args, **kwargs)
  File "test_autograd.py", line 28, in <lambda>
    f = lambda y : assignment(y, x)
  File "test_autograd.py", line 23, in assignment
    x[0, 0] = np.max(np.array([y[i], x[0, 0]]))
TypeError: float() argument must be a string or a number

assertions in derivatives

Sometimes, if I have a function with assertions inside, and I take the gradient of the function, it breaks. For example, the following breaks for me:

import autograd.numpy as np
from autograd import grad

def foo(x):
    assert np.allclose(x, (x*3.0)/3.0)
    return np.sum(x)

g = grad(foo)
g(np.array([1.0,2.0,3.0]))

The assertion causes a

TypeError: Can't differentiate wrt <type 'numpy.bool_'>

Getting `TypeError: float() argument must be a string or a number`

First of all, great library! I'm finding it very useful for some projects that I am working on.
However, in some instances I am running into a TypeError in models where an array is being sliced or assigned as a block within a larger array (despite being scalar functions ultimately).

I can reproduce it with a stripped-down example:

https://gist.github.com/thearn/faba933208316d71cdb9

import autograd.numpy as np
from autograd import grad

def f(A):

    B = np.zeros((4,4))

    B[:2, :2] = A

    return B.sum()

A = np.random.randn(2,2)

df = grad(f)

print df(A)

# expected: [[ 1.,  1.],[ 1.,  1.]]

which gives the traceback:

Traceback (most recent call last):
  File "/Users/tristanhearn/Dropbox/code/adcomponent/src/adcomponent/test.py", line 17, in <module>
    print df(A)
  File "/Users/tristanhearn/Documents/thearn_repos/autograd/autograd/core.py", line 20, in gradfun
    return backward_pass(*forward_pass(fun,args,kwargs,argnum))
  File "/Users/tristanhearn/Documents/thearn_repos/autograd/autograd/core.py", line 61, in forward_pass
    end_node = fun(*args, **kwargs)
  File "/Users/tristanhearn/Dropbox/code/adcomponent/src/adcomponent/test.py", line 9, in f
    B[:2, :2] = A
TypeError: float() argument must be a string or a number

Is this expected (ie. a known and accepted limitation)?

Gracefully deal with functions that return singleton-dimensioned arrays instead of scalars.

Functions that return np.arrays shaped like (1,) shouldn't need special treatment.

is gnumpy supported?

I read on the tutorial that GPU support is on the roadmap. However there are also traces of GPU specific code which refer to HIPS/gpu_numpy. Is autograd usable with gpu_numpy as is?

Loving autograd so far! I've been playing around with porting a neural net library from Theano to autograd and it has been going extremely smoothly 👍

Array dimension expansion causes segfault

import numpy as np
from autograd import grad
def fun(x):
    return x[:, None].sum()
g = grad(fun)
g(np.random.randn(5))

Causes a seg fault. There are other ways to expand the dimensions of an array other than [:, None], but perhaps it should fail more gracefully in this case. I've also verified that it does the same thing with np.newaxis.

[Doubt] Would iteratively setting a non-local variable as the answer result in correct gradients

Suppose I had a class method that like:

def call(self, X):
    for t in range(X.shape[0]):
        self.ans = some_function(X[t], self.ans)       # some_function is a parameterized operation
        # some more computation with self.ans
        # final step has a scalar loss function

would I get correct gradients of this whole process? Meaning - will the computation graph that autograd constructs have the value of self.ans stored per iteration and use it?

Python 3 compatibility

Autograd does currently not support Python3... is there any chance of a Py3 compatible version, or is this not a priority right now?

GPU Support through CuPy

Hi,

I think you have an awesome library here, but the lack of GPU support makes it a tough sell. It may be fairly straightforward to add GPU support to autograd using the CuPy library from Chainer (http://chainer.readthedocs.org/en/stable/index.html). CuPy is a numpy clone that runs on the GPU. In Chainer, you can write device-agnostic code by importing and using an 'xp' module, which can refer to either numpy of cupy. Since cupy has the same API as numpy, the code using xp does not have to know about GPUs and CPUs.

Perhaps you can do a similar thing for autograd: replace all 'np' in the autograd code by 'xp' and let the user decide which library to use. Do you see any problems with this approach? The cupy code is MIT licensed, so there should not be an issue there.

Best,
Taco

scipy and hessian-vector-product

A very minor issue but thought I'd bring it up.

scipy.optimize.minimize expects hessian_vector_product(args, vector).

However, the hessian_vector_product in autograd returns a function like
hessian_vector_product(vector, args).

So for compatibility with scipy.optimize.minimize it would be more convenient to swap the order of the parameters.

But I am not so familiar with other optimizers in Python, and perhaps they expect the hessian_vector_product to be parametrized differently.

How to compute partial derivative?

Hi All,
I need to compute partial derivative for function like below,

def func(x,y):
return np.abs (x - y)

I need the derivative of x, the derivate of y.
Thanks, Any suggestions will help

Implement atleast_2d and friends

http://docs.scipy.org/doc/numpy/reference/generated/numpy.atleast_2d.html

Sometimes I want a piece of code that can run on a stack of vectors. Other times I want it to work on a single vector, where I'd also like a gradient. Specific use case: I write a function handle for log likelihood. I might want its gradient, but I might also want to produce a contour plot without looping.

Casting from float to int for the purpose of indexing can break things

This is a bit of a silly example, but numpy indexing requires integers, however this ends up breaking autograd.

import numpy as np
import autograd.numpy as anp
from autograd import grad
def fun(W, inds):
   W = anp.concatenate((W, inds), axis=1)
   inds = W[:,-1]
   W = W[:,:-1]
   return W[anp.int64(inds)].sum()

W = np.random.randn(5, 10)
inds = np.ones(5)[:,None]
g = grad(fun)
g(W, inds)

produces TypeError: long() argument must be a string or a number, not 'FloatNode'

Support for `np.diff`

Hi, great project - very interesting! I'm investigating it's use for evaluating gradients of monte-carlo expectations. One issue I've run into is the lack of support for np.diff. Would it be difficult to add this? I've had a poke around but don't think I've got the knowledge/expertise to do so myself. Happy to give it a go if it's beginner level and I can get some pointers...

On another note, is there a mailing-list or gitter chat room for autograd?

In [13]: def diff(x):
    ...:     return np.concatenate([x[:1], np.diff(x)])

In [14]: # gradient of dummy function which uses `diff`
    ...: g = grad(lambda x: np.sqrt(diff(x)).sum())

In [15]: g(randn(10))
Traceback (most recent call last):

  File "<ipython-input-15-acc7bd9d5211>", line 1, in <module>
    g(randn(10))

  File "C:\Anaconda3\lib\site-packages\autograd\core.py", line 21, in gradfun
    return backward_pass(*forward_pass(fun,args,kwargs,argnum))

  File "C:\Anaconda3\lib\site-packages\autograd\core.py", line 91, in backward_pass
    og = cast_to_node_type(gradfun(cur_outgrad), parent.node_type, parent.node_value)

  File "C:\Anaconda3\lib\site-packages\autograd\core.py", line 136, in error
    raise NotImplementedError(errstr.format(self.fun.__name__, argnum))

NotImplementedError: Gradient of diff not yet implemented.