Code Monkey home page Code Monkey logo

Comments (10)

mattjj avatar mattjj commented on July 4, 2024 1

I think it works. Here's a cleaner elementwise_hess function for you, as well as a better test. (Of course, don't trust me, please test it yourself!)

from autograd import elementwise_grad
import autograd.numpy as np

def energy(args):
    return 8.314*args[0]*(args[1]*np.log(args[1]) + args[2]*np.log(args[2]))

gradfun = elementwise_grad(energy)
N = 100
M = 50

inp = [1000*np.random.rand(N,1), np.random.rand(1,M), np.random.rand(1,M)]
inp = np.broadcast_arrays(*inp)


gres = gradfun(inp)
print([i.shape for i in gres])

### here's the new stuff!!

from autograd import jacobian

def elementwise_hess(fun, argnum=0):
    sum_latter_dims = lambda x: np.sum(x.reshape(x.shape[0], -1), 1)
    def sum_grad_output(*args, **kwargs):
        return sum_latter_dims(elementwise_grad(fun)(*args, **kwargs))
    return jacobian(sum_grad_output, argnum)

# make inp an array so that jacobian() can handle it
array_inp = np.concatenate([arr[None,...] for arr in inp])

# check shapes on the given energy function
hessfun = elementwise_hess(energy)
print(hessfun(array_inp).shape)

# test on some scalar elements
from autograd import grad
hessval = hessfun(array_inp)
print grad(lambda x: grad(energy)(x)[1])(array_inp[:,8,42])
print hessval[:,1,8,42]

It prints something like this (different every time due to the pseudorandom numbers generated):

[(100, 50), (100, 50), (100, 50)]
(3, 3, 100, 50)
[   7.10510658  920.2143044     0.        ]
[   7.10510658  920.2143044     0.        ]

Please reopen the issue if it's not right!

from autograd.

mattjj avatar mattjj commented on July 4, 2024

Is the objective function scalar-valued? That is, is the call to energy returning a scalar (meaning you could use grad and not elementwise_grad) or is it returning an array of values? (I'm confused by 'the gradient of the objective function with respect to the variable at that argument index', since that sounds more like ∂f/∂xi than ∂fi/∂xi.)

EDIT: I was confused here because I never thought of using elementwise_grad for broadcasting support!

A runnable example (i.e. a definition for energy) along with a fixed input and desired output (maybe with fewer dimensions, or at least smaller ones) would also help!

from autograd.

mattjj avatar mattjj commented on July 4, 2024

My guess is that energy is a six-argument function that broadcasts across arrays (in this case, effectively six 10x10 arrays). Is that right?

from autograd.

richardotis avatar richardotis commented on July 4, 2024

Sorry, here's a minimal working example. energy is a scalar-valued function which accepts some number of arguments, and we would like to broadcast some arguments against each other. For example, imagine we have 100 selected temperatures and 50 selected compositions, and we would like to compute the energy at all 100 x 50 = 5000 combinations, as well as the gradient and Hessian at each point. (This generalizes: we may add a pressure or other axes as well.)

So if I pass L arguments which broadcast to arrays of size (N, M, ...), I want my gradient to be an array of size (L, N, M, ...), and I want my Hessian to be an array of size (L, L, N, M, ...).

from autograd import elementwise_grad
import autograd.numpy as anp
import numpy as np
def energy(args):
    return 8.314*args[0]*(args[1]*anp.log(args[1]) + args[2]*anp.log(args[2]))
grad = elementwise_grad(energy)
N = 100
M = 50
inp = [1000*np.random.rand(N,1), np.random.rand(1,M), np.random.rand(1,M)]
inp = np.broadcast_arrays(*inp)
%time gres = grad(inp)
print([i.shape for i in gres])
CPU times: user 1 ms, sys: 0 ns, total: 1 ms
Wall time: 1.55 ms
[(100, 50), (100, 50), (100, 50)]

from autograd.

mattjj avatar mattjj commented on July 4, 2024

I'm kinda wondering if this works (based on your earlier example, not the most recent version):

from autograd import elementwise_grad, jacobian
import autograd.numpy as np

def wrapper(args):
    return sum(a**2 for a in args)

inp = [1000*np.random.rand(10,1), np.random.rand(1,10), np.random.rand(1,10), np.random.rand(1,10), np.random.rand(1,10), np.random.rand(1,10)]
inp = np.broadcast_arrays(*inp)
array_inp = np.concatenate(map(lambda x: np.expand_dims(x, 0), inp))

print(array_inp.shape)
gres = elementwise_grad(wrapper)(array_inp)
print(gres.shape)
hess = jacobian(lambda x: elementwise_grad(wrapper)(x).sum(2).sum(1))(array_inp)
print(hess.shape)

It attempts to use the same trick as in elementwise_grad to support broadcasting (i.e. summing across dimensions effectively broadcasts the gradient). It has the right shape at least!

I'll try to work out a test case to see if it's actually producing the right values. Let me know if you figure one out first.

from autograd.

richardotis avatar richardotis commented on July 4, 2024

@mattjj Fantastic user support! :) It's my bedtime but I'm pleased to see a possible solution. I'll give it a shot tomorrow and update this issue with what I find.

from autograd.

mattjj avatar mattjj commented on July 4, 2024

Something like "broadcasting_hess" would be a better name.

from autograd.

richardotis avatar richardotis commented on July 4, 2024

So I haven't analytically verified anything yet, but the Hessians I'm computing with your method are symmetric and the correct matrix elements are zero. I'm going to wire it up to my existing minimization routines and see how it works.

I like "elementwise_hess" because it's makes it clear it's the counterpart to the "elementwise_grad" function -- if I was browsing your documentation I'd assume they had similar functions. (At the very least, if you rename one you should rename both.)

from autograd.

richardotis avatar richardotis commented on July 4, 2024

The other thing that pleases me is that both elementwise_grad and elementwise_hess correctly handle the case when inputs are "disconnected", i.e., an input doesn't actually show up in the graph. For example, in principle energy is a function of pressure but usually I don't have pressure-dependent terms because the model is incomplete. For completeness the user still has to specify a pressure no matter what, and it's nice to not have to special-case anything for zeros to get correctly returned and broadcast. This was an annoying case to deal with in sympy/sympy#9725 so I'm pleased I can change less than 10 lines of code to get robust back-propagation working in my use case.

from autograd.

richardotis avatar richardotis commented on July 4, 2024

Actually the above turned out not to be entirely true. There's still a broadcasting hangup but it's pretty easily solved. I'm working on a code writeup which I'll share after I've kicked the tires a bit.

from autograd.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.