Comments (10)
I think it works. Here's a cleaner elementwise_hess
function for you, as well as a better test. (Of course, don't trust me, please test it yourself!)
from autograd import elementwise_grad
import autograd.numpy as np
def energy(args):
return 8.314*args[0]*(args[1]*np.log(args[1]) + args[2]*np.log(args[2]))
gradfun = elementwise_grad(energy)
N = 100
M = 50
inp = [1000*np.random.rand(N,1), np.random.rand(1,M), np.random.rand(1,M)]
inp = np.broadcast_arrays(*inp)
gres = gradfun(inp)
print([i.shape for i in gres])
### here's the new stuff!!
from autograd import jacobian
def elementwise_hess(fun, argnum=0):
sum_latter_dims = lambda x: np.sum(x.reshape(x.shape[0], -1), 1)
def sum_grad_output(*args, **kwargs):
return sum_latter_dims(elementwise_grad(fun)(*args, **kwargs))
return jacobian(sum_grad_output, argnum)
# make inp an array so that jacobian() can handle it
array_inp = np.concatenate([arr[None,...] for arr in inp])
# check shapes on the given energy function
hessfun = elementwise_hess(energy)
print(hessfun(array_inp).shape)
# test on some scalar elements
from autograd import grad
hessval = hessfun(array_inp)
print grad(lambda x: grad(energy)(x)[1])(array_inp[:,8,42])
print hessval[:,1,8,42]
It prints something like this (different every time due to the pseudorandom numbers generated):
[(100, 50), (100, 50), (100, 50)]
(3, 3, 100, 50)
[ 7.10510658 920.2143044 0. ]
[ 7.10510658 920.2143044 0. ]
Please reopen the issue if it's not right!
from autograd.
Is the objective function scalar-valued? That is, is the call to energy
returning a scalar (meaning you could use grad
and not elementwise_grad
) or is it returning an array of values? (I'm confused by 'the gradient of the objective function with respect to the variable at that argument index', since that sounds more like ∂f/∂xi than ∂fi/∂xi.)
EDIT: I was confused here because I never thought of using elementwise_grad
for broadcasting support!
A runnable example (i.e. a definition for energy
) along with a fixed input and desired output (maybe with fewer dimensions, or at least smaller ones) would also help!
from autograd.
My guess is that energy
is a six-argument function that broadcasts across arrays (in this case, effectively six 10x10 arrays). Is that right?
from autograd.
Sorry, here's a minimal working example. energy
is a scalar-valued function which accepts some number of arguments, and we would like to broadcast some arguments against each other. For example, imagine we have 100 selected temperatures and 50 selected compositions, and we would like to compute the energy at all 100 x 50 = 5000 combinations, as well as the gradient and Hessian at each point. (This generalizes: we may add a pressure or other axes as well.)
So if I pass L arguments which broadcast to arrays of size (N, M, ...), I want my gradient to be an array of size (L, N, M, ...), and I want my Hessian to be an array of size (L, L, N, M, ...).
from autograd import elementwise_grad
import autograd.numpy as anp
import numpy as np
def energy(args):
return 8.314*args[0]*(args[1]*anp.log(args[1]) + args[2]*anp.log(args[2]))
grad = elementwise_grad(energy)
N = 100
M = 50
inp = [1000*np.random.rand(N,1), np.random.rand(1,M), np.random.rand(1,M)]
inp = np.broadcast_arrays(*inp)
%time gres = grad(inp)
print([i.shape for i in gres])
CPU times: user 1 ms, sys: 0 ns, total: 1 ms
Wall time: 1.55 ms
[(100, 50), (100, 50), (100, 50)]
from autograd.
I'm kinda wondering if this works (based on your earlier example, not the most recent version):
from autograd import elementwise_grad, jacobian
import autograd.numpy as np
def wrapper(args):
return sum(a**2 for a in args)
inp = [1000*np.random.rand(10,1), np.random.rand(1,10), np.random.rand(1,10), np.random.rand(1,10), np.random.rand(1,10), np.random.rand(1,10)]
inp = np.broadcast_arrays(*inp)
array_inp = np.concatenate(map(lambda x: np.expand_dims(x, 0), inp))
print(array_inp.shape)
gres = elementwise_grad(wrapper)(array_inp)
print(gres.shape)
hess = jacobian(lambda x: elementwise_grad(wrapper)(x).sum(2).sum(1))(array_inp)
print(hess.shape)
It attempts to use the same trick as in elementwise_grad
to support broadcasting (i.e. summing across dimensions effectively broadcasts the gradient). It has the right shape at least!
I'll try to work out a test case to see if it's actually producing the right values. Let me know if you figure one out first.
from autograd.
@mattjj Fantastic user support! :) It's my bedtime but I'm pleased to see a possible solution. I'll give it a shot tomorrow and update this issue with what I find.
from autograd.
Something like "broadcasting_hess" would be a better name.
from autograd.
So I haven't analytically verified anything yet, but the Hessians I'm computing with your method are symmetric and the correct matrix elements are zero. I'm going to wire it up to my existing minimization routines and see how it works.
I like "elementwise_hess" because it's makes it clear it's the counterpart to the "elementwise_grad" function -- if I was browsing your documentation I'd assume they had similar functions. (At the very least, if you rename one you should rename both.)
from autograd.
The other thing that pleases me is that both elementwise_grad
and elementwise_hess
correctly handle the case when inputs are "disconnected", i.e., an input doesn't actually show up in the graph. For example, in principle energy
is a function of pressure but usually I don't have pressure-dependent terms because the model is incomplete. For completeness the user still has to specify a pressure no matter what, and it's nice to not have to special-case anything for zeros to get correctly returned and broadcast. This was an annoying case to deal with in sympy/sympy#9725 so I'm pleased I can change less than 10 lines of code to get robust back-propagation working in my use case.
from autograd.
Actually the above turned out not to be entirely true. There's still a broadcasting hangup but it's pretty easily solved. I'm working on a code writeup which I'll share after I've kicked the tires a bit.
from autograd.
Related Issues (20)
- support for Jax-like custom forward pass definition? HOT 1
- Gradient become Nan for 0 value test HOT 1
- Is it possible to see gradient function? HOT 1
- Four scipy tests are failing HOT 6
- Add np.float128,np.complex256 dtypes to vspaces and boxes registers HOT 2
- unsafe URL HOT 1
- Numpy 1.25 breaks a few linalg functions HOT 1
- `autograd` 1.6 breaks Apple M-series macOS and Windows builds (module `numpy` has no attribute `float128`) HOT 3
- Can I differentiate this function?
- Python 2 and dependency on future HOT 2
- `'ArrayBox' object has no attribute 'dot'` when differentiating function containing `x.dot(y)` HOT 1
- How do I create a scalar value that does not depend on the independent variables ?
- AttributeError: module 'autograd.numpy' has no attribute 'numpy_extra'
- Autograd for quantum circuits
- [BUG] Differentiating `autograd.numpy.linalg.norm` gives incorrect results
- Support for advanced library based on autograd HOT 4
- Release new version 1.6.3 on PyPI
- autograd return nan with to norm function
- Saving `ArrayBox` to hdf5 file
- Incompatibility with numpy 2.0.0
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from autograd.