raskr / rust-autograd Goto Github PK

View Code? Open in Web Editor NEW

484.0 23.0 36.0 1 MB

Tensors and differentiable operations (like TensorFlow) in Rust

License: MIT License

Rust 100.00%

rust tensor automatic-differentiation autograd neural-networks deep-learning machine-learning

rust-autograd's Introduction

autograd

Tensors and differentiable operations backed by ndarray.

Cargo.toml

If you use basic linalg operations, especially matrix multiplications, blas feature would be important to speed them up.

[dependencies]
autograd = {"<version>", features = ["blas", "<blas-implementation-choice>"] }

<blas-implementation-choice> must be one of the following (See also blas-src)

accelerate macOS only
intel-mkl Intel/AMD CPU only. Includes Vector Mathematics (VM) ops
openblas

Features

Reverse-mode automatic differentiation

Here we are just computing partial derivatives of z = 2x^2 + 3y + 1.

use autograd as ag;
use ag::tensor_ops::*;

ag::run(|ctx: &mut ag::Context<_>| {
   let x = ctx.placeholder("x", &[]);
   let y = ctx.placeholder("y", &[]);
   let z = 2.*x*x + 3.*y + 1.;

   // dz/dy
   let gy = &grad(&[z], &[y])[0];
   println!("{:?}", gy.eval(ctx));   // => Ok(3.)

   // dz/dx (requires to fill the placeholder `x`)
   let gx = &grad(&[z], &[x])[0];
   let feed = ag::ndarray::arr0(2.);
   println!("{:?}", ctx.evaluator().push(gx).feed(x, feed.view()).run()[0]);  // => Ok(8.)

   // ddz/dx (differentiates `z` again)
   let ggx = &grad(&[gx], &[x])[0];
   println!("{:?}", ggx.eval(ctx));  // => Ok(4.)
});

Neural networks

This crate has various low-level features inspired by tensorflow/theano to train neural networks. Since computation graphs require only bare minimum of heap allocations, the overhead is small, even for complex networks.

// MNIST digits classification with multi-layer-perceptron
use autograd as ag;
use ag::optimizers::adam::Adam;
use ag::tensor_ops::*;
use ag::prelude::*;

let mut env = ag::VariableEnvironment::new();

let rng = ag::ndarray_ext::ArrayRng::<f32>::default();

// Register variables in this env.
env.name("w").set(rng.glorot_uniform(&[28 * 28, 10]));
env.name("b").set(ag::ndarray_ext::zeros(&[1, 10]));

let adam = Adam::default("my_adam", env.default_namespace().current_var_ids(), &mut env);

for epoch in 0..3 {  // 0.11 sec/epoch on 2.7GHz Intel Core i5
   env.run(|ctx| {
       let x = ctx.placeholder("x", &[-1, 28*28]);
       let y = ctx.placeholder("y", &[-1]);
       let w = ctx.variable("w");
       let b = ctx.variable("b");
       let z = matmul(x, w) + b;
       let mean_loss = reduce_mean(sparse_softmax_cross_entropy(z, &y), &[0], false);
       let grads = &grad(&[mean_loss], &[w, b]);

       // let mut feeder = ag::Feeder::new();
       // feeder.push(x, x_batch).push(y, y_batch);
       // adam.update(&[w, b], grads, ctx, feeder);
   });
}

Abstractions

use autograd as ag;
use ag::tensor_ops::*;
use ag::ndarray;

// `Tensor::map()`
ag::run(|ctx| {
    let x = ones(&[2, 3], ctx);
    // apply ndarray's methods
    let y = x.map(|x| x.fold_axis(ndarray::Axis(0), 0.0, |acc, x| acc + x));
    let z = x.map(|x| ag::ndarray_ext::zeros(x.shape()));
});

// Hooks
ag::run(|ctx| {
    let x: ag::Tensor<f32> = ones(&[2, 3], ctx).show_shape();
    let y: ag::Tensor<f32> = ones(&[2, 3], ctx).raw_hook(|x| println!("{}", x));
});

For detailed, see documentation or examples

rust-autograd's People

Contributors

Stargazers

Watchers

rust-autograd's Issues

Segfaulting on matrix multiplication with MKL

Matrix multiplication can cause a segault when using MKR. Here is a snippet:

let W = ag::variable(ag::ndarray_ext::glorot_uniform(&[(output_size as usize)*4, input_size as usize]));
let s = ag::slice(&W, &[0, 0], &[output_size, -1]);
let r = ag::matmul(&s, &x);

In the above example, x has dimensions [input_size, 1] and s will have dimensions [output_size, input_size], so matrix multiplication works out. The example runs with no issues without the mkr feature, but with mkr the eval step segfaults. I analyzed with valgrind, and it seems like there a few extra writes happening somewhere, resulting in a segfault.

`argmax` with multiple maximum values

The library is working really great so far 🎉 I am just seeing some interesting behaviour with ag::argmax(). I am passing in an array of 4, and sometimes I am seeing an argmax result of 4 or 5. I captured some inputs below.

let ref x = ag::placeholder(&[-1, 4]);
let ref a = ag::argmax(&x, 1, false);

let ref x_val = array![[0.262881, 0.2613406, 0.26295722, 0.26295722]].into_dyn();
let ref feed = [(x, x_val)];
println!("{}", a.eval(feed));

Outputs

[5]

And in python

>>> import numpy as np
>>> np.argmax([0.262881, 0.2613406, 0.26295722, 0.26295722])
2

Usage of unsafe?

Hi,

I am evaluating autograd for our project. I really like the concept, but I'm a bit concerned about the use of unsafe and some issues that come with it.

For example:

lib.rs:243, creates a new typed Vec<T>, and calls set_len. Although the function is marked unsafe, according to the documentation the elements must be initialized before calling this function, so it is borderline UB.
lib.rs:253, reads and casts any pointer as another type and is unsound. In contrast to the documentation it would not panic, but just invoke UB (you could cast_as an &u8 to u32).
mlp_mnist.rs:171, (and others) transmutes a [u8; 4] to a u32. I think this can cause endian issues (although I'm not 100% sure here as you load that from disc and later explicitly force a be conversion)
Then there are a few follow ups, e.g., where uninitialized_vec is used to create a reference to an uninitialized value.

I was wondering if you have an "unsafe roadmap" moving forward, and / or have plans to review the current use of unsafe in the code?

"unreachable code" panic on certain uses of `grad_with_default`

I ran into this panic during actual usage, then reduced it to these tests (borrowing test code from test_tensor_ops_grad.rs). It seems like maybe it's trying to take the gradient of a variable or placeholder when it shouldn't have to because grad_with_default provides the gradient explicitly...

#[test]
fn grad_with_default_variable() {
    let mut ctx = ag::VariableEnvironment::new();
    let v1 = ctx.slot().set(ndarray::arr1(&[1., 1., 1.]));
    ctx.run(|graph| {
        let v1 = graph.variable(v1);
        let g = T::grad_with_default(&[v1], &[v1], &[v1]);
        ag::test_helper::check_theoretical_grads(
            v1,
            g.as_slice(),
            &[v1],
            ag::Feeder::new(),
            1e-3,
            1e-3,
            graph,
        );
    });
}

#[test]
fn grad_with_default_placeholder() {
    ag::run(|ctx| {
        let mut eval = ctx.evaluator();
        let v1 = ctx.placeholder("v1", &[3]);
        let v1_val = ndarray::arr1(&[1., 1., 1.]);
        eval.feed("v1", v1_val.view());
        let g = T::grad_with_default(&[v1], &[v1], &[v1]);
        eval.extend(&g).run();
    });
}


---- test_tensor_ops_grad::add_n_single_placeholder stdout ----
thread 'test_tensor_ops_grad::add_n_single_placeholder' panicked at 'internal error: entered unreachable code', /n/rust-autograd/src/tensor_ops/basic_source_ops.rs:23:1
stack backtrace:
   0: rust_begin_unwind
             at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/panicking.rs:584:5
   1: core::panicking::panic_fmt
             at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/panicking.rs:142:14
   2: core::panicking::panic
             at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/panicking.rs:48:5
   3: <autograd::tensor_ops::basic_source_ops::Placeholder as autograd::op::Op<T>>::grad
             at ./src/tensor_ops/basic_source_ops.rs:15:17
   4: autograd::op::GradientContext<T>::compute_input_grads
             at ./src/op.rs:371:9
   5: autograd::gradient::compute_gradients
             at ./src/gradient.rs:60:23
   6: autograd::tensor_ops::grad_with_default
             at ./src/tensor_ops/mod.rs:142:21
   7: lib::test_tensor_ops_grad::add_n_single_placeholder::{{closure}}
             at ./tests/test_tensor_ops_grad.rs:83:17
   8: autograd::graph::run
             at ./src/graph.rs:99:5
   9: lib::test_tensor_ops_grad::add_n_single_placeholder
             at ./tests/test_tensor_ops_grad.rs:78:5
  10: lib::test_tensor_ops_grad::add_n_single_placeholder::{{closure}}
             at ./tests/test_tensor_ops_grad.rs:77:1
  11: core::ops::function::FnOnce::call_once
             at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/ops/function.rs:248:5
  12: core::ops::function::FnOnce::call_once
             at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/ops/function.rs:248:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

---- test_tensor_ops_grad::add_n_single_variable stdout ----
thread 'test_tensor_ops_grad::add_n_single_variable' panicked at 'internal error: entered unreachable code', /n/rust-autograd/src/tensor_ops/basic_source_ops.rs:21:1
stack backtrace:
   0: rust_begin_unwind
             at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/panicking.rs:584:5
   1: core::panicking::panic_fmt
             at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/panicking.rs:142:14
   2: core::panicking::panic
             at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/panicking.rs:48:5
   3: <autograd::tensor_ops::basic_source_ops::Variable as autograd::op::Op<T>>::grad
             at ./src/tensor_ops/basic_source_ops.rs:15:17
   4: autograd::op::GradientContext<T>::compute_input_grads
             at ./src/op.rs:371:9
   5: autograd::gradient::compute_gradients
             at ./src/gradient.rs:60:23
   6: autograd::tensor_ops::grad_with_default
             at ./src/tensor_ops/mod.rs:142:21
   7: lib::test_tensor_ops_grad::add_n_single_variable::{{closure}}
             at ./tests/test_tensor_ops_grad.rs:63:17
   8: autograd::variable::VariableEnvironment<F>::run
             at ./src/variable.rs:697:9
   9: lib::test_tensor_ops_grad::add_n_single_variable
             at ./tests/test_tensor_ops_grad.rs:61:5
  10: lib::test_tensor_ops_grad::add_n_single_variable::{{closure}}
             at ./tests/test_tensor_ops_grad.rs:58:1
  11: core::ops::function::FnOnce::call_once
             at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/ops/function.rs:248:5
  12: core::ops::function::FnOnce::call_once
             at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/ops/function.rs:248:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Gradient error for tensor of different dimensions

Hi, during the development of my neural network algorithm, i found that gradient method for tensors calculated between two different dimensions always has an error. The error includes thread 'main' panicked at'called Result::unwrap() on an Err value: and thread 'main' panicked at'called Result::unwrap() on an None value: if i try to print the grad result.
One example is like:

let t1  = g.constant( array![[1.0,2.0, 3.0, 4.0], [1.0,2.0, 3.0, 4.0]] );
let t2  = g.constant( array![1.0,2.0, 3.0, 4.0] );
let cal = t1/t2
let grads = g.grad(&[cal], &[t2]);

That influences a lot for the expression using during the development.
After testing, i found that expand t2's dimension can solve that problem by doing like this.

...
let t3 = g.tile(g.expand_dims(t2, &[0]), 0, 2);
let cal = t1/t3
...

However, that way definitely requires much more effort for the development since method like reduced_sum is really commonly used.
Do u have any other idea about solving this problem?

Have a problem with build multilayer perceptron

Hi. I tried to build simple perceptron with more than 1 hidden layer, but its precision seems to be very low. First 10 epochs it shows some progress, but then stuck on error ~ 0.585, which is very large for task detecting value from 0.0 to 1.0.
Can you please look on it and say what I did wrong?
Here is it https://gist.github.com/inferrna/b1b49b9d0e161a5104670ecb7870e19a

Wrong docstrings

https://github.com/raskr/rust-autograd/blob/master/src/optimizers/momentum_sgd.rs

Negative index in `slice` are off by one

When using slice with negative numbers the result is off by 1.

let a: ag::Tensor<f32> = g.zeros(&[4, 4]);
let b = g.slice(a, &[0, 0], &[-2, 2]); // numpy equivalent is a[:-1, 0:2]

assert_eq!(b.eval(&[]).unwrap().shape(), &[3, 2]);

The assert fails because the shape of b is [2,2]

Segmentation fault on multiple calls to conv2d.

When trying to make a predict fn involving a conv2d (similar to cnn_mnist), i see Segmentation faults after around 21 calls.
Is this a bug or Is there a better way to make predictions?

Minimum code to reproduce:

use autograd as ag; 
use ag::ndarray_ext as array;
use ag::tensor::Variable;

fn main() {
    let rng = array::ArrayRng::<f32>::default();
    let w1_arr = array::into_shared(rng.random_normal(&[16, 1, 3, 3], 0., 0.5));
    let b1_arr = array::into_shared(array::zeros(&[1, 16, 8, 8]));

    ag::with(|g| {
        let rng1 = array::ArrayRng::<f32>::default();
        let w1 = g.variable(w1_arr.clone());

        let b1 = g.variable(b1_arr.clone());
        for i in 0..100 {
            println!("Calling pred: {}", i);
            let x = g.variable(rng1.glorot_uniform(&[8, 8]));
            println!("Input value: {:?}", x.eval(&[]));
            let _ = g.conv2d(x, w1, 1, 1) + b1;
        }
    })
}

Dead PDF link in source

rust-autograd/src/ops/mod.rs

Line 1490 in 6c769fb

/// See http://web.stanford.edu/~awni/papers/relu_hybrid_icml2013_final.pdf

Use of 'extern crate' in examples

All the examples in the repo use extern crate autograd or extern crate ndarray which is deprecated and not recommended since rust 2018.

It should be a simple fix to update these in all instances.

Bug for `g.argmax`

Hi, i am using ag.argmax method and i met a weird situation.

let test = g.constant(array![85.0, 16.0, 0.04, 85.0, 16.0, 85.0]);
let max_dis_index = g.argmax(test, 0, false).show_with("max_dis_index is");
g.eval(&[max_dis_index], &[]);

The output becomes 8.0 which is obviously wrong.

That happens when the maximum number has more than 2 times.

Support for lgamma function

Hi, i am using autograd to make a neural network with custom defined loss function.
That needs the calculation of lgamma function of tensors during the process.
I notice that there is tf.math.lgamma() in tensorflow, and i find that would be really complex to define the operation myself using autograd since the gradient of the logarithm of the fraction is really complex for me to understand.
Could you have any support for that operation or do you have any advice how can i achieve that?
Thanks for your help.

Compile failure on example (intel-mkl-src)

When attempting to compile an example application, using the suggested cargo invocation on crates.io (autograd = { version = "0.9.5", features = ["mkl"] }), I get the following error:

error: failed to run custom build command for `intel-mkl-src v0.2.5`

Caused by:
  process didn't exit successfully: `/path/to/project/target/debug/build/intel-mkl-src-2710610bac526c6f/build-script-build` (exit code: 101)
--- stdout
Downlaod archive

--- stderr
thread 'main' panicked at 'check sum of downloaded archive is incorrect: md5sum=d41d8cd98f00b204e9800998ecf8427e', /path/to/.cargo/registry/src/github.com-1ecc6299db9ec823/intel-mkl-src-0.2.5/build.rs:88:13
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.

Workaround is possible by simply not using mkl.

access_elem failed

Hi!

I am just starting to explore autograd, and I ran into an issue when using a 2D input:

    autograd::with(|g| {
        use autograd::ndarray;
        let (w, h) = (2, 2);
        let x = g.placeholder(&[w as isize, h as isize]);
        let loss = x.access_elem(0).access_elem(0) * x.access_elem(0).access_elem(0)
            + x.access_elem(0).access_elem(1) * x.access_elem(0).access_elem(1);

        let grad = g.grad(&[loss], &[x])[0]; // dloss/dx

        let x_values = ndarray::Array2::from_shape_vec((w, h), vec![0.0; w * h]).unwrap();
        let _grad = grad
            .eval(&[x.given(x_values.view())])
            .expect("Failed to eval gradient");
    });

This gives me Failed to eval gradient: OpError(OutOfBounds("access_elem failed.")).

Whichever shape I pick (1,2, 2,1, 2,2 etc) I get the same error.

Newbie-friendly documentation would be a huge benefit

I'm currently working on some ML projects where autograd looks like the ideal tool for the job. Unfortunately, I had a bit of trouble figuring out the basics of autograd, because of the somewhat terse documentation.

At the moment, the main content of the docs begins with

Here we are just computing partial derivatives of z = 2x^2 + 3y + 1.

followed by a block of code which does exactly that. This is helpful as an example, but what's missing is a conceptual overview: What's a placeholder? What's a tensor? What does ag::run mean? What's feeding?

I was able to figure these things out by digging through the rest of the docs, but it wasn't easy, and it would be much more difficult for someone who was less familiar with ML concepts than me (I am perhaps in the middle of the spectrum). And considering how much work has gone into the implementation, making it more accessible would be very valuable!

Many of the individual functions could also benefit from more-detailed documentation. As an example, the documentation for tensor_ops::grad() currently says:

Symbolic gradient tensors of xs in the same order as xs’s

ys - Targets of differentiation that are arbitrary shapes.

xs - Tensors with which differentiate ys.

See the more useful helper: crate::optimizers::grad_helper()

and returns a vector of Tensors; but it could be much clearer about what the returned values are, and what sizes they are. (One for each of xs, I assume? but what shape are they if there are more than one of ys?)

(This is only an example; almost every function has a similar level of detail at present.)

Calculate add/sub gradient

I have been tinkering together a neural network. However, I have been running into some panics. I have managed to boil it down to calculating the gradient of ag::add or ag::sub. Here is the boiled down example:

use ag;
use ndarray::prelude::*;

let x_val = array![[0.1]].into_dyn();
let y_val = array![[0.2]].into_dyn();
let ref x = ag::placeholder(&[-1, 1]);
let ref y = ag::placeholder(&[-1, 1]);
let ref feeds = [(x, &x_val), (y, &y_val)];

let ref z = ag::add(x, y);
println!("{}", z.eval(feeds));

let ref grads = ag::grad(&[z], &[y]);
println!("{}", grads[0].eval(feeds)); // panic here

I get the following when I try to run the above:

[[0.3]]
thread 'main' panicked at 'ndarray: could not broadcast array from shape: [0] to: [2]', /home/sgibbs/.cargo/registry/src/github.com-1ecc6299db9ec823/ndarray-0.10.14/src/lib.rs:741:13
note: Run with `RUST_BACKTRACE=1` for a backtrace.

Am I doing something wrong?

Thanks

Adam optimizer is broken?

I was trying to use the Adam optimizer for a DQN problem, but the the Q values always kept diverging regardless of hyperparameters. So I looked into the code and it seems to me like the optimizer doesn't correctly update the stateful parameters. It calculates new values for m and v but while doing so it makes copies of the NdArrays. The actual state variables referenced by xs[2] and xs[3] are never updated so on each iteration the optimizer just uses zero initialized values for m and v.

SGD::compute_updates is private

The built-in gradient descent function is not usable at the moment, the compute_updates function needs a pub prefix to fix this.

Workaround: Copy-paste the implementation and use that instead.

"ndarray" and "autograd::ndarray"

Hi, i am a beginner of this interesting rust deep learning library.
When i try to run several basic examples using ndarray, i found that always has a compiler error saying i should use autograd::ndarray. Code and error are shown as follow:

    ag::with(|g| {
        let x = g.placeholder(&[-1,-1]);
        let value = array![[1., 1.]];  
        println!("{:?}", x.eval(&[x.given(value.view())]).unwrap()); 
    });

error[E0308]: mismatched types
  --> src\main.rs:14:43
   |
14 |         println!("{:?}", x.eval(&[x.given(value.view())]).unwrap());
   |                                           ^^^^^^^^^^^^ expected struct `autograd::ndarray::ArrayBase`, found struct `ndarray::ArrayBase`
   |
   = note: expected struct `autograd::ndarray::ArrayBase<autograd::ndarray::ViewRepr<&_>, _>`
              found struct `ndarray::ArrayBase<ndarray::ViewRepr<&{float}>, ndarray::Dim<[usize; 2]>>`
   = note: perhaps two different versions of crate `ndarray` are being used?

Is that compulsory to use ag::ndarray::array![[1., 1.]] instead of array![[1., 1.]] in this kind of cases?
Thanks for your help.

f64 support

I'm surveying this crate for possible use in physics code, and can't help but notice that all tensors are confined to single-precision floating point numbers. Basically, physics is chock-full of multivariate functions with insanely complicated derivatives that provide an immense barrier to their implementation; so automatic differentiation could be potentially very useful. However, sometimes these functions may involve sums of signed values that wildly differ in magnitude, making f32 a dangerous choice for implementation.

Making the crate generic over this would of course not be a trivial effort, as it will appear in virtually all traits and types. You would also most likely need to depend on the Float trait from num or similar.

Note this probably isn't the only blocker towards the crate's use in these applications, and I can understand if you don't want to add this support for only the remote chance of attracting people from other disciplines.

dropout with train=false produces error

Version: rc2.0.0-rc2

backtrace:

thread 'main' panicked at 'index out of bounds: the len is 0 but the index is 0', /home/penelope/.cargo/registry/src/github.com-1ecc6299db9ec823/autograd-2.0.0-rc2/src/evaluation.rs:446:54
stack backtrace:
   0: rust_begin_unwind
             at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/panicking.rs:515:5
   1: core::panicking::panic_fmt
             at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/core/src/panicking.rs:92:14
   2: core::panicking::panic_bounds_check
             at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/core/src/panicking.rs:69:5
   3: <usize as core::slice::index::SliceIndex<[T]>>::index
             at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/core/src/slice/index.rs:184:10
   4: core::slice::index::<impl core::ops::index::Index<I> for [T]>::index
             at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/core/src/slice/index.rs:15:9
   5: <smallvec::SmallVec<A> as core::ops::index::Index<I>>::index
             at /home/penelope/.cargo/registry/src/github.com-1ecc6299db9ec823/smallvec-1.6.1/src/lib.rs:1624:10
   6: autograd::evaluation::<impl autograd::graph::Graph<F>>::eval
             at /home/penelope/.cargo/registry/src/github.com-1ecc6299db9ec823/autograd-2.0.0-rc2/src/evaluation.rs:446:54
   7: autograd::evaluation::Evaluator<F>::run
             at /home/penelope/.cargo/registry/src/github.com-1ecc6299db9ec823/autograd-2.0.0-rc2/src/evaluation.rs:147:9
   8: battlesnake::neuralscore::train_cogent::{{closure}}
             at ./battlesnake/src/neuralscore.rs:189:16
   9: autograd::variable::VariableEnvironment<F>::run
             at /home/penelope/.cargo/registry/src/github.com-1ecc6299db9ec823/autograd-2.0.0-rc2/src/variable.rs:688:9
  10: battlesnake::neuralscore::train_cogent
             at ./battlesnake/src/neuralscore.rs:166:3
  11: battlesnake::main
             at ./battlesnake/src/main.rs:314:16
  12: core::ops::function::FnOnce::call_once
             at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/core/src/ops/function.rs:227:5

code:

env.run(|ctx| {
    let train_tensor = arr2(train_frames.iter().map(|x| Frame { values: *x }).collect_vec().as_slice());
    let train_y = arr2(train_you_won.as_slice());

    let test_tensor = arr2(test_frames.iter().map(|x| Frame { values: *x }).collect_vec().as_slice());
    let test_y = arr2(test_you_won.as_slice());

    let x = ctx.placeholder("x", &[-1, (N_SNAKES * DIM_PER_SNAKE) as isize]);
    let y = ctx.placeholder("y", &[-1, 1]);
    let w = ctx.variable("w1");
    let b = ctx.variable("b1");
    let w2 = ctx.variable("w2");
    let b2 = ctx.variable("b2");
    let z = tanh(matmul(x, w) + b);

    let dropout_1_test = dropout(z, 0.5, false);
    let z2_test = sigmoid(matmul(dropout_1_test, w2) + b2);
    let mean_loss_test = reduce_mean(mean_squared_error(z2_test, &y), &[0], false);

    //let dropout_1_train = dropout(z, 0.5, true);
    //let z2_train = sigmoid(matmul(dropout_1_train, w2) + b2);
    //let mean_loss_train = reduce_mean(mean_squared_error(z2_train, &y), &[0], false);

    let loss = ctx.evaluator().push(mean_loss_test)
    .feed(x, train_tensor.view())
    .feed(y, train_y.view() )
    .run();

    eprintln!("initial loss: {:?}", loss);
});

the same code works just fine with setting train to true

segfault when calling grad() in a loop

On my x86 linux environment with rust 1.47.0 and autograd 1.0.2 the below code has a segmentation fault.

Any idea what's causing this crash?

Given that this segfault happened in a very simple loop and that there's a decent amount of code inside of unsafe{} would it possible/feasible to use conditional compilation that used alternatives to unsafe{} code blocks when they are purely for performance, along with alternative safe versions of datastructures (eg. UnsafeCell, not sure how trustworthy SmallVec is).

test code:

==> Cargo.toml <==
[package]
name = "autograd_test"
version = "0.1.0"
edition = "2018"

[dependencies]
autograd = { version = "1.0.2" }


==> src/main.rs <==
extern crate autograd as ag;

fn main() {
    ag::with(|g: &mut ag::Graph<f64>| {
        let mut loop_iter = 1;
        loop {
            let x = g.placeholder(&[3]);
            let z = 2.0 * x;
            eprintln!("about to call grad() {}", loop_iter);
            g.grad(&[z], &[x])[0];
            eprintln!("grad() call completed {}", loop_iter);

            loop_iter += 1;
            if loop_iter >= 1000 { break };
        };
    });
}

The gdb output and backtrace for me looks like this (this crashes in the --release mode as well for me):

about to call grad() 1
grad() call completed 1
about to call grad() 2
grad() call completed 2
...
grad() call completed 25
about to call grad() 26
grad() call completed 26
about to call grad() 27

Program received signal SIGSEGV, Segmentation fault.
smallvec::SmallVec<A>::spilled (self=0x7ffff7fd12c0)
    at /home/ktegan/.cargo/registry/src/github.com-1ecc6299db9ec823/smallvec-1.4.2/src/lib.rs:695
695	        self.capacity > Self::inline_capacity()
(gdb) bt
#0  smallvec::SmallVec<A>::spilled (self=0x7ffff7fd12c0)
    at /home/ktegan/.cargo/registry/src/github.com-1ecc6299db9ec823/smallvec-1.4.2/src/lib.rs:695
#1  0x0000555555585ab3 in smallvec::SmallVec<A>::triple (self=0x7ffff7fd12c0)
    at /home/ktegan/.cargo/registry/src/github.com-1ecc6299db9ec823/smallvec-1.4.2/src/lib.rs:666
#2  0x0000555555584bce in smallvec::SmallVec<A>::len (self=0x7ffff7fd12c0)
    at /home/ktegan/.cargo/registry/src/github.com-1ecc6299db9ec823/smallvec-1.4.2/src/lib.rs:646
#3  0x00005555555d3e1a in autograd::gradient::symbolic_gradients (ys=..., wrt=..., gys=..., g=0x7fffffffda60)
    at /home/ktegan/.cargo/registry/src/github.com-1ecc6299db9ec823/autograd-1.0.2/src/gradient.rs:166
#4  0x00005555555c5573 in autograd::ops::<impl autograd::graph::Graph<F>>::grad_with_default (
    self=0x7fffffffda60, ys=..., xs=..., ys_grads=...)
    at /home/ktegan/.cargo/registry/src/github.com-1ecc6299db9ec823/autograd-1.0.2/src/ops/mod.rs:128
#5  0x00005555555c5cbb in autograd::ops::<impl autograd::graph::Graph<F>>::grad (self=0x7fffffffda60, ys_=..., 
    xs=...) at /home/ktegan/.cargo/registry/src/github.com-1ecc6299db9ec823/autograd-1.0.2/src/ops/mod.rs:96
#6  0x000055555559a42e in autograd_test::main::{{closure}} (g=0x7fffffffda60) at src/main.rs:10
#7  0x00005555555c6906 in autograd::graph::with (f=...)
    at /home/ktegan/.cargo/registry/src/github.com-1ecc6299db9ec823/autograd-1.0.2/src/graph.rs:91
#8  0x00005555555bf546 in autograd_test::main () at src/main.rs:4

Alternatives for `tf.where()`

Hi, i am working on a simple kmeans clustering algorithm using autogard
I want to achieve the same result as this TF code:

assignments = tf.constant([1, 0, 0])
c = 1
tf.where(tf.equal(assignments, c))

It simply returns the indices where the condition is True.
I wanna ask do you have any idea of simple alternative method in autogard?
Or i should again make my own custom defined op?

Could I get a value "0" when it is not differentiable at any variable ,No panic?

When I get the differential of constant at variable, I hope I can get the value "0" or an option of error, but it returns panic

Documentation about GradientContext::set_input_grads is misleading

The docs about GradientContext has some errors.

Upgrade to ndarray 0.15

I'm trying to use autograd in conjunction with another library (nshare) which only supports ndarray 0.15. I recognize that this incompatibility isn't autograd's fault, but it seems like the simplest solution would be for autograd to support the latest version of ndarray.

I forked autograd and made a quick attempt to implement this myself; it looks like most of the issues are straightforward renamings. The only real sticking point was some slicing stuff in array_ops. But unfortunately that is a bit beyond my current understanding of ndarray.

tensordot fails on assert

Hello, I may have found a bug.

I was trying to evaluate a dot product:

use autograd as ag;
use autograd::Tensor;
use autograd::ndarray::arr1;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let x: Tensor<f32> = ag::placeholder(&[4]);
    let y: Tensor<f32> = 2. * ag::tensordot(&x, &x, &[0], &[0]);

    let arr = arr1(&[0f32,1.,2.,3.]).into_dyn();

    println!("{:?}", y.eval(&[ag::Feed(&x, arr.view())]));

    Ok(())
}

which gives me the error:

thread 'main' panicked at 'assertion failed: perm_len >= 2', /home/hwchen/.cargo/registry/src/github.com-1ecc6299db9ec823/autograd-0.9.6/src/ops/math_ops.rs:344:9

Removing the assert at

rust-autograd/src/ops/math_ops.rs

Line 344 in 97a5ad0

assert!(perm_len >= 2);

allows the operation to proceed and gives me the answer I expect.

I don't know enough to know whether the assert can just be modified, so I didn't create a pull request.

Thanks!

Inconsistent tensor caching

I was under the impression that tensors are only evaluated once during a single call to eval, and the value is then cached for later uses. It seems to work like that in most cases, but I tried a simple example with some inconsistent results:

let ref a : ag::Tensor<f64> = ag::random_normal(&[1], 0.0, 1.0);
println!("{:?}", ag::eval(&[a - a], &[]));
println!("{:?}", ag::eval(&[a - 1.0 * a], &[]));

The first one produces a value 0 as expected. However the second one is always a nonzero value, so it seems to call the random number generator twice.

eval and run at same time?

Is it possible to execute the updates from an optimizer (run) and return the loss at the same time (eval)? I can't figure out how to do this. The theano.function interface I'm familiar with accepts return parameters and a separate list/dict of updates. Perhaps with autograd the eval has to be in updates, but I don't know how to retrieve the value.

Not differentiable with the given tensors error when trying to train multi-input neural network

@rusbridger and I are trying to train a multi-input siamese neural network with this structure, however, I get the error that this isn't differentiable.

My inputs x1, x2 and y are of shape [batch, 105, 105, 3], [batch, 105, 105, 3] and [batch, 1] respectively. The type is all f32.

I am running this on Ubuntu WSL.

The error I end up getting is:

thread 'main' panicked at 'Not differentiable with given tensor(s).

These are my layers, I import them into the training function below this block of code

Here is the full repo:
https://github.com/UTMIST/oneshot-rs

/// Compute convolutional layer with max-pooling.
pub fn conv_pool<'g>(x: Tensor<'g>, w: Tensor<'g>, b: Tensor<'g>) -> Tensor<'g> {
    let g = x.graph();
    let y1 = g.conv2d(x, w, 0, 1) + b;
    let y2 = g.relu(y1);
    x.graph().max_pool2d(y2, 2, 0, 2)
}

/// Compute the final convolutional layer.
pub fn conv_final<'g>(x: Tensor<'g>, w: Tensor<'g>, b: Tensor<'g>) -> Tensor<'g> {
    let g = x.graph();
    let y1 = g.conv2d(x, w, 0, 1) + b;
    x.graph().relu(y1)
}

// Load inputs.
pub fn inputs(g: &Graph<f32>) -> (Tensor, Tensor, Tensor) {
    let x1 = g.placeholder(&[-1, 1, 105, 105]);
    let x2 = g.placeholder(&[-1, 1, 105, 105]);
    let y = g.placeholder(&[-1, 1]);
    (x1, x2, y)
}

/// Compute the final sigmoid layer.
pub fn sigmoid_layer<'g>(x: Tensor<'g>, params: &[Tensor<'g>]) -> (Tensor<'g>, Tensor<'g>) {
    let g = x.graph();
    let z1 = conv_pool(x, params[0], params[6]);
    let z2 = conv_pool(z1, params[1], params[7]);
    let z3 = conv_pool(z2, params[2], params[8]);
    let twin_2_z4 = conv_final(z3, params[3], params[9]);
    let flattened = g.reshape(twin_2_z4, &[-1, 256 * 6 * 6]); // flatten
    let dense = flattened.graph().matmul(params[4], flattened) + params[10];
    (dense, g.sigmoid(flattened))
}

This is my training function, I just modified the CNN mnist example a bit by adding some of my own layers. This block of code imports the functions from the above code block.

pub fn train(train_x1: NDArr, train_x2: NDArr, train_y: NDArr) {

    ag::with(|g| {
        let rng = array::ArrayRng::<f32>::default();
        macro_rules! rand_normal {
            ($arr: expr) => {
                g.variable(rng.random_normal(&$arr, 0., 0.1));
            };
        }
        macro_rules! zeroes {
            ($arr: expr) => {
                g.variable(array::zeros(&$arr));
            };
        }

        // Weights/Biases for convolutional layers.
        let w1 = rand_normal!([64, 1, 10, 10]);
        let w2 = rand_normal!([128, 1, 7, 7]);
        let w3 = rand_normal!([128, 1, 4, 4]);
        let w4 = rand_normal!([256, 1, 4, 4]);
        let b1 = zeroes!([1, 64, 105, 105]);
        let b2 = zeroes!([1, 128, 42, 42]);
        let b3 = zeroes!([1, 128, 18, 18]);
        let b4 = zeroes!([1, 256, 6, 6]);

        // Weights/Biases for dense layers.
        let w5 = rand_normal!([4096, 256 * 6 * 6]);
        let w6 = rand_normal!([4096, 1]);
        let b5 = zeroes!([4096, 1]);

        // Collect parameters and add to adam_state.
        let params = &[w1, w2, w3, w4, w5, w6, b1, b2, b3, b4, b5];
        let adam_state = adam::AdamState::new(
            params
                .iter()
                .map(|v| v.get_variable_array().unwrap())
                .collect::<Vec<_>>()
                .as_slice(),
        );

        // Load inputs and compute sigmoid layers.
        let (x1, x2, y) = layers::inputs(g);
        let (_dense_1, sig_1) = layers::sigmoid_layer(x1, params);
        let (_dense_2, sig_2) = layers::sigmoid_layer(x2, params);

        // Siamese Distance
        let pre_weighted_l1_dist = g.abs(sig_2 - sig_1);
        let weighted_l1_dist = g.matmul(w6, pre_weighted_l1_dist);
        let final_prediction = weighted_l1_dist;
        let loss = g.sigmoid_cross_entropy(final_prediction, y);
        let grads = &g.grad(&[&loss], params);
        let _update_ops: &[Tensor] =
            &adam::Adam::default().compute_updates(params, grads, &adam_state, g);

        // println!("{:?}", );
        for epoch in 0..1 {
            let x1_batch = train_x1.slice(s![0..500, .., .., ..]).into_dyn();
            let x2_batch = train_x2.slice(s![0..500, .., .., ..]).into_dyn();
            let y_batch = train_y.slice(s![0..500, ..]).into_dyn();

            println!("{:?}", x1_batch.shape());
            println!("{:?}", x2_batch.shape());
            println!("{:?}", y_batch.shape());

            g.eval(
                _update_ops,
                &[x1.given(x1_batch), x2.given(x2_batch), y.given(y_batch)],
            );
            println!("finish epoch {}", epoch);
        }
    })
}

How to help?

Hi @raskr

I am very new to Rust and lower level programming, but not to tensors or differential equations. (I'm a data scientist.)

I'd like to help out in some way as a chance to work on my Rust skills while also doing something useful in a field I know/care about.

I saw that the project had been "asleep" for a while, but that you recently made a commit again. Is there something in your mind that you'd like to tackle?

Custom rand::Rng

Your library is exactly what I have been looking for 😍!

I have been trying to rig it up to a wasm build and I hitting into a runtime error. When calling autograd::ndarray_ext::array_gen::glorot_uniform I get a stack trace that looks like the following:

std::panicking::rust_panic_with_hook
std::panicking::begin_panic
rand::jitter::get_nstime
rand::jitter::JitterRng::new_with_timer
rand::jitter::JitterRng::new
rand::StdRng::new
rand::thread_rng::THREAD_RNG_KEY::__init
_$LT$std..thread..local..LocalKey$LT$T$GT$$GT$4init
$LT$std..thread..local..LocalKey$LT$T$GT$$GT$8try_with
$LT$std..thread..local..LocalKey$LT$T$GT$$GT$4with
rand::thread_rng
rand::weak_rng
autograd::ndarray_ext::array_gen::glorot_uniform

So we are panicing in the rand::jitter::get_nstime function. This makes sense because when I look at jitter.rs:703, the implementation for wasm is unreachable!().

Is it possible to pass in a custom &mut rand::Rng to these functions? If that was the case, I could use something like pcg_rand.

Thanks 👍

`cnn_mnist` and `lstm_lm` examples runtime failure

Hi! I'm having some trouble with running a couple of the examples from your project. My process was the following:

$ git clone https://github.com/raskr/rust-autograd.git
$ cd rust-autograd/examples
$ ./download_mnist.sh
$ RUST_BACKTRACE=1 cargo run --example cnn_mnist

which results in the following error. I've compiled it both with and without the --features mkl flag, and with and without the --release flag, neither of which seems have any effect on these errors. As far as I can tell, they don't seem to be trace back to the same problem, but I may be mistaken. I'm running rustc 1.46.0-nightly (feb3536eb 2020-06-09), on Pop!_OS 20.04 LTS, which closely mirrors Ubuntu. If it would be easier for me to open separate issues for each of these, I would be happy to do so.

However, I am able to compile and run the mlp_mnist example without issue.

thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `4`,
 right: `2`', /rustc/feb3536eba10c2e4585d066629598f03d5ddc7c6/src/libstd/macros.rs:16:9
stack backtrace:
   0: backtrace::backtrace::libunwind::trace
             at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/libunwind.rs:86
   1: backtrace::backtrace::trace_unsynchronized
             at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/mod.rs:66
   2: std::sys_common::backtrace::_print_fmt
             at src/libstd/sys_common/backtrace.rs:78
   3: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
             at src/libstd/sys_common/backtrace.rs:59
   4: core::fmt::write
             at src/libcore/fmt/mod.rs:1076
   5: std::io::Write::write_fmt
             at src/libstd/io/mod.rs:1537
   6: std::sys_common::backtrace::_print
             at src/libstd/sys_common/backtrace.rs:62
   7: std::sys_common::backtrace::print
             at src/libstd/sys_common/backtrace.rs:49
   8: std::panicking::default_hook::{{closure}}
             at src/libstd/panicking.rs:198
   9: std::panicking::default_hook
             at src/libstd/panicking.rs:218
  10: std::panicking::rust_panic_with_hook
             at src/libstd/panicking.rs:477
  11: rust_begin_unwind
             at src/libstd/panicking.rs:385
  12: std::panicking::begin_panic_fmt
             at src/libstd/panicking.rs:339
  13: ndarray::impl_methods::<impl ndarray::ArrayBase<S,D>>::slice_collapse
             at /rustc/feb3536eba10c2e4585d066629598f03d5ddc7c6/src/libstd/macros.rs:16
  14: ndarray::impl_methods::<impl ndarray::ArrayBase<S,D>>::slice_move
             at /home/chrism/.cargo/registry/src/github.com-1ecc6299db9ec823/ndarray-0.12.1/src/impl_methods.rs:325
  15: ndarray::impl_methods::<impl ndarray::ArrayBase<S,D>>::slice
             at /home/chrism/.cargo/registry/src/github.com-1ecc6299db9ec823/ndarray-0.12.1/src/impl_methods.rs:289
  16: cnn_mnist::main::{{closure}}
             at examples/cnn_mnist.rs:94
  17: autograd::graph::with
             at /home/chrism/rust-projects/rust-autograd/src/graph.rs:91
  18: cnn_mnist::main
             at examples/cnn_mnist.rs:66
  19: std::rt::lang_start::{{closure}}
             at /rustc/feb3536eba10c2e4585d066629598f03d5ddc7c6/src/libstd/rt.rs:67
  20: std::rt::lang_start_internal::{{closure}}
             at src/libstd/rt.rs:52
  21: std::panicking::try::do_call
             at src/libstd/panicking.rs:297
  22: std::panicking::try
             at src/libstd/panicking.rs:274
  23: std::panic::catch_unwind
             at src/libstd/panic.rs:394
  24: std::rt::lang_start_internal
             at src/libstd/rt.rs:51
  25: std::rt::lang_start
             at /rustc/feb3536eba10c2e4585d066629598f03d5ddc7c6/src/libstd/rt.rs:67
  26: main
  27: __libc_start_main
  28: _start
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

When running $ RUST_BACKTRACE=1 cargo run --example lstm_lm, I get the following error:

thread 'main' panicked at 'lhs input for MatMul must be 2D: ShapeError/IncompatibleShape: incompatible shapes', /home/chrism/rust-projects/rust-autograd/src/ops/dot_ops.rs:536:21
stack backtrace:
   0: backtrace::backtrace::libunwind::trace
             at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/libunwind.rs:86
   1: backtrace::backtrace::trace_unsynchronized
             at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/mod.rs:66
   2: std::sys_common::backtrace::_print_fmt
             at src/libstd/sys_common/backtrace.rs:78
   3: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
             at src/libstd/sys_common/backtrace.rs:59
   4: core::fmt::write
             at src/libcore/fmt/mod.rs:1076
   5: std::io::Write::write_fmt
             at src/libstd/io/mod.rs:1537
   6: std::sys_common::backtrace::_print
             at src/libstd/sys_common/backtrace.rs:62
   7: std::sys_common::backtrace::print
             at src/libstd/sys_common/backtrace.rs:49
   8: std::panicking::default_hook::{{closure}}
             at src/libstd/panicking.rs:198
   9: std::panicking::default_hook
             at src/libstd/panicking.rs:218
  10: std::panicking::rust_panic_with_hook
             at src/libstd/panicking.rs:477
  11: rust_begin_unwind
             at src/libstd/panicking.rs:385
  12: core::panicking::panic_fmt
             at src/libcore/panicking.rs:86
  13: core::option::expect_none_failed
             at src/libcore/option.rs:1272
  14: core::result::Result<T,E>::expect
             at /rustc/feb3536eba10c2e4585d066629598f03d5ddc7c6/src/libcore/result.rs:963
  15: <autograd::ops::dot_ops::MatMul as autograd::op::Op<T>>::compute
             at /home/chrism/rust-projects/rust-autograd/src/ops/dot_ops.rs:536
  16: autograd::runtime::<impl autograd::graph::Graph<F>>::eval::{{closure}}
             at /home/chrism/rust-projects/rust-autograd/src/runtime.rs:390
  17: core::result::Result<T,E>::and_then
             at /rustc/feb3536eba10c2e4585d066629598f03d5ddc7c6/src/libcore/result.rs:729
  18: autograd::runtime::<impl autograd::graph::Graph<F>>::eval
             at /home/chrism/rust-projects/rust-autograd/src/runtime.rs:388
  19: autograd::test_helper::check_theoretical_grads
             at /home/chrism/rust-projects/rust-autograd/src/test_helper.rs:21
  20: lstm_lm::main::{{closure}}
             at examples/lstm_lm.rs:94
  21: autograd::graph::with
             at /home/chrism/rust-projects/rust-autograd/src/graph.rs:91
  22: lstm_lm::main
             at examples/lstm_lm.rs:66
  23: std::rt::lang_start::{{closure}}
             at /rustc/feb3536eba10c2e4585d066629598f03d5ddc7c6/src/libstd/rt.rs:67
  24: std::rt::lang_start_internal::{{closure}}
             at src/libstd/rt.rs:52
  25: std::panicking::try::do_call
             at src/libstd/panicking.rs:297
  26: std::panicking::try
             at src/libstd/panicking.rs:274
  27: std::panic::catch_unwind
             at src/libstd/panic.rs:394
  28: std::rt::lang_start_internal
             at src/libstd/rt.rs:51
  29: std::rt::lang_start
             at /rustc/feb3536eba10c2e4585d066629598f03d5ddc7c6/src/libstd/rt.rs:67
  30: main
  31: __libc_start_main
  32: _start
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Unexpected gradient shape

The following code panics with ndarray: could not broadcast array from shape: [2] to: [].

use autograd::ndarray::{Array, IxDyn};
use autograd::optimizers::adam::{Adam, AdamState};
use autograd::tensor::{Constant, Variable};
use std::sync::Arc;
use std::sync::RwLock;

fn main() {
    let v: Arc<RwLock<Array<f64, IxDyn>>> = autograph::ndarray_ext::into_shared(autograph::array_gen::zeros(&[]));
    let adam_state = AdamState::new(&[&v]);
    let adam = Adam::default();

    with(|graph| {
        let c = graph.constant(autograph::array_gen::ones(&[2]));
        let v = graph.variable(v.clone());

        let y = graph.reduce_sum_to_scalar(c * v);
        let grads = graph.grad(&[y], &[v]);

        let updates = adam.compute_updates(&[v], &grads, &adam_state, graph);
        graph.eval(&updates, &[]);
    })
}

It seems like grads[0] has shape [2], although y and v are scalars. I tried to find out why but didn't find an answer.

Is this a bug? Is there a workaround?

If c * v is replaced by c + v it doesn't panic, by the way.

Thanks in advance for any help you can provide!

softmax_cross_entropy outputs shape [-1], when it should output shape [-1, 1].

I was attempting to optimize something with the softmax_cross_entropy loss function, but I kept getting broadcast shape errors, which confused me. I spent time digging into the code, and I realized that the output here isn't following the API which says it should output a 2-rank tensor, instead outputting a 1-rank tensor.

fn compute(&self, ctx: &mut crate::op::ComputeContext<T>) -> Result<(), crate::op::OpError> {
        let x = &ctx.input(0);
        let log_x: NdArray<T> = x - &tensor_ops::math_ops::logsumexp_forward(x, 1, true);
        // `t` must be one-hot
        let t = &ctx.input(1);
        assert_eq!(log_x.ndim(), 2, "x must be 2-ranked tensor");
        assert_eq!(t.ndim(), 2, "t must be 2-ranked tensor");
        // - t log x ( =(batch, num_classes))
        let minus_one = T::one().neg();
        ctx.append_output(
            (t * &log_x)
                .sum_axis(ndarray::Axis(1))
                .mapv(move |elem| elem * minus_one),
        );
        ctx.append_output(log_x);
        Ok(())
    }

I fixed it in my local copy by just reshaping the array, which seems to work, but I'm not super familiar with if this is how cross entropy calculations are usually resolved.

        let minus_one = T::one().neg();
        let result = (t * &log_x)
            .sum_axis(ndarray::Axis(1))
            .mapv(move |elem| elem * minus_one)
            .into_shape(ndarray::IxDyn(&[log_x.shape()[0], 1]))
            .unwrap();

        assert_eq!(result.ndim(), 2, "result must be 2-ranked tensor");

        ctx.append_output(result);

Index out of bound in DivOp when dividing 2 scalars

stacktrace:

   0: rust_begin_unwind
             at /rustc/08095fc1f875c89e507f17cf6c6a780c8ffa4c01/library/std/src/panicking.rs:515:5
   1: std::panicking::begin_panic_fmt
             at /rustc/08095fc1f875c89e507f17cf6c6a780c8ffa4c01/library/std/src/panicking.rs:457:5
   2: ndarray::arraytraits::<impl core::ops::index::Index<I> for ndarray::ArrayBase<S,D>>::index
             at /home/shui/.cargo/registry/src/github.com-1ecc6299db9ec823/ndarray-0.12.1/src/arraytraits.rs:56:9
   3: <autograd::ops::binary_ops::DivOp as autograd::op::Op<T>>::compute
             at /home/shui/.cargo/registry/src/github.com-1ecc6299db9ec823/autograd-1.1.0/src/ops/binary_ops.rs:247:27
   4: autograd::runtime::<impl autograd::graph::Graph<F>>::eval::{{closure}}
             at /home/shui/.cargo/registry/src/github.com-1ecc6299db9ec823/autograd-1.1.0/src/runtime.rs:392:25
   5: core::result::Result<T,E>::and_then
             at /rustc/08095fc1f875c89e507f17cf6c6a780c8ffa4c01/library/core/src/result.rs:948:22
   6: autograd::runtime::<impl autograd::graph::Graph<F>>::eval
             at /home/shui/.cargo/registry/src/github.com-1ecc6299db9ec823/autograd-1.1.0/src/runtime.rs:390:47
...

Looks like x1 has shape of [1], i.e. 1-dimension, but the index is 0-dimension.