raskr / rust-autograd Goto Github PK
View Code? Open in Web Editor NEWTensors and differentiable operations (like TensorFlow) in Rust
License: MIT License
Tensors and differentiable operations (like TensorFlow) in Rust
License: MIT License
Hello, I may have found a bug.
I was trying to evaluate a dot product:
use autograd as ag;
use autograd::Tensor;
use autograd::ndarray::arr1;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let x: Tensor<f32> = ag::placeholder(&[4]);
let y: Tensor<f32> = 2. * ag::tensordot(&x, &x, &[0], &[0]);
let arr = arr1(&[0f32,1.,2.,3.]).into_dyn();
println!("{:?}", y.eval(&[ag::Feed(&x, arr.view())]));
Ok(())
}
which gives me the error:
thread 'main' panicked at 'assertion failed: perm_len >= 2', /home/hwchen/.cargo/registry/src/github.com-1ecc6299db9ec823/autograd-0.9.6/src/ops/math_ops.rs:344:9
Removing the assert at
rust-autograd/src/ops/math_ops.rs
Line 344 in 97a5ad0
I don't know enough to know whether the assert can just be modified, so I didn't create a pull request.
Thanks!
Hi! I'm having some trouble with running a couple of the examples from your project. My process was the following:
$ git clone https://github.com/raskr/rust-autograd.git
$ cd rust-autograd/examples
$ ./download_mnist.sh
$ RUST_BACKTRACE=1 cargo run --example cnn_mnist
which results in the following error. I've compiled it both with and without the --features mkl
flag, and with and without the --release
flag, neither of which seems have any effect on these errors. As far as I can tell, they don't seem to be trace back to the same problem, but I may be mistaken. I'm running rustc 1.46.0-nightly (feb3536eb 2020-06-09)
, on Pop!_OS 20.04 LTS, which closely mirrors Ubuntu. If it would be easier for me to open separate issues for each of these, I would be happy to do so.
However, I am able to compile and run the mlp_mnist
example without issue.
thread 'main' panicked at 'assertion failed: `(left == right)`
left: `4`,
right: `2`', /rustc/feb3536eba10c2e4585d066629598f03d5ddc7c6/src/libstd/macros.rs:16:9
stack backtrace:
0: backtrace::backtrace::libunwind::trace
at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/libunwind.rs:86
1: backtrace::backtrace::trace_unsynchronized
at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/mod.rs:66
2: std::sys_common::backtrace::_print_fmt
at src/libstd/sys_common/backtrace.rs:78
3: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
at src/libstd/sys_common/backtrace.rs:59
4: core::fmt::write
at src/libcore/fmt/mod.rs:1076
5: std::io::Write::write_fmt
at src/libstd/io/mod.rs:1537
6: std::sys_common::backtrace::_print
at src/libstd/sys_common/backtrace.rs:62
7: std::sys_common::backtrace::print
at src/libstd/sys_common/backtrace.rs:49
8: std::panicking::default_hook::{{closure}}
at src/libstd/panicking.rs:198
9: std::panicking::default_hook
at src/libstd/panicking.rs:218
10: std::panicking::rust_panic_with_hook
at src/libstd/panicking.rs:477
11: rust_begin_unwind
at src/libstd/panicking.rs:385
12: std::panicking::begin_panic_fmt
at src/libstd/panicking.rs:339
13: ndarray::impl_methods::<impl ndarray::ArrayBase<S,D>>::slice_collapse
at /rustc/feb3536eba10c2e4585d066629598f03d5ddc7c6/src/libstd/macros.rs:16
14: ndarray::impl_methods::<impl ndarray::ArrayBase<S,D>>::slice_move
at /home/chrism/.cargo/registry/src/github.com-1ecc6299db9ec823/ndarray-0.12.1/src/impl_methods.rs:325
15: ndarray::impl_methods::<impl ndarray::ArrayBase<S,D>>::slice
at /home/chrism/.cargo/registry/src/github.com-1ecc6299db9ec823/ndarray-0.12.1/src/impl_methods.rs:289
16: cnn_mnist::main::{{closure}}
at examples/cnn_mnist.rs:94
17: autograd::graph::with
at /home/chrism/rust-projects/rust-autograd/src/graph.rs:91
18: cnn_mnist::main
at examples/cnn_mnist.rs:66
19: std::rt::lang_start::{{closure}}
at /rustc/feb3536eba10c2e4585d066629598f03d5ddc7c6/src/libstd/rt.rs:67
20: std::rt::lang_start_internal::{{closure}}
at src/libstd/rt.rs:52
21: std::panicking::try::do_call
at src/libstd/panicking.rs:297
22: std::panicking::try
at src/libstd/panicking.rs:274
23: std::panic::catch_unwind
at src/libstd/panic.rs:394
24: std::rt::lang_start_internal
at src/libstd/rt.rs:51
25: std::rt::lang_start
at /rustc/feb3536eba10c2e4585d066629598f03d5ddc7c6/src/libstd/rt.rs:67
26: main
27: __libc_start_main
28: _start
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
When running $ RUST_BACKTRACE=1 cargo run --example lstm_lm
, I get the following error:
thread 'main' panicked at 'lhs input for MatMul must be 2D: ShapeError/IncompatibleShape: incompatible shapes', /home/chrism/rust-projects/rust-autograd/src/ops/dot_ops.rs:536:21
stack backtrace:
0: backtrace::backtrace::libunwind::trace
at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/libunwind.rs:86
1: backtrace::backtrace::trace_unsynchronized
at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/mod.rs:66
2: std::sys_common::backtrace::_print_fmt
at src/libstd/sys_common/backtrace.rs:78
3: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
at src/libstd/sys_common/backtrace.rs:59
4: core::fmt::write
at src/libcore/fmt/mod.rs:1076
5: std::io::Write::write_fmt
at src/libstd/io/mod.rs:1537
6: std::sys_common::backtrace::_print
at src/libstd/sys_common/backtrace.rs:62
7: std::sys_common::backtrace::print
at src/libstd/sys_common/backtrace.rs:49
8: std::panicking::default_hook::{{closure}}
at src/libstd/panicking.rs:198
9: std::panicking::default_hook
at src/libstd/panicking.rs:218
10: std::panicking::rust_panic_with_hook
at src/libstd/panicking.rs:477
11: rust_begin_unwind
at src/libstd/panicking.rs:385
12: core::panicking::panic_fmt
at src/libcore/panicking.rs:86
13: core::option::expect_none_failed
at src/libcore/option.rs:1272
14: core::result::Result<T,E>::expect
at /rustc/feb3536eba10c2e4585d066629598f03d5ddc7c6/src/libcore/result.rs:963
15: <autograd::ops::dot_ops::MatMul as autograd::op::Op<T>>::compute
at /home/chrism/rust-projects/rust-autograd/src/ops/dot_ops.rs:536
16: autograd::runtime::<impl autograd::graph::Graph<F>>::eval::{{closure}}
at /home/chrism/rust-projects/rust-autograd/src/runtime.rs:390
17: core::result::Result<T,E>::and_then
at /rustc/feb3536eba10c2e4585d066629598f03d5ddc7c6/src/libcore/result.rs:729
18: autograd::runtime::<impl autograd::graph::Graph<F>>::eval
at /home/chrism/rust-projects/rust-autograd/src/runtime.rs:388
19: autograd::test_helper::check_theoretical_grads
at /home/chrism/rust-projects/rust-autograd/src/test_helper.rs:21
20: lstm_lm::main::{{closure}}
at examples/lstm_lm.rs:94
21: autograd::graph::with
at /home/chrism/rust-projects/rust-autograd/src/graph.rs:91
22: lstm_lm::main
at examples/lstm_lm.rs:66
23: std::rt::lang_start::{{closure}}
at /rustc/feb3536eba10c2e4585d066629598f03d5ddc7c6/src/libstd/rt.rs:67
24: std::rt::lang_start_internal::{{closure}}
at src/libstd/rt.rs:52
25: std::panicking::try::do_call
at src/libstd/panicking.rs:297
26: std::panicking::try
at src/libstd/panicking.rs:274
27: std::panic::catch_unwind
at src/libstd/panic.rs:394
28: std::rt::lang_start_internal
at src/libstd/rt.rs:51
29: std::rt::lang_start
at /rustc/feb3536eba10c2e4585d066629598f03d5ddc7c6/src/libstd/rt.rs:67
30: main
31: __libc_start_main
32: _start
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
I was under the impression that tensors are only evaluated once during a single call to eval
, and the value is then cached for later uses. It seems to work like that in most cases, but I tried a simple example with some inconsistent results:
let ref a : ag::Tensor<f64> = ag::random_normal(&[1], 0.0, 1.0);
println!("{:?}", ag::eval(&[a - a], &[]));
println!("{:?}", ag::eval(&[a - 1.0 * a], &[]));
The first one produces a value 0 as expected. However the second one is always a nonzero value, so it seems to call the random number generator twice.
I have been tinkering together a neural network. However, I have been running into some panics. I have managed to boil it down to calculating the gradient of ag::add
or ag::sub
. Here is the boiled down example:
use ag;
use ndarray::prelude::*;
let x_val = array![[0.1]].into_dyn();
let y_val = array![[0.2]].into_dyn();
let ref x = ag::placeholder(&[-1, 1]);
let ref y = ag::placeholder(&[-1, 1]);
let ref feeds = [(x, &x_val), (y, &y_val)];
let ref z = ag::add(x, y);
println!("{}", z.eval(feeds));
let ref grads = ag::grad(&[z], &[y]);
println!("{}", grads[0].eval(feeds)); // panic here
I get the following when I try to run the above:
[[0.3]]
thread 'main' panicked at 'ndarray: could not broadcast array from shape: [0] to: [2]', /home/sgibbs/.cargo/registry/src/github.com-1ecc6299db9ec823/ndarray-0.10.14/src/lib.rs:741:13
note: Run with `RUST_BACKTRACE=1` for a backtrace.
Am I doing something wrong?
Thanks
Version: rc2.0.0-rc2
backtrace:
thread 'main' panicked at 'index out of bounds: the len is 0 but the index is 0', /home/penelope/.cargo/registry/src/github.com-1ecc6299db9ec823/autograd-2.0.0-rc2/src/evaluation.rs:446:54
stack backtrace:
0: rust_begin_unwind
at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/panicking.rs:515:5
1: core::panicking::panic_fmt
at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/core/src/panicking.rs:92:14
2: core::panicking::panic_bounds_check
at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/core/src/panicking.rs:69:5
3: <usize as core::slice::index::SliceIndex<[T]>>::index
at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/core/src/slice/index.rs:184:10
4: core::slice::index::<impl core::ops::index::Index<I> for [T]>::index
at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/core/src/slice/index.rs:15:9
5: <smallvec::SmallVec<A> as core::ops::index::Index<I>>::index
at /home/penelope/.cargo/registry/src/github.com-1ecc6299db9ec823/smallvec-1.6.1/src/lib.rs:1624:10
6: autograd::evaluation::<impl autograd::graph::Graph<F>>::eval
at /home/penelope/.cargo/registry/src/github.com-1ecc6299db9ec823/autograd-2.0.0-rc2/src/evaluation.rs:446:54
7: autograd::evaluation::Evaluator<F>::run
at /home/penelope/.cargo/registry/src/github.com-1ecc6299db9ec823/autograd-2.0.0-rc2/src/evaluation.rs:147:9
8: battlesnake::neuralscore::train_cogent::{{closure}}
at ./battlesnake/src/neuralscore.rs:189:16
9: autograd::variable::VariableEnvironment<F>::run
at /home/penelope/.cargo/registry/src/github.com-1ecc6299db9ec823/autograd-2.0.0-rc2/src/variable.rs:688:9
10: battlesnake::neuralscore::train_cogent
at ./battlesnake/src/neuralscore.rs:166:3
11: battlesnake::main
at ./battlesnake/src/main.rs:314:16
12: core::ops::function::FnOnce::call_once
at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/core/src/ops/function.rs:227:5
code:
env.run(|ctx| {
let train_tensor = arr2(train_frames.iter().map(|x| Frame { values: *x }).collect_vec().as_slice());
let train_y = arr2(train_you_won.as_slice());
let test_tensor = arr2(test_frames.iter().map(|x| Frame { values: *x }).collect_vec().as_slice());
let test_y = arr2(test_you_won.as_slice());
let x = ctx.placeholder("x", &[-1, (N_SNAKES * DIM_PER_SNAKE) as isize]);
let y = ctx.placeholder("y", &[-1, 1]);
let w = ctx.variable("w1");
let b = ctx.variable("b1");
let w2 = ctx.variable("w2");
let b2 = ctx.variable("b2");
let z = tanh(matmul(x, w) + b);
let dropout_1_test = dropout(z, 0.5, false);
let z2_test = sigmoid(matmul(dropout_1_test, w2) + b2);
let mean_loss_test = reduce_mean(mean_squared_error(z2_test, &y), &[0], false);
//let dropout_1_train = dropout(z, 0.5, true);
//let z2_train = sigmoid(matmul(dropout_1_train, w2) + b2);
//let mean_loss_train = reduce_mean(mean_squared_error(z2_train, &y), &[0], false);
let loss = ctx.evaluator().push(mean_loss_test)
.feed(x, train_tensor.view())
.feed(y, train_y.view() )
.run();
eprintln!("initial loss: {:?}", loss);
});
the same code works just fine with setting train to true
I'm trying to use autograd in conjunction with another library (nshare) which only supports ndarray 0.15. I recognize that this incompatibility isn't autograd's fault, but it seems like the simplest solution would be for autograd to support the latest version of ndarray.
I forked autograd and made a quick attempt to implement this myself; it looks like most of the issues are straightforward renamings. The only real sticking point was some slicing stuff in array_ops. But unfortunately that is a bit beyond my current understanding of ndarray.
When attempting to compile an example application, using the suggested cargo invocation on crates.io (autograd = { version = "0.9.5", features = ["mkl"] }
), I get the following error:
error: failed to run custom build command for `intel-mkl-src v0.2.5`
Caused by:
process didn't exit successfully: `/path/to/project/target/debug/build/intel-mkl-src-2710610bac526c6f/build-script-build` (exit code: 101)
--- stdout
Downlaod archive
--- stderr
thread 'main' panicked at 'check sum of downloaded archive is incorrect: md5sum=d41d8cd98f00b204e9800998ecf8427e', /path/to/.cargo/registry/src/github.com-1ecc6299db9ec823/intel-mkl-src-0.2.5/build.rs:88:13
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
Workaround is possible by simply not using mkl
.
Hi!
I am just starting to explore autograd
, and I ran into an issue when using a 2D input:
autograd::with(|g| {
use autograd::ndarray;
let (w, h) = (2, 2);
let x = g.placeholder(&[w as isize, h as isize]);
let loss = x.access_elem(0).access_elem(0) * x.access_elem(0).access_elem(0)
+ x.access_elem(0).access_elem(1) * x.access_elem(0).access_elem(1);
let grad = g.grad(&[loss], &[x])[0]; // dloss/dx
let x_values = ndarray::Array2::from_shape_vec((w, h), vec![0.0; w * h]).unwrap();
let _grad = grad
.eval(&[x.given(x_values.view())])
.expect("Failed to eval gradient");
});
This gives me Failed to eval gradient: OpError(OutOfBounds("access_elem failed."))
.
Whichever shape I pick (1,2
, 2,1
, 2,2
etc) I get the same error.
@rusbridger and I are trying to train a multi-input siamese neural network with this structure, however, I get the error that this isn't differentiable.
My inputs x1, x2 and y are of shape [batch, 105, 105, 3], [batch, 105, 105, 3] and [batch, 1] respectively. The type is all f32.
I am running this on Ubuntu WSL.
The error I end up getting is:
thread 'main' panicked at 'Not differentiable with given tensor(s).
These are my layers, I import them into the training function below this block of code
Here is the full repo:
https://github.com/UTMIST/oneshot-rs
/// Compute convolutional layer with max-pooling.
pub fn conv_pool<'g>(x: Tensor<'g>, w: Tensor<'g>, b: Tensor<'g>) -> Tensor<'g> {
let g = x.graph();
let y1 = g.conv2d(x, w, 0, 1) + b;
let y2 = g.relu(y1);
x.graph().max_pool2d(y2, 2, 0, 2)
}
/// Compute the final convolutional layer.
pub fn conv_final<'g>(x: Tensor<'g>, w: Tensor<'g>, b: Tensor<'g>) -> Tensor<'g> {
let g = x.graph();
let y1 = g.conv2d(x, w, 0, 1) + b;
x.graph().relu(y1)
}
// Load inputs.
pub fn inputs(g: &Graph<f32>) -> (Tensor, Tensor, Tensor) {
let x1 = g.placeholder(&[-1, 1, 105, 105]);
let x2 = g.placeholder(&[-1, 1, 105, 105]);
let y = g.placeholder(&[-1, 1]);
(x1, x2, y)
}
/// Compute the final sigmoid layer.
pub fn sigmoid_layer<'g>(x: Tensor<'g>, params: &[Tensor<'g>]) -> (Tensor<'g>, Tensor<'g>) {
let g = x.graph();
let z1 = conv_pool(x, params[0], params[6]);
let z2 = conv_pool(z1, params[1], params[7]);
let z3 = conv_pool(z2, params[2], params[8]);
let twin_2_z4 = conv_final(z3, params[3], params[9]);
let flattened = g.reshape(twin_2_z4, &[-1, 256 * 6 * 6]); // flatten
let dense = flattened.graph().matmul(params[4], flattened) + params[10];
(dense, g.sigmoid(flattened))
}
This is my training function, I just modified the CNN mnist example a bit by adding some of my own layers. This block of code imports the functions from the above code block.
pub fn train(train_x1: NDArr, train_x2: NDArr, train_y: NDArr) {
ag::with(|g| {
let rng = array::ArrayRng::<f32>::default();
macro_rules! rand_normal {
($arr: expr) => {
g.variable(rng.random_normal(&$arr, 0., 0.1));
};
}
macro_rules! zeroes {
($arr: expr) => {
g.variable(array::zeros(&$arr));
};
}
// Weights/Biases for convolutional layers.
let w1 = rand_normal!([64, 1, 10, 10]);
let w2 = rand_normal!([128, 1, 7, 7]);
let w3 = rand_normal!([128, 1, 4, 4]);
let w4 = rand_normal!([256, 1, 4, 4]);
let b1 = zeroes!([1, 64, 105, 105]);
let b2 = zeroes!([1, 128, 42, 42]);
let b3 = zeroes!([1, 128, 18, 18]);
let b4 = zeroes!([1, 256, 6, 6]);
// Weights/Biases for dense layers.
let w5 = rand_normal!([4096, 256 * 6 * 6]);
let w6 = rand_normal!([4096, 1]);
let b5 = zeroes!([4096, 1]);
// Collect parameters and add to adam_state.
let params = &[w1, w2, w3, w4, w5, w6, b1, b2, b3, b4, b5];
let adam_state = adam::AdamState::new(
params
.iter()
.map(|v| v.get_variable_array().unwrap())
.collect::<Vec<_>>()
.as_slice(),
);
// Load inputs and compute sigmoid layers.
let (x1, x2, y) = layers::inputs(g);
let (_dense_1, sig_1) = layers::sigmoid_layer(x1, params);
let (_dense_2, sig_2) = layers::sigmoid_layer(x2, params);
// Siamese Distance
let pre_weighted_l1_dist = g.abs(sig_2 - sig_1);
let weighted_l1_dist = g.matmul(w6, pre_weighted_l1_dist);
let final_prediction = weighted_l1_dist;
let loss = g.sigmoid_cross_entropy(final_prediction, y);
let grads = &g.grad(&[&loss], params);
let _update_ops: &[Tensor] =
&adam::Adam::default().compute_updates(params, grads, &adam_state, g);
// println!("{:?}", );
for epoch in 0..1 {
let x1_batch = train_x1.slice(s![0..500, .., .., ..]).into_dyn();
let x2_batch = train_x2.slice(s![0..500, .., .., ..]).into_dyn();
let y_batch = train_y.slice(s![0..500, ..]).into_dyn();
println!("{:?}", x1_batch.shape());
println!("{:?}", x2_batch.shape());
println!("{:?}", y_batch.shape());
g.eval(
_update_ops,
&[x1.given(x1_batch), x2.given(x2_batch), y.given(y_batch)],
);
println!("finish epoch {}", epoch);
}
})
}
Matrix multiplication can cause a segault when using MKR. Here is a snippet:
let W = ag::variable(ag::ndarray_ext::glorot_uniform(&[(output_size as usize)*4, input_size as usize]));
let s = ag::slice(&W, &[0, 0], &[output_size, -1]);
let r = ag::matmul(&s, &x);
In the above example, x
has dimensions [input_size, 1]
and s
will have dimensions [output_size, input_size]
, so matrix multiplication works out. The example runs with no issues without the mkr
feature, but with mkr
the eval step segfaults. I analyzed with valgrind, and it seems like there a few extra writes happening somewhere, resulting in a segfault.
The following code panics with ndarray: could not broadcast array from shape: [2] to: []
.
use autograd::ndarray::{Array, IxDyn};
use autograd::optimizers::adam::{Adam, AdamState};
use autograd::tensor::{Constant, Variable};
use std::sync::Arc;
use std::sync::RwLock;
fn main() {
let v: Arc<RwLock<Array<f64, IxDyn>>> = autograph::ndarray_ext::into_shared(autograph::array_gen::zeros(&[]));
let adam_state = AdamState::new(&[&v]);
let adam = Adam::default();
with(|graph| {
let c = graph.constant(autograph::array_gen::ones(&[2]));
let v = graph.variable(v.clone());
let y = graph.reduce_sum_to_scalar(c * v);
let grads = graph.grad(&[y], &[v]);
let updates = adam.compute_updates(&[v], &grads, &adam_state, graph);
graph.eval(&updates, &[]);
})
}
It seems like grads[0]
has shape [2]
, although y
and v
are scalars. I tried to find out why but didn't find an answer.
Is this a bug? Is there a workaround?
If c * v
is replaced by c + v
it doesn't panic, by the way.
Thanks in advance for any help you can provide!
Your library is exactly what I have been looking for ๐!
I have been trying to rig it up to a wasm build and I hitting into a runtime error. When calling autograd::ndarray_ext::array_gen::glorot_uniform
I get a stack trace that looks like the following:
std::panicking::rust_panic_with_hook
std::panicking::begin_panic
rand::jitter::get_nstime
rand::jitter::JitterRng::new_with_timer
rand::jitter::JitterRng::new
rand::StdRng::new
rand::thread_rng::THREAD_RNG_KEY::__init
_$LT$std..thread..local..LocalKey$LT$T$GT$$GT$4init
$LT$std..thread..local..LocalKey$LT$T$GT$$GT$8try_with
$LT$std..thread..local..LocalKey$LT$T$GT$$GT$4with
rand::thread_rng
rand::weak_rng
autograd::ndarray_ext::array_gen::glorot_uniform
So we are panicing in the rand::jitter::get_nstime
function. This makes sense because when I look at jitter.rs:703, the implementation for wasm is unreachable!()
.
Is it possible to pass in a custom &mut rand::Rng
to these functions? If that was the case, I could use something like pcg_rand.
Thanks ๐
Hi @raskr
I am very new to Rust and lower level programming, but not to tensors or differential equations. (I'm a data scientist.)
I'd like to help out in some way as a chance to work on my Rust skills while also doing something useful in a field I know/care about.
I saw that the project had been "asleep" for a while, but that you recently made a commit again. Is there something in your mind that you'd like to tackle?
stacktrace:
0: rust_begin_unwind
at /rustc/08095fc1f875c89e507f17cf6c6a780c8ffa4c01/library/std/src/panicking.rs:515:5
1: std::panicking::begin_panic_fmt
at /rustc/08095fc1f875c89e507f17cf6c6a780c8ffa4c01/library/std/src/panicking.rs:457:5
2: ndarray::arraytraits::<impl core::ops::index::Index<I> for ndarray::ArrayBase<S,D>>::index
at /home/shui/.cargo/registry/src/github.com-1ecc6299db9ec823/ndarray-0.12.1/src/arraytraits.rs:56:9
3: <autograd::ops::binary_ops::DivOp as autograd::op::Op<T>>::compute
at /home/shui/.cargo/registry/src/github.com-1ecc6299db9ec823/autograd-1.1.0/src/ops/binary_ops.rs:247:27
4: autograd::runtime::<impl autograd::graph::Graph<F>>::eval::{{closure}}
at /home/shui/.cargo/registry/src/github.com-1ecc6299db9ec823/autograd-1.1.0/src/runtime.rs:392:25
5: core::result::Result<T,E>::and_then
at /rustc/08095fc1f875c89e507f17cf6c6a780c8ffa4c01/library/core/src/result.rs:948:22
6: autograd::runtime::<impl autograd::graph::Graph<F>>::eval
at /home/shui/.cargo/registry/src/github.com-1ecc6299db9ec823/autograd-1.1.0/src/runtime.rs:390:47
...
Looks like x1 has shape of [1]
, i.e. 1-dimension, but the index is 0-dimension.
On my x86 linux environment with rust 1.47.0 and autograd 1.0.2 the below code has a segmentation fault.
Any idea what's causing this crash?
Given that this segfault happened in a very simple loop and that there's a decent amount of code inside of unsafe{} would it possible/feasible to use conditional compilation that used alternatives to unsafe{} code blocks when they are purely for performance, along with alternative safe versions of datastructures (eg. UnsafeCell, not sure how trustworthy SmallVec is).
test code:
==> Cargo.toml <==
[package]
name = "autograd_test"
version = "0.1.0"
edition = "2018"
[dependencies]
autograd = { version = "1.0.2" }
==> src/main.rs <==
extern crate autograd as ag;
fn main() {
ag::with(|g: &mut ag::Graph<f64>| {
let mut loop_iter = 1;
loop {
let x = g.placeholder(&[3]);
let z = 2.0 * x;
eprintln!("about to call grad() {}", loop_iter);
g.grad(&[z], &[x])[0];
eprintln!("grad() call completed {}", loop_iter);
loop_iter += 1;
if loop_iter >= 1000 { break };
};
});
}
The gdb output and backtrace for me looks like this (this crashes in the --release mode as well for me):
about to call grad() 1
grad() call completed 1
about to call grad() 2
grad() call completed 2
...
grad() call completed 25
about to call grad() 26
grad() call completed 26
about to call grad() 27
Program received signal SIGSEGV, Segmentation fault.
smallvec::SmallVec<A>::spilled (self=0x7ffff7fd12c0)
at /home/ktegan/.cargo/registry/src/github.com-1ecc6299db9ec823/smallvec-1.4.2/src/lib.rs:695
695 self.capacity > Self::inline_capacity()
(gdb) bt
#0 smallvec::SmallVec<A>::spilled (self=0x7ffff7fd12c0)
at /home/ktegan/.cargo/registry/src/github.com-1ecc6299db9ec823/smallvec-1.4.2/src/lib.rs:695
#1 0x0000555555585ab3 in smallvec::SmallVec<A>::triple (self=0x7ffff7fd12c0)
at /home/ktegan/.cargo/registry/src/github.com-1ecc6299db9ec823/smallvec-1.4.2/src/lib.rs:666
#2 0x0000555555584bce in smallvec::SmallVec<A>::len (self=0x7ffff7fd12c0)
at /home/ktegan/.cargo/registry/src/github.com-1ecc6299db9ec823/smallvec-1.4.2/src/lib.rs:646
#3 0x00005555555d3e1a in autograd::gradient::symbolic_gradients (ys=..., wrt=..., gys=..., g=0x7fffffffda60)
at /home/ktegan/.cargo/registry/src/github.com-1ecc6299db9ec823/autograd-1.0.2/src/gradient.rs:166
#4 0x00005555555c5573 in autograd::ops::<impl autograd::graph::Graph<F>>::grad_with_default (
self=0x7fffffffda60, ys=..., xs=..., ys_grads=...)
at /home/ktegan/.cargo/registry/src/github.com-1ecc6299db9ec823/autograd-1.0.2/src/ops/mod.rs:128
#5 0x00005555555c5cbb in autograd::ops::<impl autograd::graph::Graph<F>>::grad (self=0x7fffffffda60, ys_=...,
xs=...) at /home/ktegan/.cargo/registry/src/github.com-1ecc6299db9ec823/autograd-1.0.2/src/ops/mod.rs:96
#6 0x000055555559a42e in autograd_test::main::{{closure}} (g=0x7fffffffda60) at src/main.rs:10
#7 0x00005555555c6906 in autograd::graph::with (f=...)
at /home/ktegan/.cargo/registry/src/github.com-1ecc6299db9ec823/autograd-1.0.2/src/graph.rs:91
#8 0x00005555555bf546 in autograd_test::main () at src/main.rs:4
I was trying to use the Adam optimizer for a DQN problem, but the the Q values always kept diverging regardless of hyperparameters. So I looked into the code and it seems to me like the optimizer doesn't correctly update the stateful parameters. It calculates new values for m and v but while doing so it makes copies of the NdArrays. The actual state variables referenced by xs[2] and xs[3] are never updated so on each iteration the optimizer just uses zero initialized values for m and v.
Line 1490 in 6c769fb
I ran into this panic during actual usage, then reduced it to these tests (borrowing test code from test_tensor_ops_grad.rs
). It seems like maybe it's trying to take the gradient of a variable or placeholder when it shouldn't have to because grad_with_default provides the gradient explicitly...
#[test]
fn grad_with_default_variable() {
let mut ctx = ag::VariableEnvironment::new();
let v1 = ctx.slot().set(ndarray::arr1(&[1., 1., 1.]));
ctx.run(|graph| {
let v1 = graph.variable(v1);
let g = T::grad_with_default(&[v1], &[v1], &[v1]);
ag::test_helper::check_theoretical_grads(
v1,
g.as_slice(),
&[v1],
ag::Feeder::new(),
1e-3,
1e-3,
graph,
);
});
}
#[test]
fn grad_with_default_placeholder() {
ag::run(|ctx| {
let mut eval = ctx.evaluator();
let v1 = ctx.placeholder("v1", &[3]);
let v1_val = ndarray::arr1(&[1., 1., 1.]);
eval.feed("v1", v1_val.view());
let g = T::grad_with_default(&[v1], &[v1], &[v1]);
eval.extend(&g).run();
});
}
---- test_tensor_ops_grad::add_n_single_placeholder stdout ----
thread 'test_tensor_ops_grad::add_n_single_placeholder' panicked at 'internal error: entered unreachable code', /n/rust-autograd/src/tensor_ops/basic_source_ops.rs:23:1
stack backtrace:
0: rust_begin_unwind
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/panicking.rs:584:5
1: core::panicking::panic_fmt
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/panicking.rs:142:14
2: core::panicking::panic
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/panicking.rs:48:5
3: <autograd::tensor_ops::basic_source_ops::Placeholder as autograd::op::Op<T>>::grad
at ./src/tensor_ops/basic_source_ops.rs:15:17
4: autograd::op::GradientContext<T>::compute_input_grads
at ./src/op.rs:371:9
5: autograd::gradient::compute_gradients
at ./src/gradient.rs:60:23
6: autograd::tensor_ops::grad_with_default
at ./src/tensor_ops/mod.rs:142:21
7: lib::test_tensor_ops_grad::add_n_single_placeholder::{{closure}}
at ./tests/test_tensor_ops_grad.rs:83:17
8: autograd::graph::run
at ./src/graph.rs:99:5
9: lib::test_tensor_ops_grad::add_n_single_placeholder
at ./tests/test_tensor_ops_grad.rs:78:5
10: lib::test_tensor_ops_grad::add_n_single_placeholder::{{closure}}
at ./tests/test_tensor_ops_grad.rs:77:1
11: core::ops::function::FnOnce::call_once
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/ops/function.rs:248:5
12: core::ops::function::FnOnce::call_once
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/ops/function.rs:248:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
---- test_tensor_ops_grad::add_n_single_variable stdout ----
thread 'test_tensor_ops_grad::add_n_single_variable' panicked at 'internal error: entered unreachable code', /n/rust-autograd/src/tensor_ops/basic_source_ops.rs:21:1
stack backtrace:
0: rust_begin_unwind
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/std/src/panicking.rs:584:5
1: core::panicking::panic_fmt
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/panicking.rs:142:14
2: core::panicking::panic
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/panicking.rs:48:5
3: <autograd::tensor_ops::basic_source_ops::Variable as autograd::op::Op<T>>::grad
at ./src/tensor_ops/basic_source_ops.rs:15:17
4: autograd::op::GradientContext<T>::compute_input_grads
at ./src/op.rs:371:9
5: autograd::gradient::compute_gradients
at ./src/gradient.rs:60:23
6: autograd::tensor_ops::grad_with_default
at ./src/tensor_ops/mod.rs:142:21
7: lib::test_tensor_ops_grad::add_n_single_variable::{{closure}}
at ./tests/test_tensor_ops_grad.rs:63:17
8: autograd::variable::VariableEnvironment<F>::run
at ./src/variable.rs:697:9
9: lib::test_tensor_ops_grad::add_n_single_variable
at ./tests/test_tensor_ops_grad.rs:61:5
10: lib::test_tensor_ops_grad::add_n_single_variable::{{closure}}
at ./tests/test_tensor_ops_grad.rs:58:1
11: core::ops::function::FnOnce::call_once
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/ops/function.rs:248:5
12: core::ops::function::FnOnce::call_once
at /rustc/897e37553bba8b42751c67658967889d11ecd120/library/core/src/ops/function.rs:248:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
The built-in gradient descent function is not usable at the moment, the compute_updates
function needs a pub prefix to fix this.
Workaround: Copy-paste the implementation and use that instead.
When using slice
with negative numbers the result is off by 1.
let a: ag::Tensor<f32> = g.zeros(&[4, 4]);
let b = g.slice(a, &[0, 0], &[-2, 2]); // numpy equivalent is a[:-1, 0:2]
assert_eq!(b.eval(&[]).unwrap().shape(), &[3, 2]);
The assert fails because the shape of b
is [2,2]
Hi. I tried to build simple perceptron with more than 1 hidden layer, but its precision seems to be very low. First 10 epochs it shows some progress, but then stuck on error ~ 0.585, which is very large for task detecting value from 0.0 to 1.0.
Can you please look on it and say what I did wrong?
Here is it https://gist.github.com/inferrna/b1b49b9d0e161a5104670ecb7870e19a
Hi,
I am evaluating autograd
for our project. I really like the concept, but I'm a bit concerned about the use of unsafe
and some issues that come with it.
For example:
lib.rs:243
, creates a new typed Vec<T>
, and calls set_len
. Although the function is marked unsafe
, according to the documentation the elements must be initialized before calling this function, so it is borderline UB.lib.rs:253
, reads and casts any pointer as another type and is unsound. In contrast to the documentation it would not panic, but just invoke UB (you could cast_as
an &u8
to u32
).mlp_mnist.rs:171
, (and others) transmutes a [u8; 4]
to a u32
. I think this can cause endian issues (although I'm not 100% sure here as you load that from disc and later explicitly force a be
conversion)uninitialized_vec
is used to create a reference to an uninitialized value.I was wondering if you have an "unsafe roadmap" moving forward, and / or have plans to review the current use of unsafe
in the code?
Hi, i am working on a simple kmeans clustering algorithm using autogard
I want to achieve the same result as this TF code:
assignments = tf.constant([1, 0, 0])
c = 1
tf.where(tf.equal(assignments, c))
It simply returns the indices where the condition is True.
I wanna ask do you have any idea of simple alternative method in autogard?
Or i should again make my own custom defined op?
I'm currently working on some ML projects where autograd looks like the ideal tool for the job. Unfortunately, I had a bit of trouble figuring out the basics of autograd, because of the somewhat terse documentation.
At the moment, the main content of the docs begins with
Here we are just computing partial derivatives of z = 2x^2 + 3y + 1.
followed by a block of code which does exactly that. This is helpful as an example, but what's missing is a conceptual overview: What's a placeholder? What's a tensor? What does ag::run
mean? What's feeding?
I was able to figure these things out by digging through the rest of the docs, but it wasn't easy, and it would be much more difficult for someone who was less familiar with ML concepts than me (I am perhaps in the middle of the spectrum). And considering how much work has gone into the implementation, making it more accessible would be very valuable!
Many of the individual functions could also benefit from more-detailed documentation. As an example, the documentation for tensor_ops::grad()
currently says:
Symbolic gradient tensors of xs in the same order as xsโs
- ys - Targets of differentiation that are arbitrary shapes.
- xs - Tensors with which differentiate ys.
See the more useful helper: crate::optimizers::grad_helper()
and returns a vector of Tensors; but it could be much clearer about what the returned values are, and what sizes they are. (One for each of xs
, I assume? but what shape are they if there are more than one of ys
?)
(This is only an example; almost every function has a similar level of detail at present.)
Hi, i am a beginner of this interesting rust deep learning library.
When i try to run several basic examples using ndarray
, i found that always has a compiler error saying i should use autograd::ndarray
. Code and error are shown as follow:
ag::with(|g| {
let x = g.placeholder(&[-1,-1]);
let value = array![[1., 1.]];
println!("{:?}", x.eval(&[x.given(value.view())]).unwrap());
});
error[E0308]: mismatched types
--> src\main.rs:14:43
|
14 | println!("{:?}", x.eval(&[x.given(value.view())]).unwrap());
| ^^^^^^^^^^^^ expected struct `autograd::ndarray::ArrayBase`, found struct `ndarray::ArrayBase`
|
= note: expected struct `autograd::ndarray::ArrayBase<autograd::ndarray::ViewRepr<&_>, _>`
found struct `ndarray::ArrayBase<ndarray::ViewRepr<&{float}>, ndarray::Dim<[usize; 2]>>`
= note: perhaps two different versions of crate `ndarray` are being used?
Is that compulsory to use ag::ndarray::array![[1., 1.]]
instead of array![[1., 1.]]
in this kind of cases?
Thanks for your help.
The library is working really great so far ๐ I am just seeing some interesting behaviour with ag::argmax()
. I am passing in an array of 4, and sometimes I am seeing an argmax
result of 4
or 5
. I captured some inputs below.
let ref x = ag::placeholder(&[-1, 4]);
let ref a = ag::argmax(&x, 1, false);
let ref x_val = array![[0.262881, 0.2613406, 0.26295722, 0.26295722]].into_dyn();
let ref feed = [(x, x_val)];
println!("{}", a.eval(feed));
Outputs
[5]
And in python
>>> import numpy as np
>>> np.argmax([0.262881, 0.2613406, 0.26295722, 0.26295722])
2
All the examples in the repo use extern crate autograd
or extern crate ndarray
which is deprecated and not recommended since rust 2018.
It should be a simple fix to update these in all instances.
Is it possible to execute the updates from an optimizer (run
) and return the loss at the same time (eval
)? I can't figure out how to do this. The theano.function
interface I'm familiar with accepts return parameters and a separate list/dict of updates. Perhaps with autograd the eval has to be in updates, but I don't know how to retrieve the value.
I'm surveying this crate for possible use in physics code, and can't help but notice that all tensors are confined to single-precision floating point numbers. Basically, physics is chock-full of multivariate functions with insanely complicated derivatives that provide an immense barrier to their implementation; so automatic differentiation could be potentially very useful. However, sometimes these functions may involve sums of signed values that wildly differ in magnitude, making f32
a dangerous choice for implementation.
Making the crate generic over this would of course not be a trivial effort, as it will appear in virtually all traits and types. You would also most likely need to depend on the Float
trait from num
or similar.
Note this probably isn't the only blocker towards the crate's use in these applications, and I can understand if you don't want to add this support for only the remote chance of attracting people from other disciplines.
When trying to make a predict fn involving a conv2d (similar to cnn_mnist), i see Segmentation faults after around 21 calls.
Is this a bug or Is there a better way to make predictions?
use autograd as ag;
use ag::ndarray_ext as array;
use ag::tensor::Variable;
fn main() {
let rng = array::ArrayRng::<f32>::default();
let w1_arr = array::into_shared(rng.random_normal(&[16, 1, 3, 3], 0., 0.5));
let b1_arr = array::into_shared(array::zeros(&[1, 16, 8, 8]));
ag::with(|g| {
let rng1 = array::ArrayRng::<f32>::default();
let w1 = g.variable(w1_arr.clone());
let b1 = g.variable(b1_arr.clone());
for i in 0..100 {
println!("Calling pred: {}", i);
let x = g.variable(rng1.glorot_uniform(&[8, 8]));
println!("Input value: {:?}", x.eval(&[]));
let _ = g.conv2d(x, w1, 1, 1) + b1;
}
})
}
Hi, i am using autograd to make a neural network with custom defined loss function.
That needs the calculation of lgamma
function of tensors during the process.
I notice that there is tf.math.lgamma()
in tensorflow, and i find that would be really complex to define the operation myself using autograd since the gradient of the logarithm of the fraction is really complex for me to understand.
Could you have any support for that operation or do you have any advice how can i achieve that?
Thanks for your help.
Hi, during the development of my neural network algorithm, i found that gradient method for tensors calculated between two different dimensions always has an error. The error includes thread 'main' panicked at'called Result::unwrap() on an Err value:
and thread 'main' panicked at'called Result::unwrap() on an None value:
if i try to print the grad
result.
One example is like:
let t1 = g.constant( array![[1.0,2.0, 3.0, 4.0], [1.0,2.0, 3.0, 4.0]] );
let t2 = g.constant( array![1.0,2.0, 3.0, 4.0] );
let cal = t1/t2
let grads = g.grad(&[cal], &[t2]);
That influences a lot for the expression using during the development.
After testing, i found that expand t2's dimension can solve that problem by doing like this.
...
let t3 = g.tile(g.expand_dims(t2, &[0]), 0, 2);
let cal = t1/t3
...
However, that way definitely requires much more effort for the development since method like reduced_sum
is really commonly used.
Do u have any other idea about solving this problem?
The docs about GradientContext has some errors.
When I get the differential of constant at variable, I hope I can get the value "0" or an option of error, but it returns panic
Hi, i am using ag.argmax
method and i met a weird situation.
let test = g.constant(array![85.0, 16.0, 0.04, 85.0, 16.0, 85.0]);
let max_dis_index = g.argmax(test, 0, false).show_with("max_dis_index is");
g.eval(&[max_dis_index], &[]);
The output becomes 8.0
which is obviously wrong.
That happens when the maximum number has more than 2 times.
I was attempting to optimize something with the softmax_cross_entropy loss function, but I kept getting broadcast shape errors, which confused me. I spent time digging into the code, and I realized that the output here isn't following the API which says it should output a 2-rank tensor, instead outputting a 1-rank tensor.
fn compute(&self, ctx: &mut crate::op::ComputeContext<T>) -> Result<(), crate::op::OpError> {
let x = &ctx.input(0);
let log_x: NdArray<T> = x - &tensor_ops::math_ops::logsumexp_forward(x, 1, true);
// `t` must be one-hot
let t = &ctx.input(1);
assert_eq!(log_x.ndim(), 2, "x must be 2-ranked tensor");
assert_eq!(t.ndim(), 2, "t must be 2-ranked tensor");
// - t log x ( =(batch, num_classes))
let minus_one = T::one().neg();
ctx.append_output(
(t * &log_x)
.sum_axis(ndarray::Axis(1))
.mapv(move |elem| elem * minus_one),
);
ctx.append_output(log_x);
Ok(())
}
I fixed it in my local copy by just reshaping the array, which seems to work, but I'm not super familiar with if this is how cross entropy calculations are usually resolved.
let minus_one = T::one().neg();
let result = (t * &log_x)
.sum_axis(ndarray::Axis(1))
.mapv(move |elem| elem * minus_one)
.into_shape(ndarray::IxDyn(&[log_x.shape()[0], 1]))
.unwrap();
assert_eq!(result.ndim(), 2, "result must be 2-ranked tensor");
ctx.append_output(result);
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.