Comments (7)
We had a meeting on Thursday to discuss this. The main points:
- I don't want to remove inplace ops from autograd formulas (deoptimizing them)
- I am OK with bunching autograd operators into bigger units (like contiguous as itself, not a call to copy_)
- Functionalization should be a tool in the toolkit, in general
- It should be possible to have multiple composites and pick the one that contextually makes sense
from functorch.
We decided to revert this for now.
Essentially, the "failure mode" for decomposing an op is much worse than not decomposing an operator. If we decompose an op and it's then slow/throws an error, the user will see a warning/error like aten::op_user_doesnt_use can't be vmapped
, and the user then has no idea where the error came from.
So, for now, we think it's better to err on the side of not decomposing an op unless we explicitly do so.
from functorch.
The problem is that some CompositeImplicitAutograd ops decompose to in-place operations that are not compatible with vmap (note here).
Yes, this is trouble. I have two parallel thoughts here:
- We should ensure implicit autograd composites don't ever use mutation (but at cost of efficiency?)
- Maybe we can provide both the non-mutating and mutating versions? Perhaps using @ailzhang's functionalization pass?
Can we solve these problems by just registering an override for the vmap key for those operations?
@zou3519 Well, VMap key has higher precedence than CompositeImplicitAutograd, so yes, that will just work.
from functorch.
If functionalization could take care of this then that would be great. @ailzhang does functionalization handle something like the following?
x = torch.empty_like(y)
x.copy_(y)
@ezyang one alternative along the lines of "providing both the non-mutating and mutating versions" could be if we have the ability to define our own set of primitives with respect to autograd.
For example, .contiguous()
eventually calls .copy_()
-- .copy_
is the primitive with respect to autograd.
Registering an override for the vmap key for contiguous
doesn't actually work because when someone does vmap(grad(blah))
then the dispatch for the grad
transform is going to break up .contiguous()
into its constituents and then vmap
will see the .copy_
and it will be sad (that's what is going on in #55)
I'm not sure it's possible to "define a new primitive with respect to autograd" out of tree, though: autograd functions exist but I'm not sure they're sufficient
from functorch.
After some experimenting... it looks like if I want to make a new primitive called functorch::to
, then setting up an autograd::Function
for it and registering overrides for the Autograd, CPU, and CUDA keys seems to make this work:
TORCH_LIBRARY_IMPL(functorch, Autograd, m) {
// to_autograd invokes an autograd::Function
m.impl("to", to_autograd);
}
TORCH_LIBRARY_IMPL(functorch, CPU, m) {
// to_kernel just calls at::to
m.impl("to", to_kernel);
}
TORCH_LIBRARY_IMPL(functorch, CUDA, m) {
m.impl("to", to_kernel);
}
unfortunately there's a lot of boilerplate here (e.g. setting up the autograd::Function and registering all of those overrides)
from functorch.
My conception of functionalization is that it is a functional transformation, much like grad/vmap are, which take traces that have mutations and transform them into traces without mutation. So in the vmap(grad(
case, what you would actually do is vmap(functionalize(grad(
(Don't ask me about UX, I don't think you want users to have to insert the functionalize pass in explicitly, so we'd have to figure something out about automatically inserting this pass when necessary).
one alternative along the lines of "providing both the non-mutating and mutating versions" could be if we have the ability to define our own set of primitives with respect to autograd.
Yes, this is possible. Today we have CPU and we have AutogradCPU; it is possible that given Batched, we should have AutogradBatched (this is a little weird, because Batched isn't a backend, but I'm guessing we probably could make it work). Then you would override the definition of contiguous
directly in AutogradBatched
to get the better behavior. I'm not sure why you'd want to implement a functorch::to
though...
from functorch.
Done
from functorch.
Related Issues (20)
- Will pmap be supported in functorh? HOT 2
- How to get only the last few layers' gradident? HOT 2
- [Question] Packaging policy for `functorch` and `torch.func` HOT 5
- INTERNAL_ASSERT failed HOT 4
- RuntimeError: Batching rule not implemented for aten::is_same_size. We could not generate a fallback.
- Vmap and backward hook problem HOT 1
- item() support for vmap HOT 2
- Performance drop because of not yet implemented batching rule for bincount
- Use functional models inside usual nn.Module HOT 1
- Error about using a grad transform with in-place operation is inconsistent with and without DDP HOT 1
- How to get the jacobian matrix in GCNs?
- Per-sample-gradient: Get gradient 0 when using grad(params_tograd, params) with respect to part of model's parameters HOT 1
- Can I call torch.utils.data.WeightedRandomSampler inside vmap? HOT 1
- vmap fails if your model includes full_backward_hook in pytorch2.0 HOT 1
- wrapper->level().value() <= current_level INTERNAL ASSERT FAILED at "../aten/src/ATen/functorch/ADInterpreters.cpp":39 HOT 1
- Swapping 2 columns in a 2d tensor
- vmap does not support Tensor.clone()
- Small difference between functorch grads and torch.autograd.grad
- Tensor.nonzero_static fails on GPU inside torch.func.vmap HOT 1
- Strange behaviour of autograd.functional.jacobian when vectorize=True and strategy=‘forward-mode’
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from functorch.