Comments (2)
Indeed, and this is tricky:
- To my mind, using
meta
is a great way to keep memory in check, but meta
does not help us when we want to keep the weight's values. Maybe we should offer to replace the original weights withmeta
equivalents as an option?- at any rate, we would probably want to do a sane thing for loading / saving state dicts
from lightning-thunder.
triage review:
- let's mutate the module we're given (so we don't preserve its tensors in memory and use more memory)
- practitioners can preserve the current behavior by copying the module before giving it to jit
- we should be careful that retracing has the information needed (possibly observing original values) to work as expected
from lightning-thunder.
Related Issues (20)
- dtype inconsistencies when dividing/rounding tensors
- Implement GroupNorm to invoke APEX GroupNorm for NeMo Stable Diffusion AutoEncoder performance HOT 14
- Dynamic shape needs to be modeled in trace
- OOM errors for Gemma-7, pythia-12b, Llama-2-13b-hf and Nous-Hermes-13b with FSDP zero3 and 2x8 H100 HOT 5
- Refine recording of source locations HOT 6
- Nous-Hermes-13b on 1x8 H100 FSDP zero2 with thunder_cudnn is 23% slower than with inductor HOT 5
- nvfuserex has problems taking getitem. HOT 4
- load/save_state_dict hooks for early transforms HOT 3
- Training Llama-2-13b-hf on 2x8 H100 with Thunder inductor is 47% slower than with Inductor HOT 4
- FP8 Linear and conv with cudnn HOT 1
- Support RN50 BatchNorm fusions with cudnn
- CI : PyTorch nightly CI failing with `FutureWarning: is_compiling is deprecated. Use torch.compiler.is_compiling() instead.`
- Distill API for module transformations from distributed / quantization uses of ThunderModule attributes
- TransformerEngine API changed and caused test failure `AttributeError: 'TELinear' object has no attribute 'fp8_weight_shapes'`
- FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. HOT 2
- [RFC] Option to make a trace easier to interpret HOT 5
- Thunder object's `__repr__` should indicate what object they are (TensorProxy and others)
- nvfuser failure HOT 3
- NVFuser error adding thunder.jit to UNet model of NeMo Stable Diffusion HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lightning-thunder.