Comments (10)
I would be really interested in this feature! What is the current status?
from lightning.
huggingface has this built-in at this point. I think it's important to have this feature in lightning as well.
from lightning.
any updates? i still can't find how to apply gradient checkpointing with lightning
from lightning.
@sidhanthholalkere @Borda https://gitter.im/PyTorch-Lightning/community?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge
from lightning.
would be really interested in this Gradient Checkpointing (https://github.com/cybertronai/gradient-checkpointing)
! What is the current status?
from lightning.
huggingface has this built-in at this point. I think it's important to have this feature in lightning as well.
yeah, would be interested in implementing it? 🐰
from lightning.
@TianHongZXY and future googlers. This is implemented via FSDP https://lightning.ai/docs/pytorch/stable/advanced/model_parallel/fsdp.html#activation-checkpointing
from lightning.
I'm not sure that this is really necessary(maybe just a link to the tutorial instead of implementing it).
For sequential nets, its as simple as
def forward(self, input_var, chunks=3):
modules = [module for k, module in self._modules.items()][0]
input_var = checkpoint_sequential(modules, chunks, input_var)
input_var = input_var.view(input_var.size(0), -1)
input_var = self.fc(input_var)
return input_var
which is basically just adding 3 extra lines to the forward().
For other models(RNNs, LSTMs, etc.) it doesn't seem like there's a nice generalization and can vary between model structure.
from lightning.
it's very tempting, but unfortunately I don't think I will be able to commit to that at this point, due to time constraints. it's a shame that I can't because I use lightning and I think it's a good project. if something changes I will let you know.
from lightning.
Is there a reason that the standard Pytorch checkpointing (https://pytorch.org/docs/stable/checkpoint.html) cannot be used with lightning? Why would this need to be reimplemented in lightning?
from lightning.
Related Issues (20)
- Trainer does not wait for neptune logger completion and logger connection stays open unless explicitly closed HOT 1
- Validation does not produce any output in PyTorch Lightning using my UNetTestModel
- Unable to extend FSDPStrategy to HPU accelerator HOT 7
- SaveConfigCallback.save_config is conflict with DDP HOT 1
- Logging Documentation Does not Detail How to Access the Logged Values during the fit loop
- Apply the ignore of the save_hyperparameters function to args as well.
- Cannot run in SLURM Interactive Session
- Resume from mid steps inside an epoch
- `DDPStrategy` fails when using accelerators other than CUDA
- PyTorch Lightning with T5 Model - RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn HOT 1
- Script freezes when Trainer is instantiated
- Sanitize object params before they get logged from argument-free classes
- Support GAN based model training with deepspeed which need to setup fabric twice HOT 2
- IndexError: Pytorch-lightning CompositionalMetric require tensor.item() if dim=0 whether I did so
- Huge metrics jump between epochs && Step and epoch log not matched, when accumulate_grad_batches > 1
- Does `fabric.save()` save on rank 0? HOT 3
- Turn off hpc checkpoint saving in SLURM environment if trainer.fit(..., ckpt_path="last") HOT 3
- DDP strategy doesn't work for on_validation_epoch_end, always hang HOT 4
- TensorBoardLogger does not document .add_image() function
- Passing a dataloader to save_hyperparams hangs trainer.fit
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lightning.