Code Monkey home page Code Monkey logo

lightning-ai / tutorials Goto Github PK

View Code? Open in Web Editor NEW
268.0 18.0 72.0 55.38 MB

Collection of Pytorch lightning tutorial form as rich scripts automatically transformed to ipython notebooks.

Home Page: https://lightning-ai.github.io/tutorials

License: Apache License 2.0

Python 99.15% Shell 0.12% Makefile 0.18% Batchfile 0.09% Jinja 0.13% Dockerfile 0.34%
tutorials jupyter-notebook machine-learning deep-learning notebooks python-scripts lightning

tutorials's People

Contributors

akihironitta avatar ananyahjha93 avatar awaelchli avatar borda avatar carsondenison avatar dependabot[bot] avatar edgarriba avatar emerald01 avatar ethanwharris avatar hankyul2 avatar hihunjin avatar ishandutta0098 avatar kampelmuehler avatar kaushikb11 avatar krshrimali avatar nreimers avatar phlippe avatar pre-commit-ci[bot] avatar rohitgr7 avatar speediedan avatar sumanthratna avatar sunil-s avatar valahaar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tutorials's Issues

mnist-tpu-training.ipynb tutorial has incorrect dependencies

๐Ÿ› Bug

The MNIST training using TPU tutorial has incorrect dependencies. Running all cells from start to finish in Colab results in an error when trying to import pytorch lightning.

To Reproduce

Steps to reproduce the behavior:

  1. Go to https://github.com/PyTorchLightning/lightning-tutorials/blob/publication/.notebooks/lightning_examples/mnist-tpu-training.ipynb
  2. Click "copy raw contents", paste into a text file and save with extension ".ipynb"
  3. Upload to Colab https://colab.research.google.com/
  4. Click "runtime" -> "Change runtime type" and select TPU as the Hardware accelerator
  5. Click "runtime" -> "run all"
  6. An error will occur when running the 3rd cell

Here is the output of the 3rd import cell.

WARNING:root:Waiting for TPU to be start up with version pytorch-1.8...
WARNING:root:Waiting for TPU to be start up with version pytorch-1.8...
WARNING:root:TPU has started up successfully with version pytorch-1.8
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-4-9ad40618d134> in <module>()
      1 import torch
      2 import torch.nn.functional as F
----> 3 from pytorch_lightning import LightningDataModule, LightningModule, Trainer
      4 from torch import nn
      5 from torch.utils.data import DataLoader, random_split

9 frames
/usr/local/lib/python3.7/dist-packages/pytorch_lightning/__init__.py in <module>()
     18 _PROJECT_ROOT = os.path.dirname(_PACKAGE_ROOT)
     19 
---> 20 from pytorch_lightning.callbacks import Callback  # noqa: E402
     21 from pytorch_lightning.core import LightningDataModule, LightningModule  # noqa: E402
     22 from pytorch_lightning.trainer import Trainer  # noqa: E402

/usr/local/lib/python3.7/dist-packages/pytorch_lightning/callbacks/__init__.py in <module>()
     12 # See the License for the specific language governing permissions and
     13 # limitations under the License.
---> 14 from pytorch_lightning.callbacks.base import Callback
     15 from pytorch_lightning.callbacks.device_stats_monitor import DeviceStatsMonitor
     16 from pytorch_lightning.callbacks.early_stopping import EarlyStopping

/usr/local/lib/python3.7/dist-packages/pytorch_lightning/callbacks/base.py in <module>()
     24 
     25 import pytorch_lightning as pl
---> 26 from pytorch_lightning.utilities.types import STEP_OUTPUT
     27 
     28 

/usr/local/lib/python3.7/dist-packages/pytorch_lightning/utilities/__init__.py in <module>()
     16 import numpy
     17 
---> 18 from pytorch_lightning.utilities.apply_func import move_data_to_device  # noqa: F401
     19 from pytorch_lightning.utilities.distributed import AllGatherGrad, rank_zero_info, rank_zero_only  # noqa: F401
     20 from pytorch_lightning.utilities.enums import (  # noqa: F401

/usr/local/lib/python3.7/dist-packages/pytorch_lightning/utilities/apply_func.py in <module>()
     27 
     28 if _TORCHTEXT_AVAILABLE:
---> 29     if _compare_version("torchtext", operator.ge, "0.9.0"):
     30         from torchtext.legacy.data import Batch
     31     else:

/usr/local/lib/python3.7/dist-packages/pytorch_lightning/utilities/imports.py in _compare_version(package, op, version, use_base_version)
     52     """
     53     try:
---> 54         pkg = importlib.import_module(package)
     55     except (ModuleNotFoundError, DistributionNotFound):
     56         return False

/usr/lib/python3.7/importlib/__init__.py in import_module(name, package)
    125                 break
    126             level += 1
--> 127     return _bootstrap._gcd_import(name[level:], package, level)
    128 
    129 

/usr/local/lib/python3.7/dist-packages/torchtext/__init__.py in <module>()
      3 from . import datasets
      4 from . import utils
----> 5 from . import vocab
      6 from . import legacy
      7 from ._extension import _init_extension

/usr/local/lib/python3.7/dist-packages/torchtext/vocab/__init__.py in <module>()
      9 )
     10 
---> 11 from .vocab_factory import (
     12     vocab,
     13     build_vocab_from_iterator,

/usr/local/lib/python3.7/dist-packages/torchtext/vocab/vocab_factory.py in <module>()
      2 from typing import Dict, Iterable, Optional, List
      3 from collections import Counter, OrderedDict
----> 4 from torchtext._torchtext import (
      5     Vocab as VocabPybind,
      6 )

ImportError: /usr/local/lib/python3.7/dist-packages/torchtext/_torchtext.so: undefined symbol: _ZTVN5torch3jit6MethodE

---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.
---------------------------------------------------------------------------

Expected behavior

The 3rd cell and the remaining notebook complete without error.

Additional context

This occurs because torchtext is now version 0.11.0 by default in Colab, but XLA strictly requires torch and torchaudio, torchtext, torchvision have the same version as torch_xla.

The error can be fixed by changing the line
! pip install --quiet "pytorch-lightning>=1.3" "torchmetrics>=0.3" "torch>=1.6, <1.9" "torchvision"
To
! pip install --quiet "pytorch-lightning>=1.3" "torchmetrics>=0.3" "torch==1.8.0" "torchvision==0.9.0" "torchaudio==0.8.0" "torchtext==0.9.0"
(Then doing a factory reset of the runtime if you ran the previous bugged code)

This explicitly installs the correct versions of the torch libraries. I'm not certain about how to create a pull request for this myself. I converted my notebook using Jupytext and it seems adding the following lines

# ## Setup
# This notebook requires some packages besides pytorch-lightning.

# %% colab={"base_uri": "https://localhost:8080/"} id="37f8b49a"
# ! pip install --quiet "pytorch-lightning>=1.3" "torchmetrics>=0.3" "torch==1.8.0" "torchvision==0.9.0" "torchaudio==0.8.0" "torchtext==0.9.0"

To the beginning of https://github.com/PyTorchLightning/lightning-tutorials/blob/main/lightning_examples/mnist-tpu-training/mnist-tpu.py should fix the issue. However I'm not sure why there isn't a "Setup" section there already, is there something about the CI system that I'm missing?

Pretrained models

๐Ÿš€ Feature

The pretrained models are currently stored in separate, external repositories. It would be nice to have releases with the tutorials and upload these models to the release.

Motivation

This point was brought up in #78 (comment)

Additional context

All pretrained models for the UvA DL course are stored here: https://github.com/phlippe/saved_models

add TPU publisher

๐Ÿš€ Feature

in case some notebooks need TPU only we shall be able to generate such notebooks too

Motivation

we shall have all notebooks executed so to verify that notebooks are valid

Pitch

Alternatives

Additional context

Allow extra arguments for pip

๐Ÿš€ Feature

allow adding extra pip arguments such as find link

pip_find-link:
  - https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html
  - https://pytorch-geometric.com/whl/torch-%(TORCH_VERSION)s+cu101.html
  - https://pytorch-geometric.com/whl/torch-1.8.0+cu%(CUDA_VERSION)s.html

which would add these arguments to created requirements file
also, allow dynamically placing proper Torch or Cuda version

Motivation

Some libraries would need CUDA build version

Additional context

Outdated intro

MNIST, PL 1.5.0, Windows 10

from pytorch_lightning.metrics.functional import accuracy
ModuleNotFoundError: No module named 'pytorch_lightning.metrics'

using custom docker image for CI/CD

๐Ÿš€ Feature

consider producing our own docker image with all needed libs installed, which would speed up all publication/testing

Motivation

boost all CI/CD processes on GPU as they are based on docker nowadays so caching is somehow harder

Pitch

Alternatives

Additional context

Probably collect all requirements and building docker image
Note that some version collisions need to be solved especially if a notebook mark max version or pin one specific ๐Ÿฐ

CI: save generated notebook as artifact for validation

๐Ÿš€ Feature

Motivation

Often, when we make changes to tutorials, there's no easy way to test whether it works prior to merging. This is particularly true for the Colab version of the notebook. If not well-tested, it could result in an unnecessarily large number of fixes and PRs.

Pitch

Have the CI save the generated notebook, so users can test the notebook run end-to-end before merging the PR.

Alternatives

Provide a notebook link to Colab on the PR's branch

Additional context

N/A

Linking images to notebook

๐Ÿš€ Feature

Allow placing images in notebooks...
Parse the script looking for MarkDown linked images and replace the relative path with permanent to git referring head or main/publication branch (as main is merged to publication images shall be the same)

Motivation

as we store notebooks in form of a script there is no simple way how to store illustrations

Pitch

Allow rich and nice tutorials

Alternatives

Additional context

Enable LFS and define all images to be stored in such way
no need to link to a particular commit ๐Ÿฐ

save Env details

๐Ÿš€ Feature

While publishing a notebook expand metadata with environment details, actual version of default and required packages

Motivation

all notebooks shall be back reproducible, so in case it does not work anymore we shall know what was the last version we tested/published the notebook

Pitch

Alternatives

Additional context

better prune deleted notebooks

๐Ÿ› Bug

comparing the trees may not be robust if you for example change a folder to subfolders

- template

to

- template
  |- subA
  '- subB

Additional context

so rather list existing notebooks in the .notebooks folder in publication branch and that is true what is there

Update papermill.cli references to papermill per 2.3.4 release

๐Ÿ› Bug

With the latest papermill release (2.3.4, updated 2022.01.22), a PR has been included that necessitates a change in our papermill.cli references to papermill

To Reproduce

As part of the make ipynb process, you'll notice python -m papermill.cli {ipynb_file} {pub_ipynb} --kernel python fails. Updating it to python -m papermill {ipynb_file} {pub_ipynb} --kernel python succeeds. In the PR I'm submitting to fix this, I'm also updating requirements to ensure papermill >=2.3.4 as previously, papermill.cli was necessary.

Expected behavior

make ipynb process should generate the relevant executed notebooks

Additional context

The improvement to papermill was originally made by @Borda and now is finally available. Thanks!

Update self.log

๐Ÿ› Bug

Some tutorials still return non-detached tensors in training_step - this is deprecated in 1.6 and may cause memory leaks if people follow those patterns.

e.g.
output = OrderedDict({"loss": g_loss, "progress_bar": tqdm_dict, "log": tqdm_dict})

The error pops up in the autogenerated docs here: https://pytorch-lightning.readthedocs.io/en/stable/notebooks/lightning_examples/basic-gan.html

I raised this in Lightning-Universe/lightning-bolts#793 too.

I see the examples in https://github.com/PyTorchLightning/pytorch-lightning/tree/master/pl_examples already defer to lightning bolts for more robust examples - wherever a good source of docs/best practices should be, I think this specific error be fixed? (these examples are also discussed in #71).

Thanks, loving the library btw! :)

`language = None` Warning in Sphinx

๐Ÿ› Bug

As of sphinx 5.0.x, setting language = None in conf.py triggers a warning
Pull request will be associated with this with the simple fix of explicitly setting language to "en" by default.

To Reproduce

Attempt to build docs using latest sphinx:
make html --debug SPHINXOPTS="-W --keep-going" -b linkcheck

Expected behavior

No warning.

Additional context

Add a tutorial for distributed inference

๐Ÿš€ Feature

Users are asking for examples how to predict with models in a distributed setting.

Motivation

We could link such a tutorial in the PL main docs.

Pitch

Add tutorial page for prediction on single GPU, multiple GPU / multiple nodes.
It should cover the PredictionWriterCallback and how to use it.

Alternatives

Additional context

Related PR #52

downloading Kaggle datasets

๐Ÿš€ Feature

A contributor can specify a Kaggle dataset and the CI & Cd will download it for a default dataset folder

Motivation

Explore more tutorials on various datasets

Alternatives

we can create a dummy PL user for Kaggle and add his credentials as repo secrets which will be used in CI/CD
see how to use such credentials: https://github.com/Kaggle/kaggle-api#api-credentials

export KAGGLE_USERNAME=datadinosaur
export KAGGLE_KEY=xxxxxxxxxxxxxx

Additional context

Need to verify legal/data usage scope @tchaton @aribornstein

Supporting lower level headings

๐Ÿš€ Feature

The HTML build currently fails if Markdown sections with level 4 or lower are used (e.g. #### Heading 4).

Motivation

Markdown allows arbitrary deep heading levels, but the current HTML build produces warnings for all headings that are level 4 or lower. The precise warning is:

WARNING: Title level inconsistent:

Since level 1 is reserved for the title of the notebook, only having level 2 and 3 headings might be too constraining.
Even if the font of heading level 3 and 4 would be the same, it is helpful to distinguish those to have a consistent table of contents.

Pitch

It seems to me that this might be a setting somewhere in the build which could be adjusted to allow level 4 headings.

Alternatives

If it cannot be allowed to have heading lower than level 4, it should be clarified in the README.

Additional context

This problem occurs in #73.

Support Adding Tutorial Dependencies From Git Repositories

๐Ÿš€ Feature

Support in .meta.yml tutorial dependency definition and processing for git repositories (URI of the form git+https://gitprovider.com/user/project.git@{version}) would be useful in a variety of circumstances.

Motivation

The current approach to .meta.yml tutorial dependency definition/processing results in a KeyError for the specified "git*" package reference in meta["environment"] = [env[r] for r in require]. Including the desired git repo URI in docs/requirements.txt appears to be a viable temporary workaround but is certainly undesirable as a solution moving forward.

Pitch

Update helpers.py and associated .meta.yml dependency parsing to support "git*" dependency references.

Alternatives

See docs/requirements.txt above as one possible temporary (undesirable) workaround.

Additional context

Came across this issue when adding a tutorial that should ultimately depend upon a user-registered modules (i.e.

git+git://github.com/speediedan/pytorch-lightning.git@24d3e43568814ec381ac5be91627629808d62081#egg=pytorch-lightning

). We may want to consider how dependencies on user-registered modules are handled as well.

inline images to noteooks

๐Ÿš€ Feature

replace linking images from repo by inserting bytecode and make notebooks fully stand-alone

Motivation

drop dependency on the repo content, so the notebooks can be shared even offline

Alternatives

update this replacement call https://github.com/PyTorchLightning/lightning-tutorials/blob/645730b8809123dbeccf6265dc50b65a875245b8/.actions/helpers.py#L164 to get something like

# ![image.png](data:image/png;base64,iVBORw0KGgoA...eExqjNywAAAABJRU5ErkJggg==)

Additional context

import base64

with open("t.png", "rb") as imageFile:
    str = base64.b64encode(imageFile.read())
    print str

see: https://stackoverflow.com/a/22351516/4521646

building notebooks in parallel for tests & publish

๐Ÿš€ Feature

Dynamically extend the time limit for running CI on PR touching more content

Motivation

do not penalise PR which are touching multiple notebooks

Additional context

Run one tiny job which would determine how many notebooks are changed, and pass this count N to the test job and for example set timeout for testing as N*30min

add badges Lightning.ai / Colab

๐Ÿš€ Feature

add badges GridAI and/ Colab which would directly open the notebook

Motivation

easy testing each notebook on his own...

Pitch

Alternatives

Additional context

define requirements for notebooks

๐Ÿ› Bug

CIFAR-10 Baseline Example Notebook is failing on importing torchvision as it is not part of the requirements files.

To Reproduce

Steps to reproduce the behaviour:

  1. Go to https://mybinder.org
  2. mybinder will start a fresh docker container with this repository and all requirements installed
  3. Start the Cifar-10-baseline example.
  4. torchvision fails to be imported

Expected behavior

Add torchvision to the requirements/default.txt

replace tensorboard by CSVLogger

๐Ÿš€ Feature

Replace inline tensorboard (TB) with a simple chart from standard CSLoggers

Motivation

The inline tensorboard is a cool interactive tool, but:

  • it requires saved logos that are missing in the notebook
  • TB inside notebooks is a heavy JS object (~30MB)

Pitch

A reader would also see the progress immediately, maybe add plotting to the csvlogger?

Alternatives

Some other simple logger?

Additional context

recurrent PL docs update

๐Ÿš€ Feature

add crone workflow on the PL side, which would frequently check if with the submodule update there is commit the change

git submodule update --init --recursive --remote

if yes, there are two options:

  1. directly commit this update to master IF all docs tests pass
  2. create a new PR with this update and verify changes [preferable]

Add 2) if there a new update on the Tutorial side before the last PR with an update is merged, let's update the existing/open PR instead of creating a new one...

Motivation

ensure that the published notebooks are updated to PL docs

Alternatives

Manual recurrent checking or raising need when a new notebook is added...

Additional context

the notebooks integration is added in Lightning-AI/pytorch-lightning#7752

LaTeX expression in notebooks

๐Ÿš€ Feature

Enhancing the LaTeX expressions in the HTML-parsed notebooks.

Motivation

LaTeX expressions are currently parsed as small images and added into the text. Sometimes, the alignment of text and LaTeX expression is a bit off, especially if we use superscripts. The image resolution of the parsed expressions is low which makes it seems a bit blurry.

Pitch

Different versions/alternatives to the LaTeX Sphinx extension (currently sphinx.ext.imgmath, link) can be considered. For the uvadlc website, we have used sphinx.ext.mathjax which integrates the expressions quite nicely.

Additional context

Examples of the sub-optimal placement of LaTeX expression in the GNN tutorial are attached below.

Screenshot 2021-08-17 at 09 56 19

Screenshot 2021-08-17 at 09 58 34

delete past notebook even with rename event

๐Ÿ› Bug

when you rename a folder/notebook the past one won't be deleted as the git diff does not cover the past name

Expected behaviour

Handle simple rename as one GHA workfllow

Additional context

ensure linear publication

๐Ÿš€ Feature

Ensure that there is only one agent working with publications eventually find some lock

Motivation

eventual collision with two merged PRs in a short time with various publication time:

  1. yield in long rebuilding public
  2. just minor edit so almost no publication change

The case is that the first starts its sync and produce rendered notebook in the meantime second sync and as it short pushed its status to publication branch ๐Ÿฐ so when the first ends with publication and tries to push to the publication branch it will be rejected as it out of sync

Pitch

This is a special case, but we shall coordinate the PR merges or prevent such shedule

Alternatives

when this happens we can rebuild all notebooks from scratch and force push to publication with make ipynb

Additional context

Support scaling of SVG images

๐Ÿš€ Feature

SVG images currently cannot be properly scaled in the generated HTML version.

Motivation

Embedded SVG images are automatically scaled to 100% of the notebook width. Using Markdown syntax or HTML adds the width and height arguments in the img tag in the generated HTML, which however seem to not work. For example, the following two expressions lead to SVG images with 100% width:

  1. Markdown syntax in notebook: ![Test image](https://github.com/PyTorchLightning/lightning-tutorials/raw/main/course_UvA-DL/graph-neural-networks/example_graph.svg){width="250px"}
    Generated HTML line: <p><img alt="Test image" src="https://github.com/PyTorchLightning/lightning-tutorials/raw/main/course_UvA-DL/graph-neural-networks/example_graph.svg" width="250px" /></p>
  2. HTML syntax in notebook: <center width="100%" style="padding:10px"><img src="https://github.com/PyTorchLightning/lightning-tutorials/raw/main/course_UvA-DL/graph-neural-networks/example_graph.svg" width="250px"></center>
    Generated HTML line: <center width="100%" style="padding:10px"><p><img alt="9463af4f00614b78b36220bf6b55ab3e" src="https://github.com/PyTorchLightning/lightning-tutorials/raw/main/course_UvA-DL/graph-neural-networks/example_graph.svg" width="250px" /></p>

Meanwhile, the syntax works as desired for png images.

A working HTML integration for the correcting sizing is using the style argument: <center width="100%" style="padding:10px"><p><img alt="9463af4f00614b78b36220bf6b55ab3e" src="https://github.com/PyTorchLightning/lightning-tutorials/raw/main/course_UvA-DL/graph-neural-networks/example_graph.svg" style="weight:250px"/></p></center>. However, adding the style argument into the Markdown syntax or HTML in the notebook gets removed when parsed to the published HTML file.

Furthermore, the same problem holds for matplotlib figures. The notebooks in the UvA DL course use SVGs in the matplotlib figures, and all of them are scaled to 100% width, independent of the specified figure size.

Pitch

I do not know precisely where the error comes from, so any help is appreciated!

Alternatives

Converting all SVG images into png files is the worst-case alternative, although it seems a bit wasteful in terms of resources and limits the resolution of the figures.

Additional context

Example figures:

Documentation SVG rendering bug/warning in Chrome

๐Ÿ› Bug

It seems that there is a weird SVG rendering bug/warning in Chrome. Maybe the SVG version just needs to be updated.

To Reproduce

If you go to https://pytorch-lightning.readthedocs.io/en/latest/notebooks/course_UvA-DL/01-introduction-to-pytorch.html#Dynamic-Computation-Graph-and-Backpropagation there is a warning about SVG 1.1 support:

Screen Shot 2022-01-25 at 8 37 23 AM

Expected behavior

It looks fine in Safari though:

Screen Shot 2022-01-25 at 8 38 26 AM

Environment

Chrome Version 97.0.4692.71 (Official Build) (arm64) on macOS

normalize text block in text cells

๐Ÿš€ Feature

Parse the markdown cells in a script and normalize them to give line length

Motivation

reduce manual formating and keep the text homogenous

Pitch

would be nice to have it as CI/precommit

Alternatives

check this pre-commit hook https://github.com/markdownlint/markdownlint
see: https://www.geeksforgeeks.org/textwrap-text-wrapping-filling-python/
use: https://mdformat.readthedocs.io/en/stable/users/installation_and_usage.html

Additional context

need to recognise and eventually skip some equations which shall be kept compact

CD: language mutations

๐Ÿš€ Feature

Democratize the tutorials and make it more accessible to not English speaking people
coming from Lightning-AI/pytorch-lightning#10239 (comment)

Pitch

Alternatives

the translation can be done on the CD side when we build notebooks from a script we can prepare some automatic language mutations for text fields

Additional context

Makefile reference to renamed assistant.py func

๐Ÿ› Bug

A function used by the Makefile used for local build testing was renamed in assistant.py . I'm shortly opening a PR to simply update the reference in Makefile accordingly.

To Reproduce

Steps to reproduce the behavior:

  • use the full makefile flow to test a notebook locally...
cd lightning-tutorials
make ipynb

Expected behavior

Make should follow the expected ipynb build flow

Additional context

Pre-commit failing due to black issue (#2966)

๐Ÿ› Bug

An issue with black (introduced by click) was causing the pre-commit "Format Code" hook to fail. Version 22.3.0 of black was released and needs to be specified with the relevant pre-commit hook we're using. I've verified that updating the black version referenced in the pre-commit hook configuration addresses the issue.

To Reproduce

Steps to reproduce the behavior:
Check latest pre-commit "Code Formatting" logs or run pre-commit run --all-files with a black version < 22.3.0 referenced in the hook

Format code..............................................................Failed
- hook id: black
- exit code: 1

Traceback (most recent call last):
  File "/home/runner/.cache/pre-commit/repok0wbrcuc/py_env-python3.8/bin/black", line 8, in <module>
    sys.exit(patched_main())
  File "/home/runner/.cache/pre-commit/repok0wbrcuc/py_env-python3.8/lib/python3.8/site-packages/black/__init__.py", line 1372, in patched_main
    patch_click()
  File "/home/runner/.cache/pre-commit/repok0wbrcuc/py_env-python3.8/lib/python3.8/site-packages/black/__init__.py", line 1358, in patch_click
    from click import _unicodefun
ImportError: cannot import name '_unicodefun' from 'click' (/home/runner/.cache/pre-commit/repok0wbrcuc/py_env-python3.8/lib/python3.8/site-packages/click/__init__.py)

Expected behavior

The code format check should complete successfully.

make a aggregated package from all notebooks

๐Ÿš€ Feature

As we generate ipython notebooks from each script we could also parse all functions and classes and copy them into a package that would have the same structure as notebooks

Motivation

extend the functionality of this project

Pitch

do not only showcase the classes/modules but also allow a reader to import and use them

Alternatives

Additional context

The package would have an identical module structure as notebooks/folders
Consider also exporting some contacts
no cross imports as the package will be generated to another separate branch (to keep it light for install)

add HPU engine

๐Ÿš€ Feature

Add HPU for CI and for CD publishing

Motivation

Extend our accelerate capabilities and show more options to users

Pitch

Additional context

We have recently in PL v1.6 added HPU support and we have needed agent pool already in Azure pipelines which can be used

Requirements not working in colab

๐Ÿ› Bug

To Reproduce

  1. Open any notebook in colab
  2. The torch>=1.6, <1.9 dependency gives the error "no such file or directory"

Capture

Expected behavior

Running cell should install requirements

Additional context

Putting it in a string seems to work, but then it complains about compatibility:
billede

Allow PR with notebooks

๐Ÿš€ Feature

Simplify the user work such as passing the notebooks and we would internally convert to scripts and cleaned formating

Motivation

Less user flustration and smoother user experience as performing review in https://www.reviewnb.com

Alternatives

Add extra workflows to handle these situations

  • these notebooks will go to another branch staging
  • on PR to this special branch we run standard notebooks testing
  • on merge (commit to staging) we trigger converting to script and merge to main or create a new PR? (we new PR we have extra level safety and also most likely no actions will be triggered on bots merge event)
  • ISSUE: how to ensure that in the staging branch is only one notebook - after converting we re-create blank staging from main

Additional context

@edgarriba is not happy about converting notebooks to script and fixing issues on his own :P

Find import bugs in "lightnin_examples/text-transformers.py"

๐Ÿ› Bug

I cannot import pytorch-lightning in colab.
The file name that has error is text-transformer.ipynb (can be found in jupyter link, colab link)

To Reproduce

Steps to reproduce the behavior:

  1. Open text-transformer.ipynb in colab (colab link)
  2. Execute first and second code block
  3. If you execute second code block, it will show you import error.

image

Expected behavior

If I import pytorch-lightning, it should be imported without error.

Solution

I tried to solve this problem myself and I somehow find that error is occurred, because torchtext version is lower than expected.
So If you add torchtext>=0.9 in import statement, error can be fixed.

! pip install --quiet "datasets" "scipy" "torchmetrics>=0.3" "transformers" "scikit-learn" "torch>=1.6, <1.9" "torchtext>=0.9" "pytorch_lightning>=1.3"

If you find my solution is good, let me know I will make a PR.

add GPU publisher

๐Ÿš€ Feature

Add GPU available workflow to generate notebooks

Motivation

faster generating notebooks and speedup CI testing

Pitch

Alternatives

probably use Azure pipelines with spot instances

Additional context

Mistake in GLUE code example

Hi, I found out there is a mistake in how the total_steps is computed in https://pytorch-lightning.readthedocs.io/en/stable/notebooks/lightning_examples/text-transformers.html

For instance, in the setup() method, we have:

# Calculate total steps
tb_size = self.hparams.train_batch_size * max(1, self.trainer.gpus)
ab_size = self.trainer.accumulate_grad_batches * float(self.trainer.max_epochs)
self.total_steps = (len(train_loader.dataset) // tb_size) // ab_size

If I'm not mistaken, it should be something along the lines of:

# Calculate total steps
tb_size = self.hparams.train_batch_size * max(1, self.trainer.gpus)
ab_size = tb_size * self.trainer.accumulate_grad_batches
self.total_steps = int((len(train_loader.dataset) / ab_size) * float(self.trainer.max_epochs))

In the first version, on MRPC (3668 instances), with 30 epochs, 32 batch size, 1 gpu and 1 batch accumulation total_steps amounts to 3; in the second version, it amounts to 3438.

cc @Borda

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.