Code Monkey home page Code Monkey logo

lightning-quick-start's Introduction

Lightning Quick Start App

Install Lightning

pip install lightning[app]

Locally

In order to run the application locally, run the following commands

pip install -r requirements.txt
lightning run app app.py

Cloud

In order to run the application cloud, run the following commands

On CPU

lightning run app app.py --cloud

On GPU

USE_GPU=1 lightning run app app.py --cloud

Adding HPO support to Quick Start App.

Using Lightning HPO, you can easily convert the training component into a Sweep Component.

pip install lightning-hpo
lightning run app app_hpo.py

Learn how it works

The components are here and the code is heavily commented.

Once you understand well this example, you aren't a beginner with Lightning App anymore ๐Ÿ”ฅ

lightning-quick-start's People

Contributors

awaelchli avatar borda avatar carmocca avatar dependabot[bot] avatar kaushikb11 avatar manskx avatar pre-commit-ci[bot] avatar tchaton avatar thomasyoungson avatar williamfalcon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lightning-quick-start's Issues

OSError: Request Entity Too Large

๐Ÿ› Bug

When the train_work is about to stop after finishing training, we get a OSError: [Errno 5] An error occurred (413) when calling the PutObject operation: Request Entity Too Large error.

To Reproduce

Steps to reproduce the behavior:

lightning run app app.py --cloud  --name quick-start-3

Code sample

The app.py from this repo.

Error and logs

root.train_work] Epoch 9: 100% 12/12 [00:00<00:00, 16.72it/s, v_num=0]
[root.train_work] Traceback (most recent call last):
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.8/site-packages/s3fs/core.py", line 112, in _error_wrapper
[root.train_work]     return await func(*args, **kwargs)
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.8/site-packages/aiobotocore/client.py", line 358, in _make_api_call
[root.train_work]     raise error_class(parsed_response, operation_name)
[root.train_work] botocore.exceptions.ClientError: An error occurred (413) when calling the PutObject operation: Request Entity Too Large
[root.train_work] The above exception was the direct cause of the following exception:
[root.train_work] Traceback (most recent call last):
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/bin/lightning-cloud-launcher", line 8, in <module>
[root.train_work]     sys.exit(main())
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
[root.train_work]     return self.main(*args, **kwargs)
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.8/site-packages/click/core.py", line 1055, in main
[root.train_work]     rv = self.invoke(ctx)
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
[root.train_work]     return _process_result(sub_ctx.command.invoke(sub_ctx))
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
[root.train_work]     return _process_result(sub_ctx.command.invoke(sub_ctx))
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
[root.train_work]     return ctx.invoke(self.callback, **ctx.params)
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.8/site-packages/click/core.py", line 760, in invoke
[root.train_work]     return __callback(*args, **kwargs)
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.8/site-packages/lightning_launcher/cli/__main__.py", line 87, in run_work
[root.train_work]     run_lightning_work(
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.8/site-packages/lightning_launcher/utils.py", line 51, in wrapper
[root.train_work]     res = func(*args, **kwargs)
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.8/site-packages/lightning_launcher/utils.py", line 77, in wrapper
[root.train_work]     res = func(*args, **kwargs)
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.8/site-packages/lightning_launcher/launcher.py", line 181, in run_lightning_work
[root.train_work]     WorkRunner(
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.8/site-packages/lightning/app/utilities/proxies.py", line 437, in __call__
[root.train_work]     raise e
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.8/site-packages/lightning/app/utilities/proxies.py", line 418, in __call__
[root.train_work]     self.run_once()
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.8/site-packages/lightning/app/utilities/proxies.py", line 582, in run_once
[root.train_work]     persist_artifacts(work=self.work)
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.8/site-packages/lightning/app/utilities/proxies.py", line 722, in persist_artifacts
[root.train_work]     _copy_files(artifact_path, destination_path)
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.8/site-packages/lightning/app/storage/copier.py", line 152, in _copy_files
[root.train_work]     fs.put(str(source_path), str(destination_path))
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.8/site-packages/fsspec/asyn.py", line 113, in wrapper
[root.train_work]     return sync(self.loop, func, *args, **kwargs)
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.8/site-packages/fsspec/asyn.py", line 98, in sync
[root.train_work]     raise return_result
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.8/site-packages/fsspec/asyn.py", line 53, in _runner
[root.train_work]     result[0] = await coro
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.8/site-packages/fsspec/asyn.py", line 523, in _put
[root.train_work]     return await _run_coros_in_chunks(
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.8/site-packages/fsspec/asyn.py", line 269, in _run_coros_in_chunks
[root.train_work]     await asyncio.gather(*chunk, return_exceptions=return_exceptions),
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.8/asyncio/tasks.py", line 455, in wait_for
[root.train_work]     return await fut
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.8/site-packages/s3fs/core.py", line 1073, in _put_file
[root.train_work]     await self._call_s3(
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.8/site-packages/s3fs/core.py", line 339, in _call_s3
[root.train_work]     return await _error_wrapper(
[root.train_work]   File "/home/zeus/miniconda3/envs/cloudspace/lib/python3.8/site-packages/s3fs/core.py", line 139, in _error_wrapper
[root.train_work]     raise err
[root.train_work] OSError: [Errno 5] An error occurred (413) when calling the PutObject operation: Request Entity Too Large

Environment

  • PyTorch Version (e.g., 1.0): 2.0.0
  • OS (e.g., Linux): Linux
  • How you installed PyTorch (conda, pip, source): pip
  • Build command you used (if compiling from source): -
  • Python version: 3.10
  • CUDA/cuDNN version: -
  • GPU models and configuration: -
  • Any other relevant information: Lightning 2.0

Additional context

Found while running #30

ModuleNotFoundError: No module named 'optuna'

๐Ÿ› Bug

[orchestrator]   File "app_hpo.py", line 4, in <module>
[orchestrator]     import optuna
[orchestrator] ModuleNotFoundError: No module named 'optuna'
[orchestrator] ERROR: Found an exception when loading your application from app_hpo.py. Please, resolve it to run your app.
[orchestrator] Traceback (most recent call last):
[orchestrator]   File "app_hpo.py", line 4, in <module>
[orchestrator]     import optuna
[orchestrator] ModuleNotFoundError: No module named 'optuna'
[orchestrator] ERROR: Found an exception when loading your application from app_hpo.py. Please, resolve it to run your app.
[orchestrator] Traceback (most recent call last):
[orchestrator]   File "app_hpo.py", line 4, in <module>
[orchestrator]     import optuna
[orchestrator] ModuleNotFoundError: No module named 'optuna'

To Reproduce

Run locally with USE_GPU=1 lightning run app app_hpo.py --cloud

Code sample

Expected behavior

Environment

Lightning cloud

Additional context

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.