Code Monkey home page Code Monkey logo

distributed-inference-with-pyhf-and-funcx's Introduction

Distributed Inference with pyhf and funcX

pre-commit.ci status

Example code for vCHEP 2021 paper "Distributed statistical inference with pyhf enabled through funcX"

Setup

Create a Python 3 virtual environment and then install the pyhf and funcX dependencies in requirements.txt.

(distributed-inference) $ python -m pip install --upgrade pip setuptools wheel
(distributed-inference) $ python -m pip install -r requirements.txt

Reproducible environment

To install a reproducible environment that is consistent down to the hash level, use pip-compile to compile a lock file from requirements.txt and install it following the pip-secure-install recommendations.

(distributed-inference) $ bash compile_dependencies.sh
(distributed-inference) $ bash secure_install.sh

On XSEDE's EXPANSE

On EXPANSE, to use a Python 3.7+ runtime Conda must be used, so create a Conda environment from the expanse-environment.yml provided, which uses the different requirements.txt files to provide the dependencies.

$ conda env create -f expanse-environment.yml
$ conda activate distributed-inference

Once a GPU session has been entered, source the setup_expanse_funcx_test_env.sh shell script to activate the environment and load all required modules

(distributed-inference) $ . setup_expanse_funcx_test_env.sh

Machine Configuration

EXPANSE has the following Nvidia drivers and GPUs:

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Thu_Jun_11_22:26:38_PDT_2020
Cuda compilation tools, release 11.0, V11.0.194
Build cuda_11.0_bu.TC445_37.28540450_0
$ nvidia-smi --list-gpus
GPU 0: Tesla V100-SXM2-32GB (UUID: GPU-XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX)

Run

Create a file named endpoint_id.txt in the top level of this repository and save your funcX endpoint ID into the file.

(distributed-inference) $ touch endpoint_id.txt

This will be read in during the run.

Pass the config JSON file for the analysis you want to run to fit_analysis.py

(distributed-inference) $ python fit_analysis.py -c config/1Lbb.json -b numpy
$ python fit_analysis.py --help
usage: fit_analysis.py [-h] [-c CONFIG_FILE] [-b BACKEND]

configuration arguments provided at run time from the CLI

optional arguments:
  -h, --help            show this help message and exit
  -c CONFIG_FILE, --config-file CONFIG_FILE
                        config file
  -b BACKEND, --backend BACKEND
                        pyhf backend str alias

distributed-inference-with-pyhf-and-funcx's People

Contributors

bengalewsky avatar matthewfeickert avatar pre-commit-ci[bot] avatar

Watchers

 avatar  avatar  avatar  avatar

distributed-inference-with-pyhf-and-funcx's Issues

funcx v0.3.9 causes benign `AttributeError: can't set attribute`

With the addition of funcx v0.3.9

funcx==0.3.9
funcx-endpoint==0.3.9

running the example of

python fit_analysis.py -c config/1Lbb.json -b numpy

on the RIVER deployment generates stderr of

--------------------
Background Workspace Constructed
--------------------
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/home/feickert/.pyenv/versions/3.9.6/lib/python3.9/threading.py", line 973, in _bootstrap_inner
    self.run()
  File "/home/feickert/.pyenv/versions/3.9.6/lib/python3.9/threading.py", line 910, in run
    self._target(*self._args, **self._kwargs)
  File "/home/feickert/.pyenv/versions/debug-distributed-inference/lib/python3.9/site-packages/funcx/sdk/executor.py", line 316, in event_loop_thread
    eventloop.run_until_complete(self.web_socket_poller())
  File "/home/feickert/.pyenv/versions/3.9.6/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
  File "/home/feickert/.pyenv/versions/debug-distributed-inference/lib/python3.9/site-packages/funcx/sdk/executor.py", line 323, in web_socket_poller
    status = await self.ws_handler.handle_incoming(
  File "/home/feickert/.pyenv/versions/debug-distributed-inference/lib/python3.9/site-packages/funcx/sdk/asynchronous/ws_polling_task.py", line 167, in handle_incoming
    if await self.set_result(task_id, data, pending_futures):
  File "/home/feickert/.pyenv/versions/debug-distributed-inference/lib/python3.9/site-packages/funcx/sdk/asynchronous/ws_polling_task.py", line 231, in set_result
    self.ws = None
AttributeError: can't set attribute

before happily continuing to run. After running it exits with

Exception in thread Thread-3:
Traceback (most recent call last):
  File "/home/feickert/.pyenv/versions/3.9.6/lib/python3.9/threading.py", line 973, in _bootstrap_inner

So while benign this should probably get moved away from when possible.

There is a already funcx-faas/funcX#721 open for this, so any follow up can happen there.

Live demo feasibility for SciPy 2021 talk

The SciPy 2021 talk will be 25 minutes (which is pretty long) and I love giving live demos in code talks (they strangely make me feel more confident ๐Ÿ˜› ). How feasible would it be to try to walk through setting up and registering a funcX endpoint during the talk and then switch over to a "user" and then execute a workflow using that endpoint?

I can imagine that there might be some permission issues, but it seems attractive.

Create environment lock file for more reproducible deploys

This project fully qualifies as a Python application in my mind and so we should do a better job of fully specifying the runtime environment than the existing requirements.txt files (@BenGalewsky has brought this up before). Related to #23 (comment) I'm not 100% sure on how the deployment to RIVER works, but for the tests on EXPANSE this

dependencies:
- python>=3.7,<3.9
- pip
- pip:
- setuptools
- wheel
- -r file:core-requirements.txt
- -r file:jax-requirements.txt

isn't truly reproducible. Something I've found to work really well for:

is to use pip-tools to create a lock file from a high level requirements.txt file and then to use Brett Cannon's "pip-secure-install" recommendations to make things as reproducible as possible with pip.

This works well as you're pinning down to the hash level of the wheel on PyPI, so if it ever gets removed or an additional wheel (maliciously) gets added you would know. However, @astrojuanlu (:wave: hey Juan) recently mentioned on Twitter that pip-compile doesn't seem to work that great when combined with Conda. In the replies people mentioned that conda-lock (https://github.com/conda-incubator/conda-lock) seems to work well, so for deploys with Conda this might be the most sensible way forward (though that requires building a second lock file I guess?).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.