Code Monkey home page Code Monkey logo

kubeflow's Introduction

Kubeflow

The Kubeflow project is dedicated to making deployment of machine learning on Kubernetes simple, portable and scalable. Our goal is not to recreate other services, but to provide a straightforward way to deploy best-of-breed open-source systems for ML to diverse infrastructures. Anywhere you are running Kubernetes, you should be able to run Kubeflow.


Contained in this repository are manifests for creating:

  • A JupyterHub to create and manage interactive Jupyter notebooks. Project Jupyter is a non-profit, open-source project to support interactive data science and scientific computing across all programming languages.
  • A TensorFlow Training Controller that can be configured to use either CPUs or GPUs and dynamically adjusted to the size of a cluster with a single setting
  • A TensorFlow Serving container to export trained TensorFlow models to Kubernetes

This document details the steps needed to run the Kubeflow project in any environment in which Kubernetes runs.

Quick Links

The Kubeflow Mission

Our goal is to make scaling machine learning models and deploying them to production as simple as possible, by letting Kubernetes do what it's great at:

  • Easy, repeatable, portable deployments on a diverse infrastructure (laptop <-> ML rig <-> training cluster <-> production cluster)
  • Deploying and managing loosely-coupled microservices
  • Scaling based on demand

Because ML practitioners use so many different types of tools, it's a key goal that you can customize the stack to whatever your requirements (within reason) and let the system take care of the "boring stuff." While we have started with a narrow set of technologies, we are working with many different projects to include additional tooling.

Ultimately, we want to have a set of simple manifests that give you an easy to use ML stack anywhere Kubernetes is already running and can self configure based on the cluster it deploys into.

Who should consider using Kubeflow?

Based on the current functionality you should consider using Kubeflow if:

  • You want to train/serve TensorFlow models in different environments (e.g. local, on prem, and cloud)
  • You want to use Jupyter notebooks to manage TensorFlow training jobs
  • You want to launch training jobs that use resources -- such as additional CPUs or GPUs -- that aren't available on your personal computer
  • You want to combine TensorFlow with other processes
    • For example, you may want to use tensorflow/agents to run simulations to generate data for training reinforcement learning models.

This list is based ONLY on current capabilities. We are investing significant resources to expand the functionality and actively soliciting help from companies and individuals interested in contributing (see below).

Setup

This documentation assumes you have a Kubernetes cluster already available.

If you need help setting up a Kubernetes cluster please refer to Kubernetes Setup.

If you want to use GPUs, be sure to follow the Kubernetes instructions for enabling GPUs.

Quick Start

Requirements

Steps

In order to quickly set up all components, execute the following commands:

# Create a namespace for kubeflow deployment
NAMESPACE=kubeflow
kubectl create namespace ${NAMESPACE}

# Which version of Kubeflow to use
# For a list of releases refer to:
# https://github.com/kubeflow/kubeflow/releases
VERSION=v0.1.0

# Initialize a ksonnet app. Set the namespace for it's default environment.
APP_NAME=my-kubeflow
ks init ${APP_NAME}
cd ${APP_NAME}
ks env set default --namespace ${NAMESPACE}

# Install Kubeflow components
ks registry add kubeflow github.com/kubeflow/kubeflow/tree/${VERSION}/kubeflow

ks pkg install kubeflow/core@${VERSION}
ks pkg install kubeflow/tf-serving@${VERSION}
ks pkg install kubeflow/tf-job@${VERSION}

# Create templates for core components
ks generate kubeflow-core kubeflow-core

# If your cluster is running on Azure you will need to set the cloud parameter.
# If the cluster was created with AKS or ACS choose aks, it if was created
# with acs-engine, choose acsengine
# PLATFORM=<aks|acsengine>
# ks param set kubeflow-core cloud ${PLATFORM}

# Enable collection of anonymous usage metrics
# Skip this step if you don't want to enable collection.
ks param set kubeflow-core reportUsage true
ks param set kubeflow-core usageId $(uuidgen)

# Deploy Kubeflow
ks apply default -c kubeflow-core

The above command sets up JupyterHub and a custom resource for running TensorFlow training jobs. Furthermore, the ksonnet packages provide prototypes that can be used to configure TensorFlow jobs and deploy TensorFlow models. Used together, these make it easy for a user go from training to serving using Tensorflow with minimal effort in a portable fashion between different environments.

For more detailed instructions about how to use Kubeflow, please refer to the user guide.

Important The commands above will enable collection of anonymous user data to help us improve Kubeflow; for more information including instructions for explictly disabling it please refer to the Usage Reporting section of the user guide.

Troubleshooting

For detailed troubleshooting instructions, please refer to this section of the user guide.

Resources

Get Involved

In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.

The Kubeflow community is guided by our Code of Conduct, which we encourage everybody to read before participating.

Who should consider contributing to Kubeflow?

  • Folks who want to add support for other ML frameworks (e.g. PyTorch, XGBoost, scikit-learn, etc...)
  • Folks who want to bring more Kubernetes magic to ML (e.g. ISTIO integration for prediction)
  • Folks who want to make Kubeflow a richer ML platform (e.g. support for ML pipelines, hyperparameter tuning)
  • Folks who want to tune Kubeflow for their particular Kubernetes distribution or Cloud
  • Folks who want to write tutorials/blog posts showing how to use Kubeflow to solve ML problems

kubeflow's People

Contributors

ankushagarwal avatar aronchick avatar barney-s avatar danisla avatar darthsuogles avatar djangopeng avatar dogopupper avatar elsonrodriguez avatar foxish avatar fxue avatar gaocegege avatar genome21 avatar jlewi avatar kkasravi avatar lluunn avatar ojarjur avatar pmangg avatar puneith avatar rohitagarwal003 avatar s1113950 avatar scorpiocph avatar sfabel avatar ukclivecox avatar vishh avatar wbuchwalter avatar willb avatar willingc avatar wydwww avatar yupbank avatar yuvipanda avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.