intelai / vck Goto Github PK

Volume Controller for Kubernetes

Home Page: https://ai.intel.com/kubernetes-volume-controller-kvc-data-management-tailored-for-machine-learning-workloads-in-kubernetes/

License: Apache License 2.0

Makefile 1.63% Shell 5.07% Go 88.72% Dockerfile 4.59%

vck's Introduction

DISCONTINUATION OF PROJECT

This project will no longer be maintained by Intel. Intel has ceased development and contributions including, but not limited to, maintenance, bug fixes, new releases, or updates, to this project. Intel no longer accepts patches to this project.

Volume Controller for Kubernetes (VCK)

Overview

This project provides basic volume and data management in Kubernetes v1.9+ using custom resource definitions (CRDs), custom controllers, volumes and volume sources. It also establishes a relationship between volumes and data and provides a way to abstract the details away from the user. When using VCK, users are expected to only interact with custom resources (CRs).

vck's People

Contributors

Stargazers

Watchers

Forkers

ajay191191 elsonrodriguez dmsuehir anjalisood wafaat datalayer-externals dancobb1 hsaputra aisensiy balajismaniam sudeshsh ppkube warmchang xigang neujie isabella232

vck's Issues

[problem]

who maintains the project now？

Pachyderm File System Source Type for KVC

As discussed in our recent meeting, kubeflow/kubeflow#151 (comment) requires a way to expose data from Pachyderm to a TFJob. Moreover, this type of data access pattern would be useful for integrating any distributed training framework (e.g., SparkML) or other resource into a Pachyderm pipeline.

In our discussion, we proposed creating a source type for exposing data from the versioned Pachyderm file system (which is backed by an object store). I suggest this format:

apiVersion: kvc.kubeflow.org/v1
kind: VolumeManager
metadata:
  name: kvc-example1
  namespace: <insert-namespace-here>
spec:
  volumeConfigs:
    - id: "vol1"
      replicas: 1
      sourceType: "PFS"
      sourceRepo: <insert input repo name or names here>
      sourceBranch: <insert input repo branch here, e.g., "master">
      accessMode: "ReadWriteOnce"
      capacity: 5Gi
      labels:
        key1: val1
        key2: val2
      options:
        pachSecretName: <insert-secret-name-for-pach-auth-and-host>

This would allow the connector to utilize the Pachyderm client to pull the necessary data into the volume.

Provide options to restrict data placement on nodes

This would be useful in the following cases:

Restrict the nodes where the data can be downloaded
Allow co-placement of CRs when number of replicas are equal

But it raises the following questions:

What do we do in the case of #nodes < #replicas when CRs are created/patched or nodes die

Cachefilesd: Deliver documentation

Cachefilesd: Update the NFS handler in KVC to handle installation of cachefilesd

non-authenticated object store source type support

For instance public AWS URL based data.

Investigate if minio client is sufficient for this, or some alternative needed.

If minio sufficient, can this be merged with existing S3 type support (perhaps via addition of a new insecure or similar flag).

Error when / is missing at the end of the s3 directory

If the sourceURL directory is missing a / at the end of the path, we get a minio client error message about needing a --recursive flag and it's not obvious what the problem is. Either adding the / to the end if it's a directory or giving a clearer error message would help.

$ kubectl logs kvc-resource-71fb9d16-4d5d-11e8-a4d9-0a580a480f75 
Added `s3` successfully.
mc: <ERROR> To copy a folder requires --recursive flag. Invalid arguments provided, please refer `mc <command> -h` for relevant documentation.

Compile failed

main.go:30:2: cannot find package "github.com/IntelAI/vck/pkg/client/clientset/versioned" in any of:
        /usr/local/Cellar/go/1.10.2/libexec/src/github.com/IntelAI/vck/pkg/client/clientset/versioned (from $GOROOT)
        /Users/robscc/workspace/go/src/github.com/IntelAI/vck/pkg/client/clientset/versioned (from $GOPATH)
../../go/src/github.com/IntelAI/vck/pkg/hooks/hooks.go:29:2: cannot find package "github.com/IntelAI/vck/pkg/client/clientset/versioned/typed/vck/v1alpha1" in any of:
        /usr/local/Cellar/go/1.10.2/libexec/src/github.com/IntelAI/vck/pkg/client/clientset/versioned/typed/vck/v1alpha1 (from $GOROOT)
        /Users/robscc/workspace/go/src/github.com/IntelAI/vck/pkg/client/clientset/versioned/typed/vck/v1alpha1 (from $GOPATH)
../../go/src/github.com/IntelAI/vck/pkg/controller/controller.go:26:2: cannot find package "github.com/IntelAI/vck/pkg/client/informers/externalversions" in any of:
        /usr/local/Cellar/go/1.10.2/libexec/src/github.com/IntelAI/vck/pkg/client/informers/externalversions (from $GOROOT)
        /Users/robscc/workspace/go/src/github.com/IntelAI/vck/pkg/client/informers/externalversions (from $GOPATH)

it looks have some files missing

Add missing e2e tests

The tests should cover the following:

nfs server ip is unreachable
nfs path is not right
tests for positive and negative S3-Dev source type

Feature Request: A new handler for cachefilesd

Extend KVC to work with cachefilesd on the nodes. Using KVC, users will be able to get the node affinity and pass it to pod scheduler allowing their jobs to be scheduled on nodes which have their datasets cached from the previous run.

Deliver documentation containing best known configs/methods for installing cachefilesd and tuning it. Requirements and limitations of this approach. Place the documentation in github at: https://github.com/NervanaSystems/dls-infrastructure.
Experiment with NFS with Cachefilesd Setup. Performance gain analysis on when the data is accessed over the network every single time vs when it is cached locally.
Figure out how cachefilesd configurations will be surfaced to the objects (e.g., Pods, Deployments) running in Kubernetes
Implement a cachefilesd handler for KVC.

Periodic S3 source data updating

For example, to re-fetch updates to temporal input datasets.

One potential means of supporting this could be via Update hooks. May be worth adding new mutable flag to distinguish from non-updateable sources.

Possible issues: additional data now exceeds capacity, dataset provenance

README should explain why/when you'd want to use the KVC

I think the README should explain why and when a user would want to use the KVC; i.e. what problems it solves.

My understanding from past discussions is that the KVC is caching data on nodes. Is this intended primarily for on prem deployments?

Cachefilesd: Add end to end example for Cachefilesd

Experiment with NFS with Cachefilesd Setup. Performance gain analysis on when the data is accessed over the network every single time vs when it is cached locally.

Bubble-up pod logs to CR when data download fails.

The logs should be surfaced up to the custom resources when the data download pod fails for any reason. Right now it bubbles-up a cryptic error such as timed out waiting for condition.

Create webhook to make volumes consumable via KVC names.

Reference: https://kubernetes.io/docs/admin/extensible-admission-controllers/#external-admission-webhooks.

Creating PVCs hang and causes all subsequent attempts to create PVc to hang as well if timeoutForDataDownload is a non quoted integer value.

This steps to recreated this is well described here #50

Fail to fetch a "directory" in s3 with minio/mc

Kvc use the minio/mc to download files from s3 but there are some issues which will make kvc not working.

No default --api parameter will make mc config add host failed. See #2422.
Event make this work, the cp command of minio will still get error info which make the mc command return a non zero return. And this will make volumemanager show an failed status. See #2460.

For the first issue, another parameter s3Version is necessary if kvc want to support different s3 api version. See more info in minio client with shows that Google Cloud Storage use a different s3 version with others.

Add a garbage collector.

The responsibilities of the garbage collector might be the following:

Evict data, PVs and PVCs in case of disk pressure.
Use a simple algorithm such as LRU.
Delete orphaned PVs, PVCs and Pods.

This is not an exhaustive list.

Support a data distribution strategy

Currently the data is replicated across the replicas if the number of replicas provided is >1. To support sharding the following can be implemented:

Introduce a filterPattern in the options of the CR Spec which includes a glob pattern along with the number of replicas which should have the data corresponding to that pattern. The pattern can be of the form:

  - {'1*.png': 2, '2*.png': 1, '3*.png': 1}

Which would mean all the files corresponding to 1*.png would be placed on 2 nodes and depending on the strategy they will either be replicated or distributed.

Introduce an option to provide the replication strategy. The possible strategies are:
-- Replicate (Implemented first)
-- Distribute (Implemented later)

Investigate the possibility of preserving logs

There are a few possible ways this can be achieved:

Introduce a source type which uploads all the data from a particular mount path to remote storage and expose that in a container where logs can be grabbed.
Introduce a source type which grabs logs from pods and uploads to a remote storage.

Create `v0.1.0-alpha1` tag for KVC

This could be from latest commit to the master branch here:
f0fd710

Fail if the source type is not supported.

Update the volume manager controller to fail fast if the source type is not supported.

Sync back with S3 data source.

If I have my data downloaded/sync from S3 to hostPath, is there a way for me to sync back my changes in the hostPath back to my S3 storage?

Snapshots

Do you plan to handle PVC from snapshots?

Erroneously empty directory in test pod when trying to mount IBM Cloud Object Storage

I have just tried to mount an S3 bucket from IBM Cloud Object Storage like this:

kubectl create namespace vckns
kubectl config set-context $(kubectl config current-context) --namespace=vckns
git clone https://github.com/IntelAI/vck.git && cd vck
helm init
# Wait until kubectl get pod -n kube-system | grep tiller shows Running state
# Modify helm-charts/kube-volume-contoller/values.yaml to use valid tag from https://hub.docker.com/r/volumecontroller/kube-volume-controller/tags/
# I use tag: "df90277"
helm install helm-charts/kube-volume-controller/ -n vck --wait --set namespace=vckns
kubectl get crd
export AWS_ACCESS_KEY_ID=<aws_access_key>
export AWS_SECRET_ACCESS_KEY=<aws_secret_access_key>
kubectl create secret generic aws-creds --from-literal=awsAccessKeyID=${AWS_ACCESS_KEY_ID} --from-literal=awsSecretAccessKey=${AWS_SECRET_ACCESS_KEY}
# Looked at kubectl get volumemanager vck-example1 -o yaml to see if "state: Pending" changes or more precisely: kubectl get volumemanager vck-example1 -o jsonpath='{.status.state}'
kubectl create -f resources/customresources/s3/one-vc.yaml
# File content:
apiVersion: vck.intelai.org/v1
kind: VolumeManager
metadata:
  name: vck-example1
  namespace: vckns
spec:
  volumeConfigs:
    - id: "vol1"
      replicas: 1
      sourceType: "S3"
      accessMode: "ReadWriteOnce"
      #nodeAffinity:
      #  - <insert-node-affinity-here>
      #tolerations:
      #  - <insert-tolerations-here>
      capacity: 5Gi
      labels:
        key1: val1
        key2: val2
      options:
        endpointURL: https://s3-api.us-geo.objectstorage.softlayer.net
        awsCredentialsSecretName: aws-creds
        sourceURL: "s3://<bucket_name>/"
        # dataPath: <insert-data-path-here-optional>"
        # distributionStrategy: <insert-distributed-strategy-here-optional>


kubectl create -f resources/pods/vck-pod.yaml
# File content:
apiVersion: v1
kind: Pod
metadata:
  name: vck-claim-pod
spec:
  #affinity:
  #  <insert-node-affinity-from-cr-status>
  volumes:
    - name: dataset-claim
      hostPath:
        path: /var/datasets/vck-resource-a2140d72-11c2-11e8-8397-0a580a440340
  containers:
  - image: busybox
    command: ["/bin/sh"]
    args: ["-c", "sleep 1d"]
    name: vck-sleep
    volumeMounts:
    - mountPath: /var/data
      name: dataset-claim

When I display the bucket via

AWS_ACCESS_KEY_ID=<aws_access_key> AWS_SECRET_ACCESS_KEY=<aws_secret_access_key> aws s3 ls --endpoint-url https://s3-api.us-geo.objectstorage.softlayer.net s3://<bucket_name>/

I correctly see the bucket content, but when I exec into the pod via kubectl exec -it vck-claim-pod /bin/sh and look into the mount path with ls /var/data it is empty.

Most artifacts seem to be available except for the fact that the resource pod is in state "Completed":

kubectl get pod,crd,pvc,pv,secret
NAME                                                    READY     STATUS      RESTARTS   AGE
pod/vck-64c4945885-2wcnm                                1/1       Running     0          1h
pod/vck-claim-pod                                       1/1       Running     0          17m
pod/vck-resource-32284724-8171-11e8-8d61-0e39608b01ad   0/1       Completed   0          39m

NAME                                                                           AGE
customresourcedefinition.apiextensions.k8s.io/volumemanagers.vck.intelai.org   1h

NAME                         TYPE                                  DATA      AGE
secret/aws-creds             Opaque                                2         1h
secret/default-token-m588d   kubernetes.io/service-account-token   3         2h
secret/vck-token-tlj9t       kubernetes.io/service-account-token   3         1h

I'm currently quite busy and thus only had 10 minutes to play with vck, so there is a good chance this is not a bug, but a user error. Still: Do you have any idea why I cannot see my data?

Thank you in advance.

Use the k8s scheduler to spawn pods, then reconcile.

Currently KVC spawns pods directly on nodes, this can cause issues as it can totally bypass scheduling constraints (affinities, taints, tolerations, etc).

Change it so that KVC just defines the pods, lets the scheduler assign them, and then reschedule as needed until the desired datasets are met.

Replace NodeAffinity with Labels

Currently the mechanism used to keep track of what data is where is a nodeAffinity. This causes a few problems:

If a node in the list goes away, or a node is added, the list must be reconciled
Pods spawned by templates will not get the updated list, causing uneven distribution of pods.

A better approach to this is to manage affinity by tagging nodes with labels reflecting the kvc volumes. This way pods can always be matched to the data with a freshly reconciled node list.

Ability to specify which nodes to use

I want to use certain nodes that have more memory, but I don't see a way to specify memory requirements or specific node names in the VolumeManager spec.

Create ksonnet prototype for KVC

Providing a ksonnet prototype would make it easier for users in the Kubeflow ecosystem, and also make it possible to provide more consolidated install instructions.

Add validation for the CR

Enable validation of CR's, a possible approach is using this: https://kubernetes.io/docs/tasks/access-kubernetes-api/extend-api-custom-resource-definitions/#validation.

Consider using a Job for data downloading instead of the pod.

Reference: https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/. Appropriate completions, parallelism and backofflimit should be set in order for the job to work as expected.

Multiple KVC resources copying data to the same node

Yesterday when using KVC to copy data from gcs (via s3) to 6 replicas in the cluster. We saw the pods start running, and then noticed that 3 of them were running on the same node. It doesn't seem like it makes sense for KVC to copy the same data 3 times to the same node, and this caused our node to be out of disk space.

Cachefilesd: Figure out how cachefilesd configurations will be surfaced to the objects running in Kubernetes

Figure out how cachefilesd configurations will be surfaced to the objects (e.g., Pods, Deployments) running in Kubernetes