Code Monkey home page Code Monkey logo

kube_sipecam's Introduction

kube_sipecam's People

Contributors

palmoreck avatar caroacostatovany avatar

Watchers

James Cloos avatar  avatar Carlos Alonso avatar Omar Miranda avatar  avatar Sebastián Cadavid-Sánchez avatar  avatar

kube_sipecam's Issues

Update section metadata.annotation when creating PVC using aws-efs as provisioner for storage class

In the past, the annotation volume.beta.kubernetes.io/storage-class was used instead of storageClassName attribute. This annotation is still working; however, it won't be supported in a future Kubernetes release

ref:

https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims

https://kubernetes.io/docs/concepts/storage/persistent-volumes/#class

So, when using aws-efs as provisioner for storage classes needs to update section: metadata.annotation when creating PVC:

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: efs
  namespace: kubeflow
  annotations:
    volume.beta.kubernetes.io/storage-class: "aws-efs"
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Mi

Or substitute with another provisioner (or follow suggestions of eterna2 in chat of slack of kubeflow... i checked if i had this suggestions and didnt find them.... but were based in node selector) because

https://github.com/kubernetes-retired/external-storage/tree/master/aws/efs

looks will be retired...

check if shm is necessary to mount as a volume for audio processings

Using run docker cmd for nvcr.io/nvidia/tensorflow:19.03-py3 docker image

I got:

The SHMEM allocation limit is set to the default of 64MB. This may be
insufficient for TensorFlow. NVIDIA recommends the use of the following flags:
nvidia-docker run --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 ...

Maybe I need to mount a volume like:

    volumeMounts:
     - name: efs-pvc
       mountPath: "/shared_volume"
     - name: dshm
       mountPath: /dev/shm
  volumes:
   - name: efs-pvc
     persistentVolumeClaim:
      claimName: efs
   - name: dshm 
     emptyDir:
      medium: Memory

in

https://github.com/CONABIO/kube_sipecam/blob/master/deployments/audio/kale-jupyterlab-kubeflow_0.4.0_1.14.0_tf_cpu.yaml#L35

??

Create Dockerfile "standard" in docu to be integrated in kube_sipecam framework

Will be useful to give potential developers of processing systems a "Dockerfile standard" so their systems can be integrated in kube_sipecam framework.

Could be sth like:

FROM ubuntu:bionic

ENV TIMEZONE America/Mexico_City
ENV JUPYTERLAB_VERSION 2.1.4
ENV LANG C.UTF-8
ENV LC_ALL C.UTF-8
ENV DEBIAN_FRONTEND noninteractive
ENV DEB_BUILD_DEPS="sudo nano less git python3-dev python3-pip python3-setuptools curl wget"
ENV DEB_PACKAGES=""
ENV PIP_PACKAGES_KALE="click==7.0 six==1.12.0 setuptools==41.0.0 urllib3==1.24.2 kubeflow-kale==0.5.0"

RUN apt-get update && export $DEBIAN_FRONTEND && \
    echo $TIMEZONE > /etc/timezone && apt-get install -y tzdata

RUN apt-get update && apt-get install -y $DEB_BUILD_DEPS $DEB_PACKAGES && pip3 install --upgrade pip

RUN curl -sL https://deb.nodesource.com/setup_12.x | sudo -E bash - && apt-get install -y nodejs

RUN pip3 install jupyter "jupyterlab<2.0.0" --upgrade
RUN jupyter notebook --generate-config && sed -i "s/#c.NotebookApp.password = .*/c.NotebookApp.password = u'sha1:115e429a919f:21911277af52f3e7a8b59380804140d9ef3e2380'/" /root/.jupyter/jupyter_notebook_config.py

RUN pip3 install $PIP_PACKAGES --upgrade

RUN jupyter labextension install kubeflow-kale-launcher

#install package, for example:
RUN pip3 install "git+https://github.com/CONABIO/geonode.git#egg=geonode_conabio&subdirectory=python3_package_for_geonode"

VOLUME ["/shared_volume"]

#create url like:
ENV NB_PREFIX geonodeurl

#use url in:
ENTRYPOINT ["/usr/local/bin/jupyter", "lab", "--ip=0.0.0.0", "--no-browser", "--allow-root", "--LabApp.allow_origin='*'", "--LabApp.base_url=geonodeurl"]

Docker image wasn't correctly build for tfx usage and kale

Need to check dependencies in Dockerfile

https://github.com/CONABIO/kube_sipecam/blob/master/dockerfiles/audio/tfx_kale/0.4.0_1.14.0_tfx/Dockerfile

There are errors related to versions regarding kale 0.3.4 version and tensorflow 1.14.0 version

For example:

ERROR: nbclient 0.1.0 has requirement nbformat>=5.0, but you'll have nbformat 4.4.0 which is incompatible.
ERROR: kfp 0.1.40 has requirement click==7.0, but you'll have click 7.1.1 which is incompatible.

Version 0.4.0 of kale also produces errors

It's not necessary to have two deployments of kale-jupyterlab-kubeflow_0.4.0_1.14.0

It was seen after doing tests that is not necessary to distinguish between having next line:

https://github.com/CONABIO/kube_sipecam/blob/master/deployments/audio/kale-jupyterlab-kubeflow_0.4.0_1.14.0_tf.yaml#L35

and don't have it in deployment:

https://github.com/CONABIO/kube_sipecam/blob/master/deployments/audio/kale-jupyterlab-kubeflow_0.4.0_1.14.0_tf_cpu.yaml#L32

At least using the example for torch:

https://github.com/CONABIO/kube_sipecam_playground/tree/issue-1/audio/notebooks/dockerfiles/tf_kale/0.4.0_1.14.0_tf/cifar10

the kubeflow+kale run was successful

So I either could delete file

https://github.com/CONABIO/kube_sipecam/blob/master/deployments/audio/kale-jupyterlab-kubeflow_0.4.0_1.14.0_tf_cpu.yaml

or use this file to compile notebook via kale and avoid having problems in kubernetes for not finding nodes with gpu's (because stablishing inside limits block the paremeter nvidia.com/gpu: 1 causes this message)

docker image for kale outputs "permission denied"

Check

https://github.com/CONABIO/kube_sipecam/blob/master/dockerfiles/audio/0.4.0/Dockerfile

When image is deployed next output produces

Fail to get yarn configuration. {"type":"error","data":"Could not write file "/usr/local/lib/python3.6/dist-packages/jupyterlab/yarn-error.log": "EACCES: permission denied, open '/usr/local/lib/python3.6/dist-packages/jupyterlab/yarn-error.log'""}
{"type":"error","data":"An unexpected error occurred: "EACCES: permission denied, scandir '/home/miuser/.config/yarn/link'"."}
{"type":"info","data":"Visit https://yarnpkg.com/en/docs/cli/config for documentation about this command."}

There has been errors when finding libs of TensorRT, use docker image from NVIDIA

TensorRT for high performance inference, see blog

Github:

https://github.com/NVIDIA/TensorRT

Not sure when and how I got errors like:

tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64
2020-03-24 13:32:09.746769: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

Maybe using tfx? or some of the dependencies of tfx?... One way to start solving previous error is using docker image in https://ngc.nvidia.com/catalog/containers/nvidia:tensorrt :

docker pull nvcr.io/nvidia/tensorrt:20.03-py3

Choose tfx docker image as base image for audio processing kubeflow pipelines

If using Docker image 0.21.4 as base image, next output is obtained:

2020-04-24 17:41:30.028801: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib
2020-04-24 17:41:30.029009: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib
2020-04-24 17:41:30.029035: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
usage: run_executor.py [-h] --executor_class_path EXECUTOR_CLASS_PATH
                       [--temp_directory_path TEMP_DIRECTORY_PATH]
                       (--inputs INPUTS | --inputs-base64 INPUTS_BASE64)
                       (--outputs OUTPUTS | --outputs-base64 OUTPUTS_BASE64)
                       (--exec-properties EXEC_PROPERTIES | --exec-properties-base64 EXEC_PROPERTIES_BASE64)
                       [--write-outputs-stdout]
run_executor.py: error: the following arguments are required: --executor_class_path

If Docker image tfx base is used as base image, next output is obtained:

Extracting Bazel installation...
                                                           [bazel release 3.0.0]
Usage: bazel <command> <options> ...

Available commands:
  analyze-profile     Analyzes build profile data.
  aquery              Analyzes the given targets and queries the action graph.
  build               Builds the specified targets.
  canonicalize-flags  Canonicalizes a list of bazel options.
  clean               Removes output files and optionally stops the server.
  coverage            Generates code coverage report for specified test targets.
...
Getting more help:
  bazel help <command>
                   Prints help and options for <command>.
  bazel help startup_options
                   Options for the JVM hosting bazel.
  bazel help target-syntax
                   Explains the syntax for specifying targets.
  bazel help info-keys
                   Displays a list of keys used by the info command.

Need to choose which tfx base docker image will use in audio processing kubeflow pipelines

See:

tfx Dockerfile

mnist pipeline native keras

taxi pipeline simple

Error using datacube 1.8.0 regarding invalid projection, Proj Error

Error using datacube 1.8.0

pyproj.exceptions.CRSError: Invalid projection: PROJCS["unnamed",GEOGCS["WGS 84",DATUM["unknown",SPHEROID["WGS84",6378137,6556752.3141]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433]],PROJECTION["Lambert_Conformal_Conic_2SP"],PARAMETER["standard_parallel_1",17.5],PARAMETER["standard_parallel_2",29.5],PARAMETER["latitude_of_origin",12],PARAMETER["central_meridian",-102],PARAMETER["false_easting",2500000],PARAMETER["false_northing",0]]: (Internal Proj Error: proj_create: buildCS: missing UNIT)

Check:

opendatacube/datacube-core#880

For:

https://github.com/CONABIO/kube_sipecam/blob/master/dockerfiles/MAD_Mex/odc_kale/0.1.0_1.8.3_0.5.0/Dockerfile

Create dir minikube_sipecam

Development has been made in

https://github.com/CONABIO/kube_sipecam/tree/master/deployments/MAD_Mex

using minikube, kubeflow and kale.

Create dir minikube_sipecam under root dir of this repo to hold explanation of this development.

Primarly this dir will hold all the documentation for the requirements and instructions to deploy the system. Will help to:

  • be a proof of concept and local deployment of kube sipecam processing system.

  • adopt kube sipecam processing system and familiarize with the pipelines.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.