Code Monkey home page Code Monkey logo

script-languages-container-tool's Introduction

Script-Languages-Container-Tool

Overview

The Script-Languages-Container-Tool (exaslct) is the build tool for the script language container. You can build, export and upload script-language container from so-called flavors which are description how to build the script language container. You can find pre-defined flavors in the script-languages-release repository. There we also described how you could customize these flavors to your needs.

In a Nutshell

Prerequisites

For installation

In order to install this tool, your system needs to provide the following prerequisites:

For running

In order to use this tool, your system needs to fulfill the following prerequisites:

  • Software

  • System Setup

    • We recommend at least 50 GB free disk space on the partition where Docker stores its images, on linux Docker typically stores the images at /var/lib/docker.
    • For the partition where the output directory (default: ./.build_output) is located we recommend additionally at least 10 GB free disk space.

Further, prerequisites might be necessary for specific tasks. These are listed under the corresponding section.

Installation

You have two options to use this project:

  • as a pure Python project
  • using the start scripts which pull the correct container image from Dockerhub and execute it within the Docker container

Pure Python

Find the wheel package for a specific release under assets.

Install the python package with python3 -m pip install https://github.com/exasol/script-languages-container-tool/releases/download/$VERSION/exasol_script_languages_container_tool-$VERSION-py3-none-any.whl. Replace $VERSION with the latest version or the specific version you are interested in.

Starter scripts

You need to install the Python package only once to install the starter scripts (see the previous section).

Install the starter scripts which allow to run exaslct within a docker image: python3 -m exasol_script_languages_container_tool.main install-starter-scripts --install-path $YOUR_INSTALL_PATH

This will create a subfolder with the scripts itself and a symlink exaslct in $YOUR_INSTALL_PATH, which can be used as entry point.

Usage

For simplicity the following examples use the starter script version (exaslct). If you want to use the pure Python package, simply replace exaslct with python3 -m exasol_script_languages_container_tool.main in all examples.

How to build an existing flavor?

Create the language container and export it to the local file system

./exaslct export --flavor-path=flavors/<flavor-name> --export-path <export-path>

or upload it directly into the BucketFS (currently http only, https follows soon)

./exaslct upload --flavor-path=flavors/<flavor-name> --database-host <hostname-or-ip> --bucketfs-port <port> \ 
                   --bucketfs-username w --bucketfs-password <password>  --bucketfs-name <bucketfs-name> \
                   --bucket-name <bucket-name> --path-in-bucket <path/in/bucket>

Once it is successfully uploaded, it will print the ALTER SESSION statement that can be used to activate the script language container in the database.

How to activate a script language container in the database

If you uploaded a container manually, you can generate the language activation statement with

./exaslct generate-language-activation --flavor-path=flavors/<flavor-name> --bucketfs-name <bucketfs-name> \
                                         --bucket-name <bucket-name> --path-in-bucket <path/in/bucket> --container-name <container-name>

where <container-name> is the name of the uploaded archive without its file extension. To activate the language, execute the generated statement in your database session to activate the container for the current session or system wide.

This command will print a SQL statement to activate the language similar to the following one:

ALTER SESSION SET SCRIPT_LANGUAGES='<LANGUAGE_ALIAS>=localzmq+protobuf:///<bucketfs-name>/<bucket-name>/<path-in-bucket>/<container-name>?lang=<language>#buckets/<bucketfs-name>/<bucket-name>/<path-in-bucket>/<container-name>/exaudf/exaudfclient[_py3]';

Please, refer to the User Guide for more detailed information, how to use exalsct.

Features

  • Build a script language container as docker images
  • Export a script language container as an archive which can be used for extending Exasol UDFs
  • Upload a script language container as an archive to the Exasol DB's BucketFS
  • Generating the activation command for a script language container
  • Can use Docker registries, such as Docker Hub, as a cache to avoid rebuilding image without changes
  • Can push Docker images to Docker registries
  • Run tests for you container against an Exasol DB (docker-db or external db)

Limitations

  • Caution with symbolic links: If you use symbolic links inside any directory of the command line arguments they must not point to files or directories outside the root of the path of the command line argument (i.e. --flavor-path ./flavors/my_flavor/ => There must be no symbolic link inside ./flavors/my_flavor point to anywhere outside of ./flavors/my_flavor). Background: Local directories paths must be mounted manually to the docker container. We currently support only the mounting of the given command line arguments, but we do not analyze the content of those directories. Plan is to fix this limitation with #35

MacOsX Limitations

  • On MacOsX all arguments (flavors path, output directory, etc.) must point to locations within the current directory (background is that the MacOsX version does not support mount binding additional directories).

Table of Contents

Information for Users

Information for Developers

script-languages-container-tool's People

Contributors

exadm-deg avatar marlenekress79789 avatar redcatbear avatar tkilias avatar tomuben avatar trellixvulnteam avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

script-languages-container-tool's Issues

Force rebuild of docker image for unit tests.

Background information

When changing unit tests/cli/lib, exaslct uses the latest docker image, because check is based on git sha.
Thus, changes are actually not incorporated when executing tests/commands locally.

Acceptance Criteria

In this case, force rebuild of the docker image.

Improve exaslct starter script to mount all path which are used by the command line parameter

Background

  • Currently, the exaslct starter script only mounts the current working directory, however the user can specify paths to other locations in certain command line parameters of exaslct, such as --flavor-path, --save-path, --export-path, --cache-directory, --temporary-base-directory, --output-directory, ...
  • At the moment, the starter script would use paths inside the runner container for them, if they are not relative the current working directory, as such they are only available in the container and don't get synced to the host

Acceptance Criteria

  • path used in the command line parameters are get mounted into the runner container
  • the permissions of these mounted directories get fixed after or during the run, because the container and the host might use different uids and gids

Use host network for docker starter

Background

  • In the docker starter exaslct_within_docker_container.sh, we start exaslct in a docker container
  • If we use exaslct upload customers probably assume most of the time the forwarded ports of the docker-db to be the ports to connect, too
  • However, if you run in docker container and have it not on the host network the port forwards are inaccessible from inside the container
  • To provide a more natural behavior we should start the docker starter container with --network host

Exaslct fails with mount_point_paths[@]: unbound variable

Background

It turned out that #10 is causing issues with bash version < 4.4, for example on CentOS 7.
The check if an element key is within an associative container is not working the same way on bash 4.2 as on bash 4.4 (using -v).

Acceptance

The bash script needs to be compatible with bash 4.2.

Investigate if the build cache for the docker image is used in exalsct

Background

  • In exasol/script-languages-release#245 was asked if we keep the docker build cache and if we can use it
  • When docker builds images each step in a Dockerfile produces an intermediate image which is kept in the build cache
  • Because, exaslct copies the build context to a temporary directory, we probably can't reuse these intermediate images, because Docker doesn't associate the new directory with a previous build
  • As such, keeping the build cache is likely not useful

Tasks

  • Investigate if the docker build cache can be used?
  • If it can't be used, check if we keep it.
  • If we keep it, change that.

Acceptance Criteria

  • Either, the build cache is used and improve the performance of rebuilds of a single images, or we remove the build cache after the build

Check release consistency

Background

When creating a release we should verify the correctness of the package version (in pyproject.toml).
It happened in the past that a release was created with the wrong version of the python package.

Acceptance Criteria

Create a GH actions which is trigger during release-droid execution and verifies correctness of the version code.

Upload does not show error when not successful

Background

If upload of container to bucket fs fails, exaslct does not show an error message to the user.

Acceptance Criteria

Show an error message if upload fails (for example is bucketfs password is incorrect)

Fix docker image labels for merges to main and for releases

Background

  • we currently update the docker image with the label latest for each merge/push to main, but it would be better to have it point to the last release.

Acceptance Criteria

We need a CI task

  • publish-main, which builds the image whenever main get updated (merge, push), it then builds the docker image for the main branch and pushes it with the label main.
  • publish-release, which builds the image whenever we create a release (release tag gets created), it then builds the docker image for the commit of this release and pushes it with the release name as label to docker hub. Furthermore, it advances the latest tag to release tag and builds the image also with the label latest and pushes this as well to docker hub.

bash starter_scripts/build_docker_runner_image.sh <label>
bash starter_scripts/push_docker_runner_image.sh <label>

Mount binding not working correctly if format is in form of key=value

Background

In #10 we implemented "mount binding" to the docker container for directories which were given in the parameters.
This mechanism is working if the parameter is given in the form of:

exaslct cmd key1 value1 key2 value2

but not when given in the form:

exaslct cmd key1=value1 key2=value2

The second form is documented in the officiel README.md.

Acceptance Criteria

Change implementation for parameter parsing in exaslct_within_docker_container.sh so, that it collects directories given as parameter list in the form:

exaslct cmd key1=value1 key2=value2

Forward the docker registry password environment variables to the docker runner

Background:

  • CI processes, might use the following environment variables to supply the passwords for the docker registry
    TARGET_DOCKER_PASSWORD
    SOURCE_DOCKER_PASSWORD
    
  • The docker runner currently, doesn't forward these

Rationals

  • We could use docker run --env or docker run --env-file to provide the environment variables
  • Because these environment variables are password the file is the better option, because it is less likely we leak them from there to the stdout

Update integration-test-environment to version 0.3

Background

  • integration-test-environment version 0.3 contains new docker-db versions and new commandline parameters to set db-mem size and db-disk size

Acceptance Criteria

  • updated version project.toml and poetry.lock
  • add commandline arguments for db-mem size and db-disk size to run-db-test

Fix handling of path_in_bucket parameter

Background

Parameter path_in_bucket is being instantiated with type luigi.Parameter, but is optional (compare test test_docker_upload).
This causes following warning during test execution:

......./usr/local/lib/python3.6/dist-packages/luigi/parameter.py:279: UserWarning: Parameter "path_in_bucket" with value "None" is not of type string.
  warnings.warn('Parameter "{}" with value "{}" is not of type string.'.format(param_name, param_value))

Acceptance Criteria

Handle absence of parameter path_in_bucket gracefully.

Add CLI command to add/remove/update packages

Background:

  • Currently, user need to edit files to add packages, this is often inconvenient and it also allows no sanity checks.
  • CLI commands could manage the package list files and check for conflicts, incompatibilities checks and select the correct file to edit

Example how this could look like

exaslct add-packages --flavor-path flavors/standard-EXASOL-7.0.0 --python3-package "joblib=1.0.1" 
exaslct add-packages --flavor-path flavors/standard-EXASOL-7.0.0 --python2-package "joblib=1.0.1" 
exaslct add-packages --flavor-path flavors/standard-EXASOL-7.0.0 --cran-package "opitools=1.0.3"
man kann sie auch kombinieren
exaslct add-packages --flavor-path flavors/standard-EXASOL-7.0.0 --python3-package "joblib=1.0.1" --python2-package "joblib=1.0.1" --cran-package "opitools=1.0.3" --apt-package "parallel"
man kann mehrere gleichzeitig hinzufügen
exaslct add-packages --flavor-path flavors/standard-EXASOL-7.0.0 --python3-package "joblib=1.0.1" --python3-package "joblib=1.0.1" --python3-package "many_requests"
oder man kann ganze paket listen hinzufügen
exaslct add-packages --flavor-path flavors/standard-EXASOL-7.0.0 --python3-package-list "path/to/package/list" 
wäre für die anderen package arten equivalent
für python könnte man auch noch überlegen requirement files zu erlauben, ein hauptproblem, von denen, ist ihre laxe specification (d.h. wir würden vermutlich nur ein definiertes subset unterstützen )
exaslct add-packages --flavor-path flavors/standard-EXASOL-7.0.0 --python3-requirements-file "path/to/requirements.txt" 

Prepare release 0.2.0

Background

We want to create a new release with integration-test-docker-environment 0.4.0 integrated.

Acceptance Criteria

  • Changelog updated
  • Ready to create release with release-droid

Re-activate mounting of test folders

Background

The test environment currently has some assumptions about the folder structure of the respective python test files.
At some places the folders are manipulated manually (for example here and here.
This does not work with the mount solution of #10, and hence the mounting for test folders was removed in #26.

Acceptance Criteria

Re-activate the mounting here and adjust the test environment to work correctly.

Symbolic links inside in/out directory not working

Background

In #10 we add volume mount points for each directory given by the arguments.
However, if any of the directories contain symbolic links to paths which are not mounted, the container won't have access to those directories which might cause the respective docker run to fail.

For example assume the following scenario
$HOME
├── flavors
│   └── my_flavor
│   └── flavor_base
│   └── build_deps -> ../../../my_deps/
└── my_deps

running

./exaslct export --flavor-path=flavors/my_flavor --export-path ./out

will not work, as $HOME/my_deps will not be visible from within the container.

Acceptance Criteria

In starter_scripts/exaslct_within_docker_container.sh find all symbolic links of all mount point directories (given by command line arguments) and add to this list. As this might be very expansive, declare a new command line parameter --mount_symbolic_link_directories which enable/disables this functionality.

Alternative:

Add a command line parameter --mount-extra-directory which allows the user to specify additional directories which get mounted to the container. This would also solve problems with get_additional_build_directories_mapping configration in the flavors which in theory could point anywhere in the file system.

Full version for MacOsX

Background

In #69 we created a slim version of exaslct with some limitations. These limitations were caused due to the old bash version which is pre-installed on MacOsX (license problem).
It turned out that a new version of bash can easily be installed via homebrew, just with brew install bash.

Acceptance Criteria

  1. Adapt exaslct scripts to find and use homebrew version of exaslct (and exit with error if not found). Best approach is to define a env variable in ./exaslct which will be used in all subsequent scripts.
  2. Adjust documentation

Setup shellcheck for the starter scripts

Background:

  • Our starter scripts get more complex bash scripts
  • Bash is highly error-prone
  • shellcheck statically checks the scripts for common pitfalls

Acceptance Criteria

  • Add a script to run it locally
  • Setup shellcheck https://www.shellcheck.net/ for the starter scripts in CI
  • Fix all issues shellcheck recognizes

Inject import for DockerFlavorAnalyzeImageTask into build_steps.py

Background:

  • Currently, each build_steps.py has it own import statement for DockerFlavorAnalyzeImageTask
  • These import statement are fixed to one package path and one name
  • This complicates changes to the structure in script-languages-container-tool, because if we change the module path or name of the class the flavors will break
  • If we inject the import statement during the dynamic load of the build_steps.py file, we can implement compatibility mechanisms, for example we could change the name of the class with import, as follows
from path.to.package import NewName as OldName
  • With changes to the package path also only require a change in the injected import statement

Refactor TestContainer

Background

Today we have one class TestContainer which takes all arguments from client and then spawns test environment.
Independent of the kind of test database it passes all parameters to SpawnTestEnvironment.
SpawnTestEnvironment then distinguishes the kind of database and uses the internal docker db or an external provided db, see here

This causes some problems, as the interface is not well defined, see here.

Acceptance Criteria

The published interface by SpawnTestEnvironment should be clear. One idea is that the actual database is attached via DependencyInjection

Fix docker repository name in construct_docker_runner_image_name.sh

Background:

  • construct_docker_runner_image_name.sh generate the tag for docker files
  • currently it generates the following exasol/script-languages:container-tool-runner-$VERSION
  • however, with this the script push_docker_runner_image.sh fails, because this docker repository doesn't exist
  • the correct repository is exasol/script-language-container

Run exaslct without root in the container

Avoid running build of script languages container as root.
The container needs to run temporarily as root in order to create user and groups.
After that, root access is not needed and may cause problems.
A potential way could, be to first create group with the same gid as is set for the docker socket and then create a user with the same uid as the caller and add it to the created group. After that, we drop the root user with su to the created user and call exaslct. This way the user can access the docker socket, but isn't root and writes files with the same uid as the caller. We might need to add the user to an additional group basically all active groups of the caller, such that he can access files or directories of the respective groups. Note: We can't change the owner or group of the docker socket, because we would change it on the host as well.

Don't create symlinks to absolute path in the exaslct_installer.sh

Background:

  • The exalsct installer creates a directory exalsct_scripts and a symlink to entry script.
  • At the moment, this symlink seems to use an absolute path which works if you install exaslct on your machine, but not, if you want to commit it to a repository and run it a CI environment

Add CI checks for other operating systems

Background

We discovered that different versions of bash can cause different behavior of exaslct command line tools, see #48 .
Because of that we need at least a CI check for CentOs 7, which comes with bash 4.2 (the minimum compatible version).

Acceptance Criteria

A new CI check which validates export functionality on push, in a CentOS 7 enviornment.

Gather and print System Information before running to give more information while debugging

Background

  • we often need to debug setups on user systems
  • we typical need information about the system, e.g
    • OS
    • OS Version
    • Docker Daemon Version
    • Docker Client Version
    • Host can access Docker Hub
    • Docker container can access internet (Ubuntu)
    • bash version
    • gnu utils Version
    • installed exaslct Version
    • Project name
    • Repository Status + Sub module status
      • modified?
      • commit id history until first commit in main branch history
      • paths to exaslct.log files for the last X builds
      • Docker images with exasol script languages prefix
    • available disk space on /
    • are checksums ok

Acceptance Criteria

  • would be a new command of exaslct (something like "diagnosis")
  • add message like "Please send to Exasol"
  • develop a bash script which collect the listed information and prints it to terminal and file
  • the script needs to work on minimal dependencies
  • the script needs to be as error resilient as possible and collect as much information as possible

Add new integration tests in installer scripts, for testing version number.
For that, extract most part of https://github.com/exasol/script-languages-container-tool/blob/main/installer/tests/test_exaslct_install_template_with_current_ref.sh into separate script and include via source (to keep trap/env variable/etc.)

Error 'chown: missing operand after' when no addtional paths are mounted

Background:

  • When executing 'exaslct spawn-test-environment ...' I don't mount any additional path
  • In that case, the variable is empty $chown_directories and the chown in following line fails
    RUN_COMMAND="/script-languages-container-tool/starter_scripts/exaslct_without_poetry.sh $quoted_arguments; RETURN_CODE=\$?; chown -R $(id -u):$(id -g) $chown_directories; chown -R $(id -u):$(id -g) .build_output &> /dev/null; exit \$RETURN_CODE"

EXASLCT: Test output should be streamed to log file

Background

  • We write the standard output and error for the language container tests only after they have finished
  • However, for long-running or if the system fails, it would be good to write them as soon the output is available
  • Basically, return a stream from the exec_run at the following position and write it directly to a file

exit_code, output = test_container.exec_run(cmd=bash_cmd,

EXASLCT: Improve short help messages

Situation:

  • Currently, the short help messages for the commands of exaslct are not helpful

Usage: exaslct.py [OPTIONS] COMMAND [ARGS]...

Options:
--help Show this message and exit.

Commands:
build This command builds all stages of the
script...
clean-all-images This command uploads the whole script...
clean-flavor-images This command uploads the whole script...
export This command exports the whole script...
generate-language-activation This command generates a alter session...
push This command pushes all stages of the
script...
run-db-test This command runs the integration tests in...
save This command pushes all stages of the
script...
spawn-test-environment This command spawn a test environment with
a...
upload This command uploads the whole script...

Task:

Add installer scripts which download and install the scripts to run exaslct from a prebuild docker container

Backgorund

  • exaslct will be used in several repositories, such as the script-languages, the legacy-udfclient-* and later also in the repositories for each flavor
  • at the moment exaslct only works on linux and needs at least a linux docker container to run on MacOS and likely also Windows
  • as such we are going to distribute exaslct as a docker container which gets build for each commit to the main branch and will be published to docker hub
  • we need a minimum set of scripts to run the docker container and mount all necessary directories into it
  • these scripts need to be installed and updated in each repository that uses exaslct

Acceptance Criteria

  • we need a single script which bootstraps the whole installation and which can install past, current and future versions of exaslct
    • the bestway to do this, is probably, by having a super simple bootstrap script which loads the actual install script for the requested version

Write stdout/err of exaslct to log file

Background

  • before we moved exaslct to this repo it wrote the stdout/err to exaslct.log
  • we need at least this functionality, better would be to write it to the job directory, because exaslct.log gets overwritten

Run tests with Python3

Background

We are migrating all script language container tests to Python3.

Acceptance Criteria

exaslct run_db_test must be executed in Python3.

Create a Mac OSX version of exaslct

Background

exaslct is not compatible with MacOsX due to two reasons:

  1. Command readlink contains only a subset of the features of the linux version. We need to use greadlink (which comes with gtools, can be installed with "brew install coreutils")
  2. Bash version is very old (3.6.2), we need at least 4.2 (due to usage of associative arrays)

Solution

We have two options here:

  1. Write a slim version of exaslct which is specifically for MacOsX, which does not support the mount point handling. And which uses greadlink instead of readlink
  2. Write exaslct in Go

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.