Code Monkey home page Code Monkey logo

airflow-role's Introduction

Apache Airflow Ansible role

GitHub release (latest by date) Ansible Galaxy Build Status

Logo

This ansible role installs a Apache Airflow server in a Debian/Ubuntu environment.

Getting Started

These instructions will get you a copy of the role for your ansible playbook. Once launched, it will install Apache Airflow in a Debian or Ubuntu system.

Prerequisites ☑️

Ansible 2.9.9 version installed. Inventory destination should be a Debian (preferable Debian 10 Buster ) or Ubuntu environment.

ℹ️ This role should work with older versions of Debian but you need to know that due to Airflow minimum requirements you should check that 🐍 Python 3.6 (or higher) is installed before (👉 See: Airflow prerequisites).

ℹ️ By default this role use the predefined installation of Python that comes with the distro.

For testing purposes, Molecule with Docker as driver.

Installing 📥

Create or add to your roles dependency file (e.g requirements.yml) from GitHub:

- src: http://github.com/idealista/airflow-role.git
  scm: git
  version: 2.0.0
  name: airflow

or using Ansible Galaxy as origin if you prefer:

- src: idealista.airflow-role
  version: 2.0.0
  name: airflow

Install the role with ansible-galaxy command:

ansible-galaxy install -p roles -r requirements.yml -f

Use in a playbook:

---
- hosts: someserver
  roles:
    - { role: airflow }

Usage 🏃

Look to the defaults properties files to see the possible configuration properties, take a look for them:

👉 Don't forget :

  • 🦸 To set your Admin user.
  • 🔑 To set Fernet key.
  • 🔑 To set webserver secret key.
  • 📝 To set your AIRFLOW_HOME and AIRFLOW_CONFIG at your own discretion.
  • 📝 To set your installation and config skelton paths at your own discretion.
    • 👉 See airflow_skeleton_paths in main.yml
  • 🐍 Python and pip version.
  • 📦 Extra packages if you need additional operators, hooks, sensors...
  • 📦 Required Python packages with version specific like SQLAlchemy for example (to avoid known Airflow bugs❗️) like below or because are necessary
  • ⚠️ With Airflow v1.10.0, PyPi package pyasn1 v0.4.4 is needed. See examples below

📦 Required Python packages

airflow_required_python_packages should be a list following this format:

airflow_required_python_packages:
  - { name: SQLAlchemy, version: 1.3.23 }
  - { name: psycopg2 }
  - {name: pyasn1, version: 0.4.4}

📦 Extra packages

airflow_extra_packages should be a list following this format:

airflow_extra_packages:
  - apache.atlas
  - celery
  - ssh

👉 For more info about this extra packages see: Airflow extra packages

Testing 🧪

pipenv install -r test-requirements.txt --python 3.7

# Optional
pipenv shell  # if in shell just use `molecule COMMAND`

pipenv run molecule test  # To run role test
# or
pipenv run molecule converge  # To run play with the role

Built With 🏗️

Ansible

Versioning 🗃️

For the versions available, see the tags on this repository.

Additionally you can see what change in each version in the CHANGELOG.md file.

Authors 🦸

See also the list of contributors who participated in this project.

License 🗒️

Apache 2.0 License

This project is licensed under the Apache 2.0 license - see the LICENSE file for details.

Contributing 👷

Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.

airflow-role's People

Contributors

adrimarteau avatar angelddaz avatar davestern avatar dortegau avatar fhalim avatar jmonterrubio avatar lorientedev avatar plozano94 avatar ultraheroe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

airflow-role's Issues

Include TravisCI

Due to complications in making Vagrant and Docker tests with Molecule compatible, in the first .travis.yml version we'll use ansible-playbook commands to test playbook's syntax, first run and idempotence and if the webserver is properly deployed.

Add python 3.8 and create virtual env to install airflow

Prerequisites

Description

To guarantee support and improve performance of actual and future versions of airflow I propose add python 3.8 and use virtualenv to prevent modify system python.

Expected behavior: Airflow will run with python 3.8 and virtualenv

Actual behavior: Airflow runs with system python version.

Versions

1.8.0

Additional Information

[BUG] Servie PATH environment not working as expected

Description

The PATH var set in the environment file for the service isn't expanding de expression $PATH

Steps to Reproduce

  1. Run a DAG with BashOperator e.g
  2. Look at the log for an error like 'No such file or directory: 'bash': 'bash''

Expected behavior:
The command of the task should be executed without errors

Actual behavior:
The BashOperator and maybe other operators are broken

Reproduces how often:
Always

Environment

  • The release version/s you are using: 2.0.0
  • OS: Debian 10

[BUG] Default scenario and build broken

Description

The build and the default scenario of molecule is broken due to psycopg2 package.

Steps to Reproduce

  1. pipenv install test-requirements.txt
  2. molecule converge

Expected behavior:

Install psycopg2 in scenario without problems

Actual behavior:

Installation of psycopg2 broken

Reproduces how often:

Always

Environment

  • The release version/s you are using: 2.0.4
  • OS: Debian

Additional Information

Using the psycopg2-binary fix the problem

Molecule using Docker

In order to use Travis and Molecule, it is mandatory to use Docker: Travis doesn't support Vagrant. The problem was that the tests use Ansible module and it needs an Ansible backend connection and, when Docker driver is specified, Molecule uses Testinfra with the argument --connection=docker, making the tests fail, as they don't find any host.

To make it work, we just have to force Testinfra to use the same arguments as the Vagrant driver. These arguments are: --connection=ansible --ansible-inventory=.molecule/ansible_inventory, so in molecule.yml, we specify these arguments in verifier section:

verifier:
  name: testinfra
  options:
    connection: ansible
    ansible-inventory: .molecule/ansible_inventory

Docker driver configuration is a bit tricky itself: as our role is using systemd, default image configuration won't work. To make Docker container use systemd, we have to add this in the container configuration in molecule.yml:

privileged: True
cap_add:
  - SYS_ADMIN
volume_mounts:
  - '/sys/fs/cgroup:/sys/fs/cgroup:ro'
command: '/lib/systemd/systemd'

Finally, looks like default Debian images come with Python 2.7.9 and it makes pip crack somehow after Airflow installation, so we have to use Python images. Knowing all of this, the Docker section in molecule.yml states like this:

docker:
  containers:
    - name: airflow.vm
      ansible_groups:
        - airflow

      image: python
      image_version: 2.7.13-jessie

      port_bindings:
        80: 80
        8080: 8080
        5555: 5555

      privileged: True
      cap_add:
        - SYS_ADMIN
      volume_mounts:
        - '/sys/fs/cgroup:/sys/fs/cgroup:ro'
      command: '/lib/systemd/systemd'

Managing DAGs and plugins updates automatically

A colleague asked for an automatic way to import DAGs and plugins from a Git repo every 5 minutes via cron job. Several changes were made to do so:

  • defaults/main.yml:
    • Added git in airflow_required_libs
    • Added dags_dependencies dictionary. Is meant to be the Python dependencies demanded by the DAGs.
      If empty, should be left with {}. If set, should follow the example:
      scrapinghub:
        version: 2.0.1
    • Added dags_repository dictionary. Is meant to be the Git repositories containing DAGs or plugins that we want to check.
      If empty, should be left with {}. If set, should follow the example:
      dags:
        src: https://github.com/apache/incubator-airflow/
        repo_subfolder: airflow/example_dags
        host_subfolder: "{{ airflow_dags_folder }}"
  • tasks/install.yml:
    • Added task to install DAGs dependencies and restart Airflow stack if there are any new packages installed.
  • tasks/config.yml:
    • Added task to create the cron jobs to check the repositories every 5 minutes.

The "-n" option might not be needed.

Hi, idelista you all.
Thanks for this repository. It helped my work.
However I have one point of concern about the systemd service file.

There is "-n" option in airflow-scheduler.service which set the number of scheduler loops.

ExecStart={{ airflow_executable }} scheduler -n ${SCHEDULER_RUNS} --pid {{ airflow_pidfile_folder }}-scheduler/scheduler.pid

This setting has remained since airflow was unstable before. (cf. apache/airflow#698)

Now it seems not to be needed. (cf. apache/airflow#19219)

So I suggest that this setting could be removed or set "-1" as default value.

Add Flower service configuration

Need to add it in:

  • defaults/main.yml: Inside airflow_services dictionary. Default values: enabled: yes and state: started.
  • templates/airflow-flower.service.j2: Service template added.
  • handlers: restart airflow-flower added.
  • tasks/config.yml: Added notification to restart airflow-flower handler in every task other handlers were notified to.

Add support to 2.0 airflow version

Description

Add support to install airflow 2.0 version

Why is this needed?

Its a mayor release with new features

Additional Information

Added properly airflow.cfg for this version

Small bugs in install.yml

There are a few small bugs in the installation playbook:

  • Celery block condition is overindented and is only checked in the second task. Plus, using block breaks a little bit the playbook "style", so the condition will be from now on checked individually in both tasks.
  • The task Airflow | Installing dependencies is buggy using Debian Docker image due to they come with no apt update done so the packages list is empty. update_cache: yes is added in order to prevent this from happening and to avoid older versions of packages.
  • The task Airflow | Installing Airflow Extra Packages will fail if airflow_extra_packages variable is empty, so this task should only be run in case this variable is not empty.

errors downloading airflow at TASK [airflow-role : Airflow | Installing Airflow]

I
I'm getting this error on ubuntu 1804 :
At TASK [airflow-role : Airflow | Installing Airflow]

TASK [airflow-role : Airflow | Installing Airflow] ****************************************************************************************************************************************************************
fatal: [192.168.33.13]: FAILED! => {"changed": false, "cmd": ["/usr/bin/pip", "install", "--no-cache-dir", "apache-airflow==1.10.2"], "msg": "stdout: Collecting apache-airflow==1.10.2\n\n:stderr: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f5beffb2f10>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',)': /simple/apache-airflow/\n Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f5beffb2b10>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',)': /simple/apache-airflow/\n Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f5beffb2d50>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',)': /simple/apache-airflow/\n Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f5bf0a655d0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',)': /simple/apache-airflow/\n Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f5bf0a65510>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',)': /simple/apache-airflow/\nException:\nTraceback (most recent call last):\n File "/usr/lib/python2.7/dist-packages/pip/basecommand.py", line 215, in main\n status = self.run(options, args)\n File "/usr/lib/python2.7/dist-packages/pip/commands/install.py", line 342, in run\n requirement_set.prepare_files(finder)\n File "/usr/lib/python2.7/dist-packages/pip/req/req_set.py", line 380, in prepare_files\n ignore_dependencies=self.ignore_dependencies))\n File "/usr/lib/python2.7/dist-packages/pip/req/req_set.py", line 554, in _prepare_file\n require_hashes\n File "/usr/lib/python2.7/dist-packages/pip/req/req_install.py", line 278, in populate_link\n self.link = finder.find_requirement(self, upgrade)\n File "/usr/lib/python2.7/dist-packages/pip/index.py", line 465, in find_requirement\n all_candidates = self.find_all_candidates(req.name)\n File "/usr/lib/python2.7/dist-packages/pip/index.py", line 423, in find_all_candidates\n for page in self._get_pages(url_locations, project_name):\n File "/usr/lib/python2.7/dist-packages/pip/index.py", line 568, in _get_pages\n page = self._get_page(location)\n File "/usr/lib/python2.7/dist-packages/pip/index.py", line 683, in _get_page\n return HTMLPage.get_page(link, session=self.session)\n File "/usr/lib/python2.7/dist-packages/pip/index.py", line 792, in get_page\n "Cache-Control": "max-age=600",\n File "/usr/share/python-wheels/requests-2.18.4-py2.py3-none-any.whl/requests/sessions.py", line 533, in get\n return self.request('GET', url, **kwargs)\n File "/usr/lib/python2.7/dist-packages/pip/download.py", line 386, in request\n return super(PipSession, self).request(method, url, *args, **kwargs)\n File "/usr/share/python-wheels/requests-2.18.4-py2.py3-none-any.whl/requests/sessions.py", line 520, in request\n resp = self.send(prep, **send_kwargs)\n File "/usr/share/python-wheels/requests-2.18.4-py2.py3-none-any.whl/requests/sessions.py", line 630, in send\n r = adapter.send(request, **kwargs)\n File "/usr/share/python-wheels/requests-2.18.4-py2.py3-none-any.whl/requests/adapters.py", line 508, in send\n raise ConnectionError(e, request=request)\nConnectionError: HTTPSConnectionPool(host='pypi.python.org', port=443): Max retries exceeded with url: /simple/apache-airflow/ (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f5bf0a65710>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))\n"}

LDAP not working if superuser_filter and/or data_profiler_filter are left blank

In airflow.cfg, data_profiler_filter and superuser_filter cannot be left blank. We can't default these values, so the lines should be added just in case they exist. So we have to add two if conditionals in airflow.cfg.j2 and add these lines (separatedly) just in case airflow_ldap_superuser_filter or airflow_ldap_data_profiler_filter exist and are not empty, respectively.

idealista.airflow-role was NOT installed successfully

downloading role from https://github.com/idealista/airflow-role/archive/1.3.1.tar.gz
[ERROR]: failed to download the file: HTTP Error 404: Not Found

Unable to download any version of airflow-role.

$ ansible --version
ansible 2.4.0.0
  config file = /media/asg-airflow-terraform/ansible/ansible.cfg
  configured module search path = [u'/home/osboxes/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/local/lib/python2.7/dist-packages/ansible
  executable location = /usr/local/bin/ansible
  python version = 2.7.12 (default, Nov 19 2016, 06:48:10) [GCC 5.4.0 20160609]

Add path for templates in services

Prerequisites

Description

Can't replace template service files from playbook always get from role,

Steps to Reproduce

Expected behavior: When have templates in playbook and you specify in variable the role must get template from path you give.

Actual behavior: Gets templates from role

Reproduces how often: 100%

Versions

1.8.1

Additional Information

Any additional information, configuration or data that might be necessary to reproduce the issue.

Make role compatible with Airflow 1.10.0

Prerequisites

Description

Setting airflow_version to 1.10.0 and launching the playbook fails in the "Installing Airflow" task with the following output:

fatal: [airflow]: FAILED! => {"changed": false, "cmd": "/usr/bin/pip install --no-cache-dir apache-airflow==1.10.0", "msg": "stdout: Collecting apache-airflow==1.10.0\n  Downloading https://files.pythonhosted.org/packages/da/2a/6e9efcd40193850e2f636c7306eede2ff5607aa9f81ff9f7a151d9b13ff8/apache-airflow-1.10.0.tar.gz (4.3MB)\n    Complete output from command python setup.py egg_info:\n    Traceback (most recent call last):\n      File \"<string>\", line 1, in <module>\n      File \"/root/pip-build-u203T5/apache-airflow/setup.py\", line 393, in <module>\n        do_setup()\n      File \"/root/pip-build-u203T5/apache-airflow/setup.py\", line 258, in do_setup\n        verify_gpl_dependency()\n      File \"/root/pip-build-u203T5/apache-airflow/setup.py\", line 49, in verify_gpl_dependency\n        raise RuntimeError(\"By default one of Airflow's dependencies installs a GPL \"\n    RuntimeError: By default one of Airflow's dependencies installs a GPL dependency (unidecode). To avoid this dependency set SLUGIFY_USES_TEXT_UNIDECODE=yes in your environment when you install or upgrade Airflow. To force installing the GPL version set AIRFLOW_GPL_UNIDECODE\n    \n    ----------------------------------------\n\n:stderr: Command \"python setup.py egg_info\" failed with error code 1 in /root/pip-build-u203T5/apache-airflow/\n"}

[BUG] template gunicorn-logrotate.j2

Hello!
I'm using a part of your great project!
I have noticed a potential issue inside the template located in templates/gunicorn-logrotate.j2.
On the master branch, it's currently written in this way:

{{ airflow_logs_folder }}/gunicorn-*.log {
    daily
    missingok
    rotate 7
    size 500M
    compress
    notifempty
    create 644 {{ airflow_user }} {{ airflow_group }}
    sharedscripts
    postrotate
        [ -f {{ airflow_pidfile_folder }}-webserver/webserver.pid ] && kill -USR1 `cat {{ airflow_pidfile_folder }}-webserver/webserver.pid`
    endscript
}

My issue concerns this code piece at the line before the last

 kill -USR1 `cat {{ airflow_pidfile_folder }}/webserver.pid`

I think it needs an "-webserver" right after "{{ airflow_pidfile_folder }}" as it is written at the beginning of the line. Otherwise the logrotate script will fail.

Thanks for your time!

PS: I tried to push a PR but PR access seems to be forbidden for non-contributors

[BUG] Services configuration

Description

Found some problems when you want to install only some services and not all.

Steps to Reproduce

Try to configure a node without some service like airflow-worker

Expected behavior:
Runs smoothly

Actual behavior:
Undesired service is up and running

Reproduces how often:
Always

Environment

  • The release version/s you are using:
  • OS: Debian Bullseye
  • Others:

Additional Information

Install tasks fail when run without escalated privileges

Prerequisites

Description

Some tasks like Airflow | Ensure Airflow group fail when run without escalated privileges.

Steps to Reproduce

  1. Install the module
  2. Run a playbook with the module included

Expected behavior: [What you expect to happen]

TASK [airflow : Airflow | Ensure Airflow group] ********************************************************************************************************************************************************************************
ok: [server-name]

Actual behavior: [What actually happens]

TASK [airflow : Airflow | Ensure Airflow group] ********************************************************************************************************************************************************************************
fatal: [1riv-dev-air]: FAILED! => {"changed": false, "msg": "groupadd: Permission denied.\ngroupadd: cannot lock /etc/group; try again later.\n", "name": "airflow"}

Reproduces how often: [What percentage of the time does it reproduce?]

Every time

Versions

The version/s you notice the behavior.

idealist.airflow-role 1.7.2
ubuntu bionic 18.04

Additional Information

I resolved this issue by adding become: true to the imported task, but this should be documented or added to the tasks.

- hosts: airflow
  tasks:
    - import_role:
        name: airflow
      become: true

[FEATURE] add async / poll params on Airflow | Initializing DB to fix hanging task

First, thanks for making this project open source, it's saved me a bunch of time in getting airflow successfully deployed.

I thought I'd note this down since others may run into this issue.

My environment:

  • ansible version: 2.4.1
  • vagrant box: bento/ubuntu-16.04
  • I installed airflow-role using galaxy, with the default configurations
  • playbook:
---
- hosts: someserver
  roles:
    - idealista/airflow-role

I had an uninstalled import in one of my dags, which I expected to cause ansible to fail on the task "Airflow | Initializing DB", since it has to import the dags as part of this process. Instead, the exit signal is not received by ansible and the task hangs.

I discovered this by adding this to my task definition:

async: 60
poll: 60

I'm fairly new to ansible, and not suggesting that async is the best way to handle this, but the expected behavior on a bad import is that the ansible task should fail instead of hang.

[BUG] notify restart airflow services not found when installing DAG dependencies

Description

When a DAG dependence is set on dags_dependencies after installing it the task notify to an unknown (typo) restart handler
Handler is e.g: "restart airflow-webserver" and the task has "restart airflow_webserver"

Steps to Reproduce

  1. Declare a DAG package dependence
  2. Go for the run

Expected behavior:
Handler restart the services normally

Actual behavior:
Handler misses the restart of the services because of a typo

Reproduces how often:
When a DAG dependence is set

Environment

  • The release version/s you are using:
  • OS: Debian 10
  • airflow-role: 2.0.0 and 2.0.1

Update airflow.cfg template

In the new versions of Airflow, airflow.cfg has more fields (supporting Kubernets, for instance). We should add support to these new features

Scheduler doesn't work

it seems the scheduler process is started after deployment.
however, the jobs can't be scheduled to execute.

Unable to install more than one Airflow Extra Package

Prerequisites

Description

When specifying more than 1 airflow extra package, the task Airflow | Installing Airflow Extra Packages fails with error:

(item=['apache-airflow[celery]', 'apache-airflow[postgres]']) => {"changed": false, "item": ["apache-airflow[celery]", "apache-airflow[postgres]"], "msg": "'version' argument is ambiguous when installing multiple package distributions. Please specify version restrictions next to each package in 'name' argument."}

Steps to Reproduce

  1. Set in a vars file
airflow_version: 1.10.0
airflow_extra_packages: [celery,postgres]
  1. Deploy Airflow using the role

Expected behavior: deployment to work

Actual behavior: deployment fails

Reproduces how often: Everytime

Versions

1.7.2

Additional Information

This also generate a warning:

[DEPRECATION WARNING]: Invoking "pip" only once while using a loop via squash_actions is deprecated. Instead of using a loop to supply multiple items and specifying `name: "apache-airflow[{{ item }}]"`, please use `name: '{{
airflow_extra_packages }}'` and remove the loop. This feature will be removed in version 2.11.

Airflow not listing DAGs with CeleryExecutor due to an empty variable

In airflow.cfg, the variable dagbag_import_timeout cannot have an empty value when CeleryExecutor is used: it raises a warning invalid literal for int() with base 10: '' that makes DAGs not to be listed.

The value of this variable is set in the role in defaults/main.yml with the variable airflow_dagbag_import_time and templated afterwards in templates/airflow.cfg.j2. Just setting a numeric value and not leaving it blank fixes this issue.

Add Celery installation

We need to modify these files:

  • defaults/main.yml: New variables:
    celery_version: Due to some incompatibilities between Airflow 1.8.x and Celery 4.x, it is interesting to be able to specify the Celery version we want to install.
    celery_extra_packages: Dictionary where we can specify if we want to install, for instance, the Redis Celery package.
  • tasks/install.yml:
    • Tasks reorganized. The role will first create the path for Airflow home and set an environment variable called AIRFLOW_HOME so during Airflow installation the install path is under control and every time Airflow stack is restarted we make sure that the proper path is used.
    • Added a block of two tasks to install Celery and its extra packages. Only executed if airflow_executor variable is CeleryExecutor.

Upgrade to Molecule v2

Molecule version is so old that it's a bit of nightmare to reproduce tests in local. Upgrade its version.

Better worker restarts

Prerequisites

Description

At the moment, when the workers are restarted, they do so the work they're performing is interrupted. Following this Stackoverflow answer provided by @juanriaza, the workers can be gracefully restarted by sending them SIGINT kill signal.

Service(airflow_service).is_running fails testing airflow-scheduler

Due to the scheduler is not constantly up & running and reboots itself every five seconds, Service(airflow_service).is_running assertion fails depending on the moment it is checked.

To fix this, we are going to add the @retry decorator from retrying Python module and make test_airflow_services check if the services are running during a maximum of 5 seconds:

from retrying import retry

@retry(stop_max_delay=5000)
def test_airflow_services(Service, AnsibleDefaults):
    airflow_services = AnsibleDefaults["airflow_services"]

    for airflow_service in airflow_services:
        if airflow_services[airflow_service]["enabled"]:
            assert Service(airflow_service).is_enabled
            assert Service(airflow_service).is_running

[SUPPORT] Conflict of airflow.cfg files

Description

I'm playing the role with almost the default config.
So I have a /etc/airflow/airflow.cfg with a mysql connection parameter

But during this tasks :
https://github.com/idealista/airflow-role/blob/master/tasks/users.yml#L3-L8

An other airflow config directory is created in the root homedir with an other airflow.cfg.
Thus the following tasks use this airflow.cfg with the default sqllite.

Additional Information

Note that I'm installing Airflow 2.1.0
And that I bypassed the installation tasks with a pre_roles

# See https://airflow.apache.org/docs/apache-airflow/stable/installation.html#installation-tools
- name: Install some dependencies
  package:
    name: "{{ item }}"
  with_items:
    - freetds-bin
    - krb5-user
    - ldap-utils
    - libffi6
    - libsasl2-2
    - libsasl2-modules
    - libssl1.1
    - locales
    - lsb-release
    - sasl2-bin
    - sqlite3
    - unixodbc

# See https://github.com/idealista/airflow-role/blob/master/tasks/install.yml
- name: Airflow | Install pip "{{ airflow_pip_version }}" version
  pip:
    name: pip
    version: "{{ airflow_pip_version }}"
  when: airflow_pip_version is defined

- name: Airflow | Install virtualenv
  pip:
    executable: "{{ airflow_pip_executable }}"
    name: virtualenv

- name: Airflow | Check if exists virtualenv
  stat:
    path: "{{ airflow_app_home }}/pyvenv.cfg"
  register: virtualenv_check

- name: Airflow | Set a virtualenv
  become: true
  become_user: "{{ airflow_user }}"
  command: "virtualenv -p python{{ airflow_python_version | default(omit) }} {{ airflow_app_home }}"
  when: not virtualenv_check.stat.exists

- name: Airflow | Install airflow
  pip:
    name: apache-airflow
    version: "{{ airflow_version }}"
    extra_args: "--constraint '{{ contraint_url }}'"
    virtualenv: "{{ airflow_app_home }}"

ubuntu, airflow db path - var/lib - opt/airflow

Description

ansible playbook installs to opt/airflow/airflow.db

airflow process launched with systemd uses /opt/airflow/airflow.db
ansible/create_user uses /var/lib/airflow/airflow/airflow.db
opt/bin/airflow also uses /var/lib/airflow/airflow/airflow.db

Just looking for ideas on how to solve this.

Additional Information

I changed a few settings in the webserver.service and Env variable files to make airflow work on ubuntu server LTS 20.04
But this part was broken before that also.

sudo -u airflow bash -c "/opt/airflow/bin/airflow config get-value core sql_alchemy_conn"
sqlite:////var/lib/airflow/airflow/airflow.db
sudo -u airflow bash -c "export AIRFLOW_CONFIG=/etc/airflow/airflow.cfg; /opt/airflow/bin/airflow config get-value core sql_alchemy_conn"
sqlite:////opt/airflow/airflow.db

Update dependencies and airflow version

Prerequisites

Description

Some dependencies aren't up to date, updated

  • ansible to 2.8.8
  • molecule to 3.0.1
  • docker to 4.1.0
  • ansible-lint to 4.2.0
  • goss to v0.3.10
    Airlfow version is set by default as 1.9.0, now is recommended by apache-airlfow install 1.10.12
    To install airflow 1.10.12 needs pip3

Expected behavior: Install last version airflow with pip3 and his constraints

Actual behavior: Install deprecated airflow version by default with pip

Reproduces how often: 100%

Versions

1.8.0 (latest)

Airflow 1.10 won't work when using LDAP

Prerequisites

Description

When LDAP is enabled in Airflow, it crashes with the same output as this question in Stack Overflow. The problem is with the pip package pyasn1, that happens to be in a outdated version. Upgrading it solves the problem.

Steps to Reproduce

  1. Set in a vars file
airflow_version: 1.10.0
airflow_webserver_authenticate: True
airflow_webserver_auth_backend: airflow.contrib.auth.backends.ldap_auth
  1. Deploy Airflow using the role
  2. Check the web UI

Expected behavior: Airflow web UI working.

Actual behavior: Airflow web UI not working.

Reproduces how often: Always

Versions

Since 1.7.0 (first compatible with Airflow 1.10.0)

Private tmp optional in services

Prerequisites

Description

[Description of the issue]

Private_tmp option is set to true in service, is desirable to be an optional parameter

Steps to Reproduce

  1. [First Step]
  2. [Second Step]
  3. [and so on...]

Expected behavior: [What you expect to happen]

We could choose true or false private_tmp in service

Actual behavior: [What actually happens]

Is setted to true and cant change it

Reproduces how often: [What percentage of the time does it reproduce?]

Versions

The version/s you notice the behavior.

Additional Information

Any additional information, configuration or data that might be necessary to reproduce the issue.

[BUG] Tasks Log view is broken

Description

The Log view of tasks by default are incorrectly configured so it doesn't show any log

Steps to Reproduce

  1. Run a DAG
  2. Go to Task view and click on "Log"
  3. The log is missing or empty

Expected behavior:
Show the DAG task execution log as normal

Actual behavior:
The log is missing, wrong configured or failed

Reproduces how often:
Always with default airflow role options

Environment

  • The release version/s you are using: 2.0.0
  • OS: Debian 10

Playbook is failing on check admin user(>2.0)

Description

[Description of the issue]

Steps to Reproduce

Just run the playbook for ubuntu 18.04
[What you expect to happen]
I was expecting it to deploy Airflow without any error
Actual behavior:
TASK [deploy_airflow : Airflow | Check Admin user (> 2.0)] *******************************************************************************************
fatal: [dmpServer]: FAILED! => {"changed": false, "cmd": ["/opt/airflow/bin/airflow", "users", "list"], "delta": "0:00:04.620829", "end": "2021-06-22 01:24:22.929163", "msg": "non-zero return code", "rc": 1, "start": "2021-06-22 01:24:18.308334", "stderr": "/opt/airflow/lib/python3.6/site-packages/flask_appbuilder/models/sqla/interface.py:62 SAWarning: relationship 'DagRun.serialized_dag' will copy column serialized_dag.dag_id to column dag_run.dag_id, which conflicts with relationship(s): 'DagRun.task_instances' (copies task_instance.dag_id to dag_run.dag_id), 'TaskInstance.dag_run' (copies task_instance.dag_id to dag_run.dag_id). If this is not the intention, consider if these relationships should be linked with back_populates, or if viewonly=True should be applied to one or more if they are read-only. For the less common case that foreign key constraints are partially overlapping, the orm.foreign() annotation can be used to isolate the columns that should be written towards. To silence this warning, add the parameter 'overlaps="dag_run,task_instances"' to the 'DagRun.serialized_dag' relationship.\n/opt/airflow/lib/python3.6/site-packages/flask_appbuilder/models/sqla/interface.py:62 SAWarning: relationship 'SerializedDagModel.dag_runs' will copy column serialized_dag.dag_id to column dag_run.dag_id, which conflicts with relationship(s): 'DagRun.task_instances' (copies task_instance.dag_id to dag_run.dag_id), 'TaskInstance.dag_run' (copies task_instance.dag_id to dag_run.dag_id). If this is not the intention, consider if these relationships should be linked with back_populates, or if viewonly=True should be applied to one or more if they are read-only. For the less common case that foreign key constraints are partially overlapping, the orm.foreign() annotation can be used to isolate the columns that should be written towards. To silence this warning, add the parameter 'overlaps="dag_run,task_instances"' to the 'SerializedDagModel.dag_runs' relationship.\nTraceback (most recent call last):\n File "/opt/airflow/lib/python3.6/site-packages/connexion/apis/abstract.py", line 209, in add_paths\n self.add_operation(path, method)\n File "/opt/airflow/lib/python3.6/site-packages/connexion/apis/abstract.py", line 173, in add_operation\n pass_context_arg_name=self.pass_context_arg_name\n File "/opt/airflow/lib/python3.6/site-packages/connexion/operations/init.py", line 8, in make_operation\n return spec.operation_cls.from_spec(spec, *args, **kwargs)\n File "/opt/airflow/lib/python3.6/site-packages/connexion/operations/openapi.py", line 138, in from_spec\n **kwargs\n File "/opt/airflow/lib/python3.6/site-packages/connexion/operations/openapi.py", line 89, in init\n pass_context_arg_name=pass_context_arg_name\n File "/opt/airflow/lib/python3.6/site-packages/connexion/operations/abstract.py", line 96, in init\n self._resolution = resolver.resolve(self)\n File "/opt/airflow/lib/python3.6/site-packages/connexion/resolver.py", line 40, in resolve\n return Resolution(self.resolve_function_from_operation_id(operation_id), operation_id)\n File "/opt/airflow/lib/python3.6/site-packages/connexion/resolver.py", line 66, in resolve_function_from_operation_id\n raise ResolverError(str(e), sys.exc_info())\nconnexion.exceptions.ResolverError: <ResolverError: columns>\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File "/opt/airflow/bin/airflow", line 8, in \n sys.exit(main())\n File "/opt/airflow/lib/python3.6/site-packages/airflow/main.py", line 40, in main\n args.func(args)\n File "/opt/airflow/lib/python3.6/site-packages/airflow/cli/cli_parser.py", line 48, in command\n return func(*args, **kwargs)\n File "/usr/lib/python3.6/contextlib.py", line 52, in inner\n return func(*args, **kwds)\n File "/opt/airflow/lib/python3.6/site-packages/airflow/cli/commands/user_command.py", line 35, in users_list\n appbuilder = cached_app().appbuilder # pylint: disable=no-member\n File "/opt/airflow/lib/python3.6/site-packages/airflow/www/app.py", line 135, in cached_app\n app = create_app(config=config, testing=testing)\n File "/opt/airflow/lib/python3.6/site-packages/airflow/www/app.py", line 120, in create_app\n init_api_connexion(flask_app)\n File "/opt/airflow/lib/python3.6/site-packages/airflow/www/extensions/init_views.py", line 172, in init_api_connexion\n specification='v1.yaml', base_path=base_path, validate_responses=True, strict_validation=True\n File "/opt/airflow/lib/python3.6/site-packages/connexion/apps/flask_app.py", line 57, in add_api\n api = super(FlaskApp, self).add_api(specification, **kwargs)\n File "/opt/airflow/lib/python3.6/site-packages/connexion/apps/abstract.py", line 156, in add_api\n options=api_options.as_dict())\n File "/opt/airflow/lib/python3.6/site-packages/connexion/apis/abstract.py", line 111, in init\n self.add_paths()\n File "/opt/airflow/lib/python3.6/site-packages/connexion/apis/abstract.py", line 216, in add_paths\n self._handle_add_operation_error(path, method, err.exc_info)\n File "/opt/airflow/lib/python3.6/site-packages/connexion/apis/abstract.py", line 231, in _handle_add_operation_error\n raise value.with_traceback(traceback)\n File "/opt/airflow/lib/python3.6/site-packages/connexion/resolver.py", line 61, in resolve_function_from_operation_id\n return self.function_resolver(operation_id)\n File "/opt/airflow/lib/python3.6/site-packages/connexion/utils.py", line 111, in get_function_from_name\n module = importlib.import_module(module_name)\n File "/usr/lib/python3.6/importlib/init.py", line 126, in import_module\n return _bootstrap._gcd_import(name[level:], package, level)\n File "", line 994, in _gcd_import\n File "", line 971, in _find_and_load\n File "", line 955, in _find_and_load_unlocked\n File "", line 665, in _load_unlocked\n File "", line 678, in exec_module\n File "", line 219, in _call_with_frames_removed\n File "/opt/airflow/lib/python3.6/site-packages/airflow/api_connexion/endpoints/connection_endpoint.py", line 26, in \n from airflow.api_connexion.schemas.connection_schema import (\n File "/opt/airflow/lib/python3.6/site-packages/airflow/api_connexion/schemas/connection_schema.py", line 42, in \n class ConnectionSchema(ConnectionCollectionItemSchema): # pylint: disable=too-many-ancestors\n File "/opt/airflow/lib/python3.6/site-packages/marshmallow/schema.py", line 117, in new\n dict_cls=dict_cls,\n File "/opt/airflow/lib/python3.6/site-packages/marshmallow_sqlalchemy/schema/sqlalchemy_schema.py", line 94, in get_declared_fields\n fields.update(mcs.get_auto_fields(fields, converter, opts, dict_cls))\n File "/opt/airflow/lib/python3.6/site-packages/marshmallow_sqlalchemy/schema/sqlalchemy_schema.py", line 108, in get_auto_fields\n for field_name, field in fields.items()\n File "/opt/airflow/lib/python3.6/site-packages/marshmallow_sqlalchemy/schema/sqlalchemy_schema.py", line 110, in \n and field_name not in opts.exclude\n File "/opt/airflow/lib/python3.6/site-packages/marshmallow_sqlalchemy/schema/sqlalchemy_schema.py", line 28, in create_field\n return converter.field_for(model, column_name, **self.field_kwargs)\n File "/opt/airflow/lib/python3.6/site-packages/marshmallow_sqlalchemy/convert.py", line 171, in field_for\n return self.property2field(prop, **kwargs)\n File "/opt/airflow/lib/python3.6/site-packages/marshmallow_sqlalchemy/convert.py", line 146, in property2field\n field_class = field_class or self._get_field_class_for_property(prop)\n File "/opt/airflow/lib/python3.6/site-packages/marshmallow_sqlalchemy/convert.py", line 210, in _get_field_class_for_property\n column = prop.columns[0]\n File "/opt/airflow/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py", line 1240, in getattr\n return self._fallback_getattr(key)\n File "/opt/airflow/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py", line 1214, in _fallback_getattr\n raise AttributeError(key)\nAttributeError: columns", "stderr_lines": ["/opt/airflow/lib/python3.6/site-packages/flask_appbuilder/models/sqla/interface.py:62 SAWarning: relationship 'DagRun.serialized_dag' will copy column serialized_dag.dag_id to column dag_run.dag_id, which conflicts with relationship(s): 'DagRun.task_instances' (copies task_instance.dag_id to dag_run.dag_id), 'TaskInstance.dag_run' (copies task_instance.dag_id to dag_run.dag_id). If this is not the intention, consider if these relationships should be linked with back_populates, or if viewonly=True should be applied to one or more if they are read-only. For the less common case that foreign key constraints are partially overlapping, the orm.foreign() annotation can be used to isolate the columns that should be written towards. To silence this warning, add the parameter 'overlaps="dag_run,task_instances"' to the 'DagRun.serialized_dag' relationship.", "/opt/airflow/lib/python3.6/site-packages/flask_appbuilder/models/sqla/interface.py:62 SAWarning: relationship 'SerializedDagModel.dag_runs' will copy column serialized_dag.dag_id to column dag_run.dag_id, which conflicts with relationship(s): 'DagRun.task_instances' (copies task_instance.dag_id to dag_run.dag_id), 'TaskInstance.dag_run' (copies task_instance.dag_id to dag_run.dag_id). If this is not the intention, consider if these relationships should be linked with back_populates, or if viewonly=True should be applied to one or more if they are read-only. For the less common case that foreign key constraints are partially overlapping, the orm.foreign() annotation can be used to isolate the columns that should be written towards. To silence this warning, add the parameter 'overlaps="dag_run,task_instances"' to the 'SerializedDagModel.dag_runs' relationship.", "Traceback (most recent call last):", " File "/opt/airflow/lib/python3.6/site-packages/connexion/apis/abstract.py", line 209, in add_paths", " self.add_operation(path, method)", " File "/opt/airflow/lib/python3.6/site-packages/connexion/apis/abstract.py", line 173, in add_operation", " pass_context_arg_name=self.pass_context_arg_name", " File "/opt/airflow/lib/python3.6/site-packages/connexion/operations/init.py", line 8, in make_operation", " return spec.operation_cls.from_spec(spec, *args, **kwargs)", " File "/opt/airflow/lib/python3.6/site-packages/connexion/operations/openapi.py", line 138, in from_spec", " **kwargs", " File "/opt/airflow/lib/python3.6/site-packages/connexion/operations/openapi.py", line 89, in init", " pass_context_arg_name=pass_context_arg_name", " File "/opt/airflow/lib/python3.6/site-packages/connexion/operations/abstract.py", line 96, in init", " self._resolution = resolver.resolve(self)", " File "/opt/airflow/lib/python3.6/site-packages/connexion/resolver.py", line 40, in resolve", " return Resolution(self.resolve_function_from_operation_id(operation_id), operation_id)", " File "/opt/airflow/lib/python3.6/site-packages/connexion/resolver.py", line 66, in resolve_function_from_operation_id", " raise ResolverError(str(e), sys.exc_info())", "connexion.exceptions.ResolverError: <ResolverError: columns>", "", "During handling of the above exception, another exception occurred:", "", "Traceback (most recent call last):", " File "/opt/airflow/bin/airflow", line 8, in ", " sys.exit(main())", " File "/opt/airflow/lib/python3.6/site-packages/airflow/main.py", line 40, in main", " args.func(args)", " File "/opt/airflow/lib/python3.6/site-packages/airflow/cli/cli_parser.py", line 48, in command", " return func(*args, **kwargs)", " File "/usr/lib/python3.6/contextlib.py", line 52, in inner", " return func(*args, **kwds)", " File "/opt/airflow/lib/python3.6/site-packages/airflow/cli/commands/user_command.py", line 35, in users_list", " appbuilder = cached_app().appbuilder # pylint: disable=no-member", " File "/opt/airflow/lib/python3.6/site-packages/airflow/www/app.py", line 135, in cached_app", " app = create_app(config=config, testing=testing)", " File "/opt/airflow/lib/python3.6/site-packages/airflow/www/app.py", line 120, in create_app", " init_api_connexion(flask_app)", " File "/opt/airflow/lib/python3.6/site-packages/airflow/www/extensions/init_views.py", line 172, in init_api_connexion", " specification='v1.yaml', base_path=base_path, validate_responses=True, strict_validation=True", " File "/opt/airflow/lib/python3.6/site-packages/connexion/apps/flask_app.py", line 57, in add_api", " api = super(FlaskApp, self).add_api(specification, **kwargs)", " File "/opt/airflow/lib/python3.6/site-packages/connexion/apps/abstract.py", line 156, in add_api", " options=api_options.as_dict())", " File "/opt/airflow/lib/python3.6/site-packages/connexion/apis/abstract.py", line 111, in init", " self.add_paths()", " File "/opt/airflow/lib/python3.6/site-packages/connexion/apis/abstract.py", line 216, in add_paths", " self._handle_add_operation_error(path, method, err.exc_info)", " File "/opt/airflow/lib/python3.6/site-packages/connexion/apis/abstract.py", line 231, in _handle_add_operation_error", " raise value.with_traceback(traceback)", " File "/opt/airflow/lib/python3.6/site-packages/connexion/resolver.py", line 61, in resolve_function_from_operation_id", " return self.function_resolver(operation_id)", " File "/opt/airflow/lib/python3.6/site-packages/connexion/utils.py", line 111, in get_function_from_name", " module = importlib.import_module(module_name)", " File "/usr/lib/python3.6/importlib/init.py", line 126, in import_module", " return _bootstrap._gcd_import(name[level:], package, level)", " File "", line 994, in _gcd_import", " File "", line 971, in _find_and_load", " File "", line 955, in _find_and_load_unlocked", " File "", line 665, in _load_unlocked", " File "", line 678, in exec_module", " File "", line 219, in _call_with_frames_removed", " File "/opt/airflow/lib/python3.6/site-packages/airflow/api_connexion/endpoints/connection_endpoint.py", line 26, in ", " from airflow.api_connexion.schemas.connection_schema import (", " File "/opt/airflow/lib/python3.6/site-packages/airflow/api_connexion/schemas/connection_schema.py", line 42, in ", " class ConnectionSchema(ConnectionCollectionItemSchema): # pylint: disable=too-many-ancestors", " File "/opt/airflow/lib/python3.6/site-packages/marshmallow/schema.py", line 117, in new", " dict_cls=dict_cls,", " File "/opt/airflow/lib/python3.6/site-packages/marshmallow_sqlalchemy/schema/sqlalchemy_schema.py", line 94, in get_declared_fields", " fields.update(mcs.get_auto_fields(fields, converter, opts, dict_cls))", " File "/opt/airflow/lib/python3.6/site-packages/marshmallow_sqlalchemy/schema/sqlalchemy_schema.py", line 108, in get_auto_fields", " for field_name, field in fields.items()", " File "/opt/airflow/lib/python3.6/site-packages/marshmallow_sqlalchemy/schema/sqlalchemy_schema.py", line 110, in ", " and field_name not in opts.exclude", " File "/opt/airflow/lib/python3.6/site-packages/marshmallow_sqlalchemy/schema/sqlalchemy_schema.py", line 28, in create_field", " return converter.field_for(model, column_name, **self.field_kwargs)", " File "/opt/airflow/lib/python3.6/site-packages/marshmallow_sqlalchemy/convert.py", line 171, in field_for", " return self.property2field(prop, **kwargs)", " File "/opt/airflow/lib/python3.6/site-packages/marshmallow_sqlalchemy/convert.py", line 146, in property2field", " field_class = field_class or self._get_field_class_for_property(prop)", " File "/opt/airflow/lib/python3.6/site-packages/marshmallow_sqlalchemy/convert.py", line 210, in _get_field_class_for_property", " column = prop.columns[0]", " File "/opt/airflow/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py", line 1240, in getattr", " return self._fallback_getattr(key)", " File "/opt/airflow/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py", line 1214, in _fallback_getattr", " raise AttributeError(key)", "AttributeError: columns"], "stdout": "", "stdout_lines": []}

Reproduces how often:
[What percentage of the time does it reproduce?]
100%

Environment

Ubuntu 18.04

  • The release version/s you are using:
  • OS:
  • Others:

Additional Information

[Any additional information, configuration or data that might be necessary to reproduce the issue.]

comments

In my machine I already have the MySQL and Tomcat installed and now I want to install Airflow onto it.

[SUPPORT] SCHEDULER_RUNS discussion

Hello there,
This is more like a discussion than a support request but I did not know where to ask it.
In your role, you set SCHEDULER_RUNS to 1000, which in fact corresponds to airflow_scheduler_num_runs, which is the total number of runs a scheduler does before shutting down.
I saw in stackoverflow this number could be set to 5 in some docker/kubernetes environment. I also read this parameter should be set to -1 in somes cases which implies the scheduler runs undefinitely and this behavior should be the norm (link here).
So I would like to know your point of view on this matter and what is the reason you set this parameter to 1000 ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.