Code Monkey home page Code Monkey logo

vagrant-ansible-flink-cluster's Introduction

Flink clusters with Vagrant and Ansible

About The Project

The goal of this project is to create a fully functional Flink cluster by virtualization the nodes. When the cluster is active, Flink jobs can be deployed in the same way as in a real cluster..

Built With

Pre-requisites

  • It's necessary to have VirtualBox installed.
  • The project uses Vagrant for creating virtual nodes, so It's necessary to have Vagrant installed.
  • For the creation of the virtual node, Vagrant uses the Ubuntu 20.04 box, therefore the box is necessary to have it installed. Command: vagrant box add bento/ubuntu-20.04
  • The last step is to provision the nodes. Ansible is used to automate this task, so you need to have it installed as well.
  • While the Flink cluster is being created, use the ssh ports to communicate the nodes. By default, the ssh port is 22, but the host contains more than one virtual node, for this reason it is necessary to map the ssh ports on the host machine to others ports. The mapped ports are defined in Vagrantfile by default and are mapped to 2220,2221,2221. For this reason, it is very important that these ports are free. Note: We can also use these ports to communicate with virtual nodes through ssh.
  • Other ports that flink need free are: 6123,8080,8081

Getting Started

If you meet all the prerequisites, just run ./cluster-up.shin your favorite terminal emulator.

Project architecture

Vagrant

The Vagrant folder contains the Vagrant configuration file. This file defines the number of virtual nodes and their configuration. In this case, the file defines three nodes with Ubuntu 20.04 (one as Flink's Jobmanager and two as Flink's Taskmanager)

For each node, Vagrant calls the Ansible pipeline to provision the node.

Ansible

When Ansible is called by Vagrant the entrypoint is site.yml. This yml define pipeline per roles /hosts. The hosts(nodegroup or single node), with their hostname , are defined in inventory. Hostnames are the same of Vagrantfile. Inventory file also contain ssh properties, for example: jobmanager ansible_ssh_host=127.0.0.1 ansible_ssh_port=2220 ansible_ssh_user=vagrant ansible_ssh_pass=vagrant

The project folder structure follows the best practices proposed by Ansible (directory-layout).

Step-By-Step provisioning

Common role

The common role runs for all nodes:

  • The first part downloads the JDK and adds the environmental variable to the path.
  • The second downloads the Flink and adds the environmental variable to the route.
  • Finally, the Flink configuration files (flink-conf.yaml,master, slave) are added.
Start-cluster role

The cluster startup role is only executed by the Jobmanager node although some tasks are delegated to the taskmanager nodes. The task managers and jobmanager are registered in the cluster.

Workaround

If you don't want to download the JDK and Flink, you can add the .zip files to the files folder. Ansible will detect zip files and skip downloads

Health check

Finally, if everything is correct, the browser accesses the GUI. localhost:8081

vagrant-ansible-flink-cluster's People

Contributors

ivanas93 avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Forkers

suqcnn

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.