Currently the pipelines are validated and test for Flink local execution modes, valida

Enable the pipelines for Flink non-local execution modes as well about fhir-data-pipes HOT 1 OPEN

chandrashekar-s commented on June 22, 2024

Enable the pipelines for Flink non-local execution modes as well

from fhir-data-pipes.

Comments (1)

chandrashekar-s commented on June 22, 2024

The following improvements have been made for the Flink local execution mode

Auto generate Flink configuration file with appropriate values configured under it. The parameters are determined to the best effort basis so that the pipelines does not fail even for high loads. Refer here for details.
The number of threads (parallelism) are defaulted to the cores in the machine, but can be overridden over here. In local mode, by default only one worker gets created per pipeline and the parallelism is achieved by the same worker. However, in non-local mode the cluster can distribute the load across workers(Taskmanagers) to achieve the needed parallelism.
The parquet row group sizes are made configurable, so that the pipeline does not consume much Heap memory, changes can be found here.

Since for the non-local execution mode the resources are little abundant, these properties can be fine tuned for it. There are might be few changes that are needed to suit the needs of the cluster.

from fhir-data-pipes.

Enable the pipelines for Flink non-local execution modes as well about fhir-data-pipes HOT 1 OPEN

Comments (1)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent