Code Monkey home page Code Monkey logo

dtstack / taier Goto Github PK

View Code? Open in Web Editor NEW
1.3K 33.0 318.0 151.03 MB

Taier is a big data development platform for submission, scheduling, operation and maintenance, and indicator information display

Home Page: https://dtstack.github.io/Taier/

License: Apache License 2.0

Shell 0.04% Java 72.37% JavaScript 0.28% Scala 1.14% TypeScript 16.59% CSS 5.19% SCSS 0.56% Mustache 0.01% Dockerfile 0.01% PLpgSQL 3.84%
distributed-schedule-system dag scheduler job-scheduler workflow-scheduling-system task-schedule cronjob-scheduler data-schedule flink spark

taier's Introduction

Taier Logo

Taier

A distributed dispatching system

Office Website | Document

中文 | English

Introduction

Taier , spelling in chinese is 太阿, is one of the ancient chinese celebrated swords.

Taier is distributed dispatching system that focus on different tasks submitted and scheduled. It's aimed at reducing the ETL's cost, making the complex dependencies between tasks clearly and reducing the labor cost about submitting, scheduling and O&M.

It's unnecessary to concern about the complex dependencies between tasks and the underlying architecture about the big data platform at Taier, so that you can pay more attention into business.

Taier provide an one-stop big data platform for submitting tasks, scheduling tasks, O&M, presentation about indicators.

The core features for Taier are as follows:

  • Easy to distributed extend
  • Visualization config for DAG
  • With an IDE development platform designed for big-data users
  • Supports to develop your own plugins
  • Multiply task modes including guide mode and script mode
  • Supports to the dependencies between upstream/downstream tasks
  • Supports batch and stream tasks
  • Integrates various different versions of Hadoop
  • Easy to integrate Flink Standalone
  • Completely safe and non-intrusive to the cluster's environment
  • Isolation by tenants and clusters
  • Supports kerberos authentication
  • Different versions for tasks
  • Supports user-defined parameters for task
  • Real-time monitoring about cluster's resources
  • Real-time presented about data indicators
  • Restrict the task's resource

Architecture

architecture

Quick start

$ wget https://raw.githubusercontent.com/DTStack/Taier/master/docker-compose.yml
$ docker-compose up -d

main

Tasks

Tasks Documentation
Work Flow Documentation
Data Sync Documentation
Data Acquisition Documentation
Flink Documentation
Shell Documentation
Python Documentation
Spark SQL Documentation
Hive SQL Documentation
Flink SQL Documentation
OceanBase SQL Documentation
ClickHouse SQL Documentation
Doris SQL Documentation
TiDB SQL Documentation
MySQL SQL Documentation
Vertica SQL Documentation
Postgre SQL Documentation
SqlServer SQL Documentation
Greenplum SQL Documentation
MaxCompute SQL Documentation
GaussDB SQL Documentation
DataX Documentation
User-defined Task Documentation

Questions

FAQ Reference For questions, bugs and supports please open an issue, we'll reply you in time.

Stay in touch

Contribution

Please make sure to read the Contributing Guide before making a pull request.

Contributor

License

Taier is under the Apache 2.0 license. See the LICENSE file for details.

taier's People

Contributors

ddwolf715 avatar diff-stone avatar dtdazhi avatar flechazow avatar flyysoul avatar ghm02708 avatar hyperleoon avatar jiemotongxue avatar jin-sir avatar jixiangup avatar kinoxyz1 avatar kongshan-zhuyu avatar lousenjay avatar mortalyoung avatar peishengsheng avatar poxiao8 avatar qaq-jun avatar saltingfish avatar shiqiwang0 avatar techfight avatar vainhope avatar wangchuanpoxiao avatar weaksloth avatar wewoor avatar wjschsg88 avatar yyl4ever avatar zekai-li avatar zhangyaodong123 avatar zhaozhenzhi avatar zwight avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

taier's Issues

一期 Roadmap

里程碑一、改造 运维中心、数据源、控制台 (2021 7.01 - 2021 07.15) #17

基于数栈版本:v4.2

  • 导航调整,导出 微应用入口
  • 移除 租户/项目/用户 等不需要的逻辑

详细参考 #17

里程碑二、任务开发 (2021 7.01 - 2021 08.15) #19

  • 支持 SparkSQL 任务开发
  • 支持基本的 数据同步(关系型数据库 )

详细参考 #19

里程碑三、发版 Ready (2021 8.15 - 2021 08.30)

全局功能测试(任务/运维/控制台/数据源)

  • 测试数据源管理
  • 测试支持 SparkSQL 任务
  • 测试支持数据同步任务(MySQL,Oracle,SQLServer, PG, HDFS, Hive)
  • 测试基本的任务运维

前端工程初始化

基本库和框架: Typescript + React + Molecule

  • 集成 ESLint
  • 集成 Prettier
  • 集成 Commitlint
  • 集成 lint-md
  • Jest 单元测试

【运维中心】周期实例

  • 查询(名称,业务时间,计划时间,状态,任务类型,调度周期)
  • 批量杀任务
  • 重跑当前
  • 重跑当前及全部下游
  • 实例依赖视图

里程碑一:改造 运维中心、数据源、控制台

主要是剔除老应用的租户、用户、用户管理、等等的逻辑

运维中心 @ProfBramble

  • 移除右侧导航所有相关逻辑, 产品、用户、消息、设置
  • 改造入口页面,移除运维中心顶部导航
  • 增加微应用入口
  • 界面细节调整、测试

数据源 @ProfBramble

  • 改造入口页面,移除数据源中心顶部导航
  • 增加微应用入口
  • 界面细节调整、测试

控制台 @ting0130

  • 更改顶部导航为左侧布局
  • 改造入口页面
  • 增加微应用入口
  • 界面路由改造 #22
  • 界面细节调整、测试 #21

公共改造 #23 @ProfBramble @ting0130

  • 移除右侧导航所有相关逻辑, 产品、用户、消息、设置
    image
  • 移除 dt-common 依赖
  • 使用 dt-react-component 替换原 dt-common 逻辑
  • 使用 dt-utils 替换原 dt-common 逻辑
  • 清理所有产品下无用的依赖
  • 清理 public/config.js 逻辑

任务实例运维

  • 展开上下游
  • 查看任务日志
  • 刷新任务实例
  • 置成功并恢复调度
  • 重跑并恢复调度

环境变量识别报错

详情

环境变量在 SQL 中识别时,会报如图的错误

image

SQL如下:

create table fen_hao_split
    as
    select '202107' as ${tmonth_code},t.project_code,d.ac_code,
    d.disp_code,d.disp_index,
    sum(case when d.isdecr is not null and d.isdecr<>t.isdecr then 0 else m_scale*t.rmb_value end) as ttl_value,
    sum(case when date_format(date_code,'yyyyMM')<concat(substr('202107',1,4),'01') and d.isdecr is not null and d.isdecr<>t.isdecr then 0
            when date_format(date_code,'yyyyMM')<concat(substr('202107',1,4),'01') then m_scale*t.rmb_value end) as jan_value 
    from ods_d_pfa_gl_account_df d 
    left join tmp_ads_fin_pfa_balance_proj_df_00_rt t 
    on substr(t.account_code,1,4) = d.fin_account_code 
    or substr(t.account_code,1,6) = d.fin_account_code 
        or substr(t.account_code,1,8) = d.fin_account_code 
    or substr(t.account_code,1,4)=split(fin_account_code,';')[0]
    where table_type='资产负债表'  and d.year_code=substr('202107',1,4)  and fiscal_period<='202107'
    group by t.project_code,d.ac_code,
            d.disp_code,d.disp_index
    limit 10
;

微前端静态资源的公共路径调整

背景

为了解决微应用的图片跨域等问题,我们将图片路径动态修改成带 host 的绝对路径,这也带来了 host 变量需要维护的问题

简要

需要一个文件或者交互让用户更方便的配置 host 绝对路径的前缀

微前端 css sandbox 硬隔离导致部分样式丢失

现在的微应用使用 QianKun 的自动扩展化前缀的 css 选择器进行隔离,但是会导致部分样式进入微前端环境后的样式会丢失
目前已知问题

  • mxGraph 的自定义样式失效
  • 运维中心列表详情页查看日志的自定义样式失效

增加一个简单的登录实现

  1. 前端配合一个很简单的登录界面;
  2. 通过配置:用户名密码;
  3. 返回token,包含所需信息userid projectId tenantId

字体图标设计

主要是应用在 DAGScheduleX 中的字体图标,具体如下:

  • 资源
  • 函数
  • 数据同步任务
  • SparkSQL 任务

image

改造 运维中心、数据源、控制台

主要是改造现有的 运维中心、数据源、控制台,

基于数栈版本:v4.2

  • 1. 导航调整,导出 微应用入口
  • 2. 移除 租户/项目/用户 等不需要的逻辑

里程碑二、任务开发

核心主要是任务开发,基于 Molecule 开发:

开发架构

任务公共功能

  • 提交
  • 发布
  • 运行
  • 停止
  • 运维
  • 任务属性
  • 调度依赖配置
  • 依赖试图 #9
  • 任务参数
  • 环境参数
  • 函数管理 #11
  • 资源管理 (SparkSQL 好像不需要,Spark 任务依赖 jar 资源?) #10

Spark SQL 任务开发

  • 增加 / 删除 #3

数据同步任务

  • 增加/删除 #4

【任务开发】数据同步

  • 向导模式(源,目标,映射,通道配置,预览)
  • 支持 MySQL,Oracle,SQLServer, PG,
  • 支持 HDFS,
  • 支持 Hive

【运维中心】补数据实例

  • 查询(名称,业务时间,计划时间,状态,任务类型,调度周期)
  • 批量杀任务
  • 重跑当前
  • 重跑当前及全部下游
  • 实例依赖视图

全局测试

系统测试

数据源

任务开发

任务运维

控制台

lerna 架构优化

  • start 命令优化
  • 依赖优化
  • husky 优化
  • template 优化
  • install 优化

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.