Comments (12)
- 能举个例子么
- 再说一下你的预期或者建议
from sqllineage.
非常感谢您的回复,我这边在建设大数据领域的字段血缘逻辑遇到些问题,看到你写的文章想了解下方案实现的细节
字段血缘的DAG,不能独立于表级血缘。 理想情况下,只维护一份统一的血缘图。至于实现,可以有两种:
- 把DAG做到字段粒度,通过一些转换, 可以计算出表级血缘的DAG。用关系型数据库的概念来做类比,就像先做一张明细表,在明细表的基础上聚合可以得到汇总表。
- 通过属性图的形式来建模, 可以参照JanusGraph的文档。表和字段分别是两种类型 的节点,同时另外还有两种类型的边, 其一是字段到表的所属关系,其二是字段与字段、表与表的血缘关系。
对于上图来说,Hive TB1 和 Hive TB2 之间的表血缘和字段血缘可以正常构建,但后续通过hive2kafka任务加工得到的kafka实体可能就没有字段级血缘,这种情况在分析一个字段的全部下游时可能就会断在这个没有字段血缘的实体上;
这边不太确定你这里的方案2是如何处理的,所以想请教一下。
from sqllineage.
你截图 hive2kafka & kafka2clickhouse 具体的处理是什么,是SQL么。如果是为什么会没有字段血缘?
from sqllineage.
是一个配置化的数据同步任务,这里拿这两种任务类型举例,假设kafka实体不存在字段血缘
from sqllineage.
kafka实体的上,下游实体之间有没有字段血缘。
from sqllineage.
有,kafka上游的hive有字段血缘,下游的clickhouse也有字段血缘,只有kafka没有
from sqllineage.
我说的是H2_C2, CK_C1之间有没有血缘
from sqllineage.
这个是没有的
from sqllineage.
那我理解你的血缘就应该在kafka实体上游就停止了,这是符合预期的吧
from sqllineage.
字段血缘
from sqllineage.
CK_C1 应该是 H2_C1 的子代, CK_C2 是 H2_C2 的子代
from sqllineage.
我理解在列级别血缘,你这个图没有你说的这个关系
from sqllineage.
Related Issues (20)
- Column Level Lineage: SELECT * EXCEPT() not showing all columns (BigQuery) HOT 1
- Metadata Masked When Table was in a previous UPDATE statement HOT 1
- Fails to read UTF8-BOM encoded files HOT 2
- No column-level lineage for T-SQL MERGE statement HOT 4
- Tsql table names with square brackets are not resolved correctly HOT 2
- Parse column level lineage incorrect
- Tsql -UPDATE set analyz error HOT 1
- Support Hive/SparkSQL Multi Table Insert Syntax HOT 1
- Column level lineage not drawn properly when metadata is provided HOT 1
- [hive] not support insert into table(field1,fleld2.field3) HOT 3
- when I use merge into ,it reports this error HOT 1
- [hive-dialect] order by and union all together it throw error HOT 1
- [hive-dialect]date/time keyword as fieldName with no `` throw error HOT 1
- [hive-dialect] dbName . tableName can not work failed HOT 3
- [hive-dialect] row_number() over (distribute by fund_account sort by bus_date desc) as sort_id ; not support HOT 1
- Will the informix database be supported in the future? HOT 1
- Subquery Partial Wildcard expansion breaks the column lineage path HOT 3
- Greenplum LATERAL Subquery doesn't work HOT 2
- False negative for Scalar Subquery used in Function HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sqllineage.