cubefs / compass Goto Github PK

Compass is a task diagnosis platform for bigdata

License: Apache License 2.0

Shell 1.51% Java 90.37% Lua 0.05% Dockerfile 0.28% HTML 0.01% Vue 6.40% TypeScript 0.74% CSS 0.03% SCSS 0.01% Python 0.60%

bigdata spark hadoop flink diagnose mapreduce scheduler sql airflow dolphinscheduler

compass's Introduction

CubeFS

Community Meeting
The CubeFS Project holds bi-weekly community online meeting. To join or watch previous meeting notes and recordings, please see meeting schedule and meeting minutes.

Overview

CubeFS ("储宝" in Chinese) is an open-source cloud-native file storage system, hosted by the Cloud Native Computing Foundation (CNCF) as an incubating project.

What can you build with CubeFS

As an open-source distributed storage, CubeFS can serve as your datacenter filesystem, data lake storage infra, and private or hybrid cloud storage. In particular, CubeFS enables the separation of storage/compute architecture for databases and AI/ML applications.

Some key features of CubeFS include:

Multiple access protocols such as POSIX, HDFS, S3, and its own REST API
Highly scalable metadata service with strong consistency
Performance optimization of large/small files and sequential/random writes
Multi-tenancy support with better resource utilization and tenant isolation
Hybrid cloud I/O acceleration through multi-level caching
Flexible storage policies, high-performance replication or low-cost erasure coding

Documents

English version: https://cubefs.io/docs/master/overview/introduction.html
Chinese version: https://cubefs.io/zh/docs/master/overview/introduction.html

Community

Homepage: cubefs.io
Mailing list: [email protected]
Slack: cubefs.slack.com
WeChat: detail see here
Twitter: cubefs_storage

Partners and Users

There is the list of users and success stories ADOPTERS.md.

Reference

Haifeng Liu, et al., CFS: A Distributed File System for Large Scale Container Platforms. SIGMOD‘19, June 30-July 5, 2019, Amsterdam, Netherlands.

For more information, please refer to https://dl.acm.org/citation.cfm?doid=3299869.3314046 and https://arxiv.org/abs/1911.03001

License

CubeFS is licensed under the Apache License, Version 2.0. For detail see LICENSE and NOTICE.

Note

The master branch may be in an unstable or even broken state during development. Please use releases instead of the master branch in order to get a stable set of binaries.

Star History

compass's People

Stargazers

Watchers

Forkers

itsharex shlpeng imarch1 pandas886 cn-tingguo bihaiyang lxorc wildwolfbang huaxuehuaxue zx-hub xpleaf skdfeitian pocozh gavinljj binaryworld kwafoor neverdizzy donggong evengui jmingc fanyo190 hoey94 ychris78 lcy999 chasedream2015 dxcarl philory nilnon 562521057 wardlican iecspace codegzx adamyuanyuan wforget warrenzhu25 dh20 hadoop835 elitegx1994 zchen-hdu liukunyuan hopefulzhao yves-yuan pj1987111 issacpeng zwangsheng yuehan-mm captionkid xiaominyuan houshanren singer-bin photogamerun mmzq1015 xu-zl zhugj15 lixiuxiu lppsuixn hjainibuli huangjing-nl kuailehaibin gongweibiao1986 packyan litiliu bigdata-lq vipaiou-outlook maikouliujian sbloodys fuchanghai rterror zhuxt2015 bigdata0803 yangguangfu007 pizhihui lordk911 hyboll mcdull-zhang liusenhua zebozhuang jiyulongxu rongyousu cs-lpc margedog uicosp huangxiaofeng10047 web-logs2 lxbalex brandonlee8 dashbaord202401 hapream wg1026688210 c61811 stephenzhou-98 aib628 461806821 yew1eb silenceland yeezychao tongchengopensource tongcheng-elong wuzhjian liujianmin1024

compass's Issues

User name and password for logging in to portal

The portal login user name does not exist

Failed to execute compass.sql `Invalid default value for 'start_time'`

The default explicit_defaults_for_timestamp=OFF in mysql version 5.7.* means that timestamp column by default is NOT NULL.

mysql version: 5.7.31

refer: https://dev.mysql.com/doc/refman/5.7/en/timestamp-initialization.html

If the [explicit_defaults_for_timestamp](https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_explicit_defaults_for_timestamp) system variable is disabled, [TIMESTAMP](https://dev.mysql.com/doc/refman/5.7/en/datetime.html) columns by default are NOT NULL, cannot contain NULL values, and assigning NULL assigns the current timestamp. 
To permit a [TIMESTAMP](https://dev.mysql.com/doc/refman/5.7/en/datetime.html) column to contain NULL, explicitly declare it with the NULL attribute.

We can change

   `start_time` timestamp(6) DEFAULT NULL COMMENT '任务开始时间',
   `end_time`   timestamp(6) DEFAULT NULL COMMENT '任务结束时间',

   `start_time` timestamp(6) NULL DEFAULT NULL COMMENT '任务开始时间',
   `end_time`   timestamp(6) NULL DEFAULT NULL COMMENT '任务结束时间',

   `start_time` timestamp(6) DEFAULT current_timestamp(6) COMMENT '任务开始时间',
   `end_time`   timestamp(6) DEFAULT current_timestamp(6) COMMENT '任务结束时间',

Kafka and Redis is both used as MQ to decoupling module, consider unifying them?

flink诊断

请问一下flink的诊断的开源计划是什么

Execution failed on Mac

I am using the compiled dist script on Mac and encountered an error. After troubleshooting, I found out that the syntax 'ps --pid' is not supported on Mac.

cannot get project name from mysql

Change MySQL table to partitioned table

The task_instance and task_datum tables can be optimized into partitioned tables.

compile failed.

程序包org.elasticsearch.xcontent不存在

Besides spark, does it support tez tasks?

Our offline task are based on Tez，does it support tez tasks?

Is support DolphinScheduler 3.0.3 ？

Does it support the latest 3.x version of DolphinScheduler?

How to adapt different versions of DolphinScheduler, the corresponding database table structure

The elasticsearch create index name rule problem

Now the elasticsearch index name suffix is based on execution_date. In the scenario of task rerun, many historical indexes will be generated. Can this rule be optimized?

ElasticSearch query problem

the elasticSearch indeces of compass are automatically created.The ElasticSearch string will be mapped to both text and keyword types by default.test and keyword are used in different query scenarios. compass used the wrong query.

compass query log in task-detect
2023-04-28 14:53:36,538 INFO 657111 [delay-queue-executor-2] [] : [c.o.c.detect.service.impl.ElasticSearchServiceImp:151] indexes:[compass-job-instance-*], duration:0 ,condition:{"query":{"bool":{"filter":[{"terms":{"taskName":["TEST_TPCDS"],"boost":1.0}},{"terms":{"projectName":["SPARKSQL_TEST_20012523"],"boost":1.0}},{"terms":{"executionDate":["2023-04-28T01:29:58.000"],"boost":1.0}},{"terms":{"flowName":["SPARKSQL_TEST_20012523"],"boost":1.0}}],"adjust_pure_negative":true,"boost":1.0}}}

return empty list. when i change query condition to
{"query":{"bool":{"filter":[{"terms":{"executionDate":["2023-04-28T01:29:58.000"],"boost":1.0}},{"terms":{"taskName.keyword":["TEST_TPCDS"],"boost":1.0}},{"terms":{"projectName.keyword":["SPARKSQL_TEST_20012523"],"boost":1.0}},{"terms":{"flowName.keyword":["SPARKSQL_TEST_20012523"],"boost":1.0}}],"adjust_pure_negative":true,"boost":1.0}}}

it works

may be we should change search condition or use elasticSearch index templete to close column mapping

[Question]: CpuWasteDetector bug verify

spark 的CpuWasteDetector检测器，在判断executorWastedPercentOverAll的时候：
float executorWastedPercentOverAll =
(((float) inJobComputeMillisAvailable - inJobComputeMillisUsed) / appComputeMillisAvailable) * 100;

其中inJobComputeMillisAvailable的单位是cpu·时，而inJobComputeMillisUsed是SparkListenerTaskEnd这个事件的TaskMetrics
@JsonProperty("Executor Run Time")
private Long executorRunTime;
这个属性累计的，这个单位是时。

是不是应该取executor的cpu时间字段进行累加？
@JsonProperty("Executor CPU Time")
private Long executorCpuTime;

replace problem with checkLogPath judgment in HDFSUtil

checkLogPath()

private static String checkLogPath(NameNodeConf nameNode, String logPath) {
    // TODO return？
    if (logPath.split(":").length > 2) {
        return logPath;
    }
    return logPath.replace(nameNode.getNameservices(), nameNode.getNameservices() + ":" + nameNode.getPort());
}

base log path
logPath:hdfs://log-hdfs:8020/flume/airflow/dag_id=example_bash_operator/run_id=manual__2023-04-13T02_28_39_356311+00_00/task_id=runme_1/attempt=1
may be we should change
logPath.replace(nameNode.getNameservices(), nameNode.getNameservices() + ":" + nameNode.getPort());
to
logPath.replace("log-hdfs:8020", nameNode.getNameservices() + ":" + nameNode.getPort());

[Bug]: The log reading option is single, does not support the YARN log directory structure of Hadoop 3.3.0 version, and does not support reading log files with suffix compression formats.

报错信息

2023-05-09 08:01:10,890 ERROR 53279 [task-thread-5] [] : [c.o.c.p.service.job.parser.SparkExecutorLogParser:80] Exception:
java.io.FileNotFoundException: File hdfs://cluster1/tmp/logs/hadoop/logs/application_1682060914400_0287 does not exist.
at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1282)
at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1256)
at org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1201)
at org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1197)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1215)
at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:2162)
at org.apache.hadoop.fs.FileSystem$5.(FileSystem.java:2288)
at org.apache.hadoop.fs.FileSystem.listFiles(FileSystem.java:2285)
at com.oppo.cloud.parser.utils.HDFSUtil.listFiles(HDFSUtil.java:149)
at com.oppo.cloud.parser.service.reader.HDFSReader.listFiles(HDFSReader.java:56)
at com.oppo.cloud.parser.service.reader.HDFSReader.getReaderObjects(HDFSReader.java:72)
at com.oppo.cloud.parser.service.job.parser.SparkExecutorLogParser.run(SparkExecutorLogParser.java:78)
at com.oppo.cloud.parser.service.job.task.Task.lambda$createFutures$0(Task.java:67)
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)

原因

https://issues.apache.org/jira/browse/YARN-6929
新版本的日志路径结构{aggregation_log_root}/{user}/bucket_{suffix}/{bucket1}/{appId}

建议支持3.3.0版本的yarn日志目录结构

建议支持多种压缩格式日志的读取

可以添加到配置里供用户选择读取的压缩格式，亦或者读取相关配置文件。
如：spark历史服务配置中相关的日志压缩配置项

其它建议

可以适当性的出一些主流版本调度器的task-syncer映射关系配置项。
减少普通用户的搭建成本

Logic problem with checkLogPath judgment in HDFSUtil

custom.rules.logPathJoins.data is

hdfs://log-hdfs:8020/flume/airflow

checkLogPath()

private static String checkLogPath(NameNodeConf nameNode, String logPath) {
    // TODO return？
    if (logPath.split(":").length > 2) {
        return logPath;
    }
    return logPath.replace(nameNode.getNameservices(), nameNode.getNameservices() + ":" + nameNode.getPort());
}

may be we should change

 if (logPath.split(":").length > 2) {
        return logPath;
 }

 if (logPath.split(":").length != 3) {
        return logPath;
 }

modify custom.rules.logPathJoins.data to remove 'hdfs:/'

Spark eventlog decompress util

Spark eventlog can be compressed, the algorithms includes snappy, lz4, zstd.
When eventlog is compressed, compass can not read the log correctly.
The utility to decompress the eventlog is needed to solve the problem.

Database-inisialiseringsfout

When I try to run the document->sql->compass.sql file, an error message appears: [2023-04-06 11:51:12] [23000][1062] Duplicate entry 'driver-' for key 'idx_logType_action'.

Support Kafka ssl and Redis Cluster&Standalone

[Feature]: metrics support

Is your feature request related to a problem? Please describe here.
We use the redis as queue to consume task and parse the log, here it need to know the size of the queue when queue is blocked.
other metrics:

when parsing log failed
the number of different kinds of logs parsed
the metrics in task-syncer when exception occurs.
and so on, the more the better.

[feature] Support to diagnose spark application when application running.

complie fail

编译失败日志.txt

Is there a plan to use embedded debezium instead of canal in order to reduce external dependencies?

can compass support azkaban？

Remove the dependency on zookeeper

Compass depends on many services so that it is difficult for me to install it quickly. Can we shrink some dependent services?

The zookeeper service is only used as a distributed lock in the task-metadata, maybe we can use redis instead.

Some continuous error logs in the text parser may not be captured

In com.oppo.cloud.common.util.textparser.TextParser#matchTailsTemplate, if there are multiple lines of continuous error logs, since the second line of logs is treated as TAIL, it cannot be captured as a ParserResult.

task-application error

2023-05-08 16:31:51,702 ERROR 31022 [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] [] : [c.o.c.a.service.impl.LogParserServiceImpl:362] filesPattern_error:java.lang.IllegalArgumentException: Wrong FS: hdfs://hacluster:8020/flume/dolphinscheduler/2023-05-08/7748035392416_4/243176, expected: hdfs://hacluster

部署指南中有提到：task-application模块需要读取调度平台日志，推荐使用flume收集到hdfs，但是没有说明如何配置flume.
实际使用的是dolphinscheduler2.0，部署了4个worker，日志都在本地。

PS：application-hadoop文件配置信息：
hadoop:
namenodes:
- nameservices: hacluster
namenodesAddr: [ "xxx01", "xxx03" ]
namenodes: [ "nn1", "nn2" ]
user: hdfs
password:
port: 8020
# scheduler platform hdfs log path keyword identification, used by task-application
matchPathKeys: [ "flume" ]
# kerberos
enableKerberos: false
# /etc/krb5.conf
krb5Conf: ""
# hdfs/*@EXAMPLE.COM
principalPattern: ""
# admin
loginUser: ""
# /var/kerberos/krb5kdc/admin.keytab
keytabPath: ""

yarn:
- clusterName: "hacluster"
resourceManager: [ "xxx01:8088","xxx03:8088" ]
jobHistoryServer: "xxx02:19888"

spark:
sparkHistoryServer: [ "xxx02:18081" ]

Table field mapping exception

in order to achieve data abstration in the scheduler layer.task-syncer has provides data mapping templates.but there seems to be some questions with the mapping between airflowDB.task_instance.try_number and comapssDB.task_instance.retry_times.

airflowDB.task_instance.try_number： execute count
comapssDB.task_instance.retry_times：retry count

try_number=1 is means execute once
retry_times=1 is means execute once ,retry once

[Question]: Problem of deploying compass

compass启动前是否一定要将Hadoop、Spark、Canal、MySQL、Kafka、Redis、Zookeeper、Elasticsearch全部都启动好，
然后配置好application-hadoop.yml和 compass_env.sh，执行./bin/start_all.sh即可？

[Feature]: Hope to adapt to the version of DolphinScheduler 1.*

目前生产环境使用的ds版本是1.3.6 , 希望适配下这个版本。

Compass deployment, mysql data not synchronized

Please help take a look at this issue，i find the data of dolphinscheduler is not synced to compass(except the success tasks in task_instance)

支持kerberos吗

从部署这块没有找到和kerberos相关的信息

getFileSystem in HDFSUtil has set incorrect dfs.namenode.rpc-address

We can change

for (int i = 0; i < nameNodeConf.getNamenodes().length; i++) {
      String r = nameNodeConf.getNamenodes()[i];
      conf.set("dfs.namenode.rpc-address." + nameNodeConf.getNameservices() + "." + r,nameNodeConf.getNamenodesAddr()[i]);
}

for (int i = 0; i < nameNodeConf.getNamenodes().length; i++) {
    String r = nameNodeConf.getNamenodes()[i];
    conf.set("dfs.namenode.rpc-address." + nameNodeConf.getNameservices() + "." + r,
            nameNodeConf.getNamenodesAddr()[i]+":"+nameNodeConf.getPort());
}

[Question]: Do you have plans to support Flink and Trino?

请问有计划支持flink 和 trino 吗

[Question]: Do you have local development setup documentation?

Does it support monitoring tasks in Hive On Yarn mode?

I expect to monitor the execution status of task thought compass.

Scene:
HiveSQL + dolphinscheduler + Yarn

Duplicated methods in UserServiceImpl

The getByUsername and getSchedulerType methods in UserServiceImpl are duplicated. we can remove getSchedulerType method.

Exception details are ignored

When I start the task-portal service and open the ui page I get an error and there are no error infos in the service log. I guess the error is due to no data in elasticsearch.

error api: /compass/api/v1/job/graph
response: {"code":500,"msg":null,"data":null}

We can remove the try-catch in the controllers, and the error will be caught and handled by exceptionHandler in GlobalExceptionConfig and record the error log.

cluster supporting k8s authentication

i checked the relevant codes and found that we do not support K8S certified clusters. do we have any plans to support ?

compass deployment problem

1、Basic environment confirmation:
Do kafka redis es components need to be configured for compass with dolpinscheduler? (single-player mode)

2、Port 7050 is not found after startup, and no error message is displayed:
Console log information is as follows:

Log information when start_all.sh is executed:

./bin/start_all.sh
/opt/compass/dist/compass/task-application
32066
/opt/compass/dist/compass/task-canal
conf/metrics/
conf/example/
conf/spring/
conf/spring/tsdb/
conf/spring/tsdb/sql-map/
conf/spring/tsdb/sql/
conf/logback.xml
conf/metrics/Canal_instances_tmpl.json
conf/example/instance.properties
conf/canal.properties
conf/canal_local.properties
conf/spring/default-instance.xml
conf/spring/tsdb/mysql-tsdb.xml
conf/spring/tsdb/sql-map/sqlmap_snapshot.xml
conf/spring/tsdb/sql-map/sqlmap-config.xml
conf/spring/tsdb/sql-map/sqlmap_history.xml
conf/spring/tsdb/h2-tsdb.xml
conf/spring/tsdb/sql/create_table.sql
conf/spring/base-instance.xml
conf/spring/group-instance.xml
conf/spring/memory-instance.xml
conf/spring/file-instance.xml

gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now

gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
plugin/
plugin/connector.rocketmq-1.1.6-jar-with-dependencies.jar

gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
cd to /opt/compass/dist/compass/task-canal/bin for workaround relative path
LOG CONFIGURATION : /opt/compass/dist/compass/task-canal/bin/../conf/logback.xml
canal conf : /opt/compass/dist/compass/task-canal/bin/../conf/canal.properties
CLASSPATH :/opt/compass/dist/compass/task-canal/bin/../conf:/opt/compass/dist/compass/task-canal/bin/../lib/*:.:/usr/local/java/lib/dt.jar:/usr/local/java/lib/tools.jar
cd to /opt/compass/dist/compass/task-canal for continue
/opt/compass/dist/compass/task-canal-adapter
cd to /opt/compass/dist/compass/task-canal-adapter/bin for workaround relative path
CLASSPATH :/opt/compass/dist/compass/task-canal-adapter/bin/../conf:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/zookeeper-3.4.5.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/zkclient-0.10.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/validation-api-2.0.1.Final.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/ucp-11.2.0.4.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/tomcat-embed-websocket-8.5.29.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/tomcat-embed-el-8.5.29.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/tomcat-embed-core-8.5.29.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/spring-webmvc-5.0.5.RELEASE.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/spring-web-5.0.5.RELEASE.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/spring-tx-5.0.5.RELEASE.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/spring-security-crypto-5.0.4.RELEASE.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/spring-orm-5.0.5.RELEASE.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/spring-jdbc-5.0.5.RELEASE.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/spring-jcl-5.0.5.RELEASE.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/spring-expression-5.0.5.RELEASE.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/spring-core-5.0.5.RELEASE.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/spring-context-5.0.5.RELEASE.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/spring-cloud-context-2.0.0.RELEASE.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/spring-boot-starter-web-2.0.1.RELEASE.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/spring-boot-starter-tomcat-2.0.1.RELEASE.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/spring-boot-starter-logging-2.0.1.RELEASE.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/spring-boot-starter-json-2.0.1.RELEASE.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/spring-boot-starter-2.0.1.RELEASE.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/spring-boot-autoconfigure-2.0.1.RELEASE.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/spring-boot-2.0.1.RELEASE.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/spring-beans-5.0.5.RELEASE.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/spring-aop-5.0.5.RELEASE.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/snakeyaml-1.19.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/slf4j-api-1.7.12.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/simplefan-11.2.0.4.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/protobuf-java-3.6.1.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/postgresql-42.1.4.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/osdt_core-11.2.0.4.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/osdt_cert-11.2.0.4.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/oro-2.0.8.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/oraclepki-11.2.0.4.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/ons-11.2.0.4.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/ojdbc6-11.2.0.4.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/netty-all-4.1.6.Final.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/netty-3.2.2.Final.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/mysql-connector-java-5.1.48.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/mssql-jdbc-7.0.0.jre8.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/logback-core-1.1.3.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/logback-classic-1.1.3.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/log4j-to-slf4j-2.17.0.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/log4j-api-2.17.0.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/log4j-1.2.17.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/jul-to-slf4j-1.7.25.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/jsr305-3.0.2.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/joda-time-2.9.4.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/jcl-over-slf4j-1.7.12.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/jboss-logging-3.3.2.Final.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/javax.annotation-api-1.3.2.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/jackson-module-parameter-names-2.9.5.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/jackson-datatype-jsr310-2.9.5.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/jackson-datatype-jdk8-2.9.5.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/jackson-databind-2.9.5.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/jackson-core-2.9.5.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/jackson-annotations-2.9.0.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/htrace-core-3.1.0-incubating.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/hibernate-validator-6.0.9.Final.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/hbase-shaded-client-1.1.2.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/guava-18.0.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/findbugs-annotations-1.3.9-1.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/fastjson2-2.0.7.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/druid-1.2.11.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/curator-recipes-2.10.0.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/curator-framework-2.10.0.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/curator-client-2.10.0.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/connector.core-1.1.6.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/commons-logging-1.2.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/commons-lang-2.6.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/commons-io-2.4.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/commons-codec-1.9.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/commons-beanutils-1.8.2.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/client-adapter.launcher-1.1.6.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/client-adapter.common-1.1.6.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/classmate-1.3.4.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/canal.protocol-1.1.6.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/canal.common-1.1.6.jar:/opt/compass/dist/compass/task-canal-adapter/bin/../lib/aviator-2.2.1.jar:.:/usr/local/java/lib/dt.jar:/usr/local/java/lib/tools.jar
cd to /opt/compass/dist/compass/task-canal-adapter for continue
/opt/compass/dist/compass/task-detect
nohup: appending output to ohup.out
32222
/opt/compass/dist/compass/task-metadata
32251
/opt/compass/dist/compass/task-parser
32355
/opt/compass/dist/compass/task-portal
32413
/opt/compass/dist/compass/task-syncer
32489

Log information when stop_all.sh is executed:

./bin/stop_all.sh
/opt/compass/dist/compass/task-application
task-application 26076
/opt/compass/dist/compass/task-canal
ambari-35.snowleopard.cn: stopping canal 26154 ...
bin/stop.sh: line 68: kill: (26154) - No such process
Oook! cost:0
/opt/compass/dist/compass/task-canal-adapter
ambari-35.snowleopard.cn: stopping canal 26199 ...
Oook! cost:6
/opt/compass/dist/compass/task-detect
task-detect 26215
/opt/compass/dist/compass/task-metadata
task-metadata 26267
/opt/compass/dist/compass/task-parser
task-parser 26330
/opt/compass/dist/compass/task-portal
task-portal 26403
/opt/compass/dist/compass/task-syncer
task-syncer 26470

Elasticsearch default replica is too big

The compiled installation package with linux

Is there a compiled installation package for quick test experience? Thank you very much

The rule for determining abnormal CPU resource utilization seems not support the case of Spark tasks with DRA enabled

In the case of Spark tasks with DRA(Spark Dynamic Resource Allocation) enabled, the calculated time is not equivalent to totalCores * appTotalTime.

long maxExecutors = getMaxConcurrent();
long executorCores = Long.parseLong(application.getSparkExecutorCores());
long totalCores = executorCores * maxExecutors;

long appComputeMillisAvailable = totalCores * appTotalTime;
long inJobComputeMillisAvailable = totalCores * jobTime;

task-syncer模块同步task表update字段为空

com.oppo.cloud.model.Task#setUpdateTime方法中
task.setCreateTime(DataUtil.parseDate(data.get("create_time")));
需要修改为
task.setUpdateTime(DataUtil.parseDate(data.get("update_time")));

[Bug]: Support for low version of ZooKeeper 3.6.2

task-metadata的报错信息，每次都需要改pom的zk版本号
2023-05-05 16:11:35,079 ERROR 147541 [main-EventThread] [] : [org.apache.curator.framework.imps.EnsembleTracker:214] Invalid config event received: {server.1=qzcs240:2888:3888:participant, version=0, server.3=qzcs242:2888:3888:participant, server.2=qzcs241:2888:3888:participant}
2023-05-05 16:11:35,082 ERROR 147541 [main-EventThread] [] : [org.apache.curator.framework.imps.EnsembleTracker:214] Invalid config event received: {server.1=qzcs240:2888:3888:participant, version=0, server.3=qzcs242:2888:3888:participant, server.2=qzcs241:2888:3888:participant}
2023-05-05 16:15:33,911 ERROR 155497 [main-EventThread] [] : [org.apache.curator.framework.imps.EnsembleTracker:214] Invalid config event received: {server.1=qzcs240:2888:3888:participant, version=0, server.3=qzcs242:2888:3888:participant, server.2=qzcs241:2888:3888:participant}
2023-05-05 16:15:33,924 ERROR 155497 [main-EventThread] [] : [org.apache.curator.framework.imps.EnsembleTracker:214] Invalid config event received: {server.1=qzcs240:2888:3888:participant, version=0, server.3=qzcs242:2888:3888:participant, server.2=qzcs241:2888:3888:participant}

Flume配置问题

在配置Flume采集海豚调度器日志到HDFS上时遇到以下问题：
1、海豚调度器的日志存放在以当日日期命名的YYYYMMDD动态文件夹内，Flume无法采集到动态文件夹内的日志，只能采集到一个静态日志文件夹内的日志文件

日志路径如上图所示
2、在Flume上配置上传到HDFS文件前缀名为原文件名时，原文件名为带上全路径的文件名，导致上传到HDFS的文件多了很多目录级别，影响罗盘解析日志文件
比如某日志在原系统上的路径为/home/dolphinscheduler/dolphinscheduler/worker-server/logs/a.log
上传到HDFS上的路径会变成/flume/dolphinscheduler/home/dolphinscheduler/dolphinscheduler/worker-server/logs/a.log
而非/flume/dolphinscheduler/a.log

这是Flume的配置，请大佬指点一下这个配置方式有什么问题或者能不能提供一个Flume采集日志配置文件模板
谢谢大佬