This repository has been moved to https://github.com/pingcap/br.
pingcap / tidb-lightning Goto Github PK
View Code? Open in Web Editor NEWThis repository has been moved to https://github.com/pingcap/br
License: Apache License 2.0
This repository has been moved to https://github.com/pingcap/br
License: Apache License 2.0
This repository has been moved to https://github.com/pingcap/br.
Is your feature request related to a problem? Please describe:
In some cases user may need to grep lightning's log and modify stalled checkpoint and importer's data directory, which needs lightning developer's help and confirm to perform actions on internal data.
Describe the feature you'd like:
tidb-lightning-ctl
could show checkpoint status(with some explanation?) of each table, and help remove importer's engine files.
When importing task is running, user could have confidence to actively terminate importing task(to alter configuration or pause) by reading checkpoint status.
When importing task is not running, tidb-lightning-ctl
may output stalled tables and their status in a more friendly way, to guide user copy-and-paste to clean or continue, to further reduce oncall.
Describe alternatives you've considered:
tidb-lightning-ctl
tidb-lightning-ctl --checkpoint-error-destroy=all
when lightning found stalled checkpoint, another way to guide user.Teachability, Documentation, Adoption, Optimization:
Is your feature request related to a problem? Please describe:
Describe the feature you'd like:
Describe alternatives you've considered:
Teachability, Documentation, Adoption, Optimization:
I noticed that Kafka can be used as a sink.
Would NATS be welcomed ?
I think it's a nice complimentary message queue as it's 100% golang, fast and HA / durable
I think it would be easy to implement also as NATS is very simple API and has API client in golang .
steps:
1.helm install tidb-lightning --name=tidb-lightning --namespace=tidb --version=1.0
2.kubectl logs -n tidb -l app.kubernetes.io/name=tidb-lightning-tidb-lightning
error log
[2020/01/22 03:55:13.266 +00:00] [WARN] [config.go:288] ["currently only per-task configuration can be applied, global configuration changes can only be made on startup"] ["global config changes"="[lightning.level]"]
[2020/01/22 03:55:13.272 +00:00] [INFO] [version.go:48] ["Welcome to lightning"] ["Release Version"=v3.0.8] ["Git Commit Hash"=f6512ee51da72e7b7c1f524c1d0079c24df78868] ["Git Branch"=HEAD] ["UTC Build Time"="2019-12-31 11:13:10"] ["Go Version"="go version go1.12 linux/amd64"]
[2020/01/22 03:55:13.272 +00:00] [INFO] [lightning.go:165] [cfg] [cfg="{\"id\":1579665313272107798,\"lightning\":{\"table-concurrency\":6,\"index-concurrency\":2,\"region-concurrency\":32,\"io-concurrency\":5,\"check-requirements\":true},\"tidb\":{\"host\":\"tidb-tidb.tidb\",\"port\":4000,\"user\":\"etl\",\"status-port\":10080,\"pd-addr\":\"tidb-pd.tidb:2379\",\"sql-mode\":\"ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION\",\"max-allowed-packet\":67108864,\"distsql-scan-concurrency\":100,\"build-stats-concurrency\":20,\"index-serial-scan-concurrency\":20,\"checksum-table-concurrency\":16},\"checkpoint\":{\"enable\":false,\"schema\":\"tidb_lightning_checkpoint\",\"driver\":\"file\",\"keep-after-success\":false},\"mydumper\":{\"read-block-size\":65536,\"batch-size\":107374182400,\"batch-import-ratio\":0,\"data-source-dir\":\"/tmp/tidb\",\"no-schema\":false,\"character-set\":\"auto\",\"csv\":{\"separator\":\",\",\"delimiter\":\"\\\"\",\"header\":false,\"trim-last-separator\":false,\"not-null\":false,\"null\":\"\\\\N\",\"backslash-escape\":true},\"case-sensitive\":false},\"black-white-list\":{\"do-tables\":null,\"do-dbs\":null,\"ignore-tables\":null,\"ignore-dbs\":[\"mysql\",\"information_schema\",\"performance_schema\",\"sys\"]},\"tikv-importer\":{\"addr\":\"tidb-importer.tidb:8287\",\"backend\":\"importer\",\"on-duplicate\":\"replace\"},\"post-restore\":{\"level-1-compact\":false,\"compact\":false,\"checksum\":true,\"analyze\":false},\"cron\":{\"switch-mode\":\"5m0s\",\"log-progress\":\"5m0s\"},\"routes\":null}"]
[2020/01/22 03:55:13.272 +00:00] [INFO] [lightning.go:194] ["load data source start"]
[2020/01/22 03:55:13.272 +00:00] [ERROR] [lightning.go:197] ["load data source failed"] [takeTime=205.803µs] [error="invalid data file, miss host table - /tmp/tidb/mydb.a.csv"]
[2020/01/22 03:55:13.272 +00:00] [ERROR] [main.go:59] ["tidb lightning encountered error"] [error="invalid data file, miss host table - /tmp/tidb/mydb.a.csv"] [errorVerbose="invalid data file, miss host table - /tmp/tidb/mydb.a.csv\ngithub.com/pingcap/tidb-lightning/lightning/mydump.(*mdLoaderSetup).setup\n\t/home/jenkins/agent/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb-lightning/lightning/mydump/loader.go:202\ngithub.com/pingcap/tidb-lightning/lightning/mydump.NewMyDumpLoader\n\t/home/jenkins/agent/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb-lightning/lightning/mydump/loader.go:105\ngithub.com/pingcap/tidb-lightning/lightning.(*Lightning).run\n\t/home/jenkins/agent/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb-lightning/lightning/lightning.go:196\ngithub.com/pingcap/tidb-lightning/lightning.(*Lightning).RunOnce\n\t/home/jenkins/agent/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb-lightning/lightning/lightning.go:138\nmain.main\n\t/home/jenkins/agent/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb-lightning/cmd/tidb-lightning/main.go:56\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:200\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1337"]
sync log failed sync /dev/stdout: invalid argument
mydb.a.csv
1,"East",32
2,"South",\N
3,"West",10
4,"North",39
5,"a",23
mydb.a-schema-create.sql
create table a(
id int,
region varchar(50),
count int)
tidb-lightning.toml
[lightning]
level = "debug"
[mydumper.csv]
separator = ','
delimiter = '"'
header = false
not-null = false
null = '\N'
backslash-escape = true
trim-last-separator = false
values.yaml
# Default values for tidb-lightning.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.
# timezone is the default system timzone
timezone: UTC
image: pingcap/tidb-lightning:v3.0.8
imagePullPolicy: IfNotPresent
service:
type: NodePort
# failFast causes the lightning pod fails when any error happens.
# when disabled, the lightning pod will keep running when error happens to allow manual intervention, users have to check logs to see the job status.
failFast: true
dataSource:
local:
nodeName: ttrms-dn-07
hostPath: /tmp/tidb
# The backup data is on a PVC which is from tidb-backup or scheduled backup, and is not uploaded to cloud storage yet.
# Note: when using this mode, the lightning needs to be deployed in the same namespace as the PVC
# and the `targetTidbCluster.namespace` needs to be configured explicitly
adhoc: {}
# pvcName: tidb-cluster-scheduled-backup
# backupName: scheduled-backup-20190822-041004
remote: {}
#rcloneImage: tynor88/rclone
#storageClassName: tidb-block
#storage: 100Gi
#secretName: cloud-storage-secret
#path: s3:bench-data-us/sysbench/sbtest_16_1e7.tar.gz
targetTidbCluster:
name: tidb
# namespace is the target tidb cluster namespace, can be omitted if the lightning is deployed in the same namespace of the target tidb cluster
namespace: ""
user: etl
resources: {}
# limits:
# cpu: 16000m
# memory: 8Gi
# requests:
# cpu: 16000m
# memory: 8Gi
nodeSelector: {}
annotations: {}
tolerations: []
affinity: {}
backend: importer # importer | tidb
config: |
[lightning]
level = "debug"
[mydumper.csv]
separator = ','
delimiter = '"'
header = false
not-null = false
null = '\N'
backslash-escape = true
trim-last-separator = false
It worked fine when Creating the empty tables directly in TiDB in the first place, and then setting [mydumper] no-schema = true in tidb-lightning.toml.
Please answer these questions before submitting your issue. Thanks!
# First setup a cluster with tiflash
# Create table and data in tidb and add tiflash replica
mysql> CREATE DATABASE IF NOT EXISTS `test`;
mysql> CREATE TABLE `test`.`t` (`Year` year(4) DEFAULT 0 NOT NULL) DEFAULT CHARSET=utf8mb4;
mysql> ALTER TABLE t set tiflash replica 1;
mysql> select * from information_schema.tiflash_replica; # Use this to check tiflash replica status
# Create a file with some rows
# Load data with tidb-lightning
What did you expect to see?
In TiFlash v3.1, we don't support IngestSST command. Hope that tidb-lightning can check if target table has tiflash replica, if have, give an error to tell user we don't support it for now.
What did you see instead?
lightning will exit with errors and data do NOT loaded.
[2020/02/27 13:20:45.637 +08:00] [INFO] [restore.go:473] [progress] [files="1/1 (100.0%)"] [tables="0/1 (0.0%)"] [speed(MiB/s)=0.00000007947365515542592] [state=post-processing] []
[2020/02/27 13:20:52.188 +08:00] [WARN] [backend.go:286] ["import spuriously failed, going to retry again"] [engineTag=`test`.`t`:0] [engineUUID=c32475d3-88af-5442-b950-a0e82e40d337] [retryCnt=2] [error="rpc error: code = Unknown desc = ImportJobFailed(\"retry 5 times still 1 ranges failed\")"]
[2020/02/27 13:20:55.188 +08:00] [ERROR] [restore.go:1404] ["import and cleanup engine failed"] [engineTag=`test`.`t`:0] [engineUUID=c32475d3-88af-5442-b950-a0e82e40d337] [takeTime=3m9.505358837s] [error="[c32475d3-88af-5442-b950-a0e82e40d337] import reach max retry 3 and still failed: rpc error: code = Unknown desc = ImportJobFailed(\"retry 5 times still 1 ranges failed\")"]
[2020/02/27 13:20:55.188 +08:00] [ERROR] [restore.go:795] ["import whole table failed"] [table=`test`.`t`] [takeTime=3m9.526326415s] [error="[c32475d3-88af-5442-b950-a0e82e40d337] import reach max retry 3 and still failed: rpc error: code = Unknown desc = ImportJobFailed(\"retry 5 times still 1 ranges failed\")"]
Versions of the cluster
TiDB-Lightning version (run tidb-lightning -V
):
./importer/tidb-lightning -V
Release Version: v3.0.5-2-g605760d
Git Commit Hash: 605760d1b2025d1e1a8b7d0c668c74863d7d1271
Git Branch: master
UTC Build Time: 2019-12-04 07:23:09
Go Version: go version go1.13.1 linux/amd64
TiKV-Importer version (run tikv-importer -V
)
./importer/tikv-importer -V
TiKV Importer 4.0.0-alpha
TiKV version (run tikv-server -V
):
Release Version: 3.1.0-beta.1
Git Commit Hash: d1f9d486aa8e4ab29652e80a32c6d152db653dec
Git Commit Branch: release-3.1
UTC Build Time: 2020-02-19 03:39:48
TiDB cluster version (execute SELECT tidb_version();
in a MySQL client):
Release Version: v3.1.0-beta.1-5-gf90a7892d
Git Commit Hash: f90a7892d2d69974f17f01fc189937bb1e04c4f1
Git Branch: release-3.1
UTC Build Time: 2020-02-24 03:08:18
Other interesting information (system version, hardware config, etc):
Operation logs
tidb-lightning.log
for TiDB-Lightning if possibletikv-importer.log
from TiKV-Importer if possibleConfiguration of the cluster and the task
tidb-lightning.toml
for TiDB-Lightning if possibletikv-importer.toml
for TiKV-Importer if possibleinventory.ini
if deployed by AnsibleScreenshot/exported-PDF of Grafana dashboard or metrics' graph in Prometheus for TiDB-Lightning if possible
Is your feature request related to a problem? Please describe:
To remove importer component. we need to split & scatter region in lightning before generate sst, so the sortkv need to confirm first.
Describe the feature you'd like:
Describe alternatives you've considered:
Teachability, Documentation, Adoption, Optimization:
Please answer these questions before submitting your issue. Thanks!
I imported 1 80GiB table with TiDB lightning. It worked as expected, but there was something in the logs that could be misleading:
2019/04/19 10:57:24.508 [info] progress: 288/314 chunks (91.7%), 0/1 tables (0.0%), speed 59.42 MiB/s, remaining 1m48s
2019/04/19 10:58:15.035 [info] [`ontime`.`ontime`:0] restore chunk #288 (/mnt/evo970/data-sets/ontime-data/ontime.ontime.000000290.sql:0) takes 1m6.786456701s (read: 16.399068996s, encode: 46.956198114s, deliver: 26.986902394s, size: 365724609, kvs: 587359)
2019/04/19 10:58:15.924 [info] [`ontime`.`ontime`:0] restore chunk #289 (/mnt/evo970/data-sets/ontime-data/ontime.ontime.000000291.sql:0) takes 1m7.308596365s (read: 16.592581554s, encode: 47.53086694s, deliver: 26.885925253s, size: 365842044, kvs: 588326)
2019/04/19 10:58:17.092 [info] [`ontime`.`ontime`:0] restore chunk #290 (/mnt/evo970/data-sets/ontime-data/ontime.ontime.000000292.sql:0) takes 1m7.909898107s (read: 16.6468798s, encode: 47.976796893s, deliver: 27.531606735s, size: 366548691, kvs: 589641)
2019/04/19 10:58:18.002 [info] [`ontime`.`ontime`:0] restore chunk #291 (/mnt/evo970/data-sets/ontime-data/ontime.ontime.000000293.sql:0) takes 1m7.370414962s (read: 16.596332265s, encode: 47.452088895s, deliver: 27.378736932s, size: 366442580, kvs: 589587)
2019/04/19 10:58:18.551 [info] [`ontime`.`ontime`:0] restore chunk #292 (/mnt/evo970/data-sets/ontime-data/ontime.ontime.000000294.sql:0) takes 1m7.276636804s (read: 16.654740381s, encode: 47.383267823s, deliver: 27.353697896s, size: 366748822, kvs: 590417)
2019/04/19 10:58:19.278 [info] [`ontime`.`ontime`:0] restore chunk #293 (/mnt/evo970/data-sets/ontime-data/ontime.ontime.000000295.sql:0) takes 1m7.483649827s (read: 16.836955554s, encode: 47.4216458s, deliver: 27.177977163s, size: 366565528, kvs: 590363)
2019/04/19 10:58:19.346 [info] [`ontime`.`ontime`:0] restore chunk #294 (/mnt/evo970/data-sets/ontime-data/ontime.ontime.000000296.sql:0) takes 1m7.214924554s (read: 16.676707627s, encode: 47.333561636s, deliver: 27.288494748s, size: 365748469, kvs: 587515)
2019/04/19 10:58:19.496 [info] [`ontime`.`ontime`:0] restore chunk #297 (/mnt/evo970/data-sets/ontime-data/ontime.ontime.000000299.sql:0) takes 1m6.847181108s (read: 16.794947935s, encode: 46.822892583s, deliver: 27.542620217s, size: 364553156, kvs: 583875)
2019/04/19 10:58:19.739 [info] [`ontime`.`ontime`:0] restore chunk #296 (/mnt/evo970/data-sets/ontime-data/ontime.ontime.000000298.sql:0) takes 1m7.346982713s (read: 16.50881386s, encode: 47.765438811s, deliver: 26.720869217s, size: 363710758, kvs: 581874)
2019/04/19 10:58:19.770 [info] [`ontime`.`ontime`:0] restore chunk #298 (/mnt/evo970/data-sets/ontime-data/ontime.ontime.000000300.sql:0) takes 1m6.965520634s (read: 16.600162769s, encode: 47.249858277s, deliver: 26.964208405s, size: 363796329, kvs: 581389)
2019/04/19 10:58:19.823 [info] [`ontime`.`ontime`:0] restore chunk #295 (/mnt/evo970/data-sets/ontime-data/ontime.ontime.000000297.sql:0) takes 1m7.486180866s (read: 16.75539177s, encode: 47.662811344s, deliver: 26.975854933s, size: 364985011, kvs: 586013)
2019/04/19 10:58:20.153 [info] [`ontime`.`ontime`:0] restore chunk #299 (/mnt/evo970/data-sets/ontime-data/ontime.ontime.000000301.sql:0) takes 1m7.143061626s (read: 16.634920628s, encode: 47.393688964s, deliver: 27.30845416s, size: 364444900, kvs: 583960)
2019/04/19 10:58:20.832 [info] [`ontime`.`ontime`:0] restore chunk #300 (/mnt/evo970/data-sets/ontime-data/ontime.ontime.000000302.sql:0) takes 1m7.086168968s (read: 16.669206333s, encode: 47.236513242s, deliver: 26.907263486s, size: 364217460, kvs: 583248)
2019/04/19 10:58:20.884 [info] [`ontime`.`ontime`:0] restore chunk #301 (/mnt/evo970/data-sets/ontime-data/ontime.ontime.000000303.sql:0) takes 1m6.543379601s (read: 16.573656478s, encode: 46.790259167s, deliver: 26.609936692s, size: 364065013, kvs: 582580)
2019/04/19 10:58:23.653 [info] [`ontime`.`ontime`:0] restore chunk #303 (/mnt/evo970/data-sets/ontime-data/ontime.ontime.000000305.sql:0) takes 1m7.074397327s (read: 16.522003924s, encode: 47.51934977s, deliver: 26.092176823s, size: 364322952, kvs: 584096)
2019/04/19 10:58:23.679 [info] [`ontime`.`ontime`:0] restore chunk #302 (/mnt/evo970/data-sets/ontime-data/ontime.ontime.000000304.sql:0) takes 1m7.145327601s (read: 16.673329281s, encode: 47.378547845s, deliver: 25.859322236s, size: 364273786, kvs: 583360)
2019/04/19 10:58:44.349 [info] [`ontime`.`ontime`:0] restore chunk #312 (/mnt/evo970/data-sets/ontime-data/ontime.ontime.000000314.sql:0) takes 24.609683888s (read: 6.566946196s, encode: 17.401519162s, deliver: 3.243461405s, size: 164967753, kvs: 265477)
2019/04/19 10:59:08.546 [info] [`ontime`.`ontime`:0] restore chunk #304 (/mnt/evo970/data-sets/ontime-data/ontime.ontime.000000306.sql:0) takes 53.510622966s (read: 14.210565472s, encode: 37.813905703s, deliver: 7.562855261s, size: 364593196, kvs: 584383)
2019/04/19 10:59:09.243 [info] [`ontime`.`ontime`:0] restore chunk #305 (/mnt/evo970/data-sets/ontime-data/ontime.ontime.000000307.sql:0) takes 53.318288989s (read: 14.143962845s, encode: 37.755656104s, deliver: 7.169037636s, size: 364482504, kvs: 583659)
2019/04/19 10:59:10.369 [info] [`ontime`.`ontime`:0] restore chunk #306 (/mnt/evo970/data-sets/ontime-data/ontime.ontime.000000308.sql:0) takes 53.276026626s (read: 14.063708065s, encode: 37.782428554s, deliver: 6.855892683s, size: 364380421, kvs: 583487)
2019/04/19 10:59:11.140 [info] [`ontime`.`ontime`:0] restore chunk #307 (/mnt/evo970/data-sets/ontime-data/ontime.ontime.000000309.sql:0) takes 53.137725106s (read: 14.031394057s, encode: 37.755501518s, deliver: 6.446930186s, size: 364286151, kvs: 583787)
2019/04/19 10:59:11.377 [info] [`ontime`.`ontime`:0] restore chunk #308 (/mnt/evo970/data-sets/ontime-data/ontime.ontime.000000310.sql:0) takes 52.825191191s (read: 14.087666875s, encode: 37.352905982s, deliver: 6.288371376s, size: 364729869, kvs: 584423)
2019/04/19 10:59:11.632 [info] [`ontime`.`ontime`:0] restore chunk #310 (/mnt/evo970/data-sets/ontime-data/ontime.ontime.000000312.sql:0) takes 52.285204777s (read: 13.954230642s, encode: 36.975168395s, deliver: 6.1849906s, size: 365445973, kvs: 586356)
2019/04/19 10:59:11.653 [info] [`ontime`.`ontime`:0] restore chunk #309 (/mnt/evo970/data-sets/ontime-data/ontime.ontime.000000311.sql:0) takes 52.374601681s (read: 14.040815033s, encode: 37.009166342s, deliver: 6.12540251s, size: 365396877, kvs: 586466)
2019/04/19 10:59:11.861 [info] [`ontime`.`ontime`:0] restore chunk #311 (/mnt/evo970/data-sets/ontime-data/ontime.ontime.000000313.sql:0) takes 52.36530689s (read: 13.969956401s, encode: 37.046355277s, deliver: 6.077134501s, size: 365602922, kvs: 587464)
2019/04/19 10:59:12.162 [info] [`ontime`.`ontime`:0] restore chunk #313 (/mnt/evo970/data-sets/ontime-data/ontime.ontime.00001.sql:0) takes 52.391710917s (read: 14.034153272s, encode: 36.999025501s, deliver: 5.998670794s, size: 365583895, kvs: 587544)
2019/04/19 10:59:12.162 [info] [`ontime`.`ontime`:0] encode kv data and write takes 21m47.596367835s (read 80539338461, written 114560022511)
2019/04/19 10:59:12.164 [info] [`ontime`.`ontime`:0] [9d2626e1-dca4-5efb-a07a-8336c493a62a] engine close
2019/04/19 10:59:16.075 [info] [`ontime`.`ontime`:0] [9d2626e1-dca4-5efb-a07a-8336c493a62a] engine close takes 3.910884297s
2019/04/19 10:59:16.075 [info] [`ontime`.`ontime`:0] flush kv deliver ...
2019/04/19 10:59:16.075 [info] [`ontime`.`ontime`:0] [9d2626e1-dca4-5efb-a07a-8336c493a62a] import
2019/04/19 11:02:24.499 [info] switch to tikv Import mode takes 4.435986ms
2019/04/19 11:02:24.499 [info] progress: 314/314 chunks (100.0%), 0/1 tables (0.0%), speed 51.21 MiB/s, post-processing
2019/04/19 11:04:31.921 [info] [`ontime`.`ontime`:0] [9d2626e1-dca4-5efb-a07a-8336c493a62a] import takes 5m15.845531805s
2019/04/19 11:04:31.921 [info] [`ontime`.`ontime`:0] [9d2626e1-dca4-5efb-a07a-8336c493a62a] cleanup
2019/04/19 11:04:34.165 [info] [`ontime`.`ontime`:0] [9d2626e1-dca4-5efb-a07a-8336c493a62a] cleanup takes 2.244043243s
2019/04/19 11:04:34.165 [info] [`ontime`.`ontime`:0] kv deliver all flushed, takes 5m18.090368706s
2019/04/19 11:04:34.165 [info] [`ontime`.`ontime`] import whole table takes 27m9.599757383s
2019/04/19 11:04:34.165 [info] [`ontime`.`ontime`:-1] [c5fdba2b-e6e3-5ace-b2f0-60ab8205cdd9] engine close
2019/04/19 11:04:34.175 [info] [`ontime`.`ontime`:-1] [c5fdba2b-e6e3-5ace-b2f0-60ab8205cdd9] engine close takes 9.106235ms
2019/04/19 11:04:34.175 [info] [`ontime`.`ontime`:-1] flush kv deliver ...
2019/04/19 11:04:34.175 [info] [`ontime`.`ontime`:-1] [c5fdba2b-e6e3-5ace-b2f0-60ab8205cdd9] import
2019/04/19 11:04:34.235 [info] [`ontime`.`ontime`:-1] [c5fdba2b-e6e3-5ace-b2f0-60ab8205cdd9] import takes 60.051593ms
2019/04/19 11:04:34.235 [info] [`ontime`.`ontime`:-1] [c5fdba2b-e6e3-5ace-b2f0-60ab8205cdd9] cleanup
2019/04/19 11:04:34.236 [info] [`ontime`.`ontime`:-1] [c5fdba2b-e6e3-5ace-b2f0-60ab8205cdd9] cleanup takes 689.512µs
2019/04/19 11:04:34.236 [info] [`ontime`.`ontime`:-1] kv deliver all flushed, takes 60.87098ms
2019/04/19 11:04:34.244 [info] [ontime.ontime] ALTER TABLE `ontime`.`ontime` AUTO_INCREMENT=725579472
2019/04/19 11:04:34.424 [info] [`ontime`.`ontime`] alter table set auto_id takes 179.78494ms
2019/04/19 11:04:34.424 [info] [`ontime`.`ontime`] local checksum [sum:17109095500510594957, kvs:183953732, size:114560022511]
2019/04/19 11:04:34.444 [info] [`ontime`.`ontime`] doing remote checksum
2019/04/19 11:07:24.501 [info] switch to tikv Import mode takes 6.442484ms
2019/04/19 11:07:24.501 [info] progress: 314/314 chunks (100.0%), 0/1 tables (0.0%), speed 42.67 MiB/s, post-processing
2019/04/19 11:10:05.558 [warning] query ADMIN CHECKSUM TABLE `ontime`.`ontime` [error] Error 9005: Region is unavailable[try again later]
2019/04/19 11:10:05.559 [warning] query ADMIN CHECKSUM TABLE `ontime`.`ontime` retry 1
2019/04/19 11:11:32.870 [info] [`ontime`.`ontime`] do checksum takes 6m58.44575803s
2019/04/19 11:11:32.895 [info] [`ontime`.`ontime`] checksum pass, {bytes:114560022511 kvs:183953732 checksum:17109095500510594957} takes 6m58.469971683s
2019/04/19 11:11:32.895 [info] [`ontime`.`ontime`] analyze
2019/04/19 11:12:24.500 [info] switch to tikv Import mode takes 5.617193ms
2019/04/19 11:12:24.501 [info] progress: 314/314 chunks (100.0%), 0/1 tables (0.0%), speed 36.58 MiB/s, post-processing
2019/04/19 11:17:24.501 [info] switch to tikv Import mode takes 6.383353ms
2019/04/19 11:17:24.501 [info] progress: 314/314 chunks (100.0%), 0/1 tables (0.0%), speed 32.00 MiB/s, post-processing
2019/04/19 11:22:24.501 [info] switch to tikv Import mode takes 6.408816ms
2019/04/19 11:22:24.501 [info] progress: 314/314 chunks (100.0%), 0/1 tables (0.0%), speed 28.45 MiB/s, post-processing
This status line becomes less useful when in post-processing mode:
2019/04/19 11:22:24.501 [info] progress: 314/314 chunks (100.0%), 0/1 tables (0.0%), speed 28.45 MiB/s, post-processing
It looks like the speed is averaged out, and so it looks like things are slowing. It would be nice to either omit this line in post processing or at least remove the speed column.
2019/04/19 11:22:24.501 [info] progress: 314/314 chunks (100.0%), 0/1 tables (0.0%), speed 28.45 MiB/s, post-processing
Release Version: v2.1.8-4-gea3b27e
Git Commit Hash: ea3b27e616ee33a74c216cf4d295782a01124ebf
Git Branch: master
UTC Build Time: 2019-04-17 05:22:40
Go Version: go version go1.12 linux/amd64
In #230 the sync error does not cause exit 1. However it still shows in the log, which is confusing.
[2019/12/16 17:12:34.369 +00:00] [INFO] [main.go:61] ["tidb lightning exit"]
sync log failed sync /dev/stdout: invalid argument```
In #230 I have example code that would help ensure just this known error is discarded, and we could apply that before logging
Is your feature request related to a problem? Please describe:
The tidb.password can be set in the configuration file or via command line options, however, with both methods, the password is set in plain text which is not secure.
Describe the feature you'd like:
To integrate with k8s and dbaas, we usually create a secret for sensitive info and we can inject the password as an environment variable to the lightning container.
So this feature is to request support for getting the password from the environment variable first, if it's not available then fall back to the config file, in this case, the deploy with the non-k8s platform can keep as it is.
Describe alternatives you've considered:
Teachability, Documentation, Adoption, Optimization:
Please answer these questions before submitting your issue. Thanks!
What did you do? If possible, provide a recipe for reproducing the error.
I extract the logic of chunk restore, and find two deadlocks and on goroutine leak. detail logic code
Versions of the cluster
TiDB-Lightning version (run tidb-lightning -V
):
master
Minor issue:
tikv-importer
uses a capital -C
for the config file option versus tidb-lightning
that uses a lower case -c
. Is it possible to make these consistent?
Additional research:
-C
A feature request for your roadmap:
Can it be possible to restore directly from a mydumper backup stored in S3? In most cloud deployments this is where user backups will be stored (the S3 API is implemented by many other object stores).
Support restore to TiDB via S3.
GanttStart: 2020-07-27
GanttDue: 2020-09-04
GanttProgress: 100%
Please answer these questions before submitting your issue. Thanks!
I have a shell script where I import a ~80GiB sample dataset into TiDB using lightning for testing. It uses a non-supported configuration of 1x TiDB 1x TiKV 1x PD:
rm -rf data/pd
rm -rf data/tikv
./bin/tidb-server --path="127.0.0.1:2379" -store tikv &
./bin/pd-server --data-dir=data/pd --log-file=logs/pd.log &
./bin/tikv-server --pd="127.0.0.1:2379" --data-dir=data/tikv -A 127.0.0.1:20165 --log-file=logs/tikv.log &
./bin/tikv-importer --import-dir tmp --log-level info &
./bin/tidb-lightning -d data-sets/ontime-data -importer localhost:20160
It was actually working without error until recently. Now it repeatedly gives errors about region not fully replicated (of course not: it's a single tikv server). It still manages to complete though, and all the data is there.
So technically these should be classed as warnings, not errors. But it would be nice-to-have if it knew from pd that I only have one copy of the data intentionally.
[2020/01/10 13:13:29.352 -07:00] [INFO] [restore.go:1792] ["restore file completed"] [table=`ontime`.`ontime`] [engineNumber=0] [fileIndex=313] [path=/mnt/evo860/data-sets/ontime-data/ontime.ontime.00001.sql:0] [readDur=11.14167766s] [encodeDur=35.341429547s] [deliverDur=8.332825462s] [checksum="{cksum=10892752876034267670,size=365583895,kvs=587544}"] [takeTime=36.233777719s] []
[2020/01/10 13:13:29.352 -07:00] [INFO] [restore.go:928] ["encode kv data and write completed"] [table=`ontime`.`ontime`] [engineNumber=0] [read=80539338461] [written=114560022511] [takeTime=15m55.575505508s] []
[2020/01/10 13:13:29.352 -07:00] [INFO] [backend.go:266] ["engine close start"] [engineTag=`ontime`.`ontime`:0] [engineUUID=9d2626e1-dca4-5efb-a07a-8336c493a62a]
[2020/01/10 13:13:29.467 -07:00] [INFO] [kv_importer.rs:103] ["close engine"] [engine="EngineFile { uuid: 9d2626e1-dca4-5efb-a07a-8336c493a62a, path: EnginePath { save: \"/mnt/evo860/tmp/9d2626e1-dca4-5efb-a07a-8336c493a62a\", temp: \"/mnt/evo860/tmp/.temp/9d2626e1-dca4-5efb-a07a-8336c493a62a\" } }"]
[2020/01/10 13:13:29.467 -07:00] [INFO] [backend.go:268] ["engine close completed"] [engineTag=`ontime`.`ontime`:0] [engineUUID=9d2626e1-dca4-5efb-a07a-8336c493a62a] [takeTime=114.822851ms] []
[2020/01/10 13:13:29.467 -07:00] [INFO] [restore.go:784] ["restore engine completed"] [table=`ontime`.`ontime`] [engineNumber=0] [takeTime=15m55.690475619s] []
[2020/01/10 13:13:29.467 -07:00] [INFO] [restore.go:1403] ["import and cleanup engine start"] [engineTag=`ontime`.`ontime`:0] [engineUUID=9d2626e1-dca4-5efb-a07a-8336c493a62a]
[2020/01/10 13:13:29.467 -07:00] [INFO] [backend.go:280] ["import start"] [engineTag=`ontime`.`ontime`:0] [engineUUID=9d2626e1-dca4-5efb-a07a-8336c493a62a] [retryCnt=0]
[2020/01/10 13:13:29.468 -07:00] [INFO] [util.rs:396] ["connecting to PD endpoint"] [endpoints=127.0.0.1:2379]
[2020/01/10 13:13:29.468 -07:00] [INFO] [<unknown>] ["New connected subchannel at 0x7efb9c84b150 for subchannel 0x7efb9c42f000"]
[2020/01/10 13:13:29.471 -07:00] [INFO] [util.rs:396] ["connecting to PD endpoint"] [endpoints=http://127.0.0.1:2379]
[2020/01/10 13:13:29.471 -07:00] [INFO] [<unknown>] ["New connected subchannel at 0x7efb9c84b120 for subchannel 0x7efb9c42f400"]
[2020/01/10 13:13:29.472 -07:00] [INFO] [util.rs:396] ["connecting to PD endpoint"] [endpoints=http://127.0.0.1:2379]
[2020/01/10 13:13:29.472 -07:00] [INFO] [<unknown>] ["New connected subchannel at 0x7efb9c84b0c0 for subchannel 0x7efb9c42f800"]
[2020/01/10 13:13:29.472 -07:00] [INFO] [util.rs:455] ["connected to PD leader"] [endpoints=http://127.0.0.1:2379]
[2020/01/10 13:13:29.472 -07:00] [INFO] [util.rs:384] ["all PD endpoints are consistent"] [endpoints="[\"127.0.0.1:2379\"]"]
[2020/01/10 13:13:32.195 -07:00] [INFO] [import.rs:56] [start] [tag="[ImportJob 9d2626e1-dca4-5efb-a07a-8336c493a62a]"]
[2020/01/10 13:13:32.195 -07:00] [INFO] [prepare.rs:76] [start] [tag="[PrepareJob 9d2626e1-dca4-5efb-a07a-8336c493a62a]"]
[2020/01/10 13:13:32.206 -07:00] [INFO] [prepare.rs:80] ["get size properties"] [size=117503282223] [tag="[PrepareJob 9d2626e1-dca4-5efb-a07a-8336c493a62a]"]
[2020/01/10 13:13:32.206 -07:00] [INFO] [prepare.rs:172] [start] [range="RangeInfo { range: end: 7480000000000000FF2F5F728000000000FF2B0C2C0000000000FAFFFFFFFFA1E728D2, size: 536942611 }"] [tag="[PrepareRangeJob 9d2626e1-dca4-5efb-a07a-8336c493a62a:0]"]
[2020/01/10 13:13:32.206 -07:00] [INFO] [prepare.rs:190] [prepare] [takes=394.169µs] [tag="[PrepareRangeJob 9d2626e1-dca4-5efb-a07a-8336c493a62a:0]"]
[2020/01/10 13:13:32.206 -07:00] [INFO] [prepare.rs:172] [start] [range="RangeInfo { range: start: 7480000000000000FF2F5F728000000000FF2B0C2C0000000000FAFFFFFFFFA1E728D2 end: 7480000000000000FF2F5F728000000000FF6C0C880000000000FAFFFFFFFFA1E728D2, size: 536909914 }"] [tag="[PrepareRangeJob 9d2626e1-dca4-5efb-a07a-8336c493a62a:1]"]
[2020/01/10 13:13:32.207 -07:00] [INFO] [<unknown>] ["New connected subchannel at 0x7ef4bf44a150 for subchannel 0x7efb9c42fc00"]
[2020/01/10 13:13:32.218 -07:00] [INFO] [prepare.rs:288] [split] [at=7480000000000000FF2F5F728000000000FF6C0C880000000000FAFFFFFFFFA1E728D2] [region="RegionInfo { region: id: 2 start_key: 7480000000000000FF2F00000000000000F8 region_epoch { conf_ver: 1 version: 22 } peers { id: 3 store_id: 1 }, leader: Some(id: 3 store_id: 1) }"] [tag="[PrepareRangeJob 9d2626e1-dca4-5efb-a07a-8336c493a62a:1]"]
[2020/01/10 13:13:32.237 -07:00] [ERROR] [util.rs:326] ["request failed"] [err="Grpc(RpcFailure(RpcStatus { status: RpcStatusCode(2), details: Some(\"region 46 is not fully replicated\") }))"]
[2020/01/10 13:13:32.237 -07:00] [INFO] [util.rs:396] ["connecting to PD endpoint"] [endpoints=http://127.0.0.1:2379]
[2020/01/10 13:13:32.237 -07:00] [INFO] [util.rs:396] ["connecting to PD endpoint"] [endpoints=http://127.0.0.1:2379]
[2020/01/10 13:13:32.238 -07:00] [INFO] [util.rs:455] ["connected to PD leader"] [endpoints=http://127.0.0.1:2379]
[2020/01/10 13:13:32.238 -07:00] [INFO] [util.rs:175] ["heartbeat sender and receiver are stale, refreshing ..."]
[2020/01/10 13:13:32.238 -07:00] [WARN] [util.rs:194] ["updating PD client done"] [spend=1.235315ms]
[2020/01/10 13:13:32.238 -07:00] [ERROR] [util.rs:326] ["request failed"] [err="Grpc(RpcFailure(RpcStatus { status: RpcStatusCode(2), details: Some(\"region 46 is not fully replicated\") }))"]
[2020/01/10 13:13:32.238 -07:00] [ERROR] [util.rs:326] ["request failed"] [err="Grpc(RpcFailure(RpcStatus { status: RpcStatusCode(2), details: Some(\"region 46 is not fully replicated\") }))"]
[2020/01/10 13:13:32.238 -07:00] [ERROR] [util.rs:326] ["request failed"] [err="Grpc(RpcFailure(RpcStatus { status: RpcStatusCode(2), details: Some(\"region 46 is not fully replicated\") }))"]
[2020/01/10 13:13:32.239 -07:00] [ERROR] [util.rs:326] ["request failed"] [err="Grpc(RpcFailure(RpcStatus { status: RpcStatusCode(2), details: Some(\"region 46 is not fully replicated\") }))"]
[2020/01/10 13:13:32.239 -07:00] [ERROR] [util.rs:326] ["request failed"] [err="Grpc(RpcFailure(RpcStatus { status: RpcStatusCode(2), details: Some(\"region 46 is not fully replicated\") }))"]
[2020/01/10 13:13:32.239 -07:00] [ERROR] [util.rs:326] ["request failed"] [err="Grpc(RpcFailure(RpcStatus { status: RpcStatusCode(2), details: Some(\"region 46 is not fully replicated\") }))"]
[2020/01/10 13:13:32.239 -07:00] [ERROR] [util.rs:326] ["request failed"] [err="Grpc(RpcFailure(RpcStatus { status: RpcStatusCode(2), details: Some(\"region 46 is not fully replicated\") }))"]
[2020/01/10 13:13:32.239 -07:00] [ERROR] [util.rs:326] ["request failed"] [err="Grpc(RpcFailure(RpcStatus { status: RpcStatusCode(2), details: Some(\"region 46 is not fully replicated\") }))"]
[2020/01/10 13:13:32.239 -07:00] [ERROR] [util.rs:326] ["request failed"] [err="Grpc(RpcFailure(RpcStatus { status: RpcStatusCode(2), details: Some(\"region 46 is not fully replicated\") }))"]
[2020/01/10 13:13:32.240 -07:00] [WARN] [prepare.rs:313] ["scatter region failed"] [err="[/rust/git/checkouts/tikv-71ea2042335c4528/2e0379c/components/pd_client/src/util.rs:334]: fail to request"] [region=46] [tag="[PrepareRangeJob 9d2626e1-dca4-5efb-a07a-8336c493a62a:1]"]
[2020/01/10 13:13:33.240 -07:00] [INFO] [prepare.rs:190] [prepare] [takes=1.033672901s] [tag="[PrepareRangeJob 9d2626e1-dca4-5efb-a07a-8336c493a62a:1]"]
[2020/01/10 13:13:33.240 -07:00] [INFO] [prepare.rs:172] [start] [range="RangeInfo { range: start: 7480000000000000FF2F5F728000000000FF6C0C880000000000FAFFFFFFFFA1E728D2 end: 7480000000000000FF2F5F728000000000FF92A2540000000000FAFFFFFFFFA1E728D2, size: 536913135 }"] [tag="[PrepareRangeJob 9d2626e1-dca4-5efb-a07a-8336c493a62a:2]"]
[2020/01/10 13:13:33.253 -07:00] [INFO] [prepare.rs:288] [split] [at=7480000000000000FF2F5F728000000000FF92A2540000000000FAFFFFFFFFA1E728D2] [region="RegionInfo { region: id: 2 start_key: 7480000000000000FF2F5F728000000000FF6C0C880000000000FA region_epoch { conf_ver: 1 version: 23 } peers { id: 3 store_id: 1 }, leader: Some(id: 3 store_id: 1) }"] [tag="[PrepareRangeJob 9d2626e1-dca4-5efb-a07a-8336c493a62a:2]"]
[2020/01/10 13:13:33.270 -07:00] [ERROR] [util.rs:326] ["request failed"] [err="Grpc(RpcFailure(RpcStatus { status: RpcStatusCode(2), details: Some(\"region 48 is not fully replicated\") }))"]
[2020/01/10 13:13:33.270 -07:00] [INFO] [util.rs:396] ["connecting to PD endpoint"] [endpoints=http://127.0.0.1:2379]
[2020/01/10 13:13:33.271 -07:00] [INFO] [util.rs:396] ["connecting to PD endpoint"] [endpoints=http://127.0.0.1:2379]
[2020/01/10 13:13:33.271 -07:00] [INFO] [util.rs:455] ["connected to PD leader"] [endpoints=http://127.0.0.1:2379]
[2020/01/10 13:13:33.271 -07:00] [INFO] [util.rs:175] ["heartbeat sender and receiver are stale, refreshing ..."]
[2020/01/10 13:13:33.271 -07:00] [WARN] [util.rs:194] ["updating PD client done"] [spend=1.108416ms]
[2020/01/10 13:13:33.271 -07:00] [ERROR] [util.rs:326] ["request failed"] [err="Grpc(RpcFailure(RpcStatus { status: RpcStatusCode(2), details: Some(\"region 48 is not fully replicated\") }))"]
[2020/01/10 13:13:33.272 -07:00] [ERROR] [util.rs:326] ["request failed"] [err="Grpc(RpcFailure(RpcStatus { status: RpcStatusCode(2), details: Some(\"region 48 is not fully replicated\") }))"]
[2020/01/10 13:13:33.272 -07:00] [ERROR] [util.rs:326] ["request failed"] [err="Grpc(RpcFailure(RpcStatus { status: RpcStatusCode(2), details: Some(\"region 48 is not fully replicated\") }))"]
[2020/01/10 13:13:33.272 -07:00] [ERROR] [util.rs:326] ["request failed"] [err="Grpc(RpcFailure(RpcStatus { status: RpcStatusCode(2), details: Some(\"region 48 is not fully replicated\") }))"]
[2020/01/10 13:13:33.272 -07:00] [ERROR] [util.rs:326] ["request failed"] [err="Grpc(RpcFailure(RpcStatus { status: RpcStatusCode(2), details: Some(\"region 48 is not fully replicated\") }))"]
[2020/01/10 13:13:33.272 -07:00] [ERROR] [util.rs:326] ["request failed"] [err="Grpc(RpcFailure(RpcStatus { status: RpcStatusCode(2), details: Some(\"region 48 is not fully replicated\") }))"]
[2020/01/10 13:13:33.272 -07:00] [ERROR] [util.rs:326] ["request failed"] [err="Grpc(RpcFailure(RpcStatus { status: RpcStatusCode(2), details: Some(\"region 48 is not fully replicated\") }))"]
[2020/01/10 13:13:33.273 -07:00] [ERROR] [util.rs:326] ["request failed"] [err="Grpc(RpcFailure(RpcStatus { status: RpcStatusCode(2), details: Some(\"region 48 is not fully replicated\") }))"]
[2020/01/10 13:13:33.273 -07:00] [ERROR] [util.rs:326] ["request failed"] [err="Grpc(RpcFailure(RpcStatus { status: RpcStatusCode(2), details: Some(\"region 48 is not fully replicated\") }))"]
[2020/01/10 13:13:33.273 -07:00] [WARN] [prepare.rs:313] ["scatter region failed"] [err="[/rust/git/checkouts/tikv-71ea2042335c4528/2e0379c/components/pd_client/src/util.rs:334]: fail to request"] [region=48] [tag="[PrepareRangeJob 9d2626e1-dca4-5efb-a07a-8336c493a62a:2]"]
[2020/01/10 13:13:34.273 -07:00] [INFO] [prepare.rs:190] [prepare] [takes=1.033316869s] [tag="[PrepareRangeJob 9d2626e1-dca4-5efb-a07a-8336c493a62a:2]"]
[2020/01/10 13:13:34.273 -07:00] [INFO] [prepare.rs:172] [start] [range="RangeInfo { range: start: 7480000000000000FF2F5F728000000000FF92A2540000000000FAFFFFFFFFA1E728D2 end: 7480000000000000FF2F5F728000000000FFB960C50000000000FAFFFFFFFFA1E728D2, size: 536909786 }"] [tag="[PrepareRangeJob 9d2626e1-dca4-5efb-a07a-8336c493a62a:3]"]
[2020/01/10 13:13:34.284 -07:00] [INFO] [prepare.rs:288] [split] [at=7480000000000000FF2F5F728000000000FFB960C50000000000FAFFFFFFFFA1E728D2] [region="RegionInfo { region: id: 2 start_key: 7480000000000000FF2F5F728000000000FF92A2540000000000FA region_epoch { conf_ver: 1 version: 24 } peers { id: 3 store_id: 1 }, leader: Some(id: 3 store_id: 1) }"] [tag="[PrepareRangeJob 9d2626e1-dca4-5efb-a07a-8336c493a62a:3]"]
[2020/01/10 13:13:34.301 -07:00] [ERROR] [util.rs:326] ["request failed"] [err="Grpc(RpcFailure(RpcStatus { status: RpcStatusCode(2), details: Some(\"region 50 is not fully replicated\") }))"]
TiDB-Lightning version (run tidb-lightning -V
):
./tidb-lightning -V
Release Version: v3.0.7-4-g42b7585
Git Commit Hash: 42b7585afb40a90b46b60fab203b1e8c99c56a3f
Git Branch: master
UTC Build Time: 2020-01-10 06:17:20
Go Version: go version go1.12 linux/amd64
TiKV-Importer version (run tikv-importer -V
):
./tikv-importer -V
TiKV Importer 4.0.0-alpha
TiKV version (run tikv-server -V
):
./tikv-server -V
TiKV
Release Version: 4.0.0-alpha
Git Commit Hash: f61c09bb3943a5e1cd594993fa6886ed661e461e
Git Commit Branch: master
UTC Build Time: 2020-01-08 01:36:23
Rust Version: rustc 1.42.0-nightly (0de96d37f 2019-12-19)
Enable Features: jemalloc portable sse protobuf-codec
Profile: dist_release
TiDB cluster version (execute SELECT tidb_version();
in a MySQL client):
mysql> select tidb_version()\G
*************************** 1. row ***************************
tidb_version(): Release Version: v4.0.0-alpha-1334-g07e642c92
Git Commit Hash: 07e642c9230ccb7c1537b27442f1fe8433e65f8a
Git Branch: master
UTC Build Time: 2020-01-08 08:32:04
GoVersion: go1.13
Race Enabled: false
TiKV Min Version: v3.0.0-60965b006877ca7234adaced7890d7b029ed1306
Check Table Before Drop: false
1 row in set (0.00 sec)
Other interesting information (system version, hardware config, etc):
I can reproduce this on two different servers. Both 16 cores, 1 has an NVMe disk w/64GiB RAM, this one has a SATA disk and 32GiB RAM.
Is your feature request related to a problem? Please describe:
Currently lightning requires a lot of configuration:
tikv-importer
)I would like to find a way where it can be used with minimal configuration. This helps improve convenience/support notice users and experiments.
Describe the feature you'd like:
I am open to ideas on implementation:
LIGHTNING LOAD 's3://path/to/mydumper'
(using the local tidb and learning pd from it).$1
for tidb-lightning
?Describe alternatives you've considered:
This is really only about improving convenience/usability for casual use cases, so there are many alternative implimentations. There are a lot of users that don't want to edit configuration files, but rather just have a tool setup and running with no effort.
Teachability, Documentation, Adoption, Optimization:
maybe this should be a error
log and we should stop running lightning in this situation ?
restore.go:140: [warning] table info not found : LINEITEM
Is your feature request related to a problem? Please describe:
The SQL dump of a table created by mydumper contains the data rows sorted by its primary index. The table can also contain secondary indices, which themselves won’t appear in their natural order. When a table is separated into multiple batches, the KV pairs from the secondary indices will be out of order. Therefore, the index parts will overlap and slows down import exponentially.
Describe the feature you'd like:
Separate the KV pairs from the data part and index part into different engines, to reduce the chance of overlapping engines, and increase ingestion speed.
Describe alternatives you've considered:
Lightning, after encoding, will classify the KV pair into “data” and “index” pairs. The original engines per batch (see RFC 3) will only store the data pairs. The index pairs are placed into a separate engine, and will not be batched.
Teachability, Documentation, Adoption, Optimization:
Use the following pseudo code representation
parallel for each table {
for each index engine of table { // for now, only 1 engine per table
open index engine
}
parallel for each data engine of table {
open data engine
parallel for each data file of data engine {
parse SQL
encode SQL to KV
write data part of KV to data engine
write index part of KV to index engine #-1
}
close data engine
sequentially import data engine
}
for each index engine of table {
close index engine
sequentially import index engine
}
alter table auto_increment
checksum
analyze
}
Index engine UUID
The engine UUID of the engines will be deterministically computed using the Version 5 (SHA-1) algorithm. The data engine UUID is computed using $TableName:$EngineID. The index engine UUID is computed using $TableName:$IndexEngineID, where the index engine ID is counted as -1, -2, -3, …
Is your feature request related to a problem? Please describe:
After using tidb-lightning, can not confirm tidb cluster current mode is no-op or importer mode
Describe the feature you'd like:
tidb-lightning is missing command to view cluster no-op or importer mode
请问下这种情况可能是什么原因造成的呢?
error:/home/jenkins/workspace/build_tidb_lightning/go/src/github.com/pingcap/tidb-lightning/lightning/restore/restore.go:323: table info table_XXX not found
table 的数据文件名称为 db.table_name.csv
表示无论将文件名改成什么样都无法正常识别呢???
Is your feature request related to a problem? Please describe:
Describe the feature you'd like:
Describe alternatives you've considered:
Teachability, Documentation, Adoption, Optimization:
lightning log
[2020/01/30 00:17:49.929 +08:00] [WARN] [util.go:95] ["compute remote checksum failed but going to try again"] [table=`tmp`.`presale_20200227`] [query="ADMIN CHECKSUM TABLE `tmp`.`presale_20200227`"] [retryCnt=0] [error="Error 1105: privilege check fail"]
AWS Aurora exported snapshot are encoded in Parquet format. We should investigate how to restore from this serialization to allow quick Aurora → TiDB data migration.
(TBD)
GanttStart: 2020-08-13
GanttDue: 2020-09-16
GanttProgress: 100%
Is your feature request related to a problem? Please describe:
When PD/TiKV/TiDB configures with TLS for connection, TiDB Lightning may not work and also no security.
Describe the feature you'd like:
Support TLS for connecting with PD/TiKV/TiDB and TiKV Importer.
Describe alternatives you've considered:
Teachability, Documentation, Adoption, Optimization:
Please answer these questions before submitting your issue. Thanks!
I setup a lighting instance and a importer instance to sync data from exported csv to tidb. And I provide create table sql file to let lighting to create table automatically.
But I got errors when create sql include "TIMESTAMP NOT NULL DEFAULT '0000-00-00 00:00:00'
" and lighting aborted.
Here is some error logs from tidb.log:
[2019/10/15 19:43:13.003 +08:00] [WARN] [conn.go:668] ["dispatch error"] [conn=6180] [connInfo="id:6180, addr:172.22.218.47:41354 status:2, collation:utf8_general_ci, user:root"] [sql="CREATE TABLE IF NOT EXISTS `health_sport_record_detail_dm` (`id` BIGINT(20) NOT NULL AUTO_INCREMENT COMMENT '运动详细记录id',`patient_id` VARCHAR(30) DEFAULT NULL COMMENT '患者id',`pin` VARCHAR(30) NOT NULL COMMENT '用户pin',`date` INT(11) NOT NULL COMMENT '日期',`sport_id` INT(11) NOT NULL COMMENT '运动id',`start_time` TIMESTAMP NOT NULL
DEFAULT CURRENT_TIMESTAMP() COMMENT '开始时间',`cost_time` INT(11) DEFAULT NULL COMMENT '运动时间长度',`end_time` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP() COMMENT '结束时间',`calorie` DOUBLE DEFAULT NULL COMMENT '此次运动消耗的卡路里',`other_amount` DOUBLE DEFAULT NULL COMMENT '其他量',`other_str` VARCHAR(30) DEFAULT NULL COMMENT '其他记录',
`create_info` VARCHAR(30) DEFAULT NULL COMMENT '创建信息',`modify_info` VARCHAR(30) DEFAULT NULL COMMENT '修改信息',`ts` TIMESTAMP NOT NULL DEFAULT '0000-00-00 00:00:00' COMMENT '时间戳',`running_distance` INT(11) DEFAULT NULL COMMENT '运动距离',`fit_crowd` TINYINT(4) DEFAULT '0' COMMENT '0:所有人;1:运动员',PRIMARY KEY(`id`)) ENGINE = InnoDB DEFAULT CHARACTER SET = UTF8 COMMENT = '血糖运动记录详表';"] [err="[types:1067]Invalid default value for 'ts'\ngithub.com/pingcap/errors.AddStack\n\t/home/jenkins/workspace/release_tidb_3.0/go/pkg/mod/github.com/pingcap/[email protected]/errors.go:174\ngithub.com/pingcap/parser/terror.(*Error).GenWithStackByArgs\n\t/home/jenkins/workspace/release_tidb_3.0
/go/pkg/mod/github.com/pingcap/[email protected]/terror/terror.go:233\ngithub.com/pingcap/tidb/ddl.checkColumnDefaultValue\n\t/home/jenkins/workspace/r
elease_tidb_3.0/go/src/github.com/pingcap/tidb/ddl/ddl_api.go:425\ngithub.com/pingcap/tidb/ddl.setDefaultValue\n\t/home/jenkins/workspace/release_tidb_3.0/go/src/github.com/pi
ngcap/tidb/ddl/ddl_api.go:2396\ngithub.com/pingcap/tidb/ddl.columnDefToCol\n\t/home/jenkins/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb/ddl/ddl_api.go:518\ngithu
b.com/pingcap/tidb/ddl.buildColumnAndConstraint\n\t/home/jenkins/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb/ddl/ddl_api.go:384\ngithub.com/pingcap/tidb/ddl.buil
dColumnsAndConstraints\n\t/home/jenkins/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb/ddl/ddl_api.go:249\ngithub.com/pingcap/tidb/ddl.buildTableInfoWithCheck\n\t/h
ome/jenkins/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb/ddl/ddl_api.go:1244\ngithub.com/pingcap/tidb/ddl.(*ddl).CreateTable\n\t/home/jenkins/workspace/release_ti
db_3.0/go/src/github.com/pingcap/tidb/ddl/ddl_api.go:1320\ngithub.com/pingcap/tidb/executor.(*DDLExec).executeCreateTable\n\t/home/jenkins/workspace/release_tidb_3.0/go/src/gi
thub.com/pingcap/tidb/executor/ddl.go:174\ngithub.com/pingcap/tidb/executor.(*DDLExec).Next\n\t/home/jenkins/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb/executor
/ddl.go:97\ngithub.com/pingcap/tidb/executor.Next\n\t/home/jenkins/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb/executor/executor.go:191\ngithub.com/pingcap/tidb/
executor.(*ExecStmt).handleNoDelayExecutor\n\t/home/jenkins/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb/executor/adapter.go:402\ngithub.com/pingcap/tidb/executor
.(*ExecStmt).Exec\n\t/home/jenkins/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb/executor/adapter.go:266\ngithub.com/pingcap/tidb/session.runStmt\n\t/home/jenkins/
workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb/session/tidb.go:219\ngithub.com/pingcap/tidb/session.(*session).executeStatement\n\t/home/jenkins/workspace/release_t
idb_3.0/go/src/github.com/pingcap/tidb/session/session.go:957\ngithub.com/pingcap/tidb/session.(*session).execute\n\t/home/jenkins/workspace/release_tidb_3.0/go/src/github.com
/pingcap/tidb/session/session.go:1066\ngithub.com/pingcap/tidb/session.(*session).Execute\n\t/home/jenkins/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb/session/se
ssion.go:994\ngithub.com/pingcap/tidb/server.(*TiDBContext).Execute\n\t/home/jenkins/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb/server/driver_tidb.go:246\ngithu
b.com/pingcap/tidb/server.(*clientConn).handleQuery\n\t/home/jenkins/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb/server/conn.go:1179\ngithub.com/pingcap/tidb/ser
ver.(*clientConn).dispatch\n\t/home/jenkins/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb/server/conn.go:897\ngithub.com/pingcap/tidb/server.(*clientConn).Run\n\t/
home/jenkins/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb/server/conn.go:652\ngithub.com/pingcap/tidb/server.(*Server).onConn\n\t/home/jenkins/workspace/release_t
idb_3.0/go/src/github.com/pingcap/tidb/server/server.go:440\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1337"]
Then I changed sql mode according to mysql 5.7 docs.
mysql> select @@sql_mode;
+-------------------------------------------------------------------------------------------------------------------------------------------+
| @@sql_mode |
+-------------------------------------------------------------------------------------------------------------------------------------------+
| ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION |
+-------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> set @@global.sql_mode = 'ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION';
Query OK, 0 rows affected (0.02 sec)
After that, I execute the failed sql from mysql cmd which connected to a tidb instance on 4000 port and succeeded.
mysql> CREATE TABLE `health_sport_record_detail_dm` ( `id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT '运动详细记录id', `patient_id` varchar(30) DEFAULT NULL COMMENT '患者id', `pin` varchar(30) NOT NULL COMMENT '用户pin', `date` int(11) NOT NULL COMMENT '日期', `sport_id` int(11) NOT NULL COMMENT '运动id', `start_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '开始时间', `cost_time` int(11) DEFAULT NULL COMMENT '运动时间长度', `end_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '结束时间', `calorie` double DEFAULT NULL COMMENT '此次运动消耗的卡路里', `other_amount` double DEFAULT NULL COMMENT '其他量', `other_str` varchar(30) DEFAULT NULL COMMENT '其他记录', `create_info` varchar(30) DEFAULT NULL COMMENT '创建信息', `modify_info` varchar(30) DEFAULT NULL COMMENT '修改信息', `ts` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00' COMMENT '时间戳', `running_distance` int(11) DEFAULT NULL COMMENT '运动距离', `fit_crowd` tinyint(4) DEFAULT '0' COMMENT '0:所有人;1:运动员', PRIMARY KEY (`id`)) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='血糖运动记录详表';
Query OK, 0 rows affected (1.02 sec)
But when I restarted lighting which connected to the same tidb instance on 4000, it failed again and reported same errors like "Invalid default value for ts".
lighting should work like mysql cmdline. But it seems lighting don't respect global sql_mode correctly.
lighting aborted with error.
here are some lighting logs:
[2019/10/15 19:38:41.002 +08:00] [ERROR] [restore.go:261] ["the whole procedure failed"] [takeTime=8.169121423s] [error="restore table schema health failed: create table faile
d: Error 1067: Invalid default value for 'ts'"]
[2019/10/15 19:38:41.002 +08:00] [ERROR] [main.go:59] ["tidb lightning encountered error"] [error="restore table schema health failed: create table failed: Error 1067: Invalid
default value for 'ts'"] [errorVerbose="Error 1067: Invalid default value for 'ts'\ngithub.com/pingcap/errors.AddStack\n\t/home/jenkins/workspace/release_ti
db_3.0/go/pkg/mod/github.com/pingcap/[email protected]/errors.go:174\ngithub.com/pingcap/errors.Trace\n\t/home/jenkins/workspace/release_tidb_3.0/go/pkg/mod/github.com/pingcap/er
[email protected]/juju_adaptor.go:15\ngithub.com/pingcap/tidb-lightning/lightning/common.SQLWithRetry.Exec.func1\n\t/home/jenkins/workspace/release_tidb_3.0/go/src/github.com/pingc
ap/tidb-lightning/lightning/common/util.go:152\ngithub.com/pingcap/tidb-lightning/lightning/common.SQLWithRetry.perform\n\t/home/jenkins/workspace/release_tidb_3.0/go/src/gith
ub.com/pingcap/tidb-lightning/lightning/common/util.go:90\ngithub.com/pingcap/tidb-lightning/lightning/common.SQLWithRetry.Exec\n\t/home/jenkins/workspace/release_tidb_3.0/go/
src/github.com/pingcap/tidb-lightning/lightning/common/util.go:150\ngithub.com/pingcap/tidb-lightning/lightning/restore.(*TiDBManager).InitSchema\n\t/home/jenkins/workspace/re
lease_tidb_3.0/go/src/github.com/pingcap/tidb-lightning/lightning/restore/tidb.go:112\ngithub.com/pingcap/tidb-lightning/lightning/restore.(*RestoreController).restoreSchema\n
\t/home/jenkins/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb-lightning/lightning/restore/restore.go:284\ngithub.com/pingcap/tidb-lightning/lightning/restore.(*Res
toreController).Run\n\t/home/jenkins/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb-lightning/lightning/restore/restore.go:245\ngithub.com/pingcap/tidb-lightning/li
ghtning.(*Lightning).run\n\t/home/jenkins/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb-lightning/lightning/lightning.go:213\ngithub.com/pingcap/tidb-lightning/lig
htning.(*Lightning).RunOnce\n\t/home/jenkins/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb-lightning/lightning/lightning.go:138\nmain.main\n\t/home/jenkins/workspa
ce/release_tidb_3.0/go/src/github.com/pingcap/tidb-lightning/cmd/tidb-lightning/main.go:56\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:200\nruntime.goexit\n\t/usr/local
/go/src/runtime/asm_amd64.s:1337\ncreate table failed\nrestore table schema health failed"]
Versions of the cluster
TiDB-Lightning version (run tidb-lightning -V
):
Release Version: v3.0.2
Git Commit Hash: ae592511b35848eb8283a1074765b9b8e8ab609b
Git Branch: HEAD
UTC Build Time: 2019-08-07 02:30:40
Go Version: go version go1.12 linux/amd64
TiKV-Importer version (run tikv-importer -V
)
TiKV Importer 3.0.2
TiKV version (run tikv-server -V
):
TiKV 3.0.2
TiDB cluster version (execute SELECT tidb_version();
in a MySQL client):
Release Version: v3.0.2
Git Commit Hash: 94498e7d06a244196bb41c3a05dd4c1f6903099a
Git Branch: HEAD
UTC Build Time: 2019-08-07 02:35:52
GoVersion: go version go1.12 linux/amd64
Race Enabled: false
TiKV Min Version: 2.1.0-alpha.1-ff3dd160846b7d1aed9079c389fc188f7f5ea13e
Check Table Before Drop: false
Other interesting information (system version, hardware config, etc):
OS: CentOS Linux release 7.2.1511 (Core)
Operation logs
tidb-lightning.log
for TiDB-Lightning if possibletikv-importer.log
from TiKV-Importer if possibleConfiguration of the cluster and the task
[lightning]
file = "log/tidb_lightning-healthdb00.log"
index-concurrency = 2
io-concurrency = 5
level = "info"
max-backups = 14
max-days = 28
max-size = 128
pprof-port = 8289
table-concurrency = 6
region-concurrency = 30
[checkpoint]
enable = true
driver = "file"
dsn = "/tmp/tidb_lightning_checkpoint_healthdb00.pb"
#driver = "mysql"
#schema = "tidb_lightning_checkpoint"
#dsn = "root:pwdjdal@tcp(172.22.191.33:3306)/?charset=utf8"
[tikv-importer]
addr = "172.22.218.49:8287"
[mydumper]
data-source-dir = "/home/tidb/deploy/migrate_healthdb.tables-00"
no-schema = false
read-block-size = 65536
[mydumper.csv]
backslash-escape = false
delimiter = ''
header = false
not-null = false
null = "NULL"
separator = "\t"
trim-last-separator = false
[tidb]
build-stats-concurrency = 20
checksum-table-concurrency = 16
distsql-scan-concurrency = 100
host = "172.22.218.11"
index-serial-scan-concurrency = 20
log-level = "error"
password = ""
port = 4000
status-port = 10080
user = "root"
pd-addr = "172.22.218.13:2379"
[post-restore]
analyze = true
checksum = true
[cron]
log-progress = "5m"
switch-mode = "5m"
# importer Configuration
log-file = "log/tikv_importer.log"
log-level = "info"
[server]
addr = "172.22.218.49:8287"
grpc-concurrency = 16
[metric]
address = "172.22.218.11:9091"
interval = "15s"
job = "tikv-importer"
[rocksdb]
max-background-jobs = 32
[rocksdb.defaultcf]
compression-per-level = ["lz4", "no", "no", "no", "no", "no", "lz4"]
max-write-buffer-number = 8
write-buffer-size = "1GB"
[rocksdb.writecf]
compression-per-level = ["lz4", "no", "no", "no", "no", "no", "lz4"]
[import]
import-dir = "/home/tidb/deploy/data.import"
max-open-engines = 8
min-available-ratio = 0.05
num-import-jobs = 24
num-threads = 16
stream-channel-window = 128
inventory.ini
if deployed by AnsibleIs your feature request related to a problem? Please describe:
TiDB v4.0 uses new row format https://github.com/pingcap/tidb/blob/master/docs/design/2018-07-19-row-format.md format, lightning should also update to use new row format
Describe alternatives you've considered:
It is best to use the tidb library functions to construct row data directly, so that you only need to update the tidb version in the future.
Request: the fastest possible import from backup with the least amount of resource usage.
One use case for this that I have in mind is to import a vary large example dataset as quickly as possible.
Currently lightning imports from a mydumper backup. The process of converting mydumper to SST files takes time and resource intensive. I would like to pay the cost once (or skip mydumper altogether).
In theory we can just save the intermediate SST files. In practice, there are issues to work through:
if you have saved the
data.import
directory after an engine file is written but before it is cleaned up, you could runtidb-lightning-ctl --import-engine <UUID>
to directly ingest those SSTs again. However, the KV pairs include the Table ID, which afaik cannot be controlled in a CREATE TABLE statement
we need some development, such as controlling not to clean up these ssts, saving the checksum of these ssts, and using the specified table ID (which also means you have to control the table version).
version:
tidb-lightning 2.1.9
tidb 2.1.4
When I import data with tidb-lightning, After the time, the error is as follows:
2019/05/30 07:41:43.841 [info] [PAYBASEDIMENSION
.TRANSLOG
] import whole table takes 13h11m33.936320627s
2019/05/30 07:41:43.841 [error] [PAYBASEDIMENSION
.TRANSLOG
] error rpc error: code = Unknown desc = ResourceTemporarilyUnavailable("Too many open engines 04fab71b-13e6-5e2f-ad64-fb4fae56229e: 8")
2019/05/30 07:41:43.841 [info] restore all tables data takes 13h11m34.086283344s
2019/05/30 07:41:43.841 [error] run cause error : rpc error: code = Unknown desc = ResourceTemporarilyUnavailable("Too many open engines 04fab71b-13e6-5e2f-ad64-fb4fae56229e: 8")
2019/05/30 07:41:43.841 [info] the whole procedure takes 13h11m34.119171179s
2019/05/30 07:41:43.841 restore.go:399: [info] Everything imported, stopping periodic actions
2019/05/30 07:41:43.852 main.go:65: [error] tidb lightning encountered error:rpc error: code = Unknown desc = ResourceTemporarilyUnavailable("Too many open engines 04fab71b-13e6-5e2f-ad64-fb4fae56229e: 8")
github.com/pingcap/errors.AddStack
/home/jenkins/workspace/release_tidb_2.1-ga/go/pkg/mod/github.com/pingcap/[email protected]/errors.go:174
github.com/pingcap/errors.Trace
/home/jenkins/workspace/release_tidb_2.1-ga/go/pkg/mod/github.com/pingcap/[email protected]/juju_adaptor.go:12
github.com/pingcap/tidb-lightning/lightning/kv.(*Importer).OpenEngine
/home/jenkins/workspace/release_tidb_2.1-ga/go/src/github.com/pingcap/tidb-lightning/lightning/kv/importer.go:178
github.com/pingcap/tidb-lightning/lightning/restore.(*TableRestore).restoreEngine
/home/jenkins/workspace/release_tidb_2.1-ga/go/src/github.com/pingcap/tidb-lightning/lightning/restore/restore.go:648
github.com/pingcap/tidb-lightning/lightning/restore.(*TableRestore).restoreTable.func1
/home/jenkins/workspace/release_tidb_2.1-ga/go/src/github.com/pingcap/tidb-lightning/lightning/restore/restore.go:588
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1337
When I restarted, it worked fine, but after a while I reported the same error. How to solve this problem, how can I ensure that the data is not lost?
thinks for help
Please answer these questions before submitting your issue. Thanks!
order_line
of 10k warnhourst[2020/03/10 20:44:47.162 +08:00] [INFO] [tidb.go:249] ["alter table auto_increment completed"] [table=`tpcc`.`order_line`] [auto_increment=32650894623] [takeTime=82.342128ms] []
[2020/03/10 20:44:47.162 +08:00] [INFO] [restore.go:1024] ["local checksum"] [table=`tpcc`.`order_line`] [checksum="{cksum=3588732482266855866,size=688778185266,kvs=8601932770}"]
[2020/03/10 20:44:47.169 +08:00] [INFO] [restore.go:1482] ["remote checksum start"] [table=`tpcc`.`order_line`]
[2020/03/10 20:45:52.311 +08:00] [INFO] [restore.go:492] [progress] [files="40/40 (100.0%)"] [tables="0/10 (0.0%)"] [speed(MiB/s)=75.81300532977347] [state=post-processing] []
[2020/03/10 20:50:52.311 +08:00] [INFO] [restore.go:492] [progress] [files="40/40 (100.0%)"] [tables="0/10 (0.0%)"] [speed(MiB/s)=70.75872548328216] [state=post-processing] []
[2020/03/10 20:55:52.311 +08:00] [INFO] [restore.go:492] [progress] [files="40/40 (100.0%)"] [tables="0/10 (0.0%)"] [speed(MiB/s)=66.33624475816293] [state=post-processing] []
[2020/03/10 20:56:23.325 +08:00] [WARN] [util.go:115] ["compute remote checksum failed but going to try again"] [table=`tpcc`.`order_line`] [query="ADMIN CHECKSUM TABLE `tpcc`.`order_line`"] [retryCnt=0] [error="Error 9005: Region is unavailable"]
[2020/03/10 20:56:23.325 +08:00] [WARN] [util.go:106] ["compute remote checksum retry start"] [table=`tpcc`.`order_line`] [query="ADMIN CHECKSUM TABLE `tpcc`.`order_line`"] [retryCnt=1]
[2020/03/10 20:58:07.796 +08:00] [INFO] [restore.go:1496] ["remote checksum completed"] [table=`tpcc`.`order_line`] [takeTime=13m20.626387114s] []
[2020/03/10 20:58:07.802 +08:00] [INFO] [restore.go:1445] ["checksum pass"] [table=`tpcc`.`order_line`] [local="{cksum=3588732482266855866,size=688778185266,kvs=8601932770}"]
[2020/03/10 20:58:07.802 +08:00] [INFO] [restore.go:1450] ["analyze start"] [table=`tpcc`.`order_line`]
[2020/03/10 21:00:52.311 +08:00] [INFO] [restore.go:492] [progress] [files="40/40 (100.0%)"] [tables="0/10 (0.0%)"] [speed(MiB/s)=62.434052438590804] [state=post-processing] []
Versions of the cluster
TiDB-Lightning version (run tidb-lightning -V
):
(paste TiDB-Lightning version here)
TiKV-Importer version (run tikv-importer -V
)
4.0.beta
TiKV version (run tikv-server -V
):
4.0.beta
TiDB cluster version (execute SELECT tidb_version();
in a MySQL client):
4.0.beta
Other interesting information (system version, hardware config, etc):
Operation logs
tidb-lightning.log
for TiDB-Lightning if possibletikv-importer.log
from TiKV-Importer if possibleConfiguration of the cluster and the task
tidb-lightning.toml
for TiDB-Lightning if possibletikv-importer.toml
for TiKV-Importer if possibleinventory.ini
if deployed by AnsibleScreenshot/exported-PDF of Grafana dashboard or metrics' graph in Prometheus for TiDB-Lightning if possible
Is your feature request related to a problem? Please describe:
Currently Lightning will not boot if it detected that the checkpoint recorded an old failure. This is not easy to for the user to copy the commands to perform the fix. On the other hand, in #247 Lightning already knows the best possible commands to run when encountering these errors.
Describe the feature you'd like:
We add a new config [checkpoint] auto-fix = true
(exposed in command line as --auto-fix-checkpoint=true
) to automatically perform the destroy/ignore commands (the default is true
).
Describe alternatives you've considered:
Disable checkpoints by default.
Teachability, Documentation, Adoption, Optimization:
--checkpoint-error-destroy
.Is your feature request related to a problem? Please describe:
Lightning can't load compressed files when using mydumper --compress
Backup files:
-rw-r--r--. 1 root root 85 Jan 16 11:42 app-schema-create.sql.gz
-rw-r--r--. 1 root root 213 Jan 16 11:42 app.mytable_tbl-schema.sql.gz
-rw-r--r--. 1 root root 98 Jan 16 11:42 app.mytable_tbl.sql.gz
-rw-r--r--. 1 root root 146 Jan 16 11:42 metadata
-rw-r--r--. 1 root root 87 Jan 16 11:42 mysql-schema-create.sql.gz
-rw-r--r--. 1 root root 167 Jan 16 11:42 mysql.GLOBAL_VARIABLES-schema.sql.gz
-rw-r--r--. 1 root root 5943 Jan 16 11:42 mysql.GLOBAL_VARIABLES.sql.gz
-rw-r--r--. 1 root root 238 Jan 16 11:42 mysql.bind_info-schema.sql.gz
-rw-r--r--. 1 root root 253 Jan 16 11:42 mysql.columns_priv-schema.sql.gz
-rw-r--r--. 1 root root 307 Jan 16 11:42 mysql.db-schema.sql.gz
-rw-r--r--. 1 root root 204 Jan 16 11:42 mysql.default_roles-schema.sql.gz
-rw-r--r--. 1 root root 142 Jan 16 11:42 mysql.expr_pushdown_blacklist-schema.sql.gz
-rw-r--r--. 1 root root 260 Jan 16 11:42 mysql.gc_delete_range-schema.sql.gz
-rw-r--r--. 1 root root 263 Jan 16 11:42 mysql.gc_delete_range_done-schema.sql.gz
-rw-r--r--. 1 root root 244 Jan 16 11:42 mysql.help_topic-schema.sql.gz
-rw-r--r--. 1 root root 136 Jan 16 11:42 mysql.opt_rule_blacklist-schema.sql.gz
-rw-r--r--. 1 root root 243 Jan 16 11:42 mysql.role_edges-schema.sql.gz
-rw-r--r--. 1 root root 232 Jan 16 11:42 mysql.stats_buckets-schema.sql.gz
-rw-r--r--. 1 root root 188 Jan 16 11:42 mysql.stats_feedback-schema.sql.gz
-rw-r--r--. 1 root root 304 Jan 16 11:42 mysql.stats_histograms-schema.sql.gz
-rw-r--r--. 1 root root 141 Jan 16 11:42 mysql.stats_histograms.sql.gz
-rw-r--r--. 1 root root 212 Jan 16 11:42 mysql.stats_meta-schema.sql.gz
-rw-r--r--. 1 root root 81 Jan 16 11:42 mysql.stats_meta.sql.gz
-rw-r--r--. 1 root root 206 Jan 16 11:42 mysql.stats_top_n-schema.sql.gz
-rw-r--r--. 1 root root 311 Jan 16 11:42 mysql.tables_priv-schema.sql.gz
-rw-r--r--. 1 root root 172 Jan 16 11:42 mysql.tidb-schema.sql.gz
-rw-r--r--. 1 root root 617 Jan 16 11:42 mysql.tidb.sql.gz
-rw-r--r--. 1 root root 354 Jan 16 11:42 mysql.user-schema.sql.gz
-rw-r--r--. 1 root root 167 Jan 16 11:42 mysql.user.sql.gz
-rw-r--r--. 1 root root 86 Jan 16 11:42 test-schema-create.sql.gz
Ligntning logs:
[2020/01/16 12:41:29.550 +00:00] [INFO] [lightning.go:194] ["load data source start"]
sync log failed sync /dev/stdout: invalid argument
[2020/01/16 12:41:29.551 +00:00] [ERROR] [lightning.go:197] ["load data source failed"] [takeTime=421.42µs] [error="missing {schema}-schema-create.sql"]
[2020/01/16 12:41:29.551 +00:00] [ERROR] [main.go:59] ["tidb lightning encountered error"] [error="missing {schema}-schema-create.sql"] [errorVerbose="missing {schema}-schema-create.sql\ngithub.com/pingcap/tidb-lightning/lightning/mydump.(*mdLoaderSetup).setup\n\t/home/jenkins/agent/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb-lightning/lightning/mydump/loader.go:176\ngithub.com/pingcap/tidb-lightning/lightning/mydump.NewMyDumpLoader\n\t/home/jenkins/agent/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb-lightning/lightning/mydump/loader.go:105\ngithub.com/pingcap/tidb-lightning/lightning.(*Lightning).run\n\t/home/jenkins/agent/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb-lightning/lightning/lightning.go:196\ngithub.com/pingcap/tidb-lightning/lightning.(*Lightning).RunOnce\n\t/home/jenkins/agent/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb-lightning/lightning/lightning.go:138\nmain.main\n\t/home/jenkins/agent/workspace/release_tidb_3.0/go/src/github.com/pingcap/tidb-lightning/cmd/tidb-lightning/main.go:56\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:200\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1337"]
Describe the feature you'd like:
Can load the compressed data files from mydumper.
Describe alternatives you've considered:
Teachability, Documentation, Adoption, Optimization:
Benchmark contains in branch lonng/bench-sql2kv
- Table s0: 22 columns and 10 index (SQL file size: ~90KB, rows count: 31)
- Table s11: 55 columns and 18 index (SQL file size: ~1.8MB, rows count: 223)
pkg: github.com/pingcap/tidb-lightning/lightning/kv
BenchmarkTableKVEncoder_SQL2KV-8 500 2628206 ns/op 1637935 B/op 9806 allocs/op
--- BENCH: BenchmarkTableKVEncoder_SQL2KV-8
sql2kv_test.go:53: Table: s0, column: 22, kvs: 341, affectRows: 31, bytes: 185127
sql2kv_test.go:53: Table: s0, column: 22, kvs: 341, affectRows: 31, bytes: 185127
sql2kv_test.go:53: Table: s0, column: 22, kvs: 341, affectRows: 31, bytes: 185127
BenchmarkTableKVEncoder_SQL2KV2-8 30 44766158 ns/op 29413560 B/op 169115 allocs/op
--- BENCH: BenchmarkTableKVEncoder_SQL2KV2-8
sql2kv_test.go:53: Table: s11, column: 55, kvs: 4237, affectRows: 23, bytes: 2461451
sql2kv_test.go:53: Table: s11, column: 55, kvs: 4237, affectRows: 23, bytes: 2461451
PASS
We need to redesign SQL2KV
interface to improve performance.
After #170, Lightning can recognize the DEFAULT CURRENT_TIMESTAMP
option, which is always initialized to time.Now()
. This means restoring from checkpoint is going to be non-deterministic when the timestamp column is indexed.
CREATE TABLE t (
x DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
KEY(x)
);
INSERT INTO t VALUES (), (), (), ();
We need to override the session system variable timestamp
, and also store it into the checkpoint to ensure the value before and after are equivalent.
Please answer these questions before submitting your issue. Thanks!
loader data into tidb using lightning
MySQL [test]> select * from t_access3;
+-------------+
| accessKey |
+-------------+
| @P&FLASHSHA |
+-------------+
MySQL [test]> select * from t_access3;
+-------------+
| accessKey |
+-------------+
| @PFLASHSHA |
+-------------+
Versions of the cluster
TiDB-Lightning version (run tidb-lightning -V
):
3.0.8
TiKV version (run tikv-server -V
):
3.0.8
TiDB cluster version (execute SELECT tidb_version();
in a MySQL client):
3.0.8
Please answer these questions before submitting your issue. Thanks!
Deploying tidb-lightning using k8s via pingcap/tidb-operator#817
git clone https://github.com/tennix/tidb-operator -b tidb-lightning
cd tidb-operator
helm install charts/tidb-cluster -n <cluster-release-name> --namespace <namepsace> --set importer.create=true
helm install charts/tidb-lightning -n <lightning-release-name> --namespace <namespace> --set dataSource.local.nodeName=<node-name>,dataSource.local.hostPath=<host-path>,targetTidbCluster.name=<cluster-release-name>
Then view the lightning service
The web page shows the task progress correctly.
The page shows and disappears immediately.
The version of the cluster is v3.0.1. The corresponding images are pulled from DockerHub.
Is your feature request related to a problem? Please describe:
afte #256 metrics configs are maintained in this repo, when new pr change the metrics, reviewer may need to run a integration test and look at the related charts on grafana to verify the correctness. So we need to provide a larger scale test suite to generate enough metrics.
Describe the feature you'd like:
Describe alternatives you've considered:
Teachability, Documentation, Adoption, Optimization:
Please answer these questions before submitting your issue. Thanks!
What did you do? If possible, provide a recipe for reproducing the error.
What did you expect to see?
What did you see instead?
Versions of the cluster
TiDB-Lightning version (run tidb-lightning -V
):
(paste TiDB-Lightning version here)
TiKV-Importer version (run tikv-importer -V
)
(paste TiKV-Importer version here)
TiKV version (run tikv-server -V
):
(paste TiKV version here)
TiDB cluster version (execute SELECT tidb_version();
in a MySQL client):
(paste TiDB cluster version here)
Other interesting information (system version, hardware config, etc):
Operation logs
tidb-lightning.log
for TiDB-Lightning if possibletikv-importer.log
from TiKV-Importer if possibleConfiguration of the cluster and the task
tidb-lightning.toml
for TiDB-Lightning if possibletikv-importer.toml
for TiKV-Importer if possibleinventory.ini
if deployed by AnsibleScreenshot/exported-PDF of Grafana dashboard or metrics' graph in Prometheus for TiDB-Lightning if possible
Is your feature request related to a problem? Please describe:
Describe the feature you'd like:
io.EOF is an ambigious error, should we retry it?
check if ImportKV
has error shouldn't save checkpoint (but do importer's engine files remain?)
on ImportMode, max-background-jobs should be specified, in order to deal with multi-instance TiKV deployment
TiDB backend use REPLACE INTO
to tolerate repeated insert with PK on recover from checkpoint
discuss TiDB backend use case, if during incremental importing and table doesn't have UNIQUE constraints, duplicated data can't be found. We might need a count(*)
based checksum
Describe alternatives you've considered:
Teachability, Documentation, Adoption, Optimization:
Please answer these questions before submitting your issue. Thanks!
What did you do? If possible, provide a recipe for reproducing the error.
What did you expect to see?
What did you see instead?
Versions of the cluster
TiDB-Lightning version (run tidb-lightning -V
):
(paste TiDB-Lightning version here)
TiKV-Importer version (run tikv-importer -V
)
(paste TiKV-Importer version here)
TiKV version (run tikv-server -V
):
(paste TiKV version here)
TiDB cluster version (execute SELECT tidb_version();
in a MySQL client):
(paste TiDB cluster version here)
Other interesting information (system version, hardware config, etc):
Operation logs
tidb-lightning.log
for TiDB-Lightning if possibletikv-importer.log
from TiKV-Importer if possibleConfiguration of the cluster and the task
tidb-lightning.toml
for TiDB-Lightning if possibletikv-importer.toml
for TiKV-Importer if possibleinventory.ini
if deployed by AnsibleScreenshot/exported-PDF of Grafana dashboard or metrics' graph in Prometheus for TiDB-Lightning if possible
Is your feature request related to a problem? Please describe:
https://github.com/pingcap/tidb/projects/33
after we support collations, the kv encode may different according to the collation.
so we must update lightning to learn what collation and encoding in the right way according to the collation.
there might some compatible issues too.
Describe the feature you'd like:
Describe alternatives you've considered:
Teachability, Documentation, Adoption, Optimization:
Is your feature request related to a problem? Please describe:
In some case (tencent PCG, e.g.) user maybe need to choose 'destroy' or 'ignore' invaid checkpoints. Those decisions may solved by code in some situation, for example:
CHECKSUMED
should continue runningcount(*)
difference of break time and start time to check correctnessSo we may provide this auto-solve option, consume some time, to reduce user's decision.
Describe the feature you'd like:
auto-solve option for invalid checkpoint in tidb-lighting-ctl
Describe alternatives you've considered:
auto-solve option in tidb-lightning
, triggered at detected invalid checkpoints.
or user guide.
Teachability, Documentation, Adoption, Optimization:
--checkpoint-error-auto-solve
of tidb-lightning-ctl
To restore a large amount of data, if we cannot stream it we must wait for the entire backup to be downloaded from object storage. Our restore process should instead look like this, which will make it quicker and reduce disk requirements (cost):
Describe the feature you'd like:
Similar feature request with #69
Support restore sql backup file through NFS protocol
Describe alternatives you've considered:
We need to consider the below situations:
Under these situations, we cannot support with the S3 store.
Please answer these questions before submitting your issue. Thanks!
With this schema:
CREATE DATABASE issue205;
CREATE TABLE issue205.a (b VARCHAR(256) CHARACTER SET 'utf8mb4');
and some non-UTF-8 CSV content (issue205.a.csv
)
head -c 15 /dev/urandom > issue205.a.csv
with configuration
[tikv-importer]
addr = '127.0.0.1:8808'
[tidb]
host = '127.0.0.1'
port = 4000
user = 'root'
status-port = 10080
[mydumper]
data-source-dir = /«path»
no-schema = true
[post-restore]
checksum = true
analyze = true
With the default SQL mode, the INSERT should fail with Error 1366 (incorrect utf8 value).
Lightning completed successfully with zero warnings. The malformed data can be seen from SELECT * FROM issue205.a;
(Everything is master)
Operation logs
Configuration of the cluster and the task
Screenshot/exported-PDF of Grafana dashboard or metrics' graph in Prometheus for TiDB-Lightning if possible
Please answer these questions before submitting your issue. Thanks!
I have a script to download master
daily, and then import the ontime database into TiDB. It just started breaking because of pingcap/parser#231 which was backported to 2.1 as pingcap/parser#232
#!/bin/sh
mysql -e 'drop database if exists ontime'
mysql -e 'create database ontime'
mysql ontime -e "CREATE TABLE ontime (
Year year(4) DEFAULT NULL,
Quarter tinyint(4) DEFAULT NULL,
Month tinyint(4) DEFAULT NULL,
DayofMonth tinyint(4) DEFAULT NULL,
DayOfWeek tinyint(4) DEFAULT NULL,
FlightDate date DEFAULT NULL,
UniqueCarrier char(7) DEFAULT NULL,
AirlineID int(11) DEFAULT NULL,
Carrier char(2) DEFAULT NULL,
TailNum varchar(50) DEFAULT NULL,
FlightNum varchar(10) DEFAULT NULL,
OriginAirportID int(11) DEFAULT NULL,
OriginAirportSeqID int(11) DEFAULT NULL,
OriginCityMarketID int(11) DEFAULT NULL,
Origin char(5) DEFAULT NULL,
OriginCityName varchar(100) DEFAULT NULL,
OriginState char(2) DEFAULT NULL,
OriginStateFips varchar(10) DEFAULT NULL,
OriginStateName varchar(100) DEFAULT NULL,
OriginWac int(11) DEFAULT NULL,
DestAirportID int(11) DEFAULT NULL,
DestAirportSeqID int(11) DEFAULT NULL,
DestCityMarketID int(11) DEFAULT NULL,
Dest char(5) DEFAULT NULL,
DestCityName varchar(100) DEFAULT NULL,
DestState char(2) DEFAULT NULL,
DestStateFips varchar(10) DEFAULT NULL,
DestStateName varchar(100) DEFAULT NULL,
DestWac int(11) DEFAULT NULL,
CRSDepTime int(11) DEFAULT NULL,
DepTime int(11) DEFAULT NULL,
DepDelay int(11) DEFAULT NULL,
DepDelayMinutes int(11) DEFAULT NULL,
DepDel15 int(11) DEFAULT NULL,
DepartureDelayGroups int(11) DEFAULT NULL,
DepTimeBlk varchar(20) DEFAULT NULL,
TaxiOut int(11) DEFAULT NULL,
WheelsOff int(11) DEFAULT NULL,
WheelsOn int(11) DEFAULT NULL,
TaxiIn int(11) DEFAULT NULL,
CRSArrTime int(11) DEFAULT NULL,
ArrTime int(11) DEFAULT NULL,
ArrDelay int(11) DEFAULT NULL,
ArrDelayMinutes int(11) DEFAULT NULL,
ArrDel15 int(11) DEFAULT NULL,
ArrivalDelayGroups int(11) DEFAULT NULL,
ArrTimeBlk varchar(20) DEFAULT NULL,
Cancelled tinyint(4) DEFAULT NULL,
CancellationCode char(1) DEFAULT NULL,
Diverted tinyint(4) DEFAULT NULL,
CRSElapsedTime int(11) DEFAULT NULL,
ActualElapsedTime int(11) DEFAULT NULL,
AirTime int(11) DEFAULT NULL,
Flights int(11) DEFAULT NULL,
Distance int(11) DEFAULT NULL,
DistanceGroup tinyint(4) DEFAULT NULL,
CarrierDelay int(11) DEFAULT NULL,
WeatherDelay int(11) DEFAULT NULL,
NASDelay int(11) DEFAULT NULL,
SecurityDelay int(11) DEFAULT NULL,
LateAircraftDelay int(11) DEFAULT NULL,
FirstDepTime varchar(10) DEFAULT NULL,
TotalAddGTime varchar(10) DEFAULT NULL,
LongestAddGTime varchar(10) DEFAULT NULL,
DivAirportLandings varchar(10) DEFAULT NULL,
DivReachedDest varchar(10) DEFAULT NULL,
DivActualElapsedTime varchar(10) DEFAULT NULL,
DivArrDelay varchar(10) DEFAULT NULL,
DivDistance varchar(10) DEFAULT NULL,
Div1Airport varchar(10) DEFAULT NULL,
Div1AirportID int(11) DEFAULT NULL,
Div1AirportSeqID int(11) DEFAULT NULL,
Div1WheelsOn varchar(10) DEFAULT NULL,
Div1TotalGTime varchar(10) DEFAULT NULL,
Div1LongestGTime varchar(10) DEFAULT NULL,
Div1WheelsOff varchar(10) DEFAULT NULL,
Div1TailNum varchar(10) DEFAULT NULL,
Div2Airport varchar(10) DEFAULT NULL,
Div2AirportID int(11) DEFAULT NULL,
Div2AirportSeqID int(11) DEFAULT NULL,
Div2WheelsOn varchar(10) DEFAULT NULL,
Div2TotalGTime varchar(10) DEFAULT NULL,
Div2LongestGTime varchar(10) DEFAULT NULL,
Div2WheelsOff varchar(10) DEFAULT NULL,
Div2TailNum varchar(10) DEFAULT NULL,
Div3Airport varchar(10) DEFAULT NULL,
Div3AirportID int(11) DEFAULT NULL,
Div3AirportSeqID int(11) DEFAULT NULL,
Div3WheelsOn varchar(10) DEFAULT NULL,
Div3TotalGTime varchar(10) DEFAULT NULL,
Div3LongestGTime varchar(10) DEFAULT NULL,
Div3WheelsOff varchar(10) DEFAULT NULL,
Div3TailNum varchar(10) DEFAULT NULL,
Div4Airport varchar(10) DEFAULT NULL,
Div4AirportID int(11) DEFAULT NULL,
Div4AirportSeqID int(11) DEFAULT NULL,
Div4WheelsOn varchar(10) DEFAULT NULL,
Div4TotalGTime varchar(10) DEFAULT NULL,
Div4LongestGTime varchar(10) DEFAULT NULL,
Div4WheelsOff varchar(10) DEFAULT NULL,
Div4TailNum varchar(10) DEFAULT NULL,
Div5Airport varchar(10) DEFAULT NULL,
Div5AirportID int(11) DEFAULT NULL,
Div5AirportSeqID int(11) DEFAULT NULL,
Div5WheelsOn varchar(10) DEFAULT NULL,
Div5TotalGTime varchar(10) DEFAULT NULL,
Div5LongestGTime varchar(10) DEFAULT NULL,
Div5WheelsOff varchar(10) DEFAULT NULL,
Div5TailNum varchar(10) DEFAULT NULL
) DEFAULT CHARSET=latin1"
killall -9 tikv-importer
killall -9 tidb-lightning
cd /mnt/evo970
./tidb/tidb-latest-linux-amd64/bin/tikv-importer --config tikv-importer.toml &
./tidb/tidb-latest-linux-amd64/bin/tidb-lightning -config tidb-lightning.toml
Success!
morgo@ryzen:~/bin$ time lightning-import-ontime
tikv-importer: no process found
tidb-lightning: no process found
Error: not a valid TiDB version: 5.7.25-TiDB-v3.0.0-beta-211-g09beefbe0-dirty
real 0m0.329s
user 0m0.040s
sys 0m0.037s
Versions of the cluster
TiDB-Lightning version (run tidb-lightning -V
):
Release Version: v2.1.5-3-g9c6a5f6
Git Commit Hash: 9c6a5f6
Git Branch: master
UTC Build Time: 2019-03-11 03:34:29
Go Version: go version go1.11.2 linux/amd64
```
- TiKV-Importer version (run `tikv-importer -V`)
```
TiKV Importer 3.0.0-beta
```
- TiKV version (run `tikv-server -V`):
```
TiKV 3.0.0-beta
```
- TiDB cluster version (execute `SELECT tidb_version();` in a MySQL client):
```
mysql> select tidb_version()\G
*************************** 1. row ***************************
tidb_version(): Release Version: v3.0.0-beta-211-g09beefbe0-dirty
Git Commit Hash: 09beefbe045011e3c77608c9ed33da87c11efa94
Git Branch: master
UTC Build Time: 2019-03-13 05:38:59
GoVersion: go version go1.11.3 linux/amd64
Race Enabled: false
TiKV Min Version: 2.1.0-alpha.1-ff3dd160846b7d1aed9079c389fc188f7f5ea13e
Check Table Before Drop: false
1 row in set (0.00 sec)
```
- Other interesting information (system version, hardware config, etc):
>
>
Operation logs
tidb-lightning.log
for TiDB-Lightning if possibletikv-importer.log
from TiKV-Importer if possibleConfiguration of the cluster and the task
tidb-lightning.toml
for TiDB-Lightning if possibletikv-importer.toml
for TiKV-Importer if possibleinventory.ini
if deployed by AnsibleScreenshot/exported-PDF of Grafana dashboard or metrics' graph in Prometheus for TiDB-Lightning if possible
Is your feature request related to a problem? Please describe:
User start tikv-importer error when set addr = "0.0.0.0:{port}"
in tikv-importer.toml
.The nohup.out
log file content shows the reason:
invalid configuration: Other("[/rust/git/checkouts/tikv-71ea2042335c4528/ac6f026/src/server/config.rs:164]: invalid advertise-addr: \"0.0.0.0:18287\"")
but the tikv-importer.log
log file content didm't show the error reason clearly:
[2020/03/04 11:11:21.002 +08:00] [INFO] [tikv-importer.rs:41] ["Welcome to TiKV Importer."]
[2020/03/04 11:11:21.003 +08:00] [INFO] [tikv-importer.rs:43] []
[2020/03/04 11:11:21.003 +08:00] [INFO] [tikv-importer.rs:43] ["Release Version: 3.0.8"]
[2020/03/04 11:11:21.003 +08:00] [INFO] [tikv-importer.rs:43] ["Git Commit Hash: a9f1e2dc6d20284a2c61b57f7bcfd84d161268f2"]
[2020/03/04 11:11:21.003 +08:00] [INFO] [tikv-importer.rs:43] ["Git Commit Branch: release-3.0"]
[2020/03/04 11:11:21.003 +08:00] [INFO] [tikv-importer.rs:43] ["UTC Build Time: 2019-12-31 01:01:07"]
[2020/03/04 11:11:21.003 +08:00] [INFO] [tikv-importer.rs:43] ["Rust Version: rustc 1.37.0-nightly (0e4a56b4b 2019-06-13)"]
[2020/03/04 11:11:21.003 +08:00] [INFO] [tikv-importer.rs:45] []
[2020/03/04 11:11:21.003 +08:00] [WARN] [lib.rs:545] ["environment variable `TZ` is missing, using `/etc/localtime`"]
[2020/03/04 11:11:21.024 +08:00] [FATAL] [lib.rs:499] ["called `Result::unwrap()` on an `Err` value: AddrParseError(())"] [backtrace="stack backtrace:\n 0: 0x55e0b882c3bd - backtrace::backtrace::trace::hcb2647c6d67dfa4f"] [location=src/libcore/result.rs:999] [thread_name=main]
Describe the feature you'd like:
tikv-importer.log
tikv-importer.toml
could notes user not using ‘0.0.0.0’ or domain name as value of addr.Describe alternatives you've considered:
Teachability, Documentation, Adoption, Optimization:
Please answer these questions before submitting your issue. Thanks!
[error="cannot decode settings from TiDB, please manually fill in `tidb.port` and `tidb.pd-addr`: json: cannot unmarshal bool into Go struct field Log.log.enable-slow-log of type uint32"]
Versions of the cluster
TiDB-Lightning version (run tidb-lightning -V
):
v4.0.0-beta.1
TiDB cluster version (execute SELECT tidb_version();
in a MySQL client):
v4.0.0-beta.1
log:
[INFO] [restore.go:1253] ["import and cleanup engine completed"] [engineTag=test
.favfav
:-1] [engineUUID=9c824252-d150-5902-a2c0-02d2f26ebf6c] [takeTime=8.515710692s] []
[INFO] [tidb.go:232] ["alter table auto_increment start"] [table=test
.favfav
] [auto_increment=1567203009]
question:
why lightning change the auto_increment value before checksum? And the new value will be so big that would lead to long gap between max(primary id) and next new rows.
lightning 在导入数据结束,checksum前会修改自增值,如果表数据很大,新的自增值和实际的当前最大ID间的距离会非常大,有字段溢出的风险。
背景
tidb在我们这里作为用户画像的存储引擎,用户画像每天都会在凌晨在spark上进行计算,计算的结果需要在早高峰来到之前全部写入tidb。
数据量比较大,一份画像约有3亿行,几百列,100多G的数据
因为,考虑这样的解决方案
Is your feature request related to a problem? Please describe:
If tidb-lightning is started too quickly after tikv-importer, it will lead to a connection error like this one:
[2020/01/25 14:29:54.578 -07:00] [INFO] [lightning.go:194] ["load data source start"]
[2020/01/25 14:29:54.581 -07:00] [INFO] [lightning.go:197] ["load data source completed"] [takeTime=3.600785ms] []
[2020/01/25 14:29:54.582 -07:00] [INFO] [restore.go:251] ["the whole procedure start"]
[2020/01/25 14:29:54.587 -07:00] [INFO] [restore.go:289] ["restore table schema start"] [db=ontime]
[2020/01/25 14:29:54.635 -07:00] [INFO] [tikv-importer.rs:40] ["Welcome to TiKV Importer."]
[2020/01/25 14:29:54.635 -07:00] [INFO] [tikv-importer.rs:42] []
[2020/01/25 14:29:54.635 -07:00] [INFO] [tikv-importer.rs:42] ["Release Version: 4.0.0-beta"]
[2020/01/25 14:29:54.635 -07:00] [INFO] [tikv-importer.rs:42] ["Git Commit Hash: 53734730d7885ea3ade4ee92e433132d8e6c0f02"]
[2020/01/25 14:29:54.635 -07:00] [INFO] [tikv-importer.rs:42] ["Git Commit Branch: release-4.0"]
[2020/01/25 14:29:54.635 -07:00] [INFO] [tikv-importer.rs:42] ["UTC Build Time: 2020-01-17 02:37:07"]
[2020/01/25 14:29:54.635 -07:00] [INFO] [tikv-importer.rs:42] ["Rust Version: rustc 1.42.0-nightly (0de96d37f 2019-12-19)"]
[2020/01/25 14:29:54.636 -07:00] [INFO] [tikv-importer.rs:44] []
[2020/01/25 14:29:54.636 -07:00] [WARN] [lib.rs:528] ["environment variable `TZ` is missing, using `/etc/localtime`"]
[2020/01/25 14:29:54.639 -07:00] [INFO] [tikv-importer.rs:163] ["import server started"]
[2020/01/25 14:29:54.690 -07:00] [INFO] [tidb.go:99] ["create tables start"] [db=ontime]
[2020/01/25 14:29:54.851 -07:00] [INFO] [tidb.go:117] ["create tables completed"] [db=ontime] [takeTime=161.385782ms] []
[2020/01/25 14:29:54.851 -07:00] [INFO] [restore.go:297] ["restore table schema completed"] [db=ontime] [takeTime=264.193394ms] []
[2020/01/25 14:29:54.856 -07:00] [INFO] [restore.go:551] ["restore all tables data start"]
[2020/01/25 14:29:54.856 -07:00] [INFO] [restore.go:572] ["restore table start"] [table=`ontime`.`ontime`]
[2020/01/25 14:29:54.856 -07:00] [INFO] [restore.go:1314] ["load engines and files start"] [table=`ontime`.`ontime`]
[2020/01/25 14:29:54.857 -07:00] [INFO] [restore.go:1343] ["load engines and files completed"] [table=`ontime`.`ontime`] [enginesCnt=2] [filesCnt=314] [takeTime=954.206µs] []
[2020/01/25 14:29:54.858 -07:00] [ERROR] [restore.go:575] ["restore table failed"] [table=`ontime`.`ontime`] [takeTime=1.396899ms] [error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial tcp 127.0.0.1:8287: connect: connection refused\""]
[2020/01/25 14:29:54.858 -07:00] [ERROR] [restore.go:670] ["restore all tables data failed"] [takeTime=1.688492ms] [error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial tcp 127.0.0.1:8287: connect: connection refused\""]
[2020/01/25 14:29:54.858 -07:00] [ERROR] [restore.go:266] ["run failed"] [step=2] [error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial tcp 127.0.0.1:8287: connect: connection refused\""]
Error: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:8287: connect: connection refused"
[2020/01/25 14:29:54.858 -07:00] [ERROR] [restore.go:272] ["the whole procedure failed"] [takeTime=275.872115ms] [error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial tcp 127.0.0.1:8287: connect: connection refused\""]
[2020/01/25 14:29:54.858 -07:00] [WARN] [restore.go:445] ["stopping periodic actions"] [error="context canceled"]
[2020/01/25 14:29:54.858 -07:00] [ERROR] [main.go:59] ["tidb lightning encountered error"] [error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial tcp 127.0.0.1:8287: connect: connection refused\""] [errorVerbose="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: Error while dialing dial tcp 127.0.0.1:8287: connect: connection refused\"\ngithub.com/pingcap/errors.AddStack\n\t/home/jenkins/agent/workspace/release_tidb_4.0/go/pkg/mod/github.com/pingcap/[email protected]/errors.go:174\ngithub.com/pingcap/errors.Trace\n\t/home/jenkins/agent/workspace/release_tidb_4.0/go/pkg/mod/github.com/pingcap/[email protected]/juju_adaptor.go:15\ngithub.com/pingcap/tidb-lightning/lightning/backend.(*importer).OpenEngine\n\t/home/jenkins/agent/workspace/release_tidb_4.0/go/src/github.com/pingcap/tidb-lightning/lightning/backend/importer.go:111\ngithub.com/pingcap/tidb-lightning/lightning/backend.Backend.OpenEngine\n\t/home/jenkins/agent/workspace/release_tidb_4.0/go/src/github.com/pingcap/tidb-lightning/lightning/backend/backend.go:180\ngithub.com/pingcap/tidb-lightning/lightning/restore.(*TableRestore).restoreEngines\n\t/home/jenkins/agent/workspace/release_tidb_4.0/go/src/github.com/pingcap/tidb-lightning/lightning/restore/restore.go:740\ngithub.com/pingcap/tidb-lightning/lightning/restore.(*TableRestore).restoreTable\n\t/home/jenkins/agent/workspace/release_tidb_4.0/go/src/github.com/pingcap/tidb-lightning/lightning/restore/restore.go:714\ngithub.com/pingcap/tidb-lightning/lightning/restore.(*RestoreController).restoreTables.func1\n\t/home/jenkins/agent/workspace/release_tidb_4.0/go/src/github.com/pingcap/tidb-lightning/lightning/restore/restore.go:574\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1337"]
Describe the feature you'd like:
My workaround is to start lightning with a sleep call in between launching components:
#!/bin/sh
mysql -e 'drop database if exists ontime'
killall -9 tikv-importer
killall -9 tidb-lightning
BASE=/mnt/evo860
cd $BASE
rm -rf $BASE/tmp && mkdir -p $BASE/tmp
./bin/tikv-importer --import-dir $BASE/tmp --log-level info -A 0.0.0.0:8287 &
sleep 5
./bin/tidb-lightning -d $BASE/data-sets/ontime-data -importer localhost:8287
tidb-server and tikv-server do not require this sleep call. They can be started slightly before a pd-server, and will retry their connectivity. This is useful for distributed systems since it is hard to guarantee exact order of startup.
Describe alternatives you've considered:
N/A
Teachability, Documentation, Adoption, Optimization:
Same behavior as tidb-server and tikv-server.
Please answer these questions before submitting your issue. Thanks!
Generate test SQL files via dbgen with the template
dbgen -i /dev/stdin -o . -t db1903_baofu.app_cashPlatform -n 1000 -r 100 -k 1 -j 40 --escape-backslash --time-zone Asia/Shanghai <<'SCHEMAEOF'
CREATE TABLE _ (
`id` BIGINT(20) NOT NULL AUTO_INCREMENT {{ rownum }},
`orderNo` VARCHAR(128) DEFAULT NULL COMMENT '交易号' {{ rand.regex('[0-9a-z]{128}', 'i') }},
`memberId` VARCHAR(512) DEFAULT NULL COMMENT '会员ID' {{ rand.regex('[0-9a-z]{512}', 'i') }},
`businessType` VARCHAR(64) DEFAULT NULL COMMENT '业务类型' {{ rand.regex('[0-9a-z]{64}', 'i') }},
`businessName` VARCHAR(512) DEFAULT NULL COMMENT '业务名称' {{ rand.regex('[0-9a-z]{512}', 'i') }},
`productName` VARCHAR(512) DEFAULT NULL COMMENT '产品名称' {{ rand.regex('[0-9a-z]{512}', 'i') }},
`productId` VARCHAR(10) DEFAULT NULL COMMENT '产品ID' {{ rand.regex('[0-9a-z]{10}', 'i') }},
`categoryId` VARCHAR(10) DEFAULT NULL COMMENT '产品类别' {{ rand.regex('[0-9a-z]{10}', 'i') }},
`amount` DECIMAL(18,8) DEFAULT '0.00000000' COMMENT '金额' {{ rand.regex('[0-9]{18}\.[0-9]{8}') }},
`amountEncrypt` VARCHAR(512) DEFAULT NULL COMMENT '金额加密' {{ rand.regex('[0-9a-z]{512}', 'i') }},
`payAmount` DECIMAL(18,8) DEFAULT '0.00000000' COMMENT '实际付款金额' {{ rand.regex('[0-9]{18}\.[0-9]{8}') }},
`orderTime` DATETIME DEFAULT NULL COMMENT '单据创建时间' {{ rand.u31_timestamp() }},
`createTime` DATETIME DEFAULT NULL COMMENT '创建时间' {{ rand.u31_timestamp() }},
`createUserId` VARCHAR(512) DEFAULT NULL COMMENT '创建用户ID' {{ rand.regex('[0-9a-z]{512}', 'i') }},
`state` INT(4) DEFAULT NULL COMMENT '状态 0:待付款|1:已付款|2:已完成|-1:失败' {{ rand.range_inclusive(0, 9999) }},
`stateMsg` VARCHAR(1024) DEFAULT NULL COMMENT '状态原因' {{ rand.regex('[0-9a-z]{1024}', 'i') }},
`updateTime` DATETIME DEFAULT NULL COMMENT '更新时间' {{ rand.u31_timestamp() }},
`updateUserId` VARCHAR(512) DEFAULT NULL COMMENT '更新用户ID' {{ rand.regex('[0-9a-z]{512}', 'i') }},
`isDelete` INT(4) DEFAULT '0' COMMENT '是否删除' {{ rand.range_inclusive(0, 9999) }},
`deleteUserId` VARCHAR(512) DEFAULT NULL COMMENT '删除用户ID' {{ rand.regex('[0-9a-z]{512}', 'i') }},
`deleteTime` DATETIME DEFAULT NULL COMMENT '删除时间' {{ rand.u31_timestamp() }},
`appid` INT(8) DEFAULT NULL COMMENT '来源 0:PC|1:手机|99:出款' {{ rand.range_inclusive(0, 99999999) }},
PRIMARY KEY(`id`),
INDEX `pay_unique`(`orderNo`, `businessType`) USING BTREE
) ENGINE = InnoDB AUTO_INCREMENT = 10405 DEFAULT CHARACTER SET = UTF8;
SCHEMAEOF
Lightning import table without any errors.
2019/03/14 14:33:22.427 main.go:65: [error] tidb lightning encountered error:table info app_cashplatform not found
github.com/pingcap/tidb-lightning/lightning/restore.(*RestoreController).restoreTables
/home/jenkins/workspace/lightning_ghpr_build/go/src/github.com/pingcap/tidb-lightning/lightning/restore/restore.go:452
github.com/pingcap/tidb-lightning/lightning/restore.(*RestoreController).restoreTables-fm
/home/jenkins/workspace/lightning_ghpr_build/go/src/github.com/pingcap/tidb-lightning/lightning/restore/restore.go:216
github.com/pingcap/tidb-lightning/lightning/restore.(*RestoreController).Run
/home/jenkins/workspace/lightning_ghpr_build/go/src/github.com/pingcap/tidb-lightning/lightning/restore/restore.go:225
github.com/pingcap/tidb-lightning/lightning.(*Lightning).run
/home/jenkins/workspace/lightning_ghpr_build/go/src/github.com/pingcap/tidb-lightning/lightning/lightning.go:132
github.com/pingcap/tidb-lightning/lightning.(*Lightning).Run.func2
/home/jenkins/workspace/lightning_ghpr_build/go/src/github.com/pingcap/tidb-lightning/lightning/lightning.go:83
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1333
Versions of the cluster
TiDB-Lightning version (run tidb-lightning -V
):
(paste TiDB-Lightning version here)
TiKV-Importer version (run tikv-importer -V
)
(paste TiKV-Importer version here)
TiKV version (run tikv-server -V
):
(paste TiKV version here)
TiDB cluster version (execute SELECT tidb_version();
in a MySQL client):
(paste TiDB cluster version here)
Other interesting information (system version, hardware config, etc):
Operation logs
tidb-lightning.log
for TiDB-Lightning if possibletikv-importer.log
from TiKV-Importer if possibleConfiguration of the cluster and the task
tidb-lightning.toml
for TiDB-Lightning if possibletikv-importer.toml
for TiKV-Importer if possibleinventory.ini
if deployed by AnsibleScreenshot/exported-PDF of Grafana dashboard or metrics' graph in Prometheus for TiDB-Lightning if possible
Please answer these questions before submitting your issue. Thanks!
kn t1 logs -f lighting-tidb-lightning-qjtfs
flag provided but not defined: -tidb-password
Usage:
-L string
log level: info, debug, warn, error, fatal (default "info")
-V print version of lightning
-backend string
delivery backend ("importer" or "mysql")
-c string
(deprecated alias of -config)
-config string
tidb-lightning configuration file
-d string
Directory of the dump to import
-importer string
address (host:port) to connect to tikv-importer
-log-file string
log file path
-pd-urls string
PD endpoint address
-server-mode
start Lightning in server mode, wait for multiple tasks instead of starting immediately
-status-addr string
the Lightning server address
-tidb-host string
TiDB server host
-tidb-port int
TiDB server port (default 4000)
-tidb-status int
TiDB server status port (default 10080)
-tidb-user string
TiDB user name to connect
Failed to parse command flags: flag provided but not defined: -tidb-password
What did you expect to see?
Can set tidb-password
What did you see instead?
Failed to parse command flags: flag provided but not defined: -tidb-password
Versions of the cluster
TiDB-Lightning version (run tidb-lightning -V
):
v3.0.8
TiKV-Importer version (run tikv-importer -V
)
(paste TiKV-Importer version here)
TiKV version (run tikv-server -V
):
(paste TiKV version here)
TiDB cluster version (execute SELECT tidb_version();
in a MySQL client):
(paste TiDB cluster version here)
Other interesting information (system version, hardware config, etc):
Operation logs
tidb-lightning.log
for TiDB-Lightning if possibletikv-importer.log
from TiKV-Importer if possibleConfiguration of the cluster and the task
tidb-lightning.toml
for TiDB-Lightning if possibletikv-importer.toml
for TiKV-Importer if possibleinventory.ini
if deployed by AnsibleScreenshot/exported-PDF of Grafana dashboard or metrics' graph in Prometheus for TiDB-Lightning if possible
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.