Comments (7)
/assign @congqixia
/unassign
from milvus.
Is there an existing issue for this?
- I have searched the existing issues
Environment
- Milvus version:2.3.14 - Deployment mode(standalone or cluster): cluster - MQ type(rocksmq, pulsar or kafka): kafka - SDK version(e.g. pymilvus v2.0.0rc2): - OS(Ubuntu or CentOS): - CPU/Memory: 256G - GPU: - Others:
Current Behavior
业务提交方式为;逐条提交,产生了44w个dropped状态的segment和50w个flushed状态segment,当集群资源不足造成milvus服务挂掉。再次重启是 datacoord节点在读取etcd的数据时报错如果:datacoord 日志
2024/05/24 04:52:56.013 +00:00] [INFO] [datacoord/channel_checker.go:113] ["timer started"] ["watch state"=ToWatch] [nodeID=1318] [channelName=by-dev-rootcoord-dml_6_444366786892873263v0] ["check interval"=15m0s] {"level":"warn","ts":"2024-05-24T04:52:56.019Z","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0005c6e00/xxxxxxxx:2379","attempt":0,"error":"rpc error: code = ResourceExhausted desc = trying to send message larger than max (3386659 vs. 2097152)"} [2024/05/24 04:52:56.271 +00:00] [WARN] [datacoord/channel_manager.go:639] ["fail to update"] [updates="["{type=Delete,nodeID=1055,channels="[by-dev-rootcoord-dml_11_444366786892873263v5, by-dev-rootcoord-dml_13_444366786893627283v2, by-dev-rootcoord-dml_2_444366786893627212v0, by-dev-rootcoord-dml_4_444366786893627212v2, by-dev-rootcoord-dml_6_444366786892873263v0, by-dev-rootcoord-dml_7_444366786892873263v1, by-dev-rootcoord-dml_8_444366786892873263v2, by-dev-rootcoord-dml_9_444366786892873263v3]"}","{type=Add,nodeID=1317,channels="[by-dev-rootcoord-dml_13_444366786893627283v2, by-dev-rootcoord-dml_7_444366786892873263v1]"}","{type=Add,nodeID=1316,channels="[by-dev-rootcoord-dml_2_444366786893627212v0, by-dev-rootcoord-dml_8_444366786892873263v2]"}","{type=Add,nodeID=1214,channels="[by-dev-rootcoord-dml_4_444366786893627212v2, by-dev-rootcoord-dml_9_444366786892873263v3]"}","{type=Add,nodeID=1318,channels="[by-dev-rootcoord-dml_11_444366786892873263v5, by-dev-rootcoord-dml_6_444366786892873263v0]"}"]"] [error="rpc error: code = ResourceExhausted desc = trying to send message larger than max (3386659 vs. 2097152)"] [2024/05/24 04:52:56.271 +00:00] [INFO] [datacoord/channel_checker.go:155] ["remove timer for channel"] [channel=by-dev-rootcoord-dml_13_444366786893627283v2] [timerCount=12] [2024/05/24 04:52:56.271 +00:00] [INFO] [datacoord/channel_checker.go:155] ["remove timer for channel"] [channel=by-dev-rootcoord-dml_7_444366786892873263v1] [timerCount=11] [2024/05/24 04:52:56.271 +00:00] [INFO] [datacoord/channel_checker.go:155] ["remove timer for channel"] [channel=by-dev-rootcoord-dml_2_444366786893627212v0] [timerCount=10] [2024/05/24 04:52:56.271 +00:00] [INFO] [datacoord/channel_checker.go:155] ["remove timer for channel"] [channel=by-dev-rootcoord-dml_8_444366786892873263v2] [timerCount=9] [2024/05/24 04:52:56.271 +00:00] [INFO] [datacoord/channel_checker.go:155] ["remove timer for channel"] [channel=by-dev-rootcoord-dml_4_444366786893627212v2] [timerCount=8] [2024/05/24 04:52:56.271 +00:00] [INFO] [datacoord/channel_checker.go:155] ["remove timer for channel"] [channel=by-dev-rootcoord-dml_9_444366786892873263v3] [timerCount=7] [2024/05/24 04:52:56.271 +00:00] [INFO] [datacoord/channel_checker.go:155] ["remove timer for channel"] [channel=by-dev-rootcoord-dml_11_444366786892873263v5] [timerCount=6] [2024/05/24 04:52:56.271 +00:00] [INFO] [datacoord/channel_checker.go:155] ["remove timer for channel"] [channel=by-dev-rootcoord-dml_6_444366786892873263v0] [timerCount=5] [2024/05/24 04:52:56.271 +00:00] [WARN] [datacoord/server.go:516] ["DataCoord Cluster Manager failed to start up"] [error="rpc error: code = ResourceExhausted desc = trying to send message larger than max (3386659 vs. 2097152)"] [2024/05/24 04:52:56.271 +00:00] [ERROR] [datacoord/server.go:314] ["DataCoord init failed"] [error="rpc error: code = ResourceExhausted desc = trying to send message larger than max (3386659 vs. 2097152)"] [stack="github.com/milvus-io/milvus/internal/datacoord.(*Server).Init.func1\n\t/go/src/github.com/milvus-io/milvus/internal/datacoord/server.go:314\ngithub.com/milvus-io/milvus/internal/util/sessionutil.(*Session).ProcessActiveStandBy\n\t/go/src/github.com/milvus-io/milvus/internal/util/sessionutil/session_util.go:1103\ngithub.com/milvus-io/milvus/internal/datacoord.(*Server).Register.func2\n\t/go/src/github.com/milvus-io/milvus/internal/datacoord/server.go:266"] [2024/05/24 04:52:56.271 +00:00] [INFO] [datacoord/channel_checker.go:134] ["stop timer before timeout"] ["watch state"=ToWatch] [nodeID=1316] [channelName=by-dev-rootcoord-dml_8_444366786892873263v2] ["timeout interval"=15m0s] [runningTimerCount=5]
Expected Behavior
No response
Steps To Reproduce
No response
Milvus Log
No response
Anything else?
No response
you should not run flush on every insertion.
You are hitting a issue that you has too many segments and
from milvus.
you can tune the rpc size limit to recover the cluster temporarily. but this is not the way milvus build to use
from milvus.
the recommended way is to write all data then flush once or don't flush at all
from milvus.
we will change from single insert to batch insert,
from milvus.
they key is you should not call flush while insertion. the data is visible even not flushed
from milvus.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen
.
from milvus.
Related Issues (20)
- [Bug]: make: *** [Makefile:253: generated-proto] Error 1 HOT 17
- [Bug]: search failed with error `segment lacks[segment=451679606836035900]: channel not available` after standalone pod kill chaos test HOT 2
- [Bug]: is too hard to make the project,i try my best. HOT 12
- [Bug]: Memory size and delta count in SegmentInfo may be inaccurate
- [Bug]: Unable to Delete Data from Collection After Field Modification in Milvus HOT 3
- [Bug]: "print: command not found" when compile HOT 3
- [Bug]: The `authorizationEnabled` flag reverts to its original state in Milvus standalone mode when using Docker. HOT 2
- [Bug]: checkResultTicker leak HOT 3
- [Bug]: `BloomFilterSet.BatchPkExist` may return false negative when `K` values differ among candidates HOT 2
- [Enhancement]: Check for channel cp lag can be removed
- [Bug]: Strong consistency , delete action, doesnt actually deletes immediately, i could still see the search results HOT 4
- [Feature]: Support to replicate the rbac operation HOT 1
- [Bug]: watch channel stuck forever HOT 2
- [Bug]: milvus crash after dropping a collection if compaction disabled HOT 1
- [Bug]: The decribe index api returns an incoherent response structure when using orm create index and milvusclient create index respectively HOT 1
- [Bug]: indices is empty when maping sparse float vector HOT 1
- [Bug]: pymilvus.exceptions.MilvusException: <MilvusException: (code=65535, message=unrecoverable error)> HOT 2
- [Bug]: Before exiting, make sure the goroutine has exited HOT 1
- [Bug]: err has degenerated into a new variable, which cannot be captured in defer. HOT 1
- [Bug]: Vector Search() bug HOT 9
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from milvus.