Comments (3)
Make sure upgrade can be smoothly is very important.
To simply the work we need to do, maybe we can keep delegator at querynode and do not move it to stream service.
One problem is how many stream node need to upgrade and it's size.
To upgrade smoothly, streaming node need to assign timestamp and try to merge data from TTstream and Proxy insert.
in 2.5 we can keep delegator still at querynode. and move delegator to streaming node at 3.0
from milvus.
what about name it as streaming service?
from milvus.
Streaming Service Upgrading In Milvus 2.5
Dependency Specification
Version 2.4 relies on the pub/sub capability of MQ for both reading and writing paths to support data persistence and querying of streaming data respectively.
In version 2.5, the pub/sub API is provided by StreamNode, and MQ reading and writing are encapsulated within StreamNode.
Write path
TimeTick lifetime
Significant changes in dependency order between versions 2.4 and 2.5 imply that the existing upgrade plan cannot meet the requirements.
Upgrade plan
. [Plan 1] Upgrade with downtime
- Stop writing on the client side.
- Execute flushAll to trigger flushing all data from MQ to disk
- Stop the 2.4 version cluster
- Start the 2.5 version cluster
[Plan 2] Upgrade with no downtime
-
Upgrade MixCoord, including RootCoord, QueryCoord, DataCoord, StreamCoord, at this time:
- In the 2.5 version, RootCoord still needs to execute the TimeTick logic
- After the upgrade, there are no changes in the read and write paths compared to the 2.4 version
-
Stop dataNode, the flush process will be terminated, Proxy can still accept all requests.
-
Start StreamNode, each pchannel will be allocated to the stream node and is prepared for subscription by the stream node client at this point.
-
Upgrade QueryNode, the new QueryNode will subscribe to vchannel with the stream node client, while the old QueryNode will continue to consume streaming from the MQ client.
-
Upgrade Proxy, once all proxies are upgraded:
- Stop sending TT logic on the RootCoord
- Enable the insertion of data by the stream node client on the Proxy
-
Stop IndexNode
Pros and cons:
- We can remove all deprecated codes once taking plan 1, even using local WAL implementation instead of rocksmq directly.
- Plan 2 will be smoother than Plan 1. However, if a query node rolling upgrade takes a while and a significant number of write requests are received during this period, the growing segments will consume more memory. Consequently, additional memory might be required for the QueryNode.
Version 2.5 servers as a transitional version to 3.0, now we can take plan2 to ensure a smooth upgrade from 2.4, making code cleanup after upgrading to 3.0.
from milvus.
Related Issues (20)
- [Bug]: querynode got restarted during test after indexcoord pod kill chaos test HOT 1
- [Bug]: Data race when clustering compaction HOT 1
- [Bug]: sparse column is sealed before append batch in mmap mode HOT 1
- [Bug]: Only one queryNode performs the loading job when the index_type is DISKANN. HOT 12
- [Bug]: Unable to load the collection HOT 14
- [Enhancement]: Enable to backup and restore rbac meta info HOT 1
- [Bug]: err cannot be captured in defer in data_sync_service HOT 1
- [Bug]: The password can contain`: `separators, which may cause errors when parsing the token as user+password. HOT 4
- [Bug]: panic in internal/flushcommon/util/timetick_sender.go HOT 2
- [Enhancement]: Add index task number for standalone milvus
- [Bug]: [benchmark][standalone] load collection raise error `collection not loaded` HOT 6
- [Feature]: Support Range Search Pagination Retain Order so No Duplication when using Different Offset HOT 3
- [Enhancement]: Enable ReadOnly/ReadWrite/Admin Privilege Group HOT 1
- [Enhancement]: improve bitset performance for AVX512
- [Bug]: querynode restarts due to `SIGSEGV: segmentation violation` after etcd follower pod failure chaos test HOT 9
- Why don't I have a GPU_IVF_FLAT in here HOT 6
- [Bug]: deletion problem HOT 2
- [Enhancement]: Mark query node as read only after suspend HOT 2
- [Bug]: `SampleFraction` config does not work for segcore tracing HOT 1
- [Bug]: Cluster scope limiter rate cannot be update to proxies when proxy number updates
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from milvus.