Comments (3)
可以参考TiDB的做法: https://zhuanlan.zhihu.com/p/55397024
from incubator-pegasus.
@hycdong 把设计文档初稿记录在这里
from incubator-pegasus.
The basic idea is just as zuoyan's comment. We firstly create rocksdb SST files on remote storage such as HDFS, and ingest those files into pegasus. Rocksdb support Creating-and-Ingesting-SST-files.
Basic design
- Creating SST files on HDFS
- Using RocksDBJava and Spark or MapReduce, write key-value pairs into SST files on HDFS
Tips:
We plan to support converting files in specific format or HBase HFile into pegasus key structure, more details about this are already in discussion.
- Replica servers download SST files from HDFS
- Client send start bulk load request(including app and remote file storage informations) to meta server
- Meta server send bulk load request to primary replica of each partition
- Primary replica receive request from meta server and download SST files and notify secondaries download files and report its download progress
- Secondary replica download SST files and report download progress through
group_check
- Primary replica report download progress to meta servers
- Meta server will wait all partitions including primary and secondaries finish downloading
Tips:
- Primary and secondaries all download files from HDFS, so that we can avoid learn and data transfer through replicas which may affect normal read, write operations.
- Server will restrict how many replica can download files at the same time
- Download files task code has lower priority than normal write operations
In one word, we should guarantee downloading staging has minimal affect to normal user operations.
- Replica servers ingestion SST files
- Meta server send ingestion request to primary replica of each partition.
- Primary replica will consider ingestion request as a special write request, execute 2pc staging
- Write an empty request after primary commit to ensure secondaries commit ingestion request
Tips:
Rocksdb will prevent write operation during ingestion files, as a result, replica during ingestion staging will also reject all user operation. We should guarantee such time as short as possible.
Relation will other functions
- Write operations: ingestion staging will prevent user write operations
- Partition split: can not execute at the same time, because split will change partition count which affect SST files generation
- Access control: user should have write permission of target cluster
- Duplication, cold backup and cu calculation: WIP
from incubator-pegasus.
Related Issues (20)
- After upgrading from 2.4.0 to 2.5.0, some replicas frequently encounter 'client read accounted for an unhandled error' HOT 1
- specify ambiguous meta_server_list on recovery_test
- Bug(duplication):some nodes can not send duplication data in master when big difference in data size
- Failed to build Pegasus due to undefined references to all thrift-generated objects while linking libdsn_client.a HOT 1
- Replica server exits abnormally with coredump HOT 4
- Feature: support sampling record users to read specific information into detail log file
- BUG(go client):when go client is writing to one partition and the replica node core dump, go client will finish after timeout without updating the configuration. HOT 2
- Meta_server unable to connect kerberos zookeeper where KDC configuration "rdns = false" HOT 1
- use getaddrinfo instead of gethostbyname HOT 1
- Bug(go client):The cluster added a new meta node, but the meta server configuration of the go client was not updated. As a result, the client cannot find the new meta address and can only access the meta listed in the meta list HOT 2
- Question:The Go client is not compiled on GitHub, and users need to download it and build it themselves.
- Errors occurred while launching Pegasus shell HOT 1
- Implement an expiration mechanism to limit the cache size
- Errors occurred while launching Pegasus bench HOT 1
- Error occurred that snappy/zstd/lz4 is not required dependencies while running packing tools HOT 1
- Error occurred that namespace in group validator of flags was missed while building Pegasus daily HOT 1
- Feature: support force no idempotent wirte when doing duplication
- Meta server and Replica server process could not exit normally after backing up or restoring on HDFS
- pegasus uses a non OSS friendly version of org.json:json jar HOT 1
- Error occurred that "get_property could not find TARGET DOWNLOAD_EXTRACT_TIMESTAMP" while building third-parties HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from incubator-pegasus.