Code Monkey home page Code Monkey logo

Comments (3)

qinzuoyan avatar qinzuoyan commented on May 24, 2024

可以参考TiDB的做法: https://zhuanlan.zhihu.com/p/55397024

from incubator-pegasus.

neverchanje avatar neverchanje commented on May 24, 2024

@hycdong 把设计文档初稿记录在这里

from incubator-pegasus.

hycdong avatar hycdong commented on May 24, 2024

The basic idea is just as zuoyan's comment. We firstly create rocksdb SST files on remote storage such as HDFS, and ingest those files into pegasus. Rocksdb support Creating-and-Ingesting-SST-files.

Basic design

  1. Creating SST files on HDFS
    • Using RocksDBJava and Spark or MapReduce, write key-value pairs into SST files on HDFS

Tips:
We plan to support converting files in specific format or HBase HFile into pegasus key structure, more details about this are already in discussion.

  1. Replica servers download SST files from HDFS
    • Client send start bulk load request(including app and remote file storage informations) to meta server
    • Meta server send bulk load request to primary replica of each partition
    • Primary replica receive request from meta server and download SST files and notify secondaries download files and report its download progress
    • Secondary replica download SST files and report download progress through group_check
    • Primary replica report download progress to meta servers
    • Meta server will wait all partitions including primary and secondaries finish downloading

Tips:

  1. Primary and secondaries all download files from HDFS, so that we can avoid learn and data transfer through replicas which may affect normal read, write operations.
  2. Server will restrict how many replica can download files at the same time
  3. Download files task code has lower priority than normal write operations

In one word, we should guarantee downloading staging has minimal affect to normal user operations.

  1. Replica servers ingestion SST files
    • Meta server send ingestion request to primary replica of each partition.
    • Primary replica will consider ingestion request as a special write request, execute 2pc staging
    • Write an empty request after primary commit to ensure secondaries commit ingestion request

Tips:
Rocksdb will prevent write operation during ingestion files, as a result, replica during ingestion staging will also reject all user operation. We should guarantee such time as short as possible.

Relation will other functions

  • Write operations: ingestion staging will prevent user write operations
  • Partition split: can not execute at the same time, because split will change partition count which affect SST files generation
  • Access control: user should have write permission of target cluster
  • Duplication, cold backup and cu calculation: WIP

from incubator-pegasus.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.