Code Monkey home page Code Monkey logo

chubaofs's Introduction

ChubaoFS

CNCF Status Build Status LICENSE Language Go Report Card Docs FOSSA Status CII Best Practices

ChubaoFS

Overview

ChubaoFS (储宝文件系统 in Chinese) is a cloud-native storage platform that provides both POSIX-compliant and S3-compatible interfaces. It is hosted by the Cloud Native Computing Foundation (CNCF) as a sandbox project.

ChubaoFS has been commonly used as the underlying storage infrastructure for online applications, database or data processing services and machine learning jobs orchestrated by Kubernetes. An advantage of doing so is to separate storage from compute - one can scale up or down based on the workload and independent of the other, providing total flexibility in matching resources to the actual storage and compute capacity required at any given time.

Some key features of ChubaoFS include:

  • Scale-out metadata management

  • Strong replication consistency

  • Specific performance optimizations for large/small files and sequential/random writes

  • Multi-tenancy

  • POSIX-compatible and mountable

  • S3-compatible object storage interface

We are committed to making ChubaoFS better and more mature. Please stay tuned.

Document

English version: https://chubaofs.readthedocs.io/en/latest/

Chinese version: https://chubaofs.readthedocs.io/zh_CN/latest/

Benchmark

Small file operation performance and scalability benchmark test by mdtest.

File Size (KB) 1 2 4 8 16 32 64 128
Creation (TPS) 70383 70383 73738 74617 69479 67435 47540 27147
Read (TPS) 108600 118193 118346 122975 116374 110795 90462 62082
Removal (TPS) 87648 84651 83532 79279 85498 86523 80946 84441
Stat (TPS) 231961 263270 264207 252309 240244 244906 273576 242930

Refer to chubaofs.readthedocs.io for performance and scalability of IO and Metadata.

Build ChubaoFS

$ git clone http://github.com/chubaofs/chubaofs.git
$ cd chubaofs
$ make

Yum Tools to Run a ChubaoFS Cluster for CentOS 7+

The list of RPM packages dependencies can be installed with:

$ yum install http://storage.jd.com/chubaofsrpm/latest/cfs-install-latest-el7.x86_64.rpm
$ cd /cfs/install
$ tree -L 2
.
├── install_cfs.yml
├── install.sh
├── iplist
├── src
└── template
    ├── client.json.j2
    ├── create_vol.sh.j2
    ├── datanode.json.j2
    ├── grafana
    ├── master.json.j2
    └── metanode.json.j2

Set parameters of the ChubaoFS cluster in iplist.

  1. [master], [datanode], [metanode], [monitor], [client] modules define IP addresses of each role.

  2. #datanode config module defines parameters of DataNodes. datanode_disks defines path and reserved space separated by ":". The path is where the data store in, so make sure it exists and has at least 30GB of space; reserved space is the minimum free space(Bytes) reserved for the path.

  3. [cfs:vars] module defines parameters for SSH connection. So make sure the port, username and password for SSH connection is unified before start.

  4. #metanode config module defines parameters of MetaNodes. metanode_totalMem defines the maximum memory(Bytes) can be use by MetaNode process.

[master]
10.196.0.1
10.196.0.2
10.196.0.3
[datanode]
...
[cfs:vars]
ansible_ssh_port=22
ansible_ssh_user=root
ansible_ssh_pass="password"
...
#datanode config
...
datanode_disks =  '"/data0:10737418240","/data1:10737418240"'
...
#metanode config
...
metanode_totalMem = "28589934592"
...

For more configurations please refer to documentation.

Start the resources of ChubaoFS cluster with script install.sh. (make sure the Master is started first)

$ bash install.sh -h
Usage: install.sh [-r --role datanode or metanode or master or monitor or client or all ] [-v --version 1.5.1 or latest]
$ bash install.sh -r master
$ bash install.sh -r metanode
$ bash install.sh -r datanode
$ bash install.sh -r monitor
$ bash install.sh -r client

Check mount point at /cfs/mountpoint on client node defined in iplist.

Open http://10.196.0.1:8500 through a browser for monitoring system(the IP of monitoring system is defined in iplist).

Run a ChubaoFS Cluster within Docker

A helper tool called run_docker.sh (under the docker directory) has been provided to run ChubaoFS with docker-compose.

$ docker/run_docker.sh -r -d /data/disk

Note that /data/disk can be any directory but please make sure it has at least 10G available space.

To check the mount status, use the mount command in the client docker shell:

$ mount | grep chubaofs

To view grafana monitor metrics, open http://127.0.0.1:3000 in browser and login with admin/123456.

To run server and client separately, use the following commands:

$ docker/run_docker.sh -b
$ docker/run_docker.sh -s -d /data/disk
$ docker/run_docker.sh -c
$ docker/run_docker.sh -m

For more usage:

$ docker/run_docker.sh -h

Helm chart to Run a ChubaoFS Cluster in Kubernetes

The chubaofs-helm repository can help you deploy ChubaoFS cluster quickly in containers orchestrated by kubernetes. Kubernetes 1.12+ and Helm 3 are required. chubaofs-helm has already integrated ChubaoFS CSI plugin

Download chubaofs-helm

$ git clone https://github.com/chubaofs/chubaofs-helm
$ cd chubaofs-helm

Copy kubeconfig file

ChubaoFS CSI driver will use client-go to connect the Kubernetes API Server. First you need to copy the kubeconfig file to chubaofs-helm/chubaofs/config/ directory, and rename to kubeconfig

$ cp ~/.kube/config chubaofs/config/kubeconfig

Create configuration yaml file

Create a chubaofs.yaml file, and put it in a user-defined path. Suppose this is where we put it.

$ cat ~/chubaofs.yaml 
path:
  data: /chubaofs/data
  log: /chubaofs/log

datanode:
  disks:
    - /data0:21474836480
    - /data1:21474836480 

metanode:
  total_mem: "26843545600"

provisioner:
  kubelet_path: /var/lib/kubelet

Note that chubaofs-helm/chubaofs/values.yaml shows all the config parameters of ChubaoFS. The parameters path.data and path.log are used to store server data and logs, respectively.

Add labels to Kubernetes node

You should tag each Kubernetes node with the appropriate labels accorindly for server node and CSI node of ChubaoFS.

kubectl label node <nodename> chuabaofs-master=enabled
kubectl label node <nodename> chuabaofs-metanode=enabled
kubectl label node <nodename> chuabaofs-datanode=enabled
kubectl label node <nodename> chubaofs-csi-node=enabled

Deploy ChubaoFS cluster

$ helm install chubaofs ./chubaofs -f ~/chubaofs.yaml

Reference

Haifeng Liu, et al., CFS: A Distributed File System for Large Scale Container Platforms. SIGMOD‘19, June 30-July 5, 2019, Amsterdam, Netherlands.

For more information, please refer to https://dl.acm.org/citation.cfm?doid=3299869.3314046 and https://arxiv.org/abs/1911.03001

Community

Partners and Users

For a list of users and success stories see ADOPTERS.md.

License

ChubaoFS is licensed under the Apache License, Version 2.0. For detail see LICENSE and NOTICE.

FOSSA Status

chubaofs's People

Contributors

awzhgw avatar mervinkid avatar shuoranliu avatar wenjia322 avatar zhuhyc avatar zhuzhengyi avatar bladehliu avatar yinlei-github avatar xuxihao1 avatar leeyubo avatar skypigltp avatar vivian7755 avatar wding109 avatar chengyu-l avatar zhengyi avatar littlejiajia322 avatar ekaakurniawan avatar ansjsun avatar znlstar avatar jzeng4 avatar caniszczyk avatar iliul avatar sjzlsr avatar dereulenspiegel avatar fossabot avatar locallocal avatar biuboombiuboom avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.