Code Monkey home page Code Monkey logo

sidewinder's People

Contributors

ambud avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

sidewinder's Issues

Extensive QA / Reliability Engineering

Sidewinder needs extensive test suite to ensure data points are not dropped and data corruption does not happen even in the face of aggressive failures.

Fix Netty performance issues

Netty for both HTTP and Binary ingestion has performance issues including:

  • Substantial pauses in case of HTTP
  • Buffer and resource leak
  • Performance limitations (currently running at only ~1M/s)

Fix broken garbage collector

Summary:
Garbage collector in new DiskStorageEngine is broken.

Description:
New Disk Storage Engine garbage collector only clears pointers from memory but doesn't actually delete files from disk.

PTR file corruption

When PTR file is resized in the new memory mapped ptr file module the file gets corrupt, this causes failure of measurement recovery when Sidewinder is shutdown.

Add basic clustering framework

Summary:

Create a basic framework that can be expanded to provide sophisticated clustering for Sidewinder.As a part of this feature, create a clustering project and provide the following capabilities to it:

  • Node discovery mechanism
  • Cluster RPC mechanism
  • Internode protocols
  • Basic data routing system

Description:

Node discovery mechanism
Node discovery allows machines to automatically discover nodes in a cluster allowing for simpler maintenance of the system using a seed machine. Providing multiple implementations for the same to allow end users to configure if they would like manual or automated discovery. Selecting Atomix project as the clustering engine / discovery engine for the same since it provides Raft based consensus and leader election in an embedded setup which can be extended for sharding and scaling features in future.

Cluster RPC mechanism
This provides simple framework for nodes to communicate with each other. GRPC is selected at the moment for this purpose that uses protocol specs written in protobuf (see next section) and HTTP for actual transport layer. Reason for selecting GRPC is to allow simplicity of transport mechanism and provide the ability to support authentication and authorization at the protocol layer without extensive effort. GRPC doesn't provide very high performance compared to other transports however the engineering effort, amortized time cost using batch calls justifies it's current use. A future enhancement would be to evaluate performance differences between GRPC and other transports to see if it's worth changing the system.

Internode protocols
Internode protocols are a more efficient mechanism for nodes to communicate with each other compared to the client facing interface. Internode protocols are written in Google Protobuf to allow simple definition of protocols that can later be translated to provide clients in other languages as well.

Basic data routing system
Basic routing system provides a way data points can be routed to multiple machines that are member of the Sidewinder cluster. The reason for supporting this is to allow simple, light weight clients the ability to leverage a Sidewinder cluster. This routing engine will provide ability for nodes to proxy requests from clients to the appropriate machines in the cluster.

Spark RDD Connector for Sidewinder

Summary:
Create a connector to pull data into Apache Spark for Analytics

Description:
Sidewinder's data should be accessible to analytical tools like Spark for performing ML or batch analytics.

Add compression to cluster WAL

Cluster WAL doesn't write compressed data at the moment, compression reduces the size of the WAL and will improve the required disk space to operate the WAL.

This feature should support a pluggable compression algorithm for WAL byte compression. Essentially this feature should compress and uncompress the byte[] payload that is written and read from the WAL.

Note: remote operations shouldn't cause decompression of the data.

Implement Coordinator based Measurement Shard Clustering

Coordinator implementation (with Atomix) were recently released to master branch. This needs to be extended to support Measurement clustering. This implementation will involve:

  • GRPC Metadata API - DB Create, Get, Delete, List
  • GRPC Metadata API - Measurement Create, Get, Delete, List
  • Metadata API cluster state update (route table etc.)
  • Unit tests for APIs and cluster balancing

Configurable time bucket constant & persistent metadata

Summary:
Make time bucket constant for series configurable & persist metadata for disk storage

Description:
Time bucket is a constant right now set to 4096 seconds at the moment, this makes it difficult to store historical data due to max open file limit. Until #31 addresses this problem, as a stop-gap measure we need this configurable from external configuration and if possible on a database by database basis using a REST API.

SQL Support with JDBC

Summary:
Add SQL Support with JDBC Driver

Description:

Note: This functionality was temporarily disabled
  • Restore support for ANSI SQL via Apache Calcite
  • Add any optimizations / refactoring if necessary
  • Support JDBC Driver to execute queries

Official Benchmark Harness

Summary:
There's currently no official benchmark harness for Sidewinder, please add one.

Description:
Benchmark harness should provide a standard set of tools to run read/write performance benchmarks for users to validate the hardware and software configuration is suitable for their use case.

Need Autocorrelate feature for Grafana working

Use cases are requiring that the auto-correlate feature in grafana be activated so that series that relate to the queried series can automatically be pulled from the database for time series correlation.

Build REST compliant API

REST API needed to delete tags from Tag Index so that storage space can be freed up.

  • Validate REST APIs and their form
  • Missing APIs for metadata, retention at the database, measurement and time series
  • Create rename APIs are database and measurement
  • Add unit tests for REST API, refer: QA Tests
  • Create Swagger API docs

Currently the following APIs are supported:

    DELETE  /databases (com.srotya.sidewinder.core.api.DatabaseOpsApi)
    GET     /databases (com.srotya.sidewinder.core.api.DatabaseOpsApi)
    DELETE  /databases/{dbName} (com.srotya.sidewinder.core.api.DatabaseOpsApi)
    GET     /databases/{dbName} (com.srotya.sidewinder.core.api.DatabaseOpsApi)
    PUT     /databases/{dbName} (com.srotya.sidewinder.core.api.DatabaseOpsApi)
    GET     /databases/{dbName}/check (com.srotya.sidewinder.core.api.DatabaseOpsApi)
    DELETE  /databases/{dbName}/measurements/{measurementName} (com.srotya.sidewinder.core.api.MeasurementOpsApi)
    GET     /databases/{dbName}/measurements/{measurementName} (com.srotya.sidewinder.core.api.MeasurementOpsApi)
    PUT     /databases/{dbName}/measurements/{measurementName} (com.srotya.sidewinder.core.api.MeasurementOpsApi)
    GET     /databases/{dbName}/measurements/{measurementName}/check (com.srotya.sidewinder.core.api.MeasurementOpsApi)
    GET     /databases/{dbName}/measurements/{measurementName}/fields (com.srotya.sidewinder.core.api.MeasurementOpsApi)
    GET     /databases/{dbName}/measurements/{measurementName}/fields/{value} (com.srotya.sidewinder.core.api.MeasurementOpsApi)
    PUT     /databases/{dbName}/measurements/{measurementName}/series (com.srotya.sidewinder.core.api.MeasurementOpsApi)
    GET     /databases/{dbName}/measurements/{measurementName}/series/count (com.srotya.sidewinder.core.api.MeasurementOpsApi)
    PUT     /databases/{dbName}/measurements/{measurementName}/series/retention/{retentionPolicy} (com.srotya.sidewinder.core.api.MeasurementOpsApi)
    POST    /databases/{dbName}/query (com.srotya.sidewinder.core.api.DatabaseOpsApi)
    POST    /influx (com.srotya.sidewinder.core.api.InfluxApi)
    POST    /sql/database/{dbName} (com.srotya.sidewinder.core.api.SqlApi)

Linearizability Tests for Storage Engine

Summary:
Add linearizability tests for Storage Engine.

Description:
Sidewinder, just like any other database supports concurrent access, which involves concurrency support for both reads and writes. It's important for the database to guarantee the linearizability property to ensure there are no concurrency bugs.

Tests should cover the following:

  • Two or more concurrent threads writing 1 series (sequence ordering)
  • Two or more concurrent threads performing reads
  • Two or more concurrent threads performing reads and writes simultaneously

Storage Engine Redesign for improving scalability

Summary:

Redesign storage engine to support hundreds of thousands of independent time series.

Description:

The current Sidewinder Disk Storage engine stores one unique time series bucket per file, the size of the bucket is configurable however, it still posses a restriction on how many unique timeseries can there be on a given server as the number of open files is limited.

While the dc4d448 does try avoid the issue of max open files by closing the files as soon as the MappedByteBuffer is created, this only pushes the envelope so far and the fundamental issue is unresolved.

The LRU based design proposal I created earlier can only help mitigate the issue when there actually aren't as many concurrent writes for time series. In the case there are, this would cause a lot of cache evictions causing frequent cache swapping adding to degraded performance.

Proposal
The New Storage Engine design proposes to decouple compression and persistence responsibilities, combine multiple series into 1 file while keeping the concept of time series buckets. The whole design is based on a memory allocator that grants buffers to series buckets on request, these buffers are slices of a memory mapped file segment. Once the file reaches a certain size new files are created and existing file is closed. This redesign refactors a lot of components in the StorageEngine while preserving the interface as much as possible therefore there's minimal impact of writer and reader components of the database. Additional testing is added as well to help improve the reliability of the system.

OLD:

Summary:

Limit maximum number of open files.

Description:

Create an Least Recently Used based eviction system to automatically close data files that are not being written to or read from. Operating systems have limit on maximum number of open files, if a user / system exceeds that an exception is thrown that can't be recovered unless files are closed. The LRU based module will prevent this exception from being thrown by proactively limiting exceeding this limit. This feature is specially very helpful for series storing historical data.

Performance Fixes

Tag key value pair separation has caused database performance impact and slowdown by about 20%, additionally memory utilization of Sidewinder needs to be reduced so Heap is freed to scale and store larger number of series.

  • Improve throughput
  • Reduce heap usage

Enhancement to Authentication

Summary:
Support other authentication mechanisms besides basic auth.

Description:
Currently there's support for only basic authentication. This ticket requests support for additional authentication mechanism for the Sidewinder HTTP API (all of them)

  • LDAP Auth
  • AD Auth
  • SPNEGO (optional)

SSL Instructions Needed

Need instructions / documentation on how to configure REST API for SSL encryption. Can SSL be supported for GRPC API?

Compaction bugs

Compaction causes data corruption:

  1. When ptr file is resized, old data is ptrs are overwritten
  2. When cleaning up buffers, file counters are reset thereby overwriting data
  3. Size allocation on compaction causes exception when trying to reload buffers

Simple Master Slave clustering

Summary:

Create a naive master slave clustering system.

Description:

Master slave based clustering system provides data replication and sharding for queries. In this model master node will receive all writes which are then replicated over to one or more slave nodes. The slave nodes can be used for read only operations like queries allowing read scaling. Because this initial implementation is naive, master won't automatically failover using leader election process. If a master is not functional the slave will need to be manually promoted by applying the configuration and restarting.

Disk Storage Engine: support for multiple data directories

Summary:
Support for multiple data directories

Description:
Disk storage engine currently only supports a single data directory via the data.dir configuration. Add support so that data can be sharded across multiple disk drives removing IO bottlenecks if any.

Disk storage engine

Disk storage engine persists Sidewinder data on disk and provides an option for persistent use cases where write throughput can be traded for durability.

Add GRPC Server for Ingestion and Queries

Summary:
Basic GRPC Server for Ingestion and Queries based on protocol already established

Description:
GRPC is the binary protocol standard in Sidewinder for clustering, therefore it will be helpful to extend that to support binary protocol writes for clients. Additionally, queries should be supported via this as well.

  • Write support
  • Query support
  • Authentication support for both

Full clustering support

Summary:

Sidewinder so far has been a single instance database with a placeholder for cluster. Without clustering linear scaling of this TSDB is difficult and has to be manually orchestrated.

Cluster should provide:

  • Dynamic addition and removal of nodes
  • Data replication
  • Data sharding
  • Request balancing

Description

Dynamic addition and removal of nodes
This feature should allow Sidewinder instances to be added on the fly and removed on the fly with zero downtime with the relaxation of partially degraded performance during addition and removal of nodes due to intensive replication / data copying in the background.

Data replication
Each unique timeseries should be allowed to be replicated to multiple nodes to ensure fault tolerance i.e. Availability (CAP), the number of nodes a timeseries is replicated to should be determined by the global database level replication policy. At the very least the database must provide cluster level replication policy and optional db level replication granularity.

Data sharding
Sharding provides linear scaling, especially for a high throughput database like Sidewinder. Sharding should ideally be performed with a variant of ConsistentHashing to ensure AP of CAP

Request balancing
Database must have the capability that the clients do not need to be "smart" i.e. they do not need to be aware about the complexity of clustering internals of Sidewinder to ensure multiple types of clients can be easily plugged in to ingest data. Thus, the instances themselves should be able to proxy requests coming from clients to appropriate nodes this way all machines can evenly be hit by clients especially the HTTP interface can be front-ended using a round-robin load balancer.

Automated Tag Cleanup / GC

Tags are currently not cleaned up even though their related time series may have been garbage collected. Need an automated way to cleanup tags.

Graphite Schema

Graphite schema for decoder is not correct.

Graphite decoder should put the last key as the value field name and the second last key as the measurement and the rest of them as tags.

Add series compaction capability

Problem:
Time series buckets are written in static size buffer chunks, this is done as standard memory allocation technique. Therefore, currently a single time bucket can get defragmented over time for a given bucket. Reading across different slices for the same bucket may not be sequential when it comes to actual locality of data in-memory. Additionally, the buffers may be not be compact enough therefore wasting substantial disk space.

Compaction:
Compaction in case of Sidewinder is the process of merging these fragmented buffers using a better compression algorithm (optional). To reduce the amount of disk space used and improve linear read performance.

Build Sidewinder Graphite Proxy

Build a simple netty based server for Sidewinder to accept graphite protocol (TCP & UDP) and act as proxy to forward data to Sidewinder over GRPC protocol.

Reduce heap memory usage

Disk storage engine is utilizing too much heap for the number of objects that are created. This causes an upper limit on the number of data points being stored as the number of TimeSeriesBucket objects are limited due to heap size.

Support for basic authentication

Summary:
Support for API Authentication (basic)

Description
Provide support for authentication on the REST API to allow / deny access to the databases.

GRPC API Authentication Needed

GRPC API currently has no authentication layer on it. This will be a request from users trying to run Sidewinder is secure environments.

Access Control Support via Apache Ranger

Summary:
Create a ranger plugin so to support Access Control via Apache Ranger

Description:
Apache Ranger provides a sophisticated access control layer that integrates well with other big data projects such as Hive, HDFS etc. Adding Ranger plugin integration will provide capabilities to leverage Apache Ranger so that ACLs can be controlled for Sidewinder.

Documentation to creating a Ranger plugin: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=53741207

Note: The Ranger Plugin MUST be created as a separate child project from sidewinder-parent

  • Ranger Plugin
  • Sidewinder ACL - Database
  • Sidewinder ACL - Measurement
  • Sidewinder ACL - Time Series
  • Sidewinder ACL - Time Range

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.