Code Monkey home page Code Monkey logo

greenlion / warp Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mysql/mysql-server

41.0 41.0 2.0 3.81 GB

WarpSQL Server, an open source OLAP focused distribution of the world's most popular open source database bundled with OLAP performance related plugins such as the WARP storage engine..

Home Page: http://warpsql.blog

License: Other

Shell 1.05% CMake 0.93% C++ 76.94% C 15.21% Perl 0.57% Objective-C 0.70% Pascal 0.14% Python 0.02% Makefile 1.47% Java 1.75% HTML 0.57% CSS 0.01% JavaScript 0.21% PHP 0.09% Awk 0.01% PLSQL 0.01% Batchfile 0.01% Assembly 0.03% SourcePawn 0.01% Yacc 0.31%

warp's People

Contributors

alfranio avatar arnabray21 avatar bjornmu avatar bkandasa avatar blaudden avatar dahlerlend avatar frazerclement avatar gkodinov avatar glebshchepa avatar gurusami avatar harinvadodaria avatar jdduncan avatar jhauglid avatar kahatlen avatar kdjakevin avatar lkotula avatar lkshminarayanan avatar ltangvald avatar marcalff avatar nacarvalho avatar nryeng avatar phulakun avatar roylyseng avatar thayumanavar77 avatar thirunarayanan avatar trosten avatar vaintroub avatar vasild avatar weigon avatar zmur avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

caizj starlabsm

warp's Issues

Support UNIQUE_CHECKS = 0

To support faster loading of data, it should be possible to disable UNIQUE and PRIMARY KEY checking. Of course, this could allow duplicate data into tables, because WARP is not index organized and a UNIQUE/PRIMARY KEY is just a hint to check for duplicates at insertion/update time.

Support MVCC and ACID w/ REPEATABLE READ

Need to add trx_id to each table row similar to rowid
Need to add visibility support to transactions so that only visible transactions are shown to a transaction.
Add row level locking

Support ACID insert/update/delete semantics

If a statement errors out after inserting rows, the rows do not roll back. All statements should be atomic. The same goes for deletions and updates. An operation cancelled in the middle leaves a partially changed base table. This is the same as MyISAM.

Bonus: add transaction support for more than one table.

DROP DATABASE fails

DROP TABLE works fine, but MySQL does not call handler::delete_table for all tables in a database when the database is dropped.

Support case insensitive collations

Fastbit is not collation aware - need to compare strings in the bitmaps using case insensitive collation comparison when ci collation is used.

support special $COLUMN columns to tables

Make the Fastbit rowid value available to the MySQL table if the special column name $rowid is used in a table. Insertions should ignore any $rowid value passed in during an insert.

Support clone plugin

MySQL 8.0 can clone a server from another server. Right now only InnoDB is supported by the server for cloning. Cloning WARP based tables should be supported as well. This will require modifying the server and won't be pluggable so it is a semi-non-compatible change to upstream MySQL. WARP could still be loaded into a non WarpSQL server, but cloning would clone WARP tables as empty.

Implement background "vacuum" for WARP tables

Because FastBit is an append-only database, WARP does not do in place updates. This means tables with frequent updates will grow substantially since old row versions are maintained until an OPTIMIZE table operation is run on tables. WARP should be able to automatically rebuild tables in the background to remove old versions.

Will the query language stay SQL?

Is a MDX transformer in the roadmap? If so will it be Pentahoo/Mondrian interpretation of the spec of something that conforms to the MS/Power BI spec..

If useful there is a pretty simple implementation of the Mondrian tools available at https://github.com/Wondersoft/olaper

I have been a fan of the swanhart-tools for along time and made good use of "Shard Query" and "Flex Views" over the years. Really glad to see it maturing.

Support pushing down joins in ECP

Fasbit has some code to do bitmap index joins. Experiment with pushing down joins to fastbit to see how performance compares vs hash and nl joins.

Support Windows

Fastbit support Windows, so the storage engine just needs to have some minor modification for Windows support including using proper path separators, etc.

Add tablespace support

Having a single file per column is convenient, but lots of fsync are necessary for ACID compliance. Implement a container format (ie tablespace) that stores all columns in a single file.

Support fast alter table

The following should be supported without rebuilding the entire table but still require a table lock:
Adding keys
Dropping keys
Adding columns
Dropping columns
Changing default values

The following should require rebuilding a table:
ADD PRIMARY KEY
ADD UNIQUE KEY

Improve unique index check performance by scanning for duplicate key values in batches of inserts

When a multi-value insert (or load data) writes data to a table, the inserts are cached in a ibis::tablex object up to a maximum size. Unfortunately it is not possible to query tablex objects, they are only for appending data, so insertions of duplicate keys in an individual batch can not be detected easily. Find a way around this to improve insertion performance with UNIQUE/PRIMARY keys.

Read entire delete RID bitmap into memory for faster deleted RID detection

Right now the deleted rid bitmap is read in in 8 byte chunks. That is a lot of fread() calls (even though the OS should buffer the reads). A table with 64 million rows only takes 8MB of space to store the bitmap entirely in ram, so it makes sense to read the whole thing in and just point ->bits to the proper offset in the buffer.

use smallest column for count(*)

Currently the first column is used for count() - instead, find the smallest column (or the first NULL marker) and use that for count() to reduce IO.

Generate transaction ids using InnoDB

In order to support CDC, transaction id values should use the InnoDB transaction id generator. InnoDB is always available in the server because it is the DD storage engine.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.