greenlion / warp Goto Github PK

This project forked from mysql/mysql-server

WarpSQL Server, an open source OLAP focused distribution of the world's most popular open source database bundled with OLAP performance related plugins such as the WARP storage engine..

Home Page: http://warpsql.blog

License: Other

Shell 1.05% CMake 0.93% C++ 76.94% C 15.21% Perl 0.57% Objective-C 0.70% Pascal 0.14% Python 0.02% Makefile 1.47% Java 1.75% HTML 0.57% CSS 0.01% JavaScript 0.21% PHP 0.09% Awk 0.01% PLSQL 0.01% Batchfile 0.01% Assembly 0.03% SourcePawn 0.01% Yacc 0.31%

warp's People

Contributors

Stargazers

Watchers

Forkers

caizj starlabsm

warp's Issues

Improve index lookups so 'ranges to the right rule' is not required

WARP can use all the parts of an explicit key so remove the logic that enforces the rules for b-tree searches on range scans.

Push numeric comparisons to strings to MySQL layer

where textcol < 1 can not be evaluated by Fastbit and returns wrong results. Comparisons between strings and integers need to be handled at the MySQL layer.

CREATE test cases with mysqltest

Create a basic feature test using mysqltest

Support MySQL native partitioning

Investigate the SE storage engine interface for partitioning and implement partitioning for WARP.

Implement fully atomic INSERT and UPDATE operations

n/t

Support UNIQUE_CHECKS = 0

To support faster loading of data, it should be possible to disable UNIQUE and PRIMARY KEY checking. Of course, this could allow duplicate data into tables, because WARP is not index organized and a UNIQUE/PRIMARY KEY is just a hint to check for duplicates at insertion/update time.

Improve DECIMAL support to store and compare packed decimals

This is an on-disk format change, so it should be part of beta 2 so that tables do not need to be repaired/reconstructed in beta3.

Support explicit indexes on NULLable columns

n/t

Support MVCC and ACID w/ REPEATABLE READ

Need to add trx_id to each table row similar to rowid
Need to add visibility support to transactions so that only visible transactions are shown to a transaction.
Add row level locking

Support ACID insert/update/delete semantics

If a statement errors out after inserting rows, the rows do not roll back. All statements should be atomic. The same goes for deletions and updates. An operation cancelled in the middle leaves a partially changed base table. This is the same as MyISAM.

Bonus: add transaction support for more than one table.

Switch from upstream MySQL to Percona Server

Since Percona Server supports RocksDB and has other useful enhancements, fork it instead of MySQL.

DROP DATABASE fails

DROP TABLE works fine, but MySQL does not call handler::delete_table for all tables in a database when the database is dropped.

Support case insensitive collations

Fastbit is not collation aware - need to compare strings in the bitmaps using case insensitive collation comparison when ci collation is used.

support special $COLUMN columns to tables

Make the Fastbit rowid value available to the MySQL table if the special column name $rowid is used in a table. Insertions should ignore any $rowid value passed in during an insert.

Support clone plugin

MySQL 8.0 can clone a server from another server. Right now only InnoDB is supported by the server for cloning. Cloning WARP based tables should be supported as well. This will require modifying the server and won't be pluggable so it is a semi-non-compatible change to upstream MySQL. WARP could still be loaded into a non WarpSQL server, but cloning would clone WARP tables as empty.

Investigate why 8.0.20 breaks ECP and other features of WARP

Merging upstream 8.0.20 broke the engine. Merge it again in a branch and investigate why it no longer works.

Implement background "vacuum" for WARP tables

Because FastBit is an append-only database, WARP does not do in place updates. This means tables with frequent updates will grow substantially since old row versions are maintained until an OPTIMIZE table operation is run on tables. WARP should be able to automatically rebuild tables in the background to remove old versions.

Will the query language stay SQL?

Is a MDX transformer in the roadmap? If so will it be Pentahoo/Mondrian interpretation of the spec of something that conforms to the MS/Power BI spec..

If useful there is a pretty simple implementation of the Mondrian tools available at https://github.com/Wondersoft/olaper

I have been a fan of the swanhart-tools for along time and made good use of "Shard Query" and "Flex Views" over the years. Really glad to see it maturing.

Support savepoints

N/t

Improve speed of MySQL aggregate functions

Use vectorized instructions for aggregate instructions.

Add warp_data_directory option

Store WARP tables in an alternative data directory by default

PRIMARY KEY and UNIQUE KEY indexes do not enforce uniqueness

n/t

Support pushing down joins in ECP

Fasbit has some code to do bitmap index joins. Experiment with pushing down joins to fastbit to see how performance compares vs hash and nl joins.

concurrent LOAD DATA INFILE blocks, but concurrent INSERT works

Figure out why LOAD DATA INFILE can not load in parallel.

Support HUGEINT (128bit int) and increase decimal storage size to 128 bits internally

Aggregation functions like SUM/STDDEV/etc can overflow on bigint or decimal expressions. Add new data types to prevent this. This is important for scientific computing.

Support Windows

Fastbit support Windows, so the storage engine just needs to have some minor modification for Windows support including using proper path separators, etc.

Create docker image for WARP server for easy testing by end users

n/t

DELETE FROM table with no WHERE clause corrupts table

TRUNCATE TABLE works correctly, but DELETE w/out WHERE clause does not.

Some ECP comparisons in 8.0.21 are not working properly

where l_shipdate = '1992-01-01' -- not pushed down
where l_shipdate IS NULL -- wrong results!

Support star-schema optimization and bitmap index merge joins

Add tablespace support

Having a single file per column is convenient, but lots of fsync are necessary for ACID compliance. Implement a container format (ie tablespace) that stores all columns in a single file.

Support fast alter table

The following should be supported without rebuilding the entire table but still require a table lock:
Adding keys
Dropping keys
Adding columns
Dropping columns
Changing default values

The following should require rebuilding a table:
ADD PRIMARY KEY
ADD UNIQUE KEY

Parallel query support

Port some shard-query logic into the database directly for parallel query.

Improve unique index check performance by scanning for duplicate key values in batches of inserts

When a multi-value insert (or load data) writes data to a table, the inserts are cached in a ibis::tablex object up to a maximum size. Unfortunately it is not possible to query tablex objects, they are only for appending data, so insertions of duplicate keys in an individual batch can not be detected easily. Find a way around this to improve insertion performance with UNIQUE/PRIMARY keys.

Read entire delete RID bitmap into memory for faster deleted RID detection

Right now the deleted rid bitmap is read in in 8 byte chunks. That is a lot of fread() calls (even though the OS should buffer the reads). A table with 64 million rows only takes 8MB of space to store the bitmap entirely in ram, so it makes sense to read the whole thing in and just point ->bits to the proper offset in the buffer.