chotchki / feophant Goto Github PK
View Code? Open in Web Editor NEWA PostgreSQL inspired SQL database written in Rust.
License: GNU Affero General Public License v3.0
A PostgreSQL inspired SQL database written in Rust.
License: GNU Affero General Public License v3.0
Currently page offsets are generated as part of the insertion into the file system. However this results in a bad assumption in a couple layers of the stack.
My issue is that the code ends up assuming when a new page is inserted that it will result in the logical next page.
If I can avoid the extraction I would prefer it, we'll see.
Right now I have an index manager but its not part of the actual implementation.
Time to integrate it.
See where I can pass read only data by reference instead of uisng Arc everywhere
Need to make a decision on the future of the project.
Acceptance Criteria:
Will likely need to have to start tracing columns from analyizing through to later stages.
Right now the main function runs the server from primitives. The Tokio Tower layer will probably do it better.
Implement support for running a fuzzer against the code base to ensure we are keeping the code at a high quality.
Right now pgbench fails on an initial attempt to drop the tables. We should add drop support.
drop table if exists pgbench_accounts, pgbench_branches, pgbench_history, pgbench_tellers
feophant/src/codec/network_frame.rs
Line 96 in a67e8f7
I feel like I parse too early however any work on this should wait until I can really profile this.
At the moment I have the license set to AGPL3.0 but that IS NOT conducive to getting people interested in this as a library. Once I get to a MVP I need to consider strongly relicensing to BSD/Apache/MIT something.
As I write this I might make the decision sooner than later.
Did more thinking, I should implement postgres's streams concept so that I don't need to do lookups to find associated metadata on an object.
I thought I was going to get to use Uuid+page offset. I think its now going to be uuid+page offset+ type.
struct PageId + enum PageType should do it (done).
I've been using a series of layered providers I've been calling managers and its gotten overly complex.
I need a better naming convention and package organization structure.
A lot of my low level code is very heavy and sensitive to locks. The Tokio project has a library called loom that seems to offer a reasonable way to test this type of code for correctness.
I suspect I'll have to move away from Github based runners once I do so.
This implies the following functionality:
Based on reading this really means implementing Btree indexes. They don't seem to be that bad to understand/implement.
First and most important question, how should the index layers work?
Are they transactional? (I don't think so until I implement a visability map)
How should the low level layer function?
Should I have an Index config struct I pass around or just a table + columns + unique or not + type
Index Config it is
Index Manager -> for a given table
IO Manager -> Handle Page Load / Store / Update
Implemented the formats but I think I need to add locking to the I/O manager.
At a minimum I need to support a get for update, update and release lock.
I'm not sure I understand how this should work :(. I think need to commit to another layer.
Back to indexes for now. I need to make a decision on how to handle them hitting the file system.
Postgres uses a series of OIDs to map onto disk.
I've been using uuids, I think I'm going to continue that. That would also solve the postgres fork approach.
Next up implementing the index manager to add entries to the index.
I'm having a hard time figuring this out, I might work to do the operations on the tree before I keep messing with the serialization protocols. I'm just worries they are directly linked.
Got further into the index manager. Unfortunately I need a lock manager to let it even pass the smell test. Time to go on a wild goose chase again! (This project is great for someone with ADHD to have fun on!)
The lock manager design/code is done but I'm not happy with using a rwlock to protect a tag. I really want to have the lock protect the content but that needs a way for me to support writeback. I think I need to build out two more things, a WAL mechanism and a buffer manager.
I guess I need to commit to doing this for reals. However I am worried about reaching a point of partially working for a while like when I did the type fixing. We'll see how this goes.
For now, the index implementation is now on hold until I get an integrated I/O subsystem and a stubbed out WAL.
Currently all pages in the system are being handled via Tokio's BytesMut struct and related traits. This struct is awesome for giving me easy read/write methods on an underlying set of pointers.
However I have discovered a critical issue that I am uncertain how to solve effectively.
So right now I have a combination of features in my lock/cache system:
The issue I am facing is as follows:
Postgres has arbitrary precision types that are critical for financial applications.
Implementing them will require a picking up a rust arbitrary precision library such as https://www.postgresql.org/docs/current/datatype-numeric.html
Right now the most reasonable library seems to have major limitations https://github.com/alkis/decimal and is basically a wrapper of the C library http://speleotrove.com/decimal/.
The BTreeLeafs have a doubly linked list to scan back and forth once you reach the bottom.
They are stored in multiple pages so if the database crashes they could be corrupted.
To work around this until I figure out a better way I can scan each leaf on database startup.
Will need to determine if the database shuts down unclean or not.
Currently new pages are selected based on appending to the end instead of based on pages being in use or not.
Switch this to using just the free space map.
Row manager will make use of the free/full markings.
Index manager will make use of them as inuse/not used.
Right now I have licensing issues as identified by Fossa. These are not due to the code but the various support tools around the code.
To fix this I will need to restructure the project to separate testing tools from the core code. They are already separated but its hard for Fossa to tell. However since I'm not selling presenting this code as production ready, meh, I'll fix it later.
So I have a fully ready free space map but I can't avoid the locking issue anymore despite it also being the next item on the todo list.
So I started trying to bake in locks and I've hit my normal do I go for a read or write lock approach? I think I may have screwed up the lock manager design, oops too much in silos!
So what I really need in a lock manager is this:
I'm debating do I need to put a check-in / check-out layer on top of file manager.
So I could expose an api for the following:
Read - Easy
ReadForUpdate - This is the hard part
Ideally
Append - Needs to be serialized
//Ignoring Delete/shrink for now
I should treat this as a lesson for not implementing something outside the greater system and realizing I can't integrate it.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.