Code Monkey home page Code Monkey logo

olegdb's People

Contributors

colby avatar hamcha avatar kyleterry avatar prestonsmith01 avatar qpfiffer avatar xe avatar zerozshadow avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

olegdb's Issues

Fix Hash collisions by implementing Cuckoo Hashing

Currently we just silently destroy existing hashes if there are collisions. That causes memory leaks (can be seen by running the jar test). We probably shouldn't be discarding data on collision anyway.

Verify that the memory is freed with valgrind.

make_install lib location bug

Currently the makefile specifies a command line level macro that gets passed to erlc. If the beam files were compiled with make, they have the macro defined to look for libraries in ./build/lib. This makes sense for debugging.
However, if make instal is specified on the commandline we need to recompile the beam files with the PREFIX specified by the user so that erlang knows where to look for the library.
This results in programs compiled in the standard make -> make install fashion to fail because they're searching for libraries in a directory that doesn't exist.

Multiple Database Instances

The functionality in the front end for this is there, but currently everything is just defaulted to an "oleg" db in /tmp. This needs to be configurable and interchangeable.

massacre.sh

This needs to actually do something useful.

Add more information to meta

  • Records inserted (Ever)
  • Records deleted (Ever)
  • Database size (in bytes)
  • times resized

Anything else we can think of

Struct size string thing

All char * instances should be replaced with structures that contain both length and the value. This will let people access the size.

Server crashes with large payloads... sometimes.

[ git:master ]
$ ./run_server.sh
[-] Starting server.
[-] Listening on port 8080
[-] Continuing to read.
[-] Continuing to read.
[-] Continuing to read.
[ERROR](src/aol.c:99: errno: Resource temporarily unavailable) Error reading
[ERROR](src/aol.c:140: errno: None) Error reading
Feb 23 21:45:15 [x] Restore failed. Corrupt AOL?
Feb 23 21:45:15 [x] Error during AOL restore...
./run_server.sh: line 5: 26692 Segmentation fault erl -pa ./build/bin -noshell -s olegdb main -s init stop

[ OK kyle@insomnia:~ ]
$ curl -vvvv -X POST -d @kyleterry.com.html http://localhost:8080/kyleterry --header "Content-Type: text/html"

  • Hostname was NOT found in DNS cache
  • Trying ::1...
  • connect to ::1 port 8080 failed: Connection refused
  • Trying 127.0.0.1...
  • Connected to localhost (127.0.0.1) port 8080 (#0)

    POST /kyleterry HTTP/1.1
    User-Agent: curl/7.35.0
    Host: localhost:8080
    Accept: /
    Content-Type: text/html
    Content-Length: 4805
    Expect: 100-continue

    < HTTP/1.1 100 Continue
  • Server OlegDB/fresh_cuts_n_jams is not blacklisted
    < Server: OlegDB/fresh_cuts_n_jams
    < Content-Length: 0
  • Empty reply from server
  • Connection #0 to host localhost left intact
    curl: (52) Empty reply from server

@kyleterry.com.html payload is here: https://gist.github.com/kyleterry/9182593

Segfault attempting to free bucket

Seems to happen specifically in the ol_scoop call in ol_content_type after a key has expired. Very odd.

[-] Requesting <<"%00%91f53%E7%5D%CE%9D%E5">>
DEBUG src/port_driver.c:149: Command from server: 2
DEBUG src/port_driver.c:120: Key: %00%91f53%E7%5D%CE%9D%E5
DEBUG src/port_driver.c:121: get_type klen: 24
DEBUG src/port_driver.c:127: Content type: application/octet-stream
DEBUG src/oleg.c:214: New key: %00%91f53%E7%5D%CE%9D%E5 Klen: 24
DEBUG src/oleg.c:326: Made Expiration: 1396509112
DEBUG src/oleg.c:214: New key: %00%91f53%E7%5D%CE%9D%E5 Klen: 24
DEBUG src/oleg.c:214: New key: %00%91f53%E7%5D%CE%9D%E5 Klen: 24
DEBUG src/oleg.c:326: Made Expiration: 1396509112
DEBUG src/oleg.c:214: New key: %00%91f53%E7%5D%CE%9D%E5 Klen: 24
*** Error in `/usr/lib/erlang/erts-5.10.4/bin/beam.smp': double free or corruption (fasttop): 0x00007fd8ec82f0a0 ***
======= Backtrace: =========
/usr/lib/libc.so.6(+0x731ff)[0x7fd91163b1ff]
/usr/lib/libc.so.6(+0x789ae)[0x7fd9116409ae]
/usr/lib/libc.so.6(+0x796b6)[0x7fd9116416b6]
./build/lib/liboleg.so(+0x4ce1)[0x7fd908f34ce1]
./build/lib/liboleg.so(ol_scoop+0x247)[0x7fd908f36203]
./build/lib/liboleg.so(ol_content_type+0xc0)[0x7fd908f36314]
./build/lib/libolegserver.so(+0x2a86)[0x7fd90913aa86]
/usr/lib/erlang/erts-5.10.4/bin/beam.smp(erts_port_output+0x117a)[0x4988fa]
/usr/lib/erlang/erts-5.10.4/bin/beam.smp(erts_port_command+0x4b9)[0x49a6b9]
/usr/lib/erlang/erts-5.10.4/bin/beam.smp(erl_send+0x84e)[0x4874fe]
/usr/lib/erlang/erts-5.10.4/bin/beam.smp(process_main+0x4359)[0x5599b9]
/usr/lib/erlang/erts-5.10.4/bin/beam.smp[0x4a283a]
/usr/lib/erlang/erts-5.10.4/bin/beam.smp[0x5dc595]
/usr/lib/libpthread.so.0(+0x80a2)[0x7fd911b800a2]
/usr/lib/libc.so.6(clone+0x6d)[0x7fd9116add1d]
======= Memory map: ========
00400000-00645000 r-xp 00000000 08:05 950201                             /usr/lib/erlang/erts-5.10.4/bin/beam.smp
00844000-00845000 r--p 00244000 08:05 950201                             /usr/lib/erlang/erts-5.10.4/bin/beam.smp
00845000-00897000 rw-p 00245000 08:05 950201                             /usr/lib/erlang/erts-5.10.4/bin/beam.smp
00897000-008b9000 rw-p 00000000 00:00 0 
02827000-02869000 rw-p 00000000 00:00 0                                  [heap]
7fd8cc000000-7fd8cc009000 rw-p 00000000 00:00 0 
7fd8cc009000-7fd8d0000000 ---p 00000000 00:00 0 
7fd8d0000000-7fd8d0009000 rw-p 00000000 00:00 0 
7fd8d0009000-7fd8d4000000 ---p 00000000 00:00 0 
7fd8d4000000-7fd8d4009000 rw-p 00000000 00:00 0 
7fd8d4009000-7fd8d8000000 ---p 00000000 00:00 0 
7fd8d8000000-7fd8d8009000 rw-p 00000000 00:00 0 
7fd8d8009000-7fd8dc000000 ---p 00000000 00:00 0 
7fd8dc000000-7fd8dc008000 rw-p 00000000 00:00 0 
7fd8dc008000-7fd8e0000000 ---p 00000000 00:00 0 
7fd8e4000000-7fd8e4008000 rw-p 00000000 00:00 0 
7fd8e4008000-7fd8e8000000 ---p 00000000 00:00 0 
7fd8ec000000-7fd8ec838000 rw-p 00000000 00:00 0 
7fd8ec838000-7fd8f0000000 ---p 00000000 00:00 0 
7fd8f0000000-7fd8f0008000 rw-p 00000000 00:00 0 
7fd8f0008000-7fd8f4000000 ---p 00000000 00:00 0 
7fd8f4000000-7fd8f4034000 rw-p 00000000 00:00 0 
7fd8f4034000-7fd8f8000000 ---p 00000000 00:00 0 
7fd8fc000000-7fd8fc008000 rw-p 00000000 00:00 0 
7fd8fc008000-7fd900000000 ---p 00000000 00:00 0 
7fd900000000-7fd900008000 rw-p 00000000 00:00 0 
7fd900008000-7fd904000000 ---p 00000000 00:00 0 
7fd904000000-7fd904008000 rw-p 00000000 00:00 0 
7fd904008000-7fd908000000 ---p 00000000 00:00 0 
7fd908d18000-7fd908d2d000 r-xp 00000000 08:05 919455                     /usr/lib/libgcc_s.so.1
7fd908d2d000-7fd908f2d000 ---p 00015000 08:05 919455                     /usr/lib/libgcc_s.so.1
7fd908f2d000-7fd908f2e000 rw-p 00015000 08:05 919455                     /usr/lib/libgcc_s.so.1
7fd908f30000-7fd908f38000 r-xp 00000000 08:05 790407                     /home/quinlan/src/Project-Oleg/build/lib/liboleg.so
7fd908f38000-7fd909137000 ---p 00008000 08:05 790407                     /home/quinlan/src/Project-Oleg/build/lib/liboleg.so
7fd909137000-7fd909138000 rw-p 00007000 08:05 790407                     /home/quinlan/src/Project-Oleg/build/lib/liboleg.so
7fd909138000-7fd90913f000 r-xp 00000000 08:05 799435                     /home/quinlan/src/Project-Oleg/build/lib/libolegserver.so
7fd90913f000-7fd90933e000 ---p 00007000 08:05 799435                     /home/quinlan/src/Project-Oleg/build/lib/libolegserver.so
7fd90933e000-7fd90933f000 rw-p 00006000 08:05 799435                     /home/quinlan/src/Project-Oleg/build/lib/libolegserver.so
7fd909440000-7fd909540000 rw-p 00000000 00:00 0 
7fd909640000-7fd909780000 rw-p 00000000 00:00 0 
7fd909840000-7fd909980000 rw-p 00000000 00:00 0 
7fd9099c0000-7fd909e80000 rw-p 00000000 00:00 0 
7fd909eb7000-7fd909eb8000 ---p 00000000 00:00 0 
7fd909eb8000-7fd90a6b8000 rw-p 00000000 00:00 0                          [stack:13682]
7fd90a6b8000-7fd90a6b9000 ---p 00000000 00:00 0 
7fd90a6b9000-7fd90aeb9000 rw-p 00000000 00:00 0                          [stack:13681]
7fd90aeb9000-7fd90aeba000 ---p 00000000 00:00 0 
7fd90aeba000-7fd90b6ba000 rw-p 00000000 00:00 0                          [stack:13680]
7fd90b6ba000-7fd90b6bb000 ---p 00000000 00:00 0 
7fd90b6bb000-7fd90bebb000 rw-p 00000000 00:00 0                          [stack:13679]
7fd90bebb000-7fd90bebc000 ---p 00000000 00:00 0 
7fd90bebc000-7fd90c6bc000 rw-p 00000000 00:00 0                          [stack:13678]
7fd90c6bc000-7fd90c6bd000 ---p 00000000 00:00 0 
7fd90c6bd000-7fd90cebd000 rw-p 00000000 00:00 0                          [stack:13677]
7fd90cebd000-7fd90cebe000 ---p 00000000 00:00 0 
7fd90cebe000-7fd90d6be000 rw-p 00000000 00:00 0                          [stack:13676]
7fd90d6be000-7fd90d6bf000 ---p 00000000 00:00 0 
7fd90d6bf000-7fd90debf000 rw-p 00000000 00:00 0                          [stack:13675]
7fd90debf000-7fd90dec0000 ---p 00000000 00:00 0 
7fd90dec0000-7fd90e900000 rw-p 00000000 00:00 0                          [stack:13674]
7fd90e92f000-7fd90e930000 ---p 00000000 00:00 0 
7fd90e930000-7fd90e951000 rw-p 00000000 00:00 0                          [stack:13672]
7fd90e951000-7fd90e952000 ---p 00000000 00:00 0 
7fd90e952000-7fd90e973000 rw-p 00000000 00:00 0                          [stack:13671]
7fd90e973000-7fd90e974000 ---p 00000000 00:00 0 
7fd90e974000-7fd90e995000 rw-p 00000000 00:00 0                          [stack:13670]
7fd90e995000-7fd90e996000 ---p 00000000 00:00 0 
7fd90e996000-7fd90e9b7000 rw-p 00000000 00:00 0                          [stack:13669]
7fd90e9b7000-7fd90e9b8000 ---p 00000000 00:00 0 
7fd90e9b8000-7fd90e9d9000 rw-p 00000000 00:00 0                          [stack:13668]
7fd90e9d9000-7fd90e9da000 ---p 00000000 00:00 0 
7fd90e9da000-7fd90e9fb000 rw-p 00000000 00:00 0                          [stack:13667]
7fd90e9fb000-7fd90e9fc000 ---p 00000000 00:00 0 
7fd90e9fc000-7fd90ea1d000 rw-p 00000000 00:00 0                          [stack:13666]
7fd90ea1d000-7fd90ea1e000 ---p 00000000 00:00 0 
7fd90ea1e000-7fd90ea3f000 rw-p 00000000 00:00 0                          [stack:13665]
7fd90ea3f000-7fd90ea40000 ---p 00000000 00:00 0 
7fd90ea40000-7fd90f680000 rw-p 00000000 00:00 0                          [stack:13662]
7fd90f69d000-7fd90f69e000 ---p 00000000 00:00 0 
7fd90f69e000-7fd90f6bf000 rw-p 00000000 00:00 0                          [stack:13664]
7fd90f6bf000-7fd90f6c0000 ---p 00000000 00:00 0 
7fd90f6c0000-7fd90ff00000 rw-p 00000000 00:00 0                          [stack:13661]
7fd90ff1c000-7fd9115c8000 rw-p 00000000 00:00 0 
7fd9115c8000-7fd911766000 r-xp 00000000 08:05 920809                     /usr/lib/libc-2.19.so
7fd911766000-7fd911966000 ---p 0019e000 08:05 920809                     /usr/lib/libc-2.19.so

Prep for 0.1.1 release

  • Update site to reflect version changes (0.1 and 0.1.1 should be accessible)
  • Document new API (things need to be freed)
  • HACKING file (Coding guidelines, pull request flow)
  • Build up a changelog
  • Push to site
  • Run through valgrind to get everything mostly cleaned up

Implement Splay trees to hold keys in addition to Hash Table

It would be a good idea to store keys in a way that is seperate from the hash table. With a splay tree, we get amortized time of O(log n) time, with the added advantage that recently accessed elements are easier to get again.

In addition to having a tree to store extra keys, this will allow us to iterate through them (in case people want to aggregate multiple records) and cleanly get keys when cleaning up the database. ol_close currently just loops through every slot in the bucket list to see if there is an ol_bucket there.

Alternate data structures encourages, but I figured this one could be fun because B+-trees are too boring.

ol_aol_restore command duplication bug

ol_restore reads in commands from the AOL file and then replicates them by calling the corresponding ol_* functions (JAR -> ol_jar, SCOOP -> ol_scoop). These in turn call ol_aol_write_cmd, which results in more data being written to the AOL file. This is unnecessary.

We should take note that the database is currently doing a restore, and that commands should not be written to the aol.

Configuration

Commandline or otherwise. Right now I'm not sure how to interpret commandline things from erlang.

Things we want to be able to configure:

  • Dumps y/n
  • Appendonly y/n
  • IP
  • Port
  • Number of dumps to keep around (last three snapshots)

test_lots_of_deletes failing on *BSD

I'm getting segfaults on "test_lots_of_deletes" under FreeBSD:

Apr 23 13:24:18 [-] ----- test_lots_of_deletes -----

Apr 23 13:24:18 [-] Opened DB: 0x801407380.
Apr 23 13:24:20 [-] Records inserted: 1000000.
Apr 23 13:24:20 [-] Saw 516494 collisions.
./run_tests.sh: line 4: 98102 Segmentation fault      (core dumped) ./build/bin/oleg_test     test
gmake: *** [test] Error 139

OpenBSD seems to be more verbose:

Apr 23 15:29:47 [-] ----- test_lots_of_deletes -----

Apr 23 15:29:47 [-] Opened DB: 0xf48590ee800.
Apr 23 15:30:02 [-] Records inserted: 1000000.
Apr 23 15:30:02 [-] Saw 516494 collisions.
oleg_test(14280) in free(): error: chunk is already free 0xf4843086140
./run_tests.sh: line 4: 14280 Abort trap              (core dumped) ./build/bin/oleg_test         test
Makefile:89: recipe for target 'test' failed
gmake: *** [test] Error 134

Server won't take large things. Or something...

$ curl -v -X POST -d @test_file.txt http://localhost:8080/turtles/test_book
* Hostname was NOT found in DNS cache
*   Trying ::1...
* connect to ::1 port 8080 failed: Connection refused
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> POST /turtles/test_book HTTP/1.1
> User-Agent: curl/7.35.0
> Host: localhost:8080
> Accept: */*
> Content-Length: 1215848
> Content-Type: application/x-www-form-urlencoded
> Expect: 100-continue
> 
< HTTP/1.1 404 Not Found
< Status: 404 Not Found
* Server OlegDB/fresh_cuts_n_jams is not blacklisted
< Server: OlegDB/fresh_cuts_n_jams
< Content-Length: 26
< Connection: close
< Content-Type: text/plain
< 
These aren't your ghosts.
* Closing connection 0

Data is not returned correctly

Data is currently returned from the port driver as some random list of integers. Needs to be sent back to the client.

Refactor ol_unjar

Right now we return a pointer to the data stored in the DB. Thats all we do. This has some limitations:

  • The data can be written
  • We can't cleanly check data size or the content_type in a single call
  • We have no idea what the length of the data is

Refactoring this to return an integer (1 if not found, 0 if found, -1 on error or something similar) and then copy the datasize, value, content type and content type size into a passed argument is a much cleaner way to handle this.

Bogus bucket->next is breaking shit

Somehow bucket->next is being set, only sometimes, to a bogus value that doesn't equal NULL. This breaks checks when traversing the linked lists and causes Oleg to panic.

Random occasional Segfault

Possibly related to ol_spoil?

Edit: It's probably do to the fact that the stack-allocated struct tm pointers expire when they go out of scope, so Oleg is trying to read bad regions of memory. Will probably be fixed with the new key expiration stuff in feature/HEAD_expiration_time.

Case-insensitive header matching is slow

parse_header(Data, Record) ->

...
            LowercaseHeader = binary:list_to_bin(
                                string:to_lower(
                                  binary:bin_to_list(Header)
                                 )
                               ),
...

This is slow as hell. Converting form a binary to a list and back again is not efficient. Find some way to convert bytearrays to lowercase natively without doing this conversion.

Setup website

We need like, a website. With tarballs and SHA256 sums n' stuff.

Content-Type not always being set correctly

I am seeing a lot of buckets with the incorrect content-type. They're just empty strings sometimes, but the ctype_size member on the bucket struct reflects the correct value. Odd.

Debug 32-bit Support

Theres no reason OlegDB won't work on 32-bit platforms. Just debug the weird issues.

Documentation

We need docs. Auto generate or something? Man I don't know.

Rearchitect all ol_unjar functions to fill out data param

See SophiaDB's sp_get or our very own ol_unjar_ds for an example of this paradigm. This is an architecture problem that was not foreseen (like we have any foresight...)

Fixing the functions in this way will help three-fold:

  1. It's blocking compression
  2. Functions outside the scope of the API can actually modify data that they get back and mess with the internal state of the database
  3. It leaves memory allocation up to the API caller

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.