libbitcoin / libbitcoin-database Goto Github PK

View Code? Open in Web Editor NEW

80.0 19.0 56.0 5.5 MB

Bitcoin High Performance Blockchain Database

License: Other

Shell 8.15% Makefile 0.92% C 0.72% C++ 83.18% Batchfile 0.53% M4 5.03% CMake 1.47%

libbitcoin-database's People

Contributors

Stargazers

Watchers

Forkers

evoskuil thecodefactory pmienk liunix1982 narodnik chizon muryee bitprim jolyzhang fudong1127 wangq3635 zimuxin 0xae lastcanal blockkette qtlab adam-zhang venturecurrency vladg14 toxeus pchaozhong jasoncoombs scristian71 jbaczuk wecluster kulpreet vdeurzen publicstartup jachiang zerocoolteam bitwares fibercrypto simelo djmuratb crypto-currency-projects 22388o ajunlonglive pinkdiamond1 sam-aryan nedimrenesalis cryptor83 qpc-github pool2win zenulabidin angelisab1 berry369 gmh5225 liestonn dreadpirate07 shashankss1205

libbitcoin-database's Issues

Optimize memory_map::resize using upgrade lock.

The remap allocator starts with a unique lock and downgrades to shared lock. In the vast majority of cases there is no need for a unique lock. It is only require when size exceeds file_size_. Otherwise writes with sufficient file space are needlessly blocking reads.

memory_ptr memory_map::reserve(size_t size, size_t expansion)
{
    // Internally preventing resize during close is not possible because of
    // cross-file integrity. So we must coalesce all threads before closing.

    // Critical Section (internal)
    ///////////////////////////////////////////////////////////////////////////
    const auto memory = REMAP_ALLOCATOR(mutex_);

    // The store should only have been closed after all threads terminated.
    if (closed_)
        throw std::runtime_error("Resize failure, store already closed.");

    if (size > file_size_)
    {
        // TODO: manage overflow (requires ceiling_multiply).
        // Expansion is an integral number that represents a real number factor.
        const size_t target = size * ((expansion + 100.0) / 100.0);

        if (!truncate_mapped(target))
        {
            handle_error("resize", filename_);
            throw std::runtime_error("Resize failure, disk space may be low.");
        }
    }

    logical_size_ = size;
    REMAP_DOWNGRADE(memory, data_);

    return memory;
    ///////////////////////////////////////////////////////////////////////////
}

[windows] tables improperly sized on close.

01:47:23.080196 INFO [node] Please wait while the node is stopping...
01:47:23.086175 DEBUG [network] Manual channel stopped: success
01:47:23.086175 DEBUG [network] Suspended manual connection.
01:47:23.087174 DEBUG [network] Stopped block_in protocol for [127.0.0.1:8333].
01:47:23.087174 DEBUG [network] Stopped block_out protocol for [127.0.0.1:8333].
01:47:23.087174 DEBUG [network] Stopped transaction_in protocol for [127.0.0.1:8333].
01:47:23.087174 DEBUG [network] Stopped transaction_out protocol for [127.0.0.1:8333].
01:47:23.090206 DEBUG [network] Valid block payload from [127.0.0.1:8333] (998068 bytes)
01:47:23.381390 INFO [network] Channel stopped: true
01:47:23.833820 DEBUG [database] Unmapped: "d:\blockchain-in\block_table" [4189020633]
01:47:23.836823 DEBUG [database] Unmapped: "d:\blockchain-in\block_index" [3294780]
01:48:00.980204 DEBUG [database] Unmapped: "d:\blockchain-in\transaction_table" [77878224980]
01:48:00.980705 INFO [node] Node stopped successfully.

01:55:36.472580 DEBUG [node] ================= startup 01/31/17 17:55:36 ==================
01:55:36.472580 INFO [node] ================= startup 01/31/17 17:55:36 ==================
01:55:36.479585 WARNING [node] ================= startup 01/31/17 17:55:36 ==================
01:55:36.488591 ERROR [node] ================= startup 01/31/17 17:55:36 ==================
01:55:36.497599 FATAL [node] ================= startup 01/31/17 17:55:36 ==================
01:55:36.502601 INFO [node] Using config file: "C:\ProgramData\libbitcoin\bn-blocks-in.cfg"
01:55:36.509607 INFO [node] Press CTRL-C to stop the node.
01:55:36.513609 INFO [node] Please wait while the node is starting...
01:55:36.521151 DEBUG [database] Buckets: block [650000], transaction [110000000], spend [250000000], history [107000000]
01:55:36.532669 DEBUG [database] Mapping: "d:\blockchain-in\block_table" [4189020632] (4096)
01:55:36.532669 DEBUG [database] Mapping: "d:\blockchain-in\block_index" [3294780] (4096)
01:55:36.532669 ERROR [node] Failure starting blockchain.
01:55:36.535666 ERROR [node] Node failed to start with error, operation failed.
01:55:36.539669 INFO [node] Please wait while the node is stopping...
01:55:36.765422 DEBUG [database] Unmapped: "d:\blockchain-in\block_table" [4189020632]
01:55:36.767894 DEBUG [database] Unmapped: "d:\blockchain-in\block_index" [3294780]
01:55:36.767894 INFO [node] Node stopped successfully.

4189020632 vs. 4189020633

It appears that the operating system truncates a single byte from the file.

This issue has been seen before but so far only on Windows. The underlying file map is platform-specific.

Running out of disk space ends process and corrupts store.

Because there are multiple independent files a lack of disk space on one cannot gracefully suspend a write. The other file may have been written, resulting in an inconsistency. As a result, when growth fails the process terminates and the store is left marked as corrupted. Cross-file write atomicity may be a performance hit that we don't want to accept. But this issue should be managed more gracefully.

History store row allocator requires concurrency guard.

Look at the address 1BrT827NCgxjctnEBdLiuDzukupwWHP1i2.
He owns the transaction e521a32cc35519164f6fd74e3b682071e3880c1b5099e5ae863ef340eb916097 (input and output) (https://blockexplorer.com/tx/e521a32cc35519164f6fd74e3b682071e3880c1b5099e5ae863ef340eb916097).
But in the history database, there is no output information.
This can be seen, for example, by transaction f26104fed45e7aaedfddcdeb713d12c1d2369330387b316f1797562c58f0be78.
Bitcoin explorer can not match the input of this transaction to the output

./bx fetch-history -c ./bx.cfg 1BrT827NCgxjctnEBdLiuDzukupwWHP1i2

transfer
{
    spent
    {
        hash b0abd44258e15418f4807021f53e462ace308fac23ce1b493f7dcbed79e4b75c
        height 476410
        index 5
    }
    value 18446744073709551615
}
transfer
{
    spent
    {
        hash f26104fed45e7aaedfddcdeb713d12c1d2369330387b316f1797562c58f0be78
        height 473014
        index 7
    }
    value 18446744073709551615
}

Is it possible to solve this problem?

[master] flush_lock must be file-specific for parallel write flushing.

The flush lock must be made file-specific because each file must be locked and flushed independently during a parallel write. The problem is that writes are locked per file but the flush lock spans files.

Concurrent block import unsafe on hash collision.

The hash table bucket lists are not guarded for concurrency.

Store block header with tx offsets vs hashes.

This would store Merkle tree transaction references as file-relative 64 bit offsets in place of tx hashes.

These values are used to populate any block retrieved from the store. This allows proper block constuction in the case where there are duplicate tx hashes in the history. Reconstruction of the block will also be accelerated by avoiding hash table lookup.

This will slow Merkle tree and tx hash list query for a block, but this is less common for historical blocks than for initial broadcast, which doesn't require query.

Hot backup and fault tolerance limitations

The blockchain store is intended to support hot backup and to be tolerant of hard shutdown (during write) faults. This is not guaranteed by the implementation.

The two scenarios rely on the store always existing in a consistent state. However there are multiple indexes in distinct files. Writes across these files are not atomic and therefore these scenarios may result in a partial update. The implementation is unable to detect this corruption, as each file individually remains consistent. Furthermore writes are typically not flushed to disk until process close.

Rename history database to address database.

History is ambiguous, yet the database is simply an address index.

Primitives require file_offset and array_index type parameterization.

The following classes require file_offset and array_index template parameterization. This will convert the remaining non-template primitives to templates, ensure type alignment, simplify code and improve readability.

slab_manager
slab_row<KeyType>
record_manager
record_row<KeyType>
record_list
record_multimap<KeyType>
record_multimap_iterable
record_multimap_iterator

Pertains to: #142

Make block query asynchronous by tx.

This will reduce the write collision cost a lot which is more important now that tx writes collide.

Change boolean result code to set of new store codes.

Support unconfirmed tx as output spender.

This was intended to be replaced by storing spender height with each output, scanning the block at the given height for the spender.

This is insufficient if we want to be able to provide this response for the set of unconfirmed txs that may have spent the output. By relying on linked confirmation state, as we now do for payment indexing, we can support this on tx write with no need for updates upon reorg. IOW the cost is amortized across the tx pool construction, avoiding post-validation impact. The downside is the very large spenders table.

Add configurable operational file size minimums.

If a zero minimum is configured set that file's minimum to 1MB (use = 1 for previous behavior).

[database]
# file_growth_rate = 5
block_table_size = 80000
candidate_index_size = 3000
confirmed_index_size = 3000
transaction_index_size = 3000000
transaction_table_size = 220000000
address_index_size = 100000
address_table_size = 100000000

This optimizes operations and effectively guards against late disk-full crash.

Hash tables not safe for read while conflict delete.

Record and slab hash tables store conflicts in a linked list. The list's first item is safely linked and delinked from the hash table header. This safety is produced by the hash table header class.

However the delink of any other list element (i.e. in the case of hash table conflicts, which currently make up about 1/3 of all references) is not atomic with a read of the same link. This produces a race to undefined behavior in the case of a block pop while reading history of the same block (i.e. via the client-server API), when there are conflicting history hash table records and the popped block is not first in the conflict list.

The reason for the error is that critical sections in the record_row<> and corresponding slab_row<> fail to produce the intended atomicity due to the fact that these classes are short-lived wrappers for navigation to the next element. The mutex lifetime must span all such reads and writes (with the exception of an item create which is written before becoming accessible).

Compress tx::version and point::index using var int.

FAIL libbitcoin_database_test_runner.sh

Hi, I received the following error while running install.sh

=================================================
   libbitcoin-database 3.0.0: ./test-suite.log
=================================================

# TOTAL: 1
# PASS:  0
# SKIP:  0
# XFAIL: 0
# FAIL:  1
# XPASS: 0
# ERROR: 0

.. contents:: :depth: 2

FAIL: libbitcoin_database_test_runner.sh
========================================

FAIL libbitcoin_database_test_runner.sh (exit status: 201)
Makefile:2423: recipe for target 'test-suite.log' failed
make[2]: *** [test-suite.log] Error 1
make[2]: Leaving directory '/home/veleiro/Projects/bitcoin/libbitcoin/libbitcoin-database/build-libbitcoin-database/libbitcoin-database'
Makefile:2529: recipe for target 'check-TESTS' failed
make[1]: *** [check-TESTS] Error 2
make[1]: Leaving directory '/home/veleiro/Projects/bitcoin/libbitcoin/libbitcoin-database/build-libbitcoin-database/libbitcoin-database'
Makefile:2735: recipe for target 'check-am' failed
make: *** [check-am] Error 2
Running 37 test cases...
Platform: linux
Compiler: GNU C++ version 6.2.0 20161109
STL     : GNU libstdc++ version 20161109
Boost   : 1.62.0

Host system Debian Stretch

Any ideas?

Add query to insert message::header[_message].

This will optimize and simplify header sync. The tx count should be used to allocate null tx references and block height should be empty sentinel in order to signal gap in sync restart.

call end_write before return

code data_base::push(const block& block, size_t height)
{
    // Critical Section
    ///////////////////////////////////////////////////////////////////////////
    unique_lock lock(write_mutex_);

    const auto ec = verify_push(block, height);

    if (ec)
        return ec;

    // Begin Flush Lock
    //vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
    if (!begin_write())
        return error::operation_failed;

    const auto median_time_past = block.header().validation.median_time_past;

    if (!push_transactions(block, height, median_time_past) ||
        !push_heights(block, height))
        return error::operation_failed;

    blocks_->store(block, height, true);
    synchronize();

    return end_write() ? error::success : error::operation_failed;
    // End Flush Lock
    //^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ///////////////////////////////////////////////////////////////////////////
}

Why in the block

    if (!push_transactions(block, height, median_time_past) ||
        !push_heights(block, height))
        return error::operation_failed;

is not called end_write()? In this case, the database can not be reopened

Flush lock slows block out protocol noticeably.

This should not be the case unless the spin lock is outside of the flush lock.

Boost file_lock limited to non-Unicode paths on Windows.

As discovered by @K-ballo here: libbitcoin/libbitcoin-system#516 (comment) and resolved here: 2f80348 in PR here: #39

Thanks for reporting!

Create configuration options for minimum file sizes.

This will help mitigate the situation where the store runs out of space after starting and will improve performance due to limiting resize operations.

Add spender tx/input to tx.output storage.

Next to spender height as: [height:4][tx:2][input:2]

This eliminates the spend table for a tiny increase in query cost and small increase in tx table size.

The fetch spend query would be moved into the tx database.

Port resolution to issues #150 and #159 to master.

PR: #160

Cache block p2p checksum to disk.

See libbitcoin/libbitcoin-network#152

Test failures with boost 1.63 package.

make  check-TESTS
make[1]: Entering directory `/home/alice/rpmbuild/BUILD/libbitcoin-database-master'
make[2]: Entering directory `/home/alice/rpmbuild/BUILD/libbitcoin-database-master'
FAIL: libbitcoin_database_test_runner.sh
make[3]: Entering directory `/home/alice/rpmbuild/BUILD/libbitcoin-database-master'
make[3]: Nothing to be done for `all'.
make[3]: Leaving directory `/home/alice/rpmbuild/BUILD/libbitcoin-database-master'
============================================================================
Testsuite summary for libbitcoin-database 4.0.0
============================================================================
# TOTAL: 1
# PASS:  0
# SKIP:  0
# XFAIL: 0
# FAIL:  1
# XPASS: 0
# ERROR: 0
============================================================================
See ./test-suite.log
Please report to eric@***
============================================================================

-=-

CentOS 7 x86_64
gcc 4.8.5
boost 1.63.0
libbitcoin 4.0.0 from yesterday (make check passes there)

MacOS (all): no template named 'unary_function' in namespace 'std'.

In file included from src/settings.cpp:19:
In file included from ./include/bitcoin/system/settings.hpp:22:
In file included from ./include/bitcoin/system/chain/chain.hpp:22:
In file included from ./include/bitcoin/system/chain/block.hpp:24:
In file included from ./include/bitcoin/system/chain/context.hpp:22:
In file included from ./include/bitcoin/system/define.hpp:36:
In file included from ./include/bitcoin/system/constraints.hpp:25:
In file included from ./include/bitcoin/system/typelets.hpp:25:
In file included from ./include/bitcoin/system/funclets.hpp:23:
In file included from ./include/bitcoin/system/literals.hpp:24:
In file included from ./include/bitcoin/system/constants.hpp:24:
In file included from ./include/bitcoin/system/types.hpp:26:
In file included from ./include/bitcoin/system/exceptions.hpp:25:
In file included from ./include/bitcoin/system/boost.hpp:49:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/algorithm/string.hpp:23:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/algorithm/string/split.hpp:16:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/algorithm/string/iter_find.hpp:27:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/algorithm/string/find_iterator.hpp:24:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/algorithm/string/detail/find_iterator.hpp:18:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/function.hpp:30:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/function/detail/prologue.hpp:17:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/function/function_base.hpp:21:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/type_index.hpp:29:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/type_index/stl_type_index.hpp:47:
/Users/runner/work/libbitcoin-database/prefix/include/boost/container_hash/hash.hpp:132:33: error: no template named 'unary_function' in namespace 'std'; did you mean '__unary_function'?
        struct hash_base : std::unary_function<T, std::size_t> {};
                           ~~~~~^
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/__functional/unary_function.h:46:1: note: '__unary_function' declared here
using __unary_function = __unary_function_keep_layout_base<_Arg, _Result>;
^
In file included from src/chain/block.cpp:19:
In file included from ./include/bitcoin/system/chain/block.hpp:24:
In file included from ./include/bitcoin/system/chain/context.hpp:22:
In file included from ./include/bitcoin/system/define.hpp:36:
In file included from ./include/bitcoin/system/constraints.hpp:25:
In file included from ./include/bitcoin/system/typelets.hpp:25:
In file included from ./include/bitcoin/system/funclets.hpp:23:
In file included from ./include/bitcoin/system/literals.hpp:24:
In file included from ./include/bitcoin/system/constants.hpp:24:
In file included from ./include/bitcoin/system/types.hpp:26:
In file included from ./include/bitcoin/system/exceptions.hpp:25:
In file included from ./include/bitcoin/system/boost.hpp:49:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/algorithm/string.hpp:23:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/algorithm/string/split.hpp:16:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/algorithm/string/iter_find.hpp:27:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/algorithm/string/find_iterator.hpp:24:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/algorithm/string/detail/find_iterator.hpp:18:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/function.hpp:30:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/function/detail/prologue.hpp:17:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/function/function_base.hpp:21:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/type_index.hpp:29:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/type_index/stl_type_index.hpp:47:
/Users/runner/work/libbitcoin-database/prefix/include/boost/container_hash/hash.hpp:132:33: error: no template named 'unary_function' in namespace 'std'; did you mean '__unary_function'?
        struct hash_base : std::unary_function<T, std::size_t> {};
                           ~~~~~^
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/__functional/unary_function.h:46:1: note: '__unary_function' declared here
using __unary_function = __unary_function_keep_layout_base<_Arg, _Result>;
^
In file included from src/define.cpp:19:
In file included from ./include/bitcoin/system/define.hpp:36:
In file included from ./include/bitcoin/system/constraints.hpp:25:
In file included from ./include/bitcoin/system/typelets.hpp:25:
In file included from ./include/bitcoin/system/funclets.hpp:23:
In file included from ./include/bitcoin/system/literals.hpp:24:
In file included from ./include/bitcoin/system/constants.hpp:24:
In file included from ./include/bitcoin/system/types.hpp:26:
In file included from ./include/bitcoin/system/exceptions.hpp:25:
In file included from ./include/bitcoin/system/boost.hpp:49:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/algorithm/string.hpp:23:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/algorithm/string/split.hpp:16:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/algorithm/string/iter_find.hpp:27:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/algorithm/string/find_iterator.hpp:24:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/algorithm/string/detail/find_iterator.hpp:18:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/function.hpp:30:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/function/detail/prologue.hpp:17:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/function/function_base.hpp:21:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/type_index.hpp:29:
In file included from /Users/runner/work/libbitcoin-database/prefix/include/boost/type_index/stl_type_index.hpp:47:
/Users/runner/work/libbitcoin-database/prefix/include/boost/container_hash/hash.hpp:132:33: error: no template named 'unary_function' in namespace 'std'; did you mean '__unary_function'?
        struct hash_base : std::unary_function<T, std::size_t> {};
                           ~~~~~^
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/__functional/unary_function.h:46:1: note: '__unary_function' declared here
using __unary_function = __unary_function_keep_layout_base<_Arg, _Result>;
^
1 error generated.
make: *** [src/libbitcoin_system_la-define.lo] Error 1
make: *** Waiting for unfinished jobs....
1 error generated.
make: *** [src/libbitcoin_system_la-settings.lo] Error 1
1 error generated.
make: *** [src/chain/libbitcoin_system_la-block.lo] Error 1

https://github.com/libbitcoin/libbitcoin-database/actions/runs/8865524750/job/24341874060

Use tx link instead of tx hash for input_point.

When the tx is stored it has been fully validated, which means previous outputs have been looked up. This requires obtaining the tx.link of the prevout hash. Storing the link vs. the hash saves 32-8 (24) bytes per input, and more (27) when tx slabs are converted to 5 byte storage. This is a material storage savings and is even less costly (faster) to write, which improves net sync and performance (including full block validation).

TODO: measure total number of inputs to determine actual space savings.

Whenever a tx is read for internal computations the link provides a materially faster lookup of the actual tx than the hash (which must pass through the hash table, including any conflicts). The only exception is the case where the objective is to obtain the hash of the point. In this case the link must be traversed to obtain the hash. Apart from paging costs the only additional material cost for this is the read of 8 more bytes (for the link). This would only impact queries (p2p/client-server) for blocks/txs (and possibly others that require the prevout hash).

The reduced net storage size may offset much of the additional paging cost, as the tx table would be significantly reduced in size. Also the paging is within the same table (self-referential), so the paging cost would be insignificant for recent prevouts (the more common scenario).

Given that tx input point links are self-referential the additional reads can be deferred into the tx result and input point iterator. As a result the abstraction remains clean while allowing the hashes to be populated as necessary.

Mark unspendable outputs as spent (optimization).

Currently we use an unspent sentinel or a spender height. Add a second sentinel for unspendable and short-circuit in validation. These include data scripts (op_return) and those that exceed max script size.

Resize files to logical size before closing.

There is a programmed 150% growth for each allocation that exceeds the file size. This results in an average ~125% excess of the necessary file size for each file upon shutdown. It's easy and reasonably fast to trim the files back to logical size. There is a small cost at shutdown and upon first write after startup. For the same reason, do not create files initially with expanded sizes (initchain).

Cache top block height in memory.

This may not be worth it given the frequency of access of the top of the array.

[master] Build warnings.

  CXX      src/databases/src_libbitcoin_database_la-address_database.lo
src/verify.cpp:112:49: warning: unused parameter 'transactions'
      [-Wunused-parameter]
code verify_missing(const transaction_database& transactions,
                                                ^
src/verify.cpp:113:24: warning: unused parameter 'tx' [-Wunused-parameter]
    const transaction& tx)
                       ^
2 warnings generated.
  CXX      src/databases/src_libbitcoin_database_la-block_database.lo
src/databases/address_database.cpp:127:58: warning: unused parameter 'tx'
      [-Wunused-parameter]
void address_database::catalog(const chain::transaction& tx)
                                                         ^
1 warning generated.
  CXX      src/databases/src_libbitcoin_database_la-transaction_database.lo
src/databases/block_database.cpp:228:16: warning: unused variable 'link'
      [-Wunused-variable]
    const auto link = next.create(header.hash(), writer);
               ^
src/databases/block_database.cpp:478:52: warning: unused parameter 'link'
      [-Wunused-parameter]
void block_database::pop_link(link_type DEBUG_ONLY(link), size_t height,
                                                   ^
2 warnings generated.
  CXX      src/memory/src_libbitcoin_database_la-accessor.lo
src/databases/transaction_database.cpp:478:14: warning: unused variable
      'position' [-Wunused-variable]
    uint32_t position = 0;
             ^
1 warning generated.
  CXX      src/memory/src_libbitcoin_database_la-file_storage.lo
  CC       src/mman-win32/src_libbitcoin_database_la-mman.lo
  CXX      src/result/src_libbitcoin_database_la-address_iterator.lo
src/memory/file_storage.cpp:425:30: warning: comparison of integers of different
      signs: 'const long' and 'const uint64_t' (aka 'const unsigned long')
      [-Wsign-compare]
    BITCOIN_ASSERT(page_size <= max_size_t);
                   ~~~~~~~~~ ^  ~~~~~~~~~~
/home/travis/build/libbitcoin/libbitcoin-database/my-prefix/include/bitcoin/system/utility/assert.hpp:28:47: note: 
      expanded from macro 'BITCOIN_ASSERT'
    #define BITCOIN_ASSERT(expression) assert(expression)
                                              ^~~~~~~~~~
/usr/include/assert.h:89:5: note: expanded from macro 'assert'
  ((expr)                                                               \
    ^~~~
1 warning generated.
  CXX      src/result/src_libbitcoin_database_la-address_result.lo
  CXX      src/result/src_libbitcoin_database_la-block_result.lo
  CXX      src/result/src_libbitcoin_database_la-inpoint_iterator.lo
src/result/block_result.cpp:41:23: warning: unused variable 'tx_start_size'
      [-Wunused-const-variable]
static constexpr auto tx_start_size = sizeof(uint32_t);
                      ^
src/result/block_result.cpp:42:23: warning: unused variable 'tx_count_size'
      [-Wunused-const-variable]
static constexpr auto tx_count_size = sizeof(uint16_t);
                      ^
src/result/block_result.cpp:47:19: warning: unused variable
      'transactions_offset' [-Wunused-const-variable]
static const auto transactions_offset = checksum_offset + checksum_size;
                  ^
3 warnings generated.

Store sequential lock implementation can cause failure

This is documented in blockchain but referenced here since the database has been split from the blockchain:

libbitcoin/libbitcoin-blockchain#141

Set file growth rate automatically if configured (default).

Set file growth rate automatically if configured to file_growth_rate = 0 (new default). Set a higher rate for smaller files and a lower rate for large files.

Store spender height with output.

This allows us to avoid the spends table for validation, and therefore treat it similarly to the history and stealth tables, whereby its indexing can be started at any height. This will offset the size increase resulting from +32 bits per input across the entire chain.

This validation to a single query to obtain the knowledge of whether an output is spent and at what height it is spent (necessary in the case of a reorg which may negate the spend), as well as the output itself. Currently three queries are required (get-spender-hash, get-tx-height, get-output). This will significantly reduce paging and cycles in the process of populating utxos.

Store tx offset vs. point in address row file.

This will reduce the file system storage at current height from 75GiB to 18GiB, improve write (sync/validation) speeds, reduce query cost (less paging) and increase some query costs (have to hit the tx table to get the tx hash).

Write failure due to regression in compression update.

Resolved in current libbitcoin and libbitcoin-database PRs.

array_index is 32 bit, will require expansion.

block/header_index/table, transaction_index, and address_table/rows all use array_index for records. block/header_index/table are safe for a very long time, but the others may require updating to higher domain offsets. This is based on the number of records, not the file/data size. If the record count reaches 2^32 the domain must increase (to 2^40). 2^24 (3 bytes) is too small for any of these record sets.

[master] Header storage exceedingly slow.

It takes about 25 seconds to store 1000 headers, in server or node, Windows or Linux, but only on testnet. Nothing unusual logged. Verified in debugger that the time is consumed by:

void block_database::push(const chain::header& header, size_t height,
    uint32_t median_time_past, uint32_t checksum, link_type tx_start,
    size_t tx_count, uint8_t state)
{
    ...
}

Move block pool into block store.

Use the height index to maintain the long chain. Use height sentinel in pooled blockes, just as with position sentinel in pooled txs. Allow queries to require "confirmed" blocks just as with require_confirmed txs.

This eliminates the need for the fork class and turns block validation into a simple single block operation, just as with tx validation. A new block that descends from an unconfirmed block is tested for fork point by walking back to the first confirmed ancestor. On a reorg the block height values are updated in the hash table for both in/out confirmations and the height index is updated similarly.

Remove hash from block_database::unindex or check that hash matches

libbitcoin-database/src/databases/block_database.cpp

Line 442 in a0f64d2

bool block_database::unindex(const hash_digest& hash, size_t height,

The presence of hash as the first parameter is confusing, especially since it is eventually ignored in the method.

Should we make one of the following two changes?

Remove hash from the method, making it explicit that unindex only removes from the top of the index.
Keep hash as the first parameter, but check that the hash matches the link found by read_index.

In the case of 1, why should we even pass the height then? Just unindex(true) or unindex(false) should do the trick. I might be missing how unindex is used in the data_base. But as I understand, it does seem like hash is redundant, the calls to unindex from data_base are passing in a hash that is anyway obtained from block_database using the height.

For the moment, I am writing tests with a not_found hash since it is ignored anyway. It should avoid confusion in the future.

Changes to std::hash template for db keys limits portability.

Also two test failures because of this.
Use the previous implementations of std::hash for database keys as overrides to create determinism.

Add block caching by hash and height.

Similar in concept to utxo cache.
Useful as performance optimization in both validation and server API.
Cannot be easily applied to address history indexes, since key-value map changes over time.
Could be applied to spend index, but slightly more complex than tx/block.
Indexes are already efficient (apart from stealth), so the benefit is much smaller.
Transactions queries are already efficient as the data is typically located in a single page.
Blocks are very expensive even compared to txs, so the benefit is greatest.

Add block_result method to read all tx hashes in one call.

What is the expected behavior for hash_table_multimap::find(Link link) behaviour with non existing link argument?

Hi @evoskuil

What is the expected behaviour of hash_table_multimap::find(Link link) method if the supplied link doesn't exist?

I expected it to behave the same as find::(Key key), i.e. return a list_element with link as not_found.

However, when calling find(Link link) with an argument that hasn't been linked yet, as in this test here: https://github.com/kulpreet/libbitcoin-database/blob/multimap_find_by_link_test/test/primitives/hash_table_multimap.cpp#L60 I don't get a list_element with link as not_found.

This is because the condition on line 82 (https://github.com/libbitcoin/libbitcoin-database/blob/master/include/bitcoin/database/impl/hash_table_multimap.ipp#L82) will never be true. And this is because record_map::find will return a list_element with the supplied link just copied in to the link.

From then on, if that memory pointed to by link hasn't been ever been written to, then we get a list_element with link as set to 0 and thus the test linked above fails.

Am I incorrect in my understanding of something here? Or Should we not care if link is called with an argument that has no data stored there? Or should I try to fix this behaviour to conform to return a list_element with a not_found link?