mfussenegger / cr8 Goto Github PK

View Code? Open in Web Editor NEW

37.0 6.0 16.0 527 KB

CLI collection of utilities for working with CrateDB or PostgreSQL. Benchmark queries, insert data.

License: MIT License

Python 100.00%

python python-3 cratedb cli crate database-testing postgresql postgres cockroachdb

cr8's Introduction

My Blog • Sponsor • My dotfiles • 🐦

Things I created or helped create

🔩

CrateDB - A distributed SQL Database
cr8 - CLI collection of utilities for working with CrateDB or PostgreSQL. Benchmark queries, insert data.
knx - Python knx / eib client library
mkjson - A commandline tool to generate static or random JSON records

🐛Debug adapter protocol

hprofdap - Debug adapter that allows to inspect Java heap dumps (.hprof files) via OQL
dapconfig-schema - JSON Schema for .vscode/launch.json debug configuration files

Neovim

nlua - Neovim as Lua interpreter

Plugins

nvim-dap - A debugger/neovim client for debug adapters. (Implements the debug adapter protocol)
nvim-dap-python - Python extension for nvim-dap
nvim-jdtls - Extensions for the Neovim built-in language server client for eclipse.jdt.ls
nvim-fzy - A fuzzy finder like fzf.vim but for fzy and neovim with Lua API
nvim-qwahl - A collection of pickers using vim.ui.select. Complementary to nvim-fzy.
nvim-lint - An asynchronous linter plugin for Neovim. Complementary to the built-in Language Server Protocol support.
nvim-lsp-compl - A (auto-)completion plugin for Neovim focusing on LSP support.
nvim-treehopper - Region selection with hints on the AST nodes of a document powered by treesitter.
nvim-ansible - run function to execute ansible playbooks, filetype patterns, improved path.
nvim-snippasta - copy text/code and paste it transformed into snippets using treesitter queries for tabstop detection.
nvim-overfly - Provides keymaps to quickly fly around your source code.

cr8's People

Contributors

Stargazers

Watchers

Forkers

lowks cgvarela chaudum joemoe andreidan quodt seut rps-v markush daq-tools ynuosoft lgtm-migrator trellixvulnteam hlcianfagna mkleen romseygeek

cr8's Issues

Result checksum / verification option

Having an option that adds a result-checksum to the result and also verifies the checksum against different results if there is more than 1 iteration would be nice.
This would be useful for queries that produce deterministic results - to make sure there are no regressions introduced that produce garbage output.

Provide a way to determine a good number of iterations for a query

additional faker provider that make sense for crate

For example:

auto-increment-id for integer/long primary keys
Something for timestamp columns that makes it easy to control the granularity per day/month?
geo-points
geo-shape

timeit subcommand does not support line breaks in stmt

Reproduceable with:

query.sql

select id,
name,
country
from countries

(.venv) ☁  cr8 [master] ☢ cat query.sql | cr8 timeit --hosts localhost:4200
100%|██████████████████████████████████████████████████████████████| 30/30 [00:05<00:00,  5.79 requests/s]
SqlException: SQLActionException[SQLParseException: line 2:1: no viable alternative at input '<EOF>']

use tqdm total for run-spec queries iterations

To have a progress bar displayed

cr8 timeit unable to handle comments after final semicolon

Describe the bug
Trying to use cr8 timeit with complex queries with comments I get failures.

To Reproduce
Pipe the following to cr8 timeit:

SELECT 1;
/* test */

-->

SQLParseException[line 1:11: mismatched input '<EOF>' expecting {'SELECT'

Expected behavior
It runs SELECT 1;

cr8 Version: (cr8 --version)
0.26.1

insert-fake-data exists with python error

running with crate 2.1.6 on cr8 0.10.0.

for this schema:

CREATE TABLE IF NOT EXISTS "ano"."nym" (
   "count" LONG,
   "csp_id" INTEGER,
   "custom1" STRING,
   "custom2" STRING,
   "custom3" STRING,
   "custom4" STRING,
   "custom5" STRING,
   "detecteduser_id" LONG,
   "device_id" INTEGER,
   "dlpbytes" LONG,
   "dlpcount" LONG,
   "downloadedbytes" LONG,
   "eventsummary_id" LONG,
   "fromtime" TIMESTAMP,
   "hash_id" STRING,
   "logprocessortag_id" INTEGER,
   "monitor" INTEGER,
   "protocol" STRING,
   "serviceblocked" INTEGER,
   "tenant_id" INTEGER,
   "timeupdated" TIMESTAMP,
   "totalbytes" LONG,
   "totime" TIMESTAMP,
   "uploadedbytes" LONG,
   "userorip" INTEGER,
   PRIMARY KEY ("detecteduser_id", "fromtime", "eventsummary_id")
);

it exits with the following msg.

TypeError: 'float' object cannot be interpreted as an integer

full output

Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/concurrent/futures/process.py", line 175, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
  File "/Users/johannes/env/lib/python3.6/site-packages/cr8/insert_fake_data.py", line 137, in _create_bulk_args
    return [row_fun() for i in range(req_size)]
TypeError: 'float' object cannot be interpreted as an integer
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/johannes/env/bin/cr8", line 11, in <module>
    sys.exit(main())
  File "/Users/johannes/env/lib/python3.6/site-packages/cr8/__main__.py", line 60, in main
    p.dispatch()
  File "/Users/johannes/env/lib/python3.6/site-packages/argh/helpers.py", line 55, in dispatch
    return dispatch(self, *args, **kwargs)
  File "/Users/johannes/env/lib/python3.6/site-packages/argh/dispatching.py", line 174, in dispatch
    for line in lines:
  File "/Users/johannes/env/lib/python3.6/site-packages/argh/dispatching.py", line 277, in _execute_command
    for line in result:
  File "/Users/johannes/env/lib/python3.6/site-packages/argh/dispatching.py", line 260, in _call
    result = function(*positional, **keywords)
  File "/Users/johannes/env/lib/python3.6/site-packages/cr8/insert_fake_data.py", line 258, in insert_fake_data
    loop.run_until_complete(tasks)
  File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/asyncio/base_events.py", line 466, in run_until_complete
    return future.result()
  File "/Users/johannes/env/lib/python3.6/site-packages/cr8/aio.py", line 49, in consume
    raise last_error
  File "/Users/johannes/env/lib/python3.6/site-packages/cr8/aio.py", line 44, in consume
    await task
  File "/Users/johannes/env/lib/python3.6/site-packages/cr8/insert_fake_data.py", line 133, in _exec_many
    return await client.execute_many(stmt, await args_coro)
TypeError: 'float' object cannot be interpreted as an integer

Unicode run-crate arguments are not passed correctly as settings

Example:

cr8 run-crate 2.1.7 -s node.name="لمهندس مؤي<0651>د النشاشيبي"

results in:

cr> select name from sys.nodes;
+--------------+
| name         |
+--------------+
| ������������ |
+--------------+
SELECT 1 row in set (0.030 sec)

When running CrateDB directly, the node.name setting is correct. Same applies to cluster.name.

FileNotFoundException if `path.data` contains multiple paths

cr8 run-crate 0.57.3 -s path.data="/tmp/data1/,/tmp/data2/"

 File "/home/jordi/workspace/code/cr8/cr8/run_crate.py", line 349, in run_crate
    print('Stopping Crate...')
  File "/home/jordi/workspace/code/cr8/cr8/run_crate.py", line 208, in __exit__
    self.stop()
  File "/home/jordi/workspace/code/cr8/cr8/run_crate.py", line 202, in stop
    shutil.rmtree(self.data_path)
  File "/usr/lib/python3.5/shutil.py", line 465, in rmtree
    onerror(os.lstat, path, sys.exc_info())
  File "/usr/lib/python3.5/shutil.py", line 463, in rmtree
    orig_st = os.lstat(path)
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/data1/,/tmp/data2/'

run-crate windows support

cr8 tries to unpack the .zip with tarfile:

Downloading https://cdn.crate.io/downloads/releases/cratedb/x64_windows/crate-4.2.2.zip and extracting to C:\Users\runneradmin\.cache\cr8\crates
Traceback (most recent call last):
  File "c:\hostedtoolcache\windows\python\3.8.5\x64\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "c:\hostedtoolcache\windows\python\3.8.5\x64\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\hostedtoolcache\windows\Python\3.8.5\x64\Scripts\cr8.exe\__main__.py", line 7, in <module>
  File "c:\hostedtoolcache\windows\python\3.8.5\x64\lib\site-packages\cr8\__main__.py", line 84, in main
    _run_crate_and_rest(p, args_groups)
  File "c:\hostedtoolcache\windows\python\3.8.5\x64\lib\site-packages\cr8\__main__.py", line 42, in _run_crate_and_rest
    with create_node(version=args.version,
  File "c:\hostedtoolcache\windows\python\3.8.5\x64\lib\site-packages\cr8\run_crate.py", line 700, in create_node
    crate_dir = get_crate(version, crate_root)
  File "c:\hostedtoolcache\windows\python\3.8.5\x64\lib\site-packages\cr8\run_crate.py", line 670, in get_crate
    crate_dir = _download_and_extract(uri, crate_root)
  File "c:\hostedtoolcache\windows\python\3.8.5\x64\lib\site-packages\cr8\run_crate.py", line 485, in _download_and_extract
    with tarfile.open(fileobj=tmpfile) as t:
  File "c:\hostedtoolcache\windows\python\3.8.5\x64\lib\tarfile.py", line 1606, in open
    raise ReadError("file could not be opened successfully")
tarfile.ReadError: file could not be opened successfully

Add support for password authentication against CrateDB nodes

Commands like run-spec, insert-fake-data, etc should also work against CrateDB nodes that require authentication.

I am happy to contribute the feature. :)

Track should not fail if single spec of the run fails

Would be great to have an option for run-track that allows single specs to fail without stopping the whole track run.

Suggestion: --non-strict or short -S command line argument

$ cr8 run-track path/to/track.toml --non-strict

share-able setup across spec files in run-track

In order to keep things organized it is useful to split tests into multiple spec files.
This has the disadvantage that the setup needs to be run for each spec file.

run-track could try to group the spec files based on the setup if the tests are read-only.

Create benchmark table if it doesn't exist

If someone installs cr8 via pip the benchmarks.sql file isn't included so it is rather troublesome to create the table. It should probably just be created automatically if it is missing.

And maybe provide an option to define the table name. For example the result hosts could be defined as -r localhost:4200/<tablename>

cr8 timeit unable to handle comments with semicolon in individual line

Describe the bug
Trying to use cr8 timeit with complex queries with comments I get failures depending on whether they contain semicolons on the same line where the comment is started.

To Reproduce
Pipe the following:

/* 
comment 2 ;
*/
select 1;

-->

SQLParseException[line 1:1: mismatched input '/' expecting {'SELECT'

This instead works:

/* comment 2 ; */
select 1;

Expected behavior
Runs select 1; without a SQLParseException

cr8 Version: (cr8 --version)

0.26.1

Removing old crates does not work correctly on Travis

Old Crate downloads are remove immediately https://travis-ci.org/crate/crate-qa/jobs/298743453#L535

Support git bisect workflow to find perf regression

To find out which commit introduced a performance regression:

Add a sub-command that can be used with git bisect in a Crate repo to
automatically have it build the current checked out Crate version and then run
a spec file against that build.

After each run prompt to check if the result is good or bad.
Optionally a avg duration threshold could be used to decide automatically.

Current manual steps:

In repo:

git clean -xdff
git submodule update --init
./gradlew clean distTar (or installDist)
launch node
cr8 run-spec ...
manually determine if runtime is good or bad
git bisect [good|bad]
repeat

Steps to implement this:

add support for cr8 run-crate /path/to/repo; this runs the first 4 steps listed above
add a cr8 run-crate-and-spec [version|path] [specpath] sub-command
- (Maybe call it run-adhoc-track or something like that?)
- (Maybe add a --fail-if [expression] option. E.g. --fail-if "runtime_stats['mean'] > 0.150")

Another option would be to support chaining run-crate with other comments. Like cr8 run-crate latest-nightly -- run-spec .... or cr8 run-crate latest-nightly -- timeit ...

KeyboardInterrupt shouldn't print stacktrace

## Running Query:
   Statement: select e.name as employee, o.name as office from employees e full join
   Concurrency: 1
   Iterations: 500
 16%|█████████████▌                                                                        | 79/500 [00:43<03:43,  1.88 requests/s]^C# Running tearDown
Traceback (most recent call last):
  File "bin/cr8", line 11, in <module>
    load_entry_point('cr8==0.6.1.dev31+ngd46ceee', 'console_scripts', 'cr8')()
  File "lib/python3.5/site-packages/cr8/main.py", line 24, in main
    p.dispatch()
  File "lib/python3.5/site-packages/argh/helpers.py", line 55, in dispatch
    return dispatch(self, *args, **kwargs)
  File "lib/python3.5/site-packages/argh/dispatching.py", line 174, in dispatch
    for line in lines:
  File "lib/python3.5/site-packages/argh/dispatching.py", line 277, in _execute_command
    for line in result:
  File "lib/python3.5/site-packages/argh/dispatching.py", line 260, in _call
    result = function(*positional, **keywords)
  File "lib/python3.5/site-packages/cr8/run_spec.py", line 235, in run_spec
    executor.run_queries(spec.queries, spec.meta)
  File "lib/python3.5/site-packages/cr8/run_spec.py", line 177, in run_queries
    result = runner.run()
  File "lib/python3.5/site-packages/cr8/timeit.py", line 130, in run
    measure, statements, self.concurrency, num_items=self.repeats)
  File "lib/python3.5/site-packages/cr8/aio.py", line 56, in run_many
    return loop.run_until_complete(map(coro, iterable, total=num_items))
  File "/usr/lib/python3.5/asyncio/base_events.py", line 375, in run_until_complete
    self.run_forever()
  File "/usr/lib/python3.5/asyncio/base_events.py", line 345, in run_forever
    self._run_once()
  File "/usr/lib/python3.5/asyncio/base_events.py", line 1276, in _run_once
    event_list = self._selector.select(timeout)
  File "/usr/lib/python3.5/selectors.py", line 441, in select
    fd_event_list = self._epoll.poll(timeout, max_ev)
KeyboardInterrupt

Jdk path points to the wrong location when run latest-nightly

The jdk path points to the wrong location when starting up latest-nightly in cr8 0.18.1.dev5+g74d5c9f.d20200427 on mac os x.

> cr8 run-crate latest-nightly
Skipping download, tarball alrady extracted at /Users/mkleen/.cache/cr8/crates/crate-4.2.0-202004270003-1d3ac73
Starting Crate process
CrateDB launching:
    PID: 3913
    Logs: /Users/mkleen/.cache/cr8/crates/crate-4.2.0-202004270003-1d3ac73/logs/cr8-crate-run848645912.log
    Data: /var/folders/49/v601s_vs0rqd3lx1q7s7shsr0000gn/T/tmpnxgx8rt5 (removed on stop)
/Users/mkleen/.cache/cr8/crates/crate-4.2.0-202004270003-1d3ac73/bin/crate: line 129: /Users/mkleen/.cache/cr8/crates/crate-4.2.0-202004270003-1d3ac73/jdk/Contents/Home/bin/java: No such file or directory
/Users/mkleen/.cache/cr8/crates/crate-4.2.0-202004270003-1d3ac73/bin/crate: line 129: exec: /Users/mkleen/.cache/cr8/crates/crate-4.2.0-202004270003-1d3ac73/jdk/Contents/Home/bin/java: cannot execute: No such file or directory
CrateDB didn't start in time or couldn't form a cluster.

The path:
/Users/mkleen/.cache/cr8/crates/crate-4.2.0-202004270003-1d3ac73/jdk/Contents/Home/bin/java
should point to:
/Users/mkleen/.cache/cr8/crates/crate-4.2.0-202004270003-1d3ac73/jdk/bin/java

GEO_SHAPE fake data provider causes `ColumnValidationException`

Steps to reproduce:

cr> create table t (shape geo_shape);

cr8 insert-fake-data --hosts localhost:4200 --table t -n 100000 -b 1
SqlException: SQLActionException[ColumnValidationException: Validation failed for shape: 'POLYGON (( 26.54999544368754 -82.30341222804802, 23.187737765374873 -90.62529200543769, 21.08304227283339 -91.58445700128993, 11.574181468199995 -86.73945042562916, 17.433520322649688 -76.38310406124984, 26.54999544368754 -82.30341222804802 ))' cannot be cast to type geo_shape]

Don't pre-allocate the values in UniformReservoir with 0

Otherwise if less than DEFAULT_NUM_SAMPLES requests are made the result is bogus

Add an output-fmt option for summary outputs

Include more server info in output

in addition to version_info the number of nodes and cluster name is probably also interesting

Ensure yellow/green special statement support

Until there is "safe" table creation in Crate it would be good to have some kind of ensure yellow and ensure green statement support in cr8.

These could then be used in cr8 spec-files after table creation to make sure everything is ready before data ingestion starts.

insert-fake-data fails in Windows Python

Python 3.5.2 😢

Traceback (most recent call last):
  File "C:\Users\mibe\AppData\Local\Programs\Python\Python35-32\Scripts\cr8-script.py", line 9, in <module>
    load_entry_point('cr8', 'console_scripts', 'cr8')()
  File "c:\users\mibe\sandbox\mikethebeer\cr8\cr8\__main__.py", line 29, in main
    p.dispatch()
  File "C:\Users\mibe\AppData\Local\Programs\Python\Python35-32\lib\site-packages\argh\helpers.py", line 55, in dispatch
    return dispatch(self, *args, **kwargs)
  File "C:\Users\mibe\AppData\Local\Programs\Python\Python35-32\lib\site-packages\argh\dispatching.py", line 174, in dis
patch
    for line in lines:
  File "C:\Users\mibe\AppData\Local\Programs\Python\Python35-32\lib\site-packages\argh\dispatching.py", line 277, in _ex
ecute_command
    for line in result:
  File "C:\Users\mibe\AppData\Local\Programs\Python\Python35-32\lib\site-packages\argh\dispatching.py", line 260, in _ca
ll
    result = function(*positional, **keywords)
  File "c:\users\mibe\sandbox\mikethebeer\cr8\cr8\insert_fake_data.py", line 248, in insert_fake_data
    loop.add_signal_handler(signal.SIGINT, stop)
  File "C:\Users\mibe\AppData\Local\Programs\Python\Python35-32\lib\asyncio\events.py", line 475, in add_signal_handler
    raise NotImplementedError
NotImplementedError

json2insert fails if thread concurrency decreases

If the concurrency parameter gets decreased to a certain number json2insert command throws an exception.

(.venv) cr8 [master +] > cat /Users/mibe/demo_0_.json | cr8 json2insert -c 10 mibe.demo st1.p.fir.io:44200 | jq '.'
Executing requests async bulk_size=1000 concurrency=10
191 requests [00:09, 16.29 requests/s]Traceback (most recent call last):
  File "/Users/mibe/sandbox/cr8/.venv/bin/cr8", line 9, in <module>
    load_entry_point('cr8==0.4.1.dev10+ng61b1deb', 'console_scripts', 'cr8')()
  File "/Users/mibe/sandbox/cr8/cr8/main.py", line 22, in main
    p.dispatch()
  File "/Users/mibe/sandbox/cr8/.venv/lib/python3.5/site-packages/argh/helpers.py", line 55, in dispatch
    return dispatch(self, *args, **kwargs)
  File "/Users/mibe/sandbox/cr8/.venv/lib/python3.5/site-packages/argh/dispatching.py", line 174, in dispatch
    for line in lines:
  File "/Users/mibe/sandbox/cr8/.venv/lib/python3.5/site-packages/argh/dispatching.py", line 277, in _execute_command
    for line in result:
  File "/Users/mibe/sandbox/cr8/.venv/lib/python3.5/site-packages/argh/dispatching.py", line 265, in _call
    for line in result:
  File "/Users/mibe/sandbox/cr8/cr8/json2insert.py", line 72, in json2insert
    aio.run(f, bulk_queries, concurrency, loop)
  File "/Users/mibe/sandbox/cr8/cr8/aio.py", line 58, in run
    consume(q)))
  File "/Users/mibe/.pyenv/versions/3.5.0/lib/python3.5/asyncio/base_events.py", line 342, in run_until_complete
    return future.result()
  File "/Users/mibe/.pyenv/versions/3.5.0/lib/python3.5/asyncio/futures.py", line 274, in result
    raise self._exception
  File "/Users/mibe/.pyenv/versions/3.5.0/lib/python3.5/asyncio/tasks.py", line 239, in _step
    result = coro.send(value)
  File "/Users/mibe/sandbox/cr8/cr8/aio.py", line 32, in map_async
    await q.put(task)
  File "/Users/mibe/.pyenv/versions/3.5.0/lib/python3.5/asyncio/queues.py", line 140, in put
    'queue non-empty, why are getters waiting?')
AssertionError: queue non-empty, why are getters waiting?
Task was destroyed but it is pending!
task: <Task pending coro=<measure() running at /Users/mibe/sandbox/cr8/cr8/aio.py:24> wait_for=<Future pending cb=[wrap_future.<locals>._check_cancel_other() at /Users/mibe/.pyenv/versions/3.5.0/lib/python3.5/asyncio/futures.py:403, Task._wakeup()]>>
Task was destroyed but it is pending!
task: <Task pending coro=<measure() running at /Users/mibe/sandbox/cr8/cr8/aio.py:24> wait_for=<Future pending cb=[wrap_future.<locals>._check_cancel_other() at /Users/mibe/.pyenv/versions/3.5.0/lib/python3.5/asyncio/futures.py:403, Task._wakeup()]>>
201 requests [00:09, 20.88 requests/s]

If the concurrency parameter gets increased - OKAY

(.venv) cr8 [master +] > python -V                                                            16:40:12
Python 3.5.0

fill-table: more default type mappings

All crate types should be mapped to a fake provider by default.

(double, float, ip, etc. are currently missing)

min version requirement for spec files

If tracks are created to run a lot of things across a wide range of version it is likely that older versions won't support some of the statements in the spec files.

For that case it would make sense to be able to specify a min_version either per spec or per query to then skip this spec/query if it's run against an older crate version

More helpful error if no fake-factory provider is found for schema

unreal runtime-stats values of request duration

runtime_stats of bulk request shows a very low request time of 45 ms in mean (out of 8 bulk requests). However it results in 3.54 requests/s which seems that a request takes about 300 ms each.

Running setUp
Running benchmark
9 requests [00:02,  3.54 requests/s]
{'bulk_size': 1000,
 'concurrency': 1,
 'ended': 1465362542607,
 'runtime_stats': {'max': 45.40254,
                   'mean': 34.800436375000004,
                   'median': 36.285183,
                   'min': 16.215288,
                   'n': 8,
                   'percentile': {'50': 35.907986,
                                  '75': 37.408257,
                                  '90': 40.147488,
                                  '95': 45.40254,
                                  '99': 45.40254,
                                  '99_9': 45.40254},
                   'stdev': 8.573975858600406,
                   'variance': 73.51306202386256},
 'started': 1465362540064,
 'statement': 'insert into CAMSYSTEM_01_CA (STATUS, TIMESTAMP_MS, CALCULATION, '
              'VALUE, VARIABLE, TIMESTAMP_S, STRVALUE) values (?, ?, ?, ?, ?, '
              '?, ?)',
 'version_info': {'hash': '4713a5498fadf14b2359e13c641e6734f8189dc5',
                  'number': '0.55.0'}}

command to dump/generate mapping file for insert-fake-data

Add a command that dumps a mapping file based.

cr8 new-fake-data-mapping --hosts <hosts> --table <table>

Output is a JSON file with the columns/fake-provider mappings that would be used by default

support for skipping setup/teardown

When trying to figure out if a change improved the performance it is sometimes necessary to run a spec file multiple times. It would be nice if it was possible to not clean the data but re-use it for subsequent runs.

Provide a way to generate sensible data for joins.

Consistent CLI

Usage is a bit confusing because the arguments are different:

timeit [-s stmt] [-w warmup] [-r repeat] [-c concurrency] hosts

insert-json [-b bulk-size] [-c concurrency] [--hosts hosts] table

insert-fake-data [-b bulk-size] [-c concurrency] [--mapping-file mapping_file] hosts fqtable num_records

run-spec [-r result-hosts] spec benchmark_hosts

run-track [-r result-hosts] [-c crate-root] track

run-crate [-e env ] [-s setting] version

Starting clusters with several nodes

I use cr8 to also do tests with CrateDB clusters containing several nodes.
It is always a bit of an hassle to properly setup the configuration.

Would it make sense to be able to start several CrateDB nodes forming a cluster with cr8?

Crate 3.0 refuses to start

$ cr8 run-crate latest-nightly

ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'log4j2.debug' to show Log4j2 internal initialization logging.
Exception in thread "main" java.lang.RuntimeException: Invalid value [1b] for the [cluster.routing.allocation.disk.watermark.low] setting.
at org.elasticsearch.node.internal.CrateSettingsPreparer.validateKnownSettings(CrateSettingsPreparer.java:99)
at org.elasticsearch.node.internal.CrateSettingsPreparer.prepareEnvironment(CrateSettingsPreparer.java:80)
at io.crate.bootstrap.CrateDB.createEnv(CrateDB.java:109)
at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:85)
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124)
at org.elasticsearch.cli.Command.main(Command.java:90)
at io.crate.bootstrap.CrateDB.main(CrateDB.java:88)
at io.crate.bootstrap.CrateDB.main(CrateDB.java:81)
Caused by: java.lang.IllegalArgumentException: unable to consistently parse [cluster.routing.allocation.disk.watermark.low=1b], [cluster.routing.allocation.disk.watermark.high=1b], and [cluster.routing.allocation.disk.watermark.flood_stage=95%] as percentage or bytes
at org.elasticsearch.cluster.routing.allocation.DiskThresholdSettings.doValidate(DiskThresholdSettings.java:170)
at org.elasticsearch.cluster.routing.allocation.DiskThresholdSettings.access$000(DiskThresholdSettings.java:38)
at org.elasticsearch.cluster.routing.allocation.DiskThresholdSettings$LowDiskWatermarkValidator.validate(DiskThresholdSettings.java:101)
at org.elasticsearch.cluster.routing.allocation.DiskThresholdSettings$LowDiskWatermarkValidator.validate(DiskThresholdSettings.java:95)
at org.elasticsearch.common.settings.Setting.get(Setting.java:361)
at org.elasticsearch.common.settings.Setting.get(Setting.java:342)
at org.elasticsearch.node.internal.CrateSettingsPreparer.validateKnownSettings(CrateSettingsPreparer.java:97)
... 7 more
Caused by: ElasticsearchParseException[failed to parse setting [cluster.routing.allocation.disk.watermark.flood_stage] with value [95%] as a size in bytes: unit is missing or unrecognized]
at org.elasticsearch.common.unit.ByteSizeValue.parseBytesSizeValue(ByteSizeValue.java:175)
at org.elasticsearch.common.unit.ByteSizeValue.parseBytesSizeValue(ByteSizeValue.java:133)
at org.elasticsearch.cluster.routing.allocation.DiskThresholdSettings.thresholdBytesFromWatermark(DiskThresholdSettings.java:343)
at org.elasticsearch.cluster.routing.allocation.DiskThresholdSettings.doValidateAsBytes(DiskThresholdSettings.java:194)
at org.elasticsearch.cluster.routing.allocation.DiskThresholdSettings.doValidate(DiskThresholdSettings.java:159)
... 13 more
Exiting because CrateDB didn't start correctly

Getting error when using list of jsons

Describe the bug
If I have a file with a list of JSONs, then I get an AttributeError:

AttributeError: 'list' object has no attribute 'items'

To Reproduce
Steps to reproduce the behavior:

1- some panda dataset
2- data_set.to_json('data.json', orient='records')

cr8 insert-json --table somedb --host somehost -c 50  -b 2500 -i data.json

Expected behavior
A clear and concise description of what you expected to happen.

cr8 Version: (cr8 --version)

Additional context
Add any other context about the problem here.

Attached the file
data.json.zip

run-crate doesn't apply all settings

cr8 run-crate latest-nightly -s psql.enabled=true -s psql.port=5434

/bin/java [...] -Des.cluster.routing.allocation.disk.watermark.low=1b -Des.cluster.routing.allocation.disk.watermark.high=1b -Des.discovery.initial_state_timeout=0 -Des.discovery.zen.ping.multicast.enabled=False -Des.network.host=127.0.0.1 -Des.udc.enabled=False -Des.cluster.name=cr8-crate-run210601515 -Des.psql.port=5434 -Des.path.data=/tmp/tmplg2b5t_a io.crate.bootstrap.CrateF

add node warmup phase

num records 10000 and bulk size 1000 only writes 1000 keys in crate

Running this:
cr8 insert-fake-data --hosts crate-1:4200 --table t1 --bulk-size 1000 --num-records 10000

results in only 1000 records in a 1 node CrateDB 2.1.5.

Expected behaviour is to find 10000 records.

Use https as bug report

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

Expected behavior
A clear and concise description of what you expected to happen.

cr8 Version: (cr8 --version)

Additional context
Add any other context about the problem here.

Expected behavior
A clear and concise description of what you expected to happen.

cr8 Version: (cr8 --version)

Additional context
Add any other context about the problem here.

mfussenegger / cr8 Goto Github PK

cr8's Introduction

Things I created or helped create

🔩

🐛Debug adapter protocol

Neovim

Plugins

cr8's People

Contributors

Stargazers

Watchers

Forkers

cr8's Issues

Recommend Projects

Recommend Topics

Recommend Org