Code Monkey home page Code Monkey logo

cr8's Introduction

My BlogSponsorMy dotfiles🐦

Things I created or helped create

🔩

  • CrateDB - A distributed SQL Database
  • cr8 - CLI collection of utilities for working with CrateDB or PostgreSQL. Benchmark queries, insert data.
  • knx - Python knx / eib client library
  • mkjson - A commandline tool to generate static or random JSON records

🐛Debug adapter protocol

  • hprofdap - Debug adapter that allows to inspect Java heap dumps (.hprof files) via OQL
  • dapconfig-schema - JSON Schema for .vscode/launch.json debug configuration files

Neovim

  • nlua - Neovim as Lua interpreter

Plugins

  • nvim-dap - A debugger/neovim client for debug adapters. (Implements the debug adapter protocol)
  • nvim-dap-python - Python extension for nvim-dap
  • nvim-jdtls - Extensions for the Neovim built-in language server client for eclipse.jdt.ls
  • nvim-fzy - A fuzzy finder like fzf.vim but for fzy and neovim with Lua API
  • nvim-qwahl - A collection of pickers using vim.ui.select. Complementary to nvim-fzy.
  • nvim-lint - An asynchronous linter plugin for Neovim. Complementary to the built-in Language Server Protocol support.
  • nvim-lsp-compl - A (auto-)completion plugin for Neovim focusing on LSP support.
  • nvim-treehopper - Region selection with hints on the AST nodes of a document powered by treesitter.
  • nvim-ansible - run function to execute ansible playbooks, filetype patterns, improved path.
  • nvim-snippasta - copy text/code and paste it transformed into snippets using treesitter queries for tabstop detection.
  • nvim-overfly - Provides keymaps to quickly fly around your source code.

cr8's People

Contributors

amotl avatar andreidan avatar asrivas-reco avatar azure-pipelines[bot] avatar chaudum avatar hlcianfagna avatar lowks avatar markush avatar matriv avatar mfussenegger avatar mkleen avatar quodt avatar rps-v avatar seut avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

cr8's Issues

Result checksum / verification option

Having an option that adds a result-checksum to the result and also verifies the checksum against different results if there is more than 1 iteration would be nice.
This would be useful for queries that produce deterministic results - to make sure there are no regressions introduced that produce garbage output.

timeit subcommand does not support line breaks in stmt

Reproduceable with:

query.sql

select id,
name,
country
from countries
(.venv) ☁  cr8 [master] ☢ cat query.sql | cr8 timeit --hosts localhost:4200
100%|██████████████████████████████████████████████████████████████| 30/30 [00:05<00:00,  5.79 requests/s]
SqlException: SQLActionException[SQLParseException: line 2:1: no viable alternative at input '<EOF>']

cr8 timeit unable to handle comments after final semicolon

Describe the bug
Trying to use cr8 timeit with complex queries with comments I get failures.

To Reproduce
Pipe the following to cr8 timeit:

SELECT 1;
/* test */

-->

SQLParseException[line 1:11: mismatched input '<EOF>' expecting {'SELECT'

Expected behavior
It runs SELECT 1;

cr8 Version: (cr8 --version)
0.26.1

insert-fake-data exists with python error

running with crate 2.1.6 on cr8 0.10.0.

for this schema:

CREATE TABLE IF NOT EXISTS "ano"."nym" (
   "count" LONG,
   "csp_id" INTEGER,
   "custom1" STRING,
   "custom2" STRING,
   "custom3" STRING,
   "custom4" STRING,
   "custom5" STRING,
   "detecteduser_id" LONG,
   "device_id" INTEGER,
   "dlpbytes" LONG,
   "dlpcount" LONG,
   "downloadedbytes" LONG,
   "eventsummary_id" LONG,
   "fromtime" TIMESTAMP,
   "hash_id" STRING,
   "logprocessortag_id" INTEGER,
   "monitor" INTEGER,
   "protocol" STRING,
   "serviceblocked" INTEGER,
   "tenant_id" INTEGER,
   "timeupdated" TIMESTAMP,
   "totalbytes" LONG,
   "totime" TIMESTAMP,
   "uploadedbytes" LONG,
   "userorip" INTEGER,
   PRIMARY KEY ("detecteduser_id", "fromtime", "eventsummary_id")
);

it exits with the following msg.

TypeError: 'float' object cannot be interpreted as an integer

full output

Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/concurrent/futures/process.py", line 175, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
  File "/Users/johannes/env/lib/python3.6/site-packages/cr8/insert_fake_data.py", line 137, in _create_bulk_args
    return [row_fun() for i in range(req_size)]
TypeError: 'float' object cannot be interpreted as an integer
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/johannes/env/bin/cr8", line 11, in <module>
    sys.exit(main())
  File "/Users/johannes/env/lib/python3.6/site-packages/cr8/__main__.py", line 60, in main
    p.dispatch()
  File "/Users/johannes/env/lib/python3.6/site-packages/argh/helpers.py", line 55, in dispatch
    return dispatch(self, *args, **kwargs)
  File "/Users/johannes/env/lib/python3.6/site-packages/argh/dispatching.py", line 174, in dispatch
    for line in lines:
  File "/Users/johannes/env/lib/python3.6/site-packages/argh/dispatching.py", line 277, in _execute_command
    for line in result:
  File "/Users/johannes/env/lib/python3.6/site-packages/argh/dispatching.py", line 260, in _call
    result = function(*positional, **keywords)
  File "/Users/johannes/env/lib/python3.6/site-packages/cr8/insert_fake_data.py", line 258, in insert_fake_data
    loop.run_until_complete(tasks)
  File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/asyncio/base_events.py", line 466, in run_until_complete
    return future.result()
  File "/Users/johannes/env/lib/python3.6/site-packages/cr8/aio.py", line 49, in consume
    raise last_error
  File "/Users/johannes/env/lib/python3.6/site-packages/cr8/aio.py", line 44, in consume
    await task
  File "/Users/johannes/env/lib/python3.6/site-packages/cr8/insert_fake_data.py", line 133, in _exec_many
    return await client.execute_many(stmt, await args_coro)
TypeError: 'float' object cannot be interpreted as an integer

Unicode run-crate arguments are not passed correctly as settings

Example:

cr8 run-crate 2.1.7 -s node.name="لمهندس مؤي<0651>د النشاشيبي"

results in:

cr> select name from sys.nodes;
+--------------+
| name         |
+--------------+
| ������������ |
+--------------+
SELECT 1 row in set (0.030 sec)

When running CrateDB directly, the node.name setting is correct. Same applies to cluster.name.

FileNotFoundException if `path.data` contains multiple paths

cr8 run-crate 0.57.3 -s path.data="/tmp/data1/,/tmp/data2/"
 File "/home/jordi/workspace/code/cr8/cr8/run_crate.py", line 349, in run_crate
    print('Stopping Crate...')
  File "/home/jordi/workspace/code/cr8/cr8/run_crate.py", line 208, in __exit__
    self.stop()
  File "/home/jordi/workspace/code/cr8/cr8/run_crate.py", line 202, in stop
    shutil.rmtree(self.data_path)
  File "/usr/lib/python3.5/shutil.py", line 465, in rmtree
    onerror(os.lstat, path, sys.exc_info())
  File "/usr/lib/python3.5/shutil.py", line 463, in rmtree
    orig_st = os.lstat(path)
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/data1/,/tmp/data2/'

run-crate windows support

cr8 tries to unpack the .zip with tarfile:

Downloading https://cdn.crate.io/downloads/releases/cratedb/x64_windows/crate-4.2.2.zip and extracting to C:\Users\runneradmin\.cache\cr8\crates
Traceback (most recent call last):
  File "c:\hostedtoolcache\windows\python\3.8.5\x64\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "c:\hostedtoolcache\windows\python\3.8.5\x64\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\hostedtoolcache\windows\Python\3.8.5\x64\Scripts\cr8.exe\__main__.py", line 7, in <module>
  File "c:\hostedtoolcache\windows\python\3.8.5\x64\lib\site-packages\cr8\__main__.py", line 84, in main
    _run_crate_and_rest(p, args_groups)
  File "c:\hostedtoolcache\windows\python\3.8.5\x64\lib\site-packages\cr8\__main__.py", line 42, in _run_crate_and_rest
    with create_node(version=args.version,
  File "c:\hostedtoolcache\windows\python\3.8.5\x64\lib\site-packages\cr8\run_crate.py", line 700, in create_node
    crate_dir = get_crate(version, crate_root)
  File "c:\hostedtoolcache\windows\python\3.8.5\x64\lib\site-packages\cr8\run_crate.py", line 670, in get_crate
    crate_dir = _download_and_extract(uri, crate_root)
  File "c:\hostedtoolcache\windows\python\3.8.5\x64\lib\site-packages\cr8\run_crate.py", line 485, in _download_and_extract
    with tarfile.open(fileobj=tmpfile) as t:
  File "c:\hostedtoolcache\windows\python\3.8.5\x64\lib\tarfile.py", line 1606, in open
    raise ReadError("file could not be opened successfully")
tarfile.ReadError: file could not be opened successfully

Track should not fail if single spec of the run fails

Would be great to have an option for run-track that allows single specs to fail without stopping the whole track run.

Suggestion: --non-strict or short -S command line argument

$ cr8 run-track path/to/track.toml --non-strict

share-able setup across spec files in run-track

In order to keep things organized it is useful to split tests into multiple spec files.
This has the disadvantage that the setup needs to be run for each spec file.

run-track could try to group the spec files based on the setup if the tests are read-only.

Create benchmark table if it doesn't exist

If someone installs cr8 via pip the benchmarks.sql file isn't included so it is rather troublesome to create the table. It should probably just be created automatically if it is missing.

And maybe provide an option to define the table name. For example the result hosts could be defined as -r localhost:4200/<tablename>

cr8 timeit unable to handle comments with semicolon in individual line

Describe the bug
Trying to use cr8 timeit with complex queries with comments I get failures depending on whether they contain semicolons on the same line where the comment is started.

To Reproduce
Pipe the following:

/* 
comment 2 ;
*/
select 1;

-->

SQLParseException[line 1:1: mismatched input '/' expecting {'SELECT'

This instead works:

/* comment 2 ; */
select 1;

Expected behavior
Runs select 1; without a SQLParseException

cr8 Version: (cr8 --version)

0.26.1

Support git bisect workflow to find perf regression

To find out which commit introduced a performance regression:

Add a sub-command that can be used with git bisect in a Crate repo to
automatically have it build the current checked out Crate version and then run
a spec file against that build.

After each run prompt to check if the result is good or bad.
Optionally a avg duration threshold could be used to decide automatically.

Current manual steps:

In repo:

  • git clean -xdff
  • git submodule update --init
  • ./gradlew clean distTar (or installDist)
  • launch node
  • cr8 run-spec ...
  • manually determine if runtime is good or bad
  • git bisect [good|bad]
  • repeat

Steps to implement this:

  • add support for cr8 run-crate /path/to/repo; this runs the first 4 steps listed above

  • add a cr8 run-crate-and-spec [version|path] [specpath] sub-command

    • (Maybe call it run-adhoc-track or something like that?)
    • (Maybe add a --fail-if [expression] option. E.g. --fail-if "runtime_stats['mean'] > 0.150")

Another option would be to support chaining run-crate with other comments. Like cr8 run-crate latest-nightly -- run-spec .... or cr8 run-crate latest-nightly -- timeit ...

KeyboardInterrupt shouldn't print stacktrace

## Running Query:
   Statement: select e.name as employee, o.name as office from employees e full join
   Concurrency: 1
   Iterations: 500
 16%|█████████████▌                                                                        | 79/500 [00:43<03:43,  1.88 requests/s]^C# Running tearDown
Traceback (most recent call last):
  File "bin/cr8", line 11, in <module>
    load_entry_point('cr8==0.6.1.dev31+ngd46ceee', 'console_scripts', 'cr8')()
  File "lib/python3.5/site-packages/cr8/main.py", line 24, in main
    p.dispatch()
  File "lib/python3.5/site-packages/argh/helpers.py", line 55, in dispatch
    return dispatch(self, *args, **kwargs)
  File "lib/python3.5/site-packages/argh/dispatching.py", line 174, in dispatch
    for line in lines:
  File "lib/python3.5/site-packages/argh/dispatching.py", line 277, in _execute_command
    for line in result:
  File "lib/python3.5/site-packages/argh/dispatching.py", line 260, in _call
    result = function(*positional, **keywords)
  File "lib/python3.5/site-packages/cr8/run_spec.py", line 235, in run_spec
    executor.run_queries(spec.queries, spec.meta)
  File "lib/python3.5/site-packages/cr8/run_spec.py", line 177, in run_queries
    result = runner.run()
  File "lib/python3.5/site-packages/cr8/timeit.py", line 130, in run
    measure, statements, self.concurrency, num_items=self.repeats)
  File "lib/python3.5/site-packages/cr8/aio.py", line 56, in run_many
    return loop.run_until_complete(map(coro, iterable, total=num_items))
  File "/usr/lib/python3.5/asyncio/base_events.py", line 375, in run_until_complete
    self.run_forever()
  File "/usr/lib/python3.5/asyncio/base_events.py", line 345, in run_forever
    self._run_once()
  File "/usr/lib/python3.5/asyncio/base_events.py", line 1276, in _run_once
    event_list = self._selector.select(timeout)
  File "/usr/lib/python3.5/selectors.py", line 441, in select
    fd_event_list = self._epoll.poll(timeout, max_ev)
KeyboardInterrupt

Jdk path points to the wrong location when run latest-nightly

The jdk path points to the wrong location when starting up latest-nightly in cr8 0.18.1.dev5+g74d5c9f.d20200427 on mac os x.

> cr8 run-crate latest-nightly
Skipping download, tarball alrady extracted at /Users/mkleen/.cache/cr8/crates/crate-4.2.0-202004270003-1d3ac73
Starting Crate process
CrateDB launching:
    PID: 3913
    Logs: /Users/mkleen/.cache/cr8/crates/crate-4.2.0-202004270003-1d3ac73/logs/cr8-crate-run848645912.log
    Data: /var/folders/49/v601s_vs0rqd3lx1q7s7shsr0000gn/T/tmpnxgx8rt5 (removed on stop)
/Users/mkleen/.cache/cr8/crates/crate-4.2.0-202004270003-1d3ac73/bin/crate: line 129: /Users/mkleen/.cache/cr8/crates/crate-4.2.0-202004270003-1d3ac73/jdk/Contents/Home/bin/java: No such file or directory
/Users/mkleen/.cache/cr8/crates/crate-4.2.0-202004270003-1d3ac73/bin/crate: line 129: exec: /Users/mkleen/.cache/cr8/crates/crate-4.2.0-202004270003-1d3ac73/jdk/Contents/Home/bin/java: cannot execute: No such file or directory
CrateDB didn't start in time or couldn't form a cluster.

The path:
/Users/mkleen/.cache/cr8/crates/crate-4.2.0-202004270003-1d3ac73/jdk/Contents/Home/bin/java
should point to:
/Users/mkleen/.cache/cr8/crates/crate-4.2.0-202004270003-1d3ac73/jdk/bin/java

GEO_SHAPE fake data provider causes `ColumnValidationException`

Steps to reproduce:

cr> create table t (shape geo_shape);
cr8 insert-fake-data --hosts localhost:4200 --table t -n 100000 -b 1
SqlException: SQLActionException[ColumnValidationException: Validation failed for shape: 'POLYGON (( 26.54999544368754 -82.30341222804802, 23.187737765374873 -90.62529200543769, 21.08304227283339 -91.58445700128993, 11.574181468199995 -86.73945042562916, 17.433520322649688 -76.38310406124984, 26.54999544368754 -82.30341222804802 ))' cannot be cast to type geo_shape]

Ensure yellow/green special statement support

Until there is "safe" table creation in Crate it would be good to have some kind of ensure yellow and ensure green statement support in cr8.

These could then be used in cr8 spec-files after table creation to make sure everything is ready before data ingestion starts.

insert-fake-data fails in Windows Python

Python 3.5.2 😢

Traceback (most recent call last):
  File "C:\Users\mibe\AppData\Local\Programs\Python\Python35-32\Scripts\cr8-script.py", line 9, in <module>
    load_entry_point('cr8', 'console_scripts', 'cr8')()
  File "c:\users\mibe\sandbox\mikethebeer\cr8\cr8\__main__.py", line 29, in main
    p.dispatch()
  File "C:\Users\mibe\AppData\Local\Programs\Python\Python35-32\lib\site-packages\argh\helpers.py", line 55, in dispatch
    return dispatch(self, *args, **kwargs)
  File "C:\Users\mibe\AppData\Local\Programs\Python\Python35-32\lib\site-packages\argh\dispatching.py", line 174, in dis
patch
    for line in lines:
  File "C:\Users\mibe\AppData\Local\Programs\Python\Python35-32\lib\site-packages\argh\dispatching.py", line 277, in _ex
ecute_command
    for line in result:
  File "C:\Users\mibe\AppData\Local\Programs\Python\Python35-32\lib\site-packages\argh\dispatching.py", line 260, in _ca
ll
    result = function(*positional, **keywords)
  File "c:\users\mibe\sandbox\mikethebeer\cr8\cr8\insert_fake_data.py", line 248, in insert_fake_data
    loop.add_signal_handler(signal.SIGINT, stop)
  File "C:\Users\mibe\AppData\Local\Programs\Python\Python35-32\lib\asyncio\events.py", line 475, in add_signal_handler
    raise NotImplementedError
NotImplementedError

json2insert fails if thread concurrency decreases

If the concurrency parameter gets decreased to a certain number json2insert command throws an exception.

(.venv) cr8 [master +] > cat /Users/mibe/demo_0_.json | cr8 json2insert -c 10 mibe.demo st1.p.fir.io:44200 | jq '.'
Executing requests async bulk_size=1000 concurrency=10
191 requests [00:09, 16.29 requests/s]Traceback (most recent call last):
  File "/Users/mibe/sandbox/cr8/.venv/bin/cr8", line 9, in <module>
    load_entry_point('cr8==0.4.1.dev10+ng61b1deb', 'console_scripts', 'cr8')()
  File "/Users/mibe/sandbox/cr8/cr8/main.py", line 22, in main
    p.dispatch()
  File "/Users/mibe/sandbox/cr8/.venv/lib/python3.5/site-packages/argh/helpers.py", line 55, in dispatch
    return dispatch(self, *args, **kwargs)
  File "/Users/mibe/sandbox/cr8/.venv/lib/python3.5/site-packages/argh/dispatching.py", line 174, in dispatch
    for line in lines:
  File "/Users/mibe/sandbox/cr8/.venv/lib/python3.5/site-packages/argh/dispatching.py", line 277, in _execute_command
    for line in result:
  File "/Users/mibe/sandbox/cr8/.venv/lib/python3.5/site-packages/argh/dispatching.py", line 265, in _call
    for line in result:
  File "/Users/mibe/sandbox/cr8/cr8/json2insert.py", line 72, in json2insert
    aio.run(f, bulk_queries, concurrency, loop)
  File "/Users/mibe/sandbox/cr8/cr8/aio.py", line 58, in run
    consume(q)))
  File "/Users/mibe/.pyenv/versions/3.5.0/lib/python3.5/asyncio/base_events.py", line 342, in run_until_complete
    return future.result()
  File "/Users/mibe/.pyenv/versions/3.5.0/lib/python3.5/asyncio/futures.py", line 274, in result
    raise self._exception
  File "/Users/mibe/.pyenv/versions/3.5.0/lib/python3.5/asyncio/tasks.py", line 239, in _step
    result = coro.send(value)
  File "/Users/mibe/sandbox/cr8/cr8/aio.py", line 32, in map_async
    await q.put(task)
  File "/Users/mibe/.pyenv/versions/3.5.0/lib/python3.5/asyncio/queues.py", line 140, in put
    'queue non-empty, why are getters waiting?')
AssertionError: queue non-empty, why are getters waiting?
Task was destroyed but it is pending!
task: <Task pending coro=<measure() running at /Users/mibe/sandbox/cr8/cr8/aio.py:24> wait_for=<Future pending cb=[wrap_future.<locals>._check_cancel_other() at /Users/mibe/.pyenv/versions/3.5.0/lib/python3.5/asyncio/futures.py:403, Task._wakeup()]>>
Task was destroyed but it is pending!
task: <Task pending coro=<measure() running at /Users/mibe/sandbox/cr8/cr8/aio.py:24> wait_for=<Future pending cb=[wrap_future.<locals>._check_cancel_other() at /Users/mibe/.pyenv/versions/3.5.0/lib/python3.5/asyncio/futures.py:403, Task._wakeup()]>>
201 requests [00:09, 20.88 requests/s]

If the concurrency parameter gets increased - OKAY

(.venv) cr8 [master +] > python -V                                                            16:40:12
Python 3.5.0

min version requirement for spec files

If tracks are created to run a lot of things across a wide range of version it is likely that older versions won't support some of the statements in the spec files.

For that case it would make sense to be able to specify a min_version either per spec or per query to then skip this spec/query if it's run against an older crate version

unreal runtime-stats values of request duration

runtime_stats of bulk request shows a very low request time of 45 ms in mean (out of 8 bulk requests). However it results in 3.54 requests/s which seems that a request takes about 300 ms each.

Running setUp
Running benchmark
9 requests [00:02,  3.54 requests/s]
{'bulk_size': 1000,
 'concurrency': 1,
 'ended': 1465362542607,
 'runtime_stats': {'max': 45.40254,
                   'mean': 34.800436375000004,
                   'median': 36.285183,
                   'min': 16.215288,
                   'n': 8,
                   'percentile': {'50': 35.907986,
                                  '75': 37.408257,
                                  '90': 40.147488,
                                  '95': 45.40254,
                                  '99': 45.40254,
                                  '99_9': 45.40254},
                   'stdev': 8.573975858600406,
                   'variance': 73.51306202386256},
 'started': 1465362540064,
 'statement': 'insert into CAMSYSTEM_01_CA (STATUS, TIMESTAMP_MS, CALCULATION, '
              'VALUE, VARIABLE, TIMESTAMP_S, STRVALUE) values (?, ?, ?, ?, ?, '
              '?, ?)',
 'version_info': {'hash': '4713a5498fadf14b2359e13c641e6734f8189dc5',
                  'number': '0.55.0'}}

support for skipping setup/teardown

When trying to figure out if a change improved the performance it is sometimes necessary to run a spec file multiple times. It would be nice if it was possible to not clean the data but re-use it for subsequent runs.

Consistent CLI

Usage is a bit confusing because the arguments are different:

timeit [-s stmt] [-w warmup] [-r repeat] [-c concurrency] hosts

insert-json [-b bulk-size] [-c concurrency] [--hosts hosts] table

insert-fake-data [-b bulk-size] [-c concurrency] [--mapping-file mapping_file] hosts fqtable num_records

run-spec [-r result-hosts] spec benchmark_hosts

run-track [-r result-hosts] [-c crate-root] track

run-crate [-e env ] [-s setting] version

Starting clusters with several nodes

I use cr8 to also do tests with CrateDB clusters containing several nodes.
It is always a bit of an hassle to properly setup the configuration.

Would it make sense to be able to start several CrateDB nodes forming a cluster with cr8?

Crate 3.0 refuses to start

$ cr8 run-crate latest-nightly
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'log4j2.debug' to show Log4j2 internal initialization logging.
Exception in thread "main" java.lang.RuntimeException: Invalid value [1b] for the [cluster.routing.allocation.disk.watermark.low] setting.
at org.elasticsearch.node.internal.CrateSettingsPreparer.validateKnownSettings(CrateSettingsPreparer.java:99)
at org.elasticsearch.node.internal.CrateSettingsPreparer.prepareEnvironment(CrateSettingsPreparer.java:80)
at io.crate.bootstrap.CrateDB.createEnv(CrateDB.java:109)
at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:85)
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124)
at org.elasticsearch.cli.Command.main(Command.java:90)
at io.crate.bootstrap.CrateDB.main(CrateDB.java:88)
at io.crate.bootstrap.CrateDB.main(CrateDB.java:81)
Caused by: java.lang.IllegalArgumentException: unable to consistently parse [cluster.routing.allocation.disk.watermark.low=1b], [cluster.routing.allocation.disk.watermark.high=1b], and [cluster.routing.allocation.disk.watermark.flood_stage=95%] as percentage or bytes
at org.elasticsearch.cluster.routing.allocation.DiskThresholdSettings.doValidate(DiskThresholdSettings.java:170)
at org.elasticsearch.cluster.routing.allocation.DiskThresholdSettings.access$000(DiskThresholdSettings.java:38)
at org.elasticsearch.cluster.routing.allocation.DiskThresholdSettings$LowDiskWatermarkValidator.validate(DiskThresholdSettings.java:101)
at org.elasticsearch.cluster.routing.allocation.DiskThresholdSettings$LowDiskWatermarkValidator.validate(DiskThresholdSettings.java:95)
at org.elasticsearch.common.settings.Setting.get(Setting.java:361)
at org.elasticsearch.common.settings.Setting.get(Setting.java:342)
at org.elasticsearch.node.internal.CrateSettingsPreparer.validateKnownSettings(CrateSettingsPreparer.java:97)
... 7 more
Caused by: ElasticsearchParseException[failed to parse setting [cluster.routing.allocation.disk.watermark.flood_stage] with value [95%] as a size in bytes: unit is missing or unrecognized]
at org.elasticsearch.common.unit.ByteSizeValue.parseBytesSizeValue(ByteSizeValue.java:175)
at org.elasticsearch.common.unit.ByteSizeValue.parseBytesSizeValue(ByteSizeValue.java:133)
at org.elasticsearch.cluster.routing.allocation.DiskThresholdSettings.thresholdBytesFromWatermark(DiskThresholdSettings.java:343)
at org.elasticsearch.cluster.routing.allocation.DiskThresholdSettings.doValidateAsBytes(DiskThresholdSettings.java:194)
at org.elasticsearch.cluster.routing.allocation.DiskThresholdSettings.doValidate(DiskThresholdSettings.java:159)
... 13 more
Exiting because CrateDB didn't start correctly

Getting error when using list of jsons

Describe the bug
If I have a file with a list of JSONs, then I get an AttributeError:

AttributeError: 'list' object has no attribute 'items'

To Reproduce
Steps to reproduce the behavior:

1- some panda dataset
2- data_set.to_json('data.json', orient='records')

cr8 insert-json --table somedb --host somehost -c 50  -b 2500 -i data.json

Expected behavior
A clear and concise description of what you expected to happen.

cr8 Version: (cr8 --version)

Additional context
Add any other context about the problem here.

Attached the file
data.json.zip

run-crate doesn't apply all settings

cr8 run-crate latest-nightly -s psql.enabled=true -s psql.port=5434
/bin/java [...] -Des.cluster.routing.allocation.disk.watermark.low=1b -Des.cluster.routing.allocation.disk.watermark.high=1b -Des.discovery.initial_state_timeout=0 -Des.discovery.zen.ping.multicast.enabled=False -Des.network.host=127.0.0.1 -Des.udc.enabled=False -Des.cluster.name=cr8-crate-run210601515 -Des.psql.port=5434 -Des.path.data=/tmp/tmplg2b5t_a io.crate.bootstrap.CrateF

Use https as bug report

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

Expected behavior
A clear and concise description of what you expected to happen.

cr8 Version: (cr8 --version)

Additional context
Add any other context about the problem here.

timeout for queries in spec files

If due to a performance regression or bug a query suddenly is a lot slower and takes hours instead of seconds it kills the whole benchmark suite and makes running the same specs across different versions troublesome.

So it would be great to be able to define a timeout per query so that it will be interrupted & skipped if it takes longer to execute.

Use https or the progress protocol

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

Expected behavior
A clear and concise description of what you expected to happen.

cr8 Version: (cr8 --version)

Additional context
Add any other context about the problem here.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.