Code Monkey home page Code Monkey logo

spanner-cli's Introduction

spanner-cli run-tests

Interactive command line tool for Cloud Spanner.

gif

Description

spanner-cli is an interactive command line tool for Google Cloud Spanner.
You can control your Spanner databases with idiomatic SQL commands.

Install

Install Go and run the following command.

# For Go 1.16+
go install github.com/cloudspannerecosystem/spanner-cli@latest

# For Go <1.16
go get -u github.com/cloudspannerecosystem/spanner-cli

Or you can download the old binary from the releases.

Usage

Usage:
  spanner-cli [OPTIONS]

spanner:
  -p, --project=       (required) GCP Project ID. [$SPANNER_PROJECT_ID]
  -i, --instance=      (required) Cloud Spanner Instance ID [$SPANNER_INSTANCE_ID]
  -d, --database=      (required) Cloud Spanner Database ID. [$SPANNER_DATABASE_ID]
  -e, --execute=       Execute SQL statement and quit.
  -f, --file=          Execute SQL statement from file and quit.
  -t, --table          Display output in table format for batch mode.
  -v, --verbose        Display verbose output.
      --credential=    Use the specific credential file
      --prompt=        Set the prompt to the specified format
      --history=       Set the history file to the specified path
      --priority=      Set default request priority (HIGH|MEDIUM|LOW)
      --role=          Use the specific database role
      --directed-read= Directed read option (replica_location:replica_type).
                       The replicat_type is optional and either READ_ONLY or READ_WRITE.

Help Options:
  -h, --help        Show this help message

Unless you specify a credential file with --credential, this tool uses Application Default Credentials as credential source to connect to Spanner databases.
Please make sure to prepare your credential by gcloud auth application-default login.

Example

Interactive mode

$ spanner-cli -p myproject -i myinstance -d mydb
Connected.
spanner> CREATE TABLE users (
      ->   id INT64 NOT NULL,
      ->   name STRING(16) NOT NULL,
      ->   active BOOL NOT NULL
      -> ) PRIMARY KEY (id);
Query OK, 0 rows affected (30.60 sec)

spanner> SHOW TABLES;
+----------------+
| Tables_in_mydb |
+----------------+
| users          |
+----------------+
1 rows in set (18.66 msecs)

spanner> INSERT INTO users (id, name, active) VALUES (1, "foo", true), (2, "bar", false);
Query OK, 2 rows affected (5.08 sec)

spanner> SELECT * FROM users ORDER BY id ASC;
+----+------+--------+
| id | name | active |
+----+------+--------+
| 1  | foo  | true   |
| 2  | bar  | false  |
+----+------+--------+
2 rows in set (3.09 msecs)

spanner> BEGIN;
Query OK, 0 rows affected (0.02 sec)

spanner(rw txn)> DELETE FROM users WHERE active = false;
Query OK, 1 rows affected (0.61 sec)

spanner(rw txn)> COMMIT;
Query OK, 0 rows affected (0.20 sec)

spanner> SELECT * FROM users ORDER BY id ASC;
+----+------+--------+
| id | name | active |
+----+------+--------+
| 1  | foo  | true   |
+----+------+--------+
1 rows in set (2.58 msecs)

spanner> DROP TABLE users;
Query OK, 0 rows affected (25.20 sec)

spanner> SHOW TABLES;
Empty set (2.02 msecs)

spanner> EXIT;
Bye

Batch mode

By passing SQL from standard input, spanner-cli runs in batch mode.

$ echo 'SELECT * FROM users;' | spanner-cli -p myproject -i myinstance -d mydb
id      name    active
1       foo     true
2       bar     false

You can also pass SQL with command line option -e.

$ spanner-cli -p myproject -i myinstance -d mydb -e 'SELECT * FROM users;'
id      name    active
1       foo     true
2       bar     false

With -t option, results are displayed in table format.

$ spanner-cli -p myproject -i myinstance -d mydb -e 'SELECT * FROM users;' -t
+----+------+--------+
| id | name | active |
+----+------+--------+
| 1  | foo  | true   |
| 2  | bar  | false  |
+----+------+--------+

Directed reads mode

spanner-cli now supports directed reads, a feature that allows you to read data from a specific replica of a Spanner database. To use directed reads with spanner-cli, you need to specify the --directed-read flag. The --directed-read flag takes a single argument, which is the name of the replica that you want to read from. The replica name can be specified in one of the following formats:

  • <replica_location>
  • <replica_location>:<replica_type>

The <replica_location> specifies the region where the replica is located such as us-central1, asia-northeast2.
The <replica_type> specifies the type of the replica either READ_WRITE or READ_ONLY.

$ spanner-cli -p myproject -i myinstance -d mydb --directed-read us-central1

$ spanner-cli -p myproject -i myinstance -d mydb --directed-read us-central1:READ_ONLY

$ spanner-cli -p myproject -i myinstance -d mydb --directed-read asia-northeast2:READ_WRITE

Directed reads are only effective for single queries or queries within a read-only transaction. Please note that directed read options do not apply to queries within a read-write transaction.

Note

If you specify an incorrect region or type for directed reads, directed reads will not be enabled and your requsts won't be routed as expected. For example, in a multi-region configuration nam3, if you mistype us-east1 as us-east-1, the connection will succeed, but directed reads will not be enabled.

To perform directed reads to asia-northeast2 in a multi-region configuration asia1, you need to specify asia-northeast2 or asia-northeast2:READ_WRITE. Since the replicas placed in asia-northeast2 are READ_WRITE replicas, directed reads will not be enabled if you specify asia-northeast2:READ_ONLY.

Please refer to the Spanner documentation to verify the valid configurations.

Syntax

In the following syntax, we use <> for a placeholder, [] for an optional keyword, and {} for a mutually exclusive keyword.

  • The syntax is case-insensitive.
  • \G delimiter is also supported for displaying results vertically.
Usage Syntax Note
List databases SHOW DATABASES;
Switch database USE <database> [ROLE <role>]; The role you set is used for accessing with fine-grained access control.
Create database CREATE DATABSE <database>;
Drop database DROP DATABASE <database>;
List tables SHOW TABLES;
Show table schema SHOW CREATE TABLE <table>;
Show columns SHOW COLUMNS FROM <table>;
Show indexes SHOW INDEX FROM <table>;
Create table CREATE TABLE ...;
Change table schema ALTER TABLE ...;
Delete table DROP TABLE ...;
Truncate table TRUNCATE TABLE <table>; Only rows are deleted. Note: Non-atomically because executed as a partitioned DML statement.
Create index CREATE INDEX ...;
Delete index DROP INDEX ...;
Create role CREATE ROLE ...;
Drop role DROP ROLE ...;
Grant GRANT ...;
Revoke REVOKE ...;
Query SELECT ...;
DML {INSERT|UPDATE|DELETE} ...;
Partitioned DML PARTITIONED {UPDATE|DELETE} ...;
Show Query Execution Plan EXPLAIN SELECT ...;
Show DML Execution Plan EXPLAIN {INSERT|UPDATE|DELETE} ...;
Show Query Execution Plan with Stats EXPLAIN ANALYZE SELECT ...;
Show DML Execution Plan with Stats EXPLAIN ANALYZE {INSERT|UPDATE|DELETE} ...;
Start a new query optimizer statistics package construction ANALYZE;
Start Read-Write Transaction BEGIN [RW] [PRIORITY {HIGH|MEDIUM|LOW}] [TAG <tag>]; See Request Priority for details on the priority. The tag you set is used as both transaction tag and request tag. See also Transaction Tags and Request Tags.
Commit Read-Write Transaction COMMIT;
Rollback Read-Write Transaction ROLLBACK;
Start Read-Only Transaction BEGIN RO [{<seconds>|<RFC3339-formatted time>}] [PRIORITY {HIGH|MEDIUM|LOW}] [TAG <tag>]; <seconds> and <RFC3339-formatted time> is used for stale read. See Request Priority for details on the priority. The tag you set is used as request tag. See also Transaction Tags and Request Tags.
End Read-Only Transaction CLOSE;
Exit CLI EXIT;

Customize prompt

You can customize the prompt by --prompt option.
There are some defined variables for being used in prompt.

Variables:

  • \p : GCP Project ID
  • \i : Cloud Spanner Instance ID
  • \d : Cloud Spanner Database ID
  • \t : In transaction

Example:

$ spanner-cli -p myproject -i myinstance -d mydb --prompt='[\p:\i:\d]\t> '
Connected.
[myproject:myinstance:mydb]> SELECT * FROM users ORDER BY id ASC;
+----+------+--------+
| id | name | active |
+----+------+--------+
| 1  | foo  | true   |
| 2  | bar  | false  |
+----+------+--------+
2 rows in set (3.09 msecs)

[myproject:myinstance:mydb]> begin;
Query OK, 0 rows affected (0.08 sec)

[myproject:myinstance:mydb](rw txn)> ...

The default prompt is spanner\t> .

Config file

This tool supports a configuration file called spanner_cli.cnf, similar to my.cnf.
The config file path must be ~/.spanner_cli.cnf.
In the config file, you can set default option values for command line options.

Example:

[spanner]
project = myproject
instance = myinstance
prompt = "[\\p:\\i:\\d]\\t> "

Configuration Precedence

  1. Command line flags(highest)
  2. Environment variables
  3. .spanner_cli.cnf in current directory
  4. .spanner_cli.cnf in home directory(lowest)

Request Priority

You can set request priority for command level or transaction level. By default MEDIUM priority is used for every request.

To set a priority for command line level, you can use --priority={HIGH|MEDIUM|LOW} command line option.

To set a priority for transaction level, you can use PRIORITY {HIGH|MEDIUM|LOW} keyword.

Here are some examples for transaction-level priority.

# Read-write transaction with low priority
BEGIN PRIORITY LOW;

# Read-only transaction with low priority
BEGIN RO PRIORITY LOW;

# Read-only transaction with 60s stale read and medium priority
BEGIN RO 60 PRIORITY MEDIUM;

# Read-only transaction with exact timestamp and medium priority
BEGIN RO 2021-04-01T23:47:44+00:00 PRIORITY MEDIUM;

Note that transaction-level priority takes precedence over command-level priority.

Transaction Tags and Request Tags

In a read-write transaction, you can add a tag following BEGIN RW TAG <tag>. spanner-cli adds the tag set in BEGIN RW TAG as a transaction tag. The tag will also be used as request tags within the transaction.

# Read-write transaction
# transaction_tag = tx1
+--------------------+
| BEGIN RW TAG tx1;  |
|                    |
| SELECT val         |
| FROM tab1      +-----request_tag = tx1
| WHERE id = 1;      |
|                    |
| UPDATE tab1        |
| SET val = 10   +-----request_tag = tx1
| WHERE id = 1;      |
|                    |
| COMMIT;            |
+--------------------+

In a read-only transaction, you can add a tag following BEGIN RO TAG <tag>. Since read-only transaction doesn't support transaction tag, spanner-cli adds the tag set in BEGIN RO TAG as request tags.

# Read-only transaction
# transaction_tag = N/A
+--------------------+
| BEGIN RO TAG tx2;  |
|                    |
| SELECT SUM(val)    |
| FROM tab1      +-----request_tag = tx2
| WHERE id = 1;      |
|                    |
| CLOSE;             |
+--------------------+

Using with the Cloud Spanner Emulator

This tool supports the Cloud Spanner Emulator via the SPANNER_EMULATOR_HOST environment variable.

$ export SPANNER_EMULATOR_HOST=localhost:9010
# Or with gcloud env-init:
$ $(gcloud emulators spanner env-init)

$ spanner-cli -p myproject -i myinstance -d mydb

How to develop

Run unit tests.

$ make test

Run integration tests, which connects to real Cloud Spanner database.

$ PROJECT=${PROJECT_ID} INSTANCE=${INSTANCE_ID} DATABASE=${DATABASE_ID} CREDENTIAL=${CREDENTIAL} make test

TODO

  • Show secondary index by "SHOW CREATE TABLE"

Disclaimer

Do not use this tool for production databases as the tool is still alpha quality.

Please feel free to report issues and send pull requests, but note that this application is not officially supported as part of the Cloud Spanner product.

spanner-cli's People

Contributors

110y avatar apstndb avatar c2nes avatar derpferd avatar lgruen avatar magiccrafter avatar matope avatar michity15169 avatar takabow avatar tkuchiki avatar toga4 avatar yfuruyama avatar zchee avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

spanner-cli's Issues

Support Change Streams DDL

Support DDLs for managing change streams.

https://cloud.google.com/spanner/docs/change-streams/manage?hl=en

  • CREATE CHANGE STREAM <name> FOR ...
  • ALTER CHANGE STREAM <name> SET FOR ...
  • ALTER CHANGE STREAM <name> SET OPTIONS ...
  • ALTER CHANGE STREAM <name> DROP FOR ALL
  • DROP CHANGE STREAM <name>
spanner> CREATE CHANGE STREAM EverythingStream FOR ALL;
ERROR: invalid statement

I propose to support generic CREATE / DROP / ALTER DDL statements so that spanner-cli can respond to the new features of Cloud Spanner with a minimum of effort.

EXPLAIN <table> is confusing and undocumented

I have noticed during the work of #43, EXPLAIN without SELECT is matched with showColumnRe.

$ spanner-cli -e 'EXPLAIN WITH t AS (SELECT 1) SELECT * FROM t'
ERROR: table "WITH t AS (SELECT 1) SELECT * FROM t" doesn't exist

It is undocumented behavior and confusing result.
IMO, it should be decommissioned or documented.

Wounded transaction is inconsistent

I have found that once the transaction is aborted, the subsequent transaction get the different aborted result even if the same transaction was run.

Let me explain with the example.

Example:

This example is the Lost Update example I explained in the blog post (en, ja).

Let's call the left pane is txn1 and right pane is txn2.

Screen Shot 2020-12-18 at 19 02 03

In the 1st try, txn2 was aborted since txn1 had higher priority than txn2. In the 2nd try, however, txn1 was aborted even though txn1 started the SELECT statement earlier than txn2.

Use spanner-cli specific request tags for queries like `SELECT 1`

spanner-cli itself may issue SQLs in addition to SQLs directly issued by users.
For example, SELECT 1 issued for heartbeat purposes, etc.

Such queries appear in Cloud Spanner's query statistics and may confuse users who are unaware that spanner-cli is issuing them.
To identify such queries more easily, I suggest adding spanner-cli specific request tags to the queries.

As far as I can find, spanner-cli seems to issue its own SELECT query for the following uses.

  • SELECT 1 - At the beginning of a Read-write transaction
  • SELECT 1 - To check if DB exists.
  • SELECT 1 - Heartbeat
  • SELECT ... FROM INFORMATION_SCHEMA.TABLES ... - SHOW TABLES
  • SELECT ... FROM INFORMATION_SCHEMA.INDEXES ... - SHOW INDEX
  • SELECT ... FROM INFORMATION_SCHEMA.COLUMNS ... - SHOW COLUMNS

I am not sure what specific string of text to tag, so I need your opinion.

As an example, it seems that Cloud Console uses sys_cloud_console_* to issue some queries.

spanner> SELECT SUBSTR(text,0,20) AS text, request_tag,
      -> FROM spanner_sys.query_stats_top_minute
      -> WHERE request_tag <> "";
+----------------------+---------------------------------------------+
| text                 | request_tag                                 |
+----------------------+---------------------------------------------+
| SELECT TABLE_NAME, T | sys_cloud_console_DatabaseTables            |
| SELECT VIEW_DEFINITI | sys_cloud_console_TableDetailViewDefinition |
| SELECT OPTION_NAME,  | sys_cloud_console_DatabaseOptions           |
| SELECT INDEX_NAME, I | sys_cloud_console_TableDetailIndexes        |
| SELECT TABLE_NAME, C | sys_cloud_console_TableDetailTableColumnMap |
| SELECT TABLE_NAME, C | sys_cloud_console_DatabaseTableColumnMap    |
| SELECT TABLE_NAME, C | sys_cloud_console_TableDetailColumns        |
| SELECT TABLE_NAME, T | sys_cloud_console_TableDetail               |
| SELECT TABLE_NAME, T | sys_cloud_console_DatabaseTables            |
| SELECT OPTION_NAME,  | sys_cloud_console_DatabaseOptions           |
| SELECT TABLE_NAME, C | sys_cloud_console_DatabaseTableColumnMap    |
+----------------------+---------------------------------------------+

Install?

More detailed install instructions would be nice.
A lot of people using this are usually in a hurry and/or just want to do a quick PoC, without having to understand how golang works environment-wise, debugging stuff etc.

I get spanner-cli: command not found after performing the installation (Yes, go version prints the version and I've tried launching new terminals).

A FAQ could be useful.

Case-sensitiveness of identifiers in spanner-cli statements

In Cloud Spanner native statements, identifiers are case-insensitive and quoted identifiers are case-sensitive.
It should be emulated in spanner-cli statements.

  • Using INFORMATION_SCHEMA
    • SHOW INDEX
    • SHOW COLUMNS
    • SHOW CREATE TABLES
  • Not using INFORMATION_SCHEMA
    • DROP DATABASE
    • SHOW CREATE TABLES

Installation issue

Need some help on installation. Thanks!

Got the following installation error while trying to install it on a CentOS linux box, looks like some go lib can't be found, here are the full console log

go get -u github.com/cloudspannerecosystem/spanner-cli
go: downloading github.com/cloudspannerecosystem/spanner-cli v0.9.6
go: downloading cloud.google.com/go v0.79.0
go: downloading github.com/chzyer/readline v0.0.0-20180603132655-2972be24d48e
go: downloading github.com/jessevdk/go-flags v1.4.0
go: downloading github.com/olekukonko/tablewriter v0.0.4
go: downloading github.com/xlab/treeprint v1.0.1-0.20200715141336-10e0bc383e01
go: downloading google.golang.org/api v0.43.0
go: downloading google.golang.org/genproto v0.0.0-20210331142528-b7513248f0ba
go: downloading cloud.google.com/go/spanner v1.17.0
go: downloading google.golang.org/grpc v1.36.1
go: downloading github.com/olekukonko/tablewriter v0.0.5
go: downloading github.com/jessevdk/go-flags v1.5.0
go: downloading github.com/xlab/treeprint v1.1.0
go: downloading google.golang.org/protobuf v1.26.0
go: downloading golang.org/x/sys v0.0.0-20210320140829-1e4c9ba3b0c4
go: downloading github.com/mattn/go-runewidth v0.0.8
go: downloading cloud.google.com/go v0.81.0
go: downloading google.golang.org/grpc v1.37.0
go: downloading github.com/mattn/go-runewidth v0.0.12
go: downloading golang.org/x/sys v0.0.0-20210423185535-09eb48e85fd7
go: downloading google.golang.org/genproto v0.0.0-20210423144448-3a41ef94ed2b
go: downloading github.com/golang/protobuf v1.5.1
go: downloading github.com/googleapis/gax-go/v2 v2.0.5
go: downloading github.com/googleapis/gax-go v1.0.3
go: downloading go.opencensus.io v0.23.0
go: downloading github.com/golang/protobuf v1.5.2
go: downloading golang.org/x/xerrors v0.0.0-20200804184101-5ec99f83aff1
go: downloading golang.org/x/net v0.0.0-20210316092652-d523dce5a7f4
go: downloading github.com/jstemmer/go-junit-report v0.9.1
go: downloading golang.org/x/lint v0.0.0-20201208152925-83fdc39ff7b5
go: downloading golang.org/x/tools v0.1.0
go: downloading golang.org/x/net v0.0.0-20210423184538-5f58ad60dda6
go: downloading github.com/golang/groupcache v0.0.0-20200121045136-8c9f03a8e57e
go: downloading github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da
go: downloading golang.org/x/text v0.3.5
go: downloading golang.org/x/mod v0.4.1
go: downloading golang.org/x/mod v0.4.2
go: downloading golang.org/x/text v0.3.6
go get: google.golang.org/[email protected]: reading https://proxy.golang.org/google.golang.org/genproto/@v/v0.0.0-20210423144448-3a41ef94ed2b.zip: 403 Forbidden
go get: google.golang.org/[email protected]: reading https://proxy.golang.org/google.golang.org/genproto/@v/v0.0.0-20210423144448-3a41ef94ed2b.zip: 403 Forbidden
go get: google.golang.org/[email protected]: reading https://proxy.golang.org/google.golang.org/genproto/@v/v0.0.0-20210423144448-3a41ef94ed2b.zip: 403 Forbidden
go get: google.golang.org/[email protected]: reading https://proxy.golang.org/google.golang.org/genproto/@v/v0.0.0-20210423144448-3a41ef94ed2b.zip: 403 Forbidden
go get: google.golang.org/[email protected]: reading https://proxy.golang.org/google.golang.org/genproto/@v/v0.0.0-20210423144448-3a41ef94ed2b.zip: 403 Forbidden
go get: google.golang.org/[email protected]: reading https://proxy.golang.org/google.golang.org/genproto/@v/v0.0.0-20210423144448-3a41ef94ed2b.zip: 403 Forbidden
go get: google.golang.org/[email protected]: reading https://proxy.golang.org/google.golang.org/genproto/@v/v0.0.0-20210423144448-3a41ef94ed2b.zip: 403 Forbidden
go get: google.golang.org/[email protected]: reading https://proxy.golang.org/google.golang.org/genproto/@v/v0.0.0-20210423144448-3a41ef94ed2b.zip: 403 Forbidden
go get: google.golang.org/[email protected]: reading https://proxy.golang.org/google.golang.org/genproto/@v/v0.0.0-20210423144448-3a41ef94ed2b.zip: 403 Forbidden

"SHOW TABLES" doesn't work

"SHOW TABLES" will fail with the error below:

spanner> SHOW TABLES;
ERROR: spanner: code = "InvalidArgument", desc = "Syntax error: Unexpected \\\"-\\\" [at 1:41]\\nSELECT t.table_name AS Tables_in_xxxxx FROM information_schema.ta...\\n

Ability to set deadlines on long running operations

A useful feature would be to be able to set shorter (or longer!) deadlines on potentially long running operations, specifically SELECT but also INSERT WHERE SELECT, UPDATE and COMMIT.

For example, the default SELECT deadline (ExecuteStreamingSQL) is 3600s or 1hr. This means that if a user executes a SELECT with a full table scan on a large table, it could continue running for up to an hour without the database being aware that, the user has attempted to cancel it by killing the Spanner-CLI process. One can imagine a scenario where a user gets impatient, kills the client process and tries again several times this loading the DB without realising.

(This is unlike other DB engines which use a TCP connection to the CLI client, so that the server can detect when a client has been killed, and terminate any queries),

Suggestion:

  • Set a shorter deadline for potentially long running APIs by default (the console UI uses 5 mins)
  • When a DEADLINE EXCEEDED error is raised, report the error with some text indicating that the deadline can be extended (and how)

Format of STRUCT is ambiguous

Currently, spanner-cli formats of STRUCT values as [...].
I think it is ambiguous with format of ARRAY values.
(We know nested ARRAY and top-level STRUCT is forbid so strictly ambiguous.)

$ spanner-cli -e 'SELECT ARRAY(SELECT AS STRUCT 1, 2)'
[[1, 2]]
$ spanner-cli -e 'SELECT SAMPLE_LOCK_REQUESTS FROM SPANNER_SYS.LOCK_STATS_TOP_HOUR' -t
...
[[tbl._exists, Exclusive], [tbl._exists, Exclusive], [tbl._exists, Exclusive], [tbl._exists, Exclusive]]
...

How about does STRUCT use (...) like native notation?

Support Partitioned DML

Cloud Spanner has two execution modes for DML, DML and Partitioned DML, but currently spanner-cli only supports normal DML. It would be better to support Partitioned DML in spanner-cli since it can process large-scale operations over the entire rows.

One thing we might need to discuss would be syntax for executing a Partitioned DML.

Context: #92 (comment)

SHOW CHANGE STREAMS statement

refs #135 (comment)

  • Show watched tables and columns
  • Show FOR ALL or empty streams also
  • Show options(currently only retention_period)

Proposed layout

+----------------------------------------------------------------+----------------------+
| CHANGE_STREAM                                                  | OPTIONS              |
+----------------------------------------------------------------+----------------------+
| Empty                                                          |                      |
| EverythingStream FOR ALL                                       |                      |
| NamesAndAlbums FOR Singers(FirstName, LastName), Albums, Songs | retention_period=36h |
| SingerAlbumStream FOR Singers, Albums                          |                      |
+----------------------------------------------------------------+----------------------+

Semantics

SELECT CHANGE_STREAM_NAME || IF(`ALL`, " FOR ALL", IFNULL(" FOR " || (
  SELECT STRING_AGG(TABLE_NAME || IF(ALL_COLUMNS, "", FORMAT("(%s)", (
    SELECT STRING_AGG(COLUMN_NAME, ", ")
    FROM INFORMATION_SCHEMA.CHANGE_STREAM_COLUMNS CSC
    WHERE (CST.CHANGE_STREAM_NAME, CST.TABLE_NAME) = (CSC.CHANGE_STREAM_NAME, CSC.TABLE_NAME)
  ))), ", ")
  FROM INFORMATION_SCHEMA.CHANGE_STREAM_TABLES CST
  WHERE CS.CHANGE_STREAM_NAME = CST.CHANGE_STREAM_NAME
), "")) AS CHANGE_STREAM,
IFNULL((
  SELECT STRING_AGG(FORMAT("%s=%s", OPTION_NAME, OPTION_VALUE), ", ")
  FROM INFORMATION_SCHEMA.CHANGE_STREAM_OPTIONS CSO
  WHERE CS.CHANGE_STREAM_NAME = CSO.CHANGE_STREAM_NAME
), "") AS OPTIONS
FROM INFORMATION_SCHEMA.CHANGE_STREAMS CS

References

Partitioned DML over 30s always fails by context deadline exceeded

Partitioned DML over 30s always fails by context deadline exceeded.

$ time ./spanner-cli -p ${CLOUDSDK_CORE_PROJECT} -i ${CLOUDSDK_SPANNER_INSTANCE} -d ${DATABASE_ID} --execute='PARTITIONED UPDATE Songs SET Duration = Duration + 1 WHERE TRUE' 
ERROR: rpc error: code = DeadlineExceeded desc = context deadline exceeded
./spanner-cli -p ${CLOUDSDK_CORE_PROJECT} -i ${CLOUDSDK_SPANNER_INSTANCE} -d   0.35s user 0.47s system 2% cpu 34.068 total

gcloud databases execute-sql succeeds.

$ time gcloud spanner databases execute-sql --instance=merpay-sponsored-instance ${DATABASE_ID} --sql='UPDATE Songs SET Duration = Duration + 1 WHERE TRUE' --enable-partitioned-dml
Statement modified a lower bound of 1024000 rows
gcloud spanner databases execute-sql --instance=merpay-sponsored-instance     0.55s user 0.22s system 1% cpu 1:10.73 total

AttributeError: 'database_id'

I am getting following error to connect to the spanner instance using spanner-cli

Traceback (most recent call last):
File "/home/venv/lib/python3.9/site-packages/proto/message.py", line 561, in getattr
pb_type = self._meta.fields[key].pb_type
KeyError: 'database_id'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/venv/bin/spanner-cli", line 8, in
sys.exit(main.main())
File "/home/venv/lib/python3.9/site-packages/click/core.py", line 1137, in call
return self.main(*args, **kwargs)
File "/home/venv/lib/python3.9/site-packages/click/core.py", line 1062, in main
rv = self.invoke(ctx)
File "/home/venv/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/venv/lib/python3.9/site-packages/click/core.py", line 763, in invoke
return __callback(*args, **kwargs)
File "/home/venv/lib/python3.9/site-packages/spannercli/main.py", line 403, in main
cli = SpannerCli(
File "/home/venv/lib/python3.9/site-packages/spannercli/main.py", line 64, in init
self.rehash()
File "/home/venv/lib/python3.9/site-packages/spannercli/main.py", line 87, in rehash
self.set_completion_databases()
File "/home/venv/lib/python3.9/site-packages/spannercli/main.py", line 92, in set_completion_databases
data = self.list_databases()
File "/home/venv/lib/python3.9/site-packages/spannercli/main.py", line 127, in list_databases
data.append(d.database_id)
File "/home/venv/lib/python3.9/site-packages/proto/message.py", line 566, in getattr
raise AttributeError(str(ex))
AttributeError: 'database_id'

statement separator don't treat comments as a token separator

The statement separator implementation treat comments as empty but it should be treated as a kind of a token separator as like whitespaces.
This behavior can make query execution an error or even change the meaning of the query.

Example

spanner-cli evaluate SELECT 0x1/**/A as SELECT 0x1A.

$ spanner-cli -d ${DATABASE} --execute='SELECT 0x1/**/A' -t                                             
+----+
|    |
+----+
| 26 |
+----+

Expected behavior

It should be evaluated as SELECT 0x1 A(equivalent of SELECT 0x1 AS A).

$ gcloud spanner databases execute-sql ${DATABASE} --sql='SELECT 0x1/**/A'               
A
1

Related specification

https://cloud.google.com/spanner/docs/reference/standard-sql/lexical?hl=en

You can separate tokens with comments or whitespace such as spaces, backspaces, tabs, or newlines.

CI fails

The latest CI failed for the test case accessing real Spanner instance: https://github.com/cloudspannerecosystem/spanner-cli/actions/runs/4684507346/jobs/8300721270

=== RUN   TestDml
    integration_test.go:108: failed to create table: err=rpc error: code = Unauthenticated desc = Request had invalid authentication credentials. Expected OAuth 2 access token, login cookie or other valid authentication credential. See https://developers.google.com/identity/sign-in/web/devconsole-project.

I will wait for a few days to check if it's just a temporal issue.

Support connection without spanner.databases.get IAM permission

I have a service account that can't list databases but it would be nice to connect to a specific database if I know it by name and do stuff. I'm able to do that with the SQLWorkbench/J and the Simba JDBC driver for Spanner.

This is the error I get:

$ spanner-cli -p foo -i conversion -d conversion-prod -e 'select count(*) from account_owner'
2018/11/08 15:47:08 failed to connect to Spanner: rpc error: code = PermissionDenied desc = Resource projects/foo/instances/conversion/databases/conversion-prod is missing IAM permission: spanner.databases.get.

Version Retention

I'm unable to import from a spanner-dump file. Spanner-cli throws and error when it tries to run the first block
ALTER DATABASE `example-db` SET OPTIONS ( version_retention_period = '7d' ); ); id STRING(1024), ) PRIMARY KEY(id);

Document precedence of configurations.

Currently, spanner-cli has various ways of configuration.

  • command line options -p, -i, -d
  • Environment variables #55
  • .spanner_cli.cnf in $HOME and current directory #51

Although, precedence is not well documented.
It should be documented to prevent unexpected behaviors.

Show query stats for verbose mode

Feature Request

Now we have --verbose mode, it would be great if query stats could be shown in the query result.

Example,

spanner> select id from comments order by text asc limit 3;
+-------+
| id    |
+-------+
| 23462 |
| 7739  |
| 17666 |
+-------+
3 rows in set (16.15 msecs), timestamp: 2020-04-04T14:00:14.669758+09:00, rows_scanned: 100, data_bytes_read: 2315570, cpu_time: 13.76 msecs

spanner>

Consideration

Since query stats have multiple statistics information, we need to consider which information should be shown and how it should be visualized.

This is an example of the output from query stats.

stats: map[string]interface {}{"bytes_returned":"21", "cpu_time":"13.76 msecs", "data_bytes_read":"2315570", "deleted_rows_scanned":"6463", "elapsed_time":"16.15 msecs", "filesystem_delay_seconds":"0 msecs", "optimizer_version":"2", "query_plan_creation_time":"3.41 msecs", "query_text":"select id from comments order by text asc limit 3", "remote_server_calls":"0/0", "rows_returned":"3", "rows_scanned":"100", "runtime_creation_time":"0 msecs"}

Result of SHOW COLUMNS is wrong if the table name is same name with INFORMATION_SCHEMA

The result of SHOW COLUMNS command is corrupted if table name is same with table of INFORMATION_SCHEMA or SPANNER_SYS.
Join condition should be constrained by TABLE_SCHEMA.
https://cloud.google.com/spanner/docs/information-schema?hl=en

spanner> CREATE TABLE TABLES(pk INT64) PRIMARY KEY(pk);
Query OK, 0 rows affected (19.86 sec)

spanner> SHOW COLUMNS FROM TABLES;
+-------------------+-------------+------+-------------+-----------+---------+
| Field             | Type        | NULL | Key         | Key_Order | Options |
+-------------------+-------------+------+-------------+-----------+---------+
| pk                | INT64       | YES  | PRIMARY_KEY | ASC       | NULL    |
| TABLE_CATALOG     | STRING(MAX) | NO   | PRIMARY_KEY | ASC       | NULL    |
| pk                | INT64       | YES  | PRIMARY_KEY | ASC       | NULL    |
| TABLE_CATALOG     | STRING(MAX) | NO   | PRIMARY_KEY | ASC       | NULL    |
| TABLE_SCHEMA      | STRING(MAX) | NO   | PRIMARY_KEY | ASC       | NULL    |
| TABLE_SCHEMA      | STRING(MAX) | NO   | PRIMARY_KEY | ASC       | NULL    |
| TABLE_NAME        | STRING(MAX) | NO   | PRIMARY_KEY | ASC       | NULL    |
| TABLE_NAME        | STRING(MAX) | NO   | PRIMARY_KEY | ASC       | NULL    |
| PARENT_TABLE_NAME | STRING(MAX) | YES  | NULL        | NULL      | NULL    |
| ON_DELETE_ACTION  | STRING(MAX) | YES  | NULL        | NULL      | NULL    |
| SPANNER_STATE     | STRING(MAX) | YES  | NULL        | NULL      | NULL    |
+-------------------+-------------+------+-------------+-----------+---------+
11 rows in set (34.96 msecs)

spanner> CREATE TABLE QUERY_STATS_TOP_MINUTE(pk INT64) PRIMARY KEY(pk);
Query OK, 0 rows affected (19.86 sec)

spanner> SHOW COLUMNS FROM QUERY_STATS_TOP_MINUTE;
+---------------------+-------------+------+-------------+-----------+---------+
| Field               | Type        | NULL | Key         | Key_Order | Options |
+---------------------+-------------+------+-------------+-----------+---------+
| INTERVAL_END        | TIMESTAMP   | YES  | NULL        | NULL      | NULL    |
| pk                  | INT64       | YES  | PRIMARY_KEY | ASC       | NULL    |
| TEXT                | STRING(MAX) | YES  | NULL        | NULL      | NULL    |
| TEXT_TRUNCATED      | BOOL        | YES  | NULL        | NULL      | NULL    |
| TEXT_FINGERPRINT    | INT64       | YES  | NULL        | NULL      | NULL    |
| EXECUTION_COUNT     | INT64       | YES  | NULL        | NULL      | NULL    |
| AVG_LATENCY_SECONDS | FLOAT64     | YES  | NULL        | NULL      | NULL    |
| AVG_ROWS            | FLOAT64     | YES  | NULL        | NULL      | NULL    |
| AVG_BYTES           | FLOAT64     | YES  | NULL        | NULL      | NULL    |
| AVG_ROWS_SCANNED    | FLOAT64     | YES  | NULL        | NULL      | NULL    |
| AVG_CPU_SECONDS     | FLOAT64     | YES  | NULL        | NULL      | NULL    |
+---------------------+-------------+------+-------------+-----------+---------+
11 rows in set (13.01 msecs)

Support query-level optimizer version selection

https://github.com/googleapis/google-cloud-go/blob/master/spanner/CHANGES.md#v130

Adds the support of providing query options (optimizer version) via three ways (precedence follows the order): client-level < environment variables < query-level. The environment variable is set by "SPANNER_OPTIMIZER_VERSION".

The environment variables based optimizer version will work well but it requires restarting(or another spanner-cli instance).
It would be added by:

  • SET optimizer level-like configuration statement.
  • Option of EXPLAIN SELECT statement.

Note:

SELECT * FROM SPANNER_SYS.SUPPORTED_OPTIMIZER_VERSIONS

seems available.

Show progress of CREATE INDEX

Currently, spanner-cli doesn't show anything until CREATE INDEX is done.

Recently, Cloud Spanner supports tracking of the progress of index backfill operations.
https://cloud.google.com/spanner/docs/release-notes#April_06_2021

You can now track the progress of long-running index backfill operations through the gcloud command line tool, REST API, and RPC API. For more information, see Checking the progress of a secondary index backfill.

So we can show progress of index creation using polling.

Consideration

  • How to visualize progress?
    • Bar or percent?
    • With elapsed time
  • Multiple DDL statements

Condition of Merge Join is not displayed

spanner-cli can't display conditions of Merge Join.

example query (modified from https://cloud.google.com/spanner/docs/query-execution-operators?hl=en#merge-join)

EXPLAIN
SELECT a.AlbumTitle, s.SongName
FROM Albums AS a JOIN@{join_method=merge_join} Songs AS s
ON a.SingerId = s.SingerId AND a.AlbumId = s.AlbumId
AND a.AlbumTitle <> s.SongName;
$ spanner-cli -p ${CLOUDSDK_CORE_PROJECT} -i ${CLOUDSDK_SPANNER_INSTANCE} -d ${DATABASE_ID} -f test-case.sql -t                           
+----+---------------------------------------------------------------------------------+
| ID | Query_Execution_Plan (EXPERIMENTAL)                                             |
+----+---------------------------------------------------------------------------------+
|  0 | Distributed Union                                                               |
|  1 | +- Serialize Result                                                             |
|  2 |    +- Merge Join                                                                |
|  3 |       +- Local Distributed Union                                                |
|  4 |       |  +- Table Scan (Full scan: true, Table: Albums)                         |
|  8 |       +- Local Distributed Union                                                |
|  9 |          +- Index Scan (Full scan: true, Index: SongsBySingerAlbumSongNameDesc) |
+----+---------------------------------------------------------------------------------+

Actually, raw planNodes of Merge Join have conditions in childLinks[2](merge join condition) and childLinks[3](residual condition) without the child type.

            - childLinks:
                - childIndex: 3
                - childIndex: 8
                - childIndex: 13
                - childIndex: 20
              displayName: Merge Join
              index: 2
              kind: RELATIONAL
...
            - childLinks:
                - childIndex: 14
                - childIndex: 17
              displayName: Function
              index: 13
              kind: SCALAR
              shortRepresentation:
                description: (($SingerId = $SingerId_1) AND ($AlbumId = $AlbumId_1))
...
            - childLinks:
                - childIndex: 21
                - childIndex: 22
              displayName: Function
              index: 20
              kind: SCALAR
              shortRepresentation:
                description: ($AlbumTitle != $SongName)

So it can be displayed like

$ go run ./ -p ${CLOUDSDK_CORE_PROJECT} -i ${CLOUDSDK_SPANNER_INSTANCE} -d ${DATABASE_ID} -f test-case.sql -t
+----+---------------------------------------------------------------------------------+
| ID | Query_Execution_Plan (EXPERIMENTAL)                                             |
+----+---------------------------------------------------------------------------------+
|  0 | Distributed Union                                                               |
|  1 | +- Serialize Result                                                             |
| *2 |    +- Merge Join                                                                |
|  3 |       +- Local Distributed Union                                                |
|  4 |       |  +- Table Scan (Full scan: true, Table: Albums)                         |
|  8 |       +- Local Distributed Union                                                |
|  9 |          +- Index Scan (Full scan: true, Index: SongsBySingerAlbumSongNameDesc) |
+----+---------------------------------------------------------------------------------+
Predicates(identified by ID):
 2: Condition: (($SingerId = $SingerId_1) AND ($AlbumId = $AlbumId_1))
    Residual Condition: ($AlbumTitle != $SongName)

Note:
The official plan visualizerIt also can't display it but it should be displayed.
official visualizer

It needs to investigate LEFT [OUTER] JOIN, RIGHT [OUTER] JOIN, FULL [OUTER] JOIN.

Improve EXPLAIN (add child type)

The query plan has a large amount of raw information, so the information displayed is limited.
Some information may be reasonable to display additionally.

Child Type

Binary operators has Type of ChildLink and they are referred in Official document.
https://cloud.google.com/spanner/docs/query-execution-operators?hl=en#cross-apply

Web UI displays child type like below:
image

EXPLAIN of spanner-cli may be enhanced like below:

$ spanner-cli -e 'EXPLAIN SELECT * FROM Albums@{FORCE_INDEX=AlbumsByAlbumTitle}'
Query_Execution_Plan (EXPERIMENTAL)

.
└── Distributed Union (subquery_cluster_node: 1)
    └── Distributed Cross Apply (subquery_cluster_node: 13)
        ├── Create Batch 
        │   └── Distributed Union (call_type: Local, subquery_cluster_node: 4)
        │       └── Compute Struct 
        │           └── Scan (scan_type: IndexScan, Full scan: true, scan_target: AlbumsByAlbumTitle)
        └── [Map]  Serialize Result 
            └── Cross Apply 
                ├── Scan (scan_target: $v2, scan_type: BatchScan)
                └── [Map]  Distributed Union (call_type: Local, subquery_cluster_node: 20)
                    └── FilterScan 
                        └── Scan (scan_target: Albums, scan_type: TableScan)

(Real Cross Apply and Distributed Cross Apply seems not to have Input type with the first child.)

executing statement followed by comment fails, which seems a bug

Maybe, I found a regression related to processing a comment after a statement.

When the version is v0.9.16, the following command outputs 1.

echo 'select 1; /* comment */' > stmt.sql
spanner-cli -p $PROJECT -i $INSTANCE -d $DATABASE -f stmt.sql

However, when the version is v0.9.17 the above command fails with ERROR: invalid statement.

Does the behavior with v0.9.17 need correction?

Priority support

Intro

Recently Cloud Spanner released the explicit priority support for data requests to Cloud Spanner.

In terms of supporting priority in spanner-cli, there would be two things we need to consider:

  1. Default priority
  2. Allow users to specify the priority

Default priority

Without specifying the priority Cloud Spanner runs a request with HIGH priority, so the current requests from spanner-cli are running with HIGH priority.

spanner-cli, however, is not meant to be used for the situation where requests should be done earlier than end-user's requests, so it might be a good idea to use MEDIUM or LOW priority as default so that it won't disrupt the end-user's requests.

Allow users to specify the priority

To allow users to specify the priority we can think of two options.

  1. Session level
  2. Statement level

Session level can be achieved with the command option like --priority=LOW when starting up the CLI.

Statement level would be a little bit controversial as there is no query hint for priority. One idea would be treating a special comment as a priority hint (e.g. SELECT * FROM t # priority: LOW), but it doesn't look good.

As a reference, TiDB uses select HIGH_PRIORITY | LOW_PRIORITY * from t notation to specify the priority.

Finish read-write transaction when aborted

Currently, spanner-cli continues read-write transaction mode when the RPC returns a code "Aborted".

> BEGIN RW;
Query OK, 0 rows affected (0.01 sec)

(rw txn)> @{LOCK_SCANNED_RANGES=exclusive} SELECT * FROM tbl WHERE pk = 0;
Empty set (3.72 msecs)

(rw txn)> @{LOCK_SCANNED_RANGES=exclusive} SELECT * FROM tbl WHERE pk = 0;
ERROR: spanner: code = "Aborted", desc = "Transaction was aborted."
(rw txn)> BEGIN RW;
ERROR: you're in read-write transaction. Please finish the transaction by 'COMMIT;' or 'ROLLBACK;'
(rw txn)> COMMIT;
ERROR: spanner: code = "Aborted", desc = "Transaction was aborted."
> BEGIN RW;
Query OK, 0 rows affected (2.22 sec)

In this state, users can't execute another statement until execute ROLLBACK or COMMIT.
(^D terminates spanner-cli)

Expected behavior

Finish read-write transaction mode when aborted and users can continue another operations immediately.

> BEGIN RW;
Query OK, 0 rows affected (0.01 sec)

(rw txn)> @{LOCK_SCANNED_RANGES=exclusive} SELECT * FROM tbl WHERE pk = 0;
Empty set (3.72 msecs)

(rw txn)> @{LOCK_SCANNED_RANGES=exclusive} SELECT * FROM tbl WHERE pk = 0;
ERROR: spanner: code = "Aborted", desc = "Transaction was aborted."
> BEGIN RW;
Query OK, 0 rows affected (2.22 sec)

EXPLAIN with subqueries shows duplicated subtrees

Some plan with subqueries show duplicate subtrees.

https://cloud.google.com/spanner/docs/query-execution-operators?hl=en#array_subqueries

SELECT a.AlbumId,
ARRAY(SELECT ConcertDate
      FROM Concerts
      WHERE Concerts.SingerId = a.SingerId)
FROM Albums AS a;
+-----+------------------------------------------------------------------+
| ID  | Query_Execution_Plan (EXPERIMENTAL)                              |
+-----+------------------------------------------------------------------+
|   0 | Distributed Union                                                |
|   1 | +- Local Distributed Union                                       |
|   2 |    +- Serialize Result                                           |
|   3 |       +- Index Scan (Full scan: true, Index: AlbumsByAlbumTitle) |
|   7 |       +- Array Subquery                                          |
|  *8 |       |  +- Distributed Union                                    |
|   9 |       |     +- Local Distributed Union                           |
| *10 |       |        +- FilterScan                                     |
|  11 |       |           +- Index Scan (Index: ConcertsBySingerId)      |
|   7 |       +- [Scalar] Array Subquery                                 |
|  *8 |          +- Distributed Union                                    |
|   9 |             +- Local Distributed Union                           |
| *10 |                +- FilterScan                                     |
|  11 |                   +- Index Scan (Index: ConcertsBySingerId)      |
+-----+------------------------------------------------------------------+
Predicates(identified by ID):
  8: Split Range: ($SingerId_1 = $SingerId)
 10: Seek Condition: ($SingerId_1 = $SingerId)
  8: Split Range: ($SingerId_1 = $SingerId)
 10: Seek Condition: ($SingerId_1 = $SingerId)

This happens because the actual query plan have duplicate children.

    - childLinks:
      - childIndex: 3 -- link to Index Scan
      - childIndex: 6 -- link to Reference: $AlbumId
      - childIndex: 7 -- link to Array Subquery
      - childIndex: 7 -- link to Array Subquery(same as above)
        type: Scalar
      displayName: Serialize Result
      index: 2
      kind: RELATIONAL

Note:

Another scalar subquery only have a link with type: Scalar.
https://cloud.google.com/spanner/docs/query-execution-operators?hl=en#scalar_subqueries

SELECT FirstName,
IF(FirstName='Alice',
   (SELECT COUNT(*)
    FROM Songs
    WHERE Duration > 300),
   0)
FROM Singers;
+-----+-----------------------------------------------------------------------+
| ID  | Query_Execution_Plan (EXPERIMENTAL)                                   |
+-----+-----------------------------------------------------------------------+
|   0 | Distributed Union                                                     |
|   1 | +- Local Distributed Union                                            |
|   2 |    +- Serialize Result                                                |
|   3 |       +- Index Scan (Full scan: true, Index: SingersByFirstLastName)  |
|  10 |       +- [Scalar] Scalar Subquery                                     |
|  11 |          +- Global Stream Aggregate (scalar_aggregate: true)          |
|  12 |             +- Distributed Union                                      |
|  13 |                +- Local Stream Aggregate (scalar_aggregate: true)     |
|  14 |                   +- Local Distributed Union                          |
| *15 |                      +- FilterScan                                    |
|  16 |                         +- Table Scan (Full scan: true, Table: Songs) |
+-----+-----------------------------------------------------------------------+
Predicates(identified by ID):
 15: Residual Condition: ($Duration > 300)
    - childLinks:
      - childIndex: 3  -- link to Index Scan
      - childIndex: 5  -- link to Reference: $FirstName
      - childIndex: 6  -- link to Function: IF(($FirstName = 'Alice'), $sv_1, 0)
      - childIndex: 10 -- link to Scalar Subquery
        type: Scalar
      displayName: Serialize Result
      index: 2
      kind: RELATIONAL

Implementation choices:

  • When child link with same node ID is appeared in same parent node,
    ignore everything except the first one(or the last one?)
  • In Serialize Result, the first child and children with type: Scalar seems appropriate subtree.

Panic happens when user attempts to run EXPLAIN ANALYZE for emulator

spanner> explain analyze select * from Singers;
(*spanner.QueryPlan)(nil)                                                                                                                             
panic: runtime error: invalid memory address or nil pointer dereference        
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x160c53a]                           
                                                                                                                                                      
goroutine 1 [running]:                                                                                                                                
main.BuildQueryPlanTree(0x0, 0x0, 0x1)                                                                                                                
        .../spanner-cli/query_plan.go:76 +0x3a
main.processPlanImpl(0x0, 0xc000132001, 0x17bbf40, 0x4, 0xc00063fba8, 0x1, 0x1, 0x1a, 0x0, 0x0)                                                       
        .../spanner-cli/statement.go:552 +0xe6                                                    
main.processPlanWithStats(...)                                                                                                                        
        .../spanner-cli/statement.go:540
main.(*ExplainAnalyzeStatement).Execute(0xc0002d7200, 0xc00049c5b0, 0x0, 0x0, 0x0)                                                                    
        .../spanner-cli/statement.go:516 +0x35b
main.(*Cli).RunInteractive(0xc00049c620, 0x0)
        .../spanner-cli/cli.go:183 +0x4a2      
main.main()                                                                                        
        .../spanner-cli/main.go:106 +0x369

This is because emulator doesn't set query plan nodes to the result set.
https://github.com/GoogleCloudPlatform/cloud-spanner-emulator/blob/77188b228e7757cd56ecffb5bc3ee85dce5d6ae1/frontend/handlers/queries.cc#L224-L230

Send `show create table\g` occur Panic

spanner-cli version : 0.6.3

> show create table  tb1\g;

-panic: regexp: Compile(`(?i)^CREATE TABLE tb1\g`): error parsing regexp: invalid escape sequence: `\g`

goroutine 1 [running]:
regexp.MustCompile(0xc0000a6840, 0x17, 0x12)
	/usr/local/go/src/regexp/regexp.go:311 +0x152
main.(*ShowCreateTableStatement).Execute(0xc0005b4000, 0xc00045c8c0, 0x1d6d0a0, 0xc0005b4000, 0x0)
	/Users/x/go/pkg/mod/github.com/cloudspannerecosystem/[email protected]/statement.go:392 +0x315
main.(*Cli).RunInteractive(0xc00045c930, 0x0)
	/Users/x/go/pkg/mod/github.com/cloudspannerecosystem/[email protected]/cli.go:183 +0x4a2

Defer transaction heartbeat to the time when the first SELECT or DML is executed

We run the transaction heartbeat just after a user run BEGIN; in the interactive shell, but we can defer starting the heartbeat to the time when the first SELECT or DML is executed.

This is possible because Cloud Spanner allows users to call BeginTransaction RPC beforehand so that we can remove overhead of calling BeginTransaction when users want to run an actual transaction (e.g. See WriteSessions in Go).

It's important to start the heartbeat after a user runs a SELECT or DML because the heartbeat uses SELECT 1 query and it could be recognized as an "older transaction" during deadlock detection (document).

Query having `;` in the string literal fails execution in the batch mode

Issue

If a query has ; in the string literal, it fails due to the syntax error.

$ echo 'select "TL;DR";' | spanner-cli -p ${PROJECT} -i ${INSTANCE} -d ${DATABASE}
ERROR: spanner: code = "InvalidArgument", desc = "Syntax error: Unclosed string literal [at 1:8]\\nselect \\\"TL\\n       ^"

This is because spanner-cli simply separates each statement by delimiter ; or \G, so the query was separated in the middle of the statement.
https://github.com/cloudspannerecosystem/spanner-cli/blob/master/cli.go#L323

This only happens for batch execution mode.

Solutions

  1. Parse query and find appropriate delimiters. This is an ideal way, but there is no official Go parser yet (spansql would have less coverage of supporting queries).
  2. (Only for batch execution) Regard a single line as a single query. This looks like a simple solution, but we have to change option -e to allow only a single query.

Request tag(per-transaction vs. per-query?)

Currently, spanner-cli supports per-transaction tag and it is used as the request tag of all queries in the transaction.
#128

Users can be confused by the behavior that multiple queries with the same request tag string are grouped into a single row.

https://cloud.google.com/spanner/docs/introspection/troubleshooting-with-tags?hl=en#things_to_note

When you assign a REQUEST_TAG, statistics for multiple queries that have the same tag string are grouped in a single row in query statistics table. Only the text of one of those queries is shown in the TEXT field.

Possible actions:

a. Add a note to multiple queries with the same tag in the "Transaction Tags and Request Tags" section of the README
b. Add per-request request tag syntax with/without the explicit transaction

Related discussion: #129 (comment) (We are also not familiar with the behavior)

DDL batching support

spanner-cli -e" {statements} " can execute multiple statements, but they are executed sequentially.
If statements are DDLs, it should be executed as a single large batch in practice.
https://cloud.google.com/spanner/docs/schema-updates?hl=en#large-updates

Consideration:

  • add a flag like --batch or not
  • check all statements before execution and batch them only if they are all DDL statements
    • or batch a subsequence of statements that are consecutive DDL statements

Table output EXPLAIN seems broken border because ambiguous width

The output of EXPLAIN with -t(--table) seems broken border in some environment.

$ spanner-cli -e "EXPLAIN SELECT * FROM Songs@{FORCE_INDEX=SongsBySongName}" -t
+--------------------------------------------------------------------------------------------------------+
| Query_Execution_Plan (EXPERIMENTAL)                                                                    |
+--------------------------------------------------------------------------------------------------------+
|                                                                                                        |
| .                                                                                                      |
| └── Distributed Union (subquery_cluster_node: 1)                                                    |
|     └── Distributed Cross Apply (subquery_cluster_node: 15)                                         |
|         ├── Create Batch                                                                            |
|         │   └── Distributed Union (call_type: Local, subquery_cluster_node: 4)                     |
|         │       └── Compute Struct                                                                 |
|         │           └── Scan (scan_target: SongsBySongName, scan_type: IndexScan, Full scan: true) |
|         └── Serialize Result                                                                        |
|             └── Cross Apply                                                                         |
|                 ├── Scan (scan_target: $v2, scan_type: BatchScan)                                   |
|                 └── Distributed Union (call_type: Local, subquery_cluster_node: 23)                 |
|                     └── FilterScan                                                                  |
|                         └── Scan (scan_target: Songs, scan_type: TableScan)                         |
|                                                                                                        |
+--------------------------------------------------------------------------------------------------------+

It can be fixed by replacing to unambiguous width characters.
xlab/treeprint#2 (comment)

For example

func (n *Node) Render() string {
	tree := treeprint.New()
	renderTree(tree, &Link{Dest: n})
	return "\n" + strings.NewReplacer(
		"└", "+",
		"├", "+",
		"│", "|",
		"─", "-",
		).Replace(tree.String())
}
$ spanner-cli -e "EXPLAIN SELECT * FROM Songs@{FORCE_INDEX=SongsBySongName}" -t
+----------------------------------------------------------------------------------------------------+
| Query_Execution_Plan (EXPERIMENTAL)                                                                |
+----------------------------------------------------------------------------------------------------+
|                                                                                                    |
| .                                                                                                  |
| +-- Distributed Union (subquery_cluster_node: 1)                                                   |
|     +-- Distributed Cross Apply (subquery_cluster_node: 15)                                        |
|         +-- [Input]  Create Batch                                                                  |
|         |   +-- Distributed Union (call_type: Local, subquery_cluster_node: 4)                     |
|         |       +-- Compute Struct                                                                 |
|         |           +-- Scan (Full scan: true, scan_type: IndexScan, scan_target: SongsBySongName) |
|         +-- [Map]  Serialize Result                                                                |
|             +-- Cross Apply                                                                        |
|                 +-- [Input]  Scan (scan_target: $v2, scan_type: BatchScan)                         |
|                 +-- [Map]  Distributed Union (subquery_cluster_node: 23, call_type: Local)         |
|                     +-- FilterScan                                                                 |
|                         +-- Scan (scan_type: TableScan, scan_target: Songs)                        |
|                                                                                                    |
+----------------------------------------------------------------------------------------------------+

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.