cloudspannerecosystem / spanner-cli Goto Github PK

View Code? Open in Web Editor NEW

216.0 8.0 28.0 921 KB

Interactive command line tool for Cloud Spanner

License: Apache License 2.0

Go 99.36% Makefile 0.64%

spanner-cli's Introduction

spanner-cli

Interactive command line tool for Cloud Spanner.

Description

spanner-cli is an interactive command line tool for Google Cloud Spanner.
You can control your Spanner databases with idiomatic SQL commands.

Install

Install Go and run the following command.

# For Go 1.16+
go install github.com/cloudspannerecosystem/spanner-cli@latest

# For Go <1.16
go get -u github.com/cloudspannerecosystem/spanner-cli

Or you can download the old binary from the releases.

Usage

Usage:
  spanner-cli [OPTIONS]

spanner:
  -p, --project=       (required) GCP Project ID. [$SPANNER_PROJECT_ID]
  -i, --instance=      (required) Cloud Spanner Instance ID [$SPANNER_INSTANCE_ID]
  -d, --database=      (required) Cloud Spanner Database ID. [$SPANNER_DATABASE_ID]
  -e, --execute=       Execute SQL statement and quit.
  -f, --file=          Execute SQL statement from file and quit.
  -t, --table          Display output in table format for batch mode.
  -v, --verbose        Display verbose output.
      --credential=    Use the specific credential file
      --prompt=        Set the prompt to the specified format
      --history=       Set the history file to the specified path
      --priority=      Set default request priority (HIGH|MEDIUM|LOW)
      --role=          Use the specific database role
      --directed-read= Directed read option (replica_location:replica_type).
                       The replicat_type is optional and either READ_ONLY or READ_WRITE.

Help Options:
  -h, --help        Show this help message

Unless you specify a credential file with --credential, this tool uses Application Default Credentials as credential source to connect to Spanner databases.
Please make sure to prepare your credential by gcloud auth application-default login.

Example

Interactive mode

$ spanner-cli -p myproject -i myinstance -d mydb
Connected.
spanner> CREATE TABLE users (
      ->   id INT64 NOT NULL,
      ->   name STRING(16) NOT NULL,
      ->   active BOOL NOT NULL
      -> ) PRIMARY KEY (id);
Query OK, 0 rows affected (30.60 sec)

spanner> SHOW TABLES;
+----------------+
| Tables_in_mydb |
+----------------+
| users          |
+----------------+
1 rows in set (18.66 msecs)

spanner> INSERT INTO users (id, name, active) VALUES (1, "foo", true), (2, "bar", false);
Query OK, 2 rows affected (5.08 sec)

spanner> SELECT * FROM users ORDER BY id ASC;
+----+------+--------+
| id | name | active |
+----+------+--------+
| 1  | foo  | true   |
| 2  | bar  | false  |
+----+------+--------+
2 rows in set (3.09 msecs)

spanner> BEGIN;
Query OK, 0 rows affected (0.02 sec)

spanner(rw txn)> DELETE FROM users WHERE active = false;
Query OK, 1 rows affected (0.61 sec)

spanner(rw txn)> COMMIT;
Query OK, 0 rows affected (0.20 sec)

spanner> SELECT * FROM users ORDER BY id ASC;
+----+------+--------+
| id | name | active |
+----+------+--------+
| 1  | foo  | true   |
+----+------+--------+
1 rows in set (2.58 msecs)

spanner> DROP TABLE users;
Query OK, 0 rows affected (25.20 sec)

spanner> SHOW TABLES;
Empty set (2.02 msecs)

spanner> EXIT;
Bye

Batch mode

By passing SQL from standard input, spanner-cli runs in batch mode.

$ echo 'SELECT * FROM users;' | spanner-cli -p myproject -i myinstance -d mydb
id      name    active
1       foo     true
2       bar     false

You can also pass SQL with command line option -e.

$ spanner-cli -p myproject -i myinstance -d mydb -e 'SELECT * FROM users;'
id      name    active
1       foo     true
2       bar     false

With -t option, results are displayed in table format.

$ spanner-cli -p myproject -i myinstance -d mydb -e 'SELECT * FROM users;' -t
+----+------+--------+
| id | name | active |
+----+------+--------+
| 1  | foo  | true   |
| 2  | bar  | false  |
+----+------+--------+

Directed reads mode

spanner-cli now supports directed reads, a feature that allows you to read data from a specific replica of a Spanner database. To use directed reads with spanner-cli, you need to specify the --directed-read flag. The --directed-read flag takes a single argument, which is the name of the replica that you want to read from. The replica name can be specified in one of the following formats:

<replica_location>
<replica_location>:<replica_type>

The <replica_location> specifies the region where the replica is located such as us-central1, asia-northeast2.
The <replica_type> specifies the type of the replica either READ_WRITE or READ_ONLY.

$ spanner-cli -p myproject -i myinstance -d mydb --directed-read us-central1

$ spanner-cli -p myproject -i myinstance -d mydb --directed-read us-central1:READ_ONLY

$ spanner-cli -p myproject -i myinstance -d mydb --directed-read asia-northeast2:READ_WRITE

Directed reads are only effective for single queries or queries within a read-only transaction. Please note that directed read options do not apply to queries within a read-write transaction.

Note

If you specify an incorrect region or type for directed reads, directed reads will not be enabled and your requsts won't be routed as expected. For example, in a multi-region configuration nam3, if you mistype us-east1 as us-east-1, the connection will succeed, but directed reads will not be enabled.

To perform directed reads to asia-northeast2 in a multi-region configuration asia1, you need to specify asia-northeast2 or asia-northeast2:READ_WRITE. Since the replicas placed in asia-northeast2 are READ_WRITE replicas, directed reads will not be enabled if you specify asia-northeast2:READ_ONLY.

Please refer to the Spanner documentation to verify the valid configurations.

Syntax

In the following syntax, we use <> for a placeholder, [] for an optional keyword, and {} for a mutually exclusive keyword.

The syntax is case-insensitive.
\G delimiter is also supported for displaying results vertically.

Usage	Syntax	Note
List databases	`SHOW DATABASES;`
Switch database	`USE <database> [ROLE <role>];`	The role you set is used for accessing with fine-grained access control.
Create database	`CREATE DATABSE <database>;`
Drop database	`DROP DATABASE <database>;`
List tables	`SHOW TABLES;`
Show table schema	`SHOW CREATE TABLE <table>;`
Show columns	`SHOW COLUMNS FROM <table>;`
Show indexes	`SHOW INDEX FROM <table>;`
Create table	`CREATE TABLE ...;`
Change table schema	`ALTER TABLE ...;`
Delete table	`DROP TABLE ...;`
Truncate table	`TRUNCATE TABLE <table>;`	Only rows are deleted. Note: Non-atomically because executed as a partitioned DML statement.
Create index	`CREATE INDEX ...;`
Delete index	`DROP INDEX ...;`
Create role	`CREATE ROLE ...;`
Drop role	`DROP ROLE ...;`
Grant	`GRANT ...;`
Revoke	`REVOKE ...;`
Query	`SELECT ...;`
DML	`{INSERT\|UPDATE\|DELETE} ...;`
Partitioned DML	`PARTITIONED {UPDATE\|DELETE} ...;`
Show Query Execution Plan	`EXPLAIN SELECT ...;`
Show DML Execution Plan	`EXPLAIN {INSERT\|UPDATE\|DELETE} ...;`
Show Query Execution Plan with Stats	`EXPLAIN ANALYZE SELECT ...;`
Show DML Execution Plan with Stats	`EXPLAIN ANALYZE {INSERT\|UPDATE\|DELETE} ...;`
Start a new query optimizer statistics package construction	`ANALYZE;`
Start Read-Write Transaction	`BEGIN [RW] [PRIORITY {HIGH\|MEDIUM\|LOW}] [TAG <tag>];`	See Request Priority for details on the priority. The tag you set is used as both transaction tag and request tag. See also Transaction Tags and Request Tags.
Commit Read-Write Transaction	`COMMIT;`
Rollback Read-Write Transaction	`ROLLBACK;`
Start Read-Only Transaction	`BEGIN RO [{<seconds>\|<RFC3339-formatted time>}] [PRIORITY {HIGH\|MEDIUM\|LOW}] [TAG <tag>];`	`<seconds>` and `<RFC3339-formatted time>` is used for stale read. See Request Priority for details on the priority. The tag you set is used as request tag. See also Transaction Tags and Request Tags.
End Read-Only Transaction	`CLOSE;`
Exit CLI	`EXIT;`

Customize prompt

You can customize the prompt by --prompt option.
There are some defined variables for being used in prompt.

Variables:

\p : GCP Project ID
\i : Cloud Spanner Instance ID
\d : Cloud Spanner Database ID
\t : In transaction

Example:

$ spanner-cli -p myproject -i myinstance -d mydb --prompt='[\p:\i:\d]\t> '
Connected.
[myproject:myinstance:mydb]> SELECT * FROM users ORDER BY id ASC;
+----+------+--------+
| id | name | active |
+----+------+--------+
| 1  | foo  | true   |
| 2  | bar  | false  |
+----+------+--------+
2 rows in set (3.09 msecs)

[myproject:myinstance:mydb]> begin;
Query OK, 0 rows affected (0.08 sec)

[myproject:myinstance:mydb](rw txn)> ...

The default prompt is spanner\t> .

Config file

This tool supports a configuration file called spanner_cli.cnf, similar to my.cnf.
The config file path must be ~/.spanner_cli.cnf.
In the config file, you can set default option values for command line options.

Example:

[spanner]
project = myproject
instance = myinstance
prompt = "[\\p:\\i:\\d]\\t> "

Configuration Precedence

Command line flags(highest)
Environment variables
.spanner_cli.cnf in current directory
.spanner_cli.cnf in home directory(lowest)

Request Priority

You can set request priority for command level or transaction level. By default MEDIUM priority is used for every request.

To set a priority for command line level, you can use --priority={HIGH|MEDIUM|LOW} command line option.

To set a priority for transaction level, you can use PRIORITY {HIGH|MEDIUM|LOW} keyword.

Here are some examples for transaction-level priority.

# Read-write transaction with low priority
BEGIN PRIORITY LOW;

# Read-only transaction with low priority
BEGIN RO PRIORITY LOW;

# Read-only transaction with 60s stale read and medium priority
BEGIN RO 60 PRIORITY MEDIUM;

# Read-only transaction with exact timestamp and medium priority
BEGIN RO 2021-04-01T23:47:44+00:00 PRIORITY MEDIUM;

Note that transaction-level priority takes precedence over command-level priority.

Transaction Tags and Request Tags

In a read-write transaction, you can add a tag following BEGIN RW TAG <tag>. spanner-cli adds the tag set in BEGIN RW TAG as a transaction tag. The tag will also be used as request tags within the transaction.

# Read-write transaction
# transaction_tag = tx1
+--------------------+
| BEGIN RW TAG tx1;  |
|                    |
| SELECT val         |
| FROM tab1      +-----request_tag = tx1
| WHERE id = 1;      |
|                    |
| UPDATE tab1        |
| SET val = 10   +-----request_tag = tx1
| WHERE id = 1;      |
|                    |
| COMMIT;            |
+--------------------+

In a read-only transaction, you can add a tag following BEGIN RO TAG <tag>. Since read-only transaction doesn't support transaction tag, spanner-cli adds the tag set in BEGIN RO TAG as request tags.

# Read-only transaction
# transaction_tag = N/A
+--------------------+
| BEGIN RO TAG tx2;  |
|                    |
| SELECT SUM(val)    |
| FROM tab1      +-----request_tag = tx2
| WHERE id = 1;      |
|                    |
| CLOSE;             |
+--------------------+

Using with the Cloud Spanner Emulator

This tool supports the Cloud Spanner Emulator via the SPANNER_EMULATOR_HOST environment variable.

$ export SPANNER_EMULATOR_HOST=localhost:9010
# Or with gcloud env-init:
$ $(gcloud emulators spanner env-init)

$ spanner-cli -p myproject -i myinstance -d mydb

How to develop

Run unit tests.

$ make test

Run integration tests, which connects to real Cloud Spanner database.

$ PROJECT=${PROJECT_ID} INSTANCE=${INSTANCE_ID} DATABASE=${DATABASE_ID} CREDENTIAL=${CREDENTIAL} make test

TODO

Show secondary index by "SHOW CREATE TABLE"

Disclaimer

Do not use this tool for production databases as the tool is still alpha quality.

Please feel free to report issues and send pull requests, but note that this application is not officially supported as part of the Cloud Spanner product.

spanner-cli's People

Contributors

Stargazers

Watchers

spanner-cli's Issues

Support Change Streams DDL

Support DDLs for managing change streams.

https://cloud.google.com/spanner/docs/change-streams/manage?hl=en

CREATE CHANGE STREAM <name> FOR ...
ALTER CHANGE STREAM <name> SET FOR ...
ALTER CHANGE STREAM <name> SET OPTIONS ...
ALTER CHANGE STREAM <name> DROP FOR ALL
DROP CHANGE STREAM <name>

spanner> CREATE CHANGE STREAM EverythingStream FOR ALL;
ERROR: invalid statement

I propose to support generic CREATE / DROP / ALTER DDL statements so that spanner-cli can respond to the new features of Cloud Spanner with a minimum of effort.

EXPLAIN <table> is confusing and undocumented

I have noticed during the work of #43, EXPLAIN without SELECT is matched with showColumnRe.

$ spanner-cli -e 'EXPLAIN WITH t AS (SELECT 1) SELECT * FROM t'
ERROR: table "WITH t AS (SELECT 1) SELECT * FROM t" doesn't exist

It is undocumented behavior and confusing result.
IMO, it should be decommissioned or documented.

Update spanner library

There was a bug that was fixed in v1.45.0 - googleapis/google-cloud-go#7136
We are constantly getting this error.

Could you please release a new version with fresh spanner library?

Wounded transaction is inconsistent

I have found that once the transaction is aborted, the subsequent transaction get the different aborted result even if the same transaction was run.

Let me explain with the example.

Example:

This example is the Lost Update example I explained in the blog post (en, ja).

Let's call the left pane is txn1 and right pane is txn2.

In the 1st try, txn2 was aborted since txn1 had higher priority than txn2. In the 2nd try, however, txn1 was aborted even though txn1 started the SELECT statement earlier than txn2.

Use spanner-cli specific request tags for queries like `SELECT 1`

spanner-cli itself may issue SQLs in addition to SQLs directly issued by users.
For example, SELECT 1 issued for heartbeat purposes, etc.

Such queries appear in Cloud Spanner's query statistics and may confuse users who are unaware that spanner-cli is issuing them.
To identify such queries more easily, I suggest adding spanner-cli specific request tags to the queries.

As far as I can find, spanner-cli seems to issue its own SELECT query for the following uses.

SELECT 1 - At the beginning of a Read-write transaction
SELECT 1 - To check if DB exists.
SELECT 1 - Heartbeat
SELECT ... FROM INFORMATION_SCHEMA.TABLES ... - SHOW TABLES
SELECT ... FROM INFORMATION_SCHEMA.INDEXES ... - SHOW INDEX
SELECT ... FROM INFORMATION_SCHEMA.COLUMNS ... - SHOW COLUMNS

I am not sure what specific string of text to tag, so I need your opinion.

As an example, it seems that Cloud Console uses sys_cloud_console_* to issue some queries.

spanner> SELECT SUBSTR(text,0,20) AS text, request_tag,
      -> FROM spanner_sys.query_stats_top_minute
      -> WHERE request_tag <> "";
+----------------------+---------------------------------------------+
| text                 | request_tag                                 |
+----------------------+---------------------------------------------+
| SELECT TABLE_NAME, T | sys_cloud_console_DatabaseTables            |
| SELECT VIEW_DEFINITI | sys_cloud_console_TableDetailViewDefinition |
| SELECT OPTION_NAME,  | sys_cloud_console_DatabaseOptions           |
| SELECT INDEX_NAME, I | sys_cloud_console_TableDetailIndexes        |
| SELECT TABLE_NAME, C | sys_cloud_console_TableDetailTableColumnMap |
| SELECT TABLE_NAME, C | sys_cloud_console_DatabaseTableColumnMap    |
| SELECT TABLE_NAME, C | sys_cloud_console_TableDetailColumns        |
| SELECT TABLE_NAME, T | sys_cloud_console_TableDetail               |
| SELECT TABLE_NAME, T | sys_cloud_console_DatabaseTables            |
| SELECT OPTION_NAME,  | sys_cloud_console_DatabaseOptions           |
| SELECT TABLE_NAME, C | sys_cloud_console_DatabaseTableColumnMap    |
+----------------------+---------------------------------------------+

Install?

More detailed install instructions would be nice.
A lot of people using this are usually in a hurry and/or just want to do a quick PoC, without having to understand how golang works environment-wise, debugging stuff etc.

I get spanner-cli: command not found after performing the installation (Yes, go version prints the version and I've tried launching new terminals).

A FAQ could be useful.

Show predicates like Seek/Residual Conditions

Currently, EXPLAIN and EXPLAIN ANALYZE don't show predicate functions like Seek Condition, Residual Condition and Split Predicate.
They are described in Query execution operators docs and very useful to optimize queries.

It should be considered how to show predicates.

The display format of the execution plans in other databases may be helpful in designing.

https://use-the-index-luke.com/sql/explain-plan

Case-sensitiveness of identifiers in spanner-cli statements

In Cloud Spanner native statements, identifiers are case-insensitive and quoted identifiers are case-sensitive.
It should be emulated in spanner-cli statements.

Using INFORMATION_SCHEMA
- SHOW INDEX
- SHOW COLUMNS
- ~~SHOW CREATE TABLES~~
Not using INFORMATION_SCHEMA
- DROP DATABASE
- SHOW CREATE TABLES

Support more staleness options in `BEGIN RO`

Currently, spanner-cli only utilize staleness options of Strong Read by BEGIN RO and Exact Staleness by BEGIN RO <seconds>.
If other options Read Timestamp, Max Staleness, Min Read Timestamp is supported, it will be nice for testing purpose.
https://cloud.google.com/spanner/docs/reference/rpc/google.spanner.v1?hl=en#snapshot-read-only-transactions

Additionally, displaying a read timestamp will be useful for some purpose.

Support QueryMode=PROFILE a.k.a EXPLAIN ANALYZE

Currently, EXPLAIN statement work as QueryMode=PLAN.
I want to see the result of QueryMode=PROFILE.

It may be similar to EXPLAIN ANALYZE of PostgreSQL.
https://www.postgresql.org/docs/12/sql-explain.html

Considerations:

What runtime information is displayed?

Installation issue

Need some help on installation. Thanks!

Got the following installation error while trying to install it on a CentOS linux box, looks like some go lib can't be found, here are the full console log

go get -u github.com/cloudspannerecosystem/spanner-cli
go: downloading github.com/cloudspannerecosystem/spanner-cli v0.9.6
go: downloading cloud.google.com/go v0.79.0
go: downloading github.com/chzyer/readline v0.0.0-20180603132655-2972be24d48e
go: downloading github.com/jessevdk/go-flags v1.4.0
go: downloading github.com/olekukonko/tablewriter v0.0.4
go: downloading github.com/xlab/treeprint v1.0.1-0.20200715141336-10e0bc383e01
go: downloading google.golang.org/api v0.43.0
go: downloading google.golang.org/genproto v0.0.0-20210331142528-b7513248f0ba
go: downloading cloud.google.com/go/spanner v1.17.0
go: downloading google.golang.org/grpc v1.36.1
go: downloading github.com/olekukonko/tablewriter v0.0.5
go: downloading github.com/jessevdk/go-flags v1.5.0
go: downloading github.com/xlab/treeprint v1.1.0
go: downloading google.golang.org/protobuf v1.26.0
go: downloading golang.org/x/sys v0.0.0-20210320140829-1e4c9ba3b0c4
go: downloading github.com/mattn/go-runewidth v0.0.8
go: downloading cloud.google.com/go v0.81.0
go: downloading google.golang.org/grpc v1.37.0
go: downloading github.com/mattn/go-runewidth v0.0.12
go: downloading golang.org/x/sys v0.0.0-20210423185535-09eb48e85fd7
go: downloading google.golang.org/genproto v0.0.0-20210423144448-3a41ef94ed2b
go: downloading github.com/golang/protobuf v1.5.1
go: downloading github.com/googleapis/gax-go/v2 v2.0.5
go: downloading github.com/googleapis/gax-go v1.0.3
go: downloading go.opencensus.io v0.23.0
go: downloading github.com/golang/protobuf v1.5.2
go: downloading golang.org/x/xerrors v0.0.0-20200804184101-5ec99f83aff1
go: downloading golang.org/x/net v0.0.0-20210316092652-d523dce5a7f4
go: downloading github.com/jstemmer/go-junit-report v0.9.1
go: downloading golang.org/x/lint v0.0.0-20201208152925-83fdc39ff7b5
go: downloading golang.org/x/tools v0.1.0
go: downloading golang.org/x/net v0.0.0-20210423184538-5f58ad60dda6
go: downloading github.com/golang/groupcache v0.0.0-20200121045136-8c9f03a8e57e
go: downloading github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da
go: downloading golang.org/x/text v0.3.5
go: downloading golang.org/x/mod v0.4.1
go: downloading golang.org/x/mod v0.4.2
go: downloading golang.org/x/text v0.3.6
go get: google.golang.org/[email protected]: reading https://proxy.golang.org/google.golang.org/genproto/@v/v0.0.0-20210423144448-3a41ef94ed2b.zip: 403 Forbidden
go get: google.golang.org/[email protected]: reading https://proxy.golang.org/google.golang.org/genproto/@v/v0.0.0-20210423144448-3a41ef94ed2b.zip: 403 Forbidden
go get: google.golang.org/[email protected]: reading https://proxy.golang.org/google.golang.org/genproto/@v/v0.0.0-20210423144448-3a41ef94ed2b.zip: 403 Forbidden
go get: google.golang.org/[email protected]: reading https://proxy.golang.org/google.golang.org/genproto/@v/v0.0.0-20210423144448-3a41ef94ed2b.zip: 403 Forbidden
go get: google.golang.org/[email protected]: reading https://proxy.golang.org/google.golang.org/genproto/@v/v0.0.0-20210423144448-3a41ef94ed2b.zip: 403 Forbidden
go get: google.golang.org/[email protected]: reading https://proxy.golang.org/google.golang.org/genproto/@v/v0.0.0-20210423144448-3a41ef94ed2b.zip: 403 Forbidden
go get: google.golang.org/[email protected]: reading https://proxy.golang.org/google.golang.org/genproto/@v/v0.0.0-20210423144448-3a41ef94ed2b.zip: 403 Forbidden
go get: google.golang.org/[email protected]: reading https://proxy.golang.org/google.golang.org/genproto/@v/v0.0.0-20210423144448-3a41ef94ed2b.zip: 403 Forbidden
go get: google.golang.org/[email protected]: reading https://proxy.golang.org/google.golang.org/genproto/@v/v0.0.0-20210423144448-3a41ef94ed2b.zip: 403 Forbidden

"SHOW TABLES" doesn't work

"SHOW TABLES" will fail with the error below:

spanner> SHOW TABLES;
ERROR: spanner: code = "InvalidArgument", desc = "Syntax error: Unexpected \\\"-\\\" [at 1:41]\\nSELECT t.table_name AS Tables_in_xxxxx FROM information_schema.ta...\\n

Ability to set deadlines on long running operations

A useful feature would be to be able to set shorter (or longer!) deadlines on potentially long running operations, specifically SELECT but also INSERT WHERE SELECT, UPDATE and COMMIT.

For example, the default SELECT deadline (ExecuteStreamingSQL) is 3600s or 1hr. This means that if a user executes a SELECT with a full table scan on a large table, it could continue running for up to an hour without the database being aware that, the user has attempted to cancel it by killing the Spanner-CLI process. One can imagine a scenario where a user gets impatient, kills the client process and tries again several times this loading the DB without realising.

(This is unlike other DB engines which use a TCP connection to the CLI client, so that the server can detect when a client has been killed, and terminate any queries),

Suggestion:

Set a shorter deadline for potentially long running APIs by default (the console UI uses 5 mins)
When a DEADLINE EXCEEDED error is raised, report the error with some text indicating that the deadline can be extended (and how)

Format of STRUCT is ambiguous

Currently, spanner-cli formats of STRUCT values as [...].
I think it is ambiguous with format of ARRAY values.
(We know nested ARRAY and top-level STRUCT is forbid so strictly ambiguous.)

$ spanner-cli -e 'SELECT ARRAY(SELECT AS STRUCT 1, 2)'
[[1, 2]]

$ spanner-cli -e 'SELECT SAMPLE_LOCK_REQUESTS FROM SPANNER_SYS.LOCK_STATS_TOP_HOUR' -t
...
[[tbl._exists, Exclusive], [tbl._exists, Exclusive], [tbl._exists, Exclusive], [tbl._exists, Exclusive]]
...

How about does STRUCT use (...) like native notation?

Support Partitioned DML

Cloud Spanner has two execution modes for DML, DML and Partitioned DML, but currently spanner-cli only supports normal DML. It would be better to support Partitioned DML in spanner-cli since it can process large-scale operations over the entire rows.

One thing we might need to discuss would be syntax for executing a Partitioned DML.

Context: #92 (comment)

SHOW CHANGE STREAMS statement

refs #135 (comment)

Show watched tables and columns
Show FOR ALL or empty streams also
Show options(currently only retention_period)

Proposed layout

+----------------------------------------------------------------+----------------------+
| CHANGE_STREAM                                                  | OPTIONS              |
+----------------------------------------------------------------+----------------------+
| Empty                                                          |                      |
| EverythingStream FOR ALL                                       |                      |
| NamesAndAlbums FOR Singers(FirstName, LastName), Albums, Songs | retention_period=36h |
| SingerAlbumStream FOR Singers, Albums                          |                      |
+----------------------------------------------------------------+----------------------+

Semantics

SELECT CHANGE_STREAM_NAME || IF(`ALL`, " FOR ALL", IFNULL(" FOR " || (
  SELECT STRING_AGG(TABLE_NAME || IF(ALL_COLUMNS, "", FORMAT("(%s)", (
    SELECT STRING_AGG(COLUMN_NAME, ", ")
    FROM INFORMATION_SCHEMA.CHANGE_STREAM_COLUMNS CSC
    WHERE (CST.CHANGE_STREAM_NAME, CST.TABLE_NAME) = (CSC.CHANGE_STREAM_NAME, CSC.TABLE_NAME)
  ))), ", ")
  FROM INFORMATION_SCHEMA.CHANGE_STREAM_TABLES CST
  WHERE CS.CHANGE_STREAM_NAME = CST.CHANGE_STREAM_NAME
), "")) AS CHANGE_STREAM,
IFNULL((
  SELECT STRING_AGG(FORMAT("%s=%s", OPTION_NAME, OPTION_VALUE), ", ")
  FROM INFORMATION_SCHEMA.CHANGE_STREAM_OPTIONS CSO
  WHERE CS.CHANGE_STREAM_NAME = CSO.CHANGE_STREAM_NAME
), "") AS OPTIONS
FROM INFORMATION_SCHEMA.CHANGE_STREAMS CS

References

Partitioned DML over 30s always fails by context deadline exceeded

Partitioned DML over 30s always fails by context deadline exceeded.

$ time ./spanner-cli -p ${CLOUDSDK_CORE_PROJECT} -i ${CLOUDSDK_SPANNER_INSTANCE} -d ${DATABASE_ID} --execute='PARTITIONED UPDATE Songs SET Duration = Duration + 1 WHERE TRUE' 
ERROR: rpc error: code = DeadlineExceeded desc = context deadline exceeded
./spanner-cli -p ${CLOUDSDK_CORE_PROJECT} -i ${CLOUDSDK_SPANNER_INSTANCE} -d   0.35s user 0.47s system 2% cpu 34.068 total

gcloud databases execute-sql succeeds.

$ time gcloud spanner databases execute-sql --instance=merpay-sponsored-instance ${DATABASE_ID} --sql='UPDATE Songs SET Duration = Duration + 1 WHERE TRUE' --enable-partitioned-dml
Statement modified a lower bound of 1024000 rows
gcloud spanner databases execute-sql --instance=merpay-sponsored-instance     0.55s user 0.22s system 1% cpu 1:10.73 total

AttributeError: 'database_id'

I am getting following error to connect to the spanner instance using spanner-cli

Traceback (most recent call last):
File "/home/venv/lib/python3.9/site-packages/proto/message.py", line 561, in getattr
pb_type = self._meta.fields[key].pb_type
KeyError: 'database_id'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/venv/bin/spanner-cli", line 8, in
sys.exit(main.main())
File "/home/venv/lib/python3.9/site-packages/click/core.py", line 1137, in call
return self.main(*args, **kwargs)
File "/home/venv/lib/python3.9/site-packages/click/core.py", line 1062, in main
rv = self.invoke(ctx)
File "/home/venv/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/venv/lib/python3.9/site-packages/click/core.py", line 763, in invoke
return __callback(*args, **kwargs)
File "/home/venv/lib/python3.9/site-packages/spannercli/main.py", line 403, in main
cli = SpannerCli(
File "/home/venv/lib/python3.9/site-packages/spannercli/main.py", line 64, in init
self.rehash()
File "/home/venv/lib/python3.9/site-packages/spannercli/main.py", line 87, in rehash
self.set_completion_databases()
File "/home/venv/lib/python3.9/site-packages/spannercli/main.py", line 92, in set_completion_databases
data = self.list_databases()
File "/home/venv/lib/python3.9/site-packages/spannercli/main.py", line 127, in list_databases
data.append(d.database_id)
File "/home/venv/lib/python3.9/site-packages/proto/message.py", line 566, in getattr
raise AttributeError(str(ex))
AttributeError: 'database_id'

statement separator don't treat comments as a token separator

The statement separator implementation treat comments as empty but it should be treated as a kind of a token separator as like whitespaces.
This behavior can make query execution an error or even change the meaning of the query.

Example

spanner-cli evaluate SELECT 0x1/**/A as SELECT 0x1A.

$ spanner-cli -d ${DATABASE} --execute='SELECT 0x1/**/A' -t                                             
+----+
|    |
+----+
| 26 |
+----+

Expected behavior

It should be evaluated as SELECT 0x1 A(equivalent of SELECT 0x1 AS A).

$ gcloud spanner databases execute-sql ${DATABASE} --sql='SELECT 0x1/**/A'               
A
1

Related specification

https://cloud.google.com/spanner/docs/reference/standard-sql/lexical?hl=en

You can separate tokens with comments or whitespace such as spaces, backspaces, tabs, or newlines.

CI fails

The latest CI failed for the test case accessing real Spanner instance: https://github.com/cloudspannerecosystem/spanner-cli/actions/runs/4684507346/jobs/8300721270

=== RUN   TestDml
    integration_test.go:108: failed to create table: err=rpc error: code = Unauthenticated desc = Request had invalid authentication credentials. Expected OAuth 2 access token, login cookie or other valid authentication credential. See https://developers.google.com/identity/sign-in/web/devconsole-project.

I will wait for a few days to check if it's just a temporal issue.

Support DDL statements for fine-grained access control

GRANT { privileges ON TABLE tablelist | ROLE rolelist } TO ROLE rolelist;
REVOKE { privileges ON TABLE tablelist | ROLE rolelist } FROM ROLE rolelist;

Note: CREATE and DROP are already supported as generic DDLs.

CREATE ROLE database_role_name;
DROP ROLE database_role_name;

https://cloud.google.com/spanner/docs/configure-fgac?hl=en
https://cloud.google.com/spanner/docs/reference/standard-sql/data-definition-language?hl=en

Support connection without spanner.databases.get IAM permission

I have a service account that can't list databases but it would be nice to connect to a specific database if I know it by name and do stuff. I'm able to do that with the SQLWorkbench/J and the Simba JDBC driver for Spanner.

This is the error I get:

$ spanner-cli -p foo -i conversion -d conversion-prod -e 'select count(*) from account_owner'
2018/11/08 15:47:08 failed to connect to Spanner: rpc error: code = PermissionDenied desc = Resource projects/foo/instances/conversion/databases/conversion-prod is missing IAM permission: spanner.databases.get.

Version Retention

I'm unable to import from a spanner-dump file. Spanner-cli throws and error when it tries to run the first block
ALTER DATABASE `example-db` SET OPTIONS ( version_retention_period = '7d' ); ); id STRING(1024), ) PRIMARY KEY(id);

Document precedence of configurations.

Currently, spanner-cli has various ways of configuration.

command line options -p, -i, -d
Environment variables #55
.spanner_cli.cnf in $HOME and current directory #51

Although, precedence is not well documented.
It should be documented to prevent unexpected behaviors.

Show query stats for verbose mode

Feature Request

Now we have --verbose mode, it would be great if query stats could be shown in the query result.

Example,

spanner> select id from comments order by text asc limit 3;
+-------+
| id    |
+-------+
| 23462 |
| 7739  |
| 17666 |
+-------+
3 rows in set (16.15 msecs), timestamp: 2020-04-04T14:00:14.669758+09:00, rows_scanned: 100, data_bytes_read: 2315570, cpu_time: 13.76 msecs

spanner>

Consideration

Since query stats have multiple statistics information, we need to consider which information should be shown and how it should be visualized.

This is an example of the output from query stats.

stats: map[string]interface {}{"bytes_returned":"21", "cpu_time":"13.76 msecs", "data_bytes_read":"2315570", "deleted_rows_scanned":"6463", "elapsed_time":"16.15 msecs", "filesystem_delay_seconds":"0 msecs", "optimizer_version":"2", "query_plan_creation_time":"3.41 msecs", "query_text":"select id from comments order by text asc limit 3", "remote_server_calls":"0/0", "rows_returned":"3", "rows_scanned":"100", "runtime_creation_time":"0 msecs"}

Result of SHOW COLUMNS is wrong if the table name is same name with INFORMATION_SCHEMA

The result of SHOW COLUMNS command is corrupted if table name is same with table of INFORMATION_SCHEMA or SPANNER_SYS.
Join condition should be constrained by TABLE_SCHEMA.
https://cloud.google.com/spanner/docs/information-schema?hl=en

spanner> CREATE TABLE TABLES(pk INT64) PRIMARY KEY(pk);
Query OK, 0 rows affected (19.86 sec)

spanner> SHOW COLUMNS FROM TABLES;
+-------------------+-------------+------+-------------+-----------+---------+
| Field             | Type        | NULL | Key         | Key_Order | Options |
+-------------------+-------------+------+-------------+-----------+---------+
| pk                | INT64       | YES  | PRIMARY_KEY | ASC       | NULL    |
| TABLE_CATALOG     | STRING(MAX) | NO   | PRIMARY_KEY | ASC       | NULL    |
| pk                | INT64       | YES  | PRIMARY_KEY | ASC       | NULL    |
| TABLE_CATALOG     | STRING(MAX) | NO   | PRIMARY_KEY | ASC       | NULL    |
| TABLE_SCHEMA      | STRING(MAX) | NO   | PRIMARY_KEY | ASC       | NULL    |
| TABLE_SCHEMA      | STRING(MAX) | NO   | PRIMARY_KEY | ASC       | NULL    |
| TABLE_NAME        | STRING(MAX) | NO   | PRIMARY_KEY | ASC       | NULL    |
| TABLE_NAME        | STRING(MAX) | NO   | PRIMARY_KEY | ASC       | NULL    |
| PARENT_TABLE_NAME | STRING(MAX) | YES  | NULL        | NULL      | NULL    |
| ON_DELETE_ACTION  | STRING(MAX) | YES  | NULL        | NULL      | NULL    |
| SPANNER_STATE     | STRING(MAX) | YES  | NULL        | NULL      | NULL    |
+-------------------+-------------+------+-------------+-----------+---------+
11 rows in set (34.96 msecs)

spanner> CREATE TABLE QUERY_STATS_TOP_MINUTE(pk INT64) PRIMARY KEY(pk);
Query OK, 0 rows affected (19.86 sec)

spanner> SHOW COLUMNS FROM QUERY_STATS_TOP_MINUTE;
+---------------------+-------------+------+-------------+-----------+---------+
| Field               | Type        | NULL | Key         | Key_Order | Options |
+---------------------+-------------+------+-------------+-----------+---------+
| INTERVAL_END        | TIMESTAMP   | YES  | NULL        | NULL      | NULL    |
| pk                  | INT64       | YES  | PRIMARY_KEY | ASC       | NULL    |
| TEXT                | STRING(MAX) | YES  | NULL        | NULL      | NULL    |
| TEXT_TRUNCATED      | BOOL        | YES  | NULL        | NULL      | NULL    |
| TEXT_FINGERPRINT    | INT64       | YES  | NULL        | NULL      | NULL    |
| EXECUTION_COUNT     | INT64       | YES  | NULL        | NULL      | NULL    |
| AVG_LATENCY_SECONDS | FLOAT64     | YES  | NULL        | NULL      | NULL    |
| AVG_ROWS            | FLOAT64     | YES  | NULL        | NULL      | NULL    |
| AVG_BYTES           | FLOAT64     | YES  | NULL        | NULL      | NULL    |
| AVG_ROWS_SCANNED    | FLOAT64     | YES  | NULL        | NULL      | NULL    |
| AVG_CPU_SECONDS     | FLOAT64     | YES  | NULL        | NULL      | NULL    |
+---------------------+-------------+------+-------------+-----------+---------+
11 rows in set (13.01 msecs)

Support query-level optimizer version selection

https://github.com/googleapis/google-cloud-go/blob/master/spanner/CHANGES.md#v130

Adds the support of providing query options (optimizer version) via three ways (precedence follows the order): client-level < environment variables < query-level. The environment variable is set by "SPANNER_OPTIMIZER_VERSION".

The environment variables based optimizer version will work well but it requires restarting(or another spanner-cli instance).
It would be added by:

SET optimizer level-like configuration statement.
Option of EXPLAIN SELECT statement.

Note:

SELECT * FROM SPANNER_SYS.SUPPORTED_OPTIMIZER_VERSIONS

seems available.

Proposal: support accessing database with fine-grained access control

CLI Option:

spanner-cli -p <project> -i <instance> -d <database> --role <role>

Syntax:

USE <database> WITH ROLE <role>;

https://cloud.google.com/spanner/docs/access-with-fgac?hl=en

Where's the binary?

https://github.com/cloudspannerecosystem/spanner-cli/blob/043d3b67688009327300f7e57bc3e6adc313c034/README.md

Enhance DDL batching

Currently, DDL batching is enabled only if all statements are DDL statements.
It can be enhanced to support consecutive DDLs which is followed by another statements.
It is useful to support the use case that database export and import database using DDLs and DMLs in a single file.
e.g. https://github.com/cloudspannerecosystem/spanner-dump

Show progress of CREATE INDEX

Currently, spanner-cli doesn't show anything until CREATE INDEX is done.

Recently, Cloud Spanner supports tracking of the progress of index backfill operations.
https://cloud.google.com/spanner/docs/release-notes#April_06_2021

You can now track the progress of long-running index backfill operations through the gcloud command line tool, REST API, and RPC API. For more information, see Checking the progress of a secondary index backfill.

So we can show progress of index creation using polling.

Consideration

How to visualize progress?
- Bar or percent?
- With elapsed time
Multiple DDL statements
- mix of CREATE INDEX and other DDLs which doesn't support progressPercent .

Is there a way to get query result as CSV?

It would be great if there is a way to get the query result as CSV to be able to export it to BigQuery or other databases.

Condition of Merge Join is not displayed

spanner-cli can't display conditions of Merge Join.

example query (modified from https://cloud.google.com/spanner/docs/query-execution-operators?hl=en#merge-join)

EXPLAIN
SELECT a.AlbumTitle, s.SongName
FROM Albums AS a JOIN@{join_method=merge_join} Songs AS s
ON a.SingerId = s.SingerId AND a.AlbumId = s.AlbumId
AND a.AlbumTitle <> s.SongName;

$ spanner-cli -p ${CLOUDSDK_CORE_PROJECT} -i ${CLOUDSDK_SPANNER_INSTANCE} -d ${DATABASE_ID} -f test-case.sql -t                           
+----+---------------------------------------------------------------------------------+
| ID | Query_Execution_Plan (EXPERIMENTAL)                                             |
+----+---------------------------------------------------------------------------------+
|  0 | Distributed Union                                                               |
|  1 | +- Serialize Result                                                             |
|  2 |    +- Merge Join                                                                |
|  3 |       +- Local Distributed Union                                                |
|  4 |       |  +- Table Scan (Full scan: true, Table: Albums)                         |
|  8 |       +- Local Distributed Union                                                |
|  9 |          +- Index Scan (Full scan: true, Index: SongsBySingerAlbumSongNameDesc) |
+----+---------------------------------------------------------------------------------+

Actually, raw planNodes of Merge Join have conditions in childLinks[2](merge join condition) and childLinks[3](residual condition) without the child type.

            - childLinks:
                - childIndex: 3
                - childIndex: 8
                - childIndex: 13
                - childIndex: 20
              displayName: Merge Join
              index: 2
              kind: RELATIONAL
...
            - childLinks:
                - childIndex: 14
                - childIndex: 17
              displayName: Function
              index: 13
              kind: SCALAR
              shortRepresentation:
                description: (($SingerId = $SingerId_1) AND ($AlbumId = $AlbumId_1))
...
            - childLinks:
                - childIndex: 21
                - childIndex: 22
              displayName: Function
              index: 20
              kind: SCALAR
              shortRepresentation:
                description: ($AlbumTitle != $SongName)

So it can be displayed like

$ go run ./ -p ${CLOUDSDK_CORE_PROJECT} -i ${CLOUDSDK_SPANNER_INSTANCE} -d ${DATABASE_ID} -f test-case.sql -t
+----+---------------------------------------------------------------------------------+
| ID | Query_Execution_Plan (EXPERIMENTAL)                                             |
+----+---------------------------------------------------------------------------------+
|  0 | Distributed Union                                                               |
|  1 | +- Serialize Result                                                             |
| *2 |    +- Merge Join                                                                |
|  3 |       +- Local Distributed Union                                                |
|  4 |       |  +- Table Scan (Full scan: true, Table: Albums)                         |
|  8 |       +- Local Distributed Union                                                |
|  9 |          +- Index Scan (Full scan: true, Index: SongsBySingerAlbumSongNameDesc) |
+----+---------------------------------------------------------------------------------+
Predicates(identified by ID):
 2: Condition: (($SingerId = $SingerId_1) AND ($AlbumId = $AlbumId_1))
    Residual Condition: ($AlbumTitle != $SongName)

Note:
The official plan visualizerIt also can't display it but it should be displayed.

It needs to investigate LEFT [OUTER] JOIN, RIGHT [OUTER] JOIN, FULL [OUTER] JOIN.

Utilize Cloud Spanner Emulator in integration test

integration_test.go can use Cloud Spanner Emulator.
refs googleapis/google-cloud-go@25803d8

DmlStatement should show result set of THEN RETURN

Currently, spanner-cli doesn't show result sets of DmlStatement.
UPDATE, INSERT, and DELETE with a THEN RETURN clause return result sets so it is better to support.

We can't use rwTxn.Update* and need to use rwTxn.Query*.
https://cloud.google.com/spanner/docs/dml-tasks?hl=en#client-library-dml-return

Improve EXPLAIN (add child type)

The query plan has a large amount of raw information, so the information displayed is limited.
Some information may be reasonable to display additionally.

Child Type

Binary operators has Type of ChildLink and they are referred in Official document.
https://cloud.google.com/spanner/docs/query-execution-operators?hl=en#cross-apply

Web UI displays child type like below:

EXPLAIN of spanner-cli may be enhanced like below:

$ spanner-cli -e 'EXPLAIN SELECT * FROM Albums@{FORCE_INDEX=AlbumsByAlbumTitle}'
Query_Execution_Plan (EXPERIMENTAL)

.
└── Distributed Union (subquery_cluster_node: 1)
    └── Distributed Cross Apply (subquery_cluster_node: 13)
        ├── Create Batch 
        │   └── Distributed Union (call_type: Local, subquery_cluster_node: 4)
        │       └── Compute Struct 
        │           └── Scan (scan_type: IndexScan, Full scan: true, scan_target: AlbumsByAlbumTitle)
        └── [Map]  Serialize Result 
            └── Cross Apply 
                ├── Scan (scan_target: $v2, scan_type: BatchScan)
                └── [Map]  Distributed Union (call_type: Local, subquery_cluster_node: 20)
                    └── FilterScan 
                        └── Scan (scan_target: Albums, scan_type: TableScan)

(Real Cross Apply and Distributed Cross Apply seems not to have Input type with the first child.)

Stdout affected row count when executed sql from file

When update sql is executed with -f flag, the result is not shown in stdout.
It would be better to stdout update/delete result as well

Update the installation instruction to match the latest golang module installation approach

executing statement followed by comment fails, which seems a bug

Maybe, I found a regression related to processing a comment after a statement.

When the version is v0.9.16, the following command outputs 1.

echo 'select 1; /* comment */' > stmt.sql
spanner-cli -p $PROJECT -i $INSTANCE -d $DATABASE -f stmt.sql

However, when the version is v0.9.17 the above command fails with ERROR: invalid statement.

Does the behavior with v0.9.17 need correction?

Priority support

Intro

Recently Cloud Spanner released the explicit priority support for data requests to Cloud Spanner.

In terms of supporting priority in spanner-cli, there would be two things we need to consider:

Default priority
Allow users to specify the priority

Default priority

Without specifying the priority Cloud Spanner runs a request with HIGH priority, so the current requests from spanner-cli are running with HIGH priority.

spanner-cli, however, is not meant to be used for the situation where requests should be done earlier than end-user's requests, so it might be a good idea to use MEDIUM or LOW priority as default so that it won't disrupt the end-user's requests.

Allow users to specify the priority

To allow users to specify the priority we can think of two options.

Session level
Statement level

Session level can be achieved with the command option like --priority=LOW when starting up the CLI.

Statement level would be a little bit controversial as there is no query hint for priority. One idea would be treating a special comment as a priority hint (e.g. SELECT * FROM t # priority: LOW), but it doesn't look good.

As a reference, TiDB uses select HIGH_PRIORITY | LOW_PRIORITY * from t notation to specify the priority.

Finish read-write transaction when aborted

Currently, spanner-cli continues read-write transaction mode when the RPC returns a code "Aborted".

> BEGIN RW;
Query OK, 0 rows affected (0.01 sec)

(rw txn)> @{LOCK_SCANNED_RANGES=exclusive} SELECT * FROM tbl WHERE pk = 0;
Empty set (3.72 msecs)

(rw txn)> @{LOCK_SCANNED_RANGES=exclusive} SELECT * FROM tbl WHERE pk = 0;
ERROR: spanner: code = "Aborted", desc = "Transaction was aborted."
(rw txn)> BEGIN RW;
ERROR: you're in read-write transaction. Please finish the transaction by 'COMMIT;' or 'ROLLBACK;'
(rw txn)> COMMIT;
ERROR: spanner: code = "Aborted", desc = "Transaction was aborted."
> BEGIN RW;
Query OK, 0 rows affected (2.22 sec)

In this state, users can't execute another statement until execute ROLLBACK or COMMIT.
(^D terminates spanner-cli)

Expected behavior

Finish read-write transaction mode when aborted and users can continue another operations immediately.

> BEGIN RW;
Query OK, 0 rows affected (0.01 sec)

(rw txn)> @{LOCK_SCANNED_RANGES=exclusive} SELECT * FROM tbl WHERE pk = 0;
Empty set (3.72 msecs)

(rw txn)> @{LOCK_SCANNED_RANGES=exclusive} SELECT * FROM tbl WHERE pk = 0;
ERROR: spanner: code = "Aborted", desc = "Transaction was aborted."
> BEGIN RW;
Query OK, 0 rows affected (2.22 sec)

EXPLAIN with subqueries shows duplicated subtrees

Some plan with subqueries show duplicate subtrees.

https://cloud.google.com/spanner/docs/query-execution-operators?hl=en#array_subqueries

SELECT a.AlbumId,
ARRAY(SELECT ConcertDate
      FROM Concerts
      WHERE Concerts.SingerId = a.SingerId)
FROM Albums AS a;

+-----+------------------------------------------------------------------+
| ID  | Query_Execution_Plan (EXPERIMENTAL)                              |
+-----+------------------------------------------------------------------+
|   0 | Distributed Union                                                |
|   1 | +- Local Distributed Union                                       |
|   2 |    +- Serialize Result                                           |
|   3 |       +- Index Scan (Full scan: true, Index: AlbumsByAlbumTitle) |
|   7 |       +- Array Subquery                                          |
|  *8 |       |  +- Distributed Union                                    |
|   9 |       |     +- Local Distributed Union                           |
| *10 |       |        +- FilterScan                                     |
|  11 |       |           +- Index Scan (Index: ConcertsBySingerId)      |
|   7 |       +- [Scalar] Array Subquery                                 |
|  *8 |          +- Distributed Union                                    |
|   9 |             +- Local Distributed Union                           |
| *10 |                +- FilterScan                                     |
|  11 |                   +- Index Scan (Index: ConcertsBySingerId)      |
+-----+------------------------------------------------------------------+
Predicates(identified by ID):
  8: Split Range: ($SingerId_1 = $SingerId)
 10: Seek Condition: ($SingerId_1 = $SingerId)
  8: Split Range: ($SingerId_1 = $SingerId)
 10: Seek Condition: ($SingerId_1 = $SingerId)

This happens because the actual query plan have duplicate children.

    - childLinks:
      - childIndex: 3 -- link to Index Scan
      - childIndex: 6 -- link to Reference: $AlbumId
      - childIndex: 7 -- link to Array Subquery
      - childIndex: 7 -- link to Array Subquery(same as above)
        type: Scalar
      displayName: Serialize Result
      index: 2
      kind: RELATIONAL

Note:

Another scalar subquery only have a link with type: Scalar.
https://cloud.google.com/spanner/docs/query-execution-operators?hl=en#scalar_subqueries

SELECT FirstName,
IF(FirstName='Alice',
   (SELECT COUNT(*)
    FROM Songs
    WHERE Duration > 300),
   0)
FROM Singers;

+-----+-----------------------------------------------------------------------+
| ID  | Query_Execution_Plan (EXPERIMENTAL)                                   |
+-----+-----------------------------------------------------------------------+
|   0 | Distributed Union                                                     |
|   1 | +- Local Distributed Union                                            |
|   2 |    +- Serialize Result                                                |
|   3 |       +- Index Scan (Full scan: true, Index: SingersByFirstLastName)  |
|  10 |       +- [Scalar] Scalar Subquery                                     |
|  11 |          +- Global Stream Aggregate (scalar_aggregate: true)          |
|  12 |             +- Distributed Union                                      |
|  13 |                +- Local Stream Aggregate (scalar_aggregate: true)     |
|  14 |                   +- Local Distributed Union                          |
| *15 |                      +- FilterScan                                    |
|  16 |                         +- Table Scan (Full scan: true, Table: Songs) |
+-----+-----------------------------------------------------------------------+
Predicates(identified by ID):
 15: Residual Condition: ($Duration > 300)

    - childLinks:
      - childIndex: 3  -- link to Index Scan
      - childIndex: 5  -- link to Reference: $FirstName
      - childIndex: 6  -- link to Function: IF(($FirstName = 'Alice'), $sv_1, 0)
      - childIndex: 10 -- link to Scalar Subquery
        type: Scalar
      displayName: Serialize Result
      index: 2
      kind: RELATIONAL

Implementation choices:

When child link with same node ID is appeared in same parent node,
ignore everything except the first one(or the last one?)
In Serialize Result, the first child and children with type: Scalar seems appropriate subtree.

Panic happens when user attempts to run EXPLAIN ANALYZE for emulator

spanner> explain analyze select * from Singers;
(*spanner.QueryPlan)(nil)                                                                                                                             
panic: runtime error: invalid memory address or nil pointer dereference        
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x160c53a]                           
                                                                                                                                                      
goroutine 1 [running]:                                                                                                                                
main.BuildQueryPlanTree(0x0, 0x0, 0x1)                                                                                                                
        .../spanner-cli/query_plan.go:76 +0x3a
main.processPlanImpl(0x0, 0xc000132001, 0x17bbf40, 0x4, 0xc00063fba8, 0x1, 0x1, 0x1a, 0x0, 0x0)                                                       
        .../spanner-cli/statement.go:552 +0xe6                                                    
main.processPlanWithStats(...)                                                                                                                        
        .../spanner-cli/statement.go:540
main.(*ExplainAnalyzeStatement).Execute(0xc0002d7200, 0xc00049c5b0, 0x0, 0x0, 0x0)                                                                    
        .../spanner-cli/statement.go:516 +0x35b
main.(*Cli).RunInteractive(0xc00049c620, 0x0)
        .../spanner-cli/cli.go:183 +0x4a2      
main.main()                                                                                        
        .../spanner-cli/main.go:106 +0x369

This is because emulator doesn't set query plan nodes to the result set.
https://github.com/GoogleCloudPlatform/cloud-spanner-emulator/blob/77188b228e7757cd56ecffb5bc3ee85dce5d6ae1/frontend/handlers/queries.cc#L224-L230

Send `show create table\g` occur Panic

spanner-cli version : 0.6.3

> show create table  tb1\g;

-panic: regexp: Compile(`(?i)^CREATE TABLE tb1\g`): error parsing regexp: invalid escape sequence: `\g`

goroutine 1 [running]:
regexp.MustCompile(0xc0000a6840, 0x17, 0x12)
	/usr/local/go/src/regexp/regexp.go:311 +0x152
main.(*ShowCreateTableStatement).Execute(0xc0005b4000, 0xc00045c8c0, 0x1d6d0a0, 0xc0005b4000, 0x0)
	/Users/x/go/pkg/mod/github.com/cloudspannerecosystem/[email protected]/statement.go:392 +0x315
main.(*Cli).RunInteractive(0xc00045c930, 0x0)
	/Users/x/go/pkg/mod/github.com/cloudspannerecosystem/[email protected]/cli.go:183 +0x4a2

Defer transaction heartbeat to the time when the first SELECT or DML is executed

We run the transaction heartbeat just after a user run BEGIN; in the interactive shell, but we can defer starting the heartbeat to the time when the first SELECT or DML is executed.

This is possible because Cloud Spanner allows users to call BeginTransaction RPC beforehand so that we can remove overhead of calling BeginTransaction when users want to run an actual transaction (e.g. See WriteSessions in Go).

It's important to start the heartbeat after a user runs a SELECT or DML because the heartbeat uses SELECT 1 query and it could be recognized as an "older transaction" during deadlock detection (document).

Query having `;` in the string literal fails execution in the batch mode

Issue

If a query has ; in the string literal, it fails due to the syntax error.

$ echo 'select "TL;DR";' | spanner-cli -p ${PROJECT} -i ${INSTANCE} -d ${DATABASE}
ERROR: spanner: code = "InvalidArgument", desc = "Syntax error: Unclosed string literal [at 1:8]\\nselect \\\"TL\\n       ^"

This is because spanner-cli simply separates each statement by delimiter ; or \G, so the query was separated in the middle of the statement.
https://github.com/cloudspannerecosystem/spanner-cli/blob/master/cli.go#L323

This only happens for batch execution mode.

Solutions

Parse query and find appropriate delimiters. This is an ideal way, but there is no official Go parser yet (spansql would have less coverage of supporting queries).
(Only for batch execution) Regard a single line as a single query. This looks like a simple solution, but we have to change option -e to allow only a single query.

Request tag(per-transaction vs. per-query?)

Currently, spanner-cli supports per-transaction tag and it is used as the request tag of all queries in the transaction.
#128

Users can be confused by the behavior that multiple queries with the same request tag string are grouped into a single row.

https://cloud.google.com/spanner/docs/introspection/troubleshooting-with-tags?hl=en#things_to_note

When you assign a REQUEST_TAG, statistics for multiple queries that have the same tag string are grouped in a single row in query statistics table. Only the text of one of those queries is shown in the TEXT field.

Possible actions:

a. Add a note to multiple queries with the same tag in the "Transaction Tags and Request Tags" section of the README
b. Add per-request request tag syntax with/without the explicit transaction

Related discussion: #129 (comment) (We are also not familiar with the behavior)

DDL batching support

spanner-cli -e" {statements} " can execute multiple statements, but they are executed sequentially.
If statements are DDLs, it should be executed as a single large batch in practice.
https://cloud.google.com/spanner/docs/schema-updates?hl=en#large-updates

Consideration:

add a flag like --batch or not
check all statements before execution and batch them only if they are all DDL statements
- or batch a subsequence of statements that are consecutive DDL statements

Table output EXPLAIN seems broken border because ambiguous width

The output of EXPLAIN with -t(--table) seems broken border in some environment.

$ spanner-cli -e "EXPLAIN SELECT * FROM Songs@{FORCE_INDEX=SongsBySongName}" -t
+--------------------------------------------------------------------------------------------------------+
| Query_Execution_Plan (EXPERIMENTAL)                                                                    |
+--------------------------------------------------------------------------------------------------------+
|                                                                                                        |
| .                                                                                                      |
| └── Distributed Union (subquery_cluster_node: 1)                                                    |
|     └── Distributed Cross Apply (subquery_cluster_node: 15)                                         |
|         ├── Create Batch                                                                            |
|         │   └── Distributed Union (call_type: Local, subquery_cluster_node: 4)                     |
|         │       └── Compute Struct                                                                 |
|         │           └── Scan (scan_target: SongsBySongName, scan_type: IndexScan, Full scan: true) |
|         └── Serialize Result                                                                        |
|             └── Cross Apply                                                                         |
|                 ├── Scan (scan_target: $v2, scan_type: BatchScan)                                   |
|                 └── Distributed Union (call_type: Local, subquery_cluster_node: 23)                 |
|                     └── FilterScan                                                                  |
|                         └── Scan (scan_target: Songs, scan_type: TableScan)                         |
|                                                                                                        |
+--------------------------------------------------------------------------------------------------------+

It can be fixed by replacing to unambiguous width characters.
xlab/treeprint#2 (comment)

For example

func (n *Node) Render() string {
	tree := treeprint.New()
	renderTree(tree, &Link{Dest: n})
	return "\n" + strings.NewReplacer(
		"└", "+",
		"├", "+",
		"│", "|",
		"─", "-",
		).Replace(tree.String())
}

$ spanner-cli -e "EXPLAIN SELECT * FROM Songs@{FORCE_INDEX=SongsBySongName}" -t
+----------------------------------------------------------------------------------------------------+
| Query_Execution_Plan (EXPERIMENTAL)                                                                |
+----------------------------------------------------------------------------------------------------+
|                                                                                                    |
| .                                                                                                  |
| +-- Distributed Union (subquery_cluster_node: 1)                                                   |
|     +-- Distributed Cross Apply (subquery_cluster_node: 15)                                        |
|         +-- [Input]  Create Batch                                                                  |
|         |   +-- Distributed Union (call_type: Local, subquery_cluster_node: 4)                     |
|         |       +-- Compute Struct                                                                 |
|         |           +-- Scan (Full scan: true, scan_type: IndexScan, scan_target: SongsBySongName) |
|         +-- [Map]  Serialize Result                                                                |
|             +-- Cross Apply                                                                        |
|                 +-- [Input]  Scan (scan_target: $v2, scan_type: BatchScan)                         |
|                 +-- [Map]  Distributed Union (subquery_cluster_node: 23, call_type: Local)         |
|                     +-- FilterScan                                                                 |
|                         +-- Scan (scan_type: TableScan, scan_target: Songs)                        |
|                                                                                                    |
+----------------------------------------------------------------------------------------------------+

Support EXPLAIN of DML

DML has execution plans and it is useful to see plans of queries which have subqueries.
https://cloud.google.com/spanner/docs/query-execution-operators?hl=en#apply-mutations

Currently, the target of EXPLAIN is limited for WITH, SELECT and statement hints but it can be loosen.
https://github.com/cloudspannerecosystem/spanner-cli/blob/v0.7.5/statement.go#L97

cloudspannerecosystem / spanner-cli Goto Github PK

spanner-cli's Introduction

spanner-cli

Description

Install

Usage

Example

Interactive mode

Batch mode

Directed reads mode

Syntax

Customize prompt

Config file

Configuration Precedence

Request Priority

Transaction Tags and Request Tags

Using with the Cloud Spanner Emulator

How to develop

TODO

Disclaimer

spanner-cli's People

Contributors

Stargazers

Watchers

Forkers

spanner-cli's Issues

Example:

Proposed layout

Semantics

References

Example

Expected behavior

Related specification

Feature Request

Consideration

Consideration

Child Type

Intro

Default priority

Allow users to specify the priority

Expected behavior

Issue

Solutions

For example

Recommend Projects

Recommend Topics

Recommend Org