opensearch-project / sql Goto Github PK

View Code? Open in Web Editor NEW

110.0 24.0 129.0 88.3 MB

Query your data using familiar SQL or intuitive Piped Processing Language (PPL)

Home Page: https://opensearch.org/docs/latest/search-plugins/sql/index/

License: Apache License 2.0

Java 97.51% Shell 0.06% Python 0.08% ANTLR 2.22% Groovy 0.06% Scala 0.08%

sql ppl

sql's Introduction

OpenSearch SQL
- Code Summary
- Highlights
- Documentation
- Forum
- Contributing
- Attribution
- Code of Conduct
- Security
- License
- Copyright

OpenSearch SQL

OpenSearch enables you to extract insights out of OpenSearch using the familiar SQL or Piped Processing Language (PPL) query syntax. Use aggregations, group by, and where clauses to investigate your data. Read your data as JSON documents or CSV tables so you have the flexibility to use the format that works best for you.

The following projects are related to the SQL plugin, but stored in the different repos. Please refer to links below for details. This document will focus on the SQL plugin for OpenSearch.

Code Summary

SQL Engine


Test and build	[![SQL CI][sql-ci-badge]][sql-ci-link]
Code coverage
Distribution build tests
Backward compatibility tests

Repository Checks


DCO Checker
Link Checker

Issues

Highlights

Besides basic filtering and aggregation, OpenSearch SQL also supports complex queries, such as querying semi-structured data, JOINs, set operations, sub-queries etc. Beyond the standard functions, OpenSearch functions are provided for better analytics and visualization. Please check our documentation for more details.

Recently we have been actively improving our query engine primarily for better correctness and extensibility. Behind the scene, the new enhanced engine has already supported both SQL and Piped Processing Language. Please find more details in SQL Engine V2 - Release Notes.

Documentation

Please refer to the SQL Language Reference Manual, Piped Processing Language (PPL) Reference Manual, OpenSearch SQL/PPL Engine Development Manual and Technical Documentation for detailed information on installing and configuring plugin.

Forum

For additional help with the plugin, including questions about opening an issue, visit the OpenSearch Forum.

Contributing

See developer guide and how to contribute to this project.

Attribution

This project is based on the Apache 2.0-licensed elasticsearch-sql project. Thank you eliranmoyal, shi-yuan, ansjsun and everyone else who contributed great code to that project. Read this for more details Attributions.

Code of Conduct

This project has adopted an Open Source Code of Conduct.

Security

If you discover a potential security issue in this project we ask that you notify OpenSearch Security directly via email to [email protected]. Please do not create a public GitHub issue.

License

See the LICENSE file for our project's licensing. We will ask you to confirm the licensing of your contribution.

Copyright

sql's People

Contributors

Stargazers

Watchers

Forkers

dai-chen chloe-zh chyun frecap weicongs-amazon vachashah sreekarjami ketanv3 changsongyang penghuo neosun100 ohltyler zhongnansu lyndonbauto luxugang steliord stjordanis dblock joshuali925 davidcui1225 abbashus wonray spbjss uestczqyangys tengda-he kylepbit st631369677 kavithacm amarouane-abdelhak bit-quill ps48 ylwu-amzn chouchoucha baby-opens zelinh alinalibq eugenesk24 risdenk vamsi-amazon top1zz naarcha-aws gaiksaya mengweieric mch2 peterzhuamazon amitgalitz bobblue owaiskazi19 jackiehanyang jdanndc jokerzh isabella232 swiddis mytn1992 biolspikachu mochi-zhang anandtripathy dplusic prudhvigodithi lanicc maxksyunz superz123 gang01 cwillum kolchfa-aws jngz-es themaherkhalil naheedmk derek-ho natebower zane-neo spartanapple jkdll lygvk cdai yang-db reta garrys-bq rishabh6788 b4sjoo ryanbogan truekraft rupal-bq git950715 seankao-az sksaifuddin qpc-github austintlee liuyonghengheng dhrubo-os chenhaifengkeda dbwiddis revillavinodh ameetchaudhari jaguilar-atl sumukhswamy lchqlchq dreamer-89 pranishd1 kaituo

sql's Issues

[BUG] Highlight.js used is EOL

Describe the bug
A EOL warning is displayed

To Reproduce
Steps to reproduce the behavior:

Open Query Workbench
Click "Explain"
View the developer console in the browser and see the message "Version 9 of Highlight.js has reached EOL and is no longer supported. Please upgrade or ask whatever dependency you are using to upgrade. https://github.com/highlightjs/highlight.js/issues/2877"

Expected behavior
No EOL messages

Plugins
All OpenSearch Plugins

Host/Environment (please complete the following information):

OS: [e.g. iOS] MacOS
Version [e.g. 22] Chrome Version 90.0.4430.212 (Official Build) (x86_64)
(OpenSearch 1.0-RC1)

Data type conversion improvement

Is your feature request related to a problem? Please describe.
Currently the type check is very strict and thus not very user friendly. We need to support implicit casting and make it consistent with explicit casting. For example: if A can be converted to B implicitly, this should be done by explicit CAST function too.

opendistro-for-elasticsearch/sql#1056: Comparison between date and string
opendistro-for-elasticsearch/sql#1061: Comparison between bool and string
opendistro-for-elasticsearch/sql#510: Comparison between date and long?

Cast function in PPL: opendistro-for-elasticsearch/sql#924

More SQL types supported:

DECIMAL
SIGNED INT
CHAR
etc

Describe the solution you'd like
Relax the type check by well defined conversion rules.

Describe alternatives you've considered
Follow the strict data type grammar, such as TIMESTAMP(...).

Support implicit cast between bool and string

[Feature] Support querying a data stream

Is your feature request related to a problem? Please describe.

A data stream is an index abstraction very similar to an index alias. Under the hood, it points to multiple backing indices and allows searches using a single named resource. SQL/PPL already supports querying an index alias by resolving the index expression to its corresponding concrete indices.

But when trying to query a data stream, the index expression resolution doesn’t include backing indices of a data stream and results in the following error.

POST /_opensearch/_sql
{
  "query": "SELECT * FROM my-data-stream LIMIT 10"
}

{
  "error": {
    "reason": "Error occurred in OpenSearch engine: no such index [my-data-stream]",
    "details": "org.opensearch.index.IndexNotFoundException: no such index [my-data-stream]\nFor more details, please send request for Json format to see the raw response from OpenSearch engine.",
    "type": "IndexNotFoundException"
  },
  "status": 404
}

Describe the solution you'd like

Index expression resolution should include backing indices of a data stream to make the query behaviour of data streams consistent with index aliases.

Additional context

This change builds on top of the following feature to support data streams in OpenSearch:
opensearch-project/OpenSearch#675

Disable the DELETE operation by default.

User manual: https://github.com/opensearch-project/sql/blob/main/docs/user/dml/delete.rst

1.1 Description

For some reason, the DELETE statement was not removed from NLPchina-ES-SQL codebase before ODFE launch. This is the only DML supported and makes SQL plugin non readonly.

Requests from community:

1.2 Proposal & Impact

Option 1 Disable completely: Users won’t be able to delete docs via SQL endpoint. From then on, they can only do this by DSL.
Option 2 Disable by setting: Add a new setting to disable DELETE by default, however user with permission can change the setting to enable DELETE.

Add support for stats aggregate function

#99

Deprecate SQL cursor keep alive setting

Remove opensearch.sql.cursor.keep_alive setting and make the parameter per-request instead of stateful cluster setting.

OpenSearch renaming in doctest log

Describe the bug
One of the prerequisite of the doc test is to start a local cluster, currently the doctest is logging this task as :doctest:startES. Not sure if there is any other similar legacy naming in doc test.

To Reproduce
Steps to reproduce the behavior:

run ./gradlew :doctest:doctest or ./gradlew build
See error when doctest is started

Expected behavior
startOpenSearch or startOS

Plugins
SQL

Host/Environment (please complete the following information):

OS: MacOS
Version: Catalina 10.15.7

Additional context
Add any other context about the problem here.

Prepare for pentest

Index expression improvement

Is your feature request related to a problem? Please describe.
Index expression is supposed to be defined clearly in both SQL and PPL.

opendistro-for-elasticsearch/sql#842: Comma-separated index names in PPL
opendistro-for-elasticsearch/sql#169: Multi-index in SQL
opendistro-for-elasticsearch/sql#221: Conflicts with comma join

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Why aggregation still use ElasticsearchQueryRequest instead of ElasticsearchScrollRequest

ElasticsearchQueryRequest has a size limit(default is 200) which make aggregation did not work well. Apparently you know that cause I see the comment in class of ElasticsearchNodeClient:

  /**
   * TODO: Scroll doesn't work for aggregation. Support aggregation later.
   */
  @Override
  public ElasticsearchResponse search(ElasticsearchRequest request) {
    return request.search(
        req -> client.search(req).actionGet(),
        req -> client.searchScroll(req).actionGet()
    );
  }

And I see ElasticsearchScrollRequest is already there, but ElasticsearchIndexScan still use ElasticsearchQueryRequest in its constructor function:

  /**
   * Todo.
   */
  public ElasticsearchIndexScan(ElasticsearchClient client,
                                Settings settings, String indexName,
                                ElasticsearchExprValueFactory exprValueFactory) {
    this.client = client;
    this.request = new ElasticsearchQueryRequest(indexName,
            settings.getSettingValue(Settings.Key.QUERY_SIZE_LIMIT), exprValueFactory);
  }

I am wondering why.

Add support for distinct aggregation

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
Add support for the following aggregations. Note that COUNT(DISTINCT) is highest priority and others are optional. MySQL reference: https://dev.mysql.com/doc/refman/8.0/en/aggregate-functions.html#function_count

COUNT(DISTINCT expr)
AVG(DISTINCT expr)
SUM(DISTINCT expr)
MIN(DISTINCT expr)
MAX(DISTINCT expr)

PPL may also needs to support count distinct. Reference: https://docs.splunk.com/Documentation/SCS/current/SearchReference/Aggregatefunctions#distinct_count.28.26lt.3Bvalue.26gt.3B.29_or_dc.28.26lt.3Bvalue.26gt.3B.29

Describe alternatives you've considered
N/A

Some of text functions are not implemented in new engine

Related issues:
#74
#73

Missing functions include:
ASCII #73
LEFT
LOCATE #74
REPLACE

[BUG] Non-paginated results returned silently if query not supported by cursor

Describe the bug
If the query is not supported by cursor to paginate, such as GROUP BY, full result set is returned silently instead of error with clear message.

To Reproduce

Create any index
Enable cursor setting manually or merge PR #75
Run a GROUP BY query
See the empty response as below

curl -H 'Content-Type: application/json' -XPOST "http://localhost:9200/_plugins/_sql" -d'
{
   "query": "SELECT state.keyword, count(1) FROM accounts GROUP BY state",
   "fetch_size": 10
}'
{
  "schema": [
    {
      "name": "state.keyword",
      "type": "double"
    },
    {
      "name": "COUNT(1)",
      "type": "double"
    }
  ],
  "total": 51,
  "datarows": [
    [
      "TX",
      0
    ],
    [
      "MD",
      0
    ],
    [
      "ID",
    ......
  ],
  "size": 51,
  "status": 200
}

Expected behavior
Expect an error thrown with clear message saying that the type of query is not supported for pagination yet.

Host/Environment (please complete the following information):

OS: [e.g. iOS] Mac
Version [e.g. 22] OpenSearch beta-1 (or current develop branch)

reporting bug fixes

[BUG] Tableau integration with OpenDistro using JDBC driver

Describe the bug
I am trying to connect to a remote OpenDistro (v1.13.2) using Tableau (v2021.1) via the JDBC driver (opendistro-sql-jdbc-1.13.0.0.jar). Tableau is able list the indexes and the fields in each index. It is able to associate the data type of each field appropriately (STRING vs DOUBLE etc). However, when I try to create a Tableau chart by dragging/dropping index fields into the Tableau column/row boxes, I get the following error:

2021-05-06 13:16:03.900 -0500 (,,,,2,77) grpc-default-executor-6 : INFO com.tableau.connect.service.ProtocolService - Running query
SELECT kibana_sample_data_flights.Carrier AS Carrier
FROM kibana_sample_data_flights
GROUP BY kibana_sample_data_flights.Carrier
2021-05-06 13:16:03.900 -0500 (,,,,2,77) grpc-default-executor-6 : INFO com.tableau.connect.grpc.GrpcProtocolService - End local request 77 /runQuery.
2021-05-06 13:16:03.911 -0500 (,,,,2,77) pool-3-thread-4 : INFO com.tableausoftware.jdbc.FetchSizeHelpers - Setting max result buffer size to 322122547 bytes, 30% of the max heap size.
2021-05-06 13:16:03.911 -0500 (,,,,2,77) pool-3-thread-4 : INFO com.tableausoftware.jdbc.FetchSizeHelpers - Setting adaptive fetch size to 2000.
2021-05-06 13:16:03.911 -0500 (,,,,2,77) pool-3-thread-4 : INFO com.tableausoftware.data.ProtobufJDBCResultSet - BindingInfo{columnName='Carrier', columnTypeName='DOUBLE', columnLabel='Carrier', columnType=8, precision=15, scale=0, columnDisplaySize=25, protobufType=DOUBLE, nullable=unknown}
2021-05-06 13:16:03.911 -0500 (,,,,2,77) pool-3-thread-4 : ERROR com.tableau.connect.service.QueryTask - Query for protocol 2 exiting with error class java.sql.SQLDataException
2021-05-06 13:16:03.911 -0500 (,,,,2,78) grpc-default-executor-6 : ERROR com.tableau.connect.grpc.GrpcProtocolService - Failed in fetchResults. SQLState=null, ErrorCode=0.
java.sql.SQLDataException: Can not parse Logstash Airways as a Double

To Reproduce
Steps to reproduce the behavior:

Open up Tableau 2021.1
Select "Other Databases (JDBC)".
In the "Other Databases (JDBC) dialog type:

URL: jdbc:elasticsearch://https://[remote-opendistro-server-dns-or-ip]:9200/
Dialect: you can select any one but I selected MySQL
Username: [userid for your local OpenDistro]
Password: [password for your local OpenDistro]
Properties file: [path to your properties file]. Create a properties file with the following contents:
- auth=BASIC
- hostnameVerification=true
- user=xxxxx
- password=xxxxx
- trustSelfSigned=true
- useSSL=true
- trustStoreLocation=[path to jks file]
- trustStorePassword=xxxxx

Click on "Sign In" button
In the next screen select the desired table/index. I selected the kibana_sample_data_flights index. You should then see the field names from the index displayed.
Click on the "Sheet 1" tab to create a new chart.
Drag/Drop a string field into the "Columns/Rows" boxes. I dragged the "Carrier" field into the Columns box You should then get an error like:
An error occurred while communicating with the Other Databases (JDBC) data source 'kibana_sample_data_flights (jdbc:elasticsearch://https://wd14efkode01.aci.awscloud:9200/)'
Bad Connection: Tableau could not connect to the data source.
Error Code: FAB9A2C5
Can not parse Logstash Airways as a Double
SELECT kibana_sample_data_flights.Carrier AS Carrier
FROM kibana_sample_data_flights
GROUP BY kibana_sample_data_flights.Carrier
Check the Tableau logs and you will see that the error is:
java.sql.SQLDataException: Can not parse Logstash Airways as a Double
at com.amazon.opendistroforelasticsearch.jdbc.types.TypeHelper.stringConversionException(TypeHelper.java:33) ~[?:?]
at com.amazon.opendistroforelasticsearch.jdbc.types.DoubleType.asDouble(DoubleType.java:50) ~[?:?]
at com.amazon.opendistroforelasticsearch.jdbc.types.DoubleType.fromValue(DoubleType.java:38) ~[?:?]
at com.amazon.opendistroforelasticsearch.jdbc.types.DoubleType.fromValue(DoubleType.java:22) ~[?:?]
at com.amazon.opendistroforelasticsearch.jdbc.types.BaseTypeConverter.convert(BaseTypeConverter.java:60) ~[?:?]
at com.amazon.opendistroforelasticsearch.jdbc.ResultSetImpl.getObjectX(ResultSetImpl.java:589) ~[?:?]
at com.amazon.opendistroforelasticsearch.jdbc.ResultSetImpl.getObjectX(ResultSetImpl.java:583) ~[?:?]
at com.amazon.opendistroforelasticsearch.jdbc.ResultSetImpl.getDoubleX(ResultSetImpl.java:305) ~[?:?]
at com.amazon.opendistroforelasticsearch.jdbc.ResultSetImpl.getDouble(ResultSetImpl.java:299) ~[?:?]
at com.tableausoftware.data.ProtobufJDBCResultSet.addValue(ProtobufJDBCResultSet.java:451) ~[jdbcserver.jar:20211.0.17]
at com.tableausoftware.data.ProtobufJDBCResultSet.addRow(ProtobufJDBCResultSet.java:410) ~[jdbcserver.jar:20211.0.17]
at com.tableausoftware.data.ProtobufJDBCResultSet.fetchRows(ProtobufJDBCResultSet.java:230) ~[jdbcserver.jar:20211.0.17]
at com.tableausoftware.data.ProtobufJDBCResultSet.nextBlockImpl(ProtobufJDBCResultSet.java:282) ~[jdbcserver.jar:20211.0.17]
at com.tableausoftware.data.ProtobufResultSet.nextBlock(ProtobufResultSet.java:48) ~[jdbcserver.jar:20211.0.17]
at com.tableau.connect.service.QueryTask.readData(QueryTask.java:139) ~[jdbcserver.jar:20211.0.17]
at com.tableau.connect.service.QueryTask.call(QueryTask.java:102) ~[jdbcserver.jar:20211.0.17]
at com.tableau.connect.service.QueryTask.call(QueryTask.java:56) ~[jdbcserver.jar:20211.0.17]
at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?]
at java.lang.Thread.run(Unknown Source) [?:?]
Caused by: java.lang.NumberFormatException: For input string: "Logstash Airways"
at jdk.internal.math.FloatingDecimal.readJavaFormatString(Unknown Source) ~[?:?]
at jdk.internal.math.FloatingDecimal.parseDouble(Unknown Source) ~[?:?]
at java.lang.Double.parseDouble(Unknown Source) ~[?:?]
at java.lang.Double.valueOf(Unknown Source) ~[?:?]
at com.amazon.opendistroforelasticsearch.jdbc.types.DoubleType.asDouble(DoubleType.java:48) ~[?:?]
... 19 more

Or for query issues:

Create the index with test documents by 'POST ...'
Run the query 'SELECT ...'
See error

Expected behavior
I expect that the integration between Tableau/OpenDistro via JDBC driver works for the remote OpenDistro like it does for the local Opendistro.

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

OS: [RedHat Linux]
Version [Red Hat Enterprise Linux Server 7.9 (Maipo)]
Kernel: Linux 3.10.0-1160.25.1.el7.x86_64
OpenDistro v1.13.2
Tableau Desktop For Mac v2021.1

Additional context
Add any other context about the problem here.

Replace the icon with OpenSearch new icon image in ODBC installers

Currently the looking of the installer:

As can see the ODFE icon is at the left bottom, we need to replace it with our latest OpenSearch icon/background images.

Icon/background location:
https://github.com/opensearch-project/sql/blob/develop/sql-odbc/src/installer/banner.bmp
https://github.com/opensearch-project/sql/blob/develop/sql-odbc/src/installer/dialog.bmp
https://github.com/opensearch-project/sql/blob/develop/sql-odbc/src/installer/icon.ico
https://github.com/opensearch-project/sql/blob/develop/sql-odbc/src/installer/Resources/background.bmp
https://github.com/opensearch-project/sql/blob/develop/sql-odbc/src/installer/Resources/background_darkaqua.bmp

OpenSearch RC 1 release

Get SQL/PPL, JDBC, ODBC, CLI and Query Workbench ready for OpenSearch RC 1 release, including:

Build against OpenSearch 1.x branch (1.0 later) instead of beta1 or rc1 tag.
Version bump and rename artifacts to xxx-1.0.0.0-rc1.zip
Add release notes
Merge to main branch and cut release branch from there
Tag and publish release

Support LOCATE in new SQL Engine

Describe the bug
The LOCATE function is documented by not implemented in the new SQL engine.

String query improvement

Is your feature request related to a problem? Please describe.

Old tracking issue: opendistro-for-elasticsearch/sql#710

opendistro-for-elasticsearch/sql#861: Locate with IF function

Describe the solution you'd like
N/A

Describe alternatives you've considered
N/A

Additional context
Add any other context or screenshots about the feature request here.

Query planner refactoring

Is your feature request related to a problem? Please describe.
Refactor logical plan operators for higher abstraction as abstract language for different frontend (SQL, PPL, Graph etc).

Describe the solution you'd like
N/A

Describe alternatives you've considered
N/A

Additional context
Add any other context or screenshots about the feature request here.

Plugin Settings Backwards Compatibility

Add support backwards compatibility for SQL plugin settings
Currently PPL settings can be only changed via Elasticsearch _cluster/settings, we need to enable PPL settings in plugin settings using the similar endpoint with SQL settings. Related issue: opendistro-for-elasticsearch/sql#981
Change the settings to consistently in _plugins/_query/settings endpoint and in plugins.sql/ppl/query.* naming for SQL/PPL/common settings

OpenSearch function support

Is your feature request related to a problem? Please describe.
Currently both SQL (new engine) and PPL doesn't support any OpenSearch functions. We should prioritize this, especially the most common full text search and aggregate functions.

Here are the feature requests from community:

opendistro-for-elasticsearch/sql#1093: Percentile function support in PPL stats
opendistro-for-elasticsearch/sql#1075: Geo function support in SQL
opendistro-for-elasticsearch/sql#878: Geo point support
opendistro-for-elasticsearch/sql#530: Histogram
opendistro-for-elasticsearch/sql#313: Geo point support

Geo point: opendistro-for-elasticsearch/sql#1064
Meta fields: opendistro-for-elasticsearch/sql#1038
Aggregate params: opendistro-for-elasticsearch/sql#615
Array data: opendistro-for-elasticsearch/sql#618

Describe the solution you'd like
Expose OpenSearch functions in SQL/PPL.

Describe alternatives you've considered
N/A

Additional context
Add any other context or screenshots about the feature request here.

[BUG] Simple SELECT queries fail to be anonymized

Describe the bug
QueryDataAnonymizer throws exception and fail to handle simple SELECT queries.

To Reproduce
Steps to reproduce the behavior:

Use test index accounts
Run the query SELECT * FROM accounts GROUP BY state HAVING count(1) > 10
See error

Expected behavior
Expect identifier names in the original query be anonymized.

Host/Environment (please complete the following information):

OS: [e.g. iOS] MacOS
Version [e.g. 22] OpenSearch SQL 1.0.0.0-beta1

Additional context

Error logs as below:

[2021-04-27T14:34:54,532][WARN ][c.a.o.s.l.u.QueryDataAnonymizer] Caught an exception when anonymizing sensitive data
[2021-04-27T14:34:54,534][INFO ][c.a.o.s.l.p.RestSqlAction] Incoming request /_opensearch/_sql?format=jdbc: SELECT * FROM accounts GROUP BY state HAVING count(1) > 10;

Support ASCII in new SQL Engine

Describe the bug
The ASCII function is documented by not implemented in the new SQL engine.

JOIN support improvement

Is your feature request related to a problem? Please describe.

There is limited support for JOIN queries which needs improvement. In particular, post processing such as filtering and aggregation after inner/outer join is most wanted. This capability is required by PPL lookup command.

Here are the feature requests from community:

opendistro-for-elasticsearch/sql#505: Left join
opendistro-for-elasticsearch/sql#351: Multi-join
opendistro-for-elasticsearch/sql#301: Aggregate after join
opendistro-for-elasticsearch/sql#289: Select function after join
opendistro-for-elasticsearch/sql#236: Aggregate after join
opendistro-for-elasticsearch/sql#221: Comma join
opendistro-for-elasticsearch/sql#124: Left join with WHERE
opendistro-for-elasticsearch/sql#110: Aggregate after join
opendistro-for-elasticsearch/sql#987: Multi-join

Describe the solution you'd like
N/A

Describe alternatives you've considered
N/A

Additional context
Add any other context or screenshots about the feature request here.

[BUG] Legacy settings are rejected by RestQuerySettingAction

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:
Send request from OpenSearch Dashboards dev tools:

PUT _plugins/_query/settings
{
  "transient" : {
    "opendistro.sql.enabled" : "true"
  }
}

Got error:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "action_request_validation_exception",
        "reason" : "Validation Failed: 1: no settings to update;"
      }
    ],
    "type" : "action_request_validation_exception",
    "reason" : "Validation Failed: 1: no settings to update;"
  },
  "status" : 400
}

Expected behavior
Setting should be able to update with either legacy setting opendistro.sql.*, opendistro.ppl.*, opendistro.query.* or the new setting plugins.sql.*, plugins.ppl.*, plugins.query.*

Plugins
Please list all plugins currently enabled.
SQL

Host/Environment (please complete the following information):

OS: [e.g. iOS] MacOS
Version [e.g. 22] Catalina

Additional context
Add any other context about the problem here.

[BUG] Legacy .mez file for PowerBI connector

test-epic

[BUG] Query creates unexpected behaviour and cannot be explained

Describe the bug
Some queries will execute and yet cannot be explained. Query result was unexpected.

To Reproduce
Steps to reproduce the behavior:

Go to Query Workbench
Run SHOW tables LIKE .% (note the .)
Click the Run button
See the results (I have two less here than with SHOW tables LIKE % - why? 🤷‍♂️ Doesn't seem to follow the wildcard pattern)
Click 'Explain'
The results will be "This query is not explainable."

Expected behavior
Any run-able query can be explained.

Plugins
All OpenSearch Plugins.

Host/Environment (please complete the following information):

OS: MacOs
Version Chrome Version 90.0.4430.212 (Official Build) (x86_64)
OpenSearch 1.0 RC-1

Refactor the AggregationResponse Parser to support extended_stats aggregation

It is the precondition to implement push down for #99,

Support implicit cast between date and string

Add support for stats aggregate function

Is your feature request related to a problem? Please describe.
Currently stats aggregate function support is missing in both SQL and PPL.

Describe the solution you'd like
Add support for the following stats aggregate function in SQL. MySQL reference: https://dev.mysql.com/doc/refman/8.0/en/aggregate-functions.html#function_std

1.STD
2.STDDEV
3.STDDEV_POP
4.STDDEV_SAMP
5.VAR_POP
6.VAR_SAMP
7.VARIANCE

PPL also needs to support some of them. Reference: https://docs.splunk.com/Documentation/SCS/current/SearchReference/Aggregatefunctions#stdev.28.26lt.3Bvalue.26gt.3B.29

Note that these functions may be optimized by pushing down to OpenSearch stats/extended_stats aggregation API: https://docs-beta.opensearch.org/docs/opensearch/metric-agg/#stats-extended_stats-matrix_stats. This can be part of #44 so this task can focus on basic implementation without pushdown.

Describe alternatives you've considered
N/A

Set operation improvement

Is your feature request related to a problem? Please describe.
Improvements on set operations such as union and intersection by migrating to the new SQL engine.

Here are the feature requests from community:

opendistro-for-elasticsearch/sql#388: Old set operation only works with JSON format.
opendistro-for-elasticsearch/sql#371: Union multi indices
opendistro-for-elasticsearch/sql#114: Explain issue

Describe the solution you'd like
N/A

Describe alternatives you've considered
N/A

Additional context
Add any other context or screenshots about the feature request here.

Test epic

test-epic

[BUG] Query Workbench Header says "OpenSearchDashboards"

Copying over original issue: opensearch-project/OpenSearch-Dashboards#429

The header for Query Workbench pages says "OpenSearchDashboards" (no space). It should say "Query Workbench".

Disable the legacy SQL/PPL Setting

1.1 Descriptions
The following SQL plugin settings may be deprecated:

[A] Setting switching between old and new engine:
- opensearch.sql.engine.new.enabled
[B] Legacy semantic analyzer settings in old engine:
[C] Default response format in old engine:
- opensearch.sql.query.response.format

1.2 Proposal & Impacts

[A] Enable and remove the setting: Users cannot disable new engine: should be fine because new engine is already turned on in ODFE 1.13.
[B] Disable and remove all the settings: No semantic check when queries fall back.
[C] Make JDBC default and remove the setting: This was added long time ago when default response format switched from JSON to JDBC (table). Should be fine to remove this directly.

Open question: Should JSON formatter be deprecated? Currently queries are handled by old engine if format=json present. This may confuse users if the old engine behaves different due to missing feature.

Add Query Engine Healthcheck for ODBC Driver Connectivity Test

In current code base the ODBC driver retrieves the installed plugins as the method to check the SQL plugin status before establishing the connection. We need to bring up a better/standard way to do the plugin health check for the ODBC driver connectivity test.

[BUG] Why nested has a limit of 3?

Search for "size" : 3, in the text below.

Why this limit? It appears returning only three result top.

I tried to look in code but I couldn't find where it is defined.
Can anyone point it to me?

SQL query:

POST /_opendistro/_sql
{
  "query" : """
        SELECT e.name AS employeeName,
               p.name AS projectName
        FROM employees_nested AS e,
             e.projects AS p
        WHERE p.name LIKE '%security%'
        """
}

Explain:

{
  "from" : 0,
  "size" : 200,
  "query" : {
    "bool" : {
      "filter" : [
        {
          "bool" : {
            "must" : [
              {
                "nested" : {
                  "query" : {
                    "wildcard" : {
                      "projects.name" : {
                        "wildcard" : "*security*",
                        "boost" : 1.0
                      }
                    }
                  },
                  "path" : "projects",
                  "ignore_unmapped" : false,
                  "score_mode" : "none",
                  "boost" : 1.0,
                  "inner_hits" : {
                    "ignore_unmapped" : false,
                    "from" : 0,
                    "size" : 3,
                    "version" : false,
                    "seq_no_primary_term" : false,
                    "explain" : false,
                    "track_scores" : false,
                    "_source" : {
                      "includes" : [
                        "projects.name"
                      ],
                      "excludes" : [ ]
                    }
                  }
                }
              }
            ],
            "adjust_pure_negative" : true,
            "boost" : 1.0
          }
        }
      ],
      "adjust_pure_negative" : true,
      "boost" : 1.0
    }
  },
  "_source" : {
    "includes" : [
      "name"
    ],
    "excludes" : [ ]
  }
}

Enable scalar function aggregation in PPL

Is your feature request related to a problem? Please describe.

opendistro-for-elasticsearch/sql#1079

Describe the solution you'd like
Actually the core engine already has this capability. What is needed is just to turn on this in ANTLR grammar in PPL.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Deprecate SQL cursor enabling setting

Deprecate cursor setting opensearch.sql.cursor.enabled and opensearch.sql.cursor.fetch_size by:

Enable cursor by default
Don't paginate if fetch_size absent (BREAKING CHANGE)

Doc: https://github.com/opensearch-project/sql/blob/main/docs/user/admin/settings.rst#opensearch-sql-cursor-enabled

Nested field query improvement

Is your feature request related to a problem? Please describe.
Nested field query is supported by following PartiQL specification: https://github.com/opensearch-project/sql/blob/main/docs/user/beyond/partiql.rst. There are limitations regarding the nested levels and query types that needs improvement.

opendistro-for-elasticsearch/sql#1055: Comparison between nested field
opendistro-for-elasticsearch/sql#963: ?
opendistro-for-elasticsearch/sql#797: Nested field involved in join
opendistro-for-elasticsearch/sql#399: Select *
opendistro-for-elasticsearch/sql#398: Aggregate over nested field
opendistro-for-elasticsearch/sql#245: Nested field involved in join
opendistro-for-elasticsearch/sql#155: Aggregate over nested field
opendistro-for-elasticsearch/sql#152: Deep nested field
opendistro-for-elasticsearch/sql#87: Filter nested field

Describe the solution you'd like
N/A

Describe alternatives you've considered
N/A

Additional context
Add any other context or screenshots about the feature request here.

Subquery support improvement

Is your feature request related to a problem? Please describe.
Currently there are only limited support for subqueries in FROM and WHERE clause: https://github.com/opensearch-project/sql/blob/main/docs/user/dql/complex.rst#subquery.

Here are the feature requests from community:

[Low] opendistro-for-elasticsearch/sql#1109: Subqueries in SELECT clause
[Medium] opendistro-for-elasticsearch/sql#1111: JOIN between subqueries

Describe the solution you'd like
N/A

Describe alternatives you've considered
N/A

Additional context
Add any other context or screenshots about the feature request here.

Support charting from PPL

Ensure backwards compatibility with ODFE

Coming from opensearch-plugins#12, enable upgrade to OpenSearch, while supporting backward compatibility.

Rename Namespaces (e.g. com.amazon.opendistroforelasticsearch to org.opensearch)
Rename Classes, Methods, Variables
Rename Remaining Identifiers (e.g. Opendistro to OpenSearch).
Rename, and Backwards Compatibility for Rest APIs
Rename, and Backwards Compatibility for Settings
Rename, and Backwards Compatibility for Indices
Run in a backwards compatible way on top of OpenSearch 1.0 that has joined an ES 7.10.x cluster
Run in a backwards compatible way on top of OpenSearch 1.0 that has joined an ODFE 1.13.x cluster
Drop in replacement for the Opendistro version 1.13 of the plugin

test-epic

Supporting IN in new Engine [BUG]

[2021-05-19T01:03:13,199][INFO ][c.a.o.s.l.p.RestSQLQueryAction] [elasticsearch_wqrgrid0660_0] Request is falling back to old SQL engine due to: Failed to parse query due to offending symbol [in] at: '
    SELECT ID FROM crygojxttsi6fvsthw2eaq 
    WHERE (Type in' <--- HERE... More details: Expecting tokens in {'AND', 'IS', 'LIKE', 'NOT', 'OR', 'REGEXP', '*', '/', '%', '+', '-', '=', '>', '<', '!', '.', ')'}```

It looks like the IN is not yet supported in the new engine. 

Some queries that have both `IN` and a comparison between two fields will fail because the old engine doesn't seem to be able to handle the comparison of the two fields, and the new one doesn't handle IN.

I would like to work on a patch, do you know of any specific blocker or macro steps that I should work on? 
A 3-4 liners practical guidance would be great :) 

Thank you!

How to enable Debug for integ-test ES server in elastic search

Running normal tests (without ES RestIntegTestTask) in intellij is very easy. We just import the project and run it it the Debug.

However I cannot find a way to run the underlying ES search server in debug mode.
I tried to add -Dcluster.debug -Dtest.debug, but it doesn't seem opening any port and the test will just run without stopping at breakpoints.

I tried to look at the code of the RestIntegTestTask gradle task (https://github.com/search?q=RestIntegTestTask&type=code), but not much insight from it.

Any advice?

Date time query improvement

Is your feature request related to a problem? Please describe.
Many date functions are already supported though some common ones are still missing (in new engine). Meanwhile time zone support is also high demand.

Here are the feature requests from community:

Functions: opendistro-for-elasticsearch/sql#709
1.1 opendistro-for-elasticsearch/sql#1073: NOW()
Time zone: opendistro-for-elasticsearch/sql#1062

Describe the solution you'd like
N/A

Describe alternatives you've considered
N/A

Additional context
Add any other context or screenshots about the feature request here.

opensearch-project / sql Goto Github PK

sql's Introduction

OpenSearch SQL

Code Summary

SQL Engine

Repository Checks

Issues

Highlights

Documentation

Forum

Contributing

Attribution

Code of Conduct

Security

License

Copyright

sql's People

Contributors

Stargazers

Watchers

Forkers

sql's Issues

Recommend Projects

Recommend Topics

Recommend Org