Code Monkey home page Code Monkey logo

presto-client-node's Introduction

presto-client-node

Distributed query engine "Presto" 's client library for node.js.

var presto = require('presto-client');
var client = new presto.Client({user: 'myname'});

client.execute({
  query:   'SELECT count(*) as cnt FROM tblname WHERE ...',
  catalog: 'hive',
  schema:  'default',
  source:  'nodejs-client',
  state:   function(error, query_id, stats){ console.log({message:"status changed", id:query_id, stats:stats}); },
  columns: function(error, data){ console.log({resultColumns: data}); },
  data:    function(error, data, columns, stats){ console.log(data); },
  success: function(error, stats){},
  error:   function(error){}
});

Installation

npm install -g presto-client

Or add presto-client to your own package.json, and do npm install.

API

new Client(opts)

Instanciate client object and set default configurations.

  • opts [object]
    • host [string]
      • Presto coordinator hostname or address (default: localhost)
    • ssl [object]
      • Setting a Hash object enables SSL and verify server certificate with options (default: null):
        • ca: An authority certificate or array of authority certificates to check the remote host against
        • cert: Public x509 certificate to use (default : null)
        • ciphers : Default cipher suite to use. (default: https://nodejs.org/api/tls.html#tls_modifying_the_default_tls_cipher_suite)
        • key: Private key to use for SSL (default: null)
        • passphrase: A string of passphrase for the private key or pfx (default: null)
        • pfx: Certificate, Private key and CA certificates to use for SSL. (default: null).
        • rejectUnauthorized: If not false the server will reject any connection which is not authorized with the list of supplied CAs. This option only has an effect if requestCert is true (default: true)
        • secureProtocol: Optional SSL method to use. The possible values are listed as SSL_METHODS, use the function names as strings. For example, "SSLv3_method" to force SSL version 3 (default: SSLv23_method)
        • servername: Server name for the SNI (Server Name Indication) TLS extension
    • port [integer]
      • Presto coordinator port (default: 8080)
    • user [string]
      • Username of query (default: process user name)
    • source [string]
      • Source of query (default: nodejs-client)
    • basic_auth [object]
      • Pass in a user and password to enable Authorization Basic headers on all requests.
      • basic_auth: {user: "user", password: "password"} (default:null)
    • custom_auth [string]
      • Sets HTTP Authorization header with the provided string.
      • Throws exception if basic_auth is also given at the same time
    • catalog [string]
      • Default catalog name
    • schema [string]
      • Default schema name
    • checkInterval [integer]
      • Interval milliseconds of each RPC to check query status (default: 800ms)
    • enableVerboseStateCallback [boolean]
      • Enable more verbose callback for Presto query states (default: false)
      • When set to true, this flag modifies the condition of the state change callback to return data every checkInterval(default: 800ms). Modify checkInterval if you wish to change the frequency.
      • Otherwise (false), the state change callback will only be called upon a change in state.
      • The purpose of this variable is to enable verbose update capability in state callbacks. This is such that "percentage complete" and "processed rows" may be extracted despite the state still remaining in a particular state eg. "RUNNING".
    • jsonParser [object]
      • Custom json parser if required (default: JSON)
    • engine [string]
      • Change headers set. Added for compatibility with Trino.
      • Available options: presto, trino (default: presto)
    • timeout [integer :optional]
      • The seconds that a query is allowed to run before it starts returning results, defaults to 60 seconds. Set to null or 0 to disable.

return value: client instance object

execute(opts)

This is an API to execute queries. (Using "/v1/statement" HTTP RPC.)

Execute query on Presto cluster, and fetch results.

Attributes of opts [object] are:

  • query [string]
  • catalog [string]
  • schema [string]
  • timezone [string :optional]
  • user [string :optional]
  • prepares [array(string) :optional]
    • The array of prepared statements, without PREPARE query0 FROM prefix.
    • Prepared queries can be referred as queryN(N: index) like query0, query1 in the query specified as query. Example:
      client.execute({ query: 'EXECUTE query0 USING 2', prepares: ['SELECT 2 + ?'], /* ... */ });
  • info [boolean :optional]
    • fetch query info (execution statistics) for success callback, or not (default false)
  • headers [object :optional]
    • additional headers to be included in the request, check the full list for Trino and Presto engines
  • timeout [integer :optional]
  • cancel [function() :optional]
    • client stops fetch of query results if this callback returns true
  • state [function(error, query_id, stats) :optional]
    • called when query stats changed
      • stats.state: QUEUED, PLANNING, STARTING, RUNNING, FINISHED, or CANCELED, FAILED
    • query_id
      • id string like 20140214_083451_00012_9w6p5
    • stats
      • object which contains running query status
  • columns [function(error, data) :optional]
    • called once when columns and its types are found in results
    • data
      • array of field info
      • [ { name: "username", type: "varchar" }, { name: "cnt", type: "bigint" } ]
  • data [function(error, data, columns, stats) :optional]
    • called per fetch of query results (may be called 2 or more)
    • data
      • array of array of each column
      • [ [ "tagomoris", 1013 ], [ "dain", 2056 ], ... ]
    • columns (optional)
      • same as data of columns callback
    • stats (optional)
      • runtime statistics object of query
  • retry [function() :optional]
    • called if a request was retried due to server returning 502, 503, or 504
  • success [function(error, stats, info) :optional]
    • called once when all results are fetched (default: value of callback)
  • error [function(error) :optional]
    • callback for errors of query execution (default: value of callback)
  • callback [function(error, stats) :optional]
    • callback for query completion (both of success and fail)
    • one of callback or success must be specified
    • one of callback or error must be specified

Callbacks order (success query) is: columns -> data (-> data xN) -> success (or callback)

query(query_id, callback)

Get query current status. (Same with 'Raw' of Presto Web in browser.)

  • query_id [string]
  • callback [function(error, data)]

kill(query_id, callback)

Stop query immediately.

  • query_id [string]
  • callback [function(error) :optional]

nodes(opts, callback)

Get node list of presto cluster and return it.

  • opts [object :optional]
    • specify null, undefined or {} (currently)
  • callback [function(error,data)]
    • error
    • data
      • array of node objects

BIGINT value handling

Javascript standard JSON module cannot handle BIGINT values correctly by precision problems.

JSON.parse('{"bigint":1139779449103133602}').bigint //=> 1139779449103133600

If your query puts numeric values in its results and precision is important for that query, you can swap JSON parser with any modules which has parse method.

var JSONbig = require('json-bigint');
JSONbig.parse('{"bigint":1139779449103133602}').bigint.toString() //=> "1139779449103133602"
// set client option
var client = new presto.Client({
  // ...
  jsonParser: JSONbig,
  // ...
});

Development

When working on this library, you can use the included docker-compose.yml file to spin up a Presto and Trino DBs, which can be done with:

docker compose up

Once you see the following messages, you'll be able connect to Presto at http://localhost:18080 and Trino at http://localhost:18081, without username/password:

presto-client-node-trino-1   | 2023-06-02T08:12:37.760Z	INFO	main	io.trino.server.Server	======== SERVER STARTED ========
presto-client-node-presto-1  | 2023-06-02T08:13:29.760Z	INFO	main	com.facebook.presto.server.PrestoServer	======== SERVER STARTED ========

After making a change, you can run the available test suite by doing:

npm run test

Versions

  • 1.1.0:
    • add automatic retries for server errors
    • follow redirects if servers simply redirect client's request
    • fix bug of HTTP/HTTPS procotol handling
    • fix bug about cancelling queries
  • 1.0.0:
    • add test cases and CI setting, new options and others, thanks to the many contributions from Matthew Peveler (@MasterOdin)
    • add "timeout" option to retry requests for server errors
    • change "error" callback (or "callback" as fallback path) to be specified when "success" is used
  • 0.13.0:
    • add "headers" option on execute() to specify any request headers
  • 0.12.2:
    • fix the bug of the "prepares" option
  • 0.12.1:
    • add "user" option on execute() to override the user specified per client
  • 0.12.0:
    • add X-Trino-Prepared-Statement to support SQL placeholder
    • catch Invalid URL errors
  • 0.11.2:
    • fix pregression for basic_auth feature
  • 0.11.1:
    • fix a critical bug around the code for authorization
  • 0.11.0:
  • 0.10.0:
    • add "engine" option to execute queries on Trino
  • 0.9.0:
    • make "catalog" and "schema" options optional (need to specify those in queries if omitted)
  • 0.8.1:
    • fix to specify default ports of http/https if nextUri doesn't have ports
  • 0.8.0:
    • fix the bug about SSL/TLS handling if redirections are required
  • 0.7.0:
    • support the change of prestodb 0.226 (compatible with others)
  • 0.6.0:
    • add X-Presto-Source if "source" specified
  • 0.5.0:
    • remove support for execute(arg, callback) using /v1/execute
  • 0.4.0:
    • add a parameter to call status callback in verbose
  • 0.3.0:
    • add Basic Authentication support
  • 0.2.0:
    • add HTTPS support
  • 0.1.3:
    • add X-Presto-Time-Zone if "timezone" specified
  • 0.1.2:
    • add X-Presto-Session if "session" specified
  • 0.1.1:
    • fix bug not to handle HTTP level errors correctly
  • 0.1.0:
    • add option to pass customized json parser to handle BIGINT values
    • add check for required callbacks of query execution
  • 0.0.6:
    • add API to get/delete queries
    • add callback state on query execution
  • 0.0.5:
    • fix to do error check on query execution
  • 0.0.4:
    • send cancel request of canceled query actually
  • 0.0.3:
    • simple and immediate query execution support
  • 0.0.2: maintenance release
    • add User-Agent header with version
  • 0.0.1: initial release

Todo

  • node: "failed" node list support
  • patches welcome!

Author & License

  • tagomoris
  • License:
    • MIT (see LICENSE)

presto-client-node's People

Contributors

clarkmalmgren avatar deugene avatar ebyhr avatar ekrekr avatar hazkiel avatar jhecking avatar kwent avatar likyh avatar malgogi avatar martin-kieliszek avatar masterodin avatar mrorz avatar pawelrychlik avatar psg2 avatar puneetjaiswal avatar snoble avatar tagomoris avatar tarekrached avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

presto-client-node's Issues

Prepare statement too large for header

I'm passing in a large prepare query but unfortunately getting a response like Error: execution error: Bad Message 431 reason: Request Header Fields Too Large. These queries are defined by 3rd parties so I don't have direct control over it.

My goal is to DESCRIBE OUTPUT preparedQuery, but I can't do this if preparing the query fails. Any workaround or suggestions?

Real world example: datalake-graphql-wrapper

Hi,

first of all - Thanks @tagomoris for all your work - it's awesome <3

I'm working at @dbsystel in a team which uses presto/trino to provide the data from the data lake.
We had the problem the the existing api solution ( with flask, s3 download etc.) had a lot of performance problems.

We found the package presto-client-node while searching for alternatives to our existing api and after some tests we knew that it's the solution for our problem! It's now one of the core packages for our new api to generate the GraphQL schema for the GraphQL API and fetching the data from the presto/trino cluster.

Since we use only OSS in our api, we decided to publish it also as OSS at github. ( What you give is what you get ;) ).

Here the link to the repo: https://github.com/dbsystel/datalake-graphql-wrapper

Here a short summary of the current features:

  • Automatic endpoint generation via interactive cli
    • Generates the interfaces and available endpoint fields based on the fetched database schema
    • Generates the filter fields for all root fields
    • Sorting
    • Pagination
  • Support for nested fields
  • date/time transformation
  • Written in TypeScript
  • Easy to extend

Feel free to fork/clone the repo or just copy some parts of it :)

Code does not follow Presto HTTP Best Practices

Your code doesn't look like it follows what the Presto owners have dictated as the best way to interact with it via HTTP:

https://github.com/prestodb/presto/wiki/HTTP-Protocol

For example, you should not be checking the status field to find out if you need to wait longer for a query. As the Presto owners suggest:

The status field is only for displaying to humans as a hint about the query's state on the server. It is not in sync in the query state from the client's perspective and must not be used to determine whether the query is finished.

Instead, the code should be checking for a 503.

how to set enable_hive_syntax like php?

We know that prestoClient use presto syntax default๏ผŒnot hiveใ€‚but when I have to query sql with hive format๏ผŒi encountered different problems, like:

image

so, how to set enable_hive_syntax like php?

which, php set enable_hive_syntax just like this:

$session->setProperty(new Property('enable_hive_syntax', true));

support for bigint values

it seems PRESTO API will return big ints as numbers in the json responses (when querying a table with a bigint column),
when parsing that using regular JSON.parse they will be rounded to js Number precision.

this can be fixed by using a JSON parser that supports bigints, like this one:
https://www.npmjs.com/package/json-bigint

Starting 0.161. /v1/execute resource got removed

Starting 0.161. /v1/execute resource got removed

Cf. https://prestodb.io/docs/current/release/release-0.161.html

And when trying to use https://github.com/tagomoris/presto-client-node#executearg-callback, the API is returning 404.

{ message: 'execution error:<html>\n<head>\n<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>\n<title>Error 404 Not Found</title>\n</head>\n<body><h2>HTTP ERROR 404</h2>\n<p>Problem accessing /v1/execute. Reason:\n<pre>    Not Found</pre></p>\n</body>\n</html>\n', error: null, code: 404 }

I think the only way is to use v1/statement now.

Query execution fails with multiple database connection

Hi,

We are executing a query connecting with Trino (Postgres catalog, big query catalog).

We found an issue while executing a query which connects to both of the databases. However the query works fine with 2 joins. But if we use more than 2 joins in the query, the query fails with the following error.

/opt/trino/trino-server/app/node_modules/presto-client/lib/presto-client/index.js:355
if (response.stats.state === 'QUEUED'
^

TypeError: Cannot read property 'state' of undefined
at /opt/trino/trino-server/app/node_modules/presto-client/lib/presto-client/index.js:355:36
at IncomingMessage. (/opt/trino/trino-server/app/node_modules/presto-client/lib/presto-client/index.js:133:13)
at IncomingMessage.emit (events.js:326:22)
at IncomingMessage.EventEmitter.emit (domain.js:483:12)
at endReadableNT (_stream_readable.js:1241:12)
at processTicksAndRejections (internal/process/task_queues.js:84:21)`

We are using presto-client package inside a moleculer service. We have deployed our application on premises.

Flow of query generation and execution of query:

  1. We get all the columns from all the database tables that user selects.

  2. Add those column names in the select query which has multiple joins.

  3. Execute the query using presto-client and the result is fetched by accumulating the data retrieved in chunks in the data method
    data [function(error, data, columns, stats) :optional], then push this data result into an array, and finally after all the data is fetched in the success method we are formatting the array data as per our need.

    Please help us out in fixing this issue and do the needful.

Thanks in advance.
app (2).txt

How to define presto session properties?

Hi, is there a way to define either on the connection or on the query execution presto session properties? Like the properties we set up on the cli by doing SET SESSION ?

Thank You

Cant get all the results of a query execute

Hi guys,
I'm trying to get all the results from a query but today I noticed that sometimes it wouldn't return all the rows. After reading the docs I understand that the data attribute from client.execute function can be called 2 or more times and that's why my code is wrong.

executePrestoQuery(query: string): Promise<any> {
    return new Promise<any>((resolve, reject) => {
      this.client.execute({
        query: query,
        catalog: "hive",
        schema: "default",
        source: "nodejs-client",
        data: function (error, data, columns, stats) {
          resolve(data);
        },
        success: function (error, stats, info) {
          console.log("success");
        },
        error: function (error) {
          reject(error);
        },
      });
    });
  }

So said that what's the best way to return all the data from the client.execute function? Can I return the data on the success attribute?

(Sorry If it's an easy solution I am new to Javascript)
Thank you
Guilherme

Undefined error when nextUri request returns non-200 status code

We hit the following error in production:

TypeError: Cannot read properties of undefined (reading 'state')
    at /app/foo/node_modules/presto-client/lib/presto-client/index.js:338:70
    at IncomingMessage.<anonymous> (/app/foo/node_modules/presto-client/lib/presto-client/index.js:133:13)
    at IncomingMessage.emit (node:events:525:35)
    at IncomingMessage.emit (node:domain:489:12)
    at endReadableNT (node:internal/streams/readable:1358:12)
    at processTicksAndRejections (node:internal/process/task_queues:83:21)

From debugging, it seems that the trino server we were querying against returned a 50x response (our logging did not capture this) on the request in fetch_next function, and so typeof response === 'string' and so response.stats === undefined in the callback.

Looking at the presto docs, it documents that a 503 error should trigger a retry while the trino docs says that a 502, 503, or 504 should trigger a retry. Anything beyond those and 200 should be considered an error case.

My proposed change would be that for both the initial request and all subsequent requests against nextUri, if they return 200, then handle it as right now. If it returns [502, 503, 504], then do a retry after 50-100ms. If the request returns something outside of those codes, then return an error with some generic message.

For the 50x case, will probably want to add some way to timeout querying so that if the server never responds, the query doesn't just permanently hang.

SSL redirect failing

When I tried to connect to presto server using the presto-client library with SSL (cert and key info) and run execute query the REST call to /v1/statement works and when the client tried to hit the "nextUri" endpoint, the server returns 401 due to missing certificate information while making the call.

This PR fixes that issue. Please review and let me know.

master...souryabharath:patch-1

nextUri can return 301 redirect due to https/http mismatch

When trino/presto is behind a load-balancer, it can be setup such that one connects to the load balancer via HTTPS to connect to trino/presto, but that the cluster itself was not setup with https, nor was the http-server.process-forward setting enabled. In this case, the HTTPS call to /v1/statement via HTTPS will end up returning a nextUri that is http, which the load balancer will respond with a 301 to the appropriate https endpoint. However, the client cannot handle this, and will end up in an error state. I believe that the client should follow the redirects, as it seems like other clients (e.g. dbeaver) do.

To handle this, I would propose using the follow-redirects package which is a drop-in replacement for the builtin http and https modules that are already used, just that it'll silently handle redirects. This would potentially break usage for anyone using this within a web context, but not sure that's an issue. Could also instead use node-fetch which similarly supports this, but would require a larger rewrite of how requests are done.

Let me know if such a change would be accepted and which library you'd prefer @tagomoris.

When nextUri is returned without a specified port, ERR_SOCKET_BAD_PORT is returned

When Presto returns a nextUri without a specified port number(e.g. http://prestoserver/v1/api/...), the following error is returned: RangeError: Port should be >= 0 and < 65536. Received .RangeError [ERR_SOCKET_BAD_PORT]: Port should be >= 0 and < 65536. Received .

This is due to the opts.port being an empty string when making a subsequent request due to: 5b87989#diff-bc0ed6253188f59cfb43f199aaea9eabR57-R67

Specifically, line 59, URL returns href.port as ''.

The way to fix this is to add the following at line 69: if (opts.port === '') delete opts.port; or opts.port = client.port;

My scenario is not an unusual situation. I simply have a firewall forwarding port 80 (as well as 443) to the presto coordinator on port 8080. The above error is seen when trying to contact it using port 80 or port 443 (with ssl turned on).

Would you be able to add one of the above-mentioned fixes?

Thanks

TypeError if missing error callback when using success callback

Right now, the code only requires that either success or callback be defined:

if (!opts.success && !opts.callback)
throw {message: "callback function 'success' (or 'callback') not specified"};

However, if I define success and an error happens I get:

/foo/node_modules/presto-client/lib/presto-client/index.js:340
                    error_callback(error || response.error);
                    ^

TypeError: error_callback is not a function

The current documentation doesn't really make this clear that'll happen (to me), especially combined with #70 where success is documented to have an error as part of its callback signature.

Would it make sense to have error also be required if success callback is used and callback is not defined? Or perhaps just provide a no-op function when defining error_callback, e.g. error_callback = opts.error || opts.callback || () => {};? Or just update README documentation?

Queries from Node client 100x slower than from Trino CLI

I'm using this Node client to query Trino running in a local container. Simple queries such as SELECT NULL; take 1-2 seconds as seen in the Trino query console, while the same query issued from the CLI complete in milliseconds. Is this expected, can I improve my configuration, or are there any improvements to the driver that can be made?

image
image

run error

/home/app/node_modules/presto-client/lib/presto-client/index.js:1
(function (exports, require, module, __filename, __dirname) { const { URL } = require('url') ;
^

SyntaxError: Unexpected token {
at exports.runInThisContext (vm.js:53:16)
at Module._compile (module.js:373:25)
at Object.Module._extensions..js (module.js:416:10)
at Module.load (module.js:343:32)
at Function.Module._load (module.js:300:12)
at Module.require (module.js:353:17)
at require (internal/module.js:12:17)
at Object. (/home/app/gleeman/node_modules/presto-client/index.js:3:11)
at Module._compile (module.js:409:26)
at Object.Module._extensions..js (module.js:416:10)

i got this error on new presto.Client.
i use node 4.9.1

v. 0.6.0 issues connecting to Presto Starbust 302-E.11-AWS

We are trying to access Presto Starbust version 302-E.11-AWS. Using the same credentials with presto-cli-302 we are able to access but with presto-client we are getting the following error:

{
  message: 'Access Denied: Cannot select from table copilot_api.ad_content',
  errorCode: 4,
  errorName: 'PERMISSION_DENIED',
  errorType: 'USER_ERROR',
  failureInfo: {
    type: 'io.prestosql.spi.security.AccessDeniedException',
    message: 'Access Denied: Cannot select from table copilot_api.ad_content',
    suppressed: [],
    stack: [
      'io.prestosql.spi.security.AccessDeniedException.denySelectTable(AccessDeniedException.java:176)',
      'io.prestosql.spi.security.AccessDeniedException.denySelectTable(AccessDeniedException.java:171)',
      'io.prestosql.plugin.hive.security.SqlStandardAccessControl.checkCanSelectFromColumns(SqlStandardAccessControl.java:204)',
      'io.prestosql.plugin.base.security.ForwardingConnectorAccessControl.checkCanSelectFromColumns(ForwardingConnectorAccessControl.java:153)',
      'io.prestosql.plugin.hive.security.PartitionsAwareAccessControl.checkCanSelectFromColumns(PartitionsAwareAccessControl.java:141)',
      'io.prestosql.security.AccessControlManager.lambda$checkCanSelectFromColumns$66(AccessControlManager.java:617)',
      'io.prestosql.security.AccessControlManager.authorizationCheck(AccessControlManager.java:802)',
      'io.prestosql.security.AccessControlManager.checkCanSelectFromColumns(AccessControlManager.java:617)',
      'io.prestosql.sql.analyzer.Analyzer.lambda$null$0(Analyzer.java:81)',
      'java.util.LinkedHashMap.forEach(LinkedHashMap.java:684)',
      'io.prestosql.sql.analyzer.Analyzer.lambda$analyze$1(Analyzer.java:80)',
      'java.util.LinkedHashMap.forEach(LinkedHashMap.java:684)',
      'io.prestosql.sql.analyzer.Analyzer.analyze(Analyzer.java:79)',
      'io.prestosql.sql.analyzer.Analyzer.analyze(Analyzer.java:68)',
      'io.prestosql.execution.SqlQueryExecution.<init>(SqlQueryExecution.java:211)',
      'io.prestosql.execution.SqlQueryExecution.<init>(SqlQueryExecution.java:98)',
      'io.prestosql.execution.SqlQueryExecution$SqlQueryExecutionFactory.createQueryExecution(SqlQueryExecution.java:761)',
      'io.prestosql.execution.SqlQueryManager.createQueryInternal(SqlQueryManager.java:361)',
      'io.prestosql.execution.SqlQueryManager.lambda$createQuery$4(SqlQueryManager.java:303)',
      'io.prestosql.$gen.Presto_302_e_11_aws____20190802_153802_1.run(Unknown Source)',
      'java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)',
      'java.util.concurrent.FutureTask.run(FutureTask.java:266)',
      'java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)',
      'java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)',
      'java.lang.Thread.run(Thread.java:748)'
    ]
  }

Can you please specify what version of Presto is supported by presto-client 0.6.0?

timeZoneId is null

I ran into a NullPointer problem upon running this very simple script:

var presto = require('presto-client');
var client = new presto.Client({ catalog: 'cassandra', schema: 'test' });

client.execute({ query: 'show tables' }, function(error, data, columns) {
  if (error) return console.log(error);
  console.log(data);
});

The error I get:

{ message: 'execution error:java.lang.NullPointerException: timeZoneId is null\n\tat java.util.Objects.requireNonNull(Objects.java:228)\n\tat com.facebook.presto.client.ClientSession.<init>(ClientSession.java:144) .........',
  error: null,
  code: 500 }

So I jumped into prestodb code and it turns out that the timeZoneId is required prestodb/presto@f8ceb17#diff-a804c427cefb2a1682c3bea281daf18aR68

I'm running prestodb 0.157.

Remove error from callbacks that don't return an error

Looking at the source code, I noticed that for the execute function the following callbacks:

  • state
  • columns
  • data

all document that their first argument is error. However, in all cases, they only ever receive null, and that if there was an error, it would have gone to the error/callback callback function, and then the above callbacks are never called.

My proposal would be to remove the first argument from those callbacks as it simplifies the call signature as well as better indicates to downstream users that they don't need to worry about any sort of error handling along those callbacks.

How to setup connection pool

Hi, there,

Is there a way to setup connection pool? by default, what's the connection pool settings?

Thanks for the help!
Sam

How to provide certificate file (for secure connection) when connecting to presto?

I am facing this error while connecting to presto using the presto-client,
unable to get local issuer certificate

const readKeystore = jks.toPem(fs.readFileSync('C:/Users/userDir/Desktop/keystore.jks'), 'keyStorePassword');
const { cert, key } = readKeystore['prestossl'];

let hiveConfig: any = {}
hiveConfig.user = 'user_name'
hiveConfig.password = 'password'
hiveConfig.port = '8085'
hiveConfig.host = 'myhadoopuser.mydomain.com'
hiveConfig.ssl = {
    key : key,
    cert : cert
};
let presto = new Presto(hiveConfig)

Add CI tests for the library

Adding tests / CI was mentioned in #50, but I'm splitting this off into its own issue as I feel like this could be accomplished without also converting the library to TS.

Having a test suite that could be run against the library would be awesome. To accomplish this, my suggestion would be to utilize the testcontainers library to spin up a presto/trino container + postgres container that are networked together, and then can just run through the basics of the existing client methods. With this in place, would be easy to extend over time with more complex configurations.

Happy to contribute to this effort @tagomoris if a PR would be accepted following the above plan (or if you've got some other idea of how to do it).

Add support for Trino

PrestoSQL is now Trino and changed few bits on the protocol.
https://trino.io/blog/2020/12/27/announcing-trino.html

I'm getting an error on User missing as the server expects X-Trino headers instead of X-Presto.
Hack to get this lib working with Trino, run this before any request:

const Headers = require('presto-client/lib/presto-client/headers').Headers;
for (let key of Object.keys(Headers)) {
    Headers[key] = Headers[key].replace("Presto", "Trino");
}

Discussion: support kerberos authentication

I am trying to connect to a Trino server guarded with Kerberos authentication.

Currently the authorization header is designated here, in HTTP basic auth: https://github.com/tagomoris/presto-client-node/blob/master/lib/presto-client/index.js#L98

I think there are 2 ways to implement Kerberos support:

  1. Introduce a new option custom_auth in Client constructor opts for a custom authorization header string. When present, it will set authorization header as specified somewhere near here in the code. Developers must find their own way to generate the authorization header (for Kerberos it means using other libraries like kerberos or krb5.)
  2. Introduce an optional dependency kerberos and support kerberos authentication similar to PyHive's.

Personally I prefer 1 because it introduces less code change on the library. If this is preferred, I can submit a pull request on this :)

Presto-client not working with Presto 0.157.1

When trying to run presto-client with Presto version 0.157.1, I get the following error:

Unable to execute HTTP request: java.lang.RuntimeException:
  Unexpected error: java.security.InvalidAlgorithmParameterException:
    the trustAnchors parameter must be non-empty

Support variable precision time, timestamp, time with time zone, timestamp with time zone types

Over couple of recent releases (up to including Presto 341), Presto gained support for variable precision time, timestamp, time with time zone, timestamp with time zone types.
For backward compatibility reasons, these are not rendered with actual precision to the client, unless the client sends proper X-Presto-Client-Capabilities header.

Add support for these, along with properly setting the header.

See more at trinodb/trino#1284

[HELP]: Access Denied: User pbennett cannot impersonate user paul

I keep running into this issue when trying to run a query. It seems it's trying to use my computer user instead of my actual creds. This is how I have set up the client

const client = new Client({
  host: 'lga-xxx-adhoc.xxx.com',
  ssl: {
    rejectUnauthorized: true,
  },
  port: 8443,
  catalog: 'gridhive',
  schema: 'rpt',
  source: 'nodejs-client',
  basic_auth: {user: 'pbennett', password: 'xxx'},
})

I have used the official Presto python client for a couple of projects and this is the client I used.

with prestodb.dbapi.connect(
        host=c.DBHOST,
        port=int(c.DBPORT),
        user=c.DBUSER,
        catalog="gridhive",
        schema="rpt",
        http_scheme="https",
        auth=prestodb.auth.BasicAuthentication(c.DBUSER, c.DBPWD),
    ) as conn:

The only real difference i see here is the http_scheme="https", I am not too sure which SSL option to use either.

Any suggestions on a resolution would be great.

Unable to configure SQL preprocessing using 'prepares'. "message: 'Prepared statement not found: query0'"

Has anyone encountered a similar problem?

This is my code :

    this.trino.execute({
      query: "EXECUTE query0 USING TIMESTAMP'2022-05-14 00:00:00',TIMESTAMP'2022-05-17 00:00:00'",
      prepares: ['select * from localdevmysql217.binglog.appointmentindex_202108 where between ? and ?'],
      state: function (error, query_id, stats) {
        console.log({ message: 'status changed', id: query_id, stats: stats });
      },
      columns: function (error, data) {
        console.log({ resultColumns: JSON.stringify(data) });
      },
      data: function (error, data, columns, stats) {
        console.log('data: ', data);
      },
      success: function (error, stats) {
        console.log(stats);
      },
      error: function (error) {
        console.log(error);
      },
    });

This is the return result:

{
  message: 'Prepared statement not found: query0',
  errorCode: 5,
  errorName: 'NOT_FOUND',
  errorType: 'USER_ERROR',
  failureInfo: {
    type: 'io.trino.spi.TrinoException',
    message: 'Prepared statement not found: query0',
    suppressed: [],
    stack: [
      'io.trino.util.Failures.checkCondition(Failures.java:64)',
      'io.trino.Session.getPreparedStatement(Session.java:277)',
      'io.trino.Session.getPreparedStatementFromExecute(Session.java:271)',
      'io.trino.execution.QueryPreparer.prepareQuery(QueryPreparer.java:65)',
      'io.trino.execution.QueryPreparer.prepareQuery(QueryPreparer.java:56)',
      'io.trino.dispatcher.DispatchManager.createQueryInternal(DispatchManager.java:180)',
      'io.trino.dispatcher.DispatchManager.lambda$createQuery$0(DispatchManager.java:149)',
      'io.trino.$gen.Trino_375____20220520_013819_2.run(Unknown Source)',
      'java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)',
      'java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)',
      'java.base/java.lang.Thread.run(Thread.java:829)'
    ]
  }
}

Support for Authorization Header on all requests

Perhaps adding the following to Client.prototype.request function will enable use-cases that require auth on all requests:

in index.js, Client.prototype.request function

if (opts instanceof Object) {
    // other code as normal
    if (client.password && client.user){
          opts.headers[Headers.AUTHORIZATION] = 'Basic ' + new Buffer(client.user + ":" + client.password).toString("base64");
    }
   // other code as normal
}

This would require the headers.js file to also include the following:

Headers.AUTHORIZATION = 'Authorization';

The potential issue with this however, is for nextUri requests. nextUri requests enter the request function as a string, and not as an object. Hence, perhaps this could be a solution to ensure that nextUri requests still pass an Authorization header:

// Top of index.js
var parseUrl = require('parse-url');

// Within Client.prototype.request function
if (opts instanceof Object) {
   // omitting for this example
  } else {
// must be a nextUri request - hence it is a string that must have its protocol, host, port, pathname extracted and an AUTH header passed through - ONLY if a client.password is used
    if (client.password){
      var href = parseUrl(opts); 
      opts = {}
      opts.host = href.resource
      opts.port = href.port
      opts.protocol = client.protocol;
      opts.path = href.pathname;
      opts.headers = {};
      opts.headers[Headers.AUTHORIZATION] = 'Basic ' + new Buffer(client.user + ":" + client.password).toString("base64");
    }
  }
// Continue with request

Client password would be entered like so within the calling code:

var client = new presto.Client({user: 'user', password: 'pass', port: 8443, host: 'host.example', 
                                            catalog: 'hive', schema: 'default', 
                                              ssl: { cert: '/usr/local/share/ca-certificates/presto.crt'} });

Please comment or review

Retrieving query status and cancelling query

It seems that the kill and query commands call the "v1/query/" endpoint, however I'm getting the following 404 path not found error:
{"timestamp: "...", "status":404, "error": "Not Found", "path":"v1/query/{query-id}"}

I noticed that in Presto + Trino documentation, they say that these get and delete requests should be made to the nextUri, which include the "v1/statement" path as opposed to above: https://prestodb.io/docs/current/develop/client-protocol.html

columns callback but no data callback

What does it mean that I get the columns() callback with stuff that looks ok, but the data() callback is never called? The final "callback()" is called, with no error.

I can use hive-driver to do the same query on my hadoop server. I launch the presto server with a hive.properties in catalogs that looks like:

connector.name=hive-hadoop2
hive.metastore.uri=thrift://hadoop-master:9083
hive.s3.ssl.enabled=false
hive.s3.path-style-access=true
hive.s3.endpoint=http://moto-server:5000

My presto client:

client = new presto.Client({
schema: 'default',
catalog: 'hive',
source: 'nodejs-client',
});

Do I need something else for "schema"?

My table in hive looks like

CREATE EXTERNAL TABLE `chunks`(
 ... )
ROW FORMAT SERDE
  'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES (
  'paths'='attributes,code,data,fwVersion,shoe,timestamp,ts,tz,user')
STORED AS INPUTFORMAT
  'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  's3a://zc-bigdata/'

Rewrite this library in TypeScript

This is my potential personal project - to complete things like:

  • Type definitions #28
  • Tests for minimum cases
  • CIs w/ recent Presto/Trino

The current release & crash steps are not modern ones (disclosure - this project started as my personal&trial project in the far past :P)

How to set resultset size?

Hi, there,

For a big query with huge number of recorders/rows (couple millions), it will take a very long time to retrieve all the results (it has been 20 minutes and still running). I believe if we can setup the resultset size to retrieve how many rows each time, it can boost the performance a lot. But I can't find any doc about that setting. Could anyone let me know how to do that?

Thanks,
Sam

Clarification needed on response

DATA callback method:
data [function(error, data, columns, stats) :optional] - called per fetch of query results (may be called 2 or more)

  1. Is it possible to get results in a single callback?
  2. Is it possible to get result data in SUCCESS callback method? - how to differentiate 201 and 204?

version loading throws an error when bundled

I'm using prest-client in a project that bundles code using Webpack and the following line throws:

lib.version = JSON.parse(fs.readFileSync(__dirname + '/package.json')).version;

I've created a patch locally using yarn but others might run into this too. Node.js and Webpack support importing package.json directly which should be well supported by other bundlers that support Node.js:

diff --git a/index.js b/index.js
index 0f5fe1cb2b9ce459711f1eb29eaa002405f0fd60..e77aed24d74cf5b5eb3e3c9a4d14b2631516ce9f 100644
--- a/index.js
+++ b/index.js
@@ -1,6 +1,6 @@
 var fs = require('fs');
 
 var lib = require('./lib/presto-client');
-lib.version = JSON.parse(fs.readFileSync(__dirname + '/package.json')).version;
+lib.version = require('./package.json').version;
 
 exports.Client = lib.Client;

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.