Code Monkey home page Code Monkey logo

amazon-qldb-driver-python's Introduction

Amazon QLDB Python Driver

Latest Version Documentation Status Build Status license AWS Provider

This is the Python driver for Amazon Quantum Ledger Database (QLDB), which allows Python developers to write software that makes use of Amazon QLDB.

For getting started with the driver, see Python and Amazon QLDB.

Requirements

Basic Configuration

See Accessing Amazon QLDB for information on connecting to AWS.

Required Python versions

Pyqldb 2.x requires Python 3.4 or later.

Pyqldb 3.x requires Python 3.6 or later.

Please see the link below for more detail to install Python:

Getting Started

Please see the Quickstart guide for the Amazon QLDB Driver for Python.

First, install the driver using pip:

pip install pyqldb

Then from a Python interpreter, call the driver and specify the ledger name:

from pyqldb.driver.qldb_driver import QldbDriver

qldb_driver = QldbDriver(ledger_name='test-ledger')

for table in qldb_driver.list_tables():
    print(table)

See Also

  1. Getting Started with Amazon QLDB Python Driver A guide that gets you started with executing transactions with the QLDB Python driver.
  2. QLDB Python Driver Cookbook The cookbook provides code samples for some simple QLDB Python driver use cases.
  3. Amazon QLDB Python Driver Tutorial: In this tutorial, you use the QLDB Driver for Python to create an Amazon QLDB ledger and populate it with tables and sample data.
  4. Amazon QLDB Python Driver Samples: A DMV based example application which demonstrates how to use QLDB with the QLDB Driver for Python.
  5. QLDB Python driver accepts and returns Amazon ION Documents. Amazon Ion is a richly-typed, self-describing, hierarchical data serialization format offering interchangeable binary and text representations. For more information read the ION docs.
  6. Amazon QLDB supports the PartiQL query language. PartiQL provides SQL-compatible query access across multiple data stores containing structured data, semistructured data, and nested data. For more information read the PartiQL docs.
  7. Refer the section Common Errors while using the Amazon QLDB Drivers which describes runtime errors that can be thrown by the Amazon QLDB Driver when calling the qldb-session APIs.

Development

Setup

Assuming that you have Python and virtualenv installed, set up your environment and installed the dependencies like this instead of the pip install pyqldb defined above:

$ git clone https://github.com/awslabs/amazon-qldb-driver-python
$ cd amazon-qldb-driver-python
$ virtualenv venv
...
$ . venv/bin/activate
$ pip install -r requirements.txt
$ pip install -e .

Running Tests

You can run the unit tests with this command:

$ pytest --cov-report term-missing --cov=pyqldb tests/unit

You can run the integration tests with this command:

$ pytest tests/integration

Documentation

Sphinx is used for documentation. You can generate HTML locally with the following:

$ pip install -r requirements-docs.txt
$ pip install -e .
$ cd docs
$ make html

Getting Help

Please use these community resources for getting help.

  • Ask a question on StackOverflow and tag it with the amazon-qldb tag.
  • Open a support ticket with AWS Support.
  • Make a new thread at AWS QLDB Forum.
  • If you think you may have found a bug, please open an issue.

Opening Issues

If you encounter a bug with the Amazon QLDB Python Driver, we would like to hear about it. Please search the existing issues and see if others are also experiencing the issue before opening a new issue. When opening a new issue, we will need the version of Amazon QLDB Python Driver, Python language version, and OS you're using. Please also include reproduction case for the issue when appropriate.

The GitHub issues are intended for bug reports and feature requests. For help and questions with using Amazon QLDB Python Driver, please make use of the resources listed in the Getting Help section. Keeping the list of open issues lean will help us respond in a timely manner.

License

This library is licensed under the Apache 2.0 License.

amazon-qldb-driver-python's People

Contributors

allimn avatar amazon-auto avatar amzn-paunort avatar aurghob33 avatar aws-qldb-github-bot avatar battesonb avatar butleragrant avatar bwinchester avatar byronlin13 avatar danieledwardknudsen avatar dependabot[bot] avatar dominitio avatar guyilin-amazon avatar plasmaintec avatar saumehta9 avatar shuiwan avatar shushen avatar simonz-bq avatar trstephen-amazon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

amazon-qldb-driver-python's Issues

Rename `QldbDriver` property `pool_limit` to `max_concurrent_transactions`

Rename QldbDriver property pool_limit to max_concurrent_transactions

The property pool_limit defines the max number of sessions a driver instance can have.
Amazon QLDB as of today supports a single transaction at a time each session. Which means that another transaction cannot be started on a session incase a transaction is already open on that session. Which in turn means that that the pool_limit property governs the max number of transaction that can be open at a time on a driver instance. It makes more sense to rename pool_limit to max_concurrent_transactions since it would be more intuitive to developers.

Before:

from pyqldb.driver.pooled_qldb_driver import QldbDriver

qldb_driver = PooledQldbDriver(ledger_name='vehicle-registration', pool_limit =10)

After:

from pyqldb.driver.qldb_driver import QldbDriver

qldb_driver = QldbDriver(ledger_name='vehicle-registration', max_concurrent_transactions=10)

IonPyBool value gets converted to IonPyInt

Description

Documents on QLDB with boolean values get automatically converted to IonPyInt upon SELECT. Expected behavior is to retrieve an IonPyBool value instead.

Example

Sample document stored on Amazon QLDB:

{
  "doc_id": "1",
  "conditions": {
    "1_monthly_balance": {
      "product_id": "1",
      "consumption_metric": true,
      "indicator": "is_active"
    }
  }
}

Sample document retrieved after SELECT statement:

{
  "doc_id": "1",
  "conditions": {
    "1_monthly_balance": {
      "product_id": "1",
      "consumption_metric": 1,
      "indicator": "is_active"
    }
  }
}

Environment

  • Amazon QLDB Python Driver Version: 3.2.2
  • Python language version: 3.9
  • OS you’re using: Testing on a AWS Lambda Function

Simplify interfaces - Move Session pooling functionality to `QldbDriver` and remove `PooledQldbDriver`

Move Session pooling functionality to QldbDriver and remove PooledQldbDriver

Currently interfaces of Pyqld provides developers with too many options to do things - PooledQldbDriver and QldbDriver, driver interfaces, session interfaces, transaction interfaces. This creates quite some confusion for developers.

In the version 2.x, we provided 2 interfaces for the driver. The standard QldbDriver and a PooledQldbDriver. As the name suggests, the PooledQldbDriver maintained a pool of sessions which allowed you to reuse the underlying connections.
Over a period of time, we realized that customers would just want to use the pooling mechanism for its benefits instead of the standard driver. Therefore, we propose that the PooledQldbDriver be removed and move the pooling functionality to QldbDriver.

Before:

from pyqldb.driver.pooled_qldb_driver import PooledQldbDriver

qldb_driver = PooledQldbDriver(ledger_name='vehicle-registration')

After:

from pyqldb.driver.qldb_driver import QldbDriver

qldb_driver = QldbDriver(ledger_name='vehicle-registration')

Possible Injection Issue

In the sample application from the docs, you have an example of a query where you are formatting the query string directly with input for the table, field and value. I have a similar query in my codebase and it gets flagged by our static code analyzer for possible SQL injection, since the user input is directly touching the query string.

It is currently possible with the driver to parameterize the value and pass it as an argument to the transaction_executor, but not the table or field. Are there any plans to fully parameterize the PartiQL query to prevent injection?

Is injection even possible in PartiQL?

Support ion-python 0.7.0 for IonToJSONEncoder support

I'm working on a simple application based upon QLDB in python, and my clients are used to accepting JSON rather than ION. Currently it's not possible to use this driver and the IonToJSONEncoder support in ion-python, so I either have to manually down-convert, or the clients will have to understand ION responses.

I'm not familiar enough with the projects to say for sure, but the list of changes for 0.6.0 and 0.7.0 of ion-python are pretty small, so it seems unlikely that it would break anything.

Remove `QldbDriver` property `pool_timeout`.

Remove QldbDriver property pool_timeout.

The pool_timeout property controlled the amount of time the driver waited for a session to be available in case the pool limit was reached and no session is present in the pool. The driver would throw a SessionPoolEmpty error in case it has waited for the duration of pool_timeout and could not get a session. We found that this parameter was not very useful.
Going forward, developers will no longer need to configure this parameter and would be internal to the pooling logic.

Invalid ledger name in the README sample code

The README says qldb_driver = PooledQldbDriver(ledger_name='test_ledger') but test_ledger is not a valid name. When you try to use the sample code you get:

...
botocore.errorfactory.BadRequestException: An error occurred (BadRequestException) when calling the SendCommand operation: 1 validation error detected: Value 'test_ledger' at 'startSession.ledgerName' failed to satisfy constraint: Member must satisfy regular expression pattern: (?!^.*--)(?!^[0-9]+$)(?!^-)(?!.*-$)^[A-Za-z0-9-]+$

Remove interfaces which allow developers to get a session from the pool and execute transaction.

Remove interfaces which allow to grab a session from the pool and execute transaction.

The driver provides too many options to execute a transaction. This makes it difficult to use and to decide how and when to use one option over the other. Also the session and transaction interfaces requires the developers to take care of retries themselves which seems to be a an unnecessary overhead. We should remove get_session() method on the driver interface. This means developers will no longer be able to get a session from the pool and execute transactions. Instead developers should use the execute_lambda on the driver interface which does all the work of getting a session, starting/committing transactions implicitly and also retries. Also QldbSession, PooledQldbSession and Transaction classes should either be made internal/private or removed since they serve no purpose as far as public interfaces are concerned.

Before


with qldb_driver.get_session() as session:

  transaction = session.start_transaction()
  transaction.execute_statment("CREATE TABLE Person")
  transaction.commit()
  
expect Exception as e:
.....

After

def create_table(transaction_executor):
    # Transaction started implicitly, no need to explicitly start transaction
    
	 transaction_executor.execute_statement("CREATE TABLE Person")

    # Transaction committed implicitly on return,
    # no need to explicitly commit transaction


qldb_driver.execute_lambda(lambda executor: create_table(executor))

Also the method execute_statement should be removed from QldbDriver.

To execute a statement, developers should use the QldbDriver.execute_lambda method that takes a lambda function. Check the Cookbook for more examples.

error parameters

i get this error, Query expects 4 parameters, but statement has 1, and this is my code.

obj = { "statement" : 'UPDATE {} AS r SET r.signatures = ?, r.isSigned = ?  WHERE r.DocumentID = ? and r.userName=?'.format(os.environ["QlDBTableDocuments"]), 
         "parameters" : (ion.convert_object_to_ion(sigantures), 
                                   ion.convert_object_to_ion(isSigned),
                                  documentId,
                                  userName)
     }
session.execute_statement(obj["statement"], obj["parameters"])  

Support for Python >3.10 on Windows

We've identified an issues building this driver with versions of python newer than 3.10 on windows. See this build failure. A sample error:

C:\Users\runneradmin\AppData\Local\Temp\pip-install-_pw79w3w\amazon-ion_4634f832fc9049cb90a06b861a617272\ion-c\tools\ionsymbols\ionsymbols.c(573,18): warning C4477: 'fprintf' : format string '%ld' requires an argument of type 'long', but variadic argument 3 has type 'int64_t' [C:\Users\runneradmin\AppData\Local\Temp\pip-install-_pw79w3w\amazon-ion_4634f832fc9049cb90a06b861a617272\ion-c\tools\ionsymbols\ionsymbols.vcxproj]

Haven't done a thorough triage here. I will point out the amazon-ion/ion-python repo only builds for Python up to 3.11 and excludes windows. Some issues that look related:

amazon.ion 0.10.0 introduces breaking change

Following the QLDB python installation instructions installed amazon.ion 0.10.0 as a dependency (released feb 13, 2023). Running the code produced the following error:

ImportError: cannot import name 'Enum' from 'amazon.ion.util'

code from the tutorial that I ran:

from pyqldb.driver.qldb_driver import QldbDriver

qldb_driver = QldbDriver(ledger_name='test-ledger')

for table in qldb_driver.list_tables():
    print(table)

Had to rollback to amazon.ion==0.9.3 to fix the issue.

Cheers,
Lee

Move method `list_tables` to the driver instance

Move method list_tables to the driver instance

The Session class exposes a helper method called list_tables which lists all the tables present in your ledger. The proposal is to migrate this method to the QldbDriver.

Before

session = driver.get_session();
tables = session.list_tables();

Now

tables = driver.list_tables();

Feature Request: QLDBDriver logs PartiQL statements used in SendCommand operation

Currently working on getting QLDB set up in my project and finding it difficult to debug my PartiQL statements being sent to the executor. Getting errors like

botocore.errorfactory.BadRequestException: An error occurred (BadRequestException) when calling the SendCommand operation: Parser Error: at line 2, column 29: Invalid path component, expecting either an IDENTIFIER or STAR, got: QUESTION_MARK with value: 
1; Expected identifier for simple path

But it's difficult to understand where the syntax error is. Could you please add an option in the QLDBDriver to log the PartiQL statement sent in SendCommand to debug/info level on a StreamHandler?

'QldbDriver' object has no attribute '_retry_limit'

The above occurs all the time, it's an error, but only is a problem during OCC failure.

@property
def retry_limit(self):
    """
    The number of automatic retries for statement executions using convenience methods on sessions when
    an OCC conflict or retriable exception occurs.
    """
    return self._retry_limit

Should be something like:
return self._retry_config._retry_limit

pip install pyqldb==3.1.0
lib/python3.7/site-packages/pyqldb/driver/qldb_driver.py
pyqldb.driver.qldb_driver.QldbDriver.retry_limit

Incompatible with version amazon.ion v0.10.0

Hi,
we encounter this bug. It was working with amazon.ion v0.9.3

[ERROR] Runtime.ImportModuleError: Unable to import module 'lambda_function': cannot import name 'Enum' from 'amazon.ion.util' (/var/task/amazon/ion/util.py)Traceback (most recent call last): [ERROR] Runtime.ImportModuleError: Unable to import module 'lambda_function': cannot import name 'Enum' from 'amazon.ion.util' (/var/task/amazon/ion/util.py) Traceback (most recent call last):

OCC conflict or retriable exception

I'm trying to insert and update several docs in a QLDB table. My lambda function receives 2 lists with the docs. Each list have a maximum of 500 docs. First I'm inserting the new docs and then I want update other docs. Here is the way in which I'm doing that:

`

new_docs = event['new_docs']
updated_docs = event['updated_docs']

pooled_qldb_driver = PooledQldbDriver(ledger_name)
try:
    with pooled_qldb_driver.get_session() as session:
        if len(new_docs) != 0:
            insert(session, table_name, new_docs)

        if len(updated_docs) != 0:
            update(session, table_name, updated_docs)
except Exception as e:
    print('Error inserting or updating documents. ')
    print(e)`

I'm testing the lambda with 2 lists at maximum capacity (500 docs). The updated_docs list has the same docs than the new_docs list with some values modified (that's a possible case for my lambda function). So to insert the list of new_documents, I split the list in batches of 40 docs and then I'm insert the batch into the QLDB table. But to update the documents, I'm updating one document at a time. On this way, when I execute my lambda I get the following error:

OCC conflict or retriable exception occurred: An error occurred (OccConflictException) when calling the SendCommand operation: Optimistic concurrency control (OCC) failure encountered while committing the transaction

I wonder if that error is caused because the lambda it's trying to update the documents before the insertion finishes or it might be caused because the list of documents is big and they are updated one at a time.

So, my questions are:

  1. What is causing this error?
  2. Is there a way to update multiple documents in one statement?

Simplify Interfaces - Remove QldbDriver.execute_statement.

QldbDriver.execute_statement does not really serve much different purpose than QldbDriver.execute_execute_lambda, in-fact anything that can be achieved by execute_statement can easily be achieved by execute_lambda. As part of simplification of interfaces, the proposal here is to remove QldbDriver.execute_statement.

Before

qldb_driver.execute_statement("CREATE TABLE Person")

After

qldb_driver.execute_lambda(lambda txn: txn.execute_statement("CREATE TABLE Person"))

Note: The execute_statement method on Executor will remain untouched.

Add support for defining custom retry backoffs.

The proposal here is to allow developers define their own backoff strategy when a transaction is retried for reasons like OCC failures, timeouts. This should be optional and should result in a default back off if not defined.

API Specs:

class pyqldb.config.retry_config.RetryConfig(retry_limit=4, base=10, custom_backoff=None)
	Retry and Backoff Config for QldbDriver

	Parameters
	retry_limit (int) – The number of automatic retries for statement executions 
	using pyqldb.driver.qldb_driver.QldbDriver.execute_lambda() when an OCC conflict 
	or retriable exception occurs. This value must not be negative.

	base (int) – The base number of milliseconds to use in the exponential backoff 
	for operation retries. Defaults to 10 ms.

	custom_backoff (function) – A custom function that accepts a retry count, error, transaction id 
	and returns the amount of time to delay in milliseconds. If the result is a non-zero negative 
	value the backoff will be considered to be zero. The base option will be ignored if this option is supplied.

Usage

from pyqldb.config.retry_config import RetryConfig
from pyqldb.driver.qldb_driver import QldbDriver

# Configuring Retry limit to 2
retry_config = RetryConfig(retry_limit=2)
qldb_driver = QldbDriver("test-ledger", retry_config=retry_config)

# Configuring a custom back off which increase backoff by 1s for each attempt.

def custom_backoff(retry_attempt, error, transaction_id):
    return 1000 * retry_attempt

retry_config = RetryConfig(retry_limit=2)
qldb_driver = QldbDriver("test-ledger", retry_config=retry_config)

Note: retry_limit property on the QldbDriver will go away after this change.
Note: Also remove retry_indicator since it doesn't serve high value.

pyqldb==3.2.2 - python setup.py egg_info did not run successfully.

Hi I'm getting this error trying to install pyqldb==3.2.2

pip install pyqldb==3.2.2

Collecting pyqldb==3.2.2
Using cached pyqldb-3.2.2.tar.gz (24 kB)
Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status 'error'
pip : error: subprocess-exited-with-error
At line:1 char:1

  • pip install pyqldb==3.2.2
  •   + CategoryInfo          : NotSpecified: (  error: subprocess-exited-with-error:String) [], RemoteException
      + FullyQualifiedErrorId : NativeCommandError
    
    
    python setup.py egg_info did not run successfully.
    exit code: 1
    
    [8 lines of output]
    Traceback (most recent call last):
      File "<string>", line 2, in <module>
      File "<pip-setuptools-caller>", line 34, in <module>
      File "C:\Users\chris\AppData\Local\Temp\pip-install-g4luqruc\pyqldb_a670b15363e5465eb7f94e87a5a160a6\setup.py", line 32, in <module>
        long_description=open('README.md').read(),
      File "C:\Users\chris\AppData\Local\Programs\Python\Python39\lib\encodings\cp874.py", line 23, in decode
        return codecs.charmap_decode(input,self.errors,decoding_table)[0]
    UnicodeDecodeError: 'charmap' codec can't decode byte 0x99 in position 5233: character maps to <undefined>
    [end of output]
    

Update transaction hash is slow

Updating the hash for a transaction is very slow for certain statements with larger Ion inputs when using QLDB python driver. In some cases over 60% of client-side time on a transaction is spent in this method (_update_hash) for a request that has a total client-side latency > 500 ms. In comparison, hashing the same data in the standard Python hashlib takes <1 millisecond.

https://github.com/awslabs/amazon-qldb-driver-python/blob/v3.2.0/pyqldb/transaction/transaction.py#L116

An example:
In my environment, computing the Ion hash of the following Ion document takes ~2.5 ms using the Python Ion hash library (which _update_hash uses) while computing the native hash is < 0.01 ms. Computing the hash of a list of 100 of these entries in a single Ion object scales linearly and takes ~200ms using Python Ion hash and <1 ms with the native library. Hence if a document containing a list of 100 entries similar to below is inserted into QLDB, the driver will spend 200ms hashing the data.

$ion_1_0 { id:"0001", type:"donut", name:"Cake", ppu:0.55, batters:{ batter:[ { id:"1001", type:"Regular" }, { id:"1002", type:"Chocolate" }, { id:"1003", type:"Blueberry" }, { id:"1004", type:"Devil's Food" } ] }, topping:[ { id:"5001", type:"None" }, { id:"5002", type:"Glazed" }, { id:"5005", type:"Sugar" }, { id:"5007", type:"Powdered Sugar" }, { id:"5006", type:"Chocolate with Sprinkles" }, { id:"5003", type:"Chocolate" }, { id:"5004", type:"Maple" } ] }

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.