vertica / vertica-python Goto Github PK

View Code? Open in Web Editor NEW

378.0 1.8K 178.0 1.19 MB

Official native Python client for the Vertica Analytics Database.

Home Page: https://www.vertica.com/

License: Apache License 2.0

Python 100.00%

python database driver

vertica-python's Introduction

vertica-python

vertica-python is a native Python client for the Vertica (http://www.vertica.com) database. vertica-python is the replacement of the deprecated Python client vertica_db_client, which was removed since Vertica server version 9.3.

📢 08/14/2018: vertica-python becomes Vertica’s first officially supported open source database client, see the blog here.

Please check out release notes to learn about the latest improvements.

vertica-python has been tested with Vertica 24.2.0 and Python 3.7/3.8/3.9/3.10/3.11/3.12. Feel free to submit issues and/or pull requests (Read up on our contributing guidelines).

Installation

To install vertica-python with pip:

# Latest release version
pip install vertica-python

# Latest commit on master branch
pip install git+https://github.com/vertica/vertica-python.git@master

To install vertica-python from source, run the following command from the root directory:

python setup.py install

Source code for vertica-python can be found at:

https://github.com/vertica/vertica-python

Using Kerberos authentication

vertica-python has optional Kerberos authentication support for Unix-like systems, which requires you to install the kerberos package:

pip install kerberos

Note that kerberos is a python extension module, which means you need to install python-dev. The command depends on the package manager and will look like

sudo [yum|apt-get|etc] install python-dev

Then see this section for how to config Kerberos for a connection.

Usage

📜 The basic vertica-python usage is common to all the database adapters implementing the DB-API v2.0 protocol.

Create a connection

The example below shows how to create a Connection object:

import vertica_python

conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             # autogenerated session label by default,
             'session_label': 'some_label',
             # default throw error on invalid UTF-8 results
             'unicode_error': 'strict',
             # SSL is disabled by default
             'ssl': False,
             # autocommit is off by default
             'autocommit': True,
             # using server-side prepared statements is disabled by default
             'use_prepared_statements': False,
             # connection timeout is not enabled by default
             # 5 seconds timeout for a socket operation (Establishing a TCP connection or read/write operation)
             'connection_timeout': 5}

# simple connection, with manual close
try:
    connection = vertica_python.connect(**conn_info)
    # do things
finally:
    connection.close()

# using `with` for auto connection closing after usage
with vertica_python.connect(**conn_info) as connection:
    # do things

Connection Option	Description
host	The server host of the connection. This can be a host name or an IP address. *Default*: "localhost"
port	The port of the connection. *Default*: 5433
user	The database user name to use to connect to the database. *Default*: (for non-OAuth connections) OS login user name (for OAuth connections) ""
password	The password to use to log into the database. *Default*: ""
database	The database name. *Default*: ""
autocommit	See Autocommit. *Default*: False
backup_server_node	See Connection Failover. *Default*: []
binary_transfer	See Data Transfer Format. *Default*: False (use text format transfer)
connection_load_balance	See Connection Load Balancing. *Default*: False (disabled)
connection_timeout	The number of seconds (can be a nonnegative floating point number) the client waits for a socket operation (Establishing a TCP connection or read/write operation). *Default*: None (no timeout)
disable_copy_local	See COPY FROM LOCAL. *Default*: False
kerberos_host_name	See Kerberos Authentication. *Default*: the value of connection option `host`
kerberos_service_name	See Kerberos Authentication. *Default*: "vertica"
log_level	See Logging.
log_path	See Logging.
oauth_access_token	See OAuth Authentication. *Default*: ""
request_complex_types	See SQL Data conversion to Python objects. *Default*: True
session_label	Sets a label for the connection on the server. This value appears in the client_label column of the v_monitor.sessions system table. *Default*: an auto-generated label with format of `vertica-python-{version}-{random_uuid}`
ssl	See TLS/SSL. *Default*: False (disabled)
unicode_error	See UTF-8 encoding issues. *Default*: 'strict' (throw error on invalid UTF-8 results)
use_prepared_statements	See Passing parameters to SQL queries. *Default*: False
workload	Sets the workload name associated with this session. Valid values are workload names that already exist in a workload routing rule on the server. If a workload name that doesn't exist is entered, the server will reject it and it will be set to the default. *Default*: ""
dsn	See Set Properties with Connection String.

Below are a few important connection topics you may deal with, or you can skip and jump to the next section: Send Queries and Retrieve Results

Set Properties with Connection String

Another way to set connection properties is passing a connection string to the keyword parameter dsn of vertica_python.connect(dsn='...', **kwargs). The connection string is of the form:

vertica://(user):(password)@(host):(port)/(database)?(arg1=val1&arg2=val2&...)

The connection string would be parsed by vertica_python.parse_dsn(connection_str), and the parsing result (a dictionary of keywords and values) would be merged with kwargs. If the same keyword is specified in both the sources, the kwargs value overrides the parsed dsn value. The (arg1=val1&arg2=val2&...) section can handle string/numeric/boolean values, blank and invalid value would be ignored.

import vertica_python

connection_str = ('vertica://admin@localhost:5433/db1?connection_load_balance=True&connection_timeout=1.5&'
                  'session_label=vpclient+123%7E456')
print(vertica_python.parse_dsn(connection_str))
# {'user': 'admin', 'host': 'localhost', 'port': 5433, 'database': 'db1',
#  'connection_load_balance': True, 'connection_timeout': 1.5, 'session_label': 'vpclient 123~456'}

additional_info = {
    'password': 'some_password', 
    'backup_server_node': ['10.6.7.123', ('10.20.82.77', 6000)]  # invalid value to be set in a connection string
    }

with vertica_python.connect(dsn=connection_str, **additional_info) as conn:
   # do things

TLS/SSL

You can pass True to ssl to enable TLS/SSL connection (equivalent to TLSMode=require).

import vertica_python

# [TLSMode: require]
conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             'ssl': True}
connection = vertica_python.connect(**conn_info)

You can pass an ssl.SSLContext to ssl to customize the SSL connection options. Server mode TLS examples:

import vertica_python
import ssl

# [TLSMode: require]
# Ensure connection is encrypted.
ssl_context = ssl.SSLContext(ssl.PROTOCOL_TLS_CLIENT)
ssl_context.check_hostname = False
ssl_context.verify_mode = ssl.CERT_NONE

conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             'ssl': ssl_context}
connection = vertica_python.connect(**conn_info)


# [TLSMode: verify-ca]
# Ensure connection is encrypted, and client trusts server certificate.
ssl_context = ssl.SSLContext(ssl.PROTOCOL_TLS_CLIENT)
ssl_context.verify_mode = ssl.CERT_REQUIRED
ssl_context.check_hostname = False
ssl_context.load_verify_locations(cafile='/path/to/ca_file.pem') # CA certificate used to verify server certificate

conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             'ssl': ssl_context}
connection = vertica_python.connect(**conn_info)


# [TLSMode: verify-full]
# Ensure connection is encrypted, client trusts server certificate,
# and server hostname matches the one listed in the server certificate.
ssl_context = ssl.SSLContext(ssl.PROTOCOL_TLS_CLIENT)
ssl_context.verify_mode = ssl.CERT_REQUIRED
ssl_context.check_hostname = True
ssl_context.load_verify_locations(cafile='/path/to/ca_file.pem') # CA certificate used to verify server certificate

conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             'ssl': ssl_context}
connection = vertica_python.connect(**conn_info)

Mutual mode TLS example:

import vertica_python
import ssl

# [TLSMode: verify-full]
# Ensure connection is encrypted, client trusts server certificate,
# and server hostname matches the one listed in the server certificate.
ssl_context = ssl.SSLContext(ssl.PROTOCOL_TLS_CLIENT)
ssl_context.verify_mode = ssl.CERT_REQUIRED
ssl_context.check_hostname = True
ssl_context.load_verify_locations(cafile='/path/to/ca_file.pem') # CA certificate used to verify server certificate

# For Mutual mode, provide client certificate and client private key to ssl_context.
# CA certificate used to verify client certificate should be set at the server side.
ssl_context.load_cert_chain(certfile='/path/to/client.pem', keyfile='/path/to/client.key')

conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             'ssl': ssl_context}
connection = vertica_python.connect(**conn_info)

See more on SSL options here.

Kerberos Authentication

In order to use Kerberos authentication, install dependencies first, and it is the user's responsibility to ensure that an Ticket-Granting Ticket (TGT) is available and valid. Whether a TGT is available can be easily determined by running the klist command. If no TGT is available, then it first must be obtained by running the kinit command or by logging in. You can pass in optional arguments to customize the authentication. The arguments are kerberos_service_name, which defaults to "vertica", and kerberos_host_name, which defaults to the value of argument host. For example,

import vertica_python

conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             # The service name portion of the Vertica Kerberos principal
             'kerberos_service_name': 'vertica_krb',
             # The instance or host name portion of the Vertica Kerberos principal
             'kerberos_host_name': 'vcluster.example.com'}

with vertica_python.connect(**conn_info) as conn:
    # do things

OAuth Authentication

To authenticate via OAuth, provide an oauth_access_token that authorizes a user to the database.

import vertica_python

conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'database': 'a_database',
             # valid OAuth access token
             'oauth_access_token': 'xxxxxx'}

with vertica_python.connect(**conn_info) as conn:
    # do things

Logging

Logging is disabled by default if neither log_level or log_path are set. Passing value to at least one of those options to enable logging.

When logging is enabled, the default value of log_level is logging.WARNING. You can find all levels here. And the default value of log_path is 'vertica_python.log', the log file will be in the current execution directory. If log_path is set to '' (empty string) or None, no file handler is set, logs will be processed by root handlers. For example,

import vertica_python
import logging

## Example 1: write DEBUG level logs to './vertica_python.log'
conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             'log_level': logging.DEBUG}
with vertica_python.connect(**conn_info) as connection:
    # do things

## Example 2: write WARNING level logs to './path/to/logs/client.log'
conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             'log_path': 'path/to/logs/client.log'}
with vertica_python.connect(**conn_info) as connection:
   # do things

## Example 3: write INFO level logs to '/home/admin/logs/vClient.log'
conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             'log_level': logging.INFO,
             'log_path': '/home/admin/logs/vClient.log'}
with vertica_python.connect(**conn_info) as connection:
   # do things

## Example 4: use root handlers to process logs by setting 'log_path' to '' (empty string) 
conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             'log_level': logging.DEBUG,
             'log_path': ''}
with vertica_python.connect(**conn_info) as connection:
    # do things

Connection Failover

Supply a list of backup hosts to backup_server_node for the client to try if the primary host you specify in the connection parameters (host, port) is unreachable. Each item in the list should be either a host string (using default port 5433) or a (host, port) tuple. A host can be a host name or an IP address.

import vertica_python

conn_info = {'host': 'unreachable.server.com',
             'port': 888,
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             'backup_server_node': ['123.456.789.123', 'invalid.com', ('10.20.82.77', 6000)]}
connection = vertica_python.connect(**conn_info)

Connection Load Balancing

Connection Load Balancing helps automatically spread the overhead caused by client connections across the cluster by having hosts redirect client connections to other hosts. Both the server and the client need to enable load balancing for it to function. If the server disables connection load balancing, the load balancing request from client will be ignored.

import vertica_python

conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'password': 'some_password',
             'database': 'vdb',
             'connection_load_balance': True}

# Server enables load balancing
with vertica_python.connect(**conn_info) as conn:
    cur = conn.cursor()
    cur.execute("SELECT NODE_NAME FROM V_MONITOR.CURRENT_SESSION")
    print("Client connects to primary node:", cur.fetchone()[0])
    cur.execute("SELECT SET_LOAD_BALANCE_POLICY('ROUNDROBIN')")

with vertica_python.connect(**conn_info) as conn:
    cur = conn.cursor()
    cur.execute("SELECT NODE_NAME FROM V_MONITOR.CURRENT_SESSION")
    print("Client redirects to node:", cur.fetchone()[0])

## Output
#  Client connects to primary node: v_vdb_node0003
#  Client redirects to node: v_vdb_node0005

Data Transfer Format

There are two formats for transferring data from a server to a vertica-python client: text and binary. For example, a FLOAT type data is represented as a 8-byte IEEE-754 floating point number (fixed-width) in binary format, and a human-readable string (variable-width) in text format. The text format of values is whatever strings are produced and accepted by the input/output conversion functions for the particular data type.

Depending on the data type, binary transfer is generally more efficient and requires less bandwidth than text transfer. However, when transferring a large number of small values, binary transfer may use more bandwidth.

A connection is set to use text format by default. Set binary_transfer to True to use binary format.

import vertica_python

conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'password': 'some_password',
             'database': 'vdb',
             'binary_transfer': True  # False by default
             }

# Server enables binary transfer
with vertica_python.connect(**conn_info) as conn:
    cur = conn.cursor()
    ...

Ideally, the output data should be the same for these two formats. However, there are edge cases:

FLOAT data: binary format might offer slightly greater precision than text format. E.g. select ATAN(12.345) returns 1.48996835348642 (text) or 1.489968353486419 (binary)
TIMESTAMPTZ data: text format always use the session timezone, but binary format might fail to get session timezone and use local timezone.
NUMERIC data: In old server versions, the precision and scale is incorrect when querying a NUMERIC column that is not from a specific table with prepared statement in binary format. E.g. select ?::NUMERIC or select node_id, ?/50 from nodes. In newer server versions, binary transfer is forcibly disabled for NUMERIC data by the server, regardless of client-side values of binary_transfer and use_prepared_statements.

Send Queries and Retrieve Results

The Connection class encapsulates a database session. It allows to:

create new Cursor instances using the cursor() method to execute database commands and queries.
terminate transactions using the methods commit() or rollback().

The class Cursor allows interaction with the database:

send commands to the database using methods such as execute(), executemany() and copy.
retrieve data from the database, iterating on the cursor or using methods such as fetchone(), fetchmany(), fetchall(), nextset().

import vertica_python

conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'user': 'some_user',
             'password': 'some_password',
             'database': 'vdb'}

# Connect to a vertica database
with vertica_python.connect(**conn_info) as conn:
    # Open a cursor to perform database operations
    # vertica-python only support one cursor per connection
    cur = conn.cursor()
    
    # Execute a command: create a table
    cur.execute("CREATE TABLE tbl (a INT, b VARCHAR)")
    
    # Insert a row
    cur.execute("INSERT INTO tbl VALUES (1, 'aa')")
    inserted = cur.fetchall()  # [[1]]
    
    # Bulk Insert with executemany()
    # Pass data to fill a query placeholders and let vertica-python perform the correct conversion
    cur.executemany("INSERT INTO tbl(a, b) VALUES (?, ?)", [(2, 'bb'), (3, 'foo'), (4, 'xx'), (5, 'bar')], use_prepared_statements=True)
    # OR
    # cur.executemany("INSERT INTO tbl(a, b) VALUES (%s, %s)", [(6, 'bb'), (7, 'foo'), (8, 'xx'), (9, 'bar')], use_prepared_statements=False)
    
    # Query the database and obtain data as Python objects.
    cur.execute("SELECT * FROM tbl")
    datarow = cur.fetchone()         # [1, 'aa']
    remaining_rows = cur.fetchall()  # [[2, 'bb'], [3, 'foo'], [4, 'xx'], [5, 'bar']]

    # Make the changes to the database persistent
    conn.commit()
    
    # Execute a query with MULTIPLE statements
    cur.execute("SELECT 1; SELECT 2; ...; SELECT N")
    while True:  # Fetch the result set for each statement
        rows = cur.fetchall()
        print(rows)
        if not cur.nextset():
            break
    # Output:        
    # [[1]]
    # [[2]]
    # ...
    # [[N]]

Frequently Asked Questions 💬

Why does my query return empty results?

If you think Cursor.fetch*() should return something, check whether your query contains multiple statements. It is very likely that you miss to call Cursor.nextset().

Why does my query not throw an error?

vertica-python tries to throw exceptions in the Cursor.execute() method, but depending on your query, there are some exceptions that can only be raised when you call fetchone() fetchmany() or fetchall(). In addition, if your query has multiple statements, errors that is not in the first statement cannot be thrown by execute(). It is recommended to always call fetchall() after execute() in order to capture any error. And for a query with multiple statements, call fetchall() and nextset() as the above example code shows.

Why is this client N times slower than another vertica client?

You may find vertica-python performs much slower executing same query on same machine than another python client (e.g. pyodbc) or client in other programming language. This is because vertica-python is a pure Python program and CPython (the official implementation of Python, which is an interpreted, dynamic language) computation is often many times slower than compiled languages like C and Go, or JIT (Just-in-Time) compiled languages like Java and JavaScript. Therefore, if you want to get better performance, instead of using the official CPython interpreter, try other performance-oriented interpreters such as PyPy.

Stream query results

Streaming is recommended if you want to further process each row, save the results in a non-list/dict format (e.g. Pandas DataFrame), or save the results in a file.

cur = connection.cursor()
cur.execute("SELECT * FROM a_table LIMIT 2")

for row in cur.iterate():
    print(row)
# [ 1, 'some text', datetime.datetime(2014, 5, 18, 6, 47, 1, 928014) ]
# [ 2, 'something else', None ]

In-memory results as list

cur = connection.cursor()
cur.execute("SELECT * FROM a_table LIMIT 2")
cur.fetchall()
# [ [1, 'something'], [2, 'something_else'] ]

In-memory results as dictionary

cur = connection.cursor('dict')
cur.execute("SELECT * FROM a_table LIMIT 2")
cur.fetchall()
# [ {'id': 1, 'value': 'something'}, {'id': 2, 'value': 'something_else'} ]
connection.close()

Nextset

If you execute multiple statements in a single call to execute(), you can use Cursor.nextset() to retrieve all of the data.

cur.execute('SELECT 1; SELECT 2;')

cur.fetchone()
# [1]
cur.fetchone()
# None

cur.nextset()
# True

cur.fetchone()
# [2]
cur.fetchone()
# None

cur.nextset()
# False

Passing parameters to SQL queries

vertica-python provides two methods for passing parameters to a SQL query:

Server-side binding
Client-side binding

⚠️ Prerequisites: Only SQL literals (i.e. query values) should be bound via these methods: they shouldn’t be used to merge table or field names to the query (vertica-python will try quoting the table name as a string value, generating invalid SQL as it is actually a SQL Identifier). If you need to generate dynamically SQL queries (for instance choosing dynamically a table name) you have to construct the full query yourself.

Server-side binding: Query using prepared statements

Vertica server-side prepared statements let you define a statement once and then run it many times with different parameters. Internally, vertica-python sends the query and the parameters to the server separately. Placeholders in the statement are represented by question marks (?). Server-side prepared statements are useful for preventing SQL injection attacks.

import vertica_python

# Enable using server-side prepared statements at connection level
conn_info = {'host': '127.0.0.1',
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             'use_prepared_statements': True,
             }

with vertica_python.connect(**conn_info) as connection:
    cur = connection.cursor()
    cur.execute("CREATE TABLE tbl (a INT, b VARCHAR)")
    cur.execute("INSERT INTO tbl VALUES (?, ?)", [1, 'aa'])
    cur.execute("INSERT INTO tbl VALUES (?, ?)", [2, 'bb'])
    cur.executemany("INSERT INTO tbl VALUES (?, ?)", [(3, 'foo'), (4, 'xx'), (5, 'bar')])
    cur.execute("COMMIT")

    cur.execute("SELECT * FROM tbl WHERE a>=? AND a<=? ORDER BY a", (2,4))
    cur.fetchall()
    # [[2, 'bb'], [3, 'foo'], [4, 'xx']]

🚫 Vertica server-side prepared statements does not support executing a query string containing multiple statements.

You can set use_prepared_statements option in cursor.execute*() functions to override the connection level setting.

import vertica_python

# Enable using server-side prepared statements at connection level
conn_info = {'host': '127.0.0.1',
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             'use_prepared_statements': True,
             }

with vertica_python.connect(**conn_info) as connection:
    cur = connection.cursor()
    cur.execute("CREATE TABLE tbl (a INT, b VARCHAR)")

    # Executing compound statements
    cur.execute("INSERT INTO tbl VALUES (?, ?); COMMIT", [1, 'aa'])
    # Error message: Cannot insert multiple commands into a prepared statement

    # Disable prepared statements but forget to change placeholders (?)
    cur.execute("INSERT INTO tbl VALUES (?, ?); COMMIT;", [1, 'aa'], use_prepared_statements=False)
    # TypeError: not all arguments converted during string formatting

    cur.execute("INSERT INTO tbl VALUES (%s, %s); COMMIT;", [1, 'aa'], use_prepared_statements=False)
    cur.execute("INSERT INTO tbl VALUES (:a, :b); COMMIT;", {'a': 2, 'b': 'bb'}, use_prepared_statements=False)

# Disable using server-side prepared statements at connection level
conn_info['use_prepared_statements'] = False
with vertica_python.connect(**conn_info) as connection:
    cur = connection.cursor()

    # Try using prepared statements
    cur.execute("INSERT INTO tbl VALUES (?, ?)", [3, 'foo'])
    # TypeError: not all arguments converted during string formatting

    cur.execute("INSERT INTO tbl VALUES (?, ?)", [3, 'foo'], use_prepared_statements=True)

    # Query using named parameters
    cur.execute("SELECT * FROM tbl WHERE a>=:n1 AND a<=:n2 ORDER BY a", {'n1': 2, 'n2': 4})
    cur.fetchall()
    # [[2, 'bb'], [3, 'foo']]

Note: In other drivers, the batch insert is converted into a COPY statement by using prepared statements. vertica-python currently does not support that. More details

Client-side binding: Query using named parameters or format parameters

vertica-python can automatically convert Python objects to SQL literals, merge the query and the parameters on the client side, and then send the query to the server: using this feature your code will be more robust and reliable to prevent SQL injection attacks. You need to set use_prepared_statements option to False (at connection level or in cursor.execute*()) to use client-side binding.

Variables can be specified with named (:name) placeholders.

cur = connection.cursor()
data = {'propA': 1, 'propB': 'stringValue'}
cur.execute("SELECT * FROM a_table WHERE a = :propA AND b = :propB", data, use_prepared_statements=False)
# converted into a SQL command similar to: SELECT * FROM a_table WHERE a = 1 AND b = 'stringValue'

cur.fetchall()
# [ [1, 'stringValue'] ]

Variables can also be specified with positional format (%s) placeholders. The placeholder must always be a %s, even if a different placeholder (such as a %d for integers or %f for floats) may look more appropriate. Never use Python string concatenation (+) or string parameters interpolation (%) to pass variables to a SQL query string.

cur = connection.cursor()
data = (1, "O'Reilly")
cur.execute("SELECT * FROM a_table WHERE a = %s AND b = %s" % data) # WRONG: % operator
cur.execute("SELECT * FROM a_table WHERE a = %d AND b = %s", data)  # WRONG: %d placeholder
cur.execute("SELECT * FROM a_table WHERE a = %s AND b = %s", data)  # correct
# converted into a SQL command: SELECT * FROM a_table WHERE a = 1 AND b = 'O''Reilly'

cur.fetchall()
# [ [1, "O'Reilly"] ]

The placeholder must not be quoted. vertica-python will add quotes where needed.

>>> cur.execute("INSERT INTO table VALUES (':propA')", {'propA': "someString"}) # WRONG
>>> cur.execute("INSERT INTO table VALUES (:propA)", {'propA': "someString"})   # correct
>>> cur.execute("INSERT INTO table VALUES ('%s')", ("someString",)) # WRONG
>>> cur.execute("INSERT INTO table VALUES (%s)", ("someString",))   # correct

In order to merge the query (with placeholders) and the parameters on the client side, parameter values (python object) are converted to SQL literals (str). vertica-python supports default mapping to SQL literals for many standard Python types (str, bytes, bool, int, float, decimal.Decimal, tuple, list, set, dict, datetime.datetime, datetime.date, datetime.time, uuid.UUID). For complex types, in some cases, you may need explicit typecasting for the placeholder (e.g. %s::ARRAY[ARRAY[INT]], %s::ROW(varchar,int,date)):

from datetime import date
cur.execute("CREATE TABLE table (a INT, b ARRAY[DATE])")
value = [date(2021, 6, 10), date(2021, 6, 12), date(2021, 6, 30)]

cur.execute("INSERT INTO table VALUES (%s, %s)", [100, value], use_prepared_statements=False)  # WRONG
# Error Message: Column "b" is of type array[date] but expression is of type array[varchar], Sqlstate: 42804, 
# Hint: You will need to rewrite or cast the expression

cur.execute("INSERT INTO table VALUES (%s, %s::ARRAY[DATE])", [100, value], use_prepared_statements=False)  # correct
# converted into a SQL command: INSERT INTO vptest VALUES (100, ARRAY['2021-06-10','2021-06-12','2021-06-30']::ARRAY[DATE])

# Client-side binding of cursor.executemany is different from cursor.execute internally
# But it also supports some of complex types mapping
cur.executemany("INSERT INTO table (a, b) VALUES (%s, %s)", [[100, value]], use_prepared_statements=False)

Register new SQL literal adapters

It is possible to adapt new Python types to SQL literals via Cursor.register_sql_literal_adapter(py_class_or_type, adapter_function) function. Example:

class Point(object):
    def __init__(self, x, y):
        self.x = x
        self.y = y

# Adapter should return a string value
def adapt_point(point):
    return "STV_GeometryPoint({},{})".format(point.x, point.y)

cur = conn.cursor()
cur.register_sql_literal_adapter(Point, adapt_point)

cur.execute("INSERT INTO geom_data (geom) VALUES (%s)", [Point(1.23, 4.56)])
cur.execute("select ST_asText(geom) from geom_data")
cur.fetchall()
# [['POINT (1.23 4.56)']]

To help you debug the binding process during Cursor.execute*(), Cursor.object_to_sql_literal(py_object) function can be used to inspect the SQL literal string converted from a Python object.

cur = conn.cursor
cur.object_to_sql_literal("O'Reilly")  # "'O''Reilly'"
cur.object_to_sql_literal(None)  # "NULL"
cur.object_to_sql_literal(True)  # "True"
cur.object_to_sql_literal(Decimal("10.00000"))  # "10.00000"
cur.object_to_sql_literal(datetime.date(2018, 9, 7))  # "'2018-09-07'"
cur.object_to_sql_literal(Point(-71.13, 42.36))  # "STV_GeometryPoint(-71.13,42.36)" if you registered in previous step

Cursor.executemany(): Server-side binding vs Client-side binding

Cursor.executemany(query, seq_of_parameters, use_prepared_statements=None)

Execute the same query or command with a sequence of input data.

PARAMETERS

query (str or bytes) – The query to execute.
seq_of_parameters (a list/tuple of Sequences or Mappings) – The parameters to pass to the query.
use_prepared_statements (bool) – Use connection level setting by default. If set, execute the query using server-side prepared statements or not.

When use_prepared_statements=True (Server-side binding), the query should contain only a single statement. Internally, vertica-python sends the query and each set of parameters to the server separately.

When use_prepared_statements=False (Client-side binding), the query is limited to simple INSERT statements only. The batch insert is converted into a COPY FROM STDIN statement by the client. This is more efficient than performing separate queries (may even faster than Server-side binding), but in case of other statements you may consider using copy.

# Note the query parameter placeholders difference!
cur.executemany("INSERT INTO tbl(a, b) VALUES (?, ?)", [(2, 'bb'), (3, 'foo'), (4, 'xx'), (5, 'bar')], use_prepared_statements=True)

cur.executemany("INSERT INTO tbl(a, b) VALUES (%s, %s)", [(6, 'bb'), (7, 'foo'), (8, 'xx'), (9, 'bar')], use_prepared_statements=False)
cur.executemany("INSERT INTO tbl(a, b) VALUES (:a, :b)", [{'a': 2, 'b': 'bb'}, {'a': 3, 'b': 'foo'}], use_prepared_statements=False)

Insert and commit/rollback

cur = connection.cursor()

# inline commit (when 'use_prepared_statements' is False)
cur.execute("INSERT INTO a_table (a, b) VALUES (1, 'aa'); commit;")

# commit in execution
cur.execute("INSERT INTO a_table (a, b) VALUES (1, 'aa')")
cur.execute("INSERT INTO a_table (a, b) VALUES (2, 'bb')")
cur.execute("commit;")

# connection.commit()
cur.execute("INSERT INTO a_table (a, b) VALUES (1, 'aa')")
connection.commit()

# connection.rollback()
cur.execute("INSERT INTO a_table (a, b) VALUES (0, 'bad')")
connection.rollback()

Autocommit

Session parameter AUTOCOMMIT can be configured by the connection option and the Connection.autocommit read/write attribute:

import vertica_python

# Enable autocommit at startup
conn_info = {'host': '127.0.0.1',
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             # autocommit is off by default
             'autocommit': True,
             }
             
with vertica_python.connect(**conn_info) as connection:
    # Check current session autocommit setting
    print(connection.autocommit)    # should be True
    # If autocommit is True, statements automatically commit their transactions when they complete.
    
    # Set autocommit setting with attribute
    connection.autocommit = False
    print(connection.autocommit)    # should be False
    # If autocommit is False, the methods commit() or rollback() must be manually invoked to terminate the transaction.

To set AUTOCOMMIT to a new value, vertica-python uses Cursor.execute() to execute a command internally, and that would clear your previous query results, so be sure to call Cursor.fetch*() to save your results before you set autocommit.

Using COPY FROM

⚠️ Prerequisites: If the data is in files not STDIN, only files on the client system should be loaded via these methods below. For files on the server system, run "COPY target-table FROM 'path‑to‑data'" with Cursor.execute().

There are 2 methods to do copy:

Method 1: "COPY FROM STDIN" sql with Cursor.copy()

cur = connection.cursor()
cur.copy("COPY test_copy (id, name) FROM stdin DELIMITER ',' ",  csv)

Where csv is either a string or a file-like object (specifically, any object with a read() method). If using a file, the data is streamed (in chunks of buffer_size bytes, which defaults to 128 * 2 ** 10).

with open("/tmp/binary_file.csv", "rb") as fs:
    cursor.copy("COPY table(field1, field2) FROM STDIN DELIMITER ',' ENCLOSED BY '\"'",
                fs, buffer_size=65536)

Method 2: "COPY FROM LOCAL" sql with Cursor.execute()

import sys
import vertica_python

conn_info = {'host': '127.0.0.1',
             'user': 'some_user',
             'password': 'some_password',
             'database': 'a_database',
             # False by default
             #'disable_copy_local': True,
             # Don't support executing COPY LOCAL operations with prepared statements
             'use_prepared_statements': False
             }

with vertica_python.connect(**conn_info) as connection:
    cur = connection.cursor()
    
    # Copy from local file
    cur.execute("COPY table(field1, field2) FROM LOCAL"
                " 'data_Jan_*.csv','data_Feb_01.csv' DELIMITER ','"
                " REJECTED DATA 'path/to/write/rejects.txt'"
                " EXCEPTIONS 'path/to/write/exceptions.txt'",
                buffer_size=65536
    )
    print("Rows loaded:", cur.fetchall())
    
    # Copy from local stdin
    cur.execute("COPY table(field1, field2) FROM LOCAL STDIN DELIMITER ','", copy_stdin=sys.stdin)
    print("Rows loaded:", cur.fetchall())

    # Copy from local stdin (compound statements)
    with open('f1.csv', 'r') as fs1, open('f2.csv', 'r') as fs2:
        cur.execute("COPY tlb1(field1, field2) FROM LOCAL STDIN DELIMITER ',';"
                    "COPY tlb2(field1, field2) FROM LOCAL STDIN DELIMITER ',';",
                    copy_stdin=[fs1, fs2], buffer_size=65536)
    print("Rows loaded 1:", cur.fetchall())
    cur.nextset()
    print("Rows loaded 2:", cur.fetchall())
    
    # Copy from local stdin (StringIO)
    from io import StringIO
    data = "Anna|123-456-789\nBrown|555-444-3333\nCindy|555-867-53093453453\nDodd|123-456-789\nEd|123-456-789"
    cur.execute("COPY customers (firstNames, phoneNumbers) FROM LOCAL STDIN ENFORCELENGTH RETURNREJECTED AUTO",
                copy_stdin=StringIO(data))

When connection option disable_copy_local set to True, disables COPY LOCAL operations, including copying data from local files/stdin and using local files to store data and exceptions. You can use this property to prevent users from writing to and copying from files on a Vertica host, including an MC host. Note that this property doesn't apply to Cursor.copy().

The data for copying from/writing to local files is streamed in chunks of buffer_size bytes, which defaults to 128 * 2 ** 10.

When executing "COPY FROM LOCAL STDIN", copy_stdin should be a file-like object or a list of file-like objects (specifically, any object with a read() method).

Cancel the current database operation

Connection.cancel() interrupts the processing of the current operation. Interrupting query execution will cause the cancelled method to raise a vertica_python.errors.QueryCanceled. If no query is being executed, it does nothing. You can call this function from a different thread/process than the one currently executing a database operation.

from multiprocessing import Process
import time
import vertica_python

def cancel_query(connection, timeout=5):
    time.sleep(timeout)
    connection.cancel()

# Example 1: Cancel the query before Cursor.execute() return.
#            The query stops executing in a shorter time after the cancel message is sent.
with vertica_python.connect(**conn_info) as conn:
    cur = conn.cursor()

    # Call cancel() from a different process
    p1 = Process(target=cancel_query, args=(conn,))
    p1.start()

    try:
        cur.execute("<Long running query>")
    except vertica_python.errors.QueryCanceled as e:
        pass
    p1.join()
    
# Example 2: Cancel the query after Cursor.execute() return.
#            Less number of rows read after the cancel message is sent.
with vertica_python.connect(**conn_info) as conn:
    cur = conn.cursor()
    cur.execute("SELECT id, time FROM large_table")
    nCount = 0
    try:
        while cur.fetchone():
            nCount += 1
            if nCount == 100:
                conn.cancel()
    except vertica_python.errors.QueryCanceled as e:
        pass
        # nCount is less than the number of rows in large_table

SQL Data conversion to Python objects

When a query is executed and Cursor.fetch*() is called, SQL data (bytes) are deserialized as Python objects. The following table shows the default mapping from SQL data types to Python objects:

SQL data type	Python object type
NULL	None
BOOLEAN	bool
INTEGER	int
FLOAT	float
NUMERIC	decimal.Decimal
CHAR	str
VARCHAR	str
LONG VARCHAR	str
BINARY	bytes
VARBINARY	bytes
LONG VARBINARY	bytes
UUID	uuid.UUID
DATE	datetime.date^[1]
TIME	datetime.time^[2]
TIMETZ	datetime.time^[2]
TIMESTAMP	datetime.datetime^[1]
TIMESTAMPTZ	datetime.datetime^[1]
INTERVAL	dateutil.relativedelta.relativedelta
ARRAY	list^[3]
SET	set^[3]
ROW	dict^[3]
MAP	dict^[3]

^[1]Python’s datetime.date and datetime.datetime only supports date ranges 0001-01-01 to 9999-12-31. Retrieving a value of BC date or future date (year>9999) results in an error.

^[2]Python’s datetime.time only supports times until 23:59:59. Retrieving a value of 24:00:00 results in an error.

^[3]If connection option 'request_complex_types' set to False, the server returns all complex types as VARCHAR/LONG VARCHAR Json strings, so the client will convert data to str instead. Server before v12.0.2 cannot provide enough metadata for complex types, the behavior is equal to request_complex_types=False.

Bypass data conversion to Python objects

The Cursor.disable_sqldata_converter attribute can bypass the result data conversion to Python objects.

with vertica_python.connect(**conn_info) as conn:
    cur = conn.cursor()
    sql = "select 'foo'::VARCHAR, 100::INT, '2001-12-01 02:50:00'::TIMESTAMP"
    
    #### Convert SQL types to Python objects ####
    print(cur.disable_sqldata_converter)   # Default is False
    # False
    cur.execute(sql)
    print(cur.fetchall())
    # [['foo', 100, datetime.datetime(2001, 12, 1, 2, 50)]]
    
    #### No Conversion: return raw bytes data ####
    cur.disable_sqldata_converter = True   # Set attribute to True
    cur.execute(sql)
    print(cur.fetchall())
    # [[b'foo', b'100', b'2001-12-01 02:50:00']]

As a result, this can improve query performance when you call fetchall() but ignore/skip result data.

Customize data conversion to Python objects

The Cursor.register_sqldata_converter(oid, converter_func) method allows to customize how SQL data values are converted to Python objects when query results are returned.

PARAMETERS:

oid – The Vertica type OID to manage.
converter_func – The converter function to register for oid. The function should have two arguments <val, ctx>. Data Transfer Format matters for val (SQL data value). ctx is a dict managing resources that may be used by convertions. E.g. ctx['column'].format_code would be 0 (Text transfer format) or 1 (Binary transfer format).

The Cursor.unregister_sqldata_converter(oid) method allows to cancel customization and use the default converter.

Each Vertica type OID is an integer representing a SQL type, you can look up OIDs in vertica_python.datatypes:

$ python3
>>> from vertica_python.datatypes import VerticaType
>>> {k:v for k,v in dict(VerticaType.__dict__).items() if not k.startswith('_')}
{'UNKNOWN': 4, 'BOOL': 5, 'INT8': 6, 'FLOAT8': 7, 'CHAR': 8, 'VARCHAR': 9, 'DATE': 10, 'TIME': 11, 'TIMESTAMP': 12, 'TIMESTAMPTZ': 13, 'INTERVAL': 14, 'INTERVALYM': 114, 'TIMETZ': 15, 'NUMERIC': 16, 'VARBINARY': 17, 'UUID': 20, 'LONGVARCHAR': 115, 'LONGVARBINARY': 116, 'BINARY': 117, 'ROW': 300, 'ARRAY': 301, 'MAP': 302, 'ARRAY1D_BOOL': 1505, 'ARRAY1D_INT8': 1506, 'ARRAY1D_FLOAT8': 1507, 'ARRAY1D_CHAR': 1508, 'ARRAY1D_VARCHAR': 1509, 'ARRAY1D_DATE': 1510, 'ARRAY1D_TIME': 1511, 'ARRAY1D_TIMESTAMP': 1512, 'ARRAY1D_TIMESTAMPTZ': 1513, 'ARRAY1D_INTERVAL': 1514, 'ARRAY1D_INTERVALYM': 1521, 'ARRAY1D_TIMETZ': 1515, 'ARRAY1D_NUMERIC': 1516, 'ARRAY1D_VARBINARY': 1517, 'ARRAY1D_UUID': 1520, 'ARRAY1D_BINARY': 1522, 'ARRAY1D_LONGVARCHAR': 1519, 'ARRAY1D_LONGVARBINARY': 1518, 'SET_BOOL': 2705, 'SET_INT8': 2706, 'SET_FLOAT8': 2707, 'SET_CHAR': 2708, 'SET_VARCHAR': 2709, 'SET_DATE': 2710, 'SET_TIME': 2711, 'SET_TIMESTAMP': 2712, 'SET_TIMESTAMPTZ': 2713, 'SET_INTERVAL': 2714, 'SET_INTERVALYM': 2721, 'SET_TIMETZ': 2715, 'SET_NUMERIC': 2716, 'SET_VARBINARY': 2717, 'SET_UUID': 2720, 'SET_BINARY': 2722, 'SET_LONGVARCHAR': 2719, 'SET_LONGVARBINARY': 2718}
>>>
>>> # Use VerticaType.XXXX to refer to an OID
>>> VerticaType.VARCHAR
9
>>>
>>> from vertica_python.datatypes import TYPENAME
>>> TYPENAME   # mapping from OIDs to readable names
{4: 'Unknown', 5: 'Boolean', 6: 'Integer', 7: 'Float', 8: 'Char', 9: 'Varchar', 115: 'Long Varchar', 10: 'Date', 11: 'Time', 15: 'TimeTz', 12: 'Timestamp', 13: 'TimestampTz', 117: 'Binary', 17: 'Varbinary', 116: 'Long Varbinary', 16: 'Numeric', 20: 'Uuid', 300: 'Row', 301: 'Array', 302: 'Map', 1505: 'Array[Boolean]', 1506: 'Array[Int8]', 1507: 'Array[Float8]', 1508: 'Array[Char]', 1509: 'Array[Varchar]', 1510: 'Array[Date]', 1511: 'Array[Time]', 1512: 'Array[Timestamp]', 1513: 'Array[TimestampTz]', 1515: 'Array[TimeTz]', 1516: 'Array[Numeric]', 1517: 'Array[Varbinary]', 1520: 'Array[Uuid]', 1522: 'Array[Binary]', 1519: 'Array[Long Varchar]', 1518: 'Array[Long Varbinary]', 2705: 'Set[Boolean]', 2706: 'Set[Int8]', 2707: 'Set[Float8]', 2708: 'Set[Char]', 2709: 'Set[Varchar]', 2710: 'Set[Date]', 2711: 'Set[Time]', 2712: 'Set[Timestamp]', 2713: 'Set[TimestampTz]', 2715: 'Set[TimeTz]', 2716: 'Set[Numeric]', 2717: 'Set[Varbinary]', 2720: 'Set[Uuid]', 2722: 'Set[Binary]', 2719: 'Set[Long Varchar]', 2718: 'Set[Long Varbinary]'}

Example: Vertica numeric to Python float

Normally Vertica NUMERIC values are converted to Python decimal.Decimal instances, because both the types allow fixed-precision arithmetic and are not subject to rounding. Sometimes, however, you may want to perform floating-point math on NUMERIC values, and decimal.Decimal may get in the way. If you are fine with the potential loss of precision and you simply want to receive NUMERIC values as Python float, you can register a converter on NUMERIC.

import vertica_python
from vertica_python.datatypes import VerticaType

conn_info = {'host': '127.0.0.1',
             'port': 5433,
             ...
            }

conn = vertica_python.connect(**conn_info)
cur = conn.cursor()
cur.execute("SELECT 123.45::NUMERIC")
print(cur.fetchone()[0])
# Decimal('123.45')

def convert_numeric(val, ctx):
    # val: bytes - this is a text representation of NUMERIC value
    # ctx: dict - some info that may be useful to the converter
    return float(val)

cur.register_sqldata_converter(VerticaType.NUMERIC, convert_numeric)
cur.execute("SELECT 123.45::NUMERIC")
print(cur.fetchone()[0])
# 123.45

cur.unregister_sqldata_converter(VerticaType.NUMERIC)  # reset
cur.execute("SELECT 123.45::NUMERIC")
print(cur.fetchone()[0])
# Decimal('123.45')

Example: Vertica complex types

The raw bytes data of complex types (ARRAY, MAP, ROW, SET) are in JSON format for both Text & Binary transfer format.

import json
import numpy as np
import vertica_python
from vertica_python.datatypes import VerticaType

conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'binary_transfer': False/True,
             ...
            }

conn = vertica_python.connect(**conn_info)
cur = conn.cursor()

#==================================================
cur.execute("SELECT ARRAY[-1.234, 0, 1.66, null, 50]::ARRAY[FLOAT]")
data = cur.fetchone()[0]
print(type(data))  # <class 'list'>
print(data)  # [-1.234, 0.0, 1.66, None, 50.0]
numpy_data = np.array(data) # This is equal to the query value below
#==================================================
def convert_array(val, ctx):
    # val: b'[-1.234,0.0,1.66,null,50.0]'
    json_data = json.loads(val)
    return np.array(json_data)
# VerticaType.ARRAY1D_FLOAT8 represents one-dimensional array of FLOAT type
cur.register_sqldata_converter(VerticaType.ARRAY1D_FLOAT8, convert_array)
cur.execute("SELECT ARRAY[-1.234, 0, 1.66, null, 50]::ARRAY[FLOAT]")
data = cur.fetchone()[0]
print(type(data))  # <class 'numpy.ndarray'>
print(data)  # [-1.234 0.0 1.66 None 50.0]

#==================================================
# VerticaType.ARRAY represents multidimensional array or contain ROWs
cur.register_sqldata_converter(VerticaType.ARRAY, convert_array)
cur.execute("SELECT ARRAY[ARRAY[-1, 234, 5],ARRAY[88, 0, 19]]::ARRAY[ARRAY[INT]]")
data = cur.fetchone()[0]
print(type(data))  # <class 'numpy.ndarray'>
print(data)
#[[ -1 234   5]
# [ 88   0  19]]

import pandas
import vertica_python
from io import BytesIO
from vertica_python.datatypes import VerticaType

conn_info = {'host': '127.0.0.1',
             'port': 5433,
             'binary_transfer': False/True,
             ...
            }

conn = vertica_python.connect(**conn_info)
cur = conn.cursor()

cur.execute("SELECT ROW(ROW('a','b') as row1, ROW('c','d') as row2)")
data = cur.fetchone()[0]
print(type(data)) # <class 'dict'>
print(data)  # {'row1': {'f0': 'a', 'f1': 'b'}, 'row2': {'f0': 'c', 'f1': 'd'}}

def convert_row(val, ctx):
    # val: b'{"row1":{"f0":"a","f1":"b"},"row2":{"f0":"c","f1":"d"}}'
    return pandas.read_json(BytesIO(val), orient='index')

cur.register_sqldata_converter(VerticaType.ROW, convert_row)
cur.execute("SELECT ROW(ROW('a','b') as row1, ROW('c','d') as row2)")
data = cur.fetchone()[0]
print(type(data)) # <class 'pandas.core.frame.DataFrame'>
print(data)
#      f0 f1
# row1  a  b
# row2  c  d

If you want to learn how default converters for each transfer format and oid works, look at the source code at vertica_python/vertica/deserializer.py

Shortcuts

The Cursor.execute() method returns self. This means that you can chain a fetch operation, such as fetchone(), to the execute() call:

row = cursor.execute(...).fetchone()

for row in cur.execute(...).fetchall():
    ...

Rowcount oddities

vertica_python behaves a bit differently than dbapi when returning rowcounts.

After a select execution, the rowcount will be -1, indicating that the row count is unknown. The rowcount value will be updated as data is streamed.

cur.execute('SELECT 10 things')

cur.rowcount == -1  # indicates unknown rowcount

cur.fetchone()
cur.rowcount == 1
cur.fetchone()
cur.rowcount == 2
cur.fetchall()
cur.rowcount == 10

After an insert/update/delete, the rowcount will be returned as a single element row:

cur.execute("DELETE 3 things")

cur.rowcount == -1  # indicates unknown rowcount
cur.fetchone()[0] == 3

UTF-8 encoding issues

While Vertica expects varchars stored to be UTF-8 encoded, sometimes invalid strings get into the database. You can specify how to handle reading these characters using the unicode_error connection option. This uses the same values as the unicode type (https://docs.python.org/3/library/codecs.html#error-handlers)

cur = vertica_python.Connection({..., 'unicode_error': 'strict'}).cursor()
cur.execute(r"SELECT E'\xC2'")
cur.fetchone()
# caught 'utf8' codec can't decode byte 0xc2 in position 0: unexpected end of data

cur = vertica_python.Connection({..., 'unicode_error': 'replace'}).cursor()
cur.execute(r"SELECT E'\xC2'")
cur.fetchone()
# �

cur = vertica_python.Connection({..., 'unicode_error': 'ignore'}).cursor()
cur.execute(r"SELECT E'\xC2'")
cur.fetchone()
#

License

Apache 2.0 License, please see LICENSE for details.

Contributing guidelines

Have a bug or an idea? Please see CONTRIBUTING.md for details.

Acknowledgements

We would like to thank the contributors to the Ruby Vertica gem (https://github.com/sprsquish/vertica), as this project gave us inspiration and help in understanding Vertica's wire protocol. These contributors are:

vertica-python's People

Contributors

Stargazers

Watchers

Forkers

marklr bearnshaw mverrilli jakubjedelsky rodevl kevints kennethdamica oldmantaiter ruwen mmogylenko bourivouh johano99 carolssnz rahul1 lupko jgao d13sl0w frozen jberka ospreyx upcounsel rforgione jrowan dennisobrien lemeryfertitta shivamy remiremi r-wheeler eliangcs wtj snakescott rutube carlekerot fredrikappelros vparikh10 ominux nchammas nicktrav twneale pauleikis mmourafiq rekbun yileye aoracle stevenstip zmaril deepakprakash nehajaiswal5 mattjacobus ravindraswamy massmutual drefrome jsingh3 bin2000 twheys floux stasfilin bradleyhurley startappdev ivgitk jkim karthiksharmasg etsy anovosiolov satryacode fanxuuu avmukilan tudit mulby alecxe xs2rajni andrewjudson epalese etchandler92 micseb mi-volodin cain-eyespage retsam69 achi0012 bourbaki asitti silentkevin donalddong ptzagk zhtsh pachewise oxanam adeturckheim blackeyepanda bryanherger darwinyip pcerny alextby dschoni pkhanpara lordshinjo vinceatbluelabs ianze qinchaofeng alexmilk

vertica-python's Issues

Bug : cursor.description[idx].name returns only function name when we use an aggregate function on a column

when using cursor.description.name on select sum(a), sum(b) from temp_table we get name = "sum" for both columns while the name should also include attribute ( sum(a) ) to make name unique. We are facing an issue due to this bug.

ConnectionError: Connection is closed when used in a multithreaded environment with Eventlet

vertica-python on Windows

it seems that the package imports crypt which is not supported on Windows.
Any way around this ?

Non-ASCII Parameters Not Working

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import json, vertica_python

cnxinfo = {'host': 'verticadb01', 'port': 5433, 'user': 'user1', 'password': 'blahblah', 'database': 'mydb'}

cnx = vertica_python.connect(**cnxinfo)
crsr = cnx.cursor()

## This doesn't work
#crsr.execute(u'SELECT * FROM my_schema.my_table WHERE lang = %s ORDER BY id', (u'Español',))

# This works
crsr.execute(u'SELECT * FROM my_schema.my_table WHERE lang = \'Español\' ORDER BY id')

d = crsr.fetchall()
cnx.close()
print(json.dumps(d))

cursor.rowcount is incorrect

In vertica_python 0.3.6, it was ok:

In [4]: cursor.execute('SELECT 1 UNION SELECT 2')

In [5]: cursor.rowcount
Out[5]: 2

But in the latest 0.5.1, you have to iterate all rows to get rowcount:

In [10]: cursor.execute('SELECT 1 UNION SELECT 2')

In [11]: cursor.rowcount
Out[11]: 0

In [12]: cursor.fetchall()
Out[12]: [[1], [2]]

In [13]: cursor.rowcount
Out[13]: 2

0.5.5 release on GitHub is missing description and files

Unlike the other releases on GitHub, 0.5.5 appears to be missing a description as well as the usual source package.

cursor.description.name returns "sum","sum when using select sum(column

connection reset in cursor.execute

Please see below

import sys

sys.path.insert(0, '/home/vrbala/Downloads/vertica-python-932fe36bfb8744d903332201a0981954029f8fa2')

from vertica_python import connect

_CONNECTION_INFO = {
        'host': 'host.com',
        'port': 5433,
        'user': 'user',
        'password': 'password',
        'database': 'vertica',
        'read_timeout': 86400
        }

connection = connect(
    _CONNECTION_INFO
    )

c = connection.cursor()
c.execute("select version();")
result = c.fetchall()
print result

c.execute("select * from schema.bala_test;")
result = c.fetchall()
print result

c.execute("update schema.bala_test set x_col = 2; ")
result = c.fetchall()
print result

c.execute("commit; ")
result = c.fetchall()
print result

c.execute("insert into schema.bala_test values (1, 1); ")
result = c.fetchall()
print result

c.execute("commit; ")
result = c.fetchall()
print result

c.execute("select * from schema.bala_test;")
result = c.fetchall()
print result

connection.close()

When I run the script,

% python vertica-test.py
[[u'Vertica Analytic Database v7.1.1-11']]
[[Decimal('1'), u'1'], [Decimal('1'), u'1']]
[[2]]
[]
[[1]]
[]
[[Decimal('1'), u'1'], [Decimal('1'), u'1']]

The test table had 2 lines originally [[1,1],[1,1]]. If you notice even after the update and insert queries, nothing has changed in the tables.

Analysis shows that the reset_connection call in cursor.execute causes every execute call to behave like a transaction and essentially a new connection. This behavior is not correct and breaks code that relies on a connection object representing a single connection & transaction unless user chooses to reboot it.

connect() chokes on unicode arguments

Passing in a dictionary with unicode keys and values to connect() yields:

ConnectionError: argument for 's' must be a string

The stack trace leads back here.

I'm not sure how other database modules handle unicode connection parameters, but I think psycopg2 accepts them. Does it make sense to modify vertica-python to accept unicode parameters as well?

You will have unicode connection information if, for example, you are loading it from a file via json.loads(). The work-around I'm using is to load the connection information using ast.literal_eval(), which gives you a dictionary with ASCII strings.

named arguments collide in substrings

If I use the named arguments :s and :start_date, then :start_date gets interpreted as :s and tart_date

"SELECT * FROM trips WHERE begintrip_lat > :s AND begintrip_at > :start_date LIMIT 4;", params={'s': 37, 'start_date': '2015-4-1'})

yields

'SELECT * FROM trips WHERE begintrip_lat > 37 AND begintrip_at > 37tart_date LIMIT 4;'

any plan to make this work with Python 3 ?

currently it is not compatible.

update error using sqlalchemy

Hi.
I use your library (0.5.5) with sqlalchemy (1.0.12) and get that situation:

`2016-04-22 16:26:24,691 INFO sqlalchemy.engine.base.Engine UPDATE orders SET comment=:comment, ts_last_activity=:ts_last_activity, state=:state, previous_state=:previous_state, flags=:flags WHERE orders.id = :orders_id
2016-04-22 16:26:24,691 INFO sqlalchemy.engine.base.Engine {'comment': '"last_activity"=>"2016-04-22 13:26"', 'orders_id': 2000205, 'previous_state': 500, 'state': 300, 'flags': 2048, 'ts_last_activity': 1461320780}
2016-04-22 16:26:24,790 INFO sqlalchemy.engine.base.Engine ROLLBACK

StaleDataError: UPDATE statement on table 'orders' expected to update 1 row(s); -1 were matched.`

Here I'm checking the session state:

>>> SessionVertica.identity_map
<sqlalchemy.orm.identity.WeakInstanceDict object at 0x7f43c5e46f90>
>>> SessionVertica.identity_map.check_modified()
True
>>> SessionVertica.identity_map.keys()
[(<class 'hasoffers.core.model.order.S'>, (2000205,))]
>>> recounted_base.id
2000205

As you see, I have rows for update, but i get the error above.
Do you have any idea, what does it means?

import vertica_python

after installing vertica-python, I encountered the following issue:

import vertica_python
File "C:\Users\user_name\AppData\Local\Continuum\Anaconda3\lib\site-packages\vertica_python-0.5.5-py3.5.egg\vertica_python\vertica\connection.py", line 153
except Exception, e:
^
SyntaxError: invalid syntax

errors not reported when multiple statements in a query

When cursor.execute is passed a query that has multiple statements, and one of the statements results in a database error, that error is not raised (unless it is the first statement).

Here's an example of a query that when run on its own raises a MissingColumn exception:

import vertica_python

params = get_default_db_params()
connection = vertica_python.connect(**params)
cursor = connection.cursor()
cursor.execute('select a;')  # bad query to raise and exception

When this query is among multiple statements (and not the first statement) no error is raised.

cursor.execute('select 1; select a;')

The expected results is a MissingColumn exception.

For reference, pyodbc raises a ProgrammingError, and jaydebeapi raises a java.sql.SQLSyntaxErrorExceptionPyRaisable.

Tested with vertica_python 0.6.3 on Python 2.7.11+ and Python 3.5.1+, with Vertica 7.2.3 db.

SSL socket is blocking

When using an SSL connection, the socket blocks indefinitely.

Here's the line that is currently blocking the connection:

ready = select.select([self._socket()], [], [], self.options['read_timeout'])

Here is the patch that fixed the issue in the Ruby gem.

"Cursor is closed" exception after upgrading to 0.4.5

After upgrading to 0.4.5 I'm got 'Cursor is closed' exception on second query. code example:

    sql = vertica_python.connect(params)

    cur = sql.cursor('dict')
    cur.execute("select 1 from dual")
    print(cur.fetchall())
    cur.close()

    cur = sql.cursor('dict')
    cur.execute("select 2 from dual")
    print(cur.fetchall())
    cur.close()

$ ./test.py 
[{'?column?': 1}]
Traceback (most recent call last):
  File "./test.py", line 23, in main
    cur.execute("select 2 from dual")
  File "/usr/local/lib/python2.6/dist-packages/vertica_python/vertica/cursor.py", line 47, in execute
    raise errors.Error('Cursor is closed')
vertica_python.errors.Error: Cursor is closed

this worked fine before and works with other sql databases.

psycopg2

You use psycopg2 module in vertica/cursor.py, but it isn't in requirements. Shouldn't it be there? Or don't you plan to replace it? I'm preparing an rpm package for fedora/epel and need to add psycopg2 as dependency.

Dependency on Unix specific package crypt

I got the following while trying
from vertica_python import connect

ImportError Traceback (most recent call last)
in ()
----> 1 from vertica_python import connect

build\bdist.win32\egg\vertica_python__init__.py in ()

build\bdist.win32\egg\vertica_python\vertica\connection.py in ()

build\bdist.win32\egg\vertica_python\vertica\messages__init__.py in ()

build\bdist.win32\egg\vertica_python\vertica\messages\frontend_messages\password.py in ()

ImportError: No module named crypt

COPY FROM LOCAL does not work?

I am not sure, if COPY FROM LOCAL should work, so do not hesitate to close this issue with status "not yet implemented" ;-)

Please, see simple PoC demonstrating the issue here:
https://github.com/jaceksan/vertica-python/blob/copy_local_poc/jacek/copy_local_poc.py

Problem with nordic character i.e. ä

Got this error on this line in connection.py self.connection.write(messages.Query(operation))

when execute('update users set lastname= :lastname where username = :username', {'lastname': 'someöö', 'username': 'sss'}

Is it a encoding problem?

vertica_python.errors.ConnectionError: unsupported authentication method: 9

I'm not sure if a setting changed on the cluster I connect to or not, however I started getting this error out of the blue, nothing has changed on the code side. It happens on connect using:

 { 'host': 'vertica', 'port': 5433, 'user': 'someuser', 'password': 'somepassword', 'database': 'MYDB'}

Any ideas where to look or what this is about?

vertica-python 0.4.1 errors on commit

vertica-python version: 0.4.1
vertica version: 7.0.1-4

Please look at the script below:

import sys
sys.path.insert(0, '/path/to/vertica-python/vertica-python-master')
from vertica_python import connect

connection = connect({
        'host': 'host.vertica.com',
        'port': 5433,
        'user': 'user',
        'password': 'password',
        'database': 'vertica',
        'read_timeout': 86400
        }
    )

c = connection.cursor()
c.execute("COMMIT;")
result = c.fetchall()
connection.close()
print result

This is a contrived example but when I run this script,

% python2.7 vertica-test.py
Message type: <class 'vertica_python.vertica.messages.backend_messages.notice_response.NoticeResponse'>
<vertica_python.vertica.messages.backend_messages.notice_response.NoticeResponse object at 0x7ffff0a06510>
Traceback (most recent call last):
  File "vertica-test.py", line 18, in <module>
    c.execute("COMMIT;")
  File "/path/to/vertica-python/vertica-python-master/vertica_python/vertica/cursor.py", line 73, in execute
    raise errors.QueryError("Unexpected message type", self.last_execution)
  File "/path/to/vertica-python/vertica-python-master/vertica_python/errors.py", line 77, in __init__
    error_response.error_message(), repr(self.one_line_sql()))

How to iterate rows?

In the current implementation, is there a way to iterate and process rows without loading all rows in raw form into memory?

python3.x support

A patch is being made to support python3 by the debian guys ( attached to this issue ).

It may be great to integrate it in the source or in a branch for the nondebian users ?
python3-vertica-patch.txt

Version 0.5.4 is not Python 2 compatible

The latest release of vertica-python on PyPi breaks compatibility with Python 2. This is not documented, and hopefully, not intentional.

A change made here raises the following error when a connection attempt is made:
vertica_python.errors.ConnectionError: str() takes at most 1 argument (2 given)

I cannot find the change in this repo, but the diff looks like this:

--- a/vertica-python-0.5.2/vertica_python/vertica/messages/frontend_messages/query.py
+++ b/vertica-python-0.5.4/vertica_python/vertica/messages/frontend_messages/query.py
@@ -1,4 +1,4 @@
-from __future__ import absolute_import
+

 from struct import pack

@@ -13,7 +13,7 @@ class Query(FrontendMessage):
     def to_bytes(self):
         s = self.query_string
         if isinstance(s, str):
-                s = unicode(s, 'utf-8')
+                s = str(s, 'utf-8')
         encoded = s.encode('utf-8')
         return self.message_string(pack('{0}sx'.format(len(encoded)), encoded))

Support for UPDATE, DELETE statements

Is there any chance to support these additional statements?

Thx

Copy method cannot read response

I am trying a simple case of inserting to an empty test table using the COPY method. Vertica 6.1.1.

from vertica_python import connect
from StringIO import StringIO

connection = connect({
    'host': '127.0.0.1', 'port': 5433,
    'user': USER, 'password': PASSWORD,
    'database': 'MainDB'
})

io = StringIO()
io.write('1|hello\n')
io.seek(0)

connection.copy("COPY testpyvertica FROM STDIN", io)

Results in:

Traceback (most recent call last):
  File "/home/danielh/Projects/generic/generic.py", line 22, in <module>
    connection.copy("COPY testpyvertica FROM STDIN", io)
  File "/usr/local/lib/python2.7/dist-packages/vertica_python/vertica/connection.py", line 193, in copy
    return self.run_with_job_lock(job)
  File "/usr/local/lib/python2.7/dist-packages/vertica_python/vertica/connection.py", line 210, in run_with_job_lock
    return job.run()
  File "/usr/local/lib/python2.7/dist-packages/vertica_python/vertica/query.py", line 31, in run
    message = self.connection.read_message()
  File "/usr/local/lib/python2.7/dist-packages/vertica_python/vertica/connection.py", line 159, in read_message
    raise ConnectionError(e.message)
vertica_python.vertica.error.ConnectionError: unpack requires a string argument of length 5

The table remains empty after the code has run.

Release versioning

Hi,
fedora/epel package maintainer here. I have a small request - is it possible to tag releases of your code in one way? There are tags starting with "v" (eg. v0.3.5) and others without (eg. 0.3.6).
Thanks!

Support for named parameters is broken in Python 3

Support for named parameters is not working in Python 3. This seems to stem from how psycopg2 is used to quote the parameters.

The following example demonstrates how it breaks down for certain unicode characters:

>>> from vertica_python import connect
>>> conn = connect(host='localhost', database='docker', user='dbadmin', password='')
>>> cursor = conn.cursor()
>>> cursor.execute('SELECT :param', {'param': u'\u16b1'})
>>> cursor.fetchone()
[b'\\341\\232\\261']

We can also see it breaking down for ordinary characters that should not give us any issue regardless of what encoding we are using:

>>> from vertica_python import connect
>>> conn = connect(host='localhost', database='docker', user='dbadmin', password='')
>>> cursor = conn.cursor()
>>> cursor.execute('CREATE TABLE test (test_col VARCHAR)')
>>> cursor.execute('SELECT 1 FROM test WHERE test_col = :param', {'param': 'test123'})
Traceback (most recent call last):
  File "<input>", line 1, in <module>
    cursor.execute('SELECT 1 FROM test WHERE test_col = :param', {'param': 'test123'})
  File "/home/fredrik/projects/vertica-python/vertica_python/vertica/cursor.py", line 93, in execute
    raise errors.QueryError.from_error_response(message, operation)
vertica_python.errors.QueryError: Severity: b'ERROR', Message: b'Operator does not exist: varchar = varbinary', Sqlstate: b'42883', Hint: b'No operator matches the given name and argument type(s). You may need to add explicit type casts', Routine: b'op_error', File: b'parse_oper.c', Line: b'577', SQL: "SELECT 1 FROM test WHERE test_col = 'test123'::bytea"

The issue comes from this line. It results in a byte array that represents the quoted original string, but with ::bytea attached to the end. This in turn causes both of the issues I listed above. See the following example of what adapt(...).getquoted() does with UTF-8 encoded byte arrays.

>>> from psycopg2.extensions import adapt
>>> adapt('test123'.encode('utf8')).getquoted()
b"'test123'::bytea"

Python3 cursor.description uses bytes objects for column names instead of strings

First -- great work bringing Python3 compatibility!

Now to the bug...

When accessing the names of columns from a result set using vertica-python under Python3, the values are byte objects instead of strings. This breaks things like pandas code that expects to access column names by their string value, not by a bytes object.

Here's an example run on Ubuntu 14.04.4, Python 3.4.3, vertica-python 0.6.2:

connection = vertica_python.connect(**{
    'host': 'myverticahost',
    'port': 5433,
    'database': 'database',
    'user': username,
    'password': password,
})

query = """
select 'US' as isocode, 'United States' as name
union all
select 'CA' as isocode, 'Canada' as name
union all
select 'MX' as isocode, 'Mexico' as name
"""
cursor = connection.cursor()
cursor.execute(query)
description = cursor.description
print(description[0].name)

Output:

b'isocode'

Ideally the value would be 'isocode'.

With pandas:

import pandas as pd
df = pd.read_sql(query, connection)
print(df.columns)

Output:

Index([b'isocode', b'name'], dtype='object')

I think a fix would be to change this line to add a .decode() to the end.
https://github.com/uber/vertica-python/blob/6cceb6e8726948f8752ca258004f87864e5b491d/vertica_python/vertica/column.py#L127

        self.name = col['name'].decode()

If column names can be None then a little more care would be required.

I assume column names can always be interpreted as utf-8. I haven't tested this, but the Vertica documentation suggests this.
https://my.vertica.com/docs/7.1.x/HTML/Content/Authoring/SQLReferenceManual/LanguageElements/Identifiers.htm

setup.py depends on pip.req, causing pants to fail with twitter.common.python

Pants runs setup.py in an isolated sys.path containing only stdlib modules. Since pip is not part of the stdlib until Python 3.4 (PEP-453) it's not available in this environment causing pip to fail.

% ./pants py -r vertica-python
**** Failed to install vertica-python-0.2.1. stdout:

**** Failed to install vertica-python-0.2.1. stderr:
Traceback (most recent call last):
  File "<stdin>", line 6, in <module>
  File "setup.py", line 5, in <module>
    from pip.req import parse_requirements
ImportError: No module named pip.req

Traceback (most recent call last):
  File "/home/ksweeney/workspace/aurora/build-support/pants-0.0.23.pex/.bootstrap/_twitter_common_python/pex.py", line 223, in execute
    self.execute_entry(entry_point, args)
  File "/home/ksweeney/workspace/aurora/build-support/pants-0.0.23.pex/.bootstrap/_twitter_common_python/pex.py", line 271, in execute_entry
    runner(entry_point)
  File "/home/ksweeney/workspace/aurora/build-support/pants-0.0.23.pex/.bootstrap/_twitter_common_python/pex.py", line 294, in execute_pkg_resources
    runner()
  File "/home/ksweeney/.pex/install/twitter.pants-0.0.23-py2-none-any.whl.43ba1ea2658c9c6cead0dacfb32a3ee8bf9c140c/twitter.pants-0.0.23-py2-none-any.whl/twitter/pants/bin/pants_exe.py", line 194, in main
    _run()
  File "/home/ksweeney/.pex/install/twitter.pants-0.0.23-py2-none-any.whl.43ba1ea2658c9c6cead0dacfb32a3ee8bf9c140c/twitter.pants-0.0.23-py2-none-any.whl/twitter/pants/bin/pants_exe.py", line 177, in _run
    result = command.run(lock)
  File "/home/ksweeney/.pex/install/twitter.pants-0.0.23-py2-none-any.whl.43ba1ea2658c9c6cead0dacfb32a3ee8bf9c140c/twitter.pants-0.0.23-py2-none-any.whl/twitter/pants/commands/command.py", line 97, in run
    return self.execute()
  File "/home/ksweeney/.pex/install/twitter.pants-0.0.23-py2-none-any.whl.43ba1ea2658c9c6cead0dacfb32a3ee8bf9c140c/twitter.pants-0.0.23-py2-none-any.whl/twitter/pants/commands/py.py", line 179, in execute
    executor.dump()
  File "/home/ksweeney/.pex/install/twitter.pants-0.0.23-py2-none-any.whl.43ba1ea2658c9c6cead0dacfb32a3ee8bf9c140c/twitter.pants-0.0.23-py2-none-any.whl/twitter/pants/python/python_chroot.py", line 201, in dump
    platforms=platforms)
  File "/home/ksweeney/.pex/install/twitter.pants-0.0.23-py2-none-any.whl.43ba1ea2658c9c6cead0dacfb32a3ee8bf9c140c/twitter.pants-0.0.23-py2-none-any.whl/twitter/pants/python/resolver.py", line 109, in resolve_multi
    platform=platform)
  File "/home/ksweeney/.pex/install/twitter.common.python-0.5.6-py2-none-any.whl.4f8c7b255333f2398b1dbb1621bf5a1175a43fe3/twitter.common.python-0.5.6-py2-none-any.whl/twitter/common/python/resolver.py", line 107, in resolve
    new_requirements.update(requires(highest_package, requirement))
  File "/home/ksweeney/.pex/install/twitter.common.python-0.5.6-py2-none-any.whl.4f8c7b255333f2398b1dbb1621bf5a1175a43fe3/twitter.common.python-0.5.6-py2-none-any.whl/twitter/common/python/resolver.py", line 81, in requires
    raise Untranslateable('Package %s is not translateable.' % package)
Untranslateable: Package SourcePackage(u'https://pypi.python.org/packages/source/v/vertica-python/vertica-python-0.2.1.tar.gz#md5=7c02e55913b18f2e082f166dfc40894f') is not translateable.

Cc: @wickman

DBAPI spec conformance

Thanks for putting out this code!

Any chance of seeing the library's API conform to the Python standard DB API?

http://www.python.org/dev/peps/pep-0249/

That way it would play well with other code in the Python DB ecosystem, such as sqlalchemy.

(Speaking of which - any tips on quickly getting this going with sqlalchemy?)

Inserts fail with Cannot commit: No transaction in progress

from vertica_python import connect

def get_connection():
return connect({
'host': '192.168.19.170',
'port': 5433,
'user': 'vertica_dba',
'password': 'password',
'database': 'ceilometer'
})

with get_connection() as conn:
insert_stmt = "INSERT INTO %s..."
cur = conn.cursor()
cur.execute(insert_stmt)

fails with the following stack trace:

raise errors.QueryError("Unexpected message type", self.last_execution)
vertica_python.errors.QueryError: Unexpected message type, SQL: 'commit'

how to use application load balancing with vertica-python module

Could you please provide sample code to specify application load balancing as part opening connection with vertica python module.

{code}
'ConnectionLoadBalance' : 1
{code}

Connection VIA DSN?

connection = connect({
'host': '127.0.0.1',
'port': 5433,
'user': 'some_user',
'password': 'some_password',
'database': 'a_database'

})

any way to do it like pyodbc:
e.g.
vertica_python.connect('host=xxx.xxx.xxx;dsn=vertica;uid=vertica')
?

DELETE / UPDATE statements do not set rowcount

The rowcount variable is not set when using a DELETE or UPDATE statement.

Decimal support is broken python3

If vertical column contains numeric value, vertica python tries to convert it to Decimal and the following error occurs:

  File "/Users/psileikis/Workspaces/vertica-python/vertica_python/vertica/column.py", line 119, in <lambda>
    ('numeric', lambda s: Decimal(s)),
nose.proxy.TypeError: conversion from bytes to Decimal is not supported

read_timeout default

When the read_timeout default is set to 10 minutes and I have a long running query, I get a connection timeout error. Changing this to 60 minutes fixed my problem.

Does this parameter always cause a connection timeout for queries that exceed the timeout? (As in, is this the intent of this parameter, or is there some other side-effect going on, like a heartbeat not working, etc) If so, 10 minutes default seems pretty short and should be called out in the docs somewhere.

SSL connections fail in python3 due to minor bytes/str distinction

This one's an easy fix, pull request here: #114

handler in cursor.execute throw error

As the readme.md mentions I tried using the handler but it throws an error. Upon looking at the code I don't see anything that supports it either. Am I missing something ?

Feature request: setting client_label

I'm not sure if there is a way to do this, but it would be very helpful if we could set the client_label (like JDBC does). This is also sometimes called a session label, I think.

CRUD Operations

Sorry to leave an Issue, but there was no email address to contact. My question is does the driver support normal CRUD operations?

please set 0.2.0 tag on github

Is there some reason for not setting tag 0.2.0 on github? It is quite different from what pypi says.

Many thanks!

Automated tests via Travis (or some other CI service)

Are the maintainers interested in automating the project's tests via a CI service like Travis? We can use a dockerized instance of Vertica to enable this.

It'll let us test PRs and hopefully move more quickly on refactoring the codebase or adding stuff as the need arises.

Python 3 timestamp_tz_parse TypeError

The following function:

def timestamp_tz_parse(s):
    # if timezome is simply UTC...
    if s.endswith('+00'):
        # remove time zone
        ts = timestamp_parse(s[:-3])
        ts = ts.replace(tzinfo=pytz.UTC)
        return ts
    # other wise do a real parse (slower)
    return parser.parse(s)

raises TypeError: endswith first arg must be bytes or a tuple of bytes, not str at this line.

It appears that s is a bytes object, not str. Changing if s.endswith('+00'): to if s.endswith(b'+00'): resolves the issue for me.

I'm using:

Python 3.4.4
vertica-python==0.6.2

Release 0.6.0 does not install properly with pip

Trying to import the latest release of vertica-python fails due to some dependencies are missing from setup.py:

Python 2.7.11 (default, Jan 21 2016, 16:14:29) 
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import vertica_python
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/fredrik/.pyenv/versions/vertica-test/lib/python2.7/site-packages/vertica_python/__init__.py", line 3, in <module>
    from vertica_python.vertica.connection import Connection
  File "/home/fredrik/.pyenv/versions/vertica-test/lib/python2.7/site-packages/vertica_python/vertica/connection.py", line 10, in <module>
    from builtins import str
ImportError: No module named builtins

Seem like future has been added as a dependency, but only to the requirements.txt file, but not the actual list of dependencies in setup.py.

extend connections to be context managers

For example:

with psycopg2.connect(DSN) as conn:
    with conn.cursor() as curs:
        curs.execute(SQL)
When a connection exits the with block, if no exception has been raised by the block, the transaction is committed. In case of exception the transaction is rolled back. In no case the connection is closed: a connection can be used in more than a with statement and each with block is effectively wrapped in a transaction.

Right now this doesn't appear to be possible in vertica-python. It would be nice to have similar behavior, though I gather this is not a standard part of the Python DB API.

executemany missing?

Hello,

I see that the module is missing executemany function, Do you have any plans to implement it in next version?

Regards,
Srinivasa KP

0.2.1 tag

Please add tag for 0.2.1 version. I use github as main source in RPM packages, so I can't update packages effectively (and officially). Thanks!

Cursor.nextset support?

Hey there,

We've been using vertica-python to run SQL queries containing multiple insert statements (into temp tables) followed by a select statement at the end to join all the temp table data and pull it out, e.g.

INSERT INTO temp_table (...) SELECT ... ;
INSERT INTO temp_table (...) SELECT ... ;
INSERT INTO temp_table (...) SELECT ... ;
SELECT ... from temp_table WHERE ...;

Somewhere around this commit I think vertica-python's behaviour changed to return the result of the first statement rather than the last: 102f681. This isn't ideal because that means our jobs are returning the number of rows inserted to the temp table rather than the data we actually want.

As a result we're stuck on vertica-python 0.3.6 :(

I thought that I could make a workaround involving the Cursor.nextset() function but then I saw it's not implemented. Actually, in that commit above it seems the placeholder was also removed.

Any chance we can work together to either restore the previous behaviour or implement the .nextset function?

This is an awesome library btw :)

Cheers
David

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.