mfenniak / pg8000 Goto Github PK

View Code? Open in Web Editor NEW

221.0 221.0 57.0 1.44 MB

A Pure-Python PostgreSQL Driver

Home Page: http://pythonhosted.org/pg8000/

License: Other

Shell 0.06% Python 99.94%

pg8000's People

Contributors

Stargazers

Watchers

Forkers

percious benmoran davidfraser zeha ulope reingart zzzeek stepank postsql nad2000 lucmult ywangd hackgnar sitaktif gitter-badger gi0baro rvdwijngaard saddingtonbaynes realazthat kaniini zendbit paulbarter chawk simudream fholmer d33tah vadv nbarros davidradernj inno157 zhaodiankui jserranodigit zilder kkocaerkek donalm m32 stalkerg romanrev exponea pfhayes jameslinus sqlobject pat-lasswell stvad gregn610 karthiksonti pvbouwel chris-data fdosani seushermsft rekgrpth jayvdb vlekakis elena-klochkova gkuo06

pg8000's Issues

Make user optional on connect

This is convenient for testing scenarios where you just want to test against localhost as the current user (as far as I can tell env["USER"], or similar, is what libpq/psycopg2 uses as teh default)

executemany() should set the row count

This also means that the SQLAlchemy dialect can support 'sane multi-row select'.

LookupError: unknown encoding: UNICODE (Amazon RedShift)

Good morning. I'm glad to see this library. I'm a big fan of the growing popular mysql-connect library for python and this seems to be an excellent brother in arms. Maybe rename to postgresql-connect to allow people to find more readily as took me a while.

Anyhow, I'm running into a "LookupError: unknown encoding: UNICODE" when connecting to Amazon's Data Warehouse RedShift and was wondering if you knew of a way I could set the encodings?

Code Below.

Cheers,

-Kevin

from pg8000 import DBAPI
conn = DBAPI.connect(host="my-redshift-server.us-west-2.redshift.amazonaws.com", port=1234, database="insight", user="kevin", password="1234")
cursor = conn.cursor()
cursor.execute("select * from monkeys where monkey_id = 3")
cursor.fetchone()

DEFAULT sequence and NOT NULL Conflict

I have the following default column (inspected with \d table):

 alt_name_id   | integer  | not null default nextval('alt_names_id_seq'::regclass)

When insterting and not specifying a value for the column I am getting a ProgrammingError exception because I haven't set a value for NOT NULL:

pg8000.errors.ProgrammingError: ('ERROR', '23502', 'null value in column "alt_name_id" violates not-null constraint')

I realise it's a kind of looping problem, but it seems the NOT NULL is being checked before DEFAULT. There is no problem with the same INSERT statement on the Postgres Server directly.

My workaround for the problem was just to specify the nextval() directly in my INSERT statement.

    INSERT INTO alt_names 
    (alt_name_id, canon_name_id, alt_name) 
    VALUES ( nextval('alt_names_id_seq'), :1, :2 );

Versions:
pg8000: 1.9.6
PGSQL: 9.3

9.4 beta 2 fails to connect

Reporting configuration

Reporting platform: OSX 10.9.4 (13E28)
Reporting postgres: Postgres.app 9.4 beta 2 here
Reporting Python: Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 00:54:21) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
pg8000.__version__ = 1.9.14

Reproduction steps:

import pg8000
conn = pg8000.connect(user="myuser")

Expected results:

Connection success

Actual results:

Traceback (most recent call last):
  File "pg8000connect.py", line 9, in testConnect
    conn = pg8000.connect(user="myuser")
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/pg8000/__init__.py", line 148, in connect
    user, host, unix_sock, port, database, password, socket_timeout, ssl)
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/pg8000/core.py", line 1179, in __init__
    raise exc_info()[1]
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/pg8000/core.py", line 1174, in __init__
    self.handle_messages(None)
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/pg8000/core.py", line 1650, in handle_messages
    self._read(data_len - 4), cursor)
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/pg8000/core.py", line 1714, in handle_PARAMETER_STATUS
    map(int, value.decode("ascii").split('.')[:2]))
ValueError: invalid literal for int() with base 10: '4beta2'

----------------------------------------------------------------------

Root cause

Seems to be related to "beta" version strings being parsed in an unexpected format.

Copy-Interface should support Columns

The Copy-Interface as it is implement right now, does not support columns where to put the data. For some cases this is really useful.

See: http://www.postgresql.org/docs/current/static/sql-copy.html

Here is a patch for implementing it:

@@ -550,11 +550,14 @@ class Cursor(Iterator):
             self.execute(operation, parameters)
         self._row_count = -1

-    def copy_from(self, fileobj, table=None, sep='\t', null=None, query=None):
+    def copy_from(self, fileobj, table=None, sep='\t', null=None, query=None, columns=None):
         if query is None:
             if table is None:
                 raise CopyQueryOrTableRequiredError()
-            query = "COPY %s FROM stdout DELIMITER '%s'" % (table, sep)
+            if columns is None:
+                query = "COPY %s FROM stdout DELIMITER '%s'" % (table, sep)
+            else:
+                query = "COPY %s (%s) FROM stdout DELIMITER '%s'" % (table, ",".join(columns), sep)
             if null is not None:
                 query += " NULL '%s'" % (null,)
         self.copy_execute(fileobj, query)

Not working with SA enums properly.

I am trying to use SA 0.6's new Enum code, but pg8000 chokes on it. Here is the example code and the error:

from sqlalchemy import *
from sqlalchemy.orm import *
url = 'postgresql+pg8000://postgres@localhost/test8000'

engine = create_engine(url, encoding='utf-8')

metadata = MetaData(bind=engine)
metadata.drop_all()

metadata = MetaData(bind=engine)
some_table = Table(u'somes', metadata,
Column(u'some_id', Integer, primary_key=True),
Column(u'e', Enum('one', 'two', name=u'e'))
)

metadata.create_all()

class Some(object):pass

mapper(Some, some_table)

session = sessionmaker(bind=engine)()

s = Some()
setattr(s, 'e', 'one')

session.add(s)
session.flush()
session.commit()

session.expunge_all()

print session.query(Some).all()

produces:

$ python pg8000test.py
Traceback (most recent call last):
File "pg8000test.py", line 28, in
session.flush()
File "/Users/percious/clients/mvpss/mvp2.1/src/sqlalchemy_trunk/lib/sqlalchemy/orm/session.py", line 1307, in flush
self._flush(objects)
File "/Users/percious/clients/mvpss/mvp2.1/src/sqlalchemy_trunk/lib/sqlalchemy/orm/session.py", line 1385, in _flush
flush_context.execute()
File "/Users/percious/clients/mvpss/mvp2.1/src/sqlalchemy_trunk/lib/sqlalchemy/orm/unitofwork.py", line 261, in execute
UOWExecutor().execute(self, tasks)
File "/Users/percious/clients/mvpss/mvp2.1/src/sqlalchemy_trunk/lib/sqlalchemy/orm/unitofwork.py", line 753, in execute
self.execute_save_steps(trans, task)
File "/Users/percious/clients/mvpss/mvp2.1/src/sqlalchemy_trunk/lib/sqlalchemy/orm/unitofwork.py", line 768, in execute_save_steps
self.save_objects(trans, task)
File "/Users/percious/clients/mvpss/mvp2.1/src/sqlalchemy_trunk/lib/sqlalchemy/orm/unitofwork.py", line 759, in save_objects
task.mapper._save_obj(task.polymorphic_tosave_objects, trans)
File "/Users/percious/clients/mvpss/mvp2.1/src/sqlalchemy_trunk/lib/sqlalchemy/orm/mapper.py", line 1420, in _save_obj
c = connection.execute(statement.values(value_params), params)
File "/Users/percious/clients/mvpss/mvp2.1/src/sqlalchemy_trunk/lib/sqlalchemy/engine/base.py", line 991, in execute
return Connection.executors[c](self, object, multiparams, params)
File "/Users/percious/clients/mvpss/mvp2.1/src/sqlalchemy_trunk/lib/sqlalchemy/engine/base.py", line 1053, in _execute_clauseelement
return self.__execute_context(context)
File "/Users/percious/clients/mvpss/mvp2.1/src/sqlalchemy_trunk/lib/sqlalchemy/engine/base.py", line 1076, in __execute_context
self._cursor_execute(context.cursor, context.statement, context.parameters[0], context=context)
File "/Users/percious/clients/mvpss/mvp2.1/src/sqlalchemy_trunk/lib/sqlalchemy/engine/base.py", line 1138, in _cursor_execute
self._handle_dbapi_exception(e, statement, parameters, cursor, context)
File "/Users/percious/clients/mvpss/mvp2.1/src/sqlalchemy_trunk/lib/sqlalchemy/engine/base.py", line 1136, in _cursor_execute
self.dialect.do_execute(cursor, statement, parameters, context=context)
File "/Users/percious/clients/mvpss/mvp2.1/src/sqlalchemy_trunk/lib/sqlalchemy/engine/default.py", line 207, in do_execute
cursor.execute(statement, parameters)
File "/Users/percious/clients/mvpss/mvp2.1/src/pg8000/pg8000/dbapi.py", line 243, in _fn
return fn(self, _args, *_kwargs)
File "/Users/percious/clients/mvpss/mvp2.1/src/pg8000/pg8000/dbapi.py", line 312, in execute
self._execute(operation, args)
File "/Users/percious/clients/mvpss/mvp2.1/src/pg8000/pg8000/dbapi.py", line 317, in _execute
self.cursor.execute(new_query, *new_args)
File "/Users/percious/clients/mvpss/mvp2.1/src/pg8000/pg8000/interface.py", line 303, in execute
self._stmt = PreparedStatement(self.connection, query, statement_name="", *[{"type": type(x), "value": x} for x in args])
File "/Users/percious/clients/mvpss/mvp2.1/src/pg8000/pg8000/interface.py", line 108, in init
self._parse_row_desc = self.c.parse(self._statement_name, statement, types)
File "/Users/percious/clients/mvpss/mvp2.1/src/pg8000/pg8000/protocol.py", line 920, in _fn
return fn(self, _args, *_kwargs)
File "/Users/percious/clients/mvpss/mvp2.1/src/pg8000/pg8000/protocol.py", line 1094, in parse
return reader.handle_messages()
File "/Users/percious/clients/mvpss/mvp2.1/src/pg8000/pg8000/protocol.py", line 906, in handle_messages
raise exc
sqlalchemy.exc.ProgrammingError: (ProgrammingError) ('ERROR', '42804', 'column "e" is of type e but expression is of type text') u'INSERT INTO somes (e) VALUES (%s) RETURNING somes.some_id' ['one']

Support for PyPy

Is PyPy supported? Can you please add it to the build process to make sure?

pg8000.errors.ProgrammingError: ('ERROR', '34000', 'portal "pg8000_portal_1" does not exist')

I'm getting this with 1.9.9, please let me know any thoughts you have on how i'm using the library wrong, to end up with this. Note that my script runs for quite a while doing many db calls, before it hits this. So to me subjectively it feels like something where pg8000 is getting itself confused after a while. I tried doing shorter transactions, but that didn't seem to help. Thanks very much for any guidance!

File "/usr/local/lib/python2.7/dist-packages/pg8000/core.py", line 658, in fetchmany
islice(self, self.arraysize if num is None else num))
File "/usr/local/lib/python2.7/dist-packages/pg8000/six.py", line 441, in next
return type(self).next(self)
File "/usr/local/lib/python2.7/dist-packages/pg8000/core.py", line 2044, in next
self.c.handle_messages(self)
File "/usr/local/lib/python2.7/dist-packages/pg8000/core.py", line 1619, in handle_messages
raise error
pg8000.errors.ProgrammingError: ('ERROR', '34000', 'portal "pg8000_portal_1" does not exist')

Caching prepared statements

At the moment the only time we re-use a prepared statement is in executemany(). Even there, the prepared statement isn't re-used between calls to executemany(). I propose having an option to cache prepared statements, keyed on the query string.

Remap Py2 next()

Possible optimization for Py2. Remap next() method of cursor class rather than inherit from the six.py Iterator.

Support for PostgreSQL 9.1

Postgresql 9.1 adds a new oid which turns up lots and causes introspection in sqlalchemy to fail when using pg8000 1.08. I assume once I submit this issue I get to attach the patch which will fix it...

Can I register my own composite types?

I use psycopg2 now, but this project looks interesting!

In psycopg2, I can register composite types, so that when I do something like

select (users.*)::users
from users
where users.user_id = 99

Instead of just getting a tuple / dict / list of the data in that row, I get an instance of the class I registered for the users table / type.

Is this possible or on the roadmap for pg8000?

Exception on connect

When calling DBAPI.connect I get the following exception:

conn = DBAPI.connect(host="dbserv1", database='testdb3', user="postgres", password="****_**")
Traceback (most recent call last):
File "", line 1, in
File "/home/karl/Desktop/dbtest/pg8000/dbapi.py", line 574, in connect
password=password, socket_timeout=socket_timeout, ssl=ssl)
File "/home/karl/Desktop/dbtest/pg8000/dbapi.py", line 470, in init
self.conn = interface.Connection(__kwargs)
File "/home/karl/Desktop/dbtest/pg8000/interface.py", line 433, in init
self.c.authenticate(user, password=password, database=database)
File "/home/karl/Desktop/dbtest/pg8000/protocol.py", line 1030, in authenticate
self._cache_record_attnames()
File "/home/karl/Desktop/dbtest/pg8000/protocol.py", line 1048, in _cache_record_attnames
[])
File "/home/karl/Desktop/dbtest/pg8000/protocol.py", line 918, in _fn
return fn(self, *args, *_kwargs)
File "/home/karl/Desktop/dbtest/pg8000/protocol.py", line 1090, in parse
return reader.handle_messages()
File "/home/karl/Desktop/dbtest/pg8000/protocol.py", line 904, in handle_messages
raise exc
pg8000.errors.ProgrammingError: ('ERROR', '42846', 'cannot cast type regproc to text')

Connection with psycopg2 using the same parameters works fine.

Possible performance improvement around DataIterator class

I wonder if the code around the DataIterator class can be streamlined?

New release

I think all the outstanding bugs have been fixed, so do we think it's time for a new release? I'm not totally sure what's involved. @mfenniak, is there anything I can do to help things along?

UnicodeDecodeError with sqlalchemy

since i've updated pg8000 from 1.08 to 1.9.10 i get UnicodeDecodeError when i query a databse that contains any unicode chars (in this special case german umlauts) setting the encoding='utf-8' on option in sqlalchemy seems to be ignored completely since 1.9

some traceback:

Traceback (most recent call last):
  File "debug.py", line 37, in <module>
    main.import_mediaserver()
  File "/volume1/@appstore/synodlnatrakt/share/SynoDLNAtrakt/synodlnatrakt/main.py", line 207, in import_mediaserver
    for result in dbresult:
  File "/volume1/@appstore/synodlnatrakt/share/SynoDLNAtrakt/lib/sqlalchemy/orm/query.py", line 2176, in __iter__
    return self._execute_and_instances(context)
  File "/volume1/@appstore/synodlnatrakt/share/SynoDLNAtrakt/lib/sqlalchemy/orm/query.py", line 2191, in _execute_and_instances
    result = conn.execute(querycontext.statement, self._params)
  File "/volume1/@appstore/synodlnatrakt/share/SynoDLNAtrakt/lib/sqlalchemy/engine/base.py", line 1450, in execute
    params)
  File "/volume1/@appstore/synodlnatrakt/share/SynoDLNAtrakt/lib/sqlalchemy/engine/base.py", line 1583, in _execute_clauseelement
    compiled_sql, distilled_params
  File "/volume1/@appstore/synodlnatrakt/share/SynoDLNAtrakt/lib/sqlalchemy/engine/base.py", line 1690, in _execute_context
    context)
  File "/volume1/@appstore/synodlnatrakt/share/SynoDLNAtrakt/lib/sqlalchemy/engine/default.py", line 335, in do_execute
    cursor.execute(statement, parameters)
  File "/volume1/@appstore/synodlnatrakt/share/SynoDLNAtrakt/lib/pg8000/core.py", line 531, in execute
    self._c.execute(self, operation, args)
  File "/volume1/@appstore/synodlnatrakt/share/SynoDLNAtrakt/lib/pg8000/core.py", line 1554, in execute
    self.handle_messages(cursor)
  File "/volume1/@appstore/synodlnatrakt/share/SynoDLNAtrakt/lib/pg8000/core.py", line 1619, in handle_messages
    self._read(data_len - 4), cursor)
  File "/volume1/@appstore/synodlnatrakt/share/SynoDLNAtrakt/lib/pg8000/core.py", line 1607, in handle_DATA_ROW
    row.append(func(data, data_idx, vlen))
  File "/volume1/@appstore/synodlnatrakt/share/SynoDLNAtrakt/lib/pg8000/core.py", line 976, in text_recv
    data[offset: offset + length], self._client_encoding)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 64: ordinal not in range(128)

with 1.08 it works perfeclty but sadly 1.08 gives me a different error sometimes:
pg8000.errors.NotSupportedError: type oid 705 not mapped to py type

Support the hstore type

SSL is very broken

I think if you test with ssl=True, you will find that it doesn't work at all because the _sock_lock that is defined in the Connection object in protocol.py is initialized after the first use in init (protocol.py:948).

Simply running with ssl=True should crash, but let me know if you need a traceback and a patch if you are unable to recreate.

'inf' and '-inf' in date, timestamp and intervall types

if pg8000 tries to map date or timestamp columns with the value 'inf' or '-inf' it crashes

from [https://github.com/mfenniak/pg8000/blob/master/pg8000/core.py#L938] ff. it seems not to be handled at all. My suggestion would be implementing it like psycopg2 does, using date.max and datetime.max [http://initd.org/psycopg/docs/usage.html#infinite-dates-handling]

Add support for the postgresql 'application name' field

Server version parsing fail on non-numeric versions

In core.py, lines 1723-1724:

            self._server_version = tuple(
                map(int, value.decode("ascii").split('.')[:2]))

This code raises an exception if postgres have a non-numeric version, in my case '9.3beta2'. I made a workaround on my server, but value.decode("ascii").replace("beta", ".beta").split('.')[:2] is obviously not a best solution

Selecting nonstandard column types causes utf8 errors

While adding pg8000-support to pganalyze-collector I noticed that a few non-standard column types cause errors.

Two I've identified so far are:

inet (as referenced in pg_stat_activity
xid (as referenced in pg_locks)

Explicitly casting them to ::text works as expected.

Example output:

DEBUG - 2014-03-11 16:17:07,192 Running query: SHOW is_superuser
DEBUG - 2014-03-11 16:17:07,193 Running query: SHOW server_version_num
DEBUG - 2014-03-11 16:17:07,193 Running query: SHOW client_encoding
[{'client_encoding': u'UTF8'}]
DEBUG - 2014-03-11 16:17:07,193 Running query: SELECT extname FROM pg_extension
DEBUG - 2014-03-11 16:17:07,194 Found pg_stat_plans, using it for query information
[..]
DEBUG - 2014-03-11 16:17:07,346 Running query: SELECT * FROM pg_stat_bgwriter
DEBUG - 2014-03-11 16:17:07,347 Running query: SELECT * FROM pg_stat_database WHERE datname = current_database()
DEBUG - 2014-03-11 16:17:07,348 Running query: 
SELECT d.datname AS database,
       n.nspname AS schema,
       c.relname AS relation,
       l.locktype,
       l.page,
       l.tuple,
       l.virtualxid,
       l.transactionid,
       l.virtualtransaction,
       l.pid,
       l.mode,
       l.granted
FROM pg_locks l
LEFT JOIN pg_catalog.pg_class c ON l.relation = c.oid
LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
LEFT JOIN pg_catalog.pg_database d ON d.oid = l.database
WHERE l.pid <> pg_backend_pid();

Traceback (most recent call last):
  File "./pganalyze-collector.py", line 1174, in <module>
    if __name__ == '__main__': main()
  File "./pganalyze-collector.py", line 1157, in main
    data['postgres'] = fetch_postgres_information()
  File "./pganalyze-collector.py", line 1059, in fetch_postgres_information
    info['locks']    = PI.Locks()
  File "./pganalyze-collector.py", line 386, in Locks
    return db.run_query(query)
  File "./pganalyze-collector.py", line 747, in run_query
    raise e
UnicodeDecodeError: 'utf8' codec can't decode byte 0x8e in position 1: invalid start byte

add support for ssh tunnel connection

some database connection clients do support connecting via SSH tuneling. One such example is Navicat, which can connect using a SSH tunnel. This is extremely useful because postgresql post 5432 is usually blocked by most firewalls, but SSH port (22) is not, allowing use to connect to remote servers.

Decimal 0.260 gets truncated to 0

I'm doing a very simple INSERT:

cur.execute('insert into foo values (%s)', [decimal.Decimal('0.260')])

I'd expect that 0.26 is stored in table foo, but psql shows it as 0.
Table schema: CREATE TABLE foo (a NUMERIC);

On the other hand, this simple select returns 0.2600:
cur.execute('SELECT %s', [decimal.Decimal('0.260')])

-ch

Repeated calls to convert_paramstyle() in executemany()

In executemany I think there may be needless reparsing of the query. There may be scope for just creating new_args. The bit of code is:

for parameters in parameter_sets:
    new_query, new_args = convert_paramstyle(
        paramstyle, operation, parameters)

    self._stmt.execute(*new_args, stream=None)

Add docs for tpc_*() methods

The upload_docs setup.py command doesn't work

Handle dates greater than 9999-12-31

I am working with a database that contains dates as far into the future as 10000-01-01. After getting some errors, I took a quick peek at that the date handling code, and discovered that it is hard-coded to work with four-digit years. The error I got was from datetime.date about the month being in the wrong range. I guess it's because the parameter passed in was the string '-0'. I'm not sure if this is something anyone cares about, but I thought I would mention it.

'ERROR', 'XX000', 'no unpinned buffers available'

I've recently upgraded to pg8000 1.9.9 from 1.08. I have a system that writes continuously to a database and it now only works for about 12 hours before I get the error
pg8000.errors.ProgrammingError: ('ERROR', 'XX000', 'no unpinned buffers available')
Currently I execute quite a few statements per transaction, could that be the problem?
Could someone tell me what this means?

Empty array issue

SELECTing an array field (with any primitive type into) raises an exception:

[03/Nov/2011:21:17:23] HTTP Traceback (most recent call last):
  ...
  File "pg8000/types.py", line 442, in array_recv
    element_len, = struct.unpack("!i", data[:4])
error: unpack str size too short for format

This is caused by the "data" variabile containing an empty string.

A quick but working fix could be the following:

         if len(data):
             element_len, = struct.unpack("!i", data[:4])
         else:
             element_len = -1

No tags on github for pg8000 releases

I would expect git tags to be defined for each pg8000 release so far (i.e. 1.00 to 1.08) so that people can get a particular release from the history (and optionally from the github download feature).

Note git does not push tags by default, so this may just be an oversight.

LookupError: unknown encoding: UNICODE (Amazon RedShift) (for real)

This is a real issue with Amazon RedShift. Need to include "unicode" to "utf8" in your pg_to_py_encodings table in the types.py file. Amazon seems to be using an unstandard PostGres encoding name of "UNICODE".

My hack solution without changing the driver is to do the following:

 #bug fix so driver can access AWS Redshift
 from pg8000 import types
 types.pg_to_py_encodings["unicode"]="utf8"

Wheel packaging format for pypi.org

Would it be possible to get a .whl universal package on Pypi ?

If I'm right, it's just a matter of adding a setup.cfg file at root level with

[wheel]
universal = 1

and then doing this to upload to pypi

python setup.py sdist bdist_wheel upload -r pypi

COPY with Amazon Redshift gives ValueError: invalid literal for int() with base 10: 'COPY'

Execute of a "COPY" command against Amazon Redshift cluster returns the following traceback (pg8000 1.9.10 installed via pip on windows 7 64bit and python 2.6.6):

Traceback (most recent call last):
File "samplercontrol.py", line 256, in
import_to_redshift(opts)
File "samplercontrol.py", line 211, in import_to_redshift
cursor.execute(sql)
File "c:\python26\lib\site-packages\pg8000\core.py", line 531, in execute
self._c.execute(self, operation, args)
File "c:\python26\lib\site-packages\pg8000\core.py", line 1554, in execute
self.handle_messages(cursor)
File "c:\python26\lib\site-packages\pg8000\core.py", line 1619, in handle_messages
self._read(data_len - 4), cursor)
File "c:\python26\lib\site-packages\pg8000\core.py", line 1589, in handle_COMMAND_COMPLETE
row_count = int(values[-1])
ValueError: invalid literal for int() with base 10: 'COPY'

There was an old issue about similar error with SELECT, and SELECT indeed works fine in 1.9.10. Unfortunately COPY gives the error above.

The particular copy command used does not seem to matter, but this is the one I used:
COPY sampler_test FROM 's3://--deleted--.manifest' MANIFEST
CREDENTIALS 'aws_access_key_id=--deleted--;aws_secret_access_key=--deleted--;token=--deleted--' DELIMITER ' ' GZIP MAXERROR 5

Forced transactions prevent CREATE DATABASE

Although the docs claim it will operate in an "autocommit" mode unless you start a transaction, the execute() function starts a transaction anyway, preventing any command which may not be executed inside a transaction (such as CREATE DATABASE).

The following is a patch I've used successfuly in my efforts to make pg8000 work with Django. I realise it's quite minimal and probably not thought through quite far enough, but it works.

{{{
diff --git a/pg8000/dbapi.py b/pg8000/dbapi.py
index 43d2b6d..d9a2115 100644
--- a/pg8000/dbapi.py
+++ b/pg8000/dbapi.py
@@ -482,6 +482,9 @@ class ConnectionWrapper(object):
self.notifies_lock = threading.Lock()
self.conn.NotificationReceived += self._notificationReceived

def autocommit(self, state):
```
   self.conn.autocommit = state
```
@require_open_connection
def begin(self):
self.conn.begin()
diff --git a/pg8000/interface.py b/pg8000/interface.py
index acbfeba..24c4324 100644
--- a/pg8000/interface.py
+++ b/pg8000/interface.py
@@ -439,6 +439,7 @@ class Connection(Cursor):
self._rollback = PreparedStatement(self, "ROLLBACK TRANSACTION")
self._unnamed_prepared_statement_lock = threading.RLock()
self.in_transaction = False
```
   self.autocommit = False
```
An event handler that is fired when NOTIFY occurs for a notification that

@@ -491,6 +492,8 @@ class Connection(Cursor):
def begin(self):
if self.is_closed:
raise ConnectionClosedError()
```
   if self.autocommit:
```

       return
 self._begin.execute()
 self.in_transaction = True

}}}

Disappearing u's in arrays of strings.

When you pass pg8000 a list of strings to be used as a PostgreSQL array, pg8000 quietly removes all of the u's. For example:

>>> import pg8000
>>> conn = pg8000.connect( ... )
>>> cursor = conn.cursor()
>>> cursor.execute('SELECT %s', (['fubar'],))
>>> cursor.fetchone()
>>> [['fbar']]

Looking through the code a bit I learned that this seems intentional:

arr_trans = dict(zip(map(ord, u("[] 'u")), list(u('{}')) + [None] * 3))
-------------------------------------^

Is this necessary for sanitizing strings? It's a hard problem to track down from a library user perspective. In lieu of having a more robust method for sanitizing array strings, perhaps pg8000 would be better off raising an exception saying that the letter 'u' is forbidden. (It also applies to ' ' [space] and ''' [apostrophe].)

Unicode arrays unsupported on Python 2.5

In core.py line 1717 the array contents are checked only for str and this prevents inserts of Unicode arrays from working under Python 2.5 / Jython.

This patch works, although it may break other platforms

--- a/pg8000/core.py
+++ b/pg8000/core.py
@@ -1714,7 +1714,7 @@ class Connection(object):
             else:
                 raise ArrayContentNotSupportedError(
                     "numeric not supported as array contents")
-        elif typ is str:
+        elif typ in (str,unicode):
             oid, fc, send_func = (25, FC_BINARY, self.py_types[str][2])
             array_typeoid = pg_array_types[oid]
         else:

Test against all supported pg versions

At the moment the tox tests only run against a single version of postgres. Should really make it run against all supported versions.

dev_tasks.txt

I've removed dev_tasks.txt and put the contents below. If there are outstanding issues we can separate them out into individual issues.

user documentation for types module
- documentation that includes example code of common things to do
  - common ops
  - insert bytea
  - timestamp manipulation
- record support that works better, based upon class definitions and using decorators; both in and out support
- replace generic exceptions with more specific exceptions
- remove interface.py and move such logic into dbapi.py
  - document that fetchmany uses prepared statements
- remove in-code documentation that belongs in sphinx templates
- properly close all PreparedStatement instances during tests
- review notify patch
- unit tests
- Support "void" return type from funcs, bug report from Ken Sell

Changes suggested by the web2py group

The people working on web2py have given their desiderata for pg8000. Here's an extract of an email that details some of the changes that they'd like:

Tony: there are other changes needed for web2py (see the diff for web2py DAL I've > attached earlier), the most important are:

missing version attribute (now it is on setup.py, web2py need it to differentiate drivers capabilities)

connect doesn't suppor dsn string anymore (you need to pass keyword parameters)

set_client_encoding is not present anymore (you need to execute SQL SET ...)

server_version attribute has a leading underscore (_server_version), this is need > to detect server capabilities like JSON

Also, for the pg8000 driver currently in web2py, I'd applied many bugfixes reported > in github / launchpad for the original project (mainly data types, unicode, importing, > etc.)
Also, I'd improved the psycopg2 compatibility (i.e. set_client_encoding, autocommit, > set_isolation_level), implementing the simple query protocol (the one that uses psycopg2, and to avoid overhead of non-prepared statements) and two-phase commit support.

The last would be important for web2py distributed_transaction_commit, but currently gluon.dal is sending raw SQL PREPARE TRANSACTION / COMMIT PREPARED / ROLLBACK PREPARED without using dbapi proposed methods TPC Connection Methods (tpc_begin, tpc_prepare, tpc_commit, tpc_rollback, tpc_recover)

You can see the detail of the changes applied to the web2py contrib pg8000 here:

https://code.google.com/p/pg8000/source/list

https://github.com/reingart/pg8000/commits/master

Get configuration from standard environment variables

We now get the user from environment variables as a default. We should probably do that for all the standard configuration variables, eg PGPORT.

executemany: wrong py_types calculation

I'm using pg8000 1.9.6 and I have a table with integer fields and I want to insert many rows at once.
Suppose I have these values:

params = ((1), (40000))

The executemany method from Cursor class initializes the PreparedStatement with the first parameter from "params", and it tries to find the type of the field with that. So, in this case, using the value 1, will find out that the type is an int2, which is between -32768 and 32768.
The problem is that the same PreparedStatement is used for all the parameters, and will fail with the second one, which is 40000.

In order to "fix" this, I've done this temporary patch:
http://pastebin.com/AABdPbNP

load credentials from .pgpass file

libpq implementations do load credentials from the .pgpass file.

pg8000 should load credentials from inside the ~/.pgpass file, as the other drivers.

This is very important because putting credentials in source code is a very bad practice and now the alternative is to manually implement the read of these from somewhere else, instead of loading them from the standardised location.

Support plain text password authentication

Speed up executmany by re-using prepared statement

I've added a new test to performance.py to measure how executemany performs. On my laptop it averages about a second. I've added an experimental branch called 'many' on my repo, which has an executemany that re-uses a single prepared statement:

tlocke/pg8000@cba3b18

It performs about 40% better, but is it a good approach? Since it only sends PARSE once, what if the first set of params had nulls? Are there other problems?

Supporting XML type

pg8000.errors.NotSupportedError rise when pg8000 selects a column which is xml type. Supporting XML type is needed.

Traceback (most recent call last):
  ...
  File "/Library/Python/2.6/site-packages/SQLAlchemy-0.6beta3-py2.6.egg/sqlalchemy/engine/default.py", line 266, in do_execute
    cursor.execute(statement, parameters)
  File "build/bdist.macosx-10.6-universal/egg/pg8000/dbapi.py", line 243, in _fn
    return fn(self, *args, **kwargs)
  File "build/bdist.macosx-10.6-universal/egg/pg8000/dbapi.py", line 312, in execute
    self._execute(operation, args)
  File "build/bdist.macosx-10.6-universal/egg/pg8000/dbapi.py", line 317, in _execute
    self.cursor.execute(new_query, *new_args)
  File "build/bdist.macosx-10.6-universal/egg/pg8000/interface.py", line 304, in execute
    self._stmt.execute(*args, **kwargs)
  File "build/bdist.macosx-10.6-universal/egg/pg8000/interface.py", line 139, in execute
    self._row_desc, cmd = self.c.bind(self._portal_name, self._statement_name, args, self._parse_row_desc, kwargs.get("stream"))
  File "build/bdist.macosx-10.6-universal/egg/pg8000/protocol.py", line 920, in _fn
    return fn(self, *args, **kwargs)
  File "build/bdist.macosx-10.6-universal/egg/pg8000/protocol.py", line 1085, in bind
    output_fc = [types.py_type_info(f) for f in row_desc.fields]
  File "build/bdist.macosx-10.6-universal/egg/pg8000/types.py", line 162, in py_type_info
    raise NotSupportedError("type oid %r not mapped to py type" % type_oid)
NotSupportedError: (NotSupportedError) type oid 142 not mapped to py type u'SELECT %s || CAST(thought.id AS TEXT) AS anon_1, CAST(xpath(%s, thought.body_ir) AS TEXT[]) AS anon_2, CAST(thought.deleted_at IS NOT NULL AS BOOLEAN) AS anon_3, thought.id AS thought_id, thought.topic AS thought_topic, thought.context AS thought_context, thought.author_name AS thought_author_name, thought.body_text AS thought_body_text, thought.body_ir AS thought_body_ir, thought.created_at AS thought_created_at, thought.updated_at AS thought_updated_at, thought.deleted_at AS thought_deleted_at \nFROM thought \nWHERE %s || CAST(thought.id AS TEXT) = %s \n LIMIT 1 OFFSET 0' (u'?', '//topic/@normal-name', u'?', u'?12345')

Returning row as list rather than tuple

When we build the result, we turn the row into a tuple:

ps._cached_rows.append(tuple(row))

In the interests of speed I'm proposing to do away with the conversion to tuple and just return the row as a list:

ps._cached_rows.append(row)

This wouldn't break the DBAPI 2.0 spec as I read it.

Does anyone have any objections?

rowcount not set for SELECT statements

The doc for rowcount says that rowcount should be updated by SELECT statements (if it can be determined by the interface):

    ##
    # This read-only attribute specifies the number of rows that the last
    # .execute*() produced (for DQL statements like 'select') or affected (for
    # DML statements like 'update' or 'insert').
    # <p>
    # The attribute is -1 in case no .execute*() has been performed on the
    # cursor or the rowcount of the last operation is cannot be determined by
    # the interface.
    # <p>
    # Stability: Part of the DBAPI 2.0 specification.
    @property
    def rowcount(self):
        return self._row_count

But it seems that SELECT is not part of the "commands with count" in pg800/core.py:

    self._commands_with_count = (
        b("INSERT"), b("DELETE"), b("UPDATE"), b("MOVE"),
        b("FETCH"), b("COPY"))

Hence, rowcount is never updated for SELECT statements, in pg8000/core.py:

    def handle_COMMAND_COMPLETE(self, data, cursor):
        values = data[:-1].split(BINARY_SPACE)
        command = values[0]
        if command in self._commands_with_count: # <---- Here
            row_count = int(values[-1])
            if cursor._row_count == -1:
                cursor._row_count = row_count
            else:
                cursor._row_count += row_count
        if command in DDL_COMMANDS:
            for k in self._caches:
                self._caches[k]['ps'].clear()

I suggest that either we try to determine rowcount for SELECTs as the documentation says or change the documentation to reflect the fact that pg8000 is not even trying to determine the rowcount for such statements.

Statement number moved from module to connection

At the moment the statement and portal numbers (and their locks) are at module level, but I think we can make them connection level to reduce contention. Will have to check this to make sure.