Code Monkey home page Code Monkey logo

prom's Introduction

Prom

An opinionated asynchronous lightweight orm for PostgreSQL or SQLite.

1 Minute Getting Started with SQLite

First, install prom:

$ pip install prom[sqlite]

Set an environment variable:

$ export PROM_DSN=sqlite://:memory:

Start python:

$ python

Create a prom Orm:

>>> import prom
>>>
>>> class Foo(prom.Orm):
...     table_name = "foo_table_name"
...     bar = prom.Field(int)
...
>>>

Now go wild and create some Foo objects:

>>> for x in range(10):
...     f = await Foo.create(bar=x)
...
>>>

Now query them:

>>> f = await Foo.query.one()
>>> f.bar
0
>>> f.pk
1
>>>
>>> async for f in await Foo.query.in_bar([3, 4, 5]):
...     f.pk
...
3
4
5
>>>

Update them:

>>> async for f in await Foo.query:
...     f.bar += 100
...     await f.save()
...
>>>

and get rid of them:

>>> async for f in await Foo.query:
...     await f.delete()
...
>>>

Congratulations, you have now created, retrieved, updated, and deleted from your database.


Configuration

Prom can be automatically configured on import by setting the environment variable PROM_DSN.

The PROM_DSN should define a dsn url:

<full.python.path.InterfaceClass>://<username>:<password>@<host>:<port>/<database>?<options=val&query=string>#<name>

The built-in interface classes don't need their full python paths, you can just use sqlite and postgres.

So to use the builtin Postgres interface on testdb database on host localhost with username testuser and password testpw:

postgres://testuser:testpw@localhost/testdb

And to set it in your environment:

export PROM_DSN=postgres://testuser:testpw@localhost/testdb

After you've set the environment variable, then you just need to import Prom in your code:

import prom

and Prom will take care of parsing the dsn url(s) and creating the connection(s) automatically.

Multiple db interfaces or connections

If you have multiple connections, you can actually set multiple environment variables:

export PROM_DSN_1=postgres://testuser:testpw@localhost/testdb1#conn_1
export PROM_DSN_2=sqlite://testuser:testpw@localhost/testdb2#conn_2

It's easy to have one set of prom.Orm children use one connection and another set use a different connection, since the fragment part of a Prom dsn url sets the name:

import prom

class Orm1(prom.Orm):
    connection_name = "conn_1"
  
class Orm2(prom.Orm):
    connection_name = "conn_2"

Now, any child class that extends Orm1 will use conn_1 and any child class that extends Orm2 will use conn_2.

Creating Models

Checkout the README to see how to define the db schema and create models your python code can use.

Querying Rows

Checkout the README to see how to perform queries on the db.

Versions

While Prom will most likely work on other versions, Prom is tested to work on 3.10.

Installation

Postgres

If you want to use Prom with Postgres:

$ apt-get install libpq-dev python-dev
$ pip install prom[postgres]

Prom

Prom installs using pip:

$ pip install prom[sqlite]
$ pip install prom[postgres]

and to install the latest and greatest:

$ pip install --upgrade "git+https://github.com/Jaymon/prom#egg=prom"

Using for the first time

Prom takes the approach that you don't want to be hassled with table installation while developing, so when it tries to do something and sees that the table doesn't yet exist, it will use your defined fields for your prom.model.Orm child and create a table for you, that way you don't have to remember to run a script or craft some custom db query to add your tables. Prom takes care of that for you automatically.

Likewise, if you add a field (and the field is not required) then prom will go ahead and add that field to your table so you don't have to bother with crafting ALTER queries while developing.

If you want to install the tables manually, you can create a script or something and use the Orm's install() method:

await SomeOrm.install()

prom's People

Contributors

javert42 avatar jaymon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

prom's Issues

problem updating

First, I created a new object and I just wanted to switch the pk to another value and update the database, and that didn't work, so I had to do something like this:

c._id = c_id
for k in c.fields:
    c.modified_fields.add(k)
c.update()

Second, notice the c.update(), that was because c.save() didn't work, figure out why it didn't work. c_id was definitely a valid primary key.

One thing that could be done is if setting the _id then that will trigger all the fields to be marked as modified. Also, we could add a force() method that would save all the fields regardless of whether they were dirty or not.

QueryField class

so Query classes have to have Interface instances, and those interface instances know what type of interface they are (eg, SQLiteInterface) so they could have a field_class on them that the query builder could use to set values, then those query field classes could generate the SQL instead of having all the crazy create SQL code

Likewise you could have clause classes, like SelectClause, WhereClause, and SortClause that could also be set and attach QueryField classes

This would help with the new caching stuff I'm adding also

overriding indexes on subclasses

This should be supported

class Parent(prom.Orm):
    index_foo = prom.Index("foo")

class Child(Parent):
    index_foo = None

The Child class should be able to set indexes to None that it doesn't want to use.

Recovering from connection error

File "/usr/local/lib/python2.7/dist-packages/prom/query.py", line 573, in get_one
    d = self._query('get_one')
  File "/usr/local/lib/python2.7/dist-packages/prom/query.py", line 682, in _query
    return getattr(i, method_name)(s, self)
  File "/usr/local/lib/python2.7/dist-packages/prom/interface/base.py", line 479, in get_one
    ret = self._get_query(self._get_one, schema, query, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/prom/interface/base.py", line 463, in _get_query
    ret = callback(schema, query, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/prom/interface/base.py", line 851, in _get_one
    return self.query(query_str, *query_args, fetchone=True, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/prom/interface/base.py", line 194, in query
    return self._query(query_str, query_args, **query_options)
  File "/usr/local/lib/python2.7/dist-packages/prom/interface/base.py", line 625, in _query
    return ret
  File "/usr/lib/python2.7/contextlib.py", line 35, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/local/lib/python2.7/dist-packages/prom/interface/base.py", line 170, in connection
    self.raise_error(e)
  File "/usr/local/lib/python2.7/dist-packages/prom/interface/base.py", line 155, in connection
    yield connection
  File "/usr/local/lib/python2.7/dist-packages/prom/interface/base.py", line 595, in _query
    cur = connection.cursor()
InterfaceError: connection already closed

Prom should be able to recover from this error and only propagate the error if it fails to reconnect N number of times or something.

Query magical-ness problem

Setup some classes like this:

class FooQuery(prom.Query):
    pass

class Foo(prom.Orm):
    pass

class BarQuery(prom.Query):
    pass

class Bar(Foo):
    pass

then do this:

q = Foo.query
q = Bar.query
print q.__name__

It won't be BarQuery as expected, but FooQuery. This is because the magical-ness is using the class decorator but that makes it more static instead of class, so Foo setting it makes Bar set it also.

I think a fix to this might be to figure out a way to use @classmethod which might fix this? I haven't done any research and it might not be possible. If it isn't going that route, then you could make an internal dict that sets things up via class name.

with transaction() converts all errors to InterfaceErrors

this is unexpected behavior and it should be changed to only change the specific interfaces (psycopg2, sqlite) to InterfaceError and leave everything else untouched

try:
    with interface.transaction():
        raise StopIteration() # this will be converted to an InterfaceError
except StopIteration:
    pass # this will never be triggered

Query.like()

add a like where you can put in values:

Foo.query.like_bar("che%") # SELECT * FROM foo WHERE bar::text LIKE 'che%'

more

Make tests more flexible to multiple interfaces

Right now tests usually instantiate a Postgres or a SQLite interface, it would be cool if a test runner could be added that will set the environment variable to Postgres, then run all the tests and then set, the environment variable to SQLite and run all the tests, this way you would just have to modify one thing (like adding a MySQL dsn) and it would automatically run all the tests against the new environment variable.

Something like this should allow us to consolidate the Postgres gevent tests because they could be simplified to just another dsn.

For an example of a bad test that instantiates each is config_test.ObjectFieldTest

Add iterator_class to the model

The query instance should then use the iterator class to create the instances, this will allow us to customize iterators for certain models and this is not a property of the query class so we don't have to create custom query classes in order to do a custom iterator

DsnConnection should pass parsing dsn to interface

It could first parse the interface modulepath and then use that to call modulepath.parse_dsn() which is a class method that will parse the dsn, this would allow the SQLite specific code to be moved out of DsnConnection.__init__ (the :memory: stuff) and the interface connection method that normalizes the path, host, and database variables, that could all be consolidated to the parse_dsn method.

more differentiated exceptions

Right now everything just gets raised as an InterfaceError, but this should be broken down even more, so there should be IntegrityError and ConnectionError etc. all children of InterfaceError, this way we can normalize the errors between all interfaces and yet be able to pick and choose what we want to catch because right now if you just want to ignore an integrity error you have to catch all the errors which is pretty bad, because the code might silently ignore a more serious error

if passing pickle into field

Right now, we have a dump field that sets to text, but it would be better if it set to BLOB, then we shouldn't have to worry about base64 encoding the pickled objects.

prom base cli command

It would be cool to have a simple prom command:

$ prom

That would just connect to whatever shell the database your PROM_DSN points to without you having to do anything else, I think we would need to use pexpect though to turn control from the python script to the db shell (eg, psql)

import pexpect
child = pexpect.spawn ('psql ...')
child.interact()     # Give control of the child to the user.
@arg("--connection-name", "-c",
    dest="conn_name",
    default="",
    help="the connection name (from prom dsn) you want to restore")
def main(conn_name):
    """quick way to get into the db shell using prom dsn"""
    inter = get_interface(conn_name)
    conn = inter.connection_config

    if "postgres" in conn.interface_name.lower():
        # TODO -- need to write password to pgpass file and set environment variable
        # and call psql, I don't think you can put the password on the cli
        cmd = [
            "psql",
            "--dbname",
            conn.database,
            "--username",
            conn.username,
            "--password",
            conn.password,
            "--host",
            conn.host,
            "--port",
            str(conn.port),
        ]

    elif "sqlite" in conn.interface_name.lower():
        cmd = [
            "sqlite3",
            conn.database
        ]
    else:
        raise RuntimeError("Unsupported interface")

    # pexpect code goes here

Remove update index

Not sure we need indexes on both created and updated automatically created

break up isetter to iupdate and icreate

It would just be easier if all the methods had the very same interface, in this case you get passed the value and then you return a value, right now I have to lookup isetter everytime I want to use it because this is the definition:

@foo.isetter
def foo(cls, val, is_update, is_modified)

I would also just like to make this whole interface better and easier to use, I ran into an issue the other day with this method being called with val=None when it wasn't actually getting set, which broke the principal of expectation, so I just think the whole interface can be cleaned up a bit, maybe have an inochange or something like that method also.

Query should run values through isetter also

I'm not sure if it does this or not, but Query should run values through the isetter method also before it runs the query, that way if you have a mapping from String to int in the isetter methods, say something like:

mapping = {
    "foo": 1,
}

Then doing something like:

MyOrm.query.is_name("foo")

the "foo" value would be converted to 1 before running the query.

Expand Field with dbset and dbget methods

They should act similar to fget and fset methods but be fired when the field is about to be written to the db and read from the db.

the dbset method:

def dbset(val, is_update=False, is_modified=False)
    """
    run this method right before the field is to be passed to the db to be written

    val -- mixed -- the current value the field contains
    is_update -- boolean -- True if updating, False if inserting
    is_modified -- boolean -- True if the field has been modified since last pull from db
    """
    return val

The dbget method:

def dbget(val):
    """val has just been pulled from the db, the returned val will be put into the field"""
    return val

adding a key to a JsonField

not sure what the fix for this looks like, but if you have a json field and that field contains a dict and then you modify that dict, it doesn't update the modified_fields and so if you call save() it won't actually save anything.

class Foo(Orm):
    bar = JsonField(True)

f = Foo.create(bar={"che": 1})
f.bar["baz"] = 2
f.save() # this won't actually update

f.modified_fields.add("bar")
f.save() # now it will update correctly

Let's do some locking

We should be able to pass in serializable=True to the transaction method to cause the begin to do:

BEGIN ISOLATION LEVEL SERIALIZABLE;

Also, it would be cool to add a .lock() method to query that would cause it to add the FOR UPDATE at the end:

SELECT balance FROM accounts WHERE user_id = 1 FOR UPDATE;

via

add Orm.__delattr__

if the value is in the schema then it should set it to None, otherwise move down the line

Prom async bug

On one of our tests, the foo/bar script uses greenthreads, and when they both start up without a wait between them they will cause a connection pool error

c1 = ScriptClient("foo/bar")
r1 = c1.run_async()

c2 = ScriptClient("foo/bar")
r2 = c2.run_async()

c1.wait()
c2.wait()

The error:

'NoneType' object has no attribute 'getconn'
Traceback (most recent call last):
  ...
  File "/usr/local/lib/python2.7/dist-packages/prom/query.py", line 573, in get_one
    d = self._query('get_one')
  File "/usr/local/lib/python2.7/dist-packages/prom/query.py", line 682, in _query
    return getattr(i, method_name)(s, self) # i.method_name(schema, query)
  File "/usr/local/lib/python2.7/dist-packages/prom/interface/base.py", line 485, in get_one
    ret = self._get_query(self._get_one, schema, query, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/prom/decorators.py", line 58, in wrapper
    return func(self, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/prom/interface/base.py", line 453, in _get_query
    with self.connection(**kwargs) as connection:
  File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
    return self.gen.next()
  File "/usr/local/lib/python2.7/dist-packages/prom/interface/base.py", line 177, in connection
    self.raise_error(e)
  File "/usr/local/lib/python2.7/dist-packages/prom/interface/base.py", line 167, in connection
    connection = self.get_connection()
  File "/usr/local/lib/python2.7/dist-packages/prom/interface/postgres.py", line 98, in get_connection
    connection = self.connection_pool.getconn()
AttributeError: 'NoneType' object has no attribute 'getconn'

This only shows up in tests, I think the reason why is because we've cleared all the tables and so the prom error handling table creation code kicks on the first connection and then the second connection causes a problem?

EnumField

a field that converts to an int behind the scenes, but allows the live object to have a set of values, so you could do something like:

class Foo(Orm):
  bar = EnumField("a", "b", "c", "d")

f = Foo()
f.bar = ["a", "b"]
f.save() # bar will save into the db as 3

f2 = f.query.get_pk(f.pk)
f2.bar # ["a", "b"]

So the enum field will set the appropriate field setters to make sure the the live orm always has a set of values, and that the db always gets the int, when it creates the field it takes the values you passed into it and sets their values in the order they are passed in, doubling the value as it goes, so

EnumField("a", "b", "c", "d")
a = 1
b = 2
c = 4
d = 8

looks like there might be a bug in all()

if you specify a limit and an offset and then call all, it looks like it still moves through all the rows instead of the slice of rows delineated by limit, offset.

Orm.is_modified() should be True only if the new value set is different than the old value.

This is a debatable addition to prom. The problem that prompted this issue is /customer/update is called with timezone data every time the iOS app is opened. When new code was added to update Salesforce whenever the User is updated, this caused the Salesforce queue to grow faster than the requests could be processed.

So, should this kind of work be done by the client or the orm library? The decision is yours!

Orm.read_only

if you set this to True on a prom.Orm instance it will cause insert() and update() to fail.

I was in a situation where I needed to mimic some production data in the test environment and it would've been nice to just have an added level of WTF just in case I accidentally triggered a change

Moving through rows using pk or limit/offset

This was a way rough test but there didn't seem to be any meaningful difference between the two

with pout.p("cursor"):
    user_id = 0
    while True:
        users = User.query.gt_pk(user_id).asc_pk().get(5000)
        if users:
            for u in users:
                user_id = u.pk
                print user_id
        else:
            break


with pout.p("all"):
    for u in User.query.asc_pk().all():
        user_id = u.pk
        print user_id

a small map/reduce like iterator

it would be cool to have something that could break apart a table and each thread would work on a part of the table, so the idea would be:

# imagine the user table has 100 rows and is managed by the User prom.Orm child

def names_that_begin_with_f(u):
    return 1 if u.name.tolower().startswith("f") else 0

p = MR(threads=10, orm_class=User, target=names_that_begin_with_f)

p.run()

What the run() method will do is find out how many rows are in the table, and create how many threads were requested and then divide the table count by number of threads and tell each thread to work on a certain section of the table, then each thread will go through their section using all() and return the results.

There are times where we want to do some simple crunching and this would speed that up quite a bit.

additions to Pool class

add to utils.Pool, this will allow setting of the object without having to go through create_value, also __missing__ code would need to be simplified to account for the new __setitem__ stuff

    # TODO -- move this into prom
    def __setitem__(self, key, val):
        super(UserPool, self).__setitem__(key, val)
        try:
            self.pq.add(key, val)
        except OverflowError:
            self.popitem()
            self.pq.add(key, val)

    # TODO -- move this into prom
    def popitem(self):
        dead_key, dead_val, dead_priority = self.pq.popitem()
        del self[dead_key]
        return dead_key, dead_val

add Field db=False flag

this will cause a a field that is passed into __init__() or modify() to be set into the class instance, but not persisted to the db. That way it can go through all the normal Field validation stuff like the normalize method if needs be, but it ultimately won't be a field in the db.

Cleanup Query api

I would like to explore the following interface:

  1. if you add select fields then get() or all() would return just the selected fields, not Orm instances. Likewise, if you called get_one() it would return just the value, not an orm instance, basically, we never want half populated orm objects for it to be the default, so if you did want a half populated orm object you could just create it and pass in the values returned. We almost exclusively want lists when we use the select() method.

  2. change get_one() to one() so that it is more inline with first() and last(), I know why I did this, to make it similar to get() and get_pk().

  3. remove Query value() and values(), because of number 1 above, we wouldn't need these any longer, and if you ever needed them with the full fields, they would still exist on the returned iterator.

query.ref fails when the pk is passed in

self.query.ref("foo.bar.Che", self.pk)

This will return all of foo.bar.Che in the db, not just the foo.bar.Che with the value of self.pk, at the very least if a value is passed in it should be an exception if the discover code can't match them up because the last thing we want is all the rows to be returned

Cleanup Iterator api

Merge AllIterator and Iterator into one with a flag, I think I've thought of doing this before and it was hairy, but I think it is worth it so we can have iterators on top of those iterators like a UniqueIterator that will skip over certain rows if they've been seen before or FilterIterator which could have a filter callback.

Basically, right now we can't do cool iterators because we have 2 distinct paths that can be followed (get() and all()) and both would need to be supported.

Orm.pool() method improvements

it would be cool to treat this method differently depending on if it was called with a class or an instance of the class

so, if it was called with just the class then nothing would change, but, if it is called with an instance then the Pool instance will have that instance already in its cache:

class Foo(Orm): pass

Foo.pool() # nothing changes, it returns an empty Pool instance

f = Foo.create()
f.pool() # the Pool instance contains f instance at f.pk

get_modified sets container fields as modified but depopulate doesn't

similar to the problem with JsonField, let's say you have a custom container that stores in the db as a string, but when pulled out from the db it becomes a set() or something similar, well, if you did something like:

f = FooOrm()
f.myset.add("val")

Then the modified fields wouldn't identify that the field has been changed, so there needs to be a way to account for this.

possible solutions:

  1. pass a value into the Field() creation that basically says to always mark this as modified, something like: myset = Field(str, dirty=True)

  2. when pulling values from the db take hashes of the values and compare those hashes with the the current values to determine what is modified, this would be a wholesale re-write of the current system and would be way slower to pull rows from the db, but would generic and wouldn't need the end user to be aware of anything

  3. when pulling out of the db, if after calling Field.isetter() the value is no longer the same type as the what the field says (ie, myset is stored in the db as a string but after calling myset.issetter() it is now a set) then go ahead and automatically mark that field as modified, this would replace the current checks for ObjectField and JsonField and the like since I think this system would be pretty generic, relatively fast (only one extra isinstance() check) and wouldn't involve end user having to do anything.

auto create orms

given a table schema, output a valid Orm class that has fields and indexes for all the columns in the db

add unique select

Foo.unique("bar_id").gt__created(now - ago).lt__created(now).values()

Would result in:

SELECT DISTINCT bar_id WHERE _created > ... AND _created < ...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.