clarkduvall / serpy Goto Github PK

View Code? Open in Web Editor NEW

956.0 21.0 62.0 134 KB

ridiculously fast object serialization

Home Page: http://serpy.readthedocs.org/en/latest/

License: MIT License

Python 99.88% Shell 0.12%

serpy's Introduction

serpy: ridiculously fast object serialization

serpy is a super simple object serialization framework built for speed. serpy serializes complex datatypes (Django Models, custom classes, ...) to simple native types (dicts, lists, strings, ...). The native types can easily be converted to JSON or any other format needed.

The goal of serpy is to be able to do this simply, reliably, and quickly. Since serializers are class based, they can be combined, extended and customized with very little code duplication. Compared to other popular Python serialization frameworks like marshmallow or Django Rest Framework Serializers serpy is at least an order of magnitude faster.

Source

Source at: https://github.com/clarkduvall/serpy

If you want a feature, send a pull request!

Documentation

Full documentation at: http://serpy.readthedocs.org/en/latest/

Installation

$ pip install serpy

Examples

Simple Example

import serpy

class Foo(object):
    """The object to be serialized."""
    y = 'hello'
    z = 9.5

    def __init__(self, x):
        self.x = x


class FooSerializer(serpy.Serializer):
    """The serializer schema definition."""
    # Use a Field subclass like IntField if you need more validation.
    x = serpy.IntField()
    y = serpy.Field()
    z = serpy.Field()

f = Foo(1)
FooSerializer(f).data
# {'x': 1, 'y': 'hello', 'z': 9.5}

fs = [Foo(i) for i in range(100)]
FooSerializer(fs, many=True).data
# [{'x': 0, 'y': 'hello', 'z': 9.5}, {'x': 1, 'y': 'hello', 'z': 9.5}, ...]

Nested Example

import serpy

class Nestee(object):
    """An object nested inside another object."""
    n = 'hi'


class Foo(object):
    x = 1
    nested = Nestee()


class NesteeSerializer(serpy.Serializer):
    n = serpy.Field()


class FooSerializer(serpy.Serializer):
    x = serpy.Field()
    # Use another serializer as a field.
    nested = NesteeSerializer()

f = Foo()
FooSerializer(f).data
# {'x': 1, 'nested': {'n': 'hi'}}

Complex Example

import serpy

class Foo(object):
    y = 1
    z = 2
    super_long_thing = 10

    def x(self):
        return 5


class FooSerializer(serpy.Serializer):
    w = serpy.Field(attr='super_long_thing')
    x = serpy.Field(call=True)
    plus = serpy.MethodField()

    def get_plus(self, obj):
        return obj.y + obj.z

f = Foo()
FooSerializer(f).data
# {'w': 10, 'x': 5, 'plus': 3}

Inheritance Example

import serpy

class Foo(object):
    a = 1
    b = 2


class ASerializer(serpy.Serializer):
    a = serpy.Field()


class ABSerializer(ASerializer):
    """ABSerializer inherits the 'a' field from ASerializer.

    This also works with multiple inheritance and mixins.
    """
    b = serpy.Field()

f = Foo()
ASerializer(f).data
# {'a': 1}
ABSerializer(f).data
# {'a': 1, 'b': 2}

License

serpy is free software distributed under the terms of the MIT license. See the LICENSE file.

serpy's People

Contributors

Stargazers

Watchers

Forkers

maroux nestio diwu1989 sreedharbukya drivelous pombredanne kiennt voidfiles techscientist mrjohnsson77 rism-digital mheisig abhishesh thedrow mttr kinnou02 humblepaper kellycampbell jelluz harut michael-k aexeagmbh swistakm genomize hove-io mrgoogol cerdman andreacimino synapticarbors optionalg bigrlab bufubaoni 4geeks xlqian sevenfeng012 maxpoletaev digitaldavenyc saied-delshad luyidong jacobstoehr alfonsolzrg krectra stuart23 ufierro matteonet kyungjunleeme sergenp pankaj7822 logileifs alombaros thespacedevs willie18 klash-group salekinsirajus

serpy's Issues

Push a new release to PyPI

Would it be possible to create a new release and push it up to PyPI? Thanks!

Question: RelatedManager (Django)

I currently do an ugly hack to serialize object with a RelatedManager attribute.

I got two objects. House and Door. A house have several doors. I want to serialize an house with nested serialization.

home.doors = home.door_set.all()
data = HomeSerializer(home).data

I mainly do this because RelatedManager is not iterable. Do you have a better way?
I would like to use the same serializer for a Door and for a RelatedManager of Door.

class HomeSerializer(serpy.Serializer):
    door_set = DoorSerializer(many=True)  # Does not work

But I could not find a proper way to do this. I only start using your lib (which is awesome btw !) yesterday, so I may be missing something.

It would be great if you can add an example of object with RelatedManager attribute in the documentation.

include test suite in PyPi tarball

Hi,
could you please consider to add tests/ into pypi tarball, so we can use it for testing the package?

No way to omit field from output

With the reversion of #38, it seems there is currently no way to omit a value from the serialized output. I'm happy to submit a PR, but would appreciate some direction to know what would be an appropriate solution:

Add an option to the field, omit_if_null
Add an option to the serializer initializer, omit_if_null
???

Methods are not overridden by the child class's implementation

If I have a serializer, TopLevelSerializer with a MethodField which uses a method get_thing to calculate the value of the field, when I create another serializer which inherits from TopLevelSerializer and re-implement get_thing on that child class, I expect the child class's implementation of get_thing to be called when I serialize an object with the child class. However, the TopLevelSerializer's implementation is always called. This breaks the standard way that inheritance is supposed to work in Python.

Control over labels

There are cases where an application may need control over the output label where it might differ from the attribute name.

The case I am working on is a JSON-LD serializer, where labels can start with the @ prefix: @id, @type, etc. This is not a valid attribute name in Python, but it should be reflected in the serialized output.

I've tried to get around this with custom subclasses, but adherence to the attribute name as output label is fairly deep in the functioning of the serializer.

So what I propose is to add an (optional) label to serializer fields, label, that would get carried along in the _compile_field_to_tuple tuple. If label is not None, the _serializer method uses it as the key in the output instead of name.

If this sounds like something you would find useful, I could put together a pull request and send it upstream.

Context is not set on serializers

There is a context keyword argument, but values passed there are not set on the object.

See: https://github.com/clarkduvall/serpy/blob/master/serpy/serializer.py#L87

Remove Python 2.6 from testing matrix

Should python 2.6 be removed from the Travis testing matrix? It is no longer supported by Django...

Django GeoJSON / Geometry fields.

Is there a way to get these to work? I've had ago but it doesn't seem to like them being treated as JSON fields.

I get the following message:

"Object of type 'Point' is not JSON serializable"

Expose fields such that swagger can autogenerate models/code?

Hi, I'm trying to use Serpy in my project. I have a very simple userprofile model that I'm testing with Serpy, which I have pasted below. I've also included two images that show the django serializer versus serpy, and how that renders in Swagger. Any idea how to make it play nicely with each other?

class UserProfileSerializer(serpy.Serializer):
    class Meta:
        model = UserProfile
        fields = (
            'id',
            'user',
            'notes',
            'created_at',
            'updated_at'
        )
        read_only_fields = ('created_at', 'updated_at')

    id = serpy.Field()
    user = serpy.StrField()
    notes = serpy.StrField()
    created_at = serpy.Field()
    updated_at = serpy.Field()

Inconsistent: child serializer required=False returns None

Serpy seems to be designed to omit a field from the serialized data if the field has required=False

However if you use a Serializer as a field, i.e. a child serializer, then the field is always present in the data.

Is this inconsistency deliberate?

...or should this part maybe not be indented as part of the else block?
https://github.com/clarkduvall/serpy/blob/master/serpy/serializer.py#L110

is this library no longer maintained ?

Can't install serpy from source

It seems like it's not possible to install serpy from source:

<trimmed>
Obtaining serpy from git+git://github.com/infoxchange/serpy.git@88c9362b5430e9902670a2460d84f0dad684fec9#egg=serpy (from -r requirements.txt (line 104))
  Cloning git://github.com/infoxchange/serpy.git (to 88c9362b5430e9902670a2460d84f0dad684fec9) to /virtualenv/src/serpy
  Could not find a tag or branch '88c9362b5430e9902670a2460d84f0dad684fec9', assuming commit.
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/virtualenv/src/serpy/setup.py", line 4, in <module>
        import serpy
      File "serpy/__init__.py", line 1, in <module>
        from serpy.fields import (
      File "serpy/fields.py", line 1, in <module>
        import six
    ImportError: No module named six

It looks like setup.py imports serpy in order to get the version string from /serpy/__init__.py. __init__.py in turn tries to import six, which hasn't been installed as a dependency because setup() hasn't run yet.

tox.ini installs six separately, which I guess is why this hasn't been seen in travis...

How to serialize django model using serpy?

Kinda new to python and django so I apologize if this is a noobish question. How would I serialize this following related models using Serpy instead of DRF ModelSerializer and then use the serpy serializer in a DRF ModelViewSet?

Models:

class Person(models.Model):
    first_name = models.CharField(_("First Name"), max_length=128, null=True)
    middle_name = models.CharField(_("Middle Name"), max_length=128, blank=True, null=True)
    last_name = models.CharField(_("Last Name"), max_length=128, null=True)
    email = models.EmailField(_("Email"), blank=True, null=True)

    def __unicode__ (self):
        return '%s: %s %s %s' % (self.id, self.first_name, self.middle_name, self.last_name)

class Bookmark(models.Model):
    person = models.ForeignKey(Person, related_name='bookmarks', on_delete=models.CASCADE)
    title = models.CharField(_("Title"), max_length=9999, blank=True, null=True)
    url = models.URLField(_("Url"), max_length=9999, blank=True, null=True)

    def __unicode__ (self):
        return '%s %s' % (self.title, self.url)

DRF ModelSerializers:

class BookmarkSerializer(ModelSerializer):
    class Meta:
        model = Bookmark
        fields = ('id', 'title', 'url', 'person')

class PersonSerializer(ModelSerializer):
    bookmarks = BookmarkSerializer(many=True, read_only=True)

    class Meta:
        model = Person
        fields = ('id', 'first_name', 'middle_name', 'last_name', 'email', 'bookmarks')

DRF ModelViewSet:

class PersonViewSet(viewsets.ModelViewSet):
    queryset = Person.objects.all()
    serializer_class = PersonSerializer

Type Hints

First, Nice library!

I'm integrating serpy with some projects mine, and I need to remove serpy from mypy type check because serpy have not any type information.

Once serpy supports python-3 and python-2 is unssuported, serpy could include typehint syntax annotations

DateField is not supported

I understand we can still defined a DateField using serpy.MethodField(), but it would be handy to have one already.

Patch for compatibility with RelatedManager (reverse foreign-key)

It wasn't working with nested reverse foreign-keys, need to add .all():

def to_value(self, instance):
    fields = self._compiled_fields
    if self.many:
        serialize = self._serialize
        return [serialize(o, fields) for o in instance.all()]
    return self._serialize(instance, fields)

This enables to do something like:

class ParentSerpy(serpy.Serializer):
    children = ChildSerpy(attr='child_set', many=True)

Accessing request object in method field - solution not working?

Hi,

I am trying to pass a context into my base class, as described in #12 for example, but this is not working me.

Any clues? Has this always worked? I am using an older version of Serpy, could this be a problem?

serpy==0.1.0
djangorestframework==3.3.2

Input validation with serpy

Hi,

I Implement Serpy with DRF. However when I wish to validate them, I have an error with serpy serializer like see in documentation. The code implementation will be do or I need to use two serializers, the standard DRF serializer for the form validation and the serpy for the listing serialization ?

Thanks

Is it still under active development?

I am planning to use this library for one of my Django projects. But the last commit I see is more than a month old. I am curious to know if the library is still under active development?

accessing request object in method field?

I provide data based on the user that is currently authenticated.
With DRF I simply access self.context.request. I see the serializer accepts the context object, but leaves it at None. Is this a choice by design or a feature to be implemented?
Dealbreaker for me in any case.

In terms of performance testing, FYI I am getting 50% gains compared to DRF. Which is significant and would be great to have in prod...!

REST framework compat.

Couple of easy API changes that'd make serpy directly compatible with serializer_class = ... in the generic views with REST framework.

Accept but ignore the context argument.
Accept but error on the data argument, with "serpy serializers do not support input validation".

(Anything else I'm missing?)

Be interested to know if you think that'd be worth doing. If so then we could probably link to serpy as an alternative for read-only endpoints.

We'd also want to do a decent job of explaining what use-cases serpy does and doesn't support vs REST framework serializers, but we could address that once we're ready to link to it as an alternative third party package, some obvious points here for later reference:

Serialization only, no deserialization.
Unordered.
No relational types.

Serpy does not catch Key and Attribute errors on MethodField

I think there should be a way to stop serialization of a MethodField. A simple approach would be catching the KeyError and AttributeError on MethodField serialization and decide if the exception should be supressed or raised. If the field is required exceptions should be propagated but if the field is optional exception should be supressed just like in other fields.

This is less of a feature request and more of a bugfix because of consistency of behavior is important.

MethodField's staticmethod expects self parameter, unused and inconsistent with DRF

Django Rest Framework's SerializerMethodField expects something like this:

@staticmethod
def get_field(data):
    return ret

Whereas Serpy requires the self parameter:

@staticmethod
def get_field(self, data):
    return ret

Exceptions are not easy to trace

Because serpy uses a simple approach of using builtin type functions, exceptions are hard to trace back.

For example, data is a string expected to be casted into integer, if int function fails, the exception will only say invalid literal for int() with base 10: '' which is easy to understand but hard to traceback.

What I propose is a small try except block in _serialize, that'll modify the exception to include serializer and field names. I believe, this'll improve error tracing and debugging vastly.

Support lists

As far as I can tell, this package doesn't support many=True on the basic fields. How should I include a field that is a list of strings in a serializer?

Serpy doesn't support deserialization

Serpy could easily be extended to support deserialization from dict back into Complex objects like Django objects. This would make it feature complete and a good candidate for replacing DRF fully. Is this something that's on the roadmap?

I realized I'm writing something similar to Serpy because I need faster serializer, but I also need deserialization. I can port my changes into Serpy.

serpy no longer supports None as field values

As a result of #38 serpy the following things happen when None is returned from a MethodField

If required=True (default), serialization throws an exception
If required=False, the key and value are not included in the output dictionary.

I'm using MethodField because it's a concrete example. I believe the same thing happens with most (if not all) other fields.

As you might expect, this breaks any use case of serpy that needs to return None.

The django rest framework docs state

Setting this to False also allows the object attribute or dictionary key to be omitted from output when serializing the instance. If the key is not present it will simply not be included in the output representation.

In concrete terms, the following serializer returns the data as seen below

class DrfSerializer(serializers.Serializer):
    value = serializers.IntegerField(required=...)

DRF 3.7.3	`required=True`	`required=False`
`data = {'value': 1}`	`{'value': 1}`	`{'value': 1}`
`data = {'value': None}`	`{'value': None}`	`{'value': None}`
`data = {}`	`KeyError(u"Got KeyError when...`	`{}`

When using the following serpy 0.2.0 serializer I get the results below

class SerpySerializer(serpy.DictSerializer):
    value = serpy.Field(required=...)

serpy 0.2.0	`required=True`	`required=False`
`data = {'value': 1}`	`{'value': 1}`	`{'value': 1}`
`data = {'value': None}`	`TypeError('Field {0} is required', 'value')`	`{}`
`data = {}`	`KeyError('value',)`	`KeyError('value',)`

For reference, this is the output from serpy 0.1.1

serpy 0.1.1	`required=True`	`required=False`
`data = {'value': 1}`	`{'value': 1}`	`{'value': 1}`
`data = {'value': None}`	`{'value': None}`	`{'value': None}`
`data = {}`	`KeyError('value',)`	`KeyError('value',)`

As seen in the tables, PR #38 does the following things

breaks cases that previously matched DRF behavior
adds no new cases that match DRF behavior
breaks any serializers that wanted to return None.

For these reasons I'd like to suggest #38 be reverted.

Should support field ordering

In some cases, it's desirable for the data to be ordered, in others, it's mandatory for the deserialization to respect order - example, deserializing Django objects with reverse fkey relations.

Non-required key raises key error when not present

In serpy 0.2, if a field set to required=False is not present in the object to be serialized, it will raise a KeyError. Based on my understanding of how DRF works, if a field set to required=False has a missing key, it will just skip it and move on.

http://www.django-rest-framework.org/api-guide/fields/#required

Nested 'self' field in Serializer

Is that possible somehow to have nested serializers, which contain object of the same class?
For example:

class PersonSerializer(serpy.Serializer):
    name = serpy.Field()
    age = serpy.IntField()
    childrens = PersonSerializer(many=True)

This code of course is invalid, but it should visualize my problem.
Marshmallow have something like:
childrens = fields.Nested('self', many=True)
If it's not possible I may try to implement feature like this.

Can't seem to run benchmarks from a clone

Docs say do this:

$ git clone https://github.com/clarkduvall/serpy.git && cd serpy
$ tox -e benchmarks

So:

$ mktmpenv --python=`which python3`
$ python -V
Python 3.4.3
$ git clone https://github.com/clarkduvall/serpy.git && cd serpy
$ pip install -r requirements.txt   # this line is missing from the docs btw
$ tox -e benchmarks
GLOB sdist-make: /path/tmp-64a1638fe69a816/serpy/setup.py
benchmarks create: /path/tmp-64a1638fe69a816/serpy/.tox/benchmarks
benchmarks installdeps: Django==1.7.7, djangorestframework==3.1.1, marshmallow==1.2.4
benchmarks inst: /path/tmp-64a1638fe69a816/serpy/.tox/dist/serpy-0.1.0.zip
...
Traceback (most recent call last):
      File "<string>", line 20, in <module>
      File "/path/pip-ipr7v027-build/setup.py", line 4, in <module>
        import serpy
      File "/path/pip-ipr7v027-build/serpy/__init__.py", line 1, in <module>
        from serpy.fields import (
      File "/path/pip-ipr7v027-build/serpy/fields.py", line 1, in <module>
        import six
    ImportError: No module named 'six'

Seems like maybe the deps for testenv aren't being carried across to testenv:benchmarks and nor is the install_requires from the setup.py being picked up?

Other tox -e invocations work fine.

Serializing properties which may or may not be present in object being passed in

Forgive me if my terminology is a bit off. So far I've been pretty impressed with Serpy - we were able to get some serious performance out of it compared to the Django Rest Framework serializer when returning several thousand objects in a single call (don't ask).

I have run into a small challenge, and I'm not quite sure the best way to overcome it. I've got a Serializer that inherits from serpy.Serializer, which has a number of fields (let's call them a, b, c, d, etc.). These fields may or may not be present in the object which I need to serialize, but I'd always like to make sure that they are present with a value of None in the serialized data. So my serializer looks something like:

class MySerializer(serpy.Serializer):
    a = serpy.FloatField()
    b = serpy.FloatField()
    c = serpy.FloatField()
    # ... and so on, for a bunch of fields

Inside of a Django view, the QuerySet of objects is filtered, and passed into the serializer.

# someObjects is a QuerySet of objects, which may or may not contain a, b, c, etc.
serializer = MySerializer(someObjects, many=True)

This raises the following exception:

... views.py", line 80, in get
    return Response(serializer.data)
  File "C:\Python27\lib\site-packages\serpy\serializer.py", line 139, in data
    self._data = self.to_value(self.instance)
  File "C:\Python27\lib\site-packages\serpy\serializer.py", line 128, in to_value
    return [serialize(o, fields) for o in instance]
  File "C:\Python27\lib\site-packages\serpy\serializer.py", line 116, in _serialize
    result = to_value(result)
TypeError: float() argument must be a string or a number```
Switching the FloatFields to be required=False no longer raises the exception, but it no longer puts the corresponding dictionary key in the output.  Ideally I would like the value of a, b, c, etc. to always be present in the output but with a value of None.  Is there a way to do this en-mass, without having to do something for every field I have that is like this?