Code Monkey home page Code Monkey logo

cerberus's Introduction

Cerberus Latest version on PyPI

Python versions Black code style

Cerberus is a lightweight and extensible data validation library for Python.

>>> v = Validator({'name': {'type': 'string'}})
>>> v.validate({'name': 'john doe'})
True

Features

Cerberus provides type checking and other base functionality out of the box and is designed to be non-blocking and easily and widely extensible, allowing for custom validation. It has no dependencies, but has the potential to become yours.

Versioning & Interpreter support

Starting with Cerberus 1.2, it is maintained according to semantic versioning. So, a major release sheds off the old and defines a space for the new, minor releases ship further new features and improvements (you know the drill, new bugs are inevitable too), and micro releases polish a definite amount of features to glory.

We intend to test Cerberus against all CPython interpreters at least until half a year after their end of life and against the most recent PyPy interpreter as a requirement for a release. If you still need to use it with a potential security hole in your setup, it should most probably work with the latest minor version branch from the time when the interpreter was still tested. Subsequent minor versions have good chances as well. In any case, you are advised to run the contributed test suite on your target system.

Funding

Cerberus is an open source, collaboratively funded project. If you run a business and are using Cerberus in a revenue-generating product, it would make business sense to sponsor its development: it ensures the project that your product relies on stays healthy and actively maintained. Individual users are also welcome to make a recurring pledge or a one time donation if Cerberus has helped you in your work or personal projects.

Every single sign-up makes a significant impact towards making Eve possible. To learn more, check out our funding page.

Documentation

Complete documentation is available at http://docs.python-cerberus.org

Installation

Cerberus is on PyPI, so all you need to do is:

$ pip install cerberus

Testing

Just run:

$ python setup.py test

Or you can use tox to run the tests under all supported Python versions. Make sure the required python versions are installed and run:

$ pip install tox  # first time only
$ tox

Contributing

Please see the Contribution Guidelines.

Cerberus is an open source project by Nicola Iarocci. See the license file for more information.

cerberus's People

Contributors

arshsingh avatar baubie avatar calve avatar cd3 avatar crunk1 avatar dkellner avatar dnohales avatar eelkeh avatar entropiae avatar flargebla avatar funkyfuture avatar gilbsgilbs avatar girogiro avatar hvdklauw avatar inirudebwoy avatar joshvillbrandt avatar kynan avatar martijnvermaat avatar misja avatar mmellison avatar nicoddemus avatar nicolaiarocci avatar nikitavlaznev avatar oev81 avatar otibsa avatar pohmelie avatar pws21 avatar rredkovich avatar russellluo avatar s4heid avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cerberus's Issues

Allow maxlength/minlength on lists and dicts

For instance we have an i18n system.
The user has to post a title as a dict with as key the language code and as value the actual string. It doesn't matter which one you provide but you have to provide at least one.

We can generate the schema based on the languages we know we support, but none of them are really required. Having a minlength: 1 on the dict would be awesome.

Inline schema validation

Hello. My idea is validate the value not only by type, but also by another scheme. For Example:

schemas = {}
schemas['dog'] = {
  'name': {
     'type': 'string'       
   },
   'owner': {
     'schema': 'person' # validate not by type, but by "person" schema 
   }
}

schemas['person'] = {
  'name': {
    'type': 'string'
  }
}

v = Validator(schemas)

How about this?

[BUG] Nested document validation is broken

Nested document validation seems to be broken in master branch. After a bit of debugging it seems that this is due to the fact that self.document attribute of subdocument validators gets assigned with a copy of the contents of the parent document (context) causing further validation steps to fail.

To reproduce

import cerberus

schema = {
    'info': {
        'type': 'dict',
        'schema': {
            'name': {'type': 'string', 'required': True}
        }
    }
}

validator = cerberus.Validator(schema)
res = validator.validate({'info': {'name': 'my name'}})
if not res:
    print validator._errors

Rename `keyschema` to `valueschema`

this is a seperated aspect of #83:

  • renaming keyschema to valueschema
    • makes the terminology more pythonic and unambigious in that way
    • keyschema will be an alias for valueschema
      • should somehow yell out, that it is deprecated
      • assuming that client code is tested, it could even raise an exception; that may be a step in a later major release
    • the sooner the better

so far there's only been affirmative feedback.

Proposal: Validation of dictionary-keys

unless i overlooked something, it is not possible to validate the dictionary-keys of a document respectively a dictionary in it. with jsonschema this is done with the patternProperties. also, the key-types should be validatable.

it should be enough to enable checks for dicts:

>>> v = Validator({'type': 'dict', 'propertyschema': {'type': 'string', 'regex': '^[a-zA-Z]$'}})
>>> v.validate({'mapping': {'foo': 'bar'}})
True
>>> v.validate({'mapping': {'foo_bar': 'foobar'}})
False
>>> v.validate({'mapping': {1: 2}})
False
>>> v.schema = {'type': 'dict', 'propertyschema': {'type': 'int', 'min': 0, 'max': 9}}
โ€ฆ

and one can trick to check the document's top-level-properties, which would be worth to be noted imo:

>>> v.validate('document': document)

though i'm not very happy with the term keyschema as of now, because it's confusing; from a pythonic point of view, valueschema is much less ambigious, imo.

if there is a consent regarding the schema-design, i may be going to implement this.

ERROR_EMPTY_BAD_TYPE is not an error I exepct in the validator.errors

v = Validator({'field': {'required': True, 'type': 'string', 'empty': False}})

v.validate({'field': 1})

I expected v.errors to only contain:

value of field 'field' must of string type

but it also contained:

'empty' rule only applies to string fields

Which I would expect to get when my schema defined it for any other type then string, not when I validate something that should contain a string but doesn't.

Can't install with pip (Python 3.3)

I'm unable to install cerberus either from pypi or git. I'm actually trying to install eve, but cerberus seems to be the culprit (something in the LICENSE file?):

Downloading/unpacking cerberus
  Downloading Cerberus-0.3.0.tar.gz
  Running setup.py egg_info for package cerberus
    Traceback (most recent call last):
      File "", line 16, in 
      File "C:\dev\misc\eve\env\build\cerberus\setup.py", line 17, in 
        license=open('LICENSE').read(),
      File "C:\dev\misc\eve\env\lib\encodings\cp1252.py", line 23, in decode
        return codecs.charmap_decode(input,self.errors,decoding_table)[0]
    UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 287: character maps to undefined>
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):

  File "", line 16, in 

  File "C:\dev\misc\eve\env\build\cerberus\setup.py", line 17, in 

    license=open('LICENSE').read(),

  File "C:\dev\misc\eve\env\lib\encodings\cp1252.py", line 23, in decode

    return codecs.charmap_decode(input,self.errors,decoding_table)[0]

UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 287: character maps to undefined>

Permit min/max with floats

Currently min/max can only be applied to ints. It'd be helpful if this was extended to floats as well.

Thanks

`allow_unknown` does not apply to sub-dictionaries in a list

#40 doesn't seem to apply when the sub-dictionary is in a list.

To reproduce:

import cerberus
cerberus.__version__
# '0.8'
v = cerberus.Validator(allow_unknown=True)
schema = {
    'a_dict': {
        'type': 'dict',
        'schema': {
            'address': {'type': 'string'},
            'city': {'type': 'string', 'required': True}
        }
    }
}
document = {
    'a_dict': {
        'address': 'my address',
        'city': 'my town',
        'extra': True
    }
}
v.validate(document, schema)
# True
schema_w_list = {
    'list_o_dicts': {
        'type': 'list',
        'minlength': 1,
        'schema': {
            'type': 'dict',
            'schema': {
                'address': {'type': 'string'},
                'city': {'type': 'string', 'required': True}
            }
        }
    }
}
document_w_list = {
    'list_o_dicts': [{
        'address': 'my address',
        'city': 'my town',
        'extra': True
    }]
}
v.validate(document_w_list, schema_w_list)
# False
v.errors
# {'list_o_dicts': {0: {'extra': 'unknown field'}}}

On a side-note, cerberus is awesome.

Request: Allow pyyaml to be used in unit tests

I use YAML to write the schema in my applications, which is one of the greatest features of cerberus in my mind. I can write a schema for a configuration file in plain-text and load it into a nested dict.

I think the documentation should list this as a feature, but I'd also like to be able to write unit tests in YAML, it is a much cleaner syntax than nested a nested dict.

`readonly` validation should happen before any other validation

Currently, when a field is marked as readonly and has a custom validation rule, that validation rule gets executed before readonly is checked. Ideally, if 'readonly': True is provided, that should happen before any other validation since it's pointless to validate a value that isn't allowed.

Use cases

Use Case 1 - record level validation:
Say I have 2 fields. "Amount" and "Calculation Type". Amount is always a float, however when calculation type is "Percent", Amount must be bounded from 0 to 100.

Use Case 2 - table level validation:
Say I have multiple records and after grouping certain fields, no duplication can be present.

Use case 3 - Multiple coercion functions:
I need to convert a value to string, and perform a couple of other functions before validation. Any tips here?

Appreciate your help! This is probably one of the coolest python projects I've seen in a while.

Validator.validate_schema() not implemented

Hello!

I have installed Cerberus with 'pip install cerberus'. The version installed is Cerberus==0.7
I have a problem when i do:

document_schema = {'curr': {'maxlength': 3, 'minlength': 3, 'required': True, 'type': 'string'},
'dep': {'minlength': 1, 'required': True, 'type': 'string'}}
v = Validator(document_schema)

v.validate_schema(document)


AttributeError Traceback (most recent call last)
in ()
----> 1 v.validate_schema(document)

AttributeError: 'Validator' object has no attribute 'validate_schema'

According to the documentation, Cerberus==0.7 have implemented this function, but not is implemented. I need this function to validate the schema before to validate the documents in runtime. I can not afford a schema exception in runtime.

Any solution?

A lot of Thanks! ;-)
Cerberus is fantastic.

Facilitate type conversions

This might be totally out of your intended scope of Cerberus, and the implementation I have in mind is not 100% pretty, but let me explain my use case :)

Data I want to validate can come in serialized as JSON, but also as HTTP form data, or even query string parameters. In the latter cases, everything is basically a string. You could also consider datetime values in JSON.

I'd like to cast/convert some values (e.g. integers serialized as strings in HTTP form data) in the Validator, such that only one pass over the schema and data is needed and validation rules on the converted value can be used as usual. This conversion should then be done before the other validation rules are applied.

Currently I use the following hack (example) in a Validator subclass to do any conversions in the type rules:

def _validate_type_integer(self, field, value):
    if isinstance(value, basestring):
        try:
            self.document[field] = int(value)
        except ValueError:
            pass
    super(ApiValidator, self)._validate_type_integer(field, self.document[field])

To make sure type rules are applied first, I applied this patch (subclassing or monkey-patching would basically mean duplicating the entire _validate method): martijnvermaat/cerberus@dd0de1f85

List and dictionary types is where it gets more ugly. Some hacking is necessary to make sure converted values end up in the top-level validator document.

After validation, I can then take the data from the validator document field and directly work with the converted values.

The alternative is do the conversion before validation, but that would mean duplicating Cerberus' schema parsing and document traversal.

So this is what I do now, and it works fine, but I'm not happy about having to use the patched _validate method. (In practice I have more intricate data conversions that just strings to integers.)

Other issues I see:

  • The naming "validate" no longer accurately describe what's being done.
  • You might not like the idea of the input document being modified (solution could be to generate a copy with converted values).

What are your thoughts? Do you see this as a valid use case? I could see this implemented in a cleaner way by having optional _convert_integer etc rules, which are always applied first.

Shouldn't it permit passing/posting multiple items for a list-type field?

Given this field schema (snippet):

tels: {
  'type': 'list', 
  'items': [{
      'type': 'dict', 
      'schema': {
          'text': {'type': 'string', 'required': True},
          'ext': {'type': 'string'},
          'note': {'type': 'string', 'maxlength': 64}
      }}]}

and this test snippet:

  key = 'Apple'
  doc = {
      'cnam':  key,
      'tels': [{
          'text': self.random_string(8),
          'ext': self.random_string(3),
          'note': self.random_string(16),
      },{
          'text': self.random_string(8),
          'ext': self.random_string(3),
          'note': self.random_string(16),
      }],
    }
  payload = {}
  payload[key] = json.dumps(doc)
  r, status = self.post('/%s/' % url, data=payload)

Upon validation it returns:

"'tels': lenght of list should be 1"

Note mispelling should be length. :)

Shouldn't it permit passing/posting multiple items for a list-type field?

allow_unknown does not respect custom validators

The Cerberus API allows custom validation properties and custom validation types to be added by creating a class that inherits from cerberus.Validator. It also allows validation rules to be supplied for arbitrary fields, via the allow_unknown property. However, the schema for unknown properties cannot make use of custom validation properties and custom validation types.

For example:

from cerberus import Validator

class CustomValidator(Validator):

    def _validate_type_foo(self, field, value):
        if not value == "foo":
            self.error(field, "Expected a foo")


v = CustomValidator({})
v.allow_unknown = {"type": "foo"}

v.validate( { "fred": "foo", "barney": "foo" } )

I would expect the call to .validate() to approve that document, but instead
I get a traceback:

Traceback (most recent call last):
  File "test-cerberus.py", line 13, in <module>
    v.validate( { "fred": "foo", "barney": "foo" } )
  File "cerberus/cerberus.py", line 165, in validate
    return self._validate(document, schema, update=update, context=context)
  File "cerberus/cerberus.py", line 228, in _validate
    self.allow_unknown})
  File "cerberus/cerberus.py", line 118, in __init__
    self.validate_schema(schema)
  File "cerberus/cerberus.py", line 278, in validate_schema
    errors.ERROR_UNKNOWN_TYPE % value)
cerberus.cerberus.SchemaError: unrecognized data-type 'foo'

Validating values of dicts with arbitrary/unknown names of keys - is it possible..?

Having a document like this one:

    document = {
        'aaa': {
            'bbb': [
                {'ddd': {'xxx': 123, 'yyy': 'some string'}},
                {'eee': {'zzz': 555}},
            ],
            'ccc': [
                {'ddd': {'xxx': 789, 'yyy': 'some other string'}},
            ]
        }
    }

...I can validate it with the following schema:

    schema = {
        'aaa': {
            'type': 'dict',
            'schema': {
                'bbb': {
                    'type': 'list',
                    'schema': {
                        'type': 'dict',
                        'schema': {
                            'ddd': {
                                'type': 'dict',
                                'schema': {
                                    'xxx': {'type': 'integer'},
                                    'yyy': {'type': 'string'},
                                },
                            },
                            'eee': {
                                'type': 'dict',
                                'schema': {
                                    'zzz': {'type': 'integer'},
                                },
                            }
                        }
                    }
                },
                'ccc': {
                    'type': 'list',
                    'schema': {
                        'type': 'dict',
                        'schema': {
                            'ddd': {
                                'type': 'dict',
                                'schema': {
                                    'xxx': {'type': 'integer'},
                                    'yyy': {'type': 'string'},
                                },
                            }
                        }
                    }
                }  # /ccc
            }
        }  # /aaa
    }

...but the thing is, I need the possibility to use arbitrary names for keys bbb and ccc (and yes, there may be an arbitrary number of them on this level).
In other words, how can I validate their values (which are lists of ddd and eee dicts, and those dicts always have the same structure), without knowing their names..? Is something like that possible with Cerberus..?

Request: Allow list of schema to be validated against

It would be useful to allow the schema key and to be a list, which would be interpreted as a set of possible schemas that should be applied and the entry should validate if any of the schemas validate. It is already possible to give a list of types, but if I want to make sure that a value is a dict that has one of two possible formats, it is not possible without creating validate_type* functions.

validictory allows this because it handles custom types in a way to the proposal in Issue #96

Allow a linelength of 120 chars?!

i'm really much for pep8-compliance. however i think that nowadays there's hardly a need to restrict line-length to 80 characters. which also results in rather worse readable code due to lots of line-continuations.

what about setting the allowed line-length to 120 characters?

Pre-processing and collecting valid data

During validation of a GET parameters I'd like to preprocess them first.
For example we have the following query

GET /resource?type=foo,bar,baz&relation_ids=[1,2,3]&count=true

Parsed parameters become a dictionary like this

parameters = {
    "type": "foo,bar,baz",
    "relation_ids": "[1,2,3]",
    "count": "true"
}

So validating against schema like {"type": {"type": "string"}, "count": {"type": "string"}, "relation_ids": {"type": "string"}} doesn't make much sense.
This is why I'd suggest to enable preprocessing rule for each field before validation.

Collecting valid data may be also useful. Because otherwise we would perform the same processing of input parameters executing exactly the same code twice.

Pseudo code may describe the idea better

schema = {
    "type": {
        "type": "string_list",  # custom type
        "prepare": True  # invokes self._prepare_type() before any validations
        "store_to": "types"  # new field name in processed document
        "empty": False
    },
    "relation_ids": {
        "type": "ints_list",  # custom type
        "prepare": True,  # self._prepare_relation_ids()
        "store_to": "relation_ids",
        "empty": False
    },
    "count": {
        "type": "boolean",
        "prepare": True,  # self._prepare_count()
        "store_to": "count",
        "empty": False
    }
}

class HTTPAPIValidator(cerberus.Validator):
    def __init__(self, *args, **kwargs):
        self.prepare = kwargs.pop('prepare', False)  # global switch, off by default
        super(HTTPAPIValidator, self).__init__(*args, **kwargs)
        self._processed_document = {}

    @property
    def processed_document(self)
        return self._processed_document

    def _prepare_type(self, field, value):
        try:
            return value.split(',').strip('[]')
        except (TypeError, ValueError):
            self._error(field, 'cannot process field {0}'.format(field))

    def  validate(self, *args, **kwargs):
        # add logic for storing to self._processed_document after validation

Currently, the same may be implemented subclassing cerberus.Validator and the schema:

schema = {
    "type": {
        "type": "string", "empty": False,
        "split_to_field": "types"  # invokes self._validate_split_to_field(self, target_field, field, value)
    },
    "relation_ids": {
        "type": "ints_list", "empty": False,
        "int_split_to_field": True,  # self._validate_ints_split_to_field
    },
    "count": {
        'type': 'string', 'empty': False,
        'bool_to_field': 'count',
        'allowed': ['true', 'false']
    }
}

custom fields on list of objects dont work properly

sample schema:

{'a': {'schema': {'b': {'oid': 'here', 'type': 'string'}}, 'type': 'list'}}

sample doc:

{'a': [ { 'b' : '33' } ] }

custom validation function for oid:

    def _validate_oid(self, *args):
        print args

output:

SchemaError: unknown rule 'b' for field '0'

If I change line 298-299 in cerberus.py to

validator = self.__class__(schema)
validator.validate(value[i])

output:

('here', 'b', '33')

as it should be.

Proposal: custom coerce-methods

i think it's better to include this feature before 0.9 is released. and as there is a reliable pattern for types, it wouldn't be much effort i guess.

one could also add an example to the docs and tests that illustrates a subclassed Validator that makes use of coercing in conjunction with contextual instance-properties.

Proposal: add exclusion-checks

it'd be handy to declare properties to exlude others:

{'property_a': {'excludes': 'property_b'},
 'property_b': {'excludes': ['property_a', 'property_c']}

if a property is not present in the document, excluded by a present one, but marked required, the requirement should be dismissed. this way mutual exclusive requirements are possible:

schema = \
{'property_a': {'excludes': 'property_b', 'required': True},
 'property_b': {'excludes': 'property_a', 'required': True}}

valid_documents = \
[{'property_a': 'foo'}, {'property_c': 'bar'}]

invalid_documents = \
[{'property_a': 'foo', 'property_b': 'bar'}, {'property_c': 'baz'}]

since i can make use of this and may be implementing the next days, i'd appreciate any thoughts on this.

Validator options not passed to "sub-schemas"

To validate a nested dict I am using the "schema" rule to define the rules for each level of the dict. The schema rule creates a new Validator object with the schema defined by the rule and passes the corresponding value in the document to the validate function.

However, the options of the main Validator instance are not passed to these sub-Validator instances. So, if the "allow_unknown" attribute is set to true in the main Validator, it does not get set for all sub-Validators.

My idea would be to add a "set_parent" method to the Validator class that would set the "parent" attribute to some Validator instance and copy all setting attributes from it (so allow_uknown, ignore_none_values, etc would be copied).

I think the idea of Validators having parents would be useful in general because then custom validator rules could then access data from an arbitrary position in the main schema no matter how far down they were created. If the documents passed into validate function also knew about their parent, then you could write validation rules that check other fields in the document to decide if the current field is valid (which is what I ultimately want to do).

Allow to pass error-handlers to a `Validator`-instance

this is a follow-up to #89 and #90.

i propose to introduce error-handlers in order to allow to deal with errors more flexible.

to achieve that:

  • Validator.__init__ takes an optional error_handler-object
  • Validator._error is extended, so it stores the following data about an error:
    • trail - a list that represents the path to the field in the document (eg: ['a_dict', 'a_list']); a common prefix can be specified upon calling Validator.validate
    • field, value - as of now
    • constraint - the constraint that failed
    • message - a simple error message, like currently implemented
  • Validator.errors calls the format(?)-method of error_handler that may return errors in a desired format and / or do whatever its purpose is

there will be one default-handler and two as reference-implementations:

  • BasicErrorHandler
    • returns errors as now
    • but concentates trail and field
  • HumanReadableErrorHandler
    • is targeted to end-users
    • concentates trail and field
    • suggests valid keys, in case of an unallowed value
    • the index of list-items is increased by one
    • list-items are prefixed with item #
  • YamlErrorHandler
    • structures errors in a dictionary
    • joins it into a yaml-file

TypeError "takes at at least 2 arguments" in _validate_keyschema

Between 0.7 and 0.7.2 the _validate_keyschema function has changed.

0.7 https://github.com/nicolaiarocci/cerberus/blob/bc52d9f39b0f90ea9656e734119b63c8b47945f0/cerberus/cerberus.py#L356

def _validate_keyschema(self, schema, field, value):
    for key, document in value.items():
        validator = self.__class__(schema)
        validator.validate({key: document}, {key: schema})
        if len(validator.errors):
            self._error(field, validator.errors)

0.7.2 https://github.com/nicolaiarocci/cerberus/blob/0d2486f53becf757f7824ff9c40d92635b759a86/cerberus/cerberus.py#L403

 def _validate_keyschema(self, schema, field, value):
    for key, document in value.items():
        validator = self.__class__()
        validator.validate(
             key: document}, {key: schema}, context=self.document)
        if len(validator.errors):
            self._error(field, validator.errors)

But 0.7.2 gives me the following traceback in my Eve application:

Traceback (most recent call last):
  File "/Users/roessland/.virtualenvs/**********/lib/python2.7/site-packages/flask/app.py", line 1836, in __call__
    return self.wsgi_app(environ, start_response)
  File "/Users/roessland/.virtualenvs/**********/lib/python2.7/site-packages/flask/app.py", line 1820, in wsgi_app
    response = self.make_response(self.handle_exception(e))
  File "/Users/roessland/.virtualenvs/**********/lib/python2.7/site-packages/flask/app.py", line 1403, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/Users/roessland/.virtualenvs/**********/lib/python2.7/site-packages/flask/app.py", line 1817, in wsgi_app
    response = self.full_dispatch_request()
  File "/Users/roessland/.virtualenvs/**********/lib/python2.7/site-packages/flask/app.py", line 1477, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/Users/roessland/.virtualenvs/**********/lib/python2.7/site-packages/flask/app.py", line 1381, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/Users/roessland/.virtualenvs/**********/lib/python2.7/site-packages/flask/app.py", line 1475, in full_dispatch_request
    rv = self.dispatch_request()
  File "/Users/roessland/.virtualenvs/**********/lib/python2.7/site-packages/flask/app.py", line 1461, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/Users/roessland/.virtualenvs/**********/lib/python2.7/site-packages/eve/endpoints.py", line 55, in collections_endpoint
    response = post(resource)
  File "/Users/roessland/.virtualenvs/**********/lib/python2.7/site-packages/eve/methods/common.py", line 229, in rate_limited
    return f(*args, **kwargs)
  File "/Users/roessland/.virtualenvs/**********/lib/python2.7/site-packages/eve/auth.py", line 57, in decorated
    return f(*args, **kwargs)
  File "/Users/roessland/.virtualenvs/**********/lib/python2.7/site-packages/eve/methods/common.py", line 675, in decorated
    r = f(resource, **combined_args)
  File "/Users/roessland/.virtualenvs/**********/lib/python2.7/site-packages/eve/methods/post.py", line 146, in post
    validation = validator.validate(document)
  File "/Users/roessland/.virtualenvs/**********/lib/python2.7/site-packages/cerberus/cerberus.py", line 160, in validate
    return self._validate(document, schema, update=update, context=context)
  File "/Users/roessland/.virtualenvs/**********/lib/python2.7/site-packages/cerberus/cerberus.py", line 216, in _validate
    validator(definition[rule], field, value)
  File "/Users/roessland/.virtualenvs/**********/lib/python2.7/site-packages/cerberus/cerberus.py", line 403, in _validate_keyschema
    validator = self.__class__()
TypeError: __init__() takes at least 2 arguments (1 given)

Reverting it back to validator = self.__class__(schema) makes the Eve API work correctly again.

It seems like this change was done in this commit: 7d04a1c

Is this a bug?

Schema aren't validated when constructing a Validator

This issue is more a matter of philosphy than a bug, but I thought I'd write it down anyway.

Impact:

If a schema contains an error then the validating program using the schema will only crash if it receives data pertaining to the erroneous part of the schema. This can make debugging hard.

Reproduction:
import cerberus
schema = {
    'ok': {'type': 'string'},
    'poop': {'not_valid_key_for_schema': 'wrong'}}
validator = cerberus.Validator(schema)  # Works fine even though validation of 'poop' will fail
validator.validate({'ok': 'everything is awesome'}
validator.validate({'poop': 'ah, it broke'})
Suggested behaviour:

I wonder whether when creating a validator, or updating it's schema, cerberus should validate the schema itself against an internal schema for what schema definitions can look like. This would avoid only hitting mistakes in schema definition when runtime data hits that part of the schema, which could be quite rare.

Penny for you thoughts?

% string formatting doesn't allow missing parameters

Consider using str.format() methods so that if someone wants to override the error.* messages you don't get an error if the %s is missing new error message.

>>> 'some message' % 'x'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: not all arguments converted during string formatting
>>> 'some message'.format('x')
'some message'
>>>```

Return doc/sub_doc and/or offset to failing items in a list.

When a list item fails, how would you know WHICH item(s) in the list failed? Would it help to return the offending item with the error? Perhaps list offsets pointing to failed items?

This may also relate to results returned when posting a list of base docs.

Propose: allow_unknown should be configurable separately per each scheme like

Something like this:

schema = {
    'a_dict': {
        'type': 'dict',
        'allowed_unknown': True,
        'schema': {
            'address': {'type': 'string'},
            'city': {'type': 'string', 'required': True}
        }
    }
}

Why? Our use case:
We write validation of big json with mixed data. Some parts of this structure are the REST response from the external service which is under active development and often change data format, other parts is our data (they more or less stable). It's very handy to disable/enable "allowed_unknown" for some parts of this struct during "step by step" api stabilization.

FR: Consider adding 'email' as a core type

It'd be useful if cerberus had 'email' as a core type so users don't have to find a regex to validate emails, etc. It's just a small thing to do, but given how almost every web app will want to validate email addresses, it'd speed up using the library if it was core type. Otherwise users have to learn how to add custom types, write a regex, etc. Adding a 'regex' parameter to the string type would also be useful.

Just a small thing, but I think it'd be helpful.

Add unique in list validator

I use this custom validator a lot - it might be useful for others as well. It is most useful for objects with embedded lists of objects.

    def _validate_unique_in_list(self, unique_in_list, field, value):
        """ Enforce uniqueness of fields listed in unique_in_list against a
        list of objects in value.
        """
        # init error object
        errors = {}

        # force input to list
        unique_fields = unique_in_list
        if type(unique_fields) is not list:
            unique_fields = [unique_fields]

        for unique_field in unique_fields:
            # build hash set
            hashes = []
            for i, channel in enumerate(value):
                if isinstance(channel[unique_field], dict):
                    h = hash(frozenset(channel[unique_field].items()))
                else:
                    h = hash(channel[unique_field])
                hashes.append(h)

            # log duplicates
            for i, h in enumerate(hashes):
                if hashes.count(h) > 1:
                    if str(i) not in errors:
                        errors[str(i)] = {}
                    errors[str(i)][unique_field] = \
                        "value '%s' must be unique in list" % \
                        channel[unique_field]

        # report errors
        if len(errors) > 0:
            self._error(field, errors)

This can be used like this where unique_in_list is a single field name or a list of field names:

'field_name': {
    'type': 'list',
    'unique_in_list': '_id',
    'schema': { ... }
}

Another improvement that could be made would be modifying the self._error() function to extend the errors object instead of assigning field to errors. This would allow me to have multiple self._error() calls without superfluous structure in the resulting errors object.

Why does the list type exclude strings?

I don't understand the rationale, and I'm hoping you might explain why you did this.

Thank you for your work!

edit
I'm creating an endpoint in eve, and I need to reference several other resources, but I don't see the point of telling the consumer that they are consuming IDs like this:

    related:
        [ {id: "1axZ"}, {id: "F2mA"}, {id: "uM5q"} ] 

I would like to do this:

    related:
        [ "1axZ", "F2mA", "uM5q" ]

Maybe I'm wrong, though, in how this should be constructed.

Proposal: option to purge unknown fields on validation

I have the need to apart from validating the data to also remove unknown fields. I couldn't see an easy way in _validate to get a direct reference to the document element being parsed so for now I subclass Validator and do a clean post validation based on the errors dict, see this gist. I was wondering, has this been discussed before, would it be something to include, and how to best handle it via _validate?

suggestion to make it easier to find this project

Looks like cerberus is exactly what I was looking for. Bad news is that I was searching for schema and validation keywords on PyPI and didnt notice your package... By lucky, because you just did a release today, I noticed your project.

I would suggest you to add the word schema somewhere in the project description and/or keywords. So people can find your project! thanks

why wrap _validate with validate and validate_update ?

if all the validate and validate_update methods do is just pass the update parameter, why not just let the user do it ?

this is much better :

validate(document)
validate(document, update=True)

than this:

validate(document)
validate_update(document)

allow_unknown in schema dict broken

The cerberus documentation gives an example of setting allow_unknown in your schema dictionary. However, even when following the example verbatim, cerberus chokes

https://cerberus.readthedocs.org/en/latest/#allowing-the-unknown

from cerberus import Validator
v = Validator()
schema = {
  'name': {'type': 'string'},
  'a_dict': {
    'type': 'dict',
    'allow_unknown': True,
    'schema': {
      'address': {'type': 'string'}
    }
  }
}
v.validate({'name': 'john', 'a_dict':{'an_unknown_field': 'is allowed'}}, schema)

produces

cerberus.cerberus.SchemaError: unknown rule 'allow_unknown' for field 'a_dict'

Proposal: allow explicit rules per type

atm it is possible to test a value to be one of multiple types, eg:

'some_field': {'type': ['a_type', 'b_type']}

for simple cases this is fine. however:

  • in case it's a dict, you can't test schema if list is also allowed
  • any other rule can only be same for each possible type
  • schema-definitions become more or less unclear:
{'a_field': {'type': ['dict', 'list', 'string'],
             'schema': {'type': ['dict', 'string'],
                        'regex': 'foo.*',
                        'keyschema': {'type': 'string',
                                      'regex': '.*'}},
             'keyschema': {'type': 'string',
                           'regex': '.*'},
             'regex': 'foo.*'}}

so, i propose to extend and also allow this notation:

{'a_field': {'type': {'dict': {'keyschema': {'type': 'string', 'regex': '.*'}},
                      'list': {'schema': {'type': {'dict': {'keyschema': {'type': 'string', 'regex': '.*'}},
                                                   'string': {'regex': 'foo.*'}}}},
                      'string': {'regex': 'foo.*'}}}}

so, allowed types can also be mappings whose value is a schema that will be validated against value if the key is recognized as valid type of value. thus (not in the example), rules could be very different depending on the actual type. it seems quiet easy to implement and shouldn't break anything.

though that idea came up while debugging as a more-or-less-workaround at first, i still find it's not a bad idea. imo, its sexyness is caused by its explicit nature. atm i don't need it, but that may change any moment.

default keyword

Although not part of validation, where would be best place to specify default values?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.