Code Monkey home page Code Monkey logo

voluptuous's Introduction

CONTRIBUTIONS ONLY

What does this mean? I do not have time to fix issues myself. The only way fixes or new features will be added is by people submitting PRs.

Current status: Voluptuous is largely feature stable. There hasn't been a need to add new features in a while, but there are some bugs that should be fixed.

Why? I no longer use Voluptuous personally (in fact I no longer regularly write Python code). Rather than leave the project in a limbo of people filing issues and wondering why they're not being worked on, I believe this notice will more clearly set expectations.

Voluptuous is a Python data validation library

image image image Test status Coverage status Gitter chat

Voluptuous, despite the name, is a Python data validation library. It is primarily intended for validating data coming into Python as JSON, YAML, etc.

It has three goals:

  1. Simplicity.
  2. Support for complex data structures.
  3. Provide useful error messages.

Contact

Voluptuous now has a mailing list! Send a mail to [email protected] to subscribe. Instructions will follow.

You can also contact me directly via email or Twitter.

To file a bug, create a new issue on GitHub with a short example of how to replicate the issue.

Documentation

The documentation is provided here.

Contribution to Documentation

Documentation is built using Sphinx. You can install it by

pip install -r requirements.txt

For building sphinx-apidoc from scratch you need to set PYTHONPATH to voluptuous/voluptuous repository.

The documentation is provided here.

Changelog

See CHANGELOG.md.

Why use Voluptuous over another validation library?

Validators are simple callables: No need to subclass anything, just use a function.

Errors are simple exceptions: A validator can just raise Invalid(msg) and expect the user to get useful messages.

Schemas are basic Python data structures: Should your data be a dictionary of integer keys to strings? {int: str} does what you expect. List of integers, floats or strings? [int, float, str].

Designed from the ground up for validating more than just forms: Nested data structures are treated in the same way as any other type. Need a list of dictionaries? [{}]

Consistency: Types in the schema are checked as types. Values are compared as values. Callables are called to validate. Simple.

Show me an example

Twitter's user search API accepts query URLs like:

$ curl 'https://api.twitter.com/1.1/users/search.json?q=python&per_page=20&page=1'

To validate this we might use a schema like:

>>> from voluptuous import Schema
>>> schema = Schema({
...   'q': str,
...   'per_page': int,
...   'page': int,
... })

This schema very succinctly and roughly describes the data required by the API, and will work fine. But it has a few problems. Firstly, it doesn't fully express the constraints of the API. According to the API, per_page should be restricted to at most 20, defaulting to 5, for example. To describe the semantics of the API more accurately, our schema will need to be more thoroughly defined:

>>> from voluptuous import Required, All, Length, Range
>>> schema = Schema({
...   Required('q'): All(str, Length(min=1)),
...   Required('per_page', default=5): All(int, Range(min=1, max=20)),
...   'page': All(int, Range(min=0)),
... })

This schema fully enforces the interface defined in Twitter's documentation, and goes a little further for completeness.

"q" is required:

>>> from voluptuous import MultipleInvalid, Invalid
>>> try:
...   schema({})
...   raise AssertionError('MultipleInvalid not raised')
... except MultipleInvalid as e:
...   exc = e
>>> str(exc) == "required key not provided @ data['q']"
True

...must be a string:

>>> try:
...   schema({'q': 123})
...   raise AssertionError('MultipleInvalid not raised')
... except MultipleInvalid as e:
...   exc = e
>>> str(exc) == "expected str for dictionary value @ data['q']"
True

...and must be at least one character in length:

>>> try:
...   schema({'q': ''})
...   raise AssertionError('MultipleInvalid not raised')
... except MultipleInvalid as e:
...   exc = e
>>> str(exc) == "length of value must be at least 1 for dictionary value @ data['q']"
True
>>> schema({'q': '#topic'}) == {'q': '#topic', 'per_page': 5}
True

"per_page" is a positive integer no greater than 20:

>>> try:
...   schema({'q': '#topic', 'per_page': 900})
...   raise AssertionError('MultipleInvalid not raised')
... except MultipleInvalid as e:
...   exc = e
>>> str(exc) == "value must be at most 20 for dictionary value @ data['per_page']"
True
>>> try:
...   schema({'q': '#topic', 'per_page': -10})
...   raise AssertionError('MultipleInvalid not raised')
... except MultipleInvalid as e:
...   exc = e
>>> str(exc) == "value must be at least 1 for dictionary value @ data['per_page']"
True

"page" is an integer >= 0:

>>> try:
...   schema({'q': '#topic', 'per_page': 'one'})
...   raise AssertionError('MultipleInvalid not raised')
... except MultipleInvalid as e:
...   exc = e
>>> str(exc)
"expected int for dictionary value @ data['per_page']"
>>> schema({'q': '#topic', 'page': 1}) == {'q': '#topic', 'page': 1, 'per_page': 5}
True

Defining schemas

Schemas are nested data structures consisting of dictionaries, lists, scalars and validators. Each node in the input schema is pattern matched against corresponding nodes in the input data.

Literals

Literals in the schema are matched using normal equality checks:

>>> schema = Schema(1)
>>> schema(1)
1
>>> schema = Schema('a string')
>>> schema('a string')
'a string'

Types

Types in the schema are matched by checking if the corresponding value is an instance of the type:

>>> schema = Schema(int)
>>> schema(1)
1
>>> try:
...   schema('one')
...   raise AssertionError('MultipleInvalid not raised')
... except MultipleInvalid as e:
...   exc = e
>>> str(exc) == "expected int"
True

URLs

URLs in the schema are matched by using urlparse library.

>>> from voluptuous import Url
>>> schema = Schema(Url())
>>> schema('http://w3.org')
'http://w3.org'
>>> try:
...   schema('one')
...   raise AssertionError('MultipleInvalid not raised')
... except MultipleInvalid as e:
...   exc = e
>>> str(exc) == "expected a URL"
True

Lists

Lists in the schema are treated as a set of valid values. Each element in the schema list is compared to each value in the input data:

>>> schema = Schema([1, 'a', 'string'])
>>> schema([1])
[1]
>>> schema([1, 1, 1])
[1, 1, 1]
>>> schema(['a', 1, 'string', 1, 'string'])
['a', 1, 'string', 1, 'string']

However, an empty list ([]) is treated as is. If you want to specify a list that can contain anything, specify it as list:

>>> schema = Schema([])
>>> try:
...   schema([1])
...   raise AssertionError('MultipleInvalid not raised')
... except MultipleInvalid as e:
...   exc = e
>>> str(exc) == "not a valid value @ data[1]"
True
>>> schema([])
[]
>>> schema = Schema(list)
>>> schema([])
[]
>>> schema([1, 2])
[1, 2]

Sets and frozensets

Sets and frozensets are treated as a set of valid values. Each element in the schema set is compared to each value in the input data:

>>> schema = Schema({42})
>>> schema({42}) == {42}
True
>>> try:
...   schema({43})
...   raise AssertionError('MultipleInvalid not raised')
... except MultipleInvalid as e:
...   exc = e
>>> str(exc) == "invalid value in set"
True
>>> schema = Schema({int})
>>> schema({1, 2, 3}) == {1, 2, 3}
True
>>> schema = Schema({int, str})
>>> schema({1, 2, 'abc'}) == {1, 2, 'abc'}
True
>>> schema = Schema(frozenset([int]))
>>> try:
...   schema({3})
...   raise AssertionError('Invalid not raised')
... except Invalid as e:
...   exc = e
>>> str(exc) == 'expected a frozenset'
True

However, an empty set (set()) is treated as is. If you want to specify a set that can contain anything, specify it as set:

>>> schema = Schema(set())
>>> try:
...   schema({1})
...   raise AssertionError('MultipleInvalid not raised')
... except MultipleInvalid as e:
...   exc = e
>>> str(exc) == "invalid value in set"
True
>>> schema(set()) == set()
True
>>> schema = Schema(set)
>>> schema({1, 2}) == {1, 2}
True

Validation functions

Validators are simple callables that raise an Invalid exception when they encounter invalid data. The criteria for determining validity is entirely up to the implementation; it may check that a value is a valid username with pwd.getpwnam(), it may check that a value is of a specific type, and so on.

The simplest kind of validator is a Python function that raises ValueError when its argument is invalid. Conveniently, many builtin Python functions have this property. Here's an example of a date validator:

>>> from datetime import datetime
>>> def Date(fmt='%Y-%m-%d'):
...   return lambda v: datetime.strptime(v, fmt)
>>> schema = Schema(Date())
>>> schema('2013-03-03')
datetime.datetime(2013, 3, 3, 0, 0)
>>> try:
...   schema('2013-03')
...   raise AssertionError('MultipleInvalid not raised')
... except MultipleInvalid as e:
...   exc = e
>>> str(exc) == "not a valid value"
True

In addition to simply determining if a value is valid, validators may mutate the value into a valid form. An example of this is the Coerce(type) function, which returns a function that coerces its argument to the given type:

def Coerce(type, msg=None):
    """Coerce a value to a type.

    If the type constructor throws a ValueError, the value will be marked as
    Invalid.
    """
    def f(v):
        try:
            return type(v)
        except ValueError:
            raise Invalid(msg or ('expected %s' % type.__name__))
    return f

This example also shows a common idiom where an optional human-readable message can be provided. This can vastly improve the usefulness of the resulting error messages.

Dictionaries

Each key-value pair in a schema dictionary is validated against each key-value pair in the corresponding data dictionary:

>>> schema = Schema({1: 'one', 2: 'two'})
>>> schema({1: 'one'})
{1: 'one'}

Extra dictionary keys

By default any additional keys in the data, not in the schema will trigger exceptions:

>>> schema = Schema({2: 3})
>>> try:
...   schema({1: 2, 2: 3})
...   raise AssertionError('MultipleInvalid not raised')
... except MultipleInvalid as e:
...   exc = e
>>> str(exc) == "extra keys not allowed @ data[1]"
True

This behaviour can be altered on a per-schema basis. To allow additional keys use Schema(..., extra=ALLOW_EXTRA):

>>> from voluptuous import ALLOW_EXTRA
>>> schema = Schema({2: 3}, extra=ALLOW_EXTRA)
>>> schema({1: 2, 2: 3})
{1: 2, 2: 3}

To remove additional keys use Schema(..., extra=REMOVE_EXTRA):

>>> from voluptuous import REMOVE_EXTRA
>>> schema = Schema({2: 3}, extra=REMOVE_EXTRA)
>>> schema({1: 2, 2: 3})
{2: 3}

It can also be overridden per-dictionary by using the catch-all marker token extra as a key:

>>> from voluptuous import Extra
>>> schema = Schema({1: {Extra: object}})
>>> schema({1: {'foo': 'bar'}})
{1: {'foo': 'bar'}}

Required dictionary keys

By default, keys in the schema are not required to be in the data:

>>> schema = Schema({1: 2, 3: 4})
>>> schema({3: 4})
{3: 4}

Similarly to how extra_ keys work, this behaviour can be overridden per-schema:

>>> schema = Schema({1: 2, 3: 4}, required=True)
>>> try:
...   schema({3: 4})
...   raise AssertionError('MultipleInvalid not raised')
... except MultipleInvalid as e:
...   exc = e
>>> str(exc) == "required key not provided @ data[1]"
True

And per-key, with the marker token Required(key):

>>> schema = Schema({Required(1): 2, 3: 4})
>>> try:
...   schema({3: 4})
...   raise AssertionError('MultipleInvalid not raised')
... except MultipleInvalid as e:
...   exc = e
>>> str(exc) == "required key not provided @ data[1]"
True
>>> schema({1: 2})
{1: 2}

Optional dictionary keys

If a schema has required=True, keys may be individually marked as optional using the marker token Optional(key):

>>> from voluptuous import Optional
>>> schema = Schema({1: 2, Optional(3): 4}, required=True)
>>> try:
...   schema({})
...   raise AssertionError('MultipleInvalid not raised')
... except MultipleInvalid as e:
...   exc = e
>>> str(exc) == "required key not provided @ data[1]"
True
>>> schema({1: 2})
{1: 2}
>>> try:
...   schema({1: 2, 4: 5})
...   raise AssertionError('MultipleInvalid not raised')
... except MultipleInvalid as e:
...   exc = e
>>> str(exc) == "extra keys not allowed @ data[4]"
True
>>> schema({1: 2, 3: 4})
{1: 2, 3: 4}

Recursive / nested schema

You can use voluptuous.Self to define a nested schema:

>>> from voluptuous import Schema, Self
>>> recursive = Schema({"more": Self, "value": int})
>>> recursive({"more": {"value": 42}, "value": 41}) == {'more': {'value': 42}, 'value': 41}
True

Extending an existing Schema

Often it comes handy to have a base Schema that is extended with more requirements. In that case you can use Schema.extend to create a new Schema:

>>> from voluptuous import Schema
>>> person = Schema({'name': str})
>>> person_with_age = person.extend({'age': int})
>>> sorted(list(person_with_age.schema.keys()))
['age', 'name']

The original Schema remains unchanged.

Objects

Each key-value pair in a schema dictionary is validated against each attribute-value pair in the corresponding object:

>>> from voluptuous import Object
>>> class Structure(object):
...     def __init__(self, q=None):
...         self.q = q
...     def __repr__(self):
...         return '<Structure(q={0.q!r})>'.format(self)
...
>>> schema = Schema(Object({'q': 'one'}, cls=Structure))
>>> schema(Structure(q='one'))
<Structure(q='one')>

Allow None values

To allow value to be None as well, use Any:

>>> from voluptuous import Any

>>> schema = Schema(Any(None, int))
>>> schema(None)
>>> schema(5)
5

Error reporting

Validators must throw an Invalid exception if invalid data is passed to them. All other exceptions are treated as errors in the validator and will not be caught.

Each Invalid exception has an associated path attribute representing the path in the data structure to our currently validating value, as well as an error_message attribute that contains the message of the original exception. This is especially useful when you want to catch Invalid exceptions and give some feedback to the user, for instance in the context of an HTTP API.

>>> def validate_email(email):
...     """Validate email."""
...     if not "@" in email:
...         raise Invalid("This email is invalid.")
...     return email
>>> schema = Schema({"email": validate_email})
>>> exc = None
>>> try:
...     schema({"email": "whatever"})
... except MultipleInvalid as e:
...     exc = e
>>> str(exc)
"This email is invalid. for dictionary value @ data['email']"
>>> exc.path
['email']
>>> exc.msg
'This email is invalid.'
>>> exc.error_message
'This email is invalid.'

The path attribute is used during error reporting, but also during matching to determine whether an error should be reported to the user or if the next match should be attempted. This is determined by comparing the depth of the path where the check is, to the depth of the path where the error occurred. If the error is more than one level deeper, it is reported.

The upshot of this is that matching is depth-first and fail-fast.

To illustrate this, here is an example schema:

>>> schema = Schema([[2, 3], 6])

Each value in the top-level list is matched depth-first in-order. Given input data of [[6]], the inner list will match the first element of the schema, but the literal 6 will not match any of the elements of that list. This error will be reported back to the user immediately. No backtracking is attempted:

>>> try:
...   schema([[6]])
...   raise AssertionError('MultipleInvalid not raised')
... except MultipleInvalid as e:
...   exc = e
>>> str(exc) == "not a valid value @ data[0][0]"
True

If we pass the data [6], the 6 is not a list type and so will not recurse into the first element of the schema. Matching will continue on to the second element in the schema, and succeed:

>>> schema([6])
[6]

Multi-field validation

Validation rules that involve multiple fields can be implemented as custom validators. It's recommended to use All() to do a two-pass validation - the first pass checking the basic structure of the data, and only after that, the second pass applying your cross-field validator:

def passwords_must_match(passwords):
    if passwords['password'] != passwords['password_again']:
        raise Invalid('passwords must match')
    return passwords

schema = Schema(All(
    # First "pass" for field types
    {'password': str, 'password_again': str},
    # Follow up the first "pass" with your multi-field rules
    passwords_must_match
))

# valid
schema({'password': '123', 'password_again': '123'})

# raises MultipleInvalid: passwords must match
schema({'password': '123', 'password_again': 'and now for something completely different'})

With this structure, your multi-field validator will run with pre-validated data from the first "pass" and so will not have to do its own type checking on its inputs.

The flipside is that if the first "pass" of validation fails, your cross-field validator will not run:

# raises Invalid because password_again is not a string
# passwords_must_match() will not run because first-pass validation already failed
schema({'password': '123', 'password_again': 1337})

Running tests

Voluptuous is using pytest:

$ pip install pytest
$ pytest

To also include a coverage report:

$ pip install pytest pytest-cov coverage>=3.0
$ pytest --cov=voluptuous voluptuous/tests/

Other libraries and inspirations

Voluptuous is heavily inspired by Validino, and to a lesser extent, jsonvalidator and json_schema.

pytest-voluptuous is a pytest plugin that helps in using voluptuous validators in asserts.

I greatly prefer the light-weight style promoted by these libraries to the complexity of libraries like FormEncode.

voluptuous's People

Contributors

alecthomas avatar aleks-v-k avatar antoni-szych-rtbhouse avatar balloob avatar bdraco avatar cdce8p avatar charlax avatar cjw296 avatar epenet avatar hoxu avatar jd avatar jmwri avatar jonafato avatar kellerza avatar lucalianas avatar minboost avatar monopolis avatar nareshnootoo avatar ngaya-ll avatar nickovs avatar odedfos avatar rwieckowski avatar sirfz avatar spacegaier avatar svisser avatar tankanow avatar tbillon avatar thatneat avatar timgates42 avatar tusharmakkar08 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

voluptuous's Issues

Required fields which exist but fail validation are present in multiple_invalid.errors

For example

schema = Schema({"a": 1}, required=True)
try:
    schema({"a":2})
except MultipleInvalid as e:
    print e.errors
[voluptuous.Invalid('not a valid value for dictionary value'),
 voluptuous.Invalid('required key not provided')]

This doesn't seem to be the right behavior. If a required key is provided, the only failure tracked should be on the validation rule.

An official successor to voluptuous

Hi @alecthomas ,

Heavily inspired by your library, I've created a full-featured successor, good, which has all the features of voluptuous and many more, is fully documented, is compatible with voluptuous, and at the same time is both modular (example) and more performant.
The library is well-tested both with unit-tests and real experience.

So I'm thinking whether you can declare it being an official successor to voluptuous?

Kind regards,
Mark

dictionary in lists is not raising errors

I am pretty sure that this is an error, but am not positive. Dictionaries don't appear to raise validation errors when they are in a list with other elements.

I think that the following should raise Invalid:

In [1]: from voluptuous import *
In [2]: v = Schema([dict, basestring])
In [3]: v(["foo", "bar"])
Out[3]: ['foo', 'bar']

Note that the following raises an error:

In [4]: v = Schema(dict)
In [5]: v("foo")
MultipleInvalid: expected dict
In [6]: v = Schema([dict])
In [7]: v(["foo"])
MultipleInvalid: invalid list value @ data[0]

Tested against PyPi version and trunk.

Wrong custom message

For example,

test.py
from voluptuous import Schema, All, Msg, Invalid
validate = Schema({'field' : Msg(All(str), 'Must be a string')})
validate({'field' : 1234})

After I have exception with "Must be a string for dictionary value" message string instead of "Must be a string".

voluptuous.SchemaError: unsupported schema data type 'bool'

import voluptuous

voluptuous.Schema(bool)  # ok
voluptuous.Schema(False)  # not ok

Is there a reason True and False literals aren't handled? If not I'll send a patch.

It also looks like bool is being handled as a callable and not as a type. I'm not sure if this was intended.

Allow Invalid to be subclassed

We would like to be able to have our custom validators raise subclasses of Invalid so we can differentiate between the types of errors raised further up the stack. Voluptuous mostly supports this except in a few places where it creates a new Invalid() object from the error that was raised. This not only loses the subclass of Invalid that might have been raised, but it also destroys any arguments to the Subclass's constructor that might have been added.

Here's the two examples that I found...there could be more:

  1. https://github.com/alecthomas/voluptuous/blob/master/voluptuous.py#L290
    This case I don't have a solution for, if you need to alter the msg, i don't believe that's possible without recreating Invalid. Can we drop the alteration of the msg?

  2. https://github.com/alecthomas/voluptuous/blob/master/voluptuous.py#L290
    In this case, it seems like

except Invalid as e:
   e.path = path + e.path 
   raise e

would do.

I will gladly put together a PR but thought I'd ask about how to approach 1) above.

please tag v0.6

Hello,

I am packaging your script for Debian and the system requires the version to be tagged in the git repository. The system currently only find 0.1, 0.2 and 0.3 :(

http://githubredir.debian.net/githubredir.cgi?author=alecthomas&project=voluptuous

Can you please:

git tag 0.6 8bd3ac007af6c0ad802ab5df72aa3c49134db7a3
git tag 0.6 283e9b5e76bd4d1f57bf475f0ba198c984dab45e
git tag 0.4 7b7f674c6fa604c718e0089323b6d2dfe21134bb
git push --tags

Then check that http://githubredir.debian.net/githubredir.cgi?author=alecthomas&project=voluptuous does show up the new versions :-]

Thanks!

Package restructure

I've been considering a restructure of the package layout. I propose that the markers, validators, and schema definition objects are separated into corresponding modules.

Note: This is intended to just get a rough idea of what a potential package layout could look like. I purposely did not copy over all potential declarations for a specific module.

voluptuous.schema

Contains the top level objects needed to initialize schema definitions and the exceptions/error handling as a result of the validation process.

class Undefined(object):
UNDEFINED = Undefined()

class Error(Exception):
class SchemaError(Error):
class Invalid(Error):
class MultipleInvalid(Invalid):
class Schema(object):

def _compile_scalar(schema):
def _iterate_mapping_candidates(schema):
def _iterate_object(obj):

class Object(dict):

etc...

voluptuous.markers

class Marker(object):

class Required(Marker):
class Optional(Marker):
class Exclusive(Optional):
class Inclusive(Optional):

etc...

voluptuous.validators

def Any(*validators, **kwargs):
def All(*validators, **kwargs):
def Boolean(v):
def Length(min=None, max=Mone, msg=None):

etc...

voluptuous.util

Kind of a catch-all for now, may not be necessary at first.

@contextmanager
def raises(exc, msg=None):

Why?

  1. Future maintainance and additions will be easier.
  2. Less potential for merge conflicts.
  3. Modules are neat.

Backwards compatibility

Introducing a package restructure would certainly break current usage. To mitigate this, perhaps voluptuous/__init__.py could contain aliases to all the currently available declarations.

from voluptuous import markers
from voluptuous import validators
...
Required = markers.Required
Optional = markers.Optional
...
Boolean = validators.Boolean

etc...

(I'm hoping) that could be a fix. If that doesn't work or is undesired for some reason, it may be a deal-breaker on the restructure.

Would you be open to a PR of this nature? I'd appreciate your thoughts. Thanks!

Named internal functions

Hi, I need to document my input file schema for end-users. So far I've been able to crawl the data structures using the same strategy as the Schema class. It works for the most part and I can generate docs with Sphinx.

The last hurdle is that most of the validator functions return an anonymous function called f which is not possible to identify. I've tried about ten different ways to do it in a robust manner, but they didn't work. I can think of a few remaining ways such as looking for internal variable names, etc, but think it would be too easy to break on updates.

I suppose the easiest way to identify them would be to name them the same name as the outer function. The second might be to give them a docstring with the name inside. Either would make identification easy.

Would you be up for such a change?

Naming the data root

Since validation errors are often sent to the user, it's annoying to see errors like

expected int for dictionary value @ data['a']

"data"? What the hell is that? :)

I suggest to add an argument, data_name to Schema.__call__ so we can use it like this:

Schema({'a': int})({'a': None}, data_name="user")

and so the error renders to:

expected int for dictionary value @ user['a']

This is disputable, since it's easy to just render a different error message like this:

try:
    schema(data)
except Invalid as e:
    raise RuntimeError('{msg} @ {name}{path}'.format(
        msg=e.message, 
        name='user', 
        path=e.path))

What do you think?

Nested schema validation?

A suggestion perhaps? The possibility to say All(list, Validate(another_schema)), so that it validates all the elements of the list using another schema that you have defined?

How do I validate dates?

If there is no builtin validation available for dates, is there a possibility to add custom validations? is it just writing another function like Any or All?

New release on PyPI

Would you consider releasing voluptuous on PyPI?
The current release is 5 months old !

I would love to use the REMOVE_EXTRA extra value!

Method of validating contents of exact list indices

As it stands, I have not been able to determine how to validate exact indices of lists.

Use case: Validating externally generated JSON that represents the concept of a tuple with an array.

I might get a data structure that looks like this:

[
  "hourly_report",
  17,
  [
    { "foo": "bar" }
    { "foo": "bar" }
  ],
  [ "option1", "option2", "option3" ],
]

I would want to validate that the first element of the outer list is a string, the second element is an integer within a specific range, the third element is a list of an arbitrary number of key/value pairs, the fourth element is a list of strings.

If you specify a list in a schema: v.Schema([ str, int, KeyValueSchema, StrListSchema]) (assuming that KeyValueSchema and StrListSchema were defined elsewhere) it won't work, because the interpretation of a list of values in a schema is to check each item of the list schema against each item of the list data, and if any of the schema items succeeds for each item of the data, that item passes validation.

Yes, it's poor data design. But I have to deal with LOTS of this legacy data, and being able to retrofit some validation would be nice.

Perhaps validators need to be passed an extra parameter by the list validator, containing the index we're looking at. Validators could ignore this parameter unless they're coded to look for it. Then one could do v.Schema([v.Index(1, str), v.Index(2,int) . . .

But that seems sloppy of itself. I'm not sure what the best way is to handle these cases.

I did post to the mailing list a week or two ago, regarding this, but nobody's weighed in on it.

potential bug: missing key thrown when passing empty extra dict

I am not sure if this is expected behavior or a bug, so I am submitting it for further consideration. Any guidance is much appreciated!

Demo setup

from voluptuous import Extra, Schema

schema = Schema({
    'a': int,
    'b': {Extra: object}
}, required=True)

sample_x = {
    'a': 2,
    'b': {
        'hello': 'world'
    }
}

try:
    schema(sample_x)
except Exception as e:
    print 'error: sample_x'
    print e

sample_y = {
    'a': 2,
    'b': {}
}

try:
    schema(sample_y)
except Exception as e:
    print 'error: sample_y'
    print e

Note:

  • Running Ubuntu 14.04, Python 2.7
  • Running voluptuous 0.8.5:
$ pip show voluptuous

---
Name: voluptuous
Version: 0.8.5
Location: /usr/local/lib/python2.7/dist-packages
Requires: setuptools

Observed

error: sample_y
required key not provided @ data['b'][<function Extra at 0x7fc74762fb90>]

Expected

Both sample_x and sample_y pass because the required key is provided with an empty dict (object) using the specified Extra setup

extra=False and unvalidated keys

We've come across an interesting scenario where our schema looks something like this:

schema = Schema({
  Required('some_key'): a_real_validator,
  'optional_key': no_op_validator 
}, extra=False)

...

def no_op_validator(value):
  return value

We don't have validation to perform on the optional key, but since extra is False, we can't just drop it from the schema. Is there a less ugly solution than the no_op_validator? Should we add something to voluptuous?

Conflicts with Python's builtin

I work @ludia on Dynamodb-mapper projects and plan to use Voluptous as the validation layer. I love it's extremely simple syntax and ability to validate nested structures.

Sadly, a couple of packaged validators (all and range) conflicts with Python's builtin (http://docs.python.org/library/functions.html).

Would you accept a pull request converting the lowercasesyntaxforvalidator to CamelCaseSyntaxForValidators ?

Replace a value that is not validated and won't be accepted by the database

There are some db fields, such as date, datetime, or int that will only accept correct type or formatted values. If a type does not validate, it would be nice to replace it with something that the db could insert.

The way that All() works, according to it's doc, is that the value passes from one validator to the next. However I find that if one validator raises Invalid and does not return the value then the next validator does not run.

Voluptuous has a Replace() function but this function does not raise an Invalid error message or check for validity.

So it seems like the choice is either to raise an Invalid Exception or to replace the value, but not both can be done. So, for example, in the case where I want to run several validators on a date input, and I'd like to catch and record/display the errors and also nullify the field if it's not representative of a valid date, then I am not yet sure how to proceed with that.

Has anyone ever encountered this? I am using version 0.8.5

Order matters for list valued schema

    from voluptuous import Schema
    d_1 = {'tag': 'hi', 'cats': str}
    d_2 = {'tag': 'hi', 'cats': int}

    # Works since d_2 matches
    s = Schema({'val': [d_2, d_1]})
    s({'val': [{'tag': 'hi', 'cats': 2}]})

    # Doesnt work. d_1 exception allowed to bubble up without d_2 being tried.
    s = Schema({'val': [d_1, d_2]})
    s({'val': [{'tag': 'hi', 'cats': 2}]})

I would not expect order to matter for the list validations. For nested structures, there is a length check on path that causes the exception not to be swallowed when it should be. Just started trying to use this yesterday though, so maybe I am misunderstanding. Tried also putting d_1 and d_2 within an Any clause inside a list but the same issue happened.

A way to get all validation failures?

I see in the docs that you want voluptuous to fail fast when there are validation failures, but would you be open to a PR that configurably aggregates and raises all validation failures?

i.e.

schema = Schema({'name': str, 'number': int}, all_errors=True)
schema({'name': 123, 'number': '234'})
# Raises MultipleInvalid with both name and number Invalid exceptions in MultipleInvalid.errors

While I totally understand the desire for failing fast, it also makes for a bad user experience if you fill out a form, get 1 error, fix it and hit submit, and then get a different error.

I could try to pull together a PR that does this, but want to make sure you'd be willing to accept it before I roll my sleeves up.

Make all messages i18n friendly

Voluptuous is awesome, but there are some baked in messages that I'd love to be able to translate with babel (or whatever other tools you might use). Not sure how one might approach this, but perhaps we can brainstorm.

Have the option to drop extra keys

Hi there,
In a special case where you want to accept only a subset of the json data, it would be nice to drop those extra keys not needed.

Unicode validation fails if nested in a list in a dictionary

Hi,
I'm trying to validate a JSON object, thus all strings are in unicode. I have the following schema, which requires me to nest a dictionary of keys as integers and values as lists ('new_player'). When I use a unicode matching in the upper levels ('team', 'player'), it works fine, but when I try to do so in the list, I get an exception. How do I validate this correctly? Thanks.

schema = Schema({
        Required('team'): {
            Match(r'\d+'): All(unicode, Length(min = 1))
        },
        Required('player'): {
            Match(r'\d+'): unicode 
        },
        Required('division'): {
            Match(r'\d+'): All(int, Range(min = 1))
        },
        Required('new_player'): {
            Match(r'\d+'): [
                unicode
            ]
        }
    })

If the list is not nested within a dictionary, the matching is fine ('player'):

schema = Schema({
        Required('division_id'): All(int, Range(min = 1)),
        Required('name'): All(unicode, Length(min = 1)),
        Required('players'): [
            All(unicode, Length(min = 1))
        ]
    })

setuptools/Manifest dependency

Not sure if this helps or not, but I was able to avoid the setuptools and Manifest dependency (regarding the need to have the readme available) at install time like so:

# readme is needed at upload time, not install time
try:
    long_description = open('README.rst').read()
except IOError:
    long_description = ''

setup(
    # ....
    long_description  = long_description,
)

Custom message for required keys breaks dictionary validation

The code in #19 broke ordinary dictionary validation. Consider:

git checkout 0762b991f155609520731ec88c8265fe587ca99b
nosetests

Failed example:
    schema({3: 4})
Expected:
    Traceback (most recent call last):
    ...
    InvalidList: required key not provided @ data[1]
Got:
    Traceback (most recent call last):
    ...
    AttributeError: 'int' object has no attribute 'msg'

Working on a fix.

Have the option to rename/refactor keys

You may want to save the validated json document to a database which may have a different structure.
An extra argument for determining what the final name of the field should be would save an extra refactoring on the data.

i.e something like

>>> schema = Schema({Required('json_key'): [str, coerce_key("database_key")],})
>>> print schema({"json_key": "val"})
>>> { 
      "database_key": "Val
    }

Coerce for Decimal Doesn't Catch decimal.InvalidOperation

from voluptuous import *
from decimal import *

schema = Schema(Coerce(Decimal))
try:
    schema('t')
except MultipleInvalid as e:
    print "exception MultipleInvalid: " + str(e)

This will output decimal.InvalidOperation: Invalid literal for Decimal: 't'. voluptuous only catches ValueError and TypeError in Coerce. Maybe it needs to catch InvalidOperation as well?

Does Voluptuous support conditional validation?

Really two main cases, for example:

  1. Lets say we have a token that is part of the schema, and the first thing we want to do is validate that the token is a valid hex value and then validate it's a token we actually know about in our datastore that is still valid, but we only want to do the second one (which requires a DB look up if it's even a valid looking token (hex))

  2. Validation requires two pieces of the schema to perform proper validation, for example a user is retrieving a list of comments for a feed item and the schema payload requires the user_id (which would be the authenticating user in this case) as well as the feed item_id, both will need to be used in the datastore query to see if they have read permission, so we'd need them both in a single callback.

Validating against uncalled @message functions silently succeeds

>>> import voluptuous
>>> s = voluptuous.Schema(voluptuous.Url)
>>> s(1)
<function Url at 0x7fc518a101b8>

This is because voluptuous.Url is really decorator(f) from within @message(), which looks like a validator and returns the actual validator.

If voluptuous doesn't accept callables as values when validating, the message decorator chould check that f is callable. Otherwise I'm not sure, maybe something like decorator._not_validator = True before returning decorator, and looking for that attribute when processing schemas.

InvalidList is a confusing exception name

When I get an InvalidList exception, I always think that I have a problem with a list. Instead, it simply means that multiple errors may be included.

Perhaps something less ambiguous could be devised? Some ideas:

  • Invalid: i.e. simply extend Invalid itself to contain optional child errors.
  • MultipleInvalid
  • Invalids
  • InvalidErrors

something else?

Install bug :( 0.8.2 from PyPI

Downloading/unpacking voluptuous
  Downloading voluptuous-0.8.2.tar.gz
  Running setup.py egg_info for package voluptuous
    WARNING: Could not locate pandoc, using Markdown long_description.
    Traceback (most recent call last):
      File "<string>", line 16, in <module>
      File "/tmp/pip_build_root/voluptuous/setup.py", line 15, in <module>
        long_description = open('README.md').read()
    IOError: [Errno 2] No such file or directory: 'README.md'
    Complete output from command python setup.py egg_info:
    WARNING: Could not locate pandoc, using Markdown long_description.

Traceback (most recent call last):
  File "<string>", line 16, in <module>
  File "/tmp/pip_build_root/voluptuous/setup.py", line 15, in <module>

    long_description = open('README.md').read()

IOError: [Errno 2] No such file or directory: 'README.md'

I believe this file needs to be in the MANIFEST, or the open() can be wrapped in a try-except. It's not needed at install time.

Order should not matter for dict key value schema

Related to the Order matters for list valued schema bug. The order of the skey should not matter in a dict...

Here is code to reproduce the bug:

import yaml
import voluptuous
import ipdb

_service_config = {
        voluptuous.Required('display_name'): str,
        'depends_on':
            {str: voluptuous.Any('init','exist')},
        'node_init_functions': [str],
        'web_init_functions': [str]
        }

_event_handler = {
        str:
            {voluptuous.Required('handler'): str,
                voluptuous.Required('config'): str
            }
        }

_sandbox_config1 = {
        voluptuous.Required('services'): 
            {str:
                voluptuous.Any(str, None)
            },
         'event_handler': _event_handler,
         voluptuous.Optional(str): _service_config
        }

_sandbox_config2 = {
        voluptuous.Optional(str): _service_config,
        voluptuous.Required('services'): 
            {str:
                voluptuous.Any(str, None)
            },
         'event_handler': _event_handler,
        }

_sandbox_config3 = {
        'event_handler': _event_handler,
        voluptuous.Optional(str): _service_config,
        voluptuous.Required('services'): 
            {str:
                voluptuous.Any(str, None)
            },
        }

_sandbox_config4 = {
        'event_handler': _event_handler,
        voluptuous.Required('services'): 
            {str:
                voluptuous.Any(str, None)
            },
        voluptuous.Optional(str): _service_config,
        }

_sandbox_config5 = {
        voluptuous.Optional(str): _service_config,
        'event_handler': _event_handler,
        voluptuous.Required('services'): 
            {str:
                voluptuous.Any(str, None)
            },
        }

if __name__ == '__main__':                
    config = yaml.load(
            """
            services:
                FriendSync: friendsync

            event_handler:
                TrackingEventHandler:
                    handler: wd.tracking:WDTrackingEventHandler
                    config: wd.tracking:WDConsumerConfig

            Barter:
                display_name: Trading
            """)

    print('----> config: %s\n' % config)
    print('----> _sandbox_config1: %s\n' % _sandbox_config1)
    print('----> _sandbox_config2: %s\n' % _sandbox_config2)
    print('----> _sandbox_config3: %s\n' % _sandbox_config3)
    print('----> _sandbox_config4: %s\n' % _sandbox_config4)
    print('----> _sandbox_config5: %s\n' % _sandbox_config5)

    validate1 = voluptuous.Schema(_sandbox_config1, required=False, extra=False)
    validate2 = voluptuous.Schema(_sandbox_config2, required=False, extra=False)
    validate3 = voluptuous.Schema(_sandbox_config3, required=False, extra=False)
    validate4 = voluptuous.Schema(_sandbox_config4, required=False, extra=False)
    validate5 = voluptuous.Schema(_sandbox_config5, required=False, extra=False)

    ipdb.set_trace()
    validate1(config)

    #ipdb.set_trace()
    #validate2(config)

    #ipdb.set_trace()
    #validate3(config)

    #ipdb.set_trace()
    #validate4(config)

    #ipdb.set_trace()
    #validate5(config)

Whole schema validator

Fields validators can be added, but for now, there is no way to validate schema as a whole (like django clean).
It would be nice to have something like:

def whole_schema_validator(data):
    if data['number1'] + data['number2'] > 5:
        raise Invalid

Schema(dict(
    number1=int,
    number2=int,
}, validators=[whole_schema_validator])

COPYING missing in package in pypi

Hello, the file COPYING is missing from the tarball available in pypi. You have to add it to MANIFEST.in As this file contains the license text is particularly important.

default_to() with dict doesn't appear to work

I was happy to find the default_to() validator to return defaults while validating a yaml document.

Unfortunately it only seems to work in literals and lists and is not able to create dictionary keys. Is that possible? Or perhaps I'm using it wrong?

    schema1 = Schema({
        'sudo':  default_to(True),
    })
    print schema1({})

    schema2 = Schema([default_to(True)])
    print schema2([None])

prints

{}
[True]

I also tried giving the field as None, but then it breaks what I really want to do:

    schema1 = Schema({
        'sudo':  all(bool, default_to(True)),
    })
    print schema1({'sudo':None})
voluptuous.InvalidList: expected bool for dictionary value @ data['sudo']

Hashable markers

Currently, markers are not correctly hashable and comparable, so after a schema is defined, it's impossible to make changes to it:

from voluptuous import Optional, Any
d['a'] = Any(None, d['a'])

This raises an exception, since Optional('a') != 'a', and their hashes differ. Explanation

Traceback (most recent call last):
  File "test.py", line 17, in <module>
    d['a'] = Any(None, d['a'])
KeyError: 'a'

I suggest to add the following to markers:

class Marker(object):
    # ...
    def __hash__(self):
        return hash(self.schema)

    def __eq__(self, other):
        return self.schema == (other.schema if isinstance(other, Marker) else other)

Now they are correctly hashable and compare equal with strings.

There is a dangerous issue that it schema argument is not hashable -- it will fail.. but those markers are only used with dictionary keys anyway, which must be hashable.

How do apply validation condition to keys ?

Let's say a part of a dict is a user supplied data to be inserted in a database. I have no idea a priori of the key names nor of the quantity. I only know they will follow a given syntax. Is there any way to achieve this with voluptuous ?

Custom error message lost in exception

Sorry to bug you again. I have two small issues.

One is that the validation function I wrote doesn't work if I use the inner f() function pattern from the readme. Any idea why that is? I see many of those in voluptuous.py have it yet they do work. If I use it, the inner function is returned but has done nothing.

Second is that my validation error message gets lost. Even with the msg=None pattern from the readme.

from voluptuous import Schema, all, any, Invalid
def check(v):
    #~ def f(v):
        if len(v.split()) != 3:
            raise Invalid('Expected length of 3.')
        return v
    #~ return f

schema = Schema([ all(any(None, basestring), check) ])

print schema([ 'one two three four' ])

The exception traceback returns this:

voluptuous.InvalidList: invalid list value @ data[0]

Porting to new version issue

Am trying to port my program from .4x to the latest. Unfortunately I was subclassing Schema to augment validate_dict. However, in the new version I no longer have access to it, since all of those functions were moved within others. What should I do?

float as a validator

Hi Alec,

I'm using float as a validator for a value in a dictionary:

schema = Schema({"test": float})
schema({"test": 123})

I understand that 123 is technically an integer, but the float builtin function is capable of casting an integer to a float, so it was unexpected to me that the above code failed. If I replace float with lambda x: float(x) it works fine, so I imagine this type is checked for manually and treated differently. Is it fair to assume that the set of specific "types" (float, int, etc) are validate more stringently than the builtin functions associated with those types?

Update Pypi version of Voluptuous

Any chance you can update the version of Voluptuous on PyPi? The latest error_message commit for example hasn't been deployed yet.

Thanks!

List length validation breaks paths in Invalid exceptions

I'm using 0.6 version of voluptuous installed from pypi.

I'm trying to validate schema with list of subschemas. Everything works fine unless I want to also validate if there is at least one list item.

Schema without list length validation:

import voluptuous as v

schema = v.Schema({
    v.Required('items'): [{
        v.Required('foo'): str,
    }]
})

Now if I call the schema with following data:

schema({'items': [{}]})

I'm getting MultipleInvalid: required key not provided @ data['items'][0]['foo'] which is ok.

Schema with list length validation:

import voluptuous as v

schema = v.Schema({
    v.Required('items'): v.All([{
        v.Required('foo'): str,
    }], v.Length(min=1))
})

Now if I call this schema with the same data as before I'm getting MultipleInvalid: required key not provided for dictionary value @ data['items']. It doesn't provide correct path to foo key and there is no way to detect which validator caused error.

Doing more research I've found that using All is the cause of the issue, as I'm getting the same incorrect validation error using schema like this:

import voluptuous as v

schema = v.Schema({
    v.Required('items'): v.All([{
        v.Required('foo'): str,
    }])
})

Feature request: pretty printing voluptuous

It would be rather nice if voluptuous would know how to pretty-print a scheme. pprint is rather horrible at pretty printing. json.dumps with indent does a much nicer job, but is -for obvious reasons- not compatible with voluptuous...

AttributeError on callable validator.

Not sure if I'm doing the right thing here...

The doc page on callable validators is a bit confusing. First it says to raise Invalid, then it says the stdlib raises ValueError, and later on it mentions MultipleInvalid. Which to use?

The only one that seems to get a response is MultipleInvalid, but when I use it I get an error. My sample code:

def validate_varname(value):
    #~ raise ValueError('blah blah...')
    #~ raise Invalid('blah blah...')
    raise MultipleInvalid('blah blah...')
    return value

vars_schema = {
    validate_varname:   basestring,
}

With that I get this exception:

Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/pave/main.py", line 262, in load_data
    data = schema(data)
File "/usr/local/lib/python2.7/dist-packages/voluptuous.py", line 207, in __call__
    return self._compiled([], data)
File "/usr/local/lib/python2.7/dist-packages/voluptuous.py", line 418, in validate_dict
    return base_validate(path, iteritems(data), out)
File "/usr/local/lib/python2.7/dist-packages/voluptuous.py", line 268, in validate_mapping
    out[new_key] = cvalue(key_path, value)
File "/usr/local/lib/python2.7/dist-packages/voluptuous.py", line 418, in validate_dict
    return base_validate(path, iteritems(data), out)
File "/usr/local/lib/python2.7/dist-packages/voluptuous.py", line 257, in validate_mapping
    new_key = ckey(key_path, key)
File "/usr/local/lib/python2.7/dist-packages/voluptuous.py", line 535, in validate_callable
    raise Invalid(e.msg, path + e.path)
File "/usr/local/lib/python2.7/dist-packages/voluptuous.py", line 168, in msg
    return self.errors[0].msg
AttributeError: 'str' object has no attribute 'msg'

What am I doing wrong?

Validate all keys over nested dictionaries

I'm looking to do something like the following:

Given a json object parsed as a number of nested dictionaries, make sure no value of string type has length > 100. Just some basic overflow checking.

Is it possible to accomplish this with the current setup. I can see how it might be done if it was a single dictionary of base types, but if the dictionaries could be nested infinitely, it get's more complicated. It looks like voluptuous can't currently handle this situation?

way to get at the args of a failed validator?

for example, if i have a validator that fails for a Length(), how can i lookup what the min and max , and which of those triggered it? i'm thinking especially in the case of 'code that doesn't want to know validation details other than digging in the raised error'.

if it doesn't exist as a feature, it would be nice to have these to have a chance to make a better message for the user.

i think i remember that spring validation errors does something like that, it might be an idea worth considering. http://docs.spring.io/spring/docs/2.5.6/api/org/springframework/validation/Errors.html

thanks! (i really like this library btw.)

Jython doesn't like 'from .voluptuous import *'

Jython, which apparently still follows Python 2.5 rules for import (even in 2.7 beta), does not like the combination of a relative import with import *, as is now done in the top-level init:

from .voluptuous import *

I have a few ideas on this:

  1. This is a Jython bug, so fix it upstream there.
  2. Who really cares about Jython anyway.
  3. Maybe an absolute import would work, except that it doesn't:
    from voluptuous.voluptuous import *
  4. * imports are sort of nasty anyway, so let's explicitly import:
from .voluptuous import (Undefined, SchemaError, Invalid, MultipleInvalid, Schema, Marker, Optional, Required,
                         Extra, extra, Msg, message, truth, Coerce, IsTrue, IsFalse, Boolean, Any, All, Match,
                         Replace, Url, IsFile, IsDir, PathExists, Range, Clamp, Length, Lower, Upper,
                         Capitalize, Title, DefaultTo)

Well, that's not very pretty, is it?

Of these, I'm most inclined to suggestion 1.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.