seperman / deepdiff Goto Github PK

DeepDiff: Deep Difference and search of any Python object/data. DeepHash: Hash of any object based on its contents. Delta: Use deltas to reconstruct objects by adding deltas together.

Home Page: http://zepworks.com

License: Other

Python 99.99% Shell 0.01%

python tree deep-search repetition difference comparison report-repetition nested recursive diff

deepdiff's Introduction

Hello!

My name is Sep. Welcome to my page!

If you deal with dirty data that is entered by humans (i.e., customer data or product data), I am building a tool for it: Qluster. Please ping me, and I will set you up with an account. I would love to hear your feedback!

Please check out my blog, Zepworks, for articles that I may publish from time to time.

Updates

DeepDiff 7.0.0 - Apr 8 2024

DeepDiff 7 comes with an improved delta object. Delta to flat dictionaries have undergone a major change. We have also introduced Delta serialize to flat rows.
Subtracting delta objects have dramatically improved at the cost of holding more metadata about the original objects.
When verbose=2, and the "path" of an item has changed in a report between t1 and t2, we include it as new_path.
path(use_t2=True) returns the correct path to t2 in any reported change in the tree view
Python 3.7 support is dropped and Python 3.12 is officially supported.

⚡ Fun fact

Try drinking olive oil if you are too hungry. I am, as I write this.

deepdiff's People

Contributors

Stargazers

Watchers

Forkers

zhenxingwu brbsix andrewyoung1991 sipp11 hongphong asfaltboy wangfenjin joshainglis ki11roy nfvs pivotal-energy-solutions zeapo ebw44 pombredanne b-jazz victorhahncastell bcpendo powergo gwgrant fellow haidenayitou sandysuehe bernhard10 mckapitoshka finnhughes bbetter173 martyhub robertglen yeahydq debasishmaji adrianer moloney chuck-lee heavybear richardjtorres phiweger testerman77 bgro gitter-badger v-a-kernel florianheigl serv-inc polozhevets self-maurya maxrothman dangelsaurus mjsilva sgenoud idokor sreecodeslayer ddyuewang solertis mthaddon blue-sky-evereyday land-pack williamd67 amespan seersucker mecforlove pmbi vault-the jmccreight rafic92 vertexes thangphuocnguyen immarvin mark90 soleronline itsx brianmaissy joshuathayer sh9901 hanxiaole devipriyasarkar jyhess skamdart xuweibj margaretgorguissian chunimuni mellon85 simonfontana coldino solidest antimirov hamidarrivy jayvdb spider2449 bopamo stastechno moult fi-do hugovk hhaiwzz oholimoli euan-tilley van-ess0 nathanielobrown mkaras93 chkothe laike9m

deepdiff's Issues

no module named helper

Python 2.7 and virtualenv.
Installed latest 2.5.0.

Getting exception: no module named helper.
file: diff.py, line: 14 from deepdiff.helper import py3.
Same for line 15.
Same for contenthash.py, search.py.

Removing the deepdiff. prefix solved this issue. (i.e. from helper import py3).

Comparing based on hashes does not work with the significant_digits option

Unfortunately I currently don't know how to best solve this.
The following options seem possible:
-) Change the hashing function to only take n significant digits into account.
-) Use comparison without hashes if the significant_digits option is present
-) Do the comparison with unchanged hashes and postprocess the result.

My current workaround is to postprocess the diff result like this:

    diff = DeepDiff(a, b, significant_digits=12, ignore_order=True) 
    if "set_item_added" in diff and "set_item_removed" in diff:
        while True:
            for item in  diff["set_item_removed"]:
                if item in diff["set_item_added"]:
                    diff["set_item_removed"].remove(item)
                    diff["set_item_added"].remove(item)
                    break
            else:
                break

This works because the dictionary returned by diff contains strings, which are formatted having only 12 digits.

Ignore order for nested iterables

I have a list, that contains lists inside, is there any way to make ignore_order recursive?
Example:

>>> a = [1, 2, [3, 4]]
>>> b = [[4, 3], 2, 1]
>>> DeepDiff(a, b, ignore_order=True)
{'iterable_item_removed': {'root[2]': [3, 4]}, 'iterable_item_added': {'root[0]': [4, 3]}}

Error when comparing lists of sets if ignore_order=True

Error when comparing lists of sets if ignore_order=True.
For example,
DeepDiff([{1}, {2}, {3}], [{1}, {2}, {3}, {4}], ignore_order=True)
printed
Warning: Can not produce a hash for set([1]) item in root and thus not counting this object. Warning: Can not produce a hash for set([2]) item in root and thus not counting this object. Warning: Can not produce a hash for set([3]) item in root and thus not counting this object. Warning: Can not produce a hash for set([1]) item in root and thus not counting this object. Warning: Can not produce a hash for set([2]) item in root and thus not counting this object. Warning: Can not produce a hash for set([3]) item in root and thus not counting this object.

TextDiff incorrectly represents dictionary_item_removed when value is None

Before 3.0 the following test would pass but now fails:

def test_none_item_removed(self):
        t1 = {1: None, 2: 2}
        t2 = {2: 2}
        ddiff = DeepDiff(t1, t2)
        result = {
            'dictionary_item_removed': {'root[1]'}
        }
        self.assertEqual(ddiff, result)

with:
AssertionError: {'dictionary_item_removed': set(['root'])} != {'dictionary_item_removed': set(['root[1]'])}

I'm not sure where or how to fix it? The closest I've come is by tweaking auto_generate_child_rel to remove if self.down.t1 is not None: but that causes some other tests to fail.

In path() because level.t1_child_rel is None we don't append the dictionary key to the path so it looks like the whole dictionary is removed.

export in real JSON

Currently using deepdiff for reporting different between objects versioned, I store the return of this function using pymongo, but some of the attributes values are not JSON compliant and I can't really store them without converting these values, for instance, when I add new dictionaries to my object, the set value added will not be serializable as is (I've to convert these as lists). It'd be nice if we could have an option to output JSON compliant object.

Problem with ignore_order and lists

Following example did not work for me:

deepdiff.DeepDiff(
{"a":[[{"b":2,"c":4},{"b":2,"c":3}]]},
{"a":[[{"b":2,"c":3},{"b":2,"c":4}]]},
ignore_order=True
)
output:
{'iterable_item_added': {"root['a'][0]": [{'b': 2, 'c': 3}, {'b': 2, 'c': 4}]},
'iterable_item_removed': {"root['a'][0]": [{'b': 2, 'c': 4},
{'b': 2, 'c': 3}]}}

Feature: Similarity instead of difference

Would be great to be able to compare 2 objects for similar key values, for example:
d1 = {"keyA": {"value": 10}, "keyB": {"value": 20}}
d1 = {"keyA": {"value": 30}, "keyB": {"value": 20}}

diff = DeepDIff(d1, d2, reverse_diff=True)
>> {"simillar_values": ["[keyB][value]"]}

I'm working on a tool that needs to compare Python dictionaries and not only find diffs (which this library does perfect!) , but also detect overlapping values.

set comparison broken sometimes

>>> deepdiff.DeepDiff(set(), set([1]))
{'set_item_added': set(['root[1]'])}
>>> deepdiff.DeepDiff(set(), set([None]))
{}

Recursion into dictionaries is limited to type dict

Recursion into dictionaries is currently limited to just those objects of (or inheriting from) type dict, as per deepdiff.py line 441

elif isinstance(t1, dict): self.__diff_dict(t1, t2, parent, parents_ids)

This causes a problem for my intended use of your excellent package, because other than my top-most dictionary, all of my nested dictionaries are subclassed from collections.MutableMapping, thus isinstance returns False causing my dictionary-like objects to not get processed as dictionaries.

Instead of checking for type dict, may I suggest that you could instead check for collections.Mapping (the base class for mapping i.e dictionaries). E.g.

elif isinstance(t1, collections.Mapping): self.__diff_dict(t1, t2, parent, parents_ids)

That would allow recursion into any object that supports dictionary lookups.

Handling of renames

For my migration system, one of the more often applied use cases is a rename of a dictionary key (which in jsonschema often is the name of an object's field), so what i get from that is:

{
    "dictionary_item_removed": {"root['properties']['obatzter']": {"description": "Custom user notes", "type": "string", "format": "html", "title": "User notes"}}, 
    "dictionary_item_added": {"root['properties']['apfelsalat']": {"description": "Custom user notes", "type": "string", "format": "html", "title": "User notes"}}
}

Where the fieldname "obatzter" was renamed to "apfelsalat". The above syntax is not so nice for migration, so I'd love to rather get a diff that says:

{
    "dictionary_item_renamed": {"root['properties']['obatzter']": "root['properties']['apfelsalat']"}
}

Might that please be possible?

Support dates

For some reason datetimes are supported but not dates...

Regex support in exclude_path

I have two lists of dictionaries

list_1 = [{'a': 1, 'b': 2}, {'c': 4, 'b': 5}]
list_2 = [{'a': 1, 'b': 3}, {'c': 4, 'b': 5}]

and I want to ignore key 'b' in all dictionaries while comparing them.
I think that it would be pretty nice if I could just use

diff = DeepDiff(list_1, list_2, exclude_regex="root\[\d+\]\['b'\]")

or something like that.

Null string DeepSearch bug

So, I found this bug while I was working on finding empty values inside the complete dictionary (given below)

one = {
	"num_list":[1,3,2], 
	"long_dict":{"empty":"", "string": [2,0,0], "somewhere": "around"},
	"alpha_list":["","","a","b"]
}

Upon DeepSearch(one,"", verbose_level=2) , I expect it to show "root['long_dict']['empty']": '', , "root['small_list'][0]": '', and "root['small_list'][1]": '' but for some reason, I get the below output.

{'matched_paths': {"root['long_dict']": {'empty': '',
   'somewhere': 'around',
   'string': [2, 0, 0]},
  "root['long_dict']['empty']": '',
  "root['long_dict']['somewhere']": 'around',
  "root['long_dict']['string']": [2, 0, 0],
  "root['alpha_list']": ['', '', 'a', 'b'],
  "root['something somewhere']": [1, 3, 2]},
 'matched_values': {"root['long_dict']['empty']": '',
  "root['long_dict']['somewhere']": 'around',
  "root['alpha_list'][0]": '',
  "root['alpha_list'][1]": '',
  "root['alpha_list'][2]": 'a',
  "root['alpha_list'][3]": 'b'}}

Here, we can see that the value 'around','a', and 'b' are matched as an empty string. 🤔 ?

What am trying to achieve is really simple, I just want to look for Empty values in the whole object in concern, could be in a loop or anywhere, and then report the emptiness and stuff based on the 'matched_values'.

Difference from RFC 6902

It would be great to have highlights of the difference the library offers comparing to https://tools.ietf.org/html/rfc6902 standard and its python implementations.

Tuples with equal content show diff even when ignore_order=True

{'values_changed': {"root['parent_tuple_list']": {'new_value': '((12, 15, 14), '
                                                               '(15, 13, 14), '
                                                               '(12, 13, 14), '
                                                               '(15, 13), (15, '
                                                               '16), (16,), '
                                                               '(), (6,), '
                                                               '(6,), (), '
                                                               '(9,), (9,), '
                                                               '(7,), (7, 8), '
                                                               '(8,), (8,), '
                                                               '(10, 11))',
                                                  'old_value': '((12, 13, 14), '
                                                               '(12, 15, 14), '
                                                               '(15, 13, 14), '
                                                               '(15, 16), '
                                                               '(16,), (15, '
                                                               '13), (), (6,), '
                                                               '(6,), (), '
                                                               '(9,), (9,), '
                                                               '(7,), (7, 8), '
                                                               '(8,), (8,), '
                                                               '(10, 11))'}}}

Confused about output

json.dumps(ddiff)

"dic_item_added": [ "root[u'dont', u'stop', u'believing']" ],

dic_item_added is an array of only one element, even though there are three keys that were added. Wouldn't it make more sense if there were three elements representing each item? I'm trying to fail a test if specific fields are in the dic_item_added array, and the format makes it very difficult to parse. Maybe i'm missing something?

Diffing two floating point numbers

My definition "significant digits" is different from yours.

t1 = 0.0000584916556452
t2 = 0.0000584936556451

Your definition implies that the 9th significant digit above is different.

Alternative if we look at the above number in scientific notation,

t1 = 5.8491655645217175e-5
t2 = 5.8493655645217175e-5

The 5th significant digit is different. You would get about the same result if you switched the formation from

'{:.Xf}'.format(t1)

'{:.Xe}'.format(t1) (where X should be X-1)

thanks,
Brian West

Mistake in using of logger

see file deepdiff/diff.py:579
logger.warning("Can not produce a hash for %s item in %s and "
"thus not counting this object: %s" % (item, parent), e) # << HERE

string formatting, bracket:
"thus not counting this object: %s" % (item, parent, e))

Add ignore fields support

Hi,

There are occasions we need to ignore some fields even if there are diff, like timestamp or some auto increment count.

I'd like to implement this feature and send you a PR. What do you think?

Print out value when dictionary Item added and/or removed

For example, it will be nice to have:

In [10]: t1 = {1: 1, 2: 2, 3: [4, 5, 6]}

In [11]: t2 = {1: 1, 2: 2}

In [12]: DeepDiff(t1, t2)
Out[12]: {'dic_item_removed': {'root[3]'}}   # v 1.6
Out[12]: {'dic_item_removed': {'root[3]': [4, 5, 6]}}   # Nice to have

Output is not as per docs

I am not sure if I am dong something wrong but the output I get is a set within a dict:

>>> pprint (ddiff) {'dic_item_added': set(["root['base_path/someapp/app_ids/android']", "root['base_path/someapp/app_ids/ios']")}

Can't properly compare dictionaries if specific key used

Hi!

I'm trying to compare xml documents converted to dict and here issue:

    from deepdiff import DeepDiff

    d1 = {
        'item': [
            {'title': 1, 'http://purl.org/rss/1.0/modules/content/:encoded': '1'},
            {'title': 2, 'http://purl.org/rss/1.0/modules/content/:encoded': '2'}
        ]
    }

    d2 = {
        'item': [
            {'http://purl.org/rss/1.0/modules/content/:encoded': '1', 'title': 1},
            {'http://purl.org/rss/1.0/modules/content/:encoded': '2', 'title': 2}
        ]
    }

    DeepDiff(d1, d2, ignore_order=True)  # Returns non empty dictionary
    DeepDiff(d1, d2, ignore_order=False)  # Returns {}

I expected result should be the same - empty dictionaries.

Pretty Difference output mode?

I was wondering if a pretty output wouldn't be a nice addon to the library.
Something more human readable ?

For instance the below :
{'type_changes': {"root['a']['b']['c']": {'old_type': <type 'str'>, 'new_value': 42, 'old_value': 'foo', 'new_type': <type 'int'>}}}

could be replaced with something like :
type_changes for a.b.c, "foo" (string type) replaced by 42 (integer type)

Creating a wrapper for the library doesn't seem like the best way to overcome this, as a new update of the core library could break the wrapper.

Feature: Add threshold for numerics

We already have ignore significant digits but what we need to add is a threshold for numerics so that it ignores the diff if it is within the threshold.

For example if threshold is 1% then the diff between 1000 and 1001 should be ignored.

Document minimum python version >= 2.7 due to set literals

I believe this won't work on Python < 2.7 because of the use of the set literal which wasn't introduced until 2.7. On 2.6 and earlier import will generate a syntax error: The documentation should indicate that it requires 2.7 or later.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/site-packages/deepdiff-0.6.1-py2.6.egg/deepdiff/__init__.py", line 1, in <module>
    from .deepdiff import DeepDiff
  File "/usr/lib/python2.6/site-packages/deepdiff-0.6.1-py2.6.egg/deepdiff/deepdiff.py", line 213
    self.__diff(t1, t2, parents_ids=frozenset({id(t1)}))

option to ignore type changes (unicode vs string), or (datetime vs string).

working with comparing previous versions of objects to the api, the revision storage creates some natural deltas from strings to unicode, and from datetime objects to iso strings.

Implement index-agnostic list comparison

Sometimes, I want to ensure that a value is in a list but I don't care if the object is at a different index in the list.

I'd like to be able to pass a flag so that [1, 2, 3, 4] returns no difference against [3, 4, 2, 1].

However, [1,2,3,4] compared to [3,2,1,4,1] should be able to tell me that an extra '1' is present in the 2nd list.

Issue comparing bytes

In python 3.5 I get the following error when trying to diff two objects that contain byte strings:

site-packages/deepdiff/diff.py in __diff_str(self, level)
    904 
    905         # do we add a diff for convenience?
--> 906         if '\n' in level.t1 or '\n' in level.t2:
    907             diff = difflib.unified_diff(
    908                 level.t1.splitlines(), level.t2.splitlines(), lineterm='')

TypeError: a bytes-like object is required, not 'str'

I guess you need to use b'\n' if the element is a byte string.

Comparing similar objects when ignoring order

d1 = {
    'key1': 'val1',
    'key2': [
        {
            'key3': 'val3',
            'key4': 'val4',
        },
        {
            'key5': 'val5',
            'key6': 'val6',
        },
    ],
}



d2 = {
    'key1': 'val1',
    'key2': [
        {
            'key3': 'val3',
            'key4': 'val4',
        },
        {
            'key5': 'CHANGE',
            'key6': 'val6',
        },
    ],
}

>>> diff = DeepDiff(d1, d2)
{'values_changed': {"root['key2'][1]['key5']": {'newvalue': 'CHANGE', 'oldvalue': 'val5'}}}

works as expected. What seems to cause a problem it when
I re-order the list and have a changed value in one of the lists dicts.

Using with ignore_order=True.

d1 = {
    'key1': 'val1',
    'key2': [
        {
            'key3': 'val3',
            'key4': 'val4',
        },
        {
            'key5': 'val5',
            'key6': 'val6',
        },
    ],
}



d2 = {
    'key1': 'val1',
    'key2': [
        {
            'key5': 'CHANGE',
            'key6': 'val6',
        },
        {
            'key3': 'val3',
            'key4': 'val4',
        },
    ],
}

>>> diff = DeepDiff(d1, d2, ignore_order=True)
{'iterable_item_removed': {"root['key2'][1]": {'key6': 'val6', 'key5': 'val5'}}, 'iterable_item_added': {"root['key2'][0]": {'key6': 'val6', 'key5': 'CHANGE'}}}

It looks like it sees it as a new dict instead of a changed value in a current dict.

Comparing types fail when string ends with "%"

Hi there,

This is a pretty easy one! Yup this fails - You probably want to quote or escape it. The line numbers are off as I was trying to figure out what it was :)

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"50%"}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":None}}
>>> ddiff = DeepDiff(t1, t2)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/Users/rh0dium/.virtualenvs/deep_diff_bug/lib/python2.7/site-packages/deepdiff/deepdiff.py", line 213, in __init__
    self.__diff(t1, t2, parents_ids=frozenset({id(t1)}))
  File "/Users/rh0dium/.virtualenvs/deep_diff_bug/lib/python2.7/site-packages/deepdiff/deepdiff.py", line 471, in __diff
    self.__diff_dict(t1, t2, parent, parents_ids)
  File "/Users/rh0dium/.virtualenvs/deep_diff_bug/lib/python2.7/site-packages/deepdiff/deepdiff.py", line 329, in __diff_dict
    self.__diff_common_children(t1, t2, t_keys_intersect, print_as_attribute, parents_ids, parent, parent_text)
  File "/Users/rh0dium/.virtualenvs/deep_diff_bug/lib/python2.7/site-packages/deepdiff/deepdiff.py", line 352, in __diff_common_children
    self.__diff(t1_child, t2_child, parent=parent_text % (parent, item_key_str), parents_ids=parents_added)
  File "/Users/rh0dium/.virtualenvs/deep_diff_bug/lib/python2.7/site-packages/deepdiff/deepdiff.py", line 471, in __diff
    self.__diff_dict(t1, t2, parent, parents_ids)
  File "/Users/rh0dium/.virtualenvs/deep_diff_bug/lib/python2.7/site-packages/deepdiff/deepdiff.py", line 329, in __diff_dict
    self.__diff_common_children(t1, t2, t_keys_intersect, print_as_attribute, parents_ids, parent, parent_text)
  File "/Users/rh0dium/.virtualenvs/deep_diff_bug/lib/python2.7/site-packages/deepdiff/deepdiff.py", line 352, in __diff_common_children
    self.__diff(t1_child, t2_child, parent=parent_text % (parent, item_key_str), parents_ids=parents_added)
  File "/Users/rh0dium/.virtualenvs/deep_diff_bug/lib/python2.7/site-packages/deepdiff/deepdiff.py", line 461, in __diff
    error % (parent, "Unknown {}:".format(t1), "Unknown {}:".format(t2)))
ValueError: unsupported format character '=' (0x3d) at index 7

Mappings should be compared with __diff_dict instead of __diff_iterable

Currently, __diff_dict is used for MutableMapping instances, while __diff_iterable is used for Mapping instances. I think this should be changed to be more consistent.

Support recursion in custom objects

I have this scenario trying to inspect some items from the boto package:

class test(object):
    def __init__(self):
        self.loop = self

a = test()
b = a

print DeepDiff(a, b)

--- cut ---
  File "E:\Projects\Repos\Cookbook\cookbooks-v8\stack-monitor\files\default\deepdiff.py", line 244, in __diffdict
    self.__diffit(t1[item], t2[item], parent=parent_text % (parent, item_str))
  File "E:\Projects\Repos\Cookbook\cookbooks-v8\stack-monitor\files\default\deepdiff.py", line 325, in __diffit
    self.__diffdict(t1, t2, parent, attributes_mode=True)
  File "E:\Projects\Repos\Cookbook\cookbooks-v8\stack-monitor\files\default\deepdiff.py", line 244, in __diffdict
    self.__diffit(t1[item], t2[item], parent=parent_text % (parent, item_str))
  File "E:\Projects\Repos\Cookbook\cookbooks-v8\stack-monitor\files\default\deepdiff.py", line 321, in __diffit
    elif isinstance(t1, Iterable):
  File "C:\Apps\Python\lib\abc.py", line 141, in __instancecheck__
    subtype in cls._abc_negative_cache):
  File "C:\Apps\Python\lib\_weakrefset.py", line 75, in __contains__
    return wr in self.data
RuntimeError: maximum recursion depth exceeded in cmp

Feature: Search for types/objects

Currently when encountering an object, DeepSearch examines that object's __dict__ without testing whether the object is what's being searched for. By example:

>>> from deepdiff import DeepSearch
>>> from uuid import uuid4()
>>> foo = uuid4()
>>> DeepSearch({1: foo}, foo)
{}

This would also allow global singletons like None and functions to be searched for.

This seems like it'd be a fairly small change, just an if item == obj check at the top of DeepSearch.__search_obj. If you think this a good idea, I'm happy to make a PR.

Please add support for diffing datetime.timedelta objects

>>> import datetime
>>> import deepdiff
>>> deepdiff.DeepDiff(0, 0)
{}
>>> deepdiff.DeepDiff(datetime.date(2017,5,1), datetime.date(2017,5,1))
{}
>>> deepdiff.DeepDiff(datetime.timedelta(0), datetime.timedelta(0))
{'unprocessed': ['root: 0:00:00 and 0:00:00']}
>>> deepdiff.DeepDiff(datetime.timedelta(0), datetime.timedelta(1))
{'unprocessed': ['root: 0:00:00 and 1 day, 0:00:00']}
>>>

String comparison uses identity testing - it should use equality testing

deepdiff.py ln:291

"elif t1 is not t2" should be rewritten into "elif t1 != t2"

http://stackoverflow.com/questions/1504717/why-does-comparing-strings-in-python-using-either-or-is-sometimes-produce

DeepSearch doesn't search for inherited class attributes

class Foo(object):
    bar = 'baz'

deepdiff.DeepSearch(Foo, 'baz')
#-> {}

The solution to this looks pretty simple: in DeepSearch.__search_obj(), change this line to obj = {getattr(attr) for attr in dir(obj)}. dir() traverses the inheritence tree, finding attributes not in __dict__. I'm happy to make a PR if this looks good to you.

Applying diff on an object

I have two objects O1 and O2.

if d = diff(O1 , O2)

can I generate O2 using d and O1?

Feature: exclude specific objects

It's currently possible to exclude types from DeepDiff and DeepSearch using the exclude_types kwarg. However, it's not possible to exclude certain values. Here's an example of what I'm looking for:

>>> d1 = {1: None, 2: 'a'}
>>> d2 = {1: 'foo', 2: 'b'}
>>> DeepDiff(d1, d2, exclude=[None])
{'values_changed': {'root[2]': {'new_value': 'b', 'old_value': 'a'}}}

This would be useful for excluding complex objects from a diff or search, such as UUIDs or datetimes. Though not strictly necessary, it might be a good idea to also make it possible to use is to compare searched and excluded objects rather than ==. I'm less confident about what that API should look like, but maybe something like a id_comparison=False kwarg?

I'm happy to look at making a PR for this if you're on board with this change.

feature request: comparing dict with strings to dict with regex values

I'd like to compare a dictionary where some of the expected values are regex to be matched to matching string values.

eg. something along:

    actual = { 'a': 'abcd', 'b': 'test' }
    # expecting 'b' to be exactly 'test' and expecting 'a' to be a four letter lowercase word
    expected = { 'a': re.compile('[a-z]{4}'),   'b': 'test' }   
    print DeepDiff(actual, expected,  regexMatching=True)    # prints {}

I figure it will be very easy to implement, a small extension of type_changes from string to regex.

DeepDiff has a bug when comparing repeated child elements.

Comparing the following
{u'root': {u'a': [{u'a1': u'a2'}, {u'a1': u'a1', u'a2': u'a2'}], u'b': u'b1'}}
with
{u'root': {u'a': [{u'a1': u'0'}, {u'a1': u'1'}, {u'a1': u'a2'}, {u'a1': u'a1', u'a2': u'a2'}], u'b': u'b1'}}
gives the following result, which is incorrect. It should only have added elements.
{'dic_item_removed': [u"root['root']['a'][1][u'a2']"],
'list_added': [u"root['root']['a']: [{u'a1': u'0'}, {u'a1': u'1'}]"],
'values_changed': [u"root['root']['a'][0]['a1']:\n--- \n+++ \n@@ -1 +1 @@\n-a2\n+0",
u"root['root']['a'][1]['a1']:\n--- \n+++ \n@@ -1 +1 @@\n-a1\n+1"]}

The above json is normalized from the following XML:
dataA = <root> <a><a1>a1</a1><a2>a2</a2></a> <a><a1>a2</a1></a> <b>b1</b> </root>
dataB = <root> <a><a1>0</a1></a> <a><a1>1</a1></a> <a><a1>a2</a1></a> <a><a1>a1</a1><a2>a2</a2></a> <b>b1</b> </root>

Take order into consideration for iterables

Add support for exclusion of deep paths

I couldn't find in documentation (and all tries in code to suggest were unsuccessful) how would I exclude exact field in list of objects?

ddiff = DeepDiff(response, request,
                 verbose_level=2,
                 exclude_paths={"root['case']['files'][]['presignedUrl']"})

I want to ignore field presignedUrl for one JSON (for all objects in list files) (and this field doesn't exist for other JSON object).
But I get dictionary_item_removed for "root['case']['files'][0]['presignedUrl']" in ddiff.
How is it possible to do? If this is not supported it might be very good to have interface like JsonPath.

For complex cases I need even more - exclude presignedUrl and ignore order of list. Meaning if delete this field (or exclude) from every object from list files lists should be the same, no matter what order of objects they have (because, excluding presignedUrl, all objects in lists are absolutely the same, just in different order). Is it possible to do too?

Request JSON:

{
  "case": {
    "accountId": "4fa27068c95d4c6db1eb",
    "accountName": "Account Name",
    "creationDate": "2009-03-29T09:35:28+00:00",
    "doctorName": "Test Doctor",
    "externalId": "c9a055cb-afdd-41c9-87d3-6b92f7b4b6bb",
    "externalOrderGuid": "7297f93b-2a1a-4887-bcf1-34332839bc60",
    "externalOrderId": "108c73c4-85bd-44a0-98ce-8c37983f047e",
    "files": [
      {
        "active": -17345,
        "caseUid": "8ccb4c6a-8b64-4d3f-9c95-712f94cfd198",
        "created": "1977-07-03T23:53:43+00:00",
        "dentalType": "313a5fb5-cb10-4a1e-9fad-04b601f657aa",
        "extension": "6b47aaf4ae",
        "fileType": "4f5502758c",
        "geocxRec": "be82b6ee-7004-4933-b348-da80a7ed8cd0",
        "geocxSection": "67cc6496-b8dd-4519-bbaf-b8a95ad3d49a",
        "guid": "4daa7069-51f9-4c8b-8e80-1272b0bff515",
        "isDesign": 11380,
        "localPath": "257c7509-3d81-402d-a038-885cbbfe0484",
        "md5": "066d3c09-ffc7-4f2e-b121-10ef4ba96f21",
        "modified": "1992-01-01T22:40:55+00:00",
        "name": "f95f4ab4-c667-4b97-a997-db84254d3d67",
        "objectType": "27c5c6e7-8fcc-4951-8294-a8c51198d417",
        "range": "aea6a6ef-e912-43c6-a626-b4fc6dcc3747",
        "revision": -7363820143813281823,
        "s3Path": "387e847e-064b-4221-bfe2-1e84beaf4728",
        "size": 7654534542432139115,
        "status": "b59a1088-c2e7-4fec-8609-4aa132cb2703",
        "units": "9827d9adb14c4f58b6fc",
        "updated": "2011-03-24T08:27:38+00:00"
      }
    ],
    "folder": "/tmp/junk",
    "id": "8ccb4c6a-8b64-4d3f-9c95-712f94cfd198",
    "lastDownload": "1998-04-08T15:53:50+00:00",
    "lastDownloadUser": "SomeOne",
    "lastEditor": "Last Editor",
    "locationId": 1602668402,
    "locationName": "130b3ca7-89f1-4f4a-bfd0-81032c20d85d",
    "message": "Just some string",
    "millingDate": "1980-01-09T00:36:48+00:00",
    "millingLastEditor": "Milling Last Editor",
    "millingStatus": "Status",
    "models": [],
    "orderId": "50d4039e-d96e-47b1-b243-88ba8d455b7d",
    "patientName": "Patient Name",
    "printStatus": "02244413-f971-486d-8d7a-250f8295cf89",
    "providerId": "e6bbe6e2-99ba-4c09-8d73-e5d469ed4cc4",
    "publicId": "2ceb31f8-5e7f-499a-857c-704eedf5644a",
    "reference": "db474a32-12b2-4e95-931d-e5fe0e1eb305",
    "service": "service",
    "shippingAddress": 10,
    "shippingDate": "1977-08-02T20:56:13+00:00",
    "shippingUrl": "Shipping Url",
    "status": "WORKING",
    "statusExternal": "Something",
    "submissionDate": "1994-05-10T12:38:26+00:00",
    "updateDate": "1971-04-15T23:28:27+00:00",
    "userName": "User Name",
    "workListId": 1905264324,
    "workListName": "791651d4-d76c-4536-8abe-69fa320ceecf"
  }
}

Response JSON:

{
  "case": {
    "accountId": "4fa27068c95d4c6db1eb",
    "accountName": "Account Name",
    "creationDate": "2009-03-29T13:35:28+04:00",
    "doctorName": "Test Doctor",
    "externalId": "c9a055cb-afdd-41c9-87d3-6b92f7b4b6bb",
    "externalOrderGuid": "7297f93b-2a1a-4887-bcf1-34332839bc60",
    "externalOrderId": "108c73c4-85bd-44a0-98ce-8c37983f047e",
    "files": [
      {
        "active": -17345,
        "caseUid": "8ccb4c6a-8b64-4d3f-9c95-712f94cfd198",
        "created": "1977-07-04T02:53:43+03:00",
        "dentalType": "313a5fb5-cb10-4a1e-9fad-04b601f657aa",
        "extension": "6b47aaf4ae",
        "fileType": "4f5502758c",
        "geocxRec": "be82b6ee-7004-4933-b348-da80a7ed8cd0",
        "geocxSection": "67cc6496-b8dd-4519-bbaf-b8a95ad3d49a",
        "guid": "4daa7069-51f9-4c8b-8e80-1272b0bff515",
        "isDesign": 11380,
        "localPath": "257c7509-3d81-402d-a038-885cbbfe0484",
        "md5": "066d3c09-ffc7-4f2e-b121-10ef4ba96f21",
        "modified": "1992-01-02T00:40:55+02:00",
        "name": "f95f4ab4-c667-4b97-a997-db84254d3d67",
        "objectType": "27c5c6e7-8fcc-4951-8294-a8c51198d417",
        "presignedUrl": "https://387e847e-064b-4221-bfe2-1e84beaf4728.s3.amazonaws.com/?AWSAccessKeyId=AKIAJ3DIXXEGSTRQUT5Q&Expires=1501358834&Signature=C%2Bulgq0CnVr4%2BXy2nYujAL6%2BKY8%3D",
        "range": "aea6a6ef-e912-43c6-a626-b4fc6dcc3747",
        "revision": -7363820143813281823,
        "s3Path": "387e847e-064b-4221-bfe2-1e84beaf4728",
        "size": 7654534542432139115,
        "status": "b59a1088-c2e7-4fec-8609-4aa132cb2703",
        "units": "9827d9adb14c4f58b6fc",
        "updated": "2011-03-24T11:27:38+03:00"
      }
    ],
    "folder": "/tmp/junk",
    "id": "8ccb4c6a-8b64-4d3f-9c95-712f94cfd198",
    "lastDownload": "1998-04-08T19:53:50+04:00",
    "lastDownloadUser": "SomeOne",
    "lastEditor": "Last Editor",
    "locationId": 1602668402,
    "locationName": "130b3ca7-89f1-4f4a-bfd0-81032c20d85d",
    "message": "Just some string",
    "millingDate": "1980-01-09T03:36:48+03:00",
    "millingLastEditor": "Milling Last Editor",
    "millingStatus": "Status",
    "models": [],
    "orderId": "50d4039e-d96e-47b1-b243-88ba8d455b7d",
    "patientName": "Patient Name",
    "printStatus": "02244413-f971-486d-8d7a-250f8295cf89",
    "providerId": "e6bbe6e2-99ba-4c09-8d73-e5d469ed4cc4",
    "publicId": "2ceb31f8-5e7f-499a-857c-704eedf5644a",
    "reference": "db474a32-12b2-4e95-931d-e5fe0e1eb305",
    "service": "service",
    "shippingAddress": 10,
    "shippingDate": "1977-08-02T23:56:13+03:00",
    "shippingUrl": "Shipping Url",
    "status": "WORKING",
    "statusExternal": "Something",
    "submissionDate": "1994-05-10T16:38:26+04:00",
    "updateDate": "1971-04-16T02:28:27+03:00",
    "userName": "User Name",
    "workListId": 1905264324,
    "workListName": "791651d4-d76c-4536-8abe-69fa320ceecf"
  }
}

DeepDiff object is not JSON serializable

Unfortunately although output of deepdiff.diff.DeepDiff is a dictionary - there is no possibility to store it as a JSON.

"dictionary_item_added", "dictionary_item_removed" values are sets (why not list?)
Besides that some keys (don't know what caused it, I just get exception) are None.

Would it be a problem to make it JSON serializable?

Make it work with pypy

Identify structural key differences, but ignore values

I would like to compare 2 objects and identify structural differences to ensure parity, but because these objects will always have different values, I only want to get the differences in the key structure. Is this possible with this module? When performing the checks it appears to ignore structural differences in deep objects and always note value differences.

conda package for deepdiff

Hello @seperman, I'd like to add a deepdiff conda package to the community packaging channel, conda-forge (https://anaconda.org/conda-forge). Presently the submission is here. Please let me know if you've any objection to conda packaging in general. In particular, would you like to be listed as a maintainer of the recipe/package? Thanks for your consideration.

Request: Support Lists of Dicts

Thanks for working on this library @seperman ! A real time saver, love it!

I have a small request that I would like to make: Can it support lists of dicts as well? So something like this for example:

from deepdiff import DeepDiff

t1 = [{"a": 2}, {"b": 3}]
t2 = [{"b": 3}, {"a": 2}]

DeepDiff(t1, t2, ignore_order=True)

The expected output (as a user) would be:

{}

But the actual output is:

{'dic_item_added': ["root[0]['b']", "root[1]['a']"], 'dic_item_removed': ["root[0]['a']", "root[1]['b']"]}

Which makes sense looking at the source code, because ignore_order is supported only for unhashable iteratable items.

But supporting that flag for lists of dicts would be very useful!

Let me know if you can support something like that. I wouldn't mind opening a Pull Request to get this going.

Can't compare dict if it contains list of dicts with different order

Hi.
Here one more example of wrong comparison. Maybe here same root as in #22, but in that case it give wrong results, does not matter, which value ignore_order was passed.

dict1 = {
            'items': [
                {
                    'tags': [
                        {'id': '2', 'title': '2'},
                        {'id': '1', 'title': '1'}

                    ],
                    'title': '1'
                },
                {
                    'tags': [
                        {'id': '1', 'title': '1'},
                        {'id': '2', 'title': '2'}
                    ],
                    'title': '2'
                }
            ]
        }

dict2 = {
            'items': [
                {
                    'tags': [
                        {'id': '1', 'title': '1'},
                        {'id': '2', 'title': '2'}
                    ],
                    'title': '1'
                },
                {
                    'tags': [
                        {'id': '2', 'title': '2'},
                        {'id': '1', 'title': '1'}
                    ],
                    'title': '2'
                }
            ]
        }

DeepDiff(dict1, dict2, ignore_order=True)  # Returns non empty dictionary

Enhancement to add ignore_case

I have two dictionaries coming from two different data sources one of which unfortunately messes up the case.

I was planning to add an option to ignore_case which applies on string comparison. If its there already doable in current implementation then I can try it out