Code Monkey home page Code Monkey logo

deepdiff's Introduction

Hello!

My name is Sep. Welcome to my page!

If you deal with dirty data that is entered by humans (i.e., customer data or product data), I am building a tool for it: Qluster. Please ping me, and I will set you up with an account. I would love to hear your feedback!

Please check out my blog, Zepworks, for articles that I may publish from time to time.

Updates

DeepDiff 7.0.0 - Apr 8 2024

  • DeepDiff 7 comes with an improved delta object. Delta to flat dictionaries have undergone a major change. We have also introduced Delta serialize to flat rows.
  • Subtracting delta objects have dramatically improved at the cost of holding more metadata about the original objects.
  • When verbose=2, and the "path" of an item has changed in a report between t1 and t2, we include it as new_path.
  • path(use_t2=True) returns the correct path to t2 in any reported change in the tree view
  • Python 3.7 support is dropped and Python 3.12 is officially supported.

⚡ Fun fact

Try drinking olive oil if you are too hungry. I am, as I write this.

deepdiff's People

Contributors

amsb avatar az-pz avatar b-jazz avatar bernhard10 avatar brianmaissy avatar brianmedigate avatar chkothe avatar dependabot[bot] avatar devipriyasarkar avatar dhanvantari avatar dustingtorres avatar eggachecat avatar flowolf avatar havardthom avatar hugovk avatar jaraco avatar jayvdb avatar jvacek avatar kor4ik avatar leoslf avatar lyz-code avatar maggelus avatar martin-kokos avatar seperman avatar sf-tcalhoun avatar uwefladrich avatar van-ess0 avatar victorhahncastell avatar williamjamieson avatar yaelmi3 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deepdiff's Issues

no module named helper

Python 2.7 and virtualenv.
Installed latest 2.5.0.

Getting exception: no module named helper.
file: diff.py, line: 14 from deepdiff.helper import py3.
Same for line 15.
Same for contenthash.py, search.py.

Removing the deepdiff. prefix solved this issue. (i.e. from helper import py3).

Comparing based on hashes does not work with the significant_digits option

Unfortunately I currently don't know how to best solve this.
The following options seem possible:
-) Change the hashing function to only take n significant digits into account.
-) Use comparison without hashes if the significant_digits option is present
-) Do the comparison with unchanged hashes and postprocess the result.

My current workaround is to postprocess the diff result like this:

    diff = DeepDiff(a, b, significant_digits=12, ignore_order=True) 
    if "set_item_added" in diff and "set_item_removed" in diff:
        while True:
            for item in  diff["set_item_removed"]:
                if item in diff["set_item_added"]:
                    diff["set_item_removed"].remove(item)
                    diff["set_item_added"].remove(item)
                    break
            else:
                break

This works because the dictionary returned by diff contains strings, which are formatted having only 12 digits.

Ignore order for nested iterables

I have a list, that contains lists inside, is there any way to make ignore_order recursive?
Example:

>>> a = [1, 2, [3, 4]]
>>> b = [[4, 3], 2, 1]
>>> DeepDiff(a, b, ignore_order=True)
{'iterable_item_removed': {'root[2]': [3, 4]}, 'iterable_item_added': {'root[0]': [4, 3]}}

Error when comparing lists of sets if ignore_order=True

Error when comparing lists of sets if ignore_order=True.
For example,
DeepDiff([{1}, {2}, {3}], [{1}, {2}, {3}, {4}], ignore_order=True)
printed
Warning: Can not produce a hash for set([1]) item in root and thus not counting this object. Warning: Can not produce a hash for set([2]) item in root and thus not counting this object. Warning: Can not produce a hash for set([3]) item in root and thus not counting this object. Warning: Can not produce a hash for set([1]) item in root and thus not counting this object. Warning: Can not produce a hash for set([2]) item in root and thus not counting this object. Warning: Can not produce a hash for set([3]) item in root and thus not counting this object.

TextDiff incorrectly represents dictionary_item_removed when value is None

Before 3.0 the following test would pass but now fails:

def test_none_item_removed(self):
        t1 = {1: None, 2: 2}
        t2 = {2: 2}
        ddiff = DeepDiff(t1, t2)
        result = {
            'dictionary_item_removed': {'root[1]'}
        }
        self.assertEqual(ddiff, result)

with:
AssertionError: {'dictionary_item_removed': set(['root'])} != {'dictionary_item_removed': set(['root[1]'])}

I'm not sure where or how to fix it? The closest I've come is by tweaking auto_generate_child_rel to remove if self.down.t1 is not None: but that causes some other tests to fail.

In path() because level.t1_child_rel is None we don't append the dictionary key to the path so it looks like the whole dictionary is removed.

export in real JSON

Currently using deepdiff for reporting different between objects versioned, I store the return of this function using pymongo, but some of the attributes values are not JSON compliant and I can't really store them without converting these values, for instance, when I add new dictionaries to my object, the set value added will not be serializable as is (I've to convert these as lists). It'd be nice if we could have an option to output JSON compliant object.

Problem with ignore_order and lists

Following example did not work for me:

deepdiff.DeepDiff(
{"a":[[{"b":2,"c":4},{"b":2,"c":3}]]},
{"a":[[{"b":2,"c":3},{"b":2,"c":4}]]},
ignore_order=True
)
output:
{'iterable_item_added': {"root['a'][0]": [{'b': 2, 'c': 3}, {'b': 2, 'c': 4}]},
'iterable_item_removed': {"root['a'][0]": [{'b': 2, 'c': 4},
{'b': 2, 'c': 3}]}}

Feature: Similarity instead of difference

Would be great to be able to compare 2 objects for similar key values, for example:
d1 = {"keyA": {"value": 10}, "keyB": {"value": 20}}
d1 = {"keyA": {"value": 30}, "keyB": {"value": 20}}

diff = DeepDIff(d1, d2, reverse_diff=True)
>> {"simillar_values": ["[keyB][value]"]}

I'm working on a tool that needs to compare Python dictionaries and not only find diffs (which this library does perfect!) , but also detect overlapping values.

Recursion into dictionaries is limited to type dict

Recursion into dictionaries is currently limited to just those objects of (or inheriting from) type dict, as per deepdiff.py line 441

elif isinstance(t1, dict): self.__diff_dict(t1, t2, parent, parents_ids)

This causes a problem for my intended use of your excellent package, because other than my top-most dictionary, all of my nested dictionaries are subclassed from collections.MutableMapping, thus isinstance returns False causing my dictionary-like objects to not get processed as dictionaries.

Instead of checking for type dict, may I suggest that you could instead check for collections.Mapping (the base class for mapping i.e dictionaries). E.g.

elif isinstance(t1, collections.Mapping): self.__diff_dict(t1, t2, parent, parents_ids)

That would allow recursion into any object that supports dictionary lookups.

Handling of renames

For my migration system, one of the more often applied use cases is a rename of a dictionary key (which in jsonschema often is the name of an object's field), so what i get from that is:

{
    "dictionary_item_removed": {"root['properties']['obatzter']": {"description": "Custom user notes", "type": "string", "format": "html", "title": "User notes"}}, 
    "dictionary_item_added": {"root['properties']['apfelsalat']": {"description": "Custom user notes", "type": "string", "format": "html", "title": "User notes"}}
}

Where the fieldname "obatzter" was renamed to "apfelsalat". The above syntax is not so nice for migration, so I'd love to rather get a diff that says:

{
    "dictionary_item_renamed": {"root['properties']['obatzter']": "root['properties']['apfelsalat']"}
}

Might that please be possible?

Support dates

For some reason datetimes are supported but not dates...

Regex support in exclude_path

I have two lists of dictionaries

list_1 = [{'a': 1, 'b': 2}, {'c': 4, 'b': 5}]
list_2 = [{'a': 1, 'b': 3}, {'c': 4, 'b': 5}]

and I want to ignore key 'b' in all dictionaries while comparing them.
I think that it would be pretty nice if I could just use

diff = DeepDiff(list_1, list_2, exclude_regex="root\[\d+\]\['b'\]")

or something like that.

Null string DeepSearch bug

So, I found this bug while I was working on finding empty values inside the complete dictionary (given below)

one = {
	"num_list":[1,3,2], 
	"long_dict":{"empty":"", "string": [2,0,0], "somewhere": "around"},
	"alpha_list":["","","a","b"]
}

Upon DeepSearch(one,"", verbose_level=2) , I expect it to show "root['long_dict']['empty']": '', , "root['small_list'][0]": '', and "root['small_list'][1]": '' but for some reason, I get the below output.

{'matched_paths': {"root['long_dict']": {'empty': '',
   'somewhere': 'around',
   'string': [2, 0, 0]},
  "root['long_dict']['empty']": '',
  "root['long_dict']['somewhere']": 'around',
  "root['long_dict']['string']": [2, 0, 0],
  "root['alpha_list']": ['', '', 'a', 'b'],
  "root['something somewhere']": [1, 3, 2]},
 'matched_values': {"root['long_dict']['empty']": '',
  "root['long_dict']['somewhere']": 'around',
  "root['alpha_list'][0]": '',
  "root['alpha_list'][1]": '',
  "root['alpha_list'][2]": 'a',
  "root['alpha_list'][3]": 'b'}}

Here, we can see that the value 'around','a', and 'b' are matched as an empty string. 🤔 ?

What am trying to achieve is really simple, I just want to look for Empty values in the whole object in concern, could be in a loop or anywhere, and then report the emptiness and stuff based on the 'matched_values'.

Tuples with equal content show diff even when ignore_order=True

{'values_changed': {"root['parent_tuple_list']": {'new_value': '((12, 15, 14), '
                                                               '(15, 13, 14), '
                                                               '(12, 13, 14), '
                                                               '(15, 13), (15, '
                                                               '16), (16,), '
                                                               '(), (6,), '
                                                               '(6,), (), '
                                                               '(9,), (9,), '
                                                               '(7,), (7, 8), '
                                                               '(8,), (8,), '
                                                               '(10, 11))',
                                                  'old_value': '((12, 13, 14), '
                                                               '(12, 15, 14), '
                                                               '(15, 13, 14), '
                                                               '(15, 16), '
                                                               '(16,), (15, '
                                                               '13), (), (6,), '
                                                               '(6,), (), '
                                                               '(9,), (9,), '
                                                               '(7,), (7, 8), '
                                                               '(8,), (8,), '
                                                               '(10, 11))'}}}

Confused about output

json.dumps(ddiff)

"dic_item_added": [ "root[u'dont', u'stop', u'believing']" ],

dic_item_added is an array of only one element, even though there are three keys that were added. Wouldn't it make more sense if there were three elements representing each item? I'm trying to fail a test if specific fields are in the dic_item_added array, and the format makes it very difficult to parse. Maybe i'm missing something?

Diffing two floating point numbers

My definition "significant digits" is different from yours.

t1 = 0.0000584916556452
t2 = 0.0000584936556451

Your definition implies that the 9th significant digit above is different.

Alternative if we look at the above number in scientific notation,

t1 = 5.8491655645217175e-5
t2 = 5.8493655645217175e-5

The 5th significant digit is different. You would get about the same result if you switched the formation from

'{:.Xf}'.format(t1)

to

'{:.Xe}'.format(t1) (where X should be X-1)

thanks,
Brian West

Mistake in using of logger

see file deepdiff/diff.py:579
logger.warning("Can not produce a hash for %s item in %s and "
"thus not counting this object: %s" % (item, parent), e) # << HERE

string formatting, bracket:
"thus not counting this object: %s" % (item, parent, e))

Add ignore fields support

Hi,

There are occasions we need to ignore some fields even if there are diff, like timestamp or some auto increment count.

I'd like to implement this feature and send you a PR. What do you think?

Print out value when dictionary Item added and/or removed

For example, it will be nice to have:

In [10]: t1 = {1: 1, 2: 2, 3: [4, 5, 6]}

In [11]: t2 = {1: 1, 2: 2}

In [12]: DeepDiff(t1, t2)
Out[12]: {'dic_item_removed': {'root[3]'}}   # v 1.6
Out[12]: {'dic_item_removed': {'root[3]': [4, 5, 6]}}   # Nice to have

Output is not as per docs

I am not sure if I am dong something wrong but the output I get is a set within a dict:

>>> pprint (ddiff) {'dic_item_added': set(["root['base_path/someapp/app_ids/android']", "root['base_path/someapp/app_ids/ios']")}

Can't properly compare dictionaries if specific key used

Hi!

I'm trying to compare xml documents converted to dict and here issue:

    from deepdiff import DeepDiff

    d1 = {
        'item': [
            {'title': 1, 'http://purl.org/rss/1.0/modules/content/:encoded': '1'},
            {'title': 2, 'http://purl.org/rss/1.0/modules/content/:encoded': '2'}
        ]
    }

    d2 = {
        'item': [
            {'http://purl.org/rss/1.0/modules/content/:encoded': '1', 'title': 1},
            {'http://purl.org/rss/1.0/modules/content/:encoded': '2', 'title': 2}
        ]
    }

    DeepDiff(d1, d2, ignore_order=True)  # Returns non empty dictionary
    DeepDiff(d1, d2, ignore_order=False)  # Returns {}

I expected result should be the same - empty dictionaries.

Pretty Difference output mode?

I was wondering if a pretty output wouldn't be a nice addon to the library.
Something more human readable ?

For instance the below :
{'type_changes': {"root['a']['b']['c']": {'old_type': <type 'str'>, 'new_value': 42, 'old_value': 'foo', 'new_type': <type 'int'>}}}

could be replaced with something like :
type_changes for a.b.c, "foo" (string type) replaced by 42 (integer type)

Creating a wrapper for the library doesn't seem like the best way to overcome this, as a new update of the core library could break the wrapper.

Feature: Add threshold for numerics

We already have ignore significant digits but what we need to add is a threshold for numerics so that it ignores the diff if it is within the threshold.

For example if threshold is 1% then the diff between 1000 and 1001 should be ignored.

Document minimum python version >= 2.7 due to set literals

I believe this won't work on Python < 2.7 because of the use of the set literal which wasn't introduced until 2.7. On 2.6 and earlier import will generate a syntax error: The documentation should indicate that it requires 2.7 or later.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/site-packages/deepdiff-0.6.1-py2.6.egg/deepdiff/__init__.py", line 1, in <module>
    from .deepdiff import DeepDiff
  File "/usr/lib/python2.6/site-packages/deepdiff-0.6.1-py2.6.egg/deepdiff/deepdiff.py", line 213
    self.__diff(t1, t2, parents_ids=frozenset({id(t1)}))

Implement index-agnostic list comparison

Sometimes, I want to ensure that a value is in a list but I don't care if the object is at a different index in the list.

I'd like to be able to pass a flag so that [1, 2, 3, 4] returns no difference against [3, 4, 2, 1].

However, [1,2,3,4] compared to [3,2,1,4,1] should be able to tell me that an extra '1' is present in the 2nd list.

Issue comparing bytes

In python 3.5 I get the following error when trying to diff two objects that contain byte strings:

site-packages/deepdiff/diff.py in __diff_str(self, level)
    904 
    905         # do we add a diff for convenience?
--> 906         if '\n' in level.t1 or '\n' in level.t2:
    907             diff = difflib.unified_diff(
    908                 level.t1.splitlines(), level.t2.splitlines(), lineterm='')

TypeError: a bytes-like object is required, not 'str'

I guess you need to use b'\n' if the element is a byte string.

Comparing similar objects when ignoring order

d1 = {
    'key1': 'val1',
    'key2': [
        {
            'key3': 'val3',
            'key4': 'val4',
        },
        {
            'key5': 'val5',
            'key6': 'val6',
        },
    ],
}



d2 = {
    'key1': 'val1',
    'key2': [
        {
            'key3': 'val3',
            'key4': 'val4',
        },
        {
            'key5': 'CHANGE',
            'key6': 'val6',
        },
    ],
}

>>> diff = DeepDiff(d1, d2)
{'values_changed': {"root['key2'][1]['key5']": {'newvalue': 'CHANGE', 'oldvalue': 'val5'}}}

works as expected. What seems to cause a problem it when
I re-order the list and have a changed value in one of the lists dicts.

Using with ignore_order=True.

d1 = {
    'key1': 'val1',
    'key2': [
        {
            'key3': 'val3',
            'key4': 'val4',
        },
        {
            'key5': 'val5',
            'key6': 'val6',
        },
    ],
}



d2 = {
    'key1': 'val1',
    'key2': [
        {
            'key5': 'CHANGE',
            'key6': 'val6',
        },
        {
            'key3': 'val3',
            'key4': 'val4',
        },
    ],
}

>>> diff = DeepDiff(d1, d2, ignore_order=True)
{'iterable_item_removed': {"root['key2'][1]": {'key6': 'val6', 'key5': 'val5'}}, 'iterable_item_added': {"root['key2'][0]": {'key6': 'val6', 'key5': 'CHANGE'}}} 

It looks like it sees it as a new dict instead of a changed value in a current dict.

Comparing types fail when string ends with "%"

Hi there,

This is a pretty easy one! Yup this fails - You probably want to quote or escape it. The line numbers are off as I was trying to figure out what it was :)

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"50%"}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":None}}
>>> ddiff = DeepDiff(t1, t2)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/Users/rh0dium/.virtualenvs/deep_diff_bug/lib/python2.7/site-packages/deepdiff/deepdiff.py", line 213, in __init__
    self.__diff(t1, t2, parents_ids=frozenset({id(t1)}))
  File "/Users/rh0dium/.virtualenvs/deep_diff_bug/lib/python2.7/site-packages/deepdiff/deepdiff.py", line 471, in __diff
    self.__diff_dict(t1, t2, parent, parents_ids)
  File "/Users/rh0dium/.virtualenvs/deep_diff_bug/lib/python2.7/site-packages/deepdiff/deepdiff.py", line 329, in __diff_dict
    self.__diff_common_children(t1, t2, t_keys_intersect, print_as_attribute, parents_ids, parent, parent_text)
  File "/Users/rh0dium/.virtualenvs/deep_diff_bug/lib/python2.7/site-packages/deepdiff/deepdiff.py", line 352, in __diff_common_children
    self.__diff(t1_child, t2_child, parent=parent_text % (parent, item_key_str), parents_ids=parents_added)
  File "/Users/rh0dium/.virtualenvs/deep_diff_bug/lib/python2.7/site-packages/deepdiff/deepdiff.py", line 471, in __diff
    self.__diff_dict(t1, t2, parent, parents_ids)
  File "/Users/rh0dium/.virtualenvs/deep_diff_bug/lib/python2.7/site-packages/deepdiff/deepdiff.py", line 329, in __diff_dict
    self.__diff_common_children(t1, t2, t_keys_intersect, print_as_attribute, parents_ids, parent, parent_text)
  File "/Users/rh0dium/.virtualenvs/deep_diff_bug/lib/python2.7/site-packages/deepdiff/deepdiff.py", line 352, in __diff_common_children
    self.__diff(t1_child, t2_child, parent=parent_text % (parent, item_key_str), parents_ids=parents_added)
  File "/Users/rh0dium/.virtualenvs/deep_diff_bug/lib/python2.7/site-packages/deepdiff/deepdiff.py", line 461, in __diff
    error % (parent, "Unknown {}:".format(t1), "Unknown {}:".format(t2)))
ValueError: unsupported format character '=' (0x3d) at index 7

Support recursion in custom objects

I have this scenario trying to inspect some items from the boto package:

class test(object):
    def __init__(self):
        self.loop = self

a = test()
b = a

print DeepDiff(a, b)

--- cut ---
  File "E:\Projects\Repos\Cookbook\cookbooks-v8\stack-monitor\files\default\deepdiff.py", line 244, in __diffdict
    self.__diffit(t1[item], t2[item], parent=parent_text % (parent, item_str))
  File "E:\Projects\Repos\Cookbook\cookbooks-v8\stack-monitor\files\default\deepdiff.py", line 325, in __diffit
    self.__diffdict(t1, t2, parent, attributes_mode=True)
  File "E:\Projects\Repos\Cookbook\cookbooks-v8\stack-monitor\files\default\deepdiff.py", line 244, in __diffdict
    self.__diffit(t1[item], t2[item], parent=parent_text % (parent, item_str))
  File "E:\Projects\Repos\Cookbook\cookbooks-v8\stack-monitor\files\default\deepdiff.py", line 321, in __diffit
    elif isinstance(t1, Iterable):
  File "C:\Apps\Python\lib\abc.py", line 141, in __instancecheck__
    subtype in cls._abc_negative_cache):
  File "C:\Apps\Python\lib\_weakrefset.py", line 75, in __contains__
    return wr in self.data
RuntimeError: maximum recursion depth exceeded in cmp

Feature: Search for types/objects

Currently when encountering an object, DeepSearch examines that object's __dict__ without testing whether the object is what's being searched for. By example:

>>> from deepdiff import DeepSearch
>>> from uuid import uuid4()
>>> foo = uuid4()
>>> DeepSearch({1: foo}, foo)
{}

This would also allow global singletons like None and functions to be searched for.

This seems like it'd be a fairly small change, just an if item == obj check at the top of DeepSearch.__search_obj. If you think this a good idea, I'm happy to make a PR.

Please add support for diffing datetime.timedelta objects

>>> import datetime
>>> import deepdiff
>>> deepdiff.DeepDiff(0, 0)
{}
>>> deepdiff.DeepDiff(datetime.date(2017,5,1), datetime.date(2017,5,1))
{}
>>> deepdiff.DeepDiff(datetime.timedelta(0), datetime.timedelta(0))
{'unprocessed': ['root: 0:00:00 and 0:00:00']}
>>> deepdiff.DeepDiff(datetime.timedelta(0), datetime.timedelta(1))
{'unprocessed': ['root: 0:00:00 and 1 day, 0:00:00']}
>>> 

DeepSearch doesn't search for inherited class attributes

class Foo(object):
    bar = 'baz'

deepdiff.DeepSearch(Foo, 'baz')
#-> {}

The solution to this looks pretty simple: in DeepSearch.__search_obj(), change this line to obj = {getattr(attr) for attr in dir(obj)}. dir() traverses the inheritence tree, finding attributes not in __dict__. I'm happy to make a PR if this looks good to you.

Feature: exclude specific objects

It's currently possible to exclude types from DeepDiff and DeepSearch using the exclude_types kwarg. However, it's not possible to exclude certain values. Here's an example of what I'm looking for:

>>> d1 = {1: None, 2: 'a'}
>>> d2 = {1: 'foo', 2: 'b'}
>>> DeepDiff(d1, d2, exclude=[None])
{'values_changed': {'root[2]': {'new_value': 'b', 'old_value': 'a'}}}

This would be useful for excluding complex objects from a diff or search, such as UUIDs or datetimes. Though not strictly necessary, it might be a good idea to also make it possible to use is to compare searched and excluded objects rather than ==. I'm less confident about what that API should look like, but maybe something like a id_comparison=False kwarg?

I'm happy to look at making a PR for this if you're on board with this change.

feature request: comparing dict with strings to dict with regex values

I'd like to compare a dictionary where some of the expected values are regex to be matched to matching string values.

eg. something along:

    actual = { 'a': 'abcd', 'b': 'test' }
    # expecting 'b' to be exactly 'test' and expecting 'a' to be a four letter lowercase word
    expected = { 'a': re.compile('[a-z]{4}'),   'b': 'test' }   
    print DeepDiff(actual, expected,  regexMatching=True)    # prints {}

I figure it will be very easy to implement, a small extension of type_changes from string to regex.

DeepDiff has a bug when comparing repeated child elements.

Comparing the following
{u'root': {u'a': [{u'a1': u'a2'}, {u'a1': u'a1', u'a2': u'a2'}], u'b': u'b1'}}
with
{u'root': {u'a': [{u'a1': u'0'}, {u'a1': u'1'}, {u'a1': u'a2'}, {u'a1': u'a1', u'a2': u'a2'}], u'b': u'b1'}}
gives the following result, which is incorrect. It should only have added elements.
{'dic_item_removed': [u"root['root']['a'][1][u'a2']"],
'list_added': [u"root['root']['a']: [{u'a1': u'0'}, {u'a1': u'1'}]"],
'values_changed': [u"root['root']['a'][0]['a1']:\n--- \n+++ \n@@ -1 +1 @@\n-a2\n+0",
u"root['root']['a'][1]['a1']:\n--- \n+++ \n@@ -1 +1 @@\n-a1\n+1"]}

The above json is normalized from the following XML:
dataA = <root> <a><a1>a1</a1><a2>a2</a2></a> <a><a1>a2</a1></a> <b>b1</b> </root>
dataB = <root> <a><a1>0</a1></a> <a><a1>1</a1></a> <a><a1>a2</a1></a> <a><a1>a1</a1><a2>a2</a2></a> <b>b1</b> </root>

Add support for exclusion of deep paths

I couldn't find in documentation (and all tries in code to suggest were unsuccessful) how would I exclude exact field in list of objects?

ddiff = DeepDiff(response, request,
                 verbose_level=2,
                 exclude_paths={"root['case']['files'][]['presignedUrl']"})

I want to ignore field presignedUrl for one JSON (for all objects in list files) (and this field doesn't exist for other JSON object).
But I get dictionary_item_removed for "root['case']['files'][0]['presignedUrl']" in ddiff.
How is it possible to do? If this is not supported it might be very good to have interface like JsonPath.

For complex cases I need even more - exclude presignedUrl and ignore order of list. Meaning if delete this field (or exclude) from every object from list files lists should be the same, no matter what order of objects they have (because, excluding presignedUrl, all objects in lists are absolutely the same, just in different order). Is it possible to do too?

Request JSON:

{
  "case": {
    "accountId": "4fa27068c95d4c6db1eb",
    "accountName": "Account Name",
    "creationDate": "2009-03-29T09:35:28+00:00",
    "doctorName": "Test Doctor",
    "externalId": "c9a055cb-afdd-41c9-87d3-6b92f7b4b6bb",
    "externalOrderGuid": "7297f93b-2a1a-4887-bcf1-34332839bc60",
    "externalOrderId": "108c73c4-85bd-44a0-98ce-8c37983f047e",
    "files": [
      {
        "active": -17345,
        "caseUid": "8ccb4c6a-8b64-4d3f-9c95-712f94cfd198",
        "created": "1977-07-03T23:53:43+00:00",
        "dentalType": "313a5fb5-cb10-4a1e-9fad-04b601f657aa",
        "extension": "6b47aaf4ae",
        "fileType": "4f5502758c",
        "geocxRec": "be82b6ee-7004-4933-b348-da80a7ed8cd0",
        "geocxSection": "67cc6496-b8dd-4519-bbaf-b8a95ad3d49a",
        "guid": "4daa7069-51f9-4c8b-8e80-1272b0bff515",
        "isDesign": 11380,
        "localPath": "257c7509-3d81-402d-a038-885cbbfe0484",
        "md5": "066d3c09-ffc7-4f2e-b121-10ef4ba96f21",
        "modified": "1992-01-01T22:40:55+00:00",
        "name": "f95f4ab4-c667-4b97-a997-db84254d3d67",
        "objectType": "27c5c6e7-8fcc-4951-8294-a8c51198d417",
        "range": "aea6a6ef-e912-43c6-a626-b4fc6dcc3747",
        "revision": -7363820143813281823,
        "s3Path": "387e847e-064b-4221-bfe2-1e84beaf4728",
        "size": 7654534542432139115,
        "status": "b59a1088-c2e7-4fec-8609-4aa132cb2703",
        "units": "9827d9adb14c4f58b6fc",
        "updated": "2011-03-24T08:27:38+00:00"
      }
    ],
    "folder": "/tmp/junk",
    "id": "8ccb4c6a-8b64-4d3f-9c95-712f94cfd198",
    "lastDownload": "1998-04-08T15:53:50+00:00",
    "lastDownloadUser": "SomeOne",
    "lastEditor": "Last Editor",
    "locationId": 1602668402,
    "locationName": "130b3ca7-89f1-4f4a-bfd0-81032c20d85d",
    "message": "Just some string",
    "millingDate": "1980-01-09T00:36:48+00:00",
    "millingLastEditor": "Milling Last Editor",
    "millingStatus": "Status",
    "models": [],
    "orderId": "50d4039e-d96e-47b1-b243-88ba8d455b7d",
    "patientName": "Patient Name",
    "printStatus": "02244413-f971-486d-8d7a-250f8295cf89",
    "providerId": "e6bbe6e2-99ba-4c09-8d73-e5d469ed4cc4",
    "publicId": "2ceb31f8-5e7f-499a-857c-704eedf5644a",
    "reference": "db474a32-12b2-4e95-931d-e5fe0e1eb305",
    "service": "service",
    "shippingAddress": 10,
    "shippingDate": "1977-08-02T20:56:13+00:00",
    "shippingUrl": "Shipping Url",
    "status": "WORKING",
    "statusExternal": "Something",
    "submissionDate": "1994-05-10T12:38:26+00:00",
    "updateDate": "1971-04-15T23:28:27+00:00",
    "userName": "User Name",
    "workListId": 1905264324,
    "workListName": "791651d4-d76c-4536-8abe-69fa320ceecf"
  }
}

Response JSON:

{
  "case": {
    "accountId": "4fa27068c95d4c6db1eb",
    "accountName": "Account Name",
    "creationDate": "2009-03-29T13:35:28+04:00",
    "doctorName": "Test Doctor",
    "externalId": "c9a055cb-afdd-41c9-87d3-6b92f7b4b6bb",
    "externalOrderGuid": "7297f93b-2a1a-4887-bcf1-34332839bc60",
    "externalOrderId": "108c73c4-85bd-44a0-98ce-8c37983f047e",
    "files": [
      {
        "active": -17345,
        "caseUid": "8ccb4c6a-8b64-4d3f-9c95-712f94cfd198",
        "created": "1977-07-04T02:53:43+03:00",
        "dentalType": "313a5fb5-cb10-4a1e-9fad-04b601f657aa",
        "extension": "6b47aaf4ae",
        "fileType": "4f5502758c",
        "geocxRec": "be82b6ee-7004-4933-b348-da80a7ed8cd0",
        "geocxSection": "67cc6496-b8dd-4519-bbaf-b8a95ad3d49a",
        "guid": "4daa7069-51f9-4c8b-8e80-1272b0bff515",
        "isDesign": 11380,
        "localPath": "257c7509-3d81-402d-a038-885cbbfe0484",
        "md5": "066d3c09-ffc7-4f2e-b121-10ef4ba96f21",
        "modified": "1992-01-02T00:40:55+02:00",
        "name": "f95f4ab4-c667-4b97-a997-db84254d3d67",
        "objectType": "27c5c6e7-8fcc-4951-8294-a8c51198d417",
        "presignedUrl": "https://387e847e-064b-4221-bfe2-1e84beaf4728.s3.amazonaws.com/?AWSAccessKeyId=AKIAJ3DIXXEGSTRQUT5Q&Expires=1501358834&Signature=C%2Bulgq0CnVr4%2BXy2nYujAL6%2BKY8%3D",
        "range": "aea6a6ef-e912-43c6-a626-b4fc6dcc3747",
        "revision": -7363820143813281823,
        "s3Path": "387e847e-064b-4221-bfe2-1e84beaf4728",
        "size": 7654534542432139115,
        "status": "b59a1088-c2e7-4fec-8609-4aa132cb2703",
        "units": "9827d9adb14c4f58b6fc",
        "updated": "2011-03-24T11:27:38+03:00"
      }
    ],
    "folder": "/tmp/junk",
    "id": "8ccb4c6a-8b64-4d3f-9c95-712f94cfd198",
    "lastDownload": "1998-04-08T19:53:50+04:00",
    "lastDownloadUser": "SomeOne",
    "lastEditor": "Last Editor",
    "locationId": 1602668402,
    "locationName": "130b3ca7-89f1-4f4a-bfd0-81032c20d85d",
    "message": "Just some string",
    "millingDate": "1980-01-09T03:36:48+03:00",
    "millingLastEditor": "Milling Last Editor",
    "millingStatus": "Status",
    "models": [],
    "orderId": "50d4039e-d96e-47b1-b243-88ba8d455b7d",
    "patientName": "Patient Name",
    "printStatus": "02244413-f971-486d-8d7a-250f8295cf89",
    "providerId": "e6bbe6e2-99ba-4c09-8d73-e5d469ed4cc4",
    "publicId": "2ceb31f8-5e7f-499a-857c-704eedf5644a",
    "reference": "db474a32-12b2-4e95-931d-e5fe0e1eb305",
    "service": "service",
    "shippingAddress": 10,
    "shippingDate": "1977-08-02T23:56:13+03:00",
    "shippingUrl": "Shipping Url",
    "status": "WORKING",
    "statusExternal": "Something",
    "submissionDate": "1994-05-10T16:38:26+04:00",
    "updateDate": "1971-04-16T02:28:27+03:00",
    "userName": "User Name",
    "workListId": 1905264324,
    "workListName": "791651d4-d76c-4536-8abe-69fa320ceecf"
  }
}

DeepDiff object is not JSON serializable

Unfortunately although output of deepdiff.diff.DeepDiff is a dictionary - there is no possibility to store it as a JSON.

"dictionary_item_added", "dictionary_item_removed" values are sets (why not list?)
Besides that some keys (don't know what caused it, I just get exception) are None.

Would it be a problem to make it JSON serializable?

Identify structural key differences, but ignore values

I would like to compare 2 objects and identify structural differences to ensure parity, but because these objects will always have different values, I only want to get the differences in the key structure. Is this possible with this module? When performing the checks it appears to ignore structural differences in deep objects and always note value differences.

Request: Support Lists of Dicts

Thanks for working on this library @seperman ! A real time saver, love it!

I have a small request that I would like to make: Can it support lists of dicts as well? So something like this for example:

from deepdiff import DeepDiff

t1 = [{"a": 2}, {"b": 3}]
t2 = [{"b": 3}, {"a": 2}]

DeepDiff(t1, t2, ignore_order=True)

The expected output (as a user) would be:

{}

But the actual output is:

{'dic_item_added': ["root[0]['b']", "root[1]['a']"], 'dic_item_removed': ["root[0]['a']", "root[1]['b']"]}

Which makes sense looking at the source code, because ignore_order is supported only for unhashable iteratable items.

But supporting that flag for lists of dicts would be very useful!

Let me know if you can support something like that. I wouldn't mind opening a Pull Request to get this going.

Can't compare dict if it contains list of dicts with different order

Hi.
Here one more example of wrong comparison. Maybe here same root as in #22, but in that case it give wrong results, does not matter, which value ignore_order was passed.

dict1 = {
            'items': [
                {
                    'tags': [
                        {'id': '2', 'title': '2'},
                        {'id': '1', 'title': '1'}

                    ],
                    'title': '1'
                },
                {
                    'tags': [
                        {'id': '1', 'title': '1'},
                        {'id': '2', 'title': '2'}
                    ],
                    'title': '2'
                }
            ]
        }

dict2 = {
            'items': [
                {
                    'tags': [
                        {'id': '1', 'title': '1'},
                        {'id': '2', 'title': '2'}
                    ],
                    'title': '1'
                },
                {
                    'tags': [
                        {'id': '2', 'title': '2'},
                        {'id': '1', 'title': '1'}
                    ],
                    'title': '2'
                }
            ]
        }

DeepDiff(dict1, dict2, ignore_order=True)  # Returns non empty dictionary

Enhancement to add ignore_case

I have two dictionaries coming from two different data sources one of which unfortunately messes up the case.

I was planning to add an option to ignore_case which applies on string comparison. If its there already doable in current implementation then I can try it out

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.