Code Monkey home page Code Monkey logo

jsondiff's Introduction

jsondiff

Diff JSON and JSON-like structures in Python.

Installation

pip install jsondiff

Quickstart

>>> import jsondiff as jd
>>> from jsondiff import diff

>>> diff({'a': 1, 'b': 2}, {'b': 3, 'c': 4})
{'c': 4, 'b': 3, delete: ['a']}

>>> diff(['a', 'b', 'c'], ['a', 'b', 'c', 'd'])
{insert: [(3, 'd')]}

>>> diff(['a', 'b', 'c'], ['a', 'c'])
{delete: [1]}

# Typical diff looks like what you'd expect...
>>> diff({'a': [0, {'b': 4}, 1]}, {'a': [0, {'b': 5}, 1]})
{'a': {1: {'b': 5}}}

# You can exclude some jsonpaths from the diff (doesn't work if the value types are different)
>>> diff({'a': 1, 'b': {'b1': 20, 'b2': 21}, 'c': 3},  {'a': 1, 'b': {'b1': 22, 'b2': 23}, 'c': 30}, exclude_paths=['b.b1', 'c'])
{'b': {'b2': 23}}

# ...but similarity is taken into account
>>> diff({'a': [0, {'b': 4}, 1]}, {'a': [0, {'c': 5}, 1]})
{'a': {insert: [(1, {'c': 5})], delete: [1]}}

# Support for various diff syntaxes
>>> diff({'a': 1, 'b': 2}, {'b': 3, 'c': 4}, syntax='explicit')
{insert: {'c': 4}, update: {'b': 3}, delete: ['a']}

>>> diff({'a': 1, 'b': 2}, {'b': 3, 'c': 4}, syntax='symmetric')
{insert: {'c': 4}, 'b': [2, 3], delete: {'a': 1}}

>>> diff({'list': [1, 2, 3], "poplist": [1, 2, 3]}, {'list': [1, 3]}, syntax="rightonly")
{"list": [1, 3], delete: ["poplist"]}

# Special handling of sets
>>> diff({'a', 'b', 'c'}, {'a', 'c', 'd'})
{discard: set(['b']), add: set(['d'])}

# Load and dump JSON
>>> print diff('["a", "b", "c"]', '["a", "c", "d"]', load=True, dump=True)
{"$delete": [1], "$insert": [[2, "d"]]}

# NOTE: Default keys in the result are objects, not strings!
>>> d = diff({'a': 1, 'delete': 2}, {'b': 3, 'delete': 4})
>>> d
{'delete': 4, 'b': 3, delete: ['a']}
>>> d[jd.delete]
['a']
>>> d['delete']
4
# Alternatively, you can use marshal=True to get back strings with a leading $
>>> diff({'a': 1, 'delete': 2}, {'b': 3, 'delete': 4}, marshal=True)
{'delete': 4, 'b': 3, '$delete': ['a']}

Command Line Client

Usage:

jdiff [-h] [-p] [-s {compact,symmetric,explicit}] [-i INDENT] [-f {json,yaml}] first second

positional arguments:
  first
  second

optional arguments:
  -h, --help            show this help message and exit
  -p, --patch
  -s {compact,symmetric,explicit}, --syntax {compact,symmetric,explicit}
                        Diff syntax controls how differences are rendered (default: compact)
  -i INDENT, --indent INDENT
                        Number of spaces to indent. None is compact, no indentation. (default: None)
  -f {json,yaml}, --format {json,yaml}
                        Specify file format for input and dump (default: json)

Examples:

$ jdiff a.json b.json -i 2

$ jdiff a.json b.json -i 2 -s symmetric

$ jdiff a.yaml b.yaml -f yaml -s symmetric

Development

Install development dependencies and test locally with

pip install -r requirements-dev.txt
# ... do your work ... add tests ...
pytest

Installing From Source

To install from source run

pip install .

This will install the library and cli for jsondiff as well as its runtime dependencies.

Testing before release

python -m build
twine check dist/*

jsondiff's People

Contributors

abhaystoic avatar alexgann avatar calebmadrigal avatar corytodd avatar cp-macrofab avatar ericremoreynolds avatar erykoff avatar fzumstein avatar gadgetjunkie avatar jwilk avatar kloczek avatar mgorny avatar neumond avatar payam54 avatar pixelb avatar r3m0t avatar ramwin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

jsondiff's Issues

Why is this returning dictionaries with unhashable key names?

Hi,

Unless I'm missing something, it's unnecessarily difficult to do a .get("delete", None) on the dictionaries returned by jsondiff.diff() because the keys are named delete rather than "delete". Is this intentional, and is there a workaround I'm not aware of?

Thanks!

Apply diff

Is there any plan to add an apply_diff kind of function which would apply the diff to an object?

Issue in case of replace

a = {1:2}
b = {5:3}
diff(a,b,syntax="explicit")

The output in the above scenario is {5:3} rather than {$replace: {5: 3}} or {$insert: {5: 3}, $delete: {1:2}}, which I think a better way in case of explicit

'Consul' object has no attribute 'txn'

I try to use txn for atomic transactions:

import platform
print("python ", platform.python_version())
import consul
print("consul ", consul.__version__)

c = consul.Consul()

print(dir(c))
c.txn.put(dict())

Output:

python  3.6.2
consul  0.7.1
['ACL', 'Agent', 'Catalog', 'Coordinate', 'Event', 'Health', 'KV', 'Operator', 'Query', 'Session', 'Status', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'acl', 'agent', 'catalog', 'connect', 'consistency', 'coordinate', 'dc', 'event', 'health', 'http', 'kv', 'operator', 'query', 'scheme', 'session', 'status', 'token']
Traceback (most recent call last):
  File "./txn.py", line 10, in <module>
    c.txn.put(dict())
AttributeError: 'Consul' object has no attribute 'txn'

How can I use txn?

Equality operator implementation for Symbols

When the diff has been performed, we obtain a dictionary with keys of type Union[T|jsondiff.symbols.Symbol] (T being the initial type(s) of the keys in the initial two dictionaries). This led me to have troubles comparing efficiently the keys, especially because the jsondiff.symbols.Symbol class does not instantiate any king of comparison operators.
It would be way easier to have a simple equality operator returning True if the label attributes of the two Symbols are identical.

Bug when second list is empty

Kind of a bug:

>>> import jsondiff
>>> a = []
>>> b = ['a', 'b', 'c']
>>> jsondiff.diff(a,b)
['a', 'b', 'c']  # ok
>>> jsondiff.diff(b,a)
[]  # wrong. expected: {delete: [3, 2, 1]}
>>> jsondiff.diff(b,a, syntax='explicit')
[]  # wrong. expected: {delete: [3, 2, 1]}
>>> jsondiff.diff(a,b, syntax='symmetric')
[[], ['a', 'b', 'c']]  # ok ?

2.0.0: pep517 based build fails with latest `setuptools` 62.0.0

Looks like just released 2.0.0 fail on use pep517 based build with latest setuptools 62.0.0

+ /usr/bin/python3 -sBm build -w --no-isolation
* Getting dependencies for wheel...
running egg_info
creating jsondiff.egg-info
writing jsondiff.egg-info/PKG-INFO
writing dependency_links to jsondiff.egg-info/dependency_links.txt
writing entry points to jsondiff.egg-info/entry_points.txt
writing top-level names to jsondiff.egg-info/top_level.txt
writing manifest file 'jsondiff.egg-info/SOURCES.txt'
reading manifest file 'jsondiff.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
adding license file 'LICENSE'
writing manifest file 'jsondiff.egg-info/SOURCES.txt'
* Building wheel...
running bdist_wheel
running build
running build_py
creating build
creating build/lib
creating build/lib/jsondiff
copying jsondiff/__init__.py -> build/lib/jsondiff
copying jsondiff/cli.py -> build/lib/jsondiff
copying jsondiff/symbols.py -> build/lib/jsondiff
installing to build/bdist.linux-x86_64/wheel
running install
running install_lib
creating build/bdist.linux-x86_64
creating build/bdist.linux-x86_64/wheel
creating build/bdist.linux-x86_64/wheel/jsondiff
copying build/lib/jsondiff/__init__.py -> build/bdist.linux-x86_64/wheel/jsondiff
copying build/lib/jsondiff/cli.py -> build/bdist.linux-x86_64/wheel/jsondiff
copying build/lib/jsondiff/symbols.py -> build/bdist.linux-x86_64/wheel/jsondiff
running install_egg_info
running egg_info
writing jsondiff.egg-info/PKG-INFO
writing dependency_links to jsondiff.egg-info/dependency_links.txt
writing entry points to jsondiff.egg-info/entry_points.txt
writing top-level names to jsondiff.egg-info/top_level.txt
reading manifest file 'jsondiff.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
adding license file 'LICENSE'
writing manifest file 'jsondiff.egg-info/SOURCES.txt'
Copying jsondiff.egg-info to build/bdist.linux-x86_64/wheel/jsondiff-2.0.0-py3.8.egg-info
running install_scripts
Traceback (most recent call last):
  File "/usr/lib/python3.8/site-packages/pep517/in_process/_in_process.py", line 363, in <module>
    main()
  File "/usr/lib/python3.8/site-packages/pep517/in_process/_in_process.py", line 345, in main
    json_out['return_val'] = hook(**hook_input['kwargs'])
  File "/usr/lib/python3.8/site-packages/pep517/in_process/_in_process.py", line 261, in build_wheel
    return _build_backend().build_wheel(wheel_directory, config_settings,
  File "/usr/lib/python3.8/site-packages/setuptools/build_meta.py", line 244, in build_wheel
  File "/usr/lib/python3.8/site-packages/setuptools/build_meta.py", line 229, in _build_with_temp_dir
  File "/usr/lib/python3.8/site-packages/setuptools/build_meta.py", line 281, in run_setup
  File "/usr/lib/python3.8/site-packages/setuptools/build_meta.py", line 174, in run_setup
  File "setup.py", line 8, in <module>
    setup(
  File "/usr/lib/python3.8/site-packages/setuptools/__init__.py", line 87, in setup
  File "/usr/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 148, in setup
  File "/usr/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 163, in run_commands
  File "/usr/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 967, in run_commands
  File "/usr/lib/python3.8/site-packages/setuptools/dist.py", line 1214, in run_command
  File "/usr/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 986, in run_command
  File "/usr/lib/python3.8/site-packages/wheel/bdist_wheel.py", line 335, in run
    self.run_command('install')
  File "/usr/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 313, in run_command
    help = {}
  File "/usr/lib/python3.8/site-packages/setuptools/dist.py", line 1214, in run_command
  File "/usr/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 986, in run_command
  File "/usr/lib/python3.8/site-packages/setuptools/command/install.py", line 68, in run
  File "/usr/lib/python3.8/site-packages/setuptools/_distutils/command/install.py", line 682, in run
  File "/usr/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 313, in run_command
    help = {}
  File "/usr/lib/python3.8/site-packages/setuptools/dist.py", line 1214, in run_command
  File "/usr/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 986, in run_command
  File "/usr/lib/python3.8/site-packages/setuptools/command/install_scripts.py", line 19, in run
ModuleNotFoundError: No module named 'setuptools.command.easy_install'

List ordering is preserved

I'm trying to compare and diff some json structures.
I expect the following to be equal, but it's not:

>>> jsondiff.diff(
    [u'all_users', u'viewer'],
    [u'viewer', u'all_users']
)
{delete: [0], insert: [(1, u'all_users')]}

Shall we consider ordering when comparing ?

jsondiff.diff returns a list instead of dict when loaded through OrderDict

With the two attached jsons (renamed to .txt to appease github…) trying to run jsondiff.diff on them returns a list.

In my applications I already previously loaded the jsons since I'm doing some processing on them before handling them over to jsondiff, and we need to use OrderedDict for that.

transcriptions:

In [1]: import json

In [2]: import jsondiff

In [3]: import collections

In [4]: with open('1.json') as f:
   ...:     a = json.load(f, object_pairs_hook=collections.OrderedDict)
   ...:     

In [5]: with open('2.json') as f:
   ...:     b = json.load(f, object_pairs_hook=collections.OrderedDict)
   ...:     

In [6]: type(jsondiff.diff(a, b))
Out[6]: list

In [7]: jsondiff.diff(a, b)
Out[7]: 
[OrderedDict([('id', '/build/node-module-deps-6.1.0/2nd/example/files/bar.js'),
              ('source',
               'module.exports = function (n) {\n    return n * 100;\n};\n'),
              ('deps', OrderedDict()),
              ('file',
               '/build/node-module-deps-6.1.0/2nd/example/files/bar.js')]),
 OrderedDict([('id', '/build/node-module-deps-6.1.0/2nd/example/files/foo.js'),
              ('source',
               "var bar = require('./bar');\n\nmodule.exports = function (n) {\n    return n * 111 + bar(n);\n};\n"),
              ('deps',
               OrderedDict([('./bar',
                             '/build/node-module-deps-6.1.0/2nd/example/files/bar.js')])),
              ('file',
               '/build/node-module-deps-6.1.0/2nd/example/files/foo.js')]),
 OrderedDict([('file',
               '/build/node-module-deps-6.1.0/2nd/example/files/main.js'),
              ('id',
               '/build/node-module-deps-6.1.0/2nd/example/files/main.js'),
              ('source',
               "var foo = require('./foo');\nconsole.log('main: ' + foo(5));\n"),
              ('deps',
               OrderedDict([('./foo',
                             '/build/node-module-deps-6.1.0/2nd/example/files/foo.js')])),
              ('entry', True)])]

In [8]: 

Is this expected? I always saw dicts being returned by jsondiff, and also I can't spot anything in jsondiff's code that obviously returns a list out of itself.

1.txt
2.txt

Feature request: diff merging

I'm using jsondiff to reduce how much I'm sending to clients. My current scheme returns a list of diffs since the last time the client requested the state. It would be better if it sent one diff with all changes together, in case redundancies.

As a side note: it would be nice if there were a way to turn the result of a diff back into a diff. In other words, turn the strings representing symbols back into symbols. This way I can internally represent state as dictionaries without having to serialize them to a json blob to apply the diff.

No difference label if one of JSON is empty

Hi,

I did some tests with the library and I have a case scenario where one of the compared JSON is empty: { }. I am using syntax='explicit' and the diff returns me exactly the JSON that is not the one empty. My problem is that I would like it to return me something like:

{
  insert: ...
}

The "insert" tag is quite important during my parsing.

If I can only dump $update

I only care about $update, but when update, I found the json result also include $insert/$delete, If I can only dump $update? Thanks!

Incorrect diff for a string

Hi, i've got two json files:

old.json

{
  "network": {
    "domain": "old.domain"
  }
}

new.json

{
  "network": {
    "domain": "new.domain"
  }
}

When i run:

jsondiff.diff(old_json, new_json, syntax="symmetric")

I've got this:

{
    "network": {
        "domain": [
            "old.domain",
            "new.domain"
        ]
    }
}

Instead of:

{
    "network": {
        "domain": {
            "$delete": [ "old.domain" ],
            "$insert": [ "new.domain" ]
        }
    }
}

The json expected that I've written is probably not right, it's just an example to show the diff keys I expect.

It is normal behavior? Did I forget something?

When giving a file that is not JSON, there's traceback output

Hi, thanks for a good tool. I like it!

However, this happens from time to time, that I use it with a file that is not a JSON. This is caused by a typo or other human error.

> jdiff file-a.json file-b.json 
Traceback (most recent call last):
  File "/opt/conda/bin/jdiff", line 8, in <module>
   ...
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

I think that the end user should never have to see a traceback. Wrapping this call into a try-except block to just output that file-a.json is not a valid JSON file is a resonable quality-of-life improvement. I'm a developer and understand the traceback, but someone who is not as technically savvy will not necessarily understand the quite cryptic JSONDecodeError message.

All the best

Is there a way to use indent other than the CLI tool?

Currently I use the following:

with open('a.json', encoding='utf-8') as f1:
data1 = json.loads(f1.read())
with open('b.json', encoding='utf-8') as f2:
data2 = json.loads(f2.read())

abc=diff(data1, data2, syntax='symmetric')
print(json.dumps(abc, indent=4))

[Question] Comparison to DeepDiff

Hey. I wonder if jsondiff has advantages over DeefDiff? I use the second to compare big dicts and see a delta.

It would be great if you give a couple of words about the library's strengths. πŸ‘πŸ»

1.3.1: test suite uses `nose-randomly`

nose-randomly is archived an is no longer maintained https://github.com/adamchainz/nose-randomly/

Additionally pytest cannot find with default settings any units

+ /usr/bin/pytest -ra tests
=========================================================================== test session starts ============================================================================
platform linux -- Python 3.8.12, pytest-6.2.5, py-1.11.0, pluggy-1.0.0
rootdir: /home/tkloczko/rpmbuild/BUILD/jsondiff-1.3.1
plugins: anyio-3.3.4, black-0.3.12
collected 0 items

========================================================================== no tests ran in 0.02s ===========================================================================

Only when pytest is started with tests/*.py it shows:

-2.fc35.x86_64/usr/lib/python3.8/site-packages
+ /usr/bin/pytest -ra tests/__init__.py tests/generate_readme.py tests/utils.py
=========================================================================== test session starts ============================================================================
platform linux -- Python 3.8.12, pytest-6.2.5, py-1.11.0, pluggy-1.0.0
rootdir: /home/tkloczko/rpmbuild/BUILD/jsondiff-1.3.1
plugins: anyio-3.3.4, black-0.3.12
collected 0 items / 2 errors

================================================================================== ERRORS ==================================================================================
________________________________________________________________ ERROR collecting tests/generate_readme.py _________________________________________________________________
ImportError while importing test module '/home/tkloczko/rpmbuild/BUILD/jsondiff-1.3.1/tests/generate_readme.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib64/python3.8/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/__init__.py:8: in <module>
    from nose_random import randomize
E   ModuleNotFoundError: No module named 'nose_random'
________________________________________________________________ ERROR collecting tests/generate_readme.py _________________________________________________________________
ImportError while importing test module '/home/tkloczko/rpmbuild/BUILD/jsondiff-1.3.1/tests/generate_readme.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib64/python3.8/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/__init__.py:8: in <module>
    from nose_random import randomize
E   ModuleNotFoundError: No module named 'nose_random'
========================================================================= short test summary info ==========================================================================
ERROR tests/generate_readme.py
ERROR tests/generate_readme.py
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 2 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
============================================================================ 2 errors in 0.16s =============================================================================

Patch example

Hi,

Could someone please give (or document) example of how to apply diff between two jsons. I guess it's using patch method but I can't figure out how to use it. Help pls.

Thanks

Cant use standard json.dump

Example:

a = {
    "dict": {
        "nested_list": [
            {
                "@dict_list_dict_f1": 1,
            },
            {
                "@dict_list_dict_f1": 2,
            },
        ]
    }
}

b = {
    "dict": {
        "nested_list": [
            {
                "@dict_list_dict_f1": 1,
            },
        ]
    }
}

from jsondiff import diff
d=diff(a, b, syntax='symmetric')

print(diff)
{'dict': {'nested_list': {delete: [(1, {'@dict_list_dict_f1': 2})]}}}

print(json.dumps(diff))

Traceback (most recent call last):
    print(json.dumps(diff))
  File "/usr/lib/python3.5/json/__init__.py", line 230, in dumps
    return _default_encoder.encode(obj)
  File "/usr/lib/python3.5/json/encoder.py", line 198, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib/python3.5/json/encoder.py", line 256, in iterencode
    return _iterencode(o, 0)
TypeError: keys must be a string


print(diff(a, b,  syntax='symmetric', dump=True))
{"dict": {"nested_list": {"$delete": [[1, {"@dict_list_dict_f1": 2}]]}}}
# internal dump convert nested tuple to list ???

Question. Writing difference to third file in JSON format.

I've got two JSON files pulled from the same website, which I'm importing a day apart using json.dump with "indent=4" to format it.

I would like to compare the old JSON file with the new JSON file with the possibility that the newer file will have additional objects.
If there are additional objects, I'd like to export them to a third JSON file with the same format.

ERROR in main: Server Error: maximum recursion depth exceeded in comparison

File "lib/python3.6/site-packages/jsondiff/__init__.py", line 597, in diff
    return cls(**kwargs).diff(a, b, fp)
  File "lib/python3.6/site-packages/jsondiff/__init__.py", line 501, in diff
    d, s = self._obj_diff(a, b)
  File "lib/python3.6/site-packages/jsondiff/__init__.py", line 488, in _obj_diff
    return self._list_diff(a, b)
  File "lib/python3.6/site-packages/jsondiff/__init__.py", line 408, in _list_diff
    for sign, value, pos, s in self._list_diff_0(C, X, Y, len(X), len(Y)):
  File "lib/python3.6/site-packages/jsondiff/__init__.py", line 374, in _list_diff_0
    for annotation in self._list_diff_0(C, X, Y, i-1, j-1):
  File "lib/python3.6/site-packages/jsondiff/__init__.py", line 374, in _list_diff_0
    for annotation in self._list_diff_0(C, X, Y, i-1, j-1):
  File "lib/python3.6/site-packages/jsondiff/__init__.py", line 374, in _list_diff_0
    for annotation in self._list_diff_0(C, X, Y, i-1, j-1):
  [Previous line repeated 958 more times]
  File "lib/python3.6/site-packages/jsondiff/__init__.py", line 372, in _list_diff_0
    d, s = self._obj_diff(X[i-1], Y[j-1])
  File "lib/python3.6/site-packages/jsondiff/__init__.py", line 484, in _obj_diff
    return self._dict_diff(a, b)
  File "lib/python3.6/site-packages/jsondiff/__init__.py", line 468, in _dict_diff
    d, s = self._obj_diff(v, w)
  File "lib/python3.6/site-packages/jsondiff/__init__.py", line 494, in _obj_diff
    return self.options.syntax.emit_value_diff(a, b, 1.0), 1.0
  File "lib/python3.6/site-packages/jsondiff/__init__.py", line 109, in emit_value_diff
    if s == 1.0:
RecursionError: maximum recursion depth exceeded in comparison
[2018-07-15 14:47:23,259] ERROR in main: Server Error: maximum recursion depth exceeded in comparison
127.0.0.1 - - [15/Jul/2018 14:47:23] "POST /compare HTTP/1.1" 500

Floating point imprecision causes different dicts to be marked as similar if highly nested

Either we should migrate to integer only arithmetic OR use python decimal library to increase precision

I have a usecase where JSON schema are being marked similar if the difference is at deepest level in a fairly nested schema.

Alternatively instead of using similarity score < 1 to check for similarity we should maybe change criteria to be that returned diff is not empty?

                d, s = self._obj_diff(v, w)
                if s < 1.0:
                    changed[k] = d

i.e this code may need to change to instead check based on d instead of s

Publish sdist and bdist wheel

The benefits of wheels are well documented. See: https://pythonwheels.com/
This package is pure Python and publishing it as both source and as a wheel is simple.

Would you accept a PR that adds a Makefile that makes it easy for you to release both an sdist and bdist_wheel to PyPI when you create releases?

Is there a simple way to convert the output dictionary to dictionary keys?

I see the output is a dictionary with a location of the items that were removed. Is there a quick way to turn that into a bunch of dict keys (or indices if they are nested) so I can just go to the location in my JSON where the diff occurred ? I know i can do this with a few for loops, but wondering if there's a built in way.

Syntax Error: invalid syntax

from jsondiff import diff

print diff('["a", "b", "c"]', '["a", "c", "d"]', load=True, dump=True)
File "", line 1
print diff('["a", "b", "c"]', '["a", "c", "d"]', load=True, dump=True)
^
SyntaxError: invalid syntax

Are the tests working?

I cloned the project and executed the tests but it's not working. do you have any advice?

$ pytest -vv

output

self = <tests.test_jsondiff.JsonDiffTests testMethod=test_a>

    def test_a(self):

        self.assertEqual({}, diff(1, 1))
        self.assertEqual({}, diff(True, True))
        self.assertEqual({}, diff('abc', 'abc'))
        self.assertEqual({}, diff([1, 2], [1, 2]))
        self.assertEqual({}, diff((1, 2), (1, 2)))
        self.assertEqual({}, diff({1, 2}, {1, 2}))
        self.assertEqual({}, diff({'a': 1, 'b': 2}, {'a': 1, 'b': 2}))
        self.assertEqual({}, diff([], []))
        self.assertEqual({}, diff(None, None))
        self.assertEqual({}, diff({}, {}))
        self.assertEqual({}, diff(set(), set()))

        self.assertEqual(2, diff(1, 2))
        self.assertEqual(False, diff(True, False))
        self.assertEqual('def', diff('abc', 'def'))
        self.assertEqual([3, 4], diff([1, 2], [3, 4]))
        self.assertEqual((3, 4), diff((1, 2), (3, 4)))
        self.assertEqual({3, 4}, diff({1, 2}, {3, 4}))
>       self.assertEqual({replace: {'c': 3, 'd': 4}}, diff({'a': 1, 'b': 2}, {'c': 3, 'd': 4}))
E       TypeError: unhashable type: 'Symbol'

tests/test_jsondiff.py:45: TypeError

New release??

Can you perform a new release so folks can pipx install this....

Incorrect output?

Hi,
I am using jsondiff in a django project where I save some data in json format into the database. Basically the json is a list of dictionaries: [{"name":"James"},{"name":"John"}]

When I receive a new json in the request this is the way I try to get the json changes:

old = '[{}]'
new = '[{"name":"James"}]'

difference = diff(old, new, load=True, syntax='symmetric')

What I would expect as the value for difference is something like this:

{insert: [(0, {'name': 'James'})]}

but I got this:

[[{}], [{'name': 'James'}]]

Is this the correct behaviour?

Max recursion limit

List differences are evaluated recursively, which leads to a very large number of stack frames for long lists and ultimately runtime errors related to max recursion limit.

Are there any plans to implement this iteratively?

Introduce YAML support for cli

It would be cool to have built-in support for YAML on the CLI. Most of my data is YAML and I end
up having to pipe it through yq so native would be nice.

Since this only affects the cli, I propose leaving all of the current structure alone. We would need a new loader/dumper pair and a --format parameter so the user can hint their file type. We could get
fancy and auto-detect the type but I'm not keen on that yet.

There should also be explicit tests for all of the loader/dumper variants.

Huge memory consumption with large JSON files

I just tried to diff two firefox memory reports (73MiB each) and jdiff exploded, had to shoot it down at about 5GiB of RAM usage.. You may try yourself, the reports can be generated on the internal about:memory page.

Comparing large JSON files

Hi and thanks for the great library.
My use case is that I compare DOM trees that are represented as JSON files to find the difference between two similar webpages. Unfortunately, I have problems comparing two large JSON files (>300K) as the comparison never comes to an end (I stopped after 10 minutes).

I'm not sure whether this is due to a bug in the code and/or due to the complexity of the JSON files. While debugging a bit, I realized that many elements are compared multiple times with the same element (or also themselves). For instance the following element from diff1.json when compared to diff2.json (Example files).

diff1["childNodes"][0]["childNodes"][1]["childNodes"][29]["childNodes"][18]["childNodes"][0]["childNodes"][0]["childNodes"][1]["childNodes"][0]["childNodes"][1]["childNodes"][1]["childNodes"][0]["childNodes"][0]["childNodes"][2]["childNodes"][40]["childNodes"][0]
{'nodeName': '#text',
 'nodeValue': 'Tienda Kindle',
 'childNodes': [],
 'attributes': {}}

Is there any option in the library to compare large JSONs or do you have any recommendation how to approach this use case?
Thank you!

Can you return both position in a and b when it's replaced?

print(diff(['a', {"a": 0, "b": 111}], [{"a": 0, "b": 11}, 'c']))
{0: {'b': 11}, insert: [(1, 'c')], delete: [0]}

I need the result like:
{replace: [(1, 0, {'b': 11})], insert: [(1, 'c')], delete: [0]}

because I want to know the both position in a and b when it's replaced.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.