max-sixty / pytest-accept Goto Github PK

A pytest plugin for automatically updating doctest outputs

License: Apache License 2.0

Python 100.00%

pytest snapshot-testing regression-testing literate-testing

pytest-accept's Introduction

pytest-accept

pytest-accept is a pytest plugin for automatically updating doctest outputs. It runs doctests, observes the generated outputs, and writes them to the doctests' documented outputs.

It's designed for a couple of audiences:

Folks who work with doctests, and don't enjoy manually copying & pasting outputs from the pytest error log to their doctests' documented outputs. pytest-accept does the copying & pasting for you.
Folks who generally find writing any tests a bit annoying, and prefer to develop by "running the code and seeing if it works". This library aims to make testing a joyful part of that development loop.

pytest-accept is decoupled from the doctests it works with — it can be used with existing doctests, and the doctests it edits are no different from normal doctests.

Jesse, what the?

Here's an example of what pytest-accept does: given a file like add.py containing an incorrect documented output:

def add(x, y):
    """
    Adds two values.

    >>> add(1, 1)
    3

    >>> add("ab", "c")
    'bac'
    """

    return x + y

...running doctests using pytest and passing --accept replaces the existing incorrect values with correct values:

pytest --doctest-modules examples/add.py --accept

diff --git a/examples/add.py b/examples/add.py
index 10a71fd..c2c945f 100644
--- a/examples/add.py
+++ b/examples/add.py
@@ -3,10 +3,10 @@ def add(x, y):
     Adds two values.

     >>> add(1, 1)
-    3
+    2

     >>> add("ab", "c")
-    'bac'
+    'abc'
     """

     return x + y

This style of testing is fairly well-developed in some languages, although still doesn't receive the attention I think it deserves, and historically hasn't had good support in python.

Confusingly, it's referred to "snapshot testing" or "regression testing" or "expect testing" or "literate testing" or "acceptance testing". The best explanation I've seen on this testing style is from @yminsky in a Jane Street Blogpost. @matklad also has an excellent summary in his blog post How to Test.

Installation

pip install pytest-accept

What about pytest tests?

A previous effort in assert_plugin.py attempted to do this for assert statements, and the file contains some notes on the effort. The biggest problem is pytest stops on the first assert failure in each test, which is very limiting. (Whereas pytest can be configured to continue on doctest failures, which this library takes advantage of.)

It's probably possible to change pytest's behavior here, but it's a significant effort on the pytest codebase.

Some alternatives:

Use an existing library like pytest-regtest, which offers file snapshot testing (i.e. not inline).
We could write a specific function / fixture, like accept(result, "abc"), similar to frameworks like rust's excellent insta (which I developed some features for), or ocaml's ppx_expect.
- But this has the disadvantage of coupling the test to the plugin: it's not possible to run tests independently of the plugin, or use the plugin on general assert tests. And one of the great elegances of pytest is its deferral to a normal assert statement.
Some of this testing feels like writing a notebook and testing that. pytest-notebook fully implements this.

Anything else?

Nothing ground-breaking! Some notes:

If a docstring uses escape characters such as \n, python will interpret them as the escape character rather than the literal. Use a raw string to have it interpreted as a literal. e.g. this fails:
```
def raw_string():
    """
    >>> "\n"
    '\n'
    """
```
but succeeds with:
```
def raw_string():
-    """
+    r"""
    >>> "\n"
    '\n'
```
Possibly pytest-accept could do more here — e.g. change the format of the docstring. But that would not be trivial to implement, and may be too invasive.
The library attempts to confirm the file hasn't changed between the start and end of the test and won't overwrite the file where it detects there's been a change. This can be helpful for workflows where the tests run repeatedly in the background (e.g. using something like watchexec) while a person is working on the file, or when the tests take a long time, maybe because of --pdb. To be doubly careful, passing --accept-copy will cause the plugin to instead create a file named {file}.py.new rather than overwriting the file on any doctest failure.
- It will overwrite the existing documented values, though these aren't generally useful per se — they're designed to match the generated of the code. The only instances they could be useful is where they've been manual curated (e.g. removing volatile outputs like hashes), and in those cases ideally they can be restored from version control. Or as above, pass --accept-copy to be conservative.
This is still fairly early, has mostly been used by me & xarray and there may be some small bugs. Let me know anything at all and I'll attempt to fix them.
It currently doesn't affect the printing of test results; the doctests will still print as failures.
- TODO: A future version could print something about them being fixed.
Python's doctest library is imperfect:
- It can't handle indents, and probably other things.
  - We modify the output to match the doctest format; e.g. with blanklines. If generated output isn't sufficient for the doctest to pass, and there is some form of output that's sufficient, please report as a bug.
- The syntax for .* is an ellipsis ..., which is also the syntax for continuing a code line, so the beginning of a line must always be specified.
- The syntax for all the directives is arguably less than aesthetically pleasing.
- It doesn't have an option for pretty printing, so the test must pretty print itself with pprint(x), which is verbose.
- It reports line numbers incorrectly in some cases — two docstring lines separated with continuation character \ is counted as one, meaning this library will not have access to the correct line number for doctest inputs and outputs.

pytest-accept's People

Contributors

Stargazers

Watchers

Forkers

illviljan antoniol untitaker

pytest-accept's Issues

Issue in case of excess Ellipsis `...` at the end of a multi-line doctest statement

If there is a trailing ellipsis at the end of a doctest statement

My IDE syntax coloring, and various other examples in the xarray codebase suggests me that there should not be any trailing Ellipsis at the end of a multi-line doctest statement, but I thought it was still worth mentioning this edge case!

Screenshot: Diff before and after running pytest-accept

Check if a file has changed before overwriting

As discussed in the readme, currently pytest-accept will overwrite a file unless --accept-copy is passed.

Generally that's fine. But recently I've been using it as a realtime tool recently (i.e. it runs with pytest -f --accept, running on every save), which is really cool. And it requires pausing on each save for it to generate and write new results, and getting that wrong causes an overwrite and resolving the change in an editor (or losing it), which can get confusing, even if it's only a couple of seconds of changes.

I think by default it should check that the file hasn't changed before overwriting. If it's changed, it still fails the test as it would otherwise, and prints a message saying it didn't overwrite and suggesting to run again

correcting linebreaks: `\n` --> `\\n`

Assume a function with doctest where the return create a linebreak \n:

import xarray as xr

def func(a, b):
    """
    Example:
    >>> import xarray as xr
    >>> func(xr.DataArray([5], dims='a'), xr.DataArray([1], dims='a'))
    """
    c = a + b
    c = c.assign_attrs(a=a, b=b)
    return c

pytest --doctest-modules file.py --accept adds the expected:

import xarray as xr

def func(a, b):
    """
    Example:
    >>> import xarray as xr
    >>> func(xr.DataArray([5], dims='a'), xr.DataArray([1], dims='a'))
    <xarray.DataArray (a: 1)>
    array([6])
    Dimensions without coordinates: a
    Attributes:
        a:        <xarray.DataArray (a: 1)>\narray([5])\nDimensions without coord...
        b:        <xarray.DataArray (a: 1)>\narray([1])\nDimensions without coord...
    """
    c = a + b
    c = c.assign_attrs(a=a, b=b)
    return c

Usually I expect that I can run pytest --doctest-modules file.py --accept and all doctests pass. Now they dont because a single \ is present but docstrings somehow expect \\n. I get an error message:

――――――――――――――――――――――――――――――――――――――――――――――――― ERROR collecting file.py ――――――――――――――――――――――――――――――――――――――――――――――――――
../../anaconda3/envs/climpred-dev/lib/python3.9/doctest.py:939: in find
    self._find(tests, obj, name, module, source_lines, globs, {})
../../anaconda3/envs/climpred-dev/lib/python3.9/site-packages/_pytest/doctest.py:522: in _find
    doctest.DocTestFinder._find(  # type: ignore
../../anaconda3/envs/climpred-dev/lib/python3.9/doctest.py:1001: in _find
    self._find(tests, val, valname, module, source_lines,
../../anaconda3/envs/climpred-dev/lib/python3.9/site-packages/_pytest/doctest.py:522: in _find
    doctest.DocTestFinder._find(  # type: ignore
../../anaconda3/envs/climpred-dev/lib/python3.9/doctest.py:989: in _find
    test = self._get_test(obj, name, module, globs, source_lines)
../../anaconda3/envs/climpred-dev/lib/python3.9/doctest.py:1073: in _get_test
    return self._parser.get_doctest(docstring, globs, name,
../../anaconda3/envs/climpred-dev/lib/python3.9/doctest.py:675: in get_doctest
    return DocTest(self.get_examples(string, name), globs,
../../anaconda3/envs/climpred-dev/lib/python3.9/doctest.py:689: in get_examples
    return [x for x in self.parse(string, name)
../../anaconda3/envs/climpred-dev/lib/python3.9/doctest.py:651: in parse
    self._parse_example(m, name, lineno)
../../anaconda3/envs/climpred-dev/lib/python3.9/doctest.py:720: in _parse_example
    self._check_prefix(want_lines, ' '*indent, name,
../../anaconda3/envs/climpred-dev/lib/python3.9/doctest.py:805: in _check_prefix
    raise ValueError('line %r of the docstring for %s has '
E   ValueError: line 10 of the docstring for tes.func has inconsistent leading whitespace: 'array([5])'

Manually correcting \n to \\n lets tests pass:

import xarray as xr

def func(a, b):
    """
    Example:
    >>> import xarray as xr
    >>> func(xr.DataArray([5], dims='a'), xr.DataArray([1], dims='a'))
    <xarray.DataArray (a: 1)>
    array([6])
    Dimensions without coordinates: a
    Attributes:
        a:        <xarray.DataArray (a: 1)>\\narray([5])\\nDimensions without coord...
        b:        <xarray.DataArray (a: 1)>\\narray([1])\\nDimensions without coord...
    """
    c = a + b
    c = c.assign_attrs(a=a, b=b)
    return c

So would it be possible to add a small correction .replace("\n", "\\n") to pytest-accept?
Probably \n is not the only issue here.

Detect and redact memory addresses?

De we want to automatically remove strings that are unlikely to be consistent across runs, e.g. memory addresses (<__main__.A at 0x13ab3cd90>)?

Thoughts:

Most of the time this is useful, but occasionally someone might want to produce and test a string like this, so do we provide opt-out?
We don't want to put pytest-accept options into the test file, because it violates the principle of decoupling the library from the tests.
It could be a commandline flag? But then do these proliferate? We could have a single one for all the output adjusting (e.g. trimming > 1000 rows or columns, adding blankline comments, etc)
What do we redact? Matching memory addresses only, e.g. 0x1[0-f]{8}? Or something to cover more examples?

Issue with docstrings containing backslashes at the end of line

Hi,

I am currently using pytest-accept for this issue of xarray: pydata/xarray#8690

I noticed a bug when a docstring contains a backslash at the end of line, then the lines are not properly replaced, an offset is introduced and it breaks the docstring

Example of a docstring that will surely fail on the xarray repo

Zoom on the section containing a backslash:

   combine_attrs : {"drop", "identical", "no_conflicts", "drop_conflicts", \
                     "override"} or callable, default: "override"

This docstring will make the auto-replacement fail. I attach here a screenshot to visually see the issue:

When removing the backslash, the pytest-accept tool successfully replace. Currently as a way to mitigate, I plan to remove all the backslashes at ends of line, running pytest-accept and then git revert the backslash removal.

Thanks!

Edit: the issue is already known and mentioned in the README

it reports line numbers incorrectly in some cases — two docstring lines separated with continuation character \ is counted as one, meaning this library will not have access to the correct line number for doctest inputs and outputs.

Issue with doctest result containing escaped backslashes

When a doctest result contains escaped backslashes, after a run of pytest-accept, they are not escaped anymore

Potential solution according to the README: use a raw string as a docstring

Long lines support

Thanks for this project! I was trying to put some colour on this approach of doing testing. I have read the same blogpost as you from Jane Street folks and was looking if someone did something similar in the python land.
I really like the approach you took to leverage the already existing doctest-modules facility.

I want to propose the implementation of a --long-lines option to make pytest-accept avoid wrapping up "big" lines and I want all the output to be shown. We have a bunch of scrapers and we want to integrate tests altogether with code.

This is an example of how one can wrap long output in doctests, the --long-lines option ought to produce something like:

def large_return_value():
    """
def large_return_value():
    """
    >>> large_return_value()
    [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1\
, 1, 1, 1, 1, 1, 1, 1,1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,\
 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1\
, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,\
 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1\
, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,\
 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1\
, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \
1, 1, 1, 1, 1, 1, 1]
    """

    return [ 1 for _ in range(300)]

    """

    return [ 1 for _ in range(300)]

This is a hacky PR I created just to have a baseline for this conversation: #42

Would you accept a contribution for the above?

Also can you give me guidance on how to setup the development environment? I come from a virtualenv-inspired workflow but so far I had no success in installing pytest_accept in there. I am totally unfamiliar with poetry.
If I do pip install -e . I find it does not execute the pytest11 hook to install the plugin and I have to supply manually the -p option for pytest to find the plugin.

Truncate long lines

Currently if a result is one line of 100K characters, pytest-accept will add it to the file, which can then crash an editor.

Probably we should truncate the middle and leave X characters at the beginning and end of each line, joined with ...