lincc-frameworks / python-project-template Goto Github PK

View Code? Open in Web Editor NEW

58.0 9.0 15.0 645 KB

Python project best practices for scientific software

Home Page: https://lincc-ppt.readthedocs.io/

License: BSD 3-Clause "New" or "Revised" License

Jinja 64.84% Shell 9.48% Python 25.68%

python template copier-template

python-project-template's People

Contributors

Stargazers

Watchers

Forkers

uw-ssec cchris28 henryiii lsstdesc sunnybhagwat35 nishanttomar aikooo7 sgalpha01 sometxdude mvandermeulen astroskip kdesoto-test dmasoumi jwuphysics

python-project-template's Issues

Add documentation about using this template with an existing project

It would be good to include some information about what to expect if moving from a setup.py file to pyproject.toml file.
And explain what directory structure is "necessary" for the various components to work together. i.e. pylint expects there to be a ./src directory for linting.

Think more about author for team projects

Should the author of our libraries be the individuals, or LINCC-Frameworks?

Investigate performance monitoring similar to codecov

Disclaimer I don't know if this exists

It would be very nice if we could find something that will allow us to record the time to run a test as a proxy for code performance.

Ideally this could be applied to the smoke test automation so that dormant projects will be monitored as well.

The specific example where this would have been valuable:
The project FlexCode takes a dependency on xgboost, but didn't define the specific version of the dependency.
The xgboost version was updated from 0.9 to 1.0, and that introduced a significant increase in FlexCode run times.
This went completely unnoticed until someone asked, "hey isn't this taking a lot longer to run???"
If there was a smoke test that recorded test run time in addition to pass/fail, then at least there would have been an indicator of the problem.

Generate html from doc strings and notebooks

Want to make sure that readthedocs gets updated automatically.

Add a makefile into the docs

A makefile should be present to allow users to build docs locally. Currently, if a user wanted to create one for the existing template, they'd likely have to copy one over and modify it as needed. We can protect from potential confusion by just having a predefined file ready to go.

One note, is that a default Makefile (e.g. one made by sphinx-quickstart) expects the documentation source files to live in a source directory. So this default should either be changed in the makefile, or the doc source files should be moved into a source/ folder. The advantage of the latter is that it seems more like the standard.

Introduce cron job smoke test

Test

Check to see if a new issue is automatically added to lincc-frameworks project tracker.

Introduce some integration tests for the template

Currently there is a GitHub action that will hydrate two test projects (one uses the default responses to the questions, the other provides some non-default responses)

However, those tests were introduced before the template allowed the user to make branching choices (i.e. black vs. pylint) that result in different files or file content.

It would be nice to be able to confirm via something like integration or unit tests that the various responses result in a hydrated test projects that contain the correct files or file contents.

Clean up the main README

It's too wordy and complicated. Needs a tl;dr; section at the top.

Add pytest-cov

Add pytest coverage to the list of dev dependencies, pre-commit hook, and GitHub workflow.

Modify template CI steps to match hydrated project CI steps

In the hydrated project we install the package with pip install .

But in the template tests, we run pop install -e . And then run the pytest step differently as well.

If we could execute the hydrated template CI as part of the template CI tests that would be ideal, and prevent us from having to keep the two in sync.

invoke pytest with "python -m pytest" so it addes the current directory to the package search path

Not strictly necessary but a nice to have.
vscode does this and it does no harm that I know.

Consider MyPy for static type hint checking

Move template into subdirectory

Moving the template into a subdirectory, ./python_project_template will allow us to separate metadata and tests for the template from metadata and tests that are the template.

For example. We want this project to have a README.md file, but we don't want to populate a new project with the same README.md file. Instead, we'll use the a ./python-project-template/README.md as the hydrated README file. Copier has a configuration that allows defining the template directory, so we can use that to separate the template itself from its metadata.

Fix assorted issues with CI

Need to fix:

Uploading codecov report
Update github actions/checkout and actions/setup-python version in both the project template and the ci for the project itself.
Add copier answer validation logic to safeguard against project names like --*/tooth-pile(9=14)$@!.

Fix failing readthedocs builds for rendered templates

Currently it seems that readthedocs builds don't work as expected. See this build: https://readthedocs.org/projects/hipscat/builds/19413701/

Never mind

Teach me to use GitHub Projects first thing in the morning. Nothing here!

Add example docstrings to methods

Jupiter notebook precommit doesn't work

The precommit hook that is meant to clear output from notebooks isn't discovering the notebooks in the nb/ directory.

Fix pre-commit focus for isort and pylint

Current isort and pylint look at all files recursively from the root directory. Need to specify only to look in ./src directory.

Consider if we should include a .env file in the template

As it stands, the user of the template must pip install . or pip install .[dev] in order for changes to the source code to be discoverable by tests. Thus if a user add a src method, and then immediately adds a test for it, the test will fail because the new method has not been packaged.

In some sense this is a good thing because it means that what's being tested is what would be deployed. But at the same time, it means that every small change requires a pip install in order to be tested.

One was around this is to include a .env file in the template with the contents:
PYTHONPATH=src.

This .env file will be picked up by VSCode (at least) and presumably other editors as well. And it means that tests run by executing the code in the src directory instead of the site-packages directory when using the editor for testing.

Using the terminal still requires that the code be pip installed for the tests to work.

Update template CI script to cover python 3.8, 3.9, 3.10

Currently the CI tests only cover Python 3.10. It should be easy enough to expand the test matrix to include python 3.8-3.10.

GitHub documentation about test matrix: https://docs.github.com/en/actions/using-jobs/using-a-matrix-for-your-jobs#example-using-a-multi-dimension-matrix

An example of using multiple different versions of Python for testing: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python#using-the-python-starter-workflow

Likely this is the diff that will be necessary, but need to test to be sure:

diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
index e17de1c..0dd5d30 100644
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -13,6 +13,7 @@ jobs:
     strategy:
       fail-fast: true
       matrix:
+        python-version: ['3.8', '3.9', '3.10']
         include:
 
           - name: Base example
@@ -29,7 +30,7 @@ jobs:
     - name: Set up Python
       uses: actions/setup-python@v2
       with:
-        python-version: '3.10'
+        python-version: ${{ matrix.python-version }}
     - name: Install Python dependencies
       run: |
         sudo apt-get update

Add specific documentation for each of the copier questions

Users could become apprehensive about using the template if they have the answer a bunch of questions, and they don't feel like they know the consequences of the answers, or what they will be used for. i.e. "Am I locking myself into something that I can't undo by answering black vs pylint vs none?" "Is it important that I get the package name just right the first time?"

We should provide some additional, preferably structured documentation (in readthedocs) for each of the questions to let the users know things like:

How locked in to the response are they
What are the consequences of a particular response
What does a given response affect
What to do if they want to change a response later
etc.

Those are a lot of point, and may not all be necessary. We also don't want to overwhelm the user with documentation. So conciseness would be good too 🤷

Have github workflow run pytests

Investigate automatically using codecov for CI

Can we implement this without needing an api key? If we do should we just leave instructions for the users?

Add machinery necessary to automatically generate HTML from Jupyter notebooks

This is a great example to work from: https://github.com/JamesALeedham/Sphinx-Autosummary-Recursion#integrating-jupyter-notebooks-with-sphinx

Create an example output repository after release v. 1.0.0 is created.

It would be nice to have an example output from using the template as a parallel repo.

It would be really nice if someone who isn't related to the project were to do it. Just as a way to make sure that the project template works as expected. Look out for missing documentation, provide feedback, etc.

Test Issue

Automated testing for the template

Currently the testing is done manually, we should find a way to automate it.

Implement the most rudimentary pytest

Just to have a simple test that actually runs.

Allow hyphen in project name

A validator was added in PR#46 that disallows hyphens in project names or package names. This check makes sense in python module names, but hyphens are common in project names (pytest-cov, pre-commit, hipscat-import for a couple examples).

Add pre-commit hook that will run document generation

Want to introduce a pre-commit hook to run cd docs; make html and confirm that it runs. Nothing else really beyond that. The documentation will actually be built by ReadTheDocs, we just need to make sure that it can be built.

Introduce automation to push new tagged releases to PyPI

Identify a method for tracking usage

It would be really nice to have some way to track who is using this template.

Tracking the number of forks of the repo is one way, but most users won't use the template in that way.

One alternative is to include a page in the Sphinx-rendered documentation. But again, not every user will publish a page on read the docs, and even if they did, it would be difficult to search/track.

Another alternative is to include something in the main README, or in a separate README, outside the root folder.

Encouraging the use of a GitHub tag might be good too, but unreliable.

Consider adding Bandit as pre-commit hooks

Wait until we get some user feedback/discuss with the team before piling on more linting tools.

For reference:

https://github.com/PyCQA/bandit # Checks for know security problems

Add support for setuptools_scm so that we can pull version id from git scm.

Expected behavior is that we should be able to create a new tag or release in GitHub, and that setuptools_scm would automatically populate a _version.py file with version information.

Add a Linter CI Action to check code style

The pre-commit hooks are great, but it might be a good idea to have a check to refuse a pull request with style errors if a user has the hooks setup incorrectly.

Rename directory from "nb" to "notebook"

The shorter name is easy to miss. What is the advantage of the short versus long name?

Consider incorporating git-lfs for large file support

The instructions here (https://git-lfs.com/) make it look like this would be a pretty straightforward operation to perform when the template is hydrated. We could add the git lfs install step as one of the "tasks" in copier.yml, and include a .gitattributes file in the template that includes some common file types.

It would also be wise to add some documentation in the readme about what it is, why it's useful, and pointing to the git-lfs documentation.

Extra bonus would be to make this an optional feature that that would be included by default. But I kind of like just making it available out of the box.

Figure out citation format or reference

We should include information for citing software in publications.

I don't know what this looks like, but once we figure it out, it would be good to include in the template.

Add pre-commit to validate pyproject.toml

Adjust line width in RST docs

Line width is all over the place.

Warn if project should be updated

Could we, maybe as part of a smoke test, determine if there's a new version of the template a project was based on, and spit out a warning?

Why? I'd want to keep projects from getting stale and getting far behind what we've determined are best practices.

Alternatively, can we generate a list of projects that have used the template and which version they're updated to, to run copier update on them after updating the template version?

Hadn’t heard of pipx at all, probably the case for most users as well. Maybe a sentence or two explaining why it’s needed would be good.
needed to run pipx ensurepath, to get copier working. Popped up as a warning when running, but may be worth an explicit warning in the docs for naive pipx users
Question of if pipx is needed. It saves users from having to reinstall copier in every environment, but overhead of understanding pipx is present. One solution is to have the docs use pip, but have a callout/subsection that shows how one would use pipx for it's long-term benefits