ufs-community / uwtools Goto Github PK
View Code? Open in Web Editor NEWWorkflow tools for use with applications with UFS and beyond
License: GNU Lesser General Public License v2.1
Workflow tools for use with applications with UFS and beyond
License: GNU Lesser General Public License v2.1
Problem:
In docs/Contributors_Guide/github_workflow.rst
, it says "The first command pulls changes from the original repository... that you see when you run git remote -v and that you set to upstream in step 4 above)." However, there is no step 4, and the docs do not indicate how to set upstream remote. A comment was added to the file indicating this problem.
Proposed Solution:
Add an explanation for how to set upstream remote (possibly borrow from METplus docs). Delete comment.
Description
As a model user, I want help in providing the necessary config object to generate a field table file
(https://jira-epic.woc.noaa.gov/browse/UW-197)
Requirements
FieldTableConfig (#152 ) must be updated to accept a help flag that returns the necessary table fields
Acceptance Criteria (Definition of Done)
Given: a user wants to see what fields are necessary for a field table
When: the user provides a flag
Then: Then help is printed to the screen to give guidance on the fields necessary for the field table
Additional context
#155 Needs to be completed first
This test shouldn't be passing.
The number here should be 144, based on the props
input.
Originally posted by @aerorahul in #45 (comment)
As the scheduling tool, I want a simple method for outputting formatted job card content to a file
After Issue #53 is completed, update the Github Action to post an HTML report to Github Pages.
Keep only the most recent report.
This should only be triggered when pushing/merging to development.
pytest-cov --cov-report html
is the base command to use to generate the report.
Refer to docs in #53 for more information.
Description
In the current logger settings, both the console and file streams assume the same level and formats as the main logger:
https://github.com/ufs-community/workflow-tools/blob/e7c0cecc238a632d56097989b382b3845c59ccb1/src/uwtools/logger.py#L128
However, in several situations, we would like these to be different. Related, the fix for issue #121, #125 has to make a hardcoded change to maintain consistency with previous tests, removing the timestamps from the console stream.
Requirements
An appropriate update would be to add arguments to the logger that allow a user to modify the level and format of individual streams without having to manually override with logging commands.
Describe alternatives you've considered
Current alternative is part of #125 to make the change fixed, but variable is preferable
Acceptance Criteria (Definition of Done)
Add arguments for the user to pass that allow for control of the stream level and format.
Description
As part of UW-229, add a method to the config base class to traverse the dictionary (self.data) and return the depth as an integer. Must increment each time a new dictionary is found as a value and only increment when the structure goes any deeper than the current level.
Description
As we implement args more broadly across the toolset, it would be a better practice to limit some args as mutually exclusive at the evaluation stage
Requirements
Creating a mutually exclusive group during the parseargs such as
group = parser.add_mutually_exclusive_group()
allows us to throw a simple error at the start, preventing bad behavior. Argparse also shows the exclusivity on --h invocation, which makes it nicer for the user
Acceptance Criteria (Definition of Done)
Implement mutual exclusion for all argparser implementation
Create a tool to covert string based memory values from one measurement to another.
Example - '100MB' to KB = '100000KB'
Description
Update user guide
https://jira-epic.woc.noaa.gov/browse/UW-234
Requirements
The RtD Users' Guide must be updated to reflect the usage of the two new tools.
There is no need to duplicate the CLI documentation, but it should include information on how to clone, install, and run the standalone tools.
A couple of examples (lean on the test cases) on popular functions should be included, as well.
Acceptance Criteria (Definition of Done)
A new user can come to the guide on RtD, clone, install, and run the standalone tools.
Must be reviewed by Gillian.
In LSF, the memory needs to be represented in terms of kilobytes.
Example
The user will request memory: 1MB, but the scheduler directive would be #BSUB -R rusage[mem=1000]
Description
https://jira-epic.woc.noaa.gov/browse/UW-209
Requirements
Test that the assumptions that the polymorphism built into the Config subclasses for YAML, namelists and INI files is sufficient to transform to another subclass. May require making the input file path optional in the 'init' methods of each.
Acceptance Criteria (Definition of Done)
One additional test in test_config.py must test all transform combinations and complete successfully
Now that more of the logic is apparent, refactor out some of the cruft to make this easier to use.
Description
As a user, I would like to know which template variables will be filled in, which will be left as jinja2 templates, and a full list of all that are needed (https://jira-epic.woc.noaa.gov/browse/UW-202)
Requirements
Add a "values_needed" type command line argument for the config parser that provides a list of template variables referenced, variables that can be filled in with their values, and those that will remain in templates upon completion. Maybe this is just extra output when run in debug mode? Formatting can be left to the developer, but make sure it's clear and readable.
Acceptance Criteria (Definition of Done)
GIVEN: A user would like additional information from the templater
WHEN: In debug mode (dry-run or not)
THEN: The templater will print to screen/log the values that are required by the template, the values provided by any config object provided.
WHEN: There is no value provided
THEN: The output will clearly state that -- we must differentiate between no value provided and None/Null values.
Additional context
How much additional work would it be to have the output contain the line number of the template?
CRH thinks it would require an extra step of cycling through each of the undefined variables and searching it in the text-based template.
Additional logging inheritance is added in #188 and will need to be tested in other tools such as config and set_config
Description
We need to ensure that there are no additional issues in other tools from the added log inheritance
Requirements
The added tests need to validate the updates made in #188
Acceptance Criteria (Definition of Done)
Tests need to be added to test_config and test_set_config that check the logging level with the updated inheritance and validate the passed log level and file
Expected behavior
Pylint should provide feedback on the standards as they should be applied in the test framework code.
Current behavior
Pylint ignores test code in several cases.
Detailed Description
The pylint blanket ignore was implemented due to the numerous errors that arise from setting up a proper test environment with stubs. After further consideration, the team would like to resolve these types of linting offenders by listing the explicitly, allowing most of the PEP8 standards to be applied to the entire code base.
Possible Implementation
Depending on the scope of various linting rules, the ignore rules can be provided for the entire test module via a configure file for pylint, or they can be provided in the code itself to apply to the whole file, method, or line.
Description
Several references give hard-written forward-slash in paths:
https://github.com/ufs-community/workflow-tools/blob/e7c0cecc238a632d56097989b382b3845c59ccb1/tests/test_templater.py#L127-L129
However, several kinds of calls do not autocorrect this on Windows platforms. It would be a cleaner format to make it universal.
Requirements
The simplest edit is to use the existing os.path.join() behavior:
input_file = os.path.join(uwtools_file_base, "fixtures", "nml.IN")
Describe alternatives you've considered
As no tier-1 or major platforms use Windows, this is a very minor feature.
Acceptance Criteria (Definition of Done)
Paths to automatically correct on Windows platforms
Description
Review the current linter rules to ensure there are no style gaps especially around leading whitespace, eg print ('hello world')
rather than the correct print('hello world')
.
Description
https://jira-epic.woc.noaa.gov/browse/UW-210
Requirements
Add a set of tests that compare two config objects of any combination of 2 file types (9 tests) X and X,Y,Z for X, Y and Z for f90nml, INI and YAML, with a user output that shows the differences. Additionally it should only be supported by files with the same structure (depth of sections, etc.)
Describe alternatives you've considered
May be partly completed by #129
Acceptance Criteria (Definition of Done)
All combinations must be tested with similar structure and correctly fail on different structure
Additional context
An additional option may be to allow user comparison of any two optional files as a tool
Description
As a model user, I want to configure a field table file for the model with a YAML config object.
https://jira-epic.woc.noaa.gov/browse/UW-16
Scheduled for UW Sprint 7.2 ending 17 February 2023
The model requires a diag table file at run time. Its content is mostly dictable, but it doesn't follow a standard convention, instead it uses syntax like that described here. Include a class that will write this format from a Configuration object. There is no need to parse this content at this time.
Acceptance Criteria (Definition of Done)
GIVEN: A configuration object with exactly to correct entries.
WHEN: The user runs set_config.py with a field table output file (or syntax).
THEN: The output is formulated with syntax expected by a field table.
GIVEN: A configuration object with more entries than needed for the field table
WHEN: The user runs set_config.py with a field table output file (or dry run).
THEN: The tool knows which key/value pairs to translate to field table syntax for the final product.
GIVEN: A configuration object does not contain enough information needed for a field table.
WHEN: The user runs set_config.py with a field table output file (or dry-run).
THEN: The tool knows which key/value pairs to translate to field table syntax and does that as it creates the final product.
GIVEN: A user wants to see what fields are necessary for a field table
WHEN: The user provides a flag (maybe named show_me)
THEN: Then help is printed to the screen to give guidance on the fields necessary for the field table.
Expected behavior
atparse_to_jinja2 tool should replace all @[] on a line with {{}}
Current behavior
Currently it only checks for and replaces the first instance of @[] on a line.
Possible Implementation
after_atparse needs to be checked for additional @[ and logic repeated until no more are detected.
Description
The following text header was likely intended to be '\n':
https://github.com/ufs-community/workflow-tools/blob/e7c0cecc238a632d56097989b382b3845c59ccb1/tests/test_templater.py#L53
It would look cleaner that way, but would require editing the reference files as well.
Requirements
Update all header text from this test and reference to use the new line
Acceptance Criteria (Definition of Done)
Proper new line is performed and passes the tests
Description
We settled on documenting the code in-line with PEP-8 docstring conventions. Those are in large part missing and this was uncovered during a recent fix to the the linting workflows.
Acceptance Criteria (Definition of Done)
All pylint disable strings are removed for missing docstrings at the module, method, class level.
On the PR for the updated users guide documentation one of the code checks, under GitHub code scanning, the CodeQL was neutral (did not pass or fail), GitHub generated the following message:
Warning: Code scanning cannot determine the alerts introduced by this pull request, because 1 configuration present on refs/heads/develop
was not found:
Actions workflow (codeql-analysis.yml
)
โ .github/workflows/codeql-analysis.yml:analyze/language:python
No new alerts
When branch was actually merged into develop however CodeQL ran and all tests passed.
Expected behavior
CodeQL check on pull request should pass (expected in this case since only documentation was altered) or fail, no configurations should be missing since that file was not altered.
Current behavior
CodeQL check was 'neutral', stated 1 configuration was missing : .github/workflows/codeql-analysis.yml:analyze/language:python
However, when branch was merged into develop all tests passed including CodeQL check
To Reproduce
Details on behavior can be found here
Context
The only changes here were to the user's guide/documentation, all .rst files located in the docs directory, there was no change to any code and no change to the .github/workflows/codeql-analysis.yml
, possibly this behavior was caused by documentation builds/updates
Description
As currently configured, logger.py has tools for updating a logger object. However, when accessed externally it will currently overwrite the object with default parameters unless otherwise specified.
Requirements
As mentioned in #188, a check will need to be added to Logger "if _name in logging.root.manager.loggerDict" to bypass the default level and handler settings.
Acceptance Criteria (Definition of Done)
Given: the user provides a logger object name to the Logger tool
When: the named logger object is already instantiated
Then: the logger tool will only read the named logger object
Description
Several places use a local-but-fixed output path, such as logfile = os.path.join(os.path.dirname(__file__), "templater.log")
. This will continue to add to the log on subsequent runs, as well as simultaneous runs from the same install. This should be changed to check the existence of the file as well as varying the path, probably with PID or timestamp
Requirements
Each generated log from a tool should be unique. Additionally, each instance of a test log should be removed at the end of a test.
Acceptance Criteria (Definition of Done)
Given: the user runs a UW tool
When: a log file is created
Then: the logger tool will ensure the file path is unique via the file handler
The LSF scheduler is a bit picky.
Specifically:
HH:MM
instead of HH:MM:SS
. The input can be HH:MM:SS
but the output needs to be in HH:MM
.kb
with no units). e.g. #BSUB -R rusage[mem=1000]
is requesting 1MB of memory. The input can be any units, but the output needs to be in kb
. @aerorahul will double check the correct units for this request.Expected behavior
Correct syntax should be used when creating the job-card.
Current behavior
The user needs to know how to provide the request. Failing which, the job is rejected by the scheduler.
Machines affected
Machines running IBM's LSF scheduler.
Description
Based on testing for #129, the current parsers are returning their own internal types. This works just fine individually, but causes issues if using config.py to convert one type to another.
Requirements
Updates need to be made to all config.py concrete methods in _load and dump_file to ensure the intermediate output/input is a consistent dictionary type.
Describe alternatives you've considered
Workarounds can be added in tests and several options for the type correction are available. However, the better behavior is to allow for broad user options without additional steps.
Acceptance Criteria (Definition of Done)
Existing tests pass, while conversion tests added by in #129 now pass when converted to assert
Additional context
Potential issue was initially identified when looking at F90 namelist parsing and noting that an OrderedDict() type was being returned and causing some issues in assertions. Further testing shows that there are some concerns regarding consistency of the PyYAML SafeLoader in creating dict types. The simple.yaml file intermediate type returns "base = 'kale' fruit = 'banana' vegetable = 'tomato' how_many = 12 dressing = 'balsamic' /" from _load. Additionally, the configparser() appears to require an additional step to read/write dictionaries.
Description
As a user, I need a user interface for handling config files.
https://jira-epic.woc.noaa.gov/browse/UW-203
Scheduled for UW Sprint 7.2 ending 17 February 2023
The config parsing tool (probably set_config.py) needs a command line interface for user interaction. The end purpose of the tool is to update key/value pairs, concatenate sections or key/value pairs, compare two fully formed config files, and transform an input base value to a different output configure type.
Requirements
It should provide all the standard options as outlined in the Toolbox MVP and should aim at achieving the following behaviors:
-i to accept an input "base" config file
-o to write a rendered output file
-c to accept a user config file
These files will need some way of knowing what type of file we give it. As a first pass, I think that determining type from the suffix is sufficient. We'll likely want a better solution at a future date.
Assume the output file type is the same as the input file type in this ticket.
Expected behavior
The links in the documentation should work, but some of them don't. This was first noticed by me when Rahul was presenting at the Sprint review on Monday 3/7.
Current behavior
Some links currently point to rst files, which do not display correctly, as expected. Links to rst files won't work.
Machines affected
All browsers - This is evident by click on the links in ReadTheDocs.
To Reproduce
For example, the links on this page (and potentially others):
https://unified-workflow.readthedocs.io/en/latest/Contributors_Guide/introduction.html
Context
The links in the documentation should work.
Detailed Description
Reformat the links as the Sphinx documentation describes:
https://docs.readthedocs.io/en/stable/guides/cross-referencing-with-sphinx.html
Additional Information
No additional information
Possible Implementation
One idea to prevent this from happening again, is to add an item to the Pull Request template to ensure all links in the documentation work before approving the PR, or to add an item to the Issue template to ensure all links in the documentation work before submitting a PR (or both).
Edit the documentation for submission with ECC-334.
Description
Update the pull request template to include co-author information
pytest is failing on test_scheduler with the following:
(uwtools) (base) login1.stampede2(1060)$ pytest
========================================================== test session starts ===========================================================
platform linux -- Python 3.7.9, pytest-7.1.1, pluggy-1.0.0
rootdir: /home1/02441/bcash/workflow-tools
collected 4 items
tests/test_scheduler.py .F.. [100%]
================================================================ FAILURES ================================================================
_____________________________________________________________ test_scheduler _____________________________________________________________
def test_scheduler():
expected = """#SBATCH scheduler=slurm
#SBATCH --job-name=abcd
#SBATCH extra_stuff=12345"""
props = {"scheduler": "slurm", "job_name": "abcd", "extra_stuff": "12345"}
js = JobScheduler.get_scheduler(props)
actual = js.job_card.content()
assert actual == expected
E AssertionError: assert '#SBATCH sche...ra_stuff12345' == '#SBATCH sche...a_stuff=12345'
E - #SBATCH scheduler=slurm
E ? -
E + #SBATCH schedulerslurm
E #SBATCH --job-name=abcd
E - #SBATCH extra_stuff=12345
E ? -
E + #SBATCH extra_stuff12345
tests/test_scheduler.py:23: AssertionError
======================================================== short test summary info =========================================================
FAILED tests/test_scheduler.py::test_scheduler - AssertionError: assert '#SBATCH sche...ra_stuff12345' == '#SBATCH sche...a_stuff=12345'
====================================================== 1 failed, 3 passed in 0.08s =======================================================
Is this related to the discussion about the user needing to know when to supply '='?
something is not right here either.
The -n
is nodes * tasks_per_node
The tasks_per_node
is the ptile
line.
Originally posted by @aerorahul in #45 (comment)
Error in file xyz.py
etc.....
Description
Add additional CLI args that make the config parser more user friendly. Please reference the Toolbox MVP for specifics.
The "conversion" flag (please choose an appropriate flag name) is meant to allow a user the flexibility to provide say a yaml base config and write out a f90 namelist.
Each of the base, config, and output file types could technically be different file types (Jeez, I hope that folks don't ACTUALLY do THAT regularly). The most common use case might be that we have a base file in f90nml format, the user config in YAML, and the output needs to be either yaml or f90nml.
Tests already exist for these conversions in test_config:: test_transform_config. Reference those tests for how this is implemented.
Acceptance Criteria (Definition of Done)
GIVEN: A user wants to provide different types of files for input and output
WHEN: The file is provided via the existing flags and an output type is also provided
THEN: A conversion to the final config type is performed before writing the file (or printing to log/stdout in dry-run mode)
Implement loading configuration object from .yaml files.
Interface Example:
input_file = sys.argv[1] # incoming yaml file
with open(input_file, "r") as fh:
try:
cfg = yaml.load(fh, Loader=yaml.FullLoader)
except yaml.YAMLError as exc:
print(exc)
Usage Example:
https://github.com/ufs-community/workflow-tools/blob/feature/scheduler/scheduler_tests_demo/test_scheduler.py
Expected behavior
The GHA test for pylint should catch items like "unused imports", and it doesn't seem to be doing that.
Current behavior
On PR #19, I couldn't find a reference to ast.List
in src/uwtools/generic_scheduler/scheduler.py
after being interested in how it was being used. I checked to see if all the tests passed, mainly the linter after expecting that it should, and found that pylint reported a 10/10. I expected that it should fail with something < 10.
Machines affected
GitHub Actions
Requirements:
Complete using Github actions. Model off of the tests and lint GitHub actions.
Refer to pytest-cov documentation for more reference.
def test_lsf4():
expected = """#BSUB -P account_name
#BSUB -q batch
#BSUB -W 00:01
#BSUB -n 12
#BSUB -R affinity[core(3)]
#BSUB -R span[ptile=6]"""
props = {
"scheduler": "lsf",
"account": "account_name",
"queue": "batch",
"walltime": "00:01:00",
"nodes": 2,
"tasks_per_node": 6,
"threads": 3
}
js = JobScheduler.get_scheduler(props)
actual = js.job_card.content()
assert actual == expected
def test_lsf5():
expected = """#BSUB -P account_name
#BSUB -q batch
#BSUB -W 00:01
#BSUB -n 12
#BSUB -R affinity[core(3)]
#BSUB -R span[ptile=6]
#BSUB -R rusage[mem=1000]"""
props = {
"scheduler": "lsf",
"account": "account_name",
"queue": "batch",
"walltime": "00:01:00",
"nodes": 2,
"tasks_per_node": 6,
"threads": 3
"memory": "1MB"
}
js = JobScheduler.get_scheduler(props)
actual = js.job_card.content()
assert actual == expected
Expected behavior
We should be checking the return status of the "make html" command in build_documentation.sh. If "make clean html" returns a bad status, the GHA runs should also fail.
Current behavior
Currently, the GHA jobs still succeeds since we fail to return bad status from that script.
Machines affected
Only GHA Actions
To Reproduce
No need to reproduce.
Context
No additional context is needed.
Detailed Description
Described above under expected behavior.
Additional Information
No additional information needed.
Possible Implementation
The changes will look something like:
make clean html
if [ $? != 0 ]; then
echo "ERROR: make clean html failed"
exit 1
fi
cd -
See example here.
consider adding the following:
echo ERROR: Warnings/Errors found in documentation
echo Summary:
grep WARNING ${DOCS_DIR}/_build/warnings.log
grep ERROR ${DOCS_DIR}/_build/warnings.log
grep CRITICAL ${DOCS_DIR}/_build/warnings.log
echo Review this log file or download documentation_warnings.log artifact
and
echo INFO: Documentation was built successfully.
See example here.
Description
Refactor the existing YAML Parser to use Jinja2-based extensions.
https://jira-epic.woc.noaa.gov/browse/UW-188
Scheduled for UW Sprint 7.1 ending 03 February 2023
The existing YAML Parser leverages JCSDA Solo tools template.py, nice_dict.py, and yaml_file.py. The refactor should take into account the specifications outlined in the Technical Design Document contributed as the result of UW-187, with the following guidelines:
The need for template.py will be obfuscated by the use of Jinja2 as the templating language. Please remove it as part of the refactor.
The contents of nice_dict.py provides a capability to use the nice namespace "dot" notation for dictionaries (this is a feature we'd like to retain), but in large part duplicates Python-native capabilities (included in the standard library). The preference is for using the Python standard implementations when possible. Please remove it as part of the refactor.
The implementation of yaml_file.py contributes methods that are generally unneeded, or that could be abstracted into a more generic configuration base class. Please remove it as part of the refactor.
Requirements
In general the YAML config parser should enable the following tasks:
Read in a YAML file as a dictionary using pyyaml.
Write a YAML file from a dictionary using pyyaml.
Fill in references to other YAML key/value pairs.
Fill in references to environment variables.
Refrain from filling in values that are not defined.
Recursively reference key/value pairs in "deep" YAML sections.
The algorithm that does this has been developed and included in SRW in https://github.com/ufs-community/ufs-srweather-app/blob/develop/ush/python_utils/config_parser.py under the extend_yaml function. At this point, that function should be included in a method under the Config class in config.py, tests written, and contributed to workflow-tools.
Acceptance Criteria (Definition of Done)
The full scope of the tool should be able to:
Note: These AC are subject to change with backlog grooming with the completion of UW-187
Description
Modify the README.md file to point to the ReadTheDocs Unified Workflow link instead of here.
Requirements
Follow formatting rules
Describe alternatives you've considered
No alternative necessary
Acceptance Criteria (Definition of Done)
Update the link and PR approval
Additional context
N/A
Description
Add dry-run flag to config_parser.
Requirements
Add additional CLI args that make the config parser more user friendly. Please reference the Toolbox MVP for specifics.
The --dry-run feature shows the rendered final text, but doesn't write a file
Acceptance Criteria (Definition of Done)
GIVEN: The user wants to know what the final output will look like from the config parser, but doesn't want to create a file
WHEN: A --dry-run flag is provided via CLI
THEN: The rendered final content is printed to stdout/log.
Description
As a user, I'd like to see uniform logging for the ins/outs of a whole variety of functions/methods
(https://jira-epic.woc.noaa.gov/browse/UW-208)
Requirements
Logger.py should be updated with a decorator function to be called with debug to provide additional information
Acceptance Criteria (Definition of Done)
Given: the user provides a debug flag to any logger-enabled tool
When: a logger decorator function has been applied to a called method
Then: information about the caller, local variables, and expected output will be printed to the screen and log file
Additional context
This ticket is more about being able to wrap calls to any existing methods
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.