Code Monkey home page Code Monkey logo

gawd's People

Contributors

alexandredecan avatar dependabot[bot] avatar drupol avatar pooya-rostami avatar step-security-bot avatar tommens avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

gawd's Issues

apply a "changeset" on a file

"gawd file1.yml file2.yml" returns a set of changes (diffs) that allows to go from file1.yml to file2.yml
So perhaps it would be useful as well to apply "gawd file1.yml changeset" that takes as input a yaml file and a changeset et returns the file2.yml that is the result of applying the changeset to the yaml file

Treat literals as strings, do not convert them to Python datatypes

When reading a workflow file, ruamel.yaml is used to parse the yaml and its default behaviour is applied. The default behaviour means that all literals that have a specific meaning in yaml (such as "true", "false", integers, floats, etc.) are converted to their corresponding Python data types.

The problem is that when we report the changes that we find between two workflow files, we report them as Python objects, implying for example that key: true changed to key: false will be reported as True becoming False (where True and False are Booleans) instead of true becoming false (as strings).

A possibility would be to convert all these specific values to strings when reading them through ruamel. However, (1) it doesn't seem easy to tune ruamel to get this behaviour (it seems we'll have to change the behaviour of a subclass of the Constructor class, but support for specific data types seems to be implemented through class methods that are called explicitly); (2) this also means that our similarity function will be applied exclusively on strings, implying that the difference between (for example) True and False won't be 1 anymore, but the result of difflib.SequenceMatcher's ratio on the two strings. Similarly, comparing 11 and 12 will lead to a similarity of 50% where it is currently 0%.

To be discussed...

Report changes in fixed/predictable order

Hello,

Currently, the changes that are reported by gawd are provided in an arbitrary order that mostly depends on the code (i.e., we first address everything except jobs, then jobs for example). At some point, it could be interesting to report the changes in a specific, fixed order. This will notably "solve" 6394d5d where I changed the way we compare the output of gawd to ignore the order in which changes are reported.

I see at least two options:

  • We can sort changes according to their path in lexicographical order. Ties are broken by looking at the kind of changes (for example, if a same step is moved and changed, we first report on moved (or changed) consistently). That way, we ensure an order, but this order is not really relevant to the users;
  • We can sort changes according to their path, sorted in apparition order: since a YAML file is mostly a combination of dicts and lists and that both are ordered, we can rely on this order to provide the changes. Based on the current implementation, I would say the easiest would be to write a function sort_paths that takes a list of paths and the workflow file (the Python structure), and returns a sorted list of paths following the order in which these paths can be found in the given workflow file.

The second approach seems more "ergonomic" in the sense that changes are provided in the order in which they can be seen in the file. It remains to decide (1) whether we sort changes according to their path/position in the first workflow file or the second one; (2) and where to insert changes that only affect one file (e.g., let's say we sort changes according to the second workflow file, as in diff, where do we put "removed" changes since their corresponding paths do not exist in this second file?).

apply gawd directly on git commits

It could be useful to use the tool by providing as input just a commit reference in some git repo and a reference to a yaml file in that commit. In that case, the tool would apply the diff to that specific file in that commit, w.r.t. to previous available version of that file, in order to return the diff with all of the changes that were made by that commit to the file.

In the same vein, it would be useful to have the tool do the same thing for all yaml workflow files corresponding to given commit. (This would correspond to applying the tool several times, once for each yaml workflow file found in the commit.

Use "--short" by default, convert it to "--long"

Hello,

Currently, when reporting on a change through the CLI, values are provided in their "long" version. We have an option to make the output shorter, by specifying --short (or -s). I propose that, by default, we truncate the output (i.e., we apply the behaviour of --short) while providing a new option, namely --long or --full-value or..., to indicate that the output shouldn't be truncated. The rationale behind this suggested change is that I expect users to be more interested about the kind of changes that were made to the file (and the path of those changes) rather than the values (since, if you know the path, you can easily find the full values in the respective files).

Do not use a "." as component separator in paths

At least when accessing gawd programmatically, the list of components of a path should be returned as a list of string, not as a single string where components are separated using ".". The rationale is that many workflows are using this character as part of a valid job name, making it difficult to split path components afterwards for them.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.