Code Monkey home page Code Monkey logo

rulecheck's Issues

Standard rule setting failure handling

Document/support rules failing config (bad settings).
Maybe they should print their use/help on error? Or maybe they should provide a method to return that text and loadRules would print it on exception from init of the rule?

Create test module to help rule authors test their rules.

They should be able to specify:
The rule class for rulechecktest to create
A file for input
Maybe support taking in an already created srcml file
Settings dictionary for their rule
Get a list of log calls the rulemade
Run standard set of asserts on their rule creation (does it do all the correct things on object creation?)
Provide help text

Output format not automatically parseable by GCC parsers

GCC's output is slightly different in that it adds spaces after some colons:
hello.cpp:4:1: error: ‘returna’ was not declared in this scope

Update the log output to add these spaces so IDEs and other tools that look for errors/warnings in tool output will be more likely to pick up rulecheck's output.

Provide standard way for rules to provide help text

The system should be able to print help text for any rule. A standard way for rules to report help text (or standard method run and it may then print text) will be needed in addition to a command line argument that takes the rule name for which to display help information.

Support a per rule werror setting

Implement such that each rule implementation doesn't have to handle this on its own. Warnings reported on a rule would be promoted to errors automatically by the system. werror setting would be specified in the config json.

Additional integration test

Have integration test parse output, using a any_other rule that logs every xml element with a visit_line that logs every line and show that the reported position always increases

"Mute after n" option

"Mute after n" option to print summary for a rule on a file on n+1th message from a rule (to reduce log size)

Support ignoring rule based on a source comment

Support ignore based on line comment.

Possibly like:
// NORCNEXTLINE(rulepack.myrule)
would ignore rulepack.myrule on the line following the comment.
// NORC(rulepack.myrule)
would ignore rulepack.myrule on the line the comment appears on.

Technically, the implementation does not have to restrict its search for the keywords to comments. It may do a plaintext search on a line. This will be simpler and should be good enough.

Any whitespace between start of comment and the keyword 'rulecheck' would be allowed.

Multiple rules may be specified using comma as a separator:
// NORC(rulepack.rule1, rulepack.rule2)

The following would ignore all violations by any rule:
// NORC(*)
// NORCNEXTLINE(*)

Text after NORC/NORCNEXTLINE except that in immediately following parenthesis will be ignored. That way the comment may contain a reason/rationale for the disabling of the rule.

Unrecognized rules will be ignored as the user may simply not be executing rulecheck with the rule activated.

Syntax is inspired by NOLINT and NOLINTNEXTLINE comments for clang-tidy: https://clang.llvm.org/extra/clang-tidy/.
A consistent syntax will help users if they are using both tools.

If srcml's position's line information decrements, rulecheck will report wrong line information

srcml may report a starting position of line X and then in subsequent elements report starting positions of Y < X. One known case is the following construct

#ifdef __cplusplus
extern "C" {
#endif

#include ... // Or other CPP statements

// C statements

#ifdef __cplusplus
}
#endif

In the above case the block_content tag for the extern "C" block will have a starting position where the C statements start instead of where the following CPP statements are.

Provide handling of srcml namespaces

The elements passed to visitors can be parsed via xpaht searches but they all have the full namespace which makes searches cumbersome. Either a helper map for the srcml namespaces could be provided or perhaps when parsing the SRCML output, the namespaces can be stripped prior to iterating over the tree. See the most popular answer (not the accepted answer) here: https://stackoverflow.com/questions/18159221/remove-namespace-and-prefix-from-xml-in-python-using-lxml

Add argument to run an ignore list cleanup

Ignore list cleanup option would output a new list with line numbers updated (if original list had issue at line 5, but it was found at line 6 then it is still a match and output new list with line 6 listed)

Support specifying multiple rules in a single object of a config file

If a rulepack has many rules, it is tedious to create/manage a rule object entry in the json config file. Support the use of wildcards (* and ?) in names. May still require the rulepack to be specified. For example:
name: linuxstyle.*
name: linuxstyle.rule2?

Also, support wildcards pulling in multiple rules but if a specific rule is named with setting then instead of instantiating the rule again, replace the one pulled in via wildcard.

  • NOTE: This last one may need some more thought as it is a bit inconsistent for how multiple rule instantiation is done today.

Support Python 3.7

Currently Python 3.8 is required. Change components to allow 3.7 to be used so that additional systems can be used. (Some corporate systems may not be allowed to upgrade to 3.8 yet.)

Refactor to remove ignore logic from logger

The logger module's log_violation method currently knows too much about how ignores work. It requests the hash of the ignore and then passes it to various other ignore module methods. Update log_violation and ignore module so that the logger doesn't need to request the hash value unless it is going to show it on the console.

Rulecheck sometimes reads beyond last line of file.

This is caused by this srcml bug: srcML/srcML#1697
The incorrect end position specified by srcml is beyond the last 'real' line in the file.
In the example given in the srcml issue:

1: #if THIS\n
2: main() {\n
3: }\n
4: #endif\n
5:

line 5 has no content and thus is not read in by rulecheck when it calls python's readlines function. In other words, readlines will read in an list of 4 strings, not 5. But srcml specifies the end position as being on line 5 and rulecheck then tries to process lines up to and including line 5, but this is beyond the end of the list of lines.

Additional summary output

Have logger keep total count of err and warn (two counts) for each rule (by rule name) over all files and include that in summary. Print Error Count, Warning Count, Rule Name so rule name length doesn't impact output formatting.
The counts between instantiations of the same rule would not be counted separately

Support "strict" mode to disable ignore directives

Support an argument and per-rule setting to disable ignore via line comments (strict mode).
(Consider and document what will happen if a rule is ignored via the hash lookup method. Maybe also disable that method as well for strict mode, or print warning, or do nothing?)

Ignore via hash should limit line number difference when finding matches

Ignore via hash needs to take line number +/- n lines as many lines may hash to same value as the line content is identical. With this, the hashes must be loaded from the ignore list and duplicates in ignore list counted and then decremented as they are found in the files being searched. If count reaches 0 or line not within n lines of ignore row then it is a reported violation.

Add rule to language applicability support

Rules need to provide which languages they parse. There are the same srcml tags across several languages, but a style or other rule might only apply to a langugage or a few, not all.
Rules can be dynamic in this by taking in a setting and changing what they return in the language getter.

Additional internal tests needed

Test and cleanup as needed what happens when: rule path not found, rule not found, source not found, srcml not found, config file not found, srcml returns an error, settings for rule not present in json, no rules specified in config json

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.