Code Monkey home page Code Monkey logo

lins's Introduction

NIM LICENSE tests

lins

Lins is a lightweight, extensible linter for prose—specifically developed with LaTeX in mind. The tool is written in Nim and inspired by Vale.

Documentation

The project's documentation is available here.

Building

If none of the release packages targets your platform, you can still build and use this tool provided that you have a C compiler that targets your platform.

  1. Download and install the Nim compiler and its tools.

  2. Clone this repository and run

    nimble install
    
  3. Since Lins relies on PCRE for its regular expression support via dynamic linking, you will also have to build or install PCRE as a library.

Version numbers

Releases follow semantic versioning to determine how the version number is incremented. If the specification is ever broken by a release, this will be documented in the changelog.

Reporting a bug

If you discover a bug or what you believe is unintended behavior, please submit an issue on the issue board. A minimal working example and a short description of the context is appreciated and goes a long way towards being able to fix the problem quickly.

License

Lins is free software released under the MIT license.

Third-party dependencies

  • Nim's standard library
  • NimYAML
  • Regular expression support is provided by the PCRE library package, which is open source software, written by Philip Hazel, and copyright by the University of Cambridge, England.

Author

Lins is maintained by Marcus Eriksson.

lins's People

Contributors

sthenic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

lins's Issues

Format specifier to insert context match

It might be valuable to be able to insert the matching text from the context regular expression of a scope entry.

Maybe a dedicated format specifier and not one based on numbers like $1 is a good idea.

Control sequence existence rule

How about the ability to define a rule to

  • trigger if a control sequence is used in a particular scope or
  • trigger if a control sequence is not used in a particular scope.

This would require some modification of the LaTeXTextSegment and the LaTeX parser to track the control sequences encountered in the segment.

Built-in rules (LaTeX)

This issue collects and organizes ideas about built-in rules in the LaTeX linter.

Any built-in rules must be able to be disabled by the user.

  • Empty sections
    Trigger if \section is immediately followed by another section command, i.e. without text in between.

    \section{A}
    % This should be avoided!
    \subsection{AB}
  • Absence of groups/options
    Trigger if a capture group is missing from a control sequence or environment. For example, a missing [!ht] from a float environment, if that's a preference.

Add rule to use en-dashes for number ranges

Figure out if it's possible to make a rule that enforces the use of en-dashes (--) for number ranges.

If possible, this rule should match occurrences of \ref{.}--\ref{.}.

Warn about tab characters

Issue a warning when encountering tab characters \t or \v. This would require emitting rule violations from the lexer so it's related to #17 in terms of infrastructure support.

Link checker

A cool idea would be to add an option like --follow-urls which doesn't run the linter but instead identifies URLs and checks if they return a 404 code. Any dead links would then be reported to the user and the exit code set accordingly.

I don't think you would want to always check the links, the speed penalty is too great.

Refactor configuration file parser

The configuration file parser in cfg.nim is needlessly complicated with all the states and whatnot. Rewrite it with the token eating philosophy of the language parsers and lexers.

UI to exclude math environments

We need to expose a way to exclude the special math environments ($, \[, etc.) in a rule file.

Currently there is the math scope available in the scope section. That includes all of these environments. There is a UI to exclude equations since it's an environment but not any way to exclude inline math $ or displayed math $$.

Replace \textbackslash with \

A nice-to-have feature would be to treat \textbackslash similarly to control symbols, i.e. inserting the resulting character that would be typeset into the text segment.

Negative matching

It might be useful to support negative matching, i.e. a rule which if it doesn't match in the specified scope, a violation is generated.

This is particularly useful to work around the fact that regexes doesn't support non-fixed-width look-behind or look-ahead assertions. Those cases will be better served by these negative matching rules.

Implement YAML parser

While the project is currently using NimYAML, that project sometimes seems to have stagnated. Each time there's a new Nim version it takes a relatively long time before breaking changes are fixed.

The YAML needs of this project are a subset of what NimYAML provides, we would only need a parser and not the serialization engine.

This should ideally be implemented in the same way as the existing lexers and parsers in the project. I'm thinking something along the lines of the Nim JSON parser, outputting variant objects and leaving the translation to fixed types to the user. NimYAML does away with this step by leveraging the macro system, which is nice, but introduces complexity.

Literal interpretation of capture group

Similar to the list of control sequences whose contents are expanded into the parent segment, we should implement a list of control sequences whose contents are interpreted literally.

For example, constructions like \date{\today} currently doesn't generate any lintable output (everything is suppressed). Perhaps it should? So that you can write a rule to enforce a dynamic dating of the document.

It also technically possible to add a specific control sequence literally to the segment every time it's encountered. Like \today for example.

Tab completion

It should be possible to offer more intelligent tab completion on Unix systems.

One way seems to be using compgen.

Title capitalization rule

A new rule type should be added that triggers title capitalization checks.

The rule file itself is responsible for defining a scope and selecting a capitalization style, although I think we will start by supporting just the Chicago style.

Here is a link to an online formatter. The Chicago manual of style also defines the rules on page 526.

Customizable list of expanding control sequences

Currently, the list of control sequences whose text is expanded into the parent segment is hard coded and consists of \emph, \textbf and \texttt.

There is value in opening up this list to allow the user to add custom control sequences that should be handled in the same way.

Maybe via the configuration file?

Implement trailing context

Right now, only leading context are supported. Although the infrastructure is in place for both, the trailing context is always empty.

The question is how to implement it. The current stack implementation + buffer refill creates a few corner cases that may be tough to handle.

There are a lot of potential use cases for trailing contexts, e.g. checking that \section is followed by a \label.

Rename context labels

Rename context labels:

  • 'before' to 'leading'
  • 'after' to 'trailing'

This will be a breaking change but works better, least of all in the documentation. "The leading context" is more clear than "the context before" --- before what? Since before/after are prepositions. It gets needlessly wordy.

Check line endings

Offer support for checking the line endings in the linted file. Enforcing unix-style line endings \n or windows-style line endings or \c\n.

This would have to be extracted from the lexer somehow since that's where the input file is read.

Maybe it's activated by a rule file but the existence of the file only activates a fixed check in the linter.

Emphasis tracking

It should be possible to implement an emphasis tracker. The \emphmacro is often used to introduce a term and to provide its definition. After the term has been introduced to the reader, \emph should no longer be used. The goal here would be to trigger a rule violation on repeated uses of \emph for the same term.

Is this useful in a general case or potentially too aggressive?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.