sthenic / lins Goto Github PK

5.0 1.0 0.0 894 KB

Lins is a lightweight, extensible linter for prose—specifically developed with LaTeX in mind.

Home Page: https://sthenic.github.io/lins

License: MIT License

Shell 0.37% Nim 99.63%

lins's Introduction

Lins is a lightweight, extensible linter for prose—specifically developed with LaTeX in mind. The tool is written in Nim and inspired by Vale.

Documentation

The project's documentation is available here.

Building

If none of the release packages targets your platform, you can still build and use this tool provided that you have a C compiler that targets your platform.

Download and install the Nim compiler and its tools.
Clone this repository and run
```
nimble install
```
Since Lins relies on PCRE for its regular expression support via dynamic linking, you will also have to build or install PCRE as a library.

Version numbers

Releases follow semantic versioning to determine how the version number is incremented. If the specification is ever broken by a release, this will be documented in the changelog.

Reporting a bug

If you discover a bug or what you believe is unintended behavior, please submit an issue on the issue board. A minimal working example and a short description of the context is appreciated and goes a long way towards being able to fix the problem quickly.

License

Lins is free software released under the MIT license.

Third-party dependencies

Nim's standard library
NimYAML
Regular expression support is provided by the PCRE library package, which is open source software, written by Philip Hazel, and copyright by the University of Cambridge, England.

Author

Lins is maintained by Marcus Eriksson.

lins's People

Contributors

Stargazers

Watchers

lins's Issues

Format specifier to insert context match

It might be valuable to be able to insert the matching text from the context regular expression of a scope entry.

Maybe a dedicated format specifier and not one based on numbers like $1 is a good idea.

Control sequence existence rule

How about the ability to define a rule to

trigger if a control sequence is used in a particular scope or
trigger if a control sequence is not used in a particular scope.

This would require some modification of the LaTeXTextSegment and the LaTeX parser to track the control sequences encountered in the segment.

Built-in rules (LaTeX)

This issue collects and organizes ideas about built-in rules in the LaTeX linter.

Any built-in rules must be able to be disabled by the user.

Empty sections
Trigger if \section is immediately followed by another section command, i.e. without text in between.
```
\section{A}
% This should be avoided!
\subsection{AB}
```
Absence of groups/options
Trigger if a capture group is missing from a control sequence or environment. For example, a missing [!ht] from a float environment, if that's a preference.

Add rule to use en-dashes for number ranges

Figure out if it's possible to make a rule that enforces the use of en-dashes (--) for number ranges.

If possible, this rule should match occurrences of \ref{.}--\ref{.}.

Warn about tab characters

Issue a warning when encountering tab characters \t or \v. This would require emitting rule violations from the lexer so it's related to #17 in terms of infrastructure support.

Link checker

A cool idea would be to add an option like --follow-urls which doesn't run the linter but instead identifies URLs and checks if they return a 404 code. Any dead links would then be reported to the user and the exit code set accordingly.

I don't think you would want to always check the links, the speed penalty is too great.

Refactor configuration file parser

The configuration file parser in cfg.nim is needlessly complicated with all the states and whatnot. Rewrite it with the token eating philosophy of the language parsers and lexers.

UI to exclude math environments

We need to expose a way to exclude the special math environments ($, \[, etc.) in a rule file.

Currently there is the math scope available in the scope section. That includes all of these environments. There is a UI to exclude equations since it's an environment but not any way to exclude inline math $ or displayed math $$.

Replace \textbackslash with \

A nice-to-have feature would be to treat \textbackslash similarly to control symbols, i.e. inserting the resulting character that would be typeset into the text segment.

Fix support for inital values for line and column numbers

The new linter structure has yet to offer support to respect line_init and col_init from the CLI options --line and --col.

Negative matching

It might be useful to support negative matching, i.e. a rule which if it doesn't match in the specified scope, a violation is generated.

This is particularly useful to work around the fact that regexes doesn't support non-fixed-width look-behind or look-ahead assertions. Those cases will be better served by these negative matching rules.

Implement YAML parser

While the project is currently using NimYAML, that project sometimes seems to have stagnated. Each time there's a new Nim version it takes a relatively long time before breaking changes are fixed.

The YAML needs of this project are a subset of what NimYAML provides, we would only need a parser and not the serialization engine.

This should ideally be implemented in the same way as the existing lexers and parsers in the project. I'm thinking something along the lines of the Nim JSON parser, outputting variant objects and leaving the translation to fixed types to the user. NimYAML does away with this step by leveraging the macro system, which is nice, but introduces complexity.

Literal interpretation of capture group

Similar to the list of control sequences whose contents are expanded into the parent segment, we should implement a list of control sequences whose contents are interpreted literally.

For example, constructions like \date{\today} currently doesn't generate any lintable output (everything is suppressed). Perhaps it should? So that you can write a rule to enforce a dynamic dating of the document.

It also technically possible to add a specific control sequence literally to the segment every time it's encountered. Like \today for example.

Tab completion

It should be possible to offer more intelligent tab completion on Unix systems.

One way seems to be using compgen.

Title capitalization rule

A new rule type should be added that triggers title capitalization checks.

The rule file itself is responsible for defining a scope and selecting a capitalization style, although I think we will start by supporting just the Chicago style.

Here is a link to an online formatter. The Chicago manual of style also defines the rules on page 526.

Customizable list of expanding control sequences

Currently, the list of control sequences whose text is expanded into the parent segment is hard coded and consists of \emph, \textbf and \texttt.

There is value in opening up this list to allow the user to add custom control sequences that should be handled in the same way.

Maybe via the configuration file?

Implement trailing context

Right now, only leading context are supported. Although the infrastructure is in place for both, the trailing context is always empty.

The question is how to implement it. The current stack implementation + buffer refill creates a few corner cases that may be tough to handle.

There are a lot of potential use cases for trailing contexts, e.g. checking that \section is followed by a \label.

Rename context labels

Rename context labels:

'before' to 'leading'
'after' to 'trailing'

This will be a breaking change but works better, least of all in the documentation. "The leading context" is more clear than "the context before" --- before what? Since before/after are prepositions. It gets needlessly wordy.

Check line endings

Offer support for checking the line endings in the linted file. Enforcing unix-style line endings \n or windows-style line endings or \c\n.

This would have to be extracted from the lexer somehow since that's where the input file is read.

Maybe it's activated by a rule file but the existence of the file only activates a fixed check in the linter.

Emphasis tracking

It should be possible to implement an emphasis tracker. The \emphmacro is often used to introduce a term and to provide its definition. After the term has been introduced to the reader, \emph should no longer be used. The goal here would be to trigger a rule violation on repeated uses of \emph for the same term.

Is this useful in a general case or potentially too aggressive?