Code Monkey home page Code Monkey logo

abnf's Introduction

Augmented BNF for Syntax Specifications: ABNF

Internet technical specifications often need to define a formal syntax and are free to employ whatever notation their authors deem useful. Over the years, a modified version of Backus-Naur Form (BNF), called Augmented BNF (ABNF), has been popular among many Internet specifications. It balances compactness and simplicity with reasonable representational power.

RFC 5234

Contents

! []byte(...) should be UTF-8 encoded!

Function Generator

A way to generate the operators in memory.

g := ParserGenerator{
	RawABNF: rawABNF,
}
functions := g.GenerateABNFAsOperators()
// e.g. functions["ALPHA"]([]byte("a"))

Code Generator

Both the Core ABNF and the ABNF Definition contained within this package where created by the generator.

corePkg := externalABNF{
	operator:    true,
	packageName: "github.com/elimity-com/abnf/core",
}
g := Generator{
	PackageName:  "definition",
	RawABNF:      rawABNF,
	ExternalABNF: map[string]ExternalABNF{
		"ALPHA":  corePkg,
		"BIT":    corePkg,
		// etc.
	},
}
f := g.GenerateABNFAsAlternatives()
// e.g. ioutil.WriteFile("./definition/abnf_definition.go", []byte(fmt.Sprintf("%#v", f)), 0644)
(Currently) Not Supported
  • free-form prose
  • incremental alternatives

"Core" rules that are used variously among higher-level rules. The "core" rules might be formed into a lexical analyzer or simply be part of the main ruleset.

Elements form a sequence of one or more rule names and/or value definitions, combined according to the various operators defined in this package, such as alternative and repetition.

HEXDIG

In the spec HEXDIG is case insensitive.
i.e. 0x6e != 0x6E

HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F"

In this implementation it is so that 0x6e == 0x6E.

HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F"
               / "a" / "b" / "c" / "d" / "e" / "f"

EOL

Text files created on DOS/Windows machines have different line endings than files created on Unix/Linux. DOS uses carriage return and line feed (\r\n) as a line ending, which Unix uses just line feed (\n).

This is why this package also allows LF which is NOT compliant with the specification.

CRLF = CR LF / LF

Operator Precedence

RFC 5234 3.10

highest

  1. Rule name, prose-val, Terminal value
  2. Comment
  3. Value range
  4. Repetition
  5. Grouping, Optional
  6. Concatenation
  7. Alternative

lowest

abnf's People

Contributors

q-uint avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

abnf's Issues

Code Generator rule "alias" generates stack overflow/error.

When a rule is simply equal to another rule, the generated rule returns itself instead of the other rule.

rule-a = ALPHA / DIGIT
rule-b = rule-a
// rule-a = ALPHA / DIGIT
func RuleA() operators.Operator {
    return operators.Alts(
        "rule-a",
        core.ALPHA(),
        core.DIGIT(),
    )
}
// rule-b = rule-a
func RuleB() operators.Operator {
    return RuleB()  // NOTE - ERROR HERE - should be RuleA
}

Range Operator

The following line results in an empty slice.

operators.Range("%x5D-10FFFF", []byte{93}, []byte{16, 255, 255})([]byte("x"))

"x" has a decimal value of 120 which is in the range.

Hex converted to bytes by:

abnf/tree.go

Lines 500 to 507 in a0ffb9d

func hexStringToBytes(hexStr string) []int {
n, _ := strconv.ParseInt(hexStr, 16, 64)
b := make([]int, (len(hexStr)+1)/2)
for i := range b {
b[i] = int(byte(n >> uint64(8*(len(b)-i-1))))
}
return b
}

Range checked by:

abnf/operators/leaves.go

Lines 55 to 81 in a0ffb9d

func Range(key string, l, h []byte) Operator {
return func(s []byte) Alternatives {
if len(s) == 0 || len(s) < len(l) || bytes.Compare(s[:len(l)], l) < 0 {
return nil
}
var l int
for i := range h {
if i+1 <= i+1 && bytes.Compare(s[:i+1], h) <= 0 {
l++
} else {
break
}
}
if l == 0 {
return nil
}
return []*Node{
{
Key: key,
Value: s[:l],
},
}
}
}

Can not parse IPv6 addresses

There is a bug in the Optional function.
Because of this bug the library is not able to parse IPv6 addresses correctly.

Simplified example:

rule := Concat(`[ *1( a ":" ) a ] "::"`,
	Optional(`[ *1( a ":" ) a ]`,
		Concat(`*1( a ":" ) a`,
			Repeat(`*1( a ":" )`, 0, 1,
				Concat(`a ":"`,
					Rune(`a`, 'a'),
					Rune(`:`, ':'),
				),
			),
			Rune(`a`, 'a'),
		),
	),
	String(`::`, "::", false),
)

for _, s := range []string{
	"::",
	"a::",
	"a:a::",
} {
	r := rule([]rune(s))
	if r == nil {
		fmt.Printf("no value found for: %s", s)
	}
}

Result:

no value found for: a::

I think this is because:

  1. ::: the optional value is empty.
  2. a::: finds a: but can not find the following a, thus ignores the optional value which is incorrect.
  3. a:a::: finds the optional value.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.