The ohm-s from hpi-swa

SUnit tests in the 'testing' message category

Tests throughout all Ohm-S packages are consistently placed inside the message category testing instead of the conventional tests.

Write documentation to reflect differences between Ohm and Ohm/JS

Unexpected difference between lexical and syntactic rule

I was using the following rules in Ohm-S:

Text {
  TextRule
    = text

  textCharacter
      = "\\\""
      | "\\\\"
      | ~"\\" ~"\"" any

  text
    = "\"" textCharacter+ "\""
}

When trying to match the input Hello World starting from text everything works as expected and the string is correctly matched. When starting from TextRule, however, the expression can not be matched.

I tried the very same setup in Ohm/JS, which was able to match with either rule as the starting rule.

Add range operator

OhmSmalltalk grammar cannot be parsed by Ohm/JS

I stumbled upon this using the interactive ohm editor and confirmed it by setting up ohm/JS locally.

Error: Line 207, col 7:
  206 |     | stringLiteral
> 207 |     | ByteArrayLiteral
              ^~~~~~~~~~~~~~~~
  208 |     | symbolInArrayLiteral
Cannot apply syntactic rule ByteArrayLiteral from here (inside a lexical context)

After converting ByteArrayLiteral and LiteralArrayLiteralInLiteralArray to lexical rules, it does get accepted. I'm not sure, however, if that has an impact on the grammar's correctness.

Apply should allow parameterized applys as arguments

Smalltalk grammar problem with message receivers

currently the Smalltalk grammar will parse a variable name which starts with a special literal name, such as false, as a unary send without a space

FindFirst:startingAt: not understood

The Method FindFirst:startingAt: is not in the package. It is easy to implement it yourself in your local image but at first is missing.

Also your CI fails due to that ;)

OhmPExprOpt always produces a node with content empty string

You can see it when using OhmSmalltalk>>number

Add a parse result object instead of the true or exception interface

Add parameterized actions

ohmjs/ohm#38 (comment)

Add CaseInsensitiveTerminal

Add the capability to update the DebuggerMap in the source rewriter

Fix issue with bootstrapping the built in grammar

Port the built-in rules to Ohm/S

https://github.com/harc/ohm/blob/master/src/built-in-rules.ohm

OhmSmalltalk can not parse all of the Squeak/Smalltalk methods

Remove dependency to Context/S 2 for standalone Ohm version

BinarySelectorChar in OhmExplicitSendsSmalltalk

The Expression 3-6 is evaluated to normalFloatingPointLiteral_exponent. It seems to be an issue of the grammar of the binaryMessageSelectorChar.

It would make the stiling with OmegaPrint correct in the case of using '-' as BinaryMessage. Here you can see a picture of how it is formatted right now.
Thank you.

Allow access to skipped whitespace

In order to support Pretty Printers, the CST should allow access to whitespace. The whitespace does not have to be explicitly materialized in the CST but could also be re-constructed after the fact.

Here are two suggestions:

CST nodes support the method skippedSpaces that returns the spaces skipped before the node was applied. There are two options how the spaces could be returned:
- The spaces are returned as a simple string
- The spaces are returned as a spaces node
CST nodes support the method childrenIncludingSpaces that returns a collection of all parsed nodes including the space nodes in between

Any thoughts from your side @Paula-Kli?

SmalltalkGrammar literalArray problems with symbols in literalArray

Add the construction of recipes to the compilation of base, built in, and ohm grammars

Arity of some rules behaves inconsistently

The following rule yields nodes with only four children if the optional part does not match. It yields nodes with five children if it matches. We should check with Ohm/JS if this is intended or accidential behavior.

Pragma =
'<' identifier (':' Literal)? '>'

Check base grammar rules

Add Unicode Character sets

New super-splice operator

In Ohm v15.3.0 a new operator was added:
ohmjs/ohm@b519a05
https://github.com/harc/ohm/blob/master/doc/syntax-reference.md#defining-extending-and-overriding-rules

The super-splice operator (...) can be used to append and/or prepend cases to the supergrammar rule body. E.g., if the supergrammar defines comment = multiLineComment, then comment := ... | singleLineComment is equivalent to comment := multiLineComment | singleLineComment.

Rename _ to any

First rule of a grammar denotes the start rule

Add a nice way to use Ohm/S grammars without Gramada

OhmExplicitSendsSmalltalk has only OperandCascades

Matching something like self new test; test it is matched as OperandCascade with self as operand and new test and test as MessageChains. I was wondering whether that should be an UnaryCascade.

In case it shouldn't. What is an UnaryCascade and what a BinaryCascade? I can't find an example that is matched as one of those.

Fix checks in the child grammar if the super grammar changes

The results of many pexprs produce ambiguous nodes

The cst resulting from parsing aabbbcbcd with the following grammar is ambiguous:

ManyTestGrammar { StartRule = "a"+ ("b"+ "c")+ "d"+ }

We can not determine the matching intervals of "b"+ anymore.

Late conversion of inline rule names to message names can cause clashes

Inline rule names do not match with the message that is sent to semantic actions. It is therefore possible to do the following:

SomeGrammar {
  RuleOne
    = "hello"
  Rule
    = "hello" "world" -- one
}

At the moment inline rule names' underscore is removed and the subsequent character capitalized when searching for methods in semantic actions. Both rules would therefore try to lookup the method RuleOne. Due to differing arities, only one of the rules can be acted upon.

This duplication of rules is not detected, since the inline rule name conforms to the Ohm/JS implementation (Rule_one) and is only later converted.

Refactor makeGrammar interface to match Ohm/JS interface

not in a sequence in a syntactic rule skips spaces

Which is somewhat wrong, as there was no further matched child.

OhmSHRuleParser errors on incomplete Smalltalk method

On a fresh subclass of OhmGrammarSmalltalkProxy, when writing the class-side serializedGrammar method, the error Error: ByteString called #basicNew: with invalid argument -1 is thrown once the text editor contains the following:

serializedGrammar
^ '

The error is caused because the following code always assumes the opening single quote to be closed in the end:

Ohm-S/packages/Ohm-Support.package/OhmSHRuleParser.class/instance/rangesIn.classOrMetaClass.workspace.environment.context..st

Line 14 in 98bf92f

    
           	grammarSource := aString copyFrom: offset + 1 to: (aString findLast: [:c | c = $']) - 1.

Change the matching API to the new Ohm one

i.e. no matchContents:startingFrom: etc.

Connected to #3

Write test to ensure Unicode is parseable

#replaceParametersWithArguments: creates unnecessarily deep copies

Almost all of a rule body is copied during #replaceParametersWithArguments:

Update espace sequences (list and means to escape...)

Different line breaks are an issue

https://github.com/harc/ohm/blob/master/doc/syntax-reference.md mentions:

Special characters (", , and ') can be escaped with a backslash -- e.g., """ will match a literal quote character in the input stream. Other valid escape sequences include: \b (backspace), \f (form feed), \n (line feed), \r (carriage return), and \t (tab), as well as \x followed by 2 hex digits and \u followed by 4 hex digits, for matching characters by code point.

In terminalExpression a text instead of a string

When you compute the value for a _terminal node with memoization, the result is a text instead of a string.
Is this intended behavior?
You can see it in this screenshot:

The memoization is called in the OhmSynthesizedAttribute >> value:

Add parameterized rules

Check whether there are duplicate parameter names
Cache newly created body in application

Lexification operator '#' not supported

Lexification currently seems to not be supported. Trying to create a grammar using this operator fails, throwing an OhmMatchFailure.

OhmCheckStructure does not check rule name

I have the following set of rules:

integerConstant
	= decimalConstant
	| octalConstant

decimalConstant
	= nonzeroDigit digit*

octalConstant
	= "0" octalDigit*

octalDigit
	= "0".."7"

nonzeroDigit
	= "1".."9"

In a test case (subclass of OhmSyntaxTestCase) I want to make sure an integerConstant rule is parsed as the correct subrule as follows:

testGrammarParsesIntegerConstants

	startRule := #integerConstant.
	self shouldParse: '042' to: #(integerConstant (octalConstant '042'))

While this test passes as expected, I also noticed that it actually passes no matter what rule name I specify in the structure I supposedly assert against:

testGrammarParsesIntegerConstants

	startRule := #integerConstant.
	self shouldParse: '042' to: #(integerConstant (thisCanBeWhatever '042'))

It looks like the rule names are not compared on leaf levels:

Ohm-S/packages/Ohm-Core.package/OhmCheckStructure.class/instance/defaultExpression..st

Lines 6 to 13 in 98bf92f

    
           "Check for a value" 
        
           (self structure size = 2 and: [self structure second isString]) 
        
           	ifTrue: [ ^ aNode interval contents = self structure second]. 
        
           nonPrimitiveChildren := aNode children reject: [:n | n ruleName = OhmParsingExpression terminalRuleIdentifier]. 
        
           "Check whether substructure can be valid at all" 
        
           ((aNode ruleName = self structure first) and: [nonPrimitiveChildren size = (self structure size - 1)]) 
        
           	ifFalse: [^ false].

Is this correct behavior? And if so, what would be the expected way to write this kind of test?

BinaryCascade in latest fix

In the latest commit from 07.07.2020 binary cascades are again matched as operand cascades.

We tried matching 1 + 2 negated; negated which was in the commit from 06.07.2020 matched as binaryCascade and in the commit from 07.07.2020 it is matched as operandCascade from OhmExplicitSendsSmalltalk.

Fix potential infinite loop in Smalltalk grammar

Zum einen beschwert sich die aktuelle Ohm-JS Version über folgende Regel

in der Smalltalk-Grammatik:
BlockLiteral = "[" BlockArguments? ExecutableCode? "]"
mit der Nachricht:
"Nullable expression ExecutableCode is not allowed inside '?' (possible
infinite loop)"

Da ExecutableCode (via MoreExecutableCode und Statements) schon leer
sein kann, habe ich das ? entfernt und es funktioniert.
Nun weiß ich nicht, ob die Smalltalk-Implementierung mit ExecutableCode?
klar kommen würde.

	"Check for a value"
	(self structure size = 2 and: [self structure second isString])
	ifTrue: [ ^ aNode interval contents = self structure second].

	nonPrimitiveChildren := aNode children reject: [:n \| n ruleName = OhmParsingExpression terminalRuleIdentifier].
	"Check whether substructure can be valid at all"
	((aNode ruleName = self structure first) and: [nonPrimitiveChildren size = (self structure size - 1)])
	ifFalse: [^ false].

hpi-swa / ohm-s Goto Github PK

ohm-s's People

Contributors

Stargazers

Watchers

Forkers

ohm-s's Issues

Recommend Projects

Recommend Topics

Recommend Org