otterkit / otterkit-cobol Goto Github PK

View Code? Open in Web Editor NEW

249.0 11.0 15.0 30.12 MB

A free and open source Standard COBOL compiler for 64-bit environments

Home Page: https://otterkit.com

License: Apache License 2.0

C# 40.25% HTML 0.10% C 59.66%

cobol compiler dotnet

otterkit-cobol's Introduction

Otterkit Toolchain

Contributing

If you have any questions, please feel free to reach out by opening an issue or a new discussion post. We're happy to welcome new contributors!

Please keep in mind that the repository is licensed under Apache-2.0, all contributions (unless otherwise specified) will be made available under the same terms (Apache-2.0, Submission of Contributions).

Copyright

Unless otherwise specified, all files in this repository (except the Otterkit logo) are licensed under the Apache-2.0 license.

See LICENSE for the full license.

See NOTICE for third-party software licenses.

otterkit-cobol's People

Contributors

Stargazers

Watchers

Forkers

gabrielesilinic nicolashouriet pinkuburu mggates39 muayadal robertartigas yutaro-sakamoto tudor44 kjarrio reburninator kant2002 drozendaal vilirocha

otterkit-cobol's Issues

[✨]: Congratulations for the work

Not a request feature at all.
just a thanks for the working being done! 👏👏👏

[Core]: Defining an Embedded COBOL subset as a C replacement for systems programming (and why strict aliasing is broken)

Let's start with the problem that I'm having (and attempting to solve). While trying to figure out some memory related details in our runtime, I realized that some COBOL features are impossible to implement in standard conforming C without invoking undefined behavior. More specifically, according to the rules in the C standard, it's not possible to implement COBOL's REDEFINES (unions) in C without invoking undefined behavior, and by extension most kinds of type punning is also prohibited.

I'll clarify what I mean by this, as this might also come as a surprise to other C developers. Both the C99 and C11 standards contain a footnote saying the following:

If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation.

Which at first makes it seem like C allows the kind of type punning we need to make COBOL's REDEFINES work. However, footnotes are specified in the foreword as non-normative:

In accordance with Part 3 of the ISO/IEC Directives, this foreword, the introduction, notes, footnotes, and examples are also for information only.

Meaning that footnotes can't define normative behavior and should only clarify the existing normative text with additional information. No normative text exists in the C standard that specifies the kind of type punning described in the footnote. In fact, we have sections in the standard that contradict what is said in the footnote:

The value of at most one of the members can be stored in a union object at any time. . . .

If the value of only one member can be stored in a union, then the value of the other members is non-existent (and reading from them would be UB). Nothing stops Clang and GCC from optimizing (currently or in the future) based on the assumption that union members, other than the very first written to, are in fact non-existent, so we risk undefined behavior here.

The other issue is, that even if we consider the footnote as normative (and resolve the conflicts), C's strict aliasing rules (in 6.5) would still prohibit us from accessing the same memory location as objects of different types:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
— a character type.

So we have to assume that accessing (reading or writing) to the same memory location using different types is at best implementation-defined (-fno-strict-aliasing), and at worst undefined behavior. I'm not the first one having issues with how the standard defined these rules, as it turns out the Linux kernel is compiled with -fno-strict-aliasing for this same reason (and generates a ton of type punning warnings when the flag is turned off).

This strict aliasing issue is, of course, not only quite awful for any low-level systems programming, but also really bad for us because Standard COBOL does not have the same strict aliasing restrictions, and in fact, the ability to access the same memory location as different types is required in order to make REDEFINES (unions) work as specified by the standard.

As described in the section 13.18.44 of the COBOL23 standard:

The REDEFINES clause allows the same computer storage area to be described by different data description entries.

And again a little further down, in the general rules for the clause:

When the same storage area is defined by more than one data description entry, the data-name associated with any of those data description entries may be used to reference that storage area.

Being able to access the same memory location as different types is a requirement for Standard COBOL, and is at the same time a violation of Standard C's strict aliasing rules. So we can conclude that C as it currently is, cannot be used to implement COBOL's REDEFINES without invoking undefined or implementation-defined behavior (and by extension, C# unions have the same issue).

Some people have suggested using memcpy as a workaround to make type punning work in C, but this has several issues, one being that in order for this to work we need the compiler to be able to recognize the use of memcpy for type punning and optimize it out (not always possible, and risks more UB). If for some reason the call to memcpy is not optimized away and the call is actually executed then we'd be calling memcpy on overlapping memory, which is undefined behavior.

Also, using a memcpy-like function for type punning can cause undefined behavior in COBOL. As described in section 14.6.10:

When the data items referenced by a sending and a receiving operand in any statement are identified as sharing either a part of or all of their storage areas, and the rules for the statement do not provide for a specific result in the following circumstances, then:

When the data items are not described by the same data description entry, the result of the statement is undefined.

I'd rather not have to deal with conflicting UB on both sides and strict aliasing weirdness, so I'm proposing a subset of Standard COBOL, that I'll be calling "Embedded COBOL", which consists only of language features that map as close to one-to-one as possible with freestanding standard C.

Please note that this won't be another dialect that Otterkit will directly support in addition to Standard COBOL, but rather a subset of Standard COBOL features that we'll be prioritizing (implementing first) in order to replace our existing C code as soon as possible.

I'll update this issue soon to start adding the C to COBOL feature mappings, let me know if anyone has any feature suggestions from C (that map to COBOL) that I should add to the list.

Tagging both @gabrielesilinic, and @TriAttack238. I need your feedback on this.

Also tagging @GitMensch. I don't think COBOL has any restrictions that would make this impossible, let me know if I'm wrong on this. Also, feel free to add any C features I should include that map directly to COBOL.

[Core dev]: Standard COBOL manual memory allocation, pointers, and address of variables

Current Issue:

Standard COBOL defines statements for manual memory allocation (ALLOCATE and FREE), a data item clause where those can be used (BASED), and data item pointers / addresses (ADDRESS OF identifier).

C# itself does not like giving out pointers to GC allocated memory, because there's no guarantee that it won't be moved around by the GC eventually. Keeping all variables pinned is also not ideal (potentially very bad actually).

This means that we realistically can't get the address of a .NET heap allocated object. This is needed though, because Standard COBOL allows users to get the address of a working-storage data item, which is always a static heap allocation.

Possible Solutions:

Call C malloc and free.
Call Rust's equivalent of malloc and free.
A C or Rust implementation of a "memory pool" (Native equivalent of an ArrayPool<T>).
Custom stack-based memory allocator (outside the actual stack).

[🐛]: Error compiling hello world

Describe the bug
Attempting to compile a "Hello World" style program to see what the codegen would look like.

To Reproduce
Created a file main.cob with the contents

IDENTIFICATION DIVISION.
PROGRAM-ID. IDSAMPLE.
ENVIRONMENT DIVISION.
PROCEDURE DIVISION.
    DISPLAY 'HELLO WORLD'.
    STOP RUN.

Then ran otterkit build main.cob

Expected behavior
Compile succeeded

Actual behavior
Compiler crashed

> otterkit  build .\main.cob
Otterkit parsing error: C:\Users\jaredpar\temp\cobol\main.cob:1:0
Unexpected token: Missing source unit ID name (PROGRAM-ID, FUNCTION-ID, CLASS-ID...), the identification division header is optional but every source unit must still have an ID.

     |
   1 | PROGRAM-ID. IDSAMPLE.
     |        ~

Attempting recovery: Unexpected tokens will be ignored until a separator period or an anchor point is found
(Anchors: OPTIONS, ENVIRONMENT, DATA, PROCEDURE)

Otterkit parsing recovery: C:\Users\jaredpar\temp\cobol\main.cob:1:3
Parser recovered at the following anchor point: (Anchor: ".")

     |
   1 | PROGRAM-ID. IDSAMPLE.
     |           ~

Unhandled exception. System.InvalidOperationException: Stack empty.
   at System.Collections.Generic.Stack`1.ThrowForEmptyStack()
   at System.Collections.Generic.Stack`1.Peek()
   at Otterkit.Analyzer.<Analyze>g__IDENTIFICATION|7_1() in /Users/ktlsf/Documents/GitHub/otterkit/src/OtterkitAnalyzer.cs:line 198
   at Otterkit.Analyzer.<Analyze>g__Source|7_0() in /Users/ktlsf/Documents/GitHub/otterkit/src/OtterkitAnalyzer.cs:line 116
   at Otterkit.Analyzer.Analyze(List`1 tokenList, String fileName) in /Users/ktlsf/Documents/GitHub/otterkit/src/OtterkitAnalyzer.cs:line 101
   at Otterkit.OtterkitCompiler.CommandLineArguments(String[] args) in /Users/ktlsf/Documents/GitHub/otterkit/src/OtterkitCompiler.cs:line 153
   at Otterkit.OtterkitCompiler.Main(String[] args) in /Users/ktlsf/Documents/GitHub/otterkit/src/OtterkitCompiler.cs:line 39

Platform Information (please complete the following information):

OS: Windows
OS Version 11
CPU Architecture x64
Otterkit version: 1.0.15-alpha

Additional context
I have effectively zero experience with COBOL. This could very well a simple PEBKAC problem.

Mostly just curious about what the code gen strategy was and wanted to take a look at it.

[Core Dev] External data items implementation

Opening a new issue to discuss the implementation of external data items a little better.

Need to figure out how to resolve an external data item's storage.

[Core Dev]: Dynamic binding and the External Repository.

The Standard COBOL External Repository

Standard COBOL's External Repository contains all information required for activating (calling/invoking) programs, functions, or methods and for checking conformance (Checking their signature, in case dynamic binding is required).

The information contained in the repository is:

The name of the source unit.
The type of the source unit (Whether it is a program, function, class, or an interface).
A description of its parameters (Their type declarations, whether they are optional or not, and if they are passed by value or by reference. Classes only: generics and inheritance).
A description of the returning data item (and its type declaration).
The exceptions that could be raised in the source unit.
The object properties and methods (Classes only: the previous items also applies to store its methods).
Extra information such as: if DECIMAL-POINT IS COMMA clause is specified, currency symbols, and locale information.
Also, "Any other information that the implementation requires".

The Standard requires us to "provide a mechanism that allows the user to specify whether to update the external repository when a compilation unit is compiled".

From this we can tell that the external repository works like a local manifest which contains all program, function, class, method and interface signatures that belong to a given project. It can also be used during runtime when the signature of a program can't be statically checked during compile time (Raises an exception if it fails to find the signature at runtime).

If updating the external repository is turned off by the user then any kind of runtime signature check will fail if the signature was not added to it before the user turned it off.

*> External repository is global to each project
*> It cannot contain duplicate or conflicting program, function and class names


           ------------- Project dependency signatures
           |
           ˅
External Repository  <------- User project signatures
           ˄
           |
           ------------- Extra library signatures

The requirement to allow the user to update the repository tells us that it either has to be a local manifest file (probably too slow), or a compile time source-generated data structure populated with all signatures from a project (Including dependencies), with a compiler option to not update it.

The problem with dynamic binding and the External Repository

There's a certain problem with storing all of this information in a single place, is that there's no way to store and return different delegate types from a single data structure that allows searching by a string value like a dictionary (I might be wrong about this).

I thought about storing the types themselves with System.Type, but that would require heavy use of reflection to get them back and call the procedure method from it.

// Dictionaries *have* to contain the type of delegate it will store
// But we can't store just a single *kind* of signature
var dict = new Dictionary<string, Func<ICOBOLType, ICOBOLType>>();

// Something like this would not be possible (I think):
var signatures = new Dictionary<string, Signature<type or void, any params...>>();

A source-generated static class with different fields or properties to store those might work, but only if it still allows lookup with an arbitrary user-defined string value without causing a compiler error (I'm not sure how exactly that would look like).

*> This is because COBOL allows getting a program pointer from a string:
set ptr-name to address of program str-var.

*> The value of str-var can't be statically checked during 
*> compile time if it is obtained from user input:
accept str-var from standard-input.

*> The name of the program will be checked during runtime
*> and allows calling it afterwards, which will cause another
*> runtime check for its parameters and return types:
call ptr-name using ... returning ... .

Dynamic binding like this can't be done without reflection in C# as far as I know.

// This is exactly what we need, but it requires reflection:
Type.GetType("TypeName").GetMethod("MethodName");

Information about C# types and signatures might be available in the compiled assemblies or during compilation, but I'm not sure how to access those with Otterkit without using reflection (and how to use those during runtime to call methods).

We're trying to avoid the use of reflection in order to support NativeAOT, with both the compiler and the generated C# code. Which probably means that this will have to be source-generated, ideally without causing a C# compiler error or fatal runtime exception when the program or function is not found.

We might be able to do this with the Dynamic type, but we need to benchmark how much slower that would be and if it is compatible with NativeAOT. It might have a big performance impact if every program or function pointer has to use the dynamic type to hold their references.

If all else fails, we might have to look into emitting IL into a sort of custom assembly and try to bypass all C# checks, or use another "intermediate" language to hold these signatures and calls for us.

Community help and suggestions?

We'll leave this issue open for discussion and suggestions until we figure out the best way to implement and handle this functionality. We might need community help on this, from both the COBOL side and the C# side.

We need to add few unit tests to ensure the functionality not broken with PR

**Is your feature request related to a problem? **
As a first step towards CI/CD for every check in, we need to establish some unit tests for each components here.

**Is your feature request related to another COBOL dialect? **
A clear and concise description of what the feature is, and which COBOL dialect currently implements it.

**If the feature request is related to another COBOL dialect: Is the requested feature compatible with our current implementation and does it conform to the latest COBOL standard? **

Describe the solution you'd like
I would like us to create a folder Tests and have unit test projects to address any bug injection to improve the code quality.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

[🐛]:

Describe the bug
otterkit new app doesn't work

To Reproduce
otterkit new app
This gives

C:>otterkit new app
No templates or subcommands found matching: 'otterkit-export'.

To list installed templates similar to 'otterkit-export', run:
   dotnet new list otterkit-export
To search for the templates on NuGet.org, run:
   dotnet new search otterkit-export

And those instructions don't work either, viz

C:>   dotnet new list otterkit-export
No templates found matching: 'otterkit-export'.

To search for the templates on NuGet.org, run:
   dotnet new search otterkit-export


For details on the exit code, refer to https://aka.ms/templating-exit-codes#103

C:>   dotnet new search otterkit-export
Searching for the templates...
Matches from template source: NuGet.org
No templates found matching: 'otterkit-export'.


For details on the exit code, refer to https://aka.ms/templating-exit-codes#103

Platform Information (please complete the following information):

OS: Windows
OS Version: 11
CPU Architecture: x64
Libotterkit Version:

>dotnet new install Otterkit.Templates::1.7.50
The following template packages will be installed:
   Otterkit.Templates::1.7.50

Success: Otterkit.Templates::1.7.50 installed the following templates:
Template Name               Short Name    Language  Tags
--------------------------  ------------  --------  -----------------
Otterkit COBOL Application  otterkit-app  [C#]      COBOL/Application
Otterkit COBOL Library      otterkit-lib  [C#]      COBOL/Library


C:>dotnet tool install --global Otterkit --version 1.0.80
Tool 'otterkit' is already installed.

[Docs]: V-ISA IR: Mapping the virtual instruction set architecture to other ISAs.

This issue is meant as documentation, showing the mapping between a virtual instruction set architecture (used as an intermediate representation) to other ISAs, and also explanations for its design choices, and its calling convention.

Virtual registers:

Zero	Stack Pointers	General Registers	Vector Registers
`zero`	`frame, stack`	`[r0, r1, r2, ...]`	`[v0, v1, v2, ...]`

Calling convention:

For a procedural callee:

The first six integer or pointer arguments are passed in registers r0 through r5, excess arguments are passed on the stack. The first eight floating-point and SIMD vector arguments are passed in registers v0 through v7, excess arguments are passed on the stack.
If the callee is described as having a return variable, then the address of a valid storage area of appropriate length and alignment for the variable's type must be passed on the stack immediately before the call (or equivalent) instruction.

For an object-oriented callee:

Notes:

Why an explicit zero register?

Without one, we'd need to take a register, zero it out, and use its value whenever we need a zero. This can cause pipeline stalls in some architectures (RISC-V and MIPS), because any operations that use a zero would then be dependent on a previous zeroing operation.

Having one also makes register allocation easier. The compiler doesn't need to identify a register that won't cause a stall when clobbered with zero, or spill into the stack whenever we need a zero and no registers are immediately available.

Any "compare against 0" operations (common in branching code) are potentially cheaper by not requiring a register zeroing operation beforehand. Any memory zeroing operations are also potentially cheaper by already having a 64-bit (8 bytes) zero immediately available in a register.

[🐛]: Tried the first program and complaining about

Describe the bug
biroj@88665a0182c9 demonewapp % otterkit build --run -e hello.cob --free
Analyzer Error [COB0085]: Missing source unit definition. ╭─/> [hello.cob:1:7] │ 1 │ program-id. hello. │ ~~~~/> Expected a source unit id definition. │ │ Note: The identification header is optional but every source unit must still have an ID. │ ────╯ Unhandled exception. System.InvalidOperationException: Stack empty. at System.Collections.Generic.Stack1.ThrowForEmptyStack() at System.Collections.Generic.Stack1.Peek() at Otterkit.Analyzers.Analyzer.Source() in C:\Users\KTSno\Documents\GitHub\otterkit\Otterkit.Analyzers\src\Analyzer.cs:line 54 at Otterkit.Analyzers.Analyzer.Analyze(List`1 tokenList) in C:\Users\KTSno\Documents\GitHub\otterkit\Otterkit.Analyzers\src\Analyzer.cs:line 17 at Otterkit.Otterkit.CommandLineArguments(String[] args) in C:\Users\KTSno\Documents\GitHub\otterkit\src\Otterkit.cs:line 115 at Otterkit.Otterkit.Main(String[] args) in C:\Users\KTSno\Documents\GitHub\otterkit\src\Otterkit.cs:line 27 zsh: abort otterkit build --run -e hello.cob --free
To Reproduce
https://gist.github.com/birojnayak/6b939399336ac80da8e50a47e58ae8de <= CobolFile

ran this command
otterkit build --run -e hello.cob --free

Expected behavior
Expecting CS file

Screenshots
If applicable, add screenshots to help explain the issue. This could be terminal screenshots that can help us find the source of the issue.

Platform Information (please complete the following information):

OS: Mac
OS Version [e.g. Ubuntu 22.04]
CPU Architecture [e.g. x64]
Libotterkit Version 1.0.70

Additional context
Add any other context about the problem here.

[Core Dev] Ordering of the Data Division clauses

Alright so, while implementing the "IS EXTERNAL" clause for the data items I got a bit confused about the ordering of these clauses, mostly because IBM, Micro Focus and GnuCOBOL let you order those randomly (with some exceptions).

For example, all three implementations allow you to write the value clause before the picture clause, and the usage clause after the value clause.

But the standard defines it differently, the COBOL standard gives a very specific ordering for those (picture > usage > value). The standard's ordering is implied by the data description entry's format 1, the COBOL metalanguage format used there does not allow for any ordering, and this can be compared with the metalanguage format used for the "ON EXCEPTION" and "NOT ON EXCEPTION" (used on the call statement) which does allow the user to choose which one to write first.

One of the main goals of Otterkit COBOL is strictly following the standard as much as possible to avoid accidentally creating yet another dialect. But one issue with that it seems, is that Otterkit would be breaking compatibility with every other implementation.

I'm still in favor of strictly conforming to the standard though, rather than doing my own thing. I'm opening this issue in hopes to get feedback on this.

[✨]: Include NIST85 tests - adjusted

I see that this compiler is mostly targeted at "modern standard COBOL", which I guess would be the 2022 standard.
This means - in general - that many COBOL85 code would not be accepted - but it is likely reasonable to either "support enough" to compile and test NIST85 code - or at least do the following:

get GnuCOBOL 3.2 preview or later
build it from source
run make test -j8 COBOL_FLAGS=--save-temps
try to compile the generated ".i" files with otterkit - those have all copybooks included, all comments removed all line-continuation replaced and the code effectively in free format

Either add "enough" COBOL85 like the comment paragraphs (AUTHOR. and friends) or remove them with sed or by hand.
Ignore others like the ones using ALTER.

Aim for the NC module first, increasing your own testsuite with everything that does not compile and also check for the execution results in your testsuite. Compile and add the features of all program one by one.
This will get you many failures on the first modules - but very reasonable ones, and much less with each follow-up.

And in the end this will get you many of the real world usages of COBOL and some special cases, too.

[Core]: Rethinking how our COBOL backend should be structured (or, what went wrong with our backend).

As you might have noticed by our last commit being two months ago, we've been in a bit of a hiatus. The reason being that our C# backend most likely won't work as I originally thought it would, so I reserved these past two months to study assembly more closely and to look for alternative solutions that could help us continue working on the backend. I have a couple of ideas, but they're not completely refined yet. I'll document it once I finish refining the full picture of the new backend architecture.

While we still want to support interoperability with it, our backend will most likely not generate C# anymore due to several mapping issues that I'll explain below.

TL;DR COBOL doesn't map particularly well to current Algol-like languages.

After staring at both COBOL and assembly code for the past two months, I've noticed that it seems to map extremely well to assembly though. In fact, COBOL appears to map directly to assembly much better than C ever did (the PDP-11 might be the only exception).

If you stare at both side by side long enough, you start noticing that the separation of the data and procedure divisions is not just for syntactic purposes, but it's actually a clear separation of the data and text segments in machine code. In virtual memory, one has read/write permissions, and the other has read/execute permissions, because of this they can never be placed together (overlap) in memory, and this separation is made explicit in COBOL's syntax.

This also carries over to COBOL's object oriented syntax. Though in this case, there's instead a clear syntactic separation between the stack, the heap, and the code segment. An object method cannot contain a working-storage section (the heap), and the object itself cannot contain a local-storage section (the stack), and the code itself is separate from both. Even though it supports OOP concepts, it still maps surprisingly well to assembly, certainly much better than other object oriented languages.

The statement based syntax also appears to be somewhat similar to assembly mnemonics in the way they are written (similar to: verb operand operand ...), it can be argued that COBOL statements make a sort of high-level assembly language. This becomes clear when you realize that user-defined paragraphs and sections in the procedure division are just assembly labels, and in fact they have the same behavior as a label with a conditional jump at the end to determine whether to fall through to the next label or return back to the perform statement that called it. This "fall through" behavior combined with the ability to call a label as if it was a parameterless function, is not easily mapped to other high-level languages, but it maps extremely well to any modern assembly language.

Realizing this was an absolutely enlightening moment. The whole reason why everything is so neatly separated in COBOL, from the divisions to the statements, in both procedural and object oriented code, appears to be so that it can still map well enough to some abstract assembly language. The same can't be said for Algol-like languages. They don't map that well directly to assembly anymore, on any modern architecture.

That's the problem right there, the reason why we've been having so much trouble with the backend trying to map COBOL to C# or C is because we're essentially trying to do the job of a disassembler. We're taking an assembly-like language and trying to turn it into an Algol-like language, we're doing the opposite of what we should be doing. COBOL is essentially a portable low-level assembly-like language, it's not Algol-like, it's not Pascal-like, it's not ML-like, there's nothing else like it. The English-like syntax just makes it much easier to read, remove all of the extra words in most of the statements and the final result will look much like an assembly mnemonic.

It also gives users additional features that are arguably lower level than even what C allows you to do without extra library support. Like the ability to override how a symbol is to be exported to the object file. To align a bit field to the first bit of the first available byte boundary (aligned clause). To align a struct in such a way that all items are synchronized to the left or to the right of a natural boundary, changing where the padding bytes will be (synchronized clause). To change the justification of a bit field, effectively switching its entire in memory bit ordering (justified clause). To change the in-memory endianness of... IEEE 754 binary floating-point variables??? There are more of these useful low level features, and that includes raw pointers as well!

These are not the features nor the syntax of a high-level language nowadays, it gives some extreme flexibility over how the bits and bytes are stored both in memory and in the object file. We probably shouldn't be lifting a low-level (arguably extremely) assembly-like language like COBOL to a higher level Algol-like language. It just doesn't map as cleanly as I thought it would when I started the project, I messed up at astronomical proportions. I'll be looking for alternative solutions that don't require depending on the behavior of other languages.

Some immediate problems with trying to map COBOL to some Algol-like languages:

C is... an interesting one, here are some of the issues with it.
- With strict aliasing we can't implement the redefines clause, @GitMensch and I have tested this on another GitHub issue. We need to be able to address the same memory location as different incompatible types, the C standard seems to disagree.
- Without control over how symbols are exported we can't implement externalized names the way the standard requires, specially for overloaded methods. C compilers will sometimes mess with the symbols, there's no way for the user to (portably) override them on an individual source unit basis, and also no way to make them (portably) case insensitive.
- We can't change the endianness of individual floating-point variables at compile time, so we can't implement the endianness switch in some of the usage clause formats. Using a char array to encode the endianness would break with strict aliasing.
- It doesn't support portable bit fields which are required for Boolean variables (different from Algol-like bools), it's all implementation-specific behavior at best with no control over padding.
- It doesn't allow us to specify the exact layout of structs down to the padding bytes, which are required for both the aligned (for bit fields) and synchronized clauses. Might be possible using some horrible spaghetti, but do we really want that, will the spaghetti be portable?
- It doesn't allow us to specify where things should be stored in the final object file. It would be needed for us to optimize the external repository, and to not have it be dropped into an arbitrary place in the data segment (or worse, in the code segment) that can only be found with dlsym or GetProcAddress, both of which are OS dependent.
- Finally, buffer-related APIs in C are usually null terminated, COBOL buffers can't be null terminated due to reference modification and, for example, the ability to access both the alphanumeric elementary items individually from a group item, or the entire group as a single contiguous alphanumeric item. There's also null mentions of a null terminator in the COBOL standard.
- And a couple more issues but I'll stop here before it gets too long.
C# is just too high-level to make it a truly useful target for Standard COBOL without also having to sacrifice most of its low-level features (like managed COBOL implementations would do), or having to pinvoke assembly functions just to make everything conform to the standard. It also has all the issues pointed out above (minus strict aliasing).

Handling end of line incase of variable declaration is numeric

Describe the bug


01 Total    PIC 99 value 0.

When there is 0. , the parser should figure out the EOL and consider the initialization to just 0 instead of failing.

To Reproduce
https://gist.github.com/birojnayak/4bd0d4332d98170f74803b3e217ca4f8

Expected behavior
The parser shouldn't throw exception.

Screenshots
If applicable, add screenshots to help explain the issue. This could be terminal screenshots that can help us find the source of the issue.

Platform Information (please complete the following information):

OS: [e.g. Linux]
OS Version [e.g. Ubuntu 22.04]
CPU Architecture [e.g. x64]
Libotterkit Version [e.g. 1.0]

Additional context
Add any other context about the problem here.

[✨]: Ideas for a Standard Library

The COBOL Standard Idea

While modern COBOL has made great strides from the 1985 standard that still colors public perception, there are still a few holes where the language does not quite match up to modern standards and expectations. Therefore, there is the idea for a standard library written in 2023 Standard COBOL that is attached to the compiler. However, we need ideas, which is where this issue thread comes in.

How to format ideas.

If you have an idea for an addition to the standard library, your proposal needs three important aspects:

What is this feature? Are there any examples from other languages that implement this feature well in their standard libraries or elsewhere? What use cases does it serve? Should this feature be in the standard library(ie benefit a majority of projects/developers)?
How exactly would it work? You don't need to write a full college essay, but some clarification on expected behavior and maybe internals would be great. How would it fit in with the rest of the standard library and COBOL's language philosophy as a whole?
How hard would it be to implement? Everything takes work and time, and it's important to think about the resources it takes to implement a feature before going ahead with it. For example, while true randomness would be cool, breaking a constant server connection to a cosmic ray detector or lava lamp camera is cool but impractical, to say the least.

However, that is not usually the end. Other commenters can discuss and suggest modifications to the idea as they wish, within reason.

The First Idea

This is not only a legitimate proposal for the standard library, but an example for future proposals.

Idea: Dynamic Data Structures.

COBOL's tables work just as well as arrays in other languages, as structures of predefined length for variables of the same type. However, nearly every programming language in the modern day also has Dynamic Data Structures: structures that order data but do not have a set length. These are perfect for handling an unknown amount of elements, which is quite often. I propose to add not one but two dynamic data structures to the standard library: Vectors(Dynamic Arrays) and Linked Lists.
Both strictures will be classes in the standard library that generate mutable objects. These objects will be able to store virtually any number of values of the same type. Structures will also support concatenation with tables and their own type(as in type of structure and type of value), along with stack-like properties(more on that later). These structures can also hold other structures, such as a vector holding another vector of a linked list. Assume all methods are for the instance of the respective class unless told otherwise. First, Vectors:

Vectors are dynamic arrays, which means that under the hood vector objects contain a table, and that table is swapped out for a smaller or larger one when needed. According to this youtube video, these tables should be some multiple of 2 larger or smaller than the initial size of the internal table(which is 1 by default) for efficient resizing. So if the internal table is 4 items long and completely filled, a new table 8 items long will be created and all of the values of the last table will be carried over. Additionally, if an internal table 16 items long that has 8 items but 1 item is removed, it is now at less than one-half capacity and the new internal array will be 8 units long. Unlike linked lists, vectors will be indexed, sliced(creating a smaller vector with an indexed subset of elements), and insertion and deletions by index.
Linked lists are composed of object nodes that reference each other. I am specifically talking about Doubly-Linked Lists, where nodes can both access the previous node and the next node. However, the first node references when initializing the list always start with the first Head node, and the following nodes must be traversed with a next() or prev() method on each node.

While it is technically possible for each node of a linked list to have a different type, in my proposal each vector and linked list will only hold one type of value using COBOL's generics system. Whether this includes interfaces or just regular types and classes I am sure.

These structures will also have methods that consider the structure as a whole, such as a static equals() that checks if two structures are of the same type and values, and a contains() method to check if a certain value exists in a structure.

These structures will also implement the stack interface, which will allow users to use the push() method to add a value to the end of the structure, and a pop() method to remove a value from the end of the structure and return the value. The linked list structure can also push and pop at the beginning.

While it is a lot of work to implement these features, vectors and linked lists are nothing new to the world of computer science, so they should be relatively easy to implement. Additionally, COBOL's high-level language features such as garbage collection and generics should make implementation less of a hassle compared to a language where precise memory management is a priority.

And that is finally the end. Now, any questions or comments?

[🐛]: No templates or subcommands found matching: 'otterkit-export'.

Hello,

As title says I have a problem running the following command as described in the documentation: otterkit new app

The error I received is:

& ❯ otterkit new app
No templates or subcommands found matching: 'otterkit-export'.

To list installed templates similar to 'otterkit-export', run:
   dotnet new list otterkit-export
To search for the templates on NuGet.org, run:
   dotnet new search otterkit-export


For details on the exit code, refer to https://aka.ms/templating-exit-codes#103

Running above commands gives me: No templates found matching: 'otterkit-export'.

I've checked whether I have .NET 7 installed and for me it seems I have everything setup correctly:

& ❯ dotnet --info
.NET SDK:
 Version:   7.0.400
 Commit:    73bf45718d

Runtime Environment:
 OS Name:     Windows
 OS Version:  10.0.22621
 OS Platform: Windows
 RID:         win10-x64
 Base Path:   C:\Program Files\dotnet\sdk\7.0.400\

Host:
  Version:      7.0.11
  Architecture: x64
  Commit:       ecb34f85ec

.NET SDKs installed:
  6.0.414 [C:\Program Files\dotnet\sdk]
  7.0.102 [C:\Program Files\dotnet\sdk]
  7.0.400 [C:\Program Files\dotnet\sdk]

.NET runtimes installed:
  Microsoft.AspNetCore.All 2.1.30 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.App 2.1.30 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 3.1.32 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 5.0.17 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 6.0.13 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 6.0.21 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 6.0.22 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 7.0.2 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 7.0.10 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 2.1.30 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 3.1.32 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 5.0.17 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 6.0.13 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 6.0.15 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 6.0.18 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 6.0.20 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 6.0.21 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 6.0.22 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 7.0.2 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 7.0.10 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 7.0.11 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.WindowsDesktop.App 3.1.32 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
  Microsoft.WindowsDesktop.App 5.0.17 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
  Microsoft.WindowsDesktop.App 6.0.13 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
  Microsoft.WindowsDesktop.App 6.0.21 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
  Microsoft.WindowsDesktop.App 6.0.22 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
  Microsoft.WindowsDesktop.App 7.0.2 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
  Microsoft.WindowsDesktop.App 7.0.10 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]

Other architectures found:
  x86   [C:\Program Files (x86)\dotnet]
    registered at [HKLM\SOFTWARE\dotnet\Setup\InstalledVersions\x86\InstallLocation]

Environment variables:
  Not set

global.json file:
  Not found

Learn more:
  https://aka.ms/dotnet/info

Download .NET:
  https://aka.ms/dotnet/download

Also checking for installed templates gives me the following output:

& ❯ dotnet new list
These templates matched your input:

Template Name                                 Short Name                  Language    Tags
--------------------------------------------  --------------------------  ----------  ---------------------------------------------------------
.NET MAUI App                                 maui                        [C#]        MAUI/Android/iOS/macOS/Mac Catalyst/Windows/Tizen
.NET MAUI Blazor App                          maui-blazor                 [C#]        MAUI/Android/iOS/macOS/Mac Catalyst/Windows/Tizen/Blazor
.NET MAUI Class Library                       mauilib                     [C#]        MAUI/Android/iOS/macOS/Mac Catalyst/Windows/Tizen
.NET MAUI ContentPage (C#)                    maui-page-csharp            [C#]        MAUI/Android/iOS/macOS/Mac Catalyst/WinUI/Tizen/Xaml/Code
.NET MAUI ContentPage (XAML)                  maui-page-xaml              [C#]        MAUI/Android/iOS/macOS/Mac Catalyst/WinUI/Tizen/Xaml/Code
.NET MAUI ContentView (C#)                    maui-view-csharp            [C#]        MAUI/Android/iOS/macOS/Mac Catalyst/WinUI/Tizen/Xaml/Code
.NET MAUI ContentView (XAML)                  maui-view-xaml              [C#]        MAUI/Android/iOS/macOS/Mac Catalyst/WinUI/Tizen/Xaml/Code
.NET MAUI ResourceDictionary (XAML)           maui-dict-xaml              [C#]        MAUI/Android/iOS/macOS/Mac Catalyst/WinUI/Xaml/Code
Android Activity template                     android-activity            [C#]        Android/Mobile
Android Application                           android                     [C#]        Android/Mobile
Android Class Library                         androidlib                  [C#]        Android/Mobile
Android Java Library Binding                  android-bindinglib          [C#]        Android/Mobile
Android Layout template                       android-layout              [C#]        Android/Mobile
Android Wear Application                      androidwear                 [C#]        Android/Mobile
ASP.NET Core Empty                            web                         [C#],F#     Web/Empty
ASP.NET Core gRPC Service                     grpc                        [C#]        Web/gRPC
ASP.NET Core Web API                          webapi                      [C#],F#     Web/WebAPI
ASP.NET Core Web App                          webapp,razor                [C#]        Web/MVC/Razor Pages
ASP.NET Core Web App (Model-View-Controller)  mvc                         [C#],F#     Web/MVC
ASP.NET Core with Angular                     angular                     [C#]        Web/MVC/SPA
ASP.NET Core with React.js                    react                       [C#]        Web/MVC/SPA
ASP.NET Core with React.js and Redux          reactredux                  [C#]        Web/MVC/SPA
Blazor Server App                             blazorserver                [C#]        Web/Blazor
Blazor Server App Empty                       blazorserver-empty          [C#]        Web/Blazor/Empty
Blazor WebAssembly App                        blazorwasm                  [C#]        Web/Blazor/WebAssembly/PWA
Blazor WebAssembly App Empty                  blazorwasm-empty            [C#]        Web/Blazor/WebAssembly/PWA/Empty
Class Library                                 classlib                    [C#],F#,VB  Common/Library
Console App                                   console                     [C#],F#,VB  Common/Console
dotnet gitignore file                         gitignore                               Config
Dotnet local tool manifest file               tool-manifest                           Config
EditorConfig file                             editorconfig                            Config
global.json file                              globaljson                              Config
iOS Application                               ios                         [C#],F#,VB  iOS/Mobile
iOS Binding Library                           iosbinding                  [C#]        iOS/Mobile
iOS Class Library                             ioslib                      [C#],VB     iOS/Mobile
iOS Controller                                ios-controller              [C#]        iOS/Mobile
iOS Storyboard                                ios-storyboard              [C#]        iOS/Mobile
iOS Tabbed Application                        ios-tabbed                  [C#]        iOS/Mobile
iOS View                                      ios-view                    [C#]        iOS/Mobile
iOS View Controller                           ios-viewcontroller          [C#]        iOS/Mobile
Mac Catalyst Application                      maccatalyst                 [C#],VB     macOS/Mac Catalyst
Mac Catalyst Binding Library                  maccatalystbinding          [C#]        macOS/Mac Catalyst
Mac Catalyst Class Library                    maccatalystlib              [C#],VB     macOS/Mac Catalyst
Mac Catalyst Controller                       maccatalyst-controller      [C#]        macOS/Mac Catalyst
Mac Catalyst Storyboard                       maccatalyst-storyboard      [C#]        macOS/Mac Catalyst
Mac Catalyst View                             maccatalyst-view            [C#]        macOS/Mac Catalyst
Mac Catalyst View Controller                  maccatalyst-viewcontroller  [C#]        macOS/Mac Catalyst
MSBuild Directory.Build.props file            buildprops                              MSBuild/props
MSBuild Directory.Build.targets file          buildtargets                            MSBuild/props
MSTest Test Project                           mstest                      [C#],F#,VB  Test/MSTest
MVC ViewImports                               viewimports                 [C#]        Web/ASP.NET
MVC ViewStart                                 viewstart                   [C#]        Web/ASP.NET
NuGet Config                                  nugetconfig                             Config
NUnit 3 Test Item                             nunit-test                  [C#],F#,VB  Test/NUnit
NUnit 3 Test Project                          nunit                       [C#],F#,VB  Test/NUnit
Otterkit COBOL Application                    otterkit-app                [C#]        COBOL/Application
Otterkit COBOL Library                        otterkit-lib                [C#]        COBOL/Library
Protocol Buffer File                          proto                                   Web/gRPC
Razor Class Library                           razorclasslib               [C#]        Web/Razor/Library/Razor Class Library
Razor Component                               razorcomponent              [C#]        Web/ASP.NET
Razor Page                                    page                        [C#]        Web/ASP.NET
Solution File                                 sln,solution                            Solution
Web Config                                    webconfig                               Config
Windows Forms App                             winforms                    [C#],VB     Common/WinForms
Windows Forms Class Library                   winformslib                 [C#],VB     Common/WinForms
Windows Forms Control Library                 winformscontrollib          [C#],VB     Common/WinForms
Worker Service                                worker                      [C#],F#     Common/Worker/Web
WPF Application                               wpf                         [C#],VB     Common/WPF
WPF Class Library                             wpflib                      [C#],VB     Common/WPF
WPF Custom Control Library                    wpfcustomcontrollib         [C#],VB     Common/WPF
WPF User Control Library                      wpfusercontrollib           [C#],VB     Common/WPF
xUnit Test Project                            xunit                       [C#],F#,VB  Test/xUnit

Sorry to bother but am I missing something? Probably yes, but could you help me to resolve my issue? Thank you.

Parser should honor literals

Describe the bug
01 Num2 PIC 9 values zeros.
The Language parser failing for numeral literals

To Reproduce
https://gist.github.com/birojnayak/4bd0d4332d98170f74803b3e217ca4f8

Expected behavior
Parser should honor it.

Screenshots
If applicable, add screenshots to help explain the issue. This could be terminal screenshots that can help us find the source of the issue.

Platform Information (please complete the following information):

OS: [e.g. Linux]
OS Version [e.g. Ubuntu 22.04]
CPU Architecture [e.g. x64]
Libotterkit Version [e.g. 1.0]

Additional context
Add any other context about the problem here.

[✨]: include yourself in compiler-explorer

Is your feature request related to a problem?
It is nice to be able to compare the results of compilers, and also gives the project some additional visibility.

Describe the solution you'd like
Add yourself to compiler-explorer.
This would mean to add it to its main repo into the existing COBOL definition, see compiler-explorer/compiler-explorer@6f7218a; and also to integrate it to the infrastructure repository. The matching commit to the one above is compiler-explorer/infra@8948308.

Note that the main entry is based on GCC so that's "cheap", you'd use a "complete" definition like https://github.com/compiler-explorer/compiler-explorer/blob/main/lib/compilers/gnucobol.ts that more or less just uses the build and use instructions you already have.

The .net one is possibly also useful to inspect: https://github.com/compiler-explorer/compiler-explorer/blob/main/lib/compilers/dotnet.ts

Additional context
Note: before this is done - ensure that the codegen works for the examples (which are 'simple'), those are available at https://github.com/compiler-explorer/compiler-explorer/tree/main/examples/cobol

After that it is likely a good idea to create a new compiler issue there and discuss the necessary steps and if they may do the actual work.

otterkit / otterkit-cobol Goto Github PK

otterkit-cobol's Introduction

Otterkit Toolchain

Contributing

Copyright

otterkit-cobol's People

Contributors

Stargazers

Watchers

Forkers

otterkit-cobol's Issues

Current Issue:

Possible Solutions:

The Standard COBOL External Repository

The problem with dynamic binding and the External Repository

Community help and suggestions?

Virtual registers:

Calling convention:

For a procedural callee:

For an object-oriented callee:

Notes:

The COBOL Standard Idea

How to format ideas.

The First Idea

Idea: Dynamic Data Structures.

Recommend Projects

Recommend Topics

Recommend Org