Code Monkey home page Code Monkey logo

minsk's Introduction

Minsk

Build Status

Have you considered Minsk? -- Worf, naming things.

This repo contains Minsk, a handwritten compiler in C#. It illustrates basic concepts of compiler construction and how one can tool the language inside of an IDE by exposing APIs for parsing and type checking.

This compiler uses many of the concepts that you can find in the Microsoft C# and Visual Basic compilers, code named Roslyn.

Live coding

This code base was written live during streaming. You can watch the recordings on YouTube or browse the episode PRs.

Browsing the code

If you want to browse the code, check out the symbolic source browser.

Donations

Some people kindly asked me whether I accept donations. I have the luxury of working for a great employer and I make a good salary. That means I have got the time and means to produce these videos and share my passion for open source and .NET.

But not everyone has that luxury. If you find these videos helpful and you want to give back, consider donating to organizations that help the less fortunate to get into the tech industry, such as Black Girls Code.

Thank you ❤

minsk's People

Contributors

acolom avatar aniel avatar btastic avatar catalin-andronie avatar cbenghi avatar centreboard avatar distantcam avatar dolifer avatar fedeazzato avatar fredrikhr avatar iwillspeak avatar ltrzesniewski avatar neme12 avatar nirmal4g avatar noslaver avatar nothisisnotdaniel avatar ostorc avatar owenneil avatar stefan991 avatar superjmn avatar terrajobst avatar thebluesky avatar thild avatar tiesmaster avatar tkharaishvili avatar warappa avatar youssef1313 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

minsk's Issues

[Proposal] Consider adding REPL command history

It would be handy if we could use Up/Down buttons to walk through the history

This package seems to be working (AFAIK @filip_woj has used it for dotnet-script).

Also this sounds like a good up-for-grabs feature :)

Empty input will result in a crash

Leaving a blank input or just whitespace results in a crash

»     

Unhandled Exception: System.ArgumentNullException: Value cannot be null.
Parameter name: key
   at System.Collections.Generic.Dictionary`2.FindEntry(TKey key)
   at System.Collections.Generic.Dictionary`2.TryGetValue(TKey key, TValue& value)
   at Minsk.CodeAnalysis.Binding.BoundScope.TryLookup(String name, VariableSymbol& variable) in /home/nathan/Projects/minsk/src/Minsk/CodeAnalysis/Binding/BoundScope.cs:line 28
   at Minsk.CodeAnalysis.Binding.Binder.BindNameExpression(NameExpressionSyntax syntax) in /home/nathan/Projects/minsk/src/Minsk/CodeAnalysis/Binding/Binder.cs:line 146
   at Minsk.CodeAnalysis.Binding.Binder.BindExpression(ExpressionSyntax syntax) in /home/nathan/Projects/minsk/src/Minsk/CodeAnalysis/Binding/Binder.cs:line 119
   at Minsk.CodeAnalysis.Binding.Binder.BindExpressionStatement(ExpressionStatementSyntax syntax) in /home/nathan/Projects/minsk/src/Minsk/CodeAnalysis/Binding/Binder.cs:line 106
   at Minsk.CodeAnalysis.Binding.Binder.BindStatement(StatementSyntax syntax) in /home/nathan/Projects/minsk/src/Minsk/CodeAnalysis/Binding/Binder.cs:line 69
   at Minsk.CodeAnalysis.Binding.Binder.BindGlobalScope(BoundGlobalScope previous, CompilationUnitSyntax syntax) in /home/nathan/Projects/minsk/src/Minsk/CodeAnalysis/Binding/Binder.cs:line 24
   at Minsk.CodeAnalysis.Compilation.get_GlobalScope() in /home/nathan/Projects/minsk/src/Minsk/CodeAnalysis/Compilation.cs:line 35
   at Minsk.CodeAnalysis.Compilation.Evaluate(Dictionary`2 variables) in /home/nathan/Projects/minsk/src/Minsk/CodeAnalysis/Compilation.cs:line 50
   at Minsk.Program.Main() in /home/nathan/Projects/minsk/src/mc/Program.cs:line 67

Production ready?

I want to use for minsk for my next project. Is it production ready?

Seriously: Are you planing to continue the series?
Loved it so far!

Optional Tokens

How could i use an optional semicolon for every expression?

could you introduce a generic method?

New episodes?

It has been quite a while since the last episode, are there any new episodes planned?

Print should accept any

Right now print only accepts string, which requires conversions. We should make print take any instead.

Replace reflection based tree-traversal with Roslyn source generator

We should replace this code:

public IEnumerable<SyntaxNode> GetChildren()
{
var properties = GetType().GetProperties(BindingFlags.Public | BindingFlags.Instance);
foreach (var property in properties)
{
if (typeof(SyntaxNode).IsAssignableFrom(property.PropertyType))
{
var child = (SyntaxNode)property.GetValue(this);
if (child != null)
yield return child;
}
else if (typeof(SeparatedSyntaxList).IsAssignableFrom(property.PropertyType))
{
var separatedSyntaxList = (SeparatedSyntaxList)property.GetValue(this);
foreach (var child in separatedSyntaxList.GetWithSeparators())
yield return child;
}
else if (typeof(IEnumerable<SyntaxNode>).IsAssignableFrom(property.PropertyType))
{
var children = (IEnumerable<SyntaxNode>)property.GetValue(this);
foreach (var child in children)
{
if (child != null)
yield return child;
}
}
}
}

With a source generator that creates GetChildren() for each of the classes derived from SyntaxNode.

Parsing and syntax for dot accessor!

I'm writting here so I can reach you @terrajobst and all the interested people following this channel.
I'm trying to add . (dot) accessor in a programming language I'm trying to build.. I wanted to know what would be the best approach for it...

should I use the BinaryExpressionSyntax class and treat . (dot) as an operator?
or should I build a class specially for this... ie: DotAccessorExpressionSyntax
what kind of information should I keep there?

Use emitter in REPL

We should emit an assembly into memory and load that instead of using the Evaluator. This avoids having to support two backends.

Disallow tabs

We've made too many compromises already, too many retreats. They invade our files, and we fall back. They assimilate entire code bases, and we fall back. Not again. The line must be drawn here! This far, and no further! And I will make them pay for what they've done!
-- VP of Engineering Jean-Luc Picard when fighting tabs

As requested in this tweet, Minsk should only allow spaces for indentation and fail for tabs.

Future of the Project

Are there any plans to continue the project? It's been amazing so far, would love to see it completed!

Still interested!

Hi Immo,

just wanted to let you know that there are some people still interested in a continuation of this series.

Thanks,
Flo

[QUESTION] Parsing comments/documentation and attach it to function declarations

To all of you parser developers! I'm making a parser for a custom language. This language uses JSDoc as way of adding function documentation and metadata. My idea was to tokenize the input with the following types of comments:

// my line comment
/* my multi line comment */  
/// this is a documentation

then I would strip them from the List...

var comments = tokens.where(x=> x.Kind == Kind.SingleLComment || x.Kind == Kind.MultiLComment);
var documentation = tokens.where(x=> x.Kind == Kind.Documentation);
var code = tokens.where(x=> !comments.Contains(x) && !documentation.Contains(x));

now the code tokens can be normally parsed as intended...

now in the parser I want to be able to, at any time, check if there is a token previous to a declaration and attach it to the declaration model as metadata... for example...

ParseFunctionDeclaration()
{
            var functionKeyword = MatchToken(SyntaxKind.FunctionKeyword);
            var identifier = MatchToken(SyntaxKind.IdentifierToken);
            var openParenthesisToken = MatchToken(SyntaxKind.OpenParenthesisToken);
            var parameters = ParseParameterList();
            var closeParenthesisToken = MatchToken(SyntaxKind.CloseParenthesisToken);
            var meta = functionKeyword.TryMatchPrevious(SyntaxKind.Documentation, out var token) ? token : null; 
}

even though the tokens are no longer next to each other.. (I stripped the original list) I was thinking of keeping them linked using a pointer to the previous token (during the tokenization process).

my problem is I maybe breaking the model structure here... as the tokens will have access to each other and they will need to have a TryMatchPrevious method it doesn't really make sense because a token shouldn't have a match functionality.

on the other hand I can just put the function in the Parser and have it's signature be:

TryMatchPrevious(kindToMatch, out var token);
or even
TryMatchPrevious(startToken, kindToMatch, out var token);

what do you think of this approach? am I overthinking it? is this too much? is there a simple way of implementing this!!

[Question] A different lowering for while-statement perhaps

Hey @terrajobst,

I am really enjoying your compiler series.
In episode 8 lowering, you have mentioned while statement as sequence of steps

  • goto check
  • continue
  • body
  • gotoTrue condition continue
  • end

Perhaps, what if we had a version with one less label as follows

  • begin
  • gotoFalse condition end
  • body
  • goto begin
  • end

Would it impact anytime later?

Thanks!!

Add a language server

I would love to see some episodes on how to create a language server for Minsk, as I imagine this needs to tie in deeply with the compiler.

Would also enable some cool integrations with VSCode.

Dead project?

Is this project dead? It seems like it is not going to be continued, because there are no interactions with it. I would personally think that it is a good idea to continue with it, but I don't know what you (@terrajobst) want to do.

Add support for arrays

We should add support for arrays! They would allow us to create complex structures.

Changes:

Restructure for loops to iterate over an array:

for <variableName> of <array> 
{
}

to would create an array:

var nums: int[] = 1 to 10
for i of nums
{
}

Function array(length: number): any[]:

let list: string[] = array(5)

To make that possible we would also have to implement nulls (#171)
Function count(list: any[]): int

let length: int = count(list)

New syntax [0, 1] to create initialized arrays (automatically detect types):

let list2 = [1, 2, 3, 4, 5] // I know, you could use 1 to 5 here

New syntax list[i]:

let value = list2[2] // = 3

That would allow us to create such functions:

function prompt(questions: string[]): string[]
{
    let result: string[] = array(count(questions))
    var index = 0;
    for q of questions
    {
        print(q)
        result[index] = input()
        index++ // I think we already have that
    }
    return result
}

Expose a semantic model

We should add a semantic model API that allows answering questions like "which type does this ExpressionSyntax resolve to?" or "what VariableSymbol is referenced by this NameExpressionSyntax?"

Support 5!

After this:
image
I think you should add support for factorials :)

Generator doesn't work on Unix systems

I have tried to compile project via build.sh, however on both Mac and Ubuntu I have got some errors during the compilation.
On WSL Ubuntu 20:

CSC : error CS2011: Error opening response file 'P:\tmp/tmp569c46c419f443d4a597c88937c6f562.rsp' [/mnt/p/minsk/src/Minsk.Generators/Minsk.Generators.csproj]

CSC : warning CS2008: No source files specified. [/mnt/p/minsk/src/Minsk.Generators/Minsk.Generators.csproj]

CSC : error CS1562: Outputs without source must have the /out option specified [/mnt/p/minsk/src/Minsk.Generators/Minsk.Generators.csproj]

and on Mac (I am sorry for Czech in output, but I couldn't persuade dotnet to print it in English):

/Users/ostorc/.nuget/packages/microsoft.net.compilers/3.6.0-4.final/tools/Microsoft.CSharp.Core.targets(59,5): error MSB3883: Neočekávaná výjimka:  [/Users/ostorc/Projects/minsk/src/Minsk.Generators/Minsk.Generators.csproj]
/Users/ostorc/.nuget/packages/microsoft.net.compilers/3.6.0-4.final/tools/Microsoft.CSharp.Core.targets(59,5): error : System.ComponentModel.Win32Exception (8): Exec format error [/Users/ostorc/Projects/minsk/src/Minsk.Generators/Minsk.Generators.csproj]
/Users/ostorc/.nuget/packages/microsoft.net.compilers/3.6.0-4.final/tools/Microsoft.CSharp.Core.targets(59,5): error : at System.Diagnostics.Process.ForkAndExecProcess(String filename, String[] argv, String[] envp, String cwd, Boolean redirectStdin, Boolean redirectStdout, Boolean redirectStderr, Boolean setCredentials, UInt32 userId, UInt32 groupId, UInt32[] groups, Int32& stdinFd, Int32& stdoutFd, Int32& stderrFd, Boolean usesTerminal, Boolean throwOnNoExec) [/Users/ostorc/Projects/minsk/src/Minsk.Generators/Minsk.Generators.csproj]
/Users/ostorc/.nuget/packages/microsoft.net.compilers/3.6.0-4.final/tools/Microsoft.CSharp.Core.targets(59,5): error : at System.Diagnostics.Process.StartCore(ProcessStartInfo startInfo) [/Users/ostorc/Projects/minsk/src/Minsk.Generators/Minsk.Generators.csproj]
/Users/ostorc/.nuget/packages/microsoft.net.compilers/3.6.0-4.final/tools/Microsoft.CSharp.Core.targets(59,5): error : at System.Diagnostics.Process.Start() [/Users/ostorc/Projects/minsk/src/Minsk.Generators/Minsk.Generators.csproj]
/Users/ostorc/.nuget/packages/microsoft.net.compilers/3.6.0-4.final/tools/Microsoft.CSharp.Core.targets(59,5): error : at Microsoft.Build.Utilities.ToolTask.ExecuteTool(String pathToTool, String responseFileCommands, String commandLineCommands) [/Users/ostorc/Projects/minsk/src/Minsk.Generators/Minsk.Generators.csproj]
/Users/ostorc/.nuget/packages/microsoft.net.compilers/3.6.0-4.final/tools/Microsoft.CSharp.Core.targets(59,5): error : at Microsoft.CodeAnalysis.BuildTasks.ManagedCompiler.ExecuteTool(String pathToTool, String responseFileCommands, String commandLineCommands) [/Users/ostorc/Projects/minsk/src/Minsk.Generators/Minsk.Generators.csproj]

Both system have current preview version of dotnet (5.0.100-preview.3.20216.6).

On Windows it works fine.

[Question]FunctionSymbol references syntax

Hello,

I'm currently trying to make the mental separation between parsing an binding.

And everything makes sense so far. Parsing is the step to make sense of the syntax, while binding is the step to make sense of the semantics.
And this distinctions works in Minsk with all bound types and symbols, except FunctionSymbol.

FunctionSymbol has a reference to the declaration syntax.

My question is now: Is this on purpose or is it just something that wasn't addressed yet?

If it is on purpose, what is the benefit?
As far as i can see, the syntax, should not be relevant after successful binding.

Edit: Fix spelling, Markdown

[Feature request] Source Generators

I propose adding facilities to the Minsk compiler that would allow it to better cater to users and use-cases that could benefit from code generation. The implementation and the external user-facing interface when interacting with the compiler can be inspired by the recent addition to the Roslyn compiler. Enhancing the compiler with such a feature would not only benefit boilerplate-heavy scenarios, where codebases can be made much less susceptible to the fragility arising from alteration or addition of code that's reliant on handwritten boilerplate code, but also indirectly supplement existing/future metaprogramming abilities of the language without adding special syntax to the language just for this purpose.

Optimize string concatenation

We should flatten multiple string concatenations and use the overloads of String.Concat that allow us to pass in multiple parameters/an array. That's better than only handling two strings at a time.

Remove dead code

After we did #98 we should also remove dead code. The compiler should also emit warnings for dead code.

I want to repost your video

Hello dear Immo,I'm a Chinese Video Maker,I'm interested in your videos such as "making a compiler" and so on,can I post it to other websites such as bilibili?I will tell other people the origin author is you and put origin link under these videos。
Hope your reply.

Exception when evaluating token sequences <ArbitraryToken> <IdentifierToken>

Currently there is a problem with matching the following expected token sequence:
[ArbitraryToken] <IdentifierToken>

In case of valid inputs, this works like a charm. But think of malformed inputs like:
for while - while is a keyword, not an identifier
var 1 - 1 is a number, not an identifier
while let - let is a keyword, not an identifier
...
When an expected token of kind IdentifierToken cannot be matched (see above) the parser inserts one with{Kind = SyntaxKind.Identifier, Text = null, Value = null} and reports an unexpected token - diagnostic.

The parser diagnostics don't get reported immediately but concatenated with the diagnostics from the binder - what implies binding the (malformed) syntax tree first. And when it comes to binding, those inserted tokens start to hurt:
The most of the parsed identifier tokens are bound to VariableSymbols with SyntaxToken.Text as variable name. And when trying to declare it in its scope this ultimately leads to either
_variables.ContainsKey(null)
or
_variables.Add(null, variable)

Dictionaries don't like nulls, so both calls lead to an ArgumentNullException.

I decided to not file a PR, because there are various ways of fixing that issue and I don't know which one you would prefer.

/Peter

EDIT: Tried to clarify the problem.

Support comparing two values of type any

Right now, this doesn't compile:

var x: any = "12"
var y: any = "2"
var result = x == y

The error is:

Binary operator '==' is not defined for types 'any' and 'any'.

Null

Add support for null:

function x(y: int)
{
    if (y > 10) return y - 10
    else return null
}

Also mentioned in #170

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.