Code Monkey home page Code Monkey logo

mar-toolkit's Introduction

The MAR Toolkit

The MAR toolkit is a set of programs designed for building bots for the game Much Assembly Required.

Note: This project is not officially part of the game.

Features

Macro MAR (MMAR) assembly language

The Macro MAR language is designed to be a superset of Much Assembly Required's language. In addition to all the features of MAR, MMAR adds its own features such as macros, file modularity, and assemble-time math.

MAR Disassembler

The toolkit contains a disassembler that can be used to inspect binary MAR code.

MAR Floppy Manager

A 'floppy disk' manager is included in the toolkit to ease the process of reading/writing binary data to/from 'MAR floppy disks'.

Contents

  • docs - Contains the MMAR language reference and other information.
  • src - Contains the toolkit's applications and libraries.

mar-toolkit's People

Contributors

francessco121 avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

mar-toolkit's Issues

Create a 'MAR floppy disk' manager tool

Currently, there's no easy way to put a compiled MAR binary into the game's floppy media. A simple CLI tool could be made that allows users to:

  • Extract sectors into a separate binary file
  • Overwrite sectors with a binary file
  • Clear the entire media (can be achieved with the create command)
  • Create a blank media

Commit pubspec.lock files

These are supposed to be in source control for application packages (i.e. the assembler, disassembler, and floppy manager).

Add an IR optimizer step to the assembler

There are many small optimizations the assembler can perform before outputting MAR code. There should be an optional step right before outputting code which optimizes a list of IR lines.

Starting optimizations

A small list of initial optimizations that can be supported with this issue.

Note: A small rule that applies for all of these optimizations: Any optimization that would normally result in an instruction being removed CANNOT be performed if that instruction has a label.

Redundant MOV

Two MOV instructions copying data to/from each other is redundant and the second MOV can be omitted:

MOV A, B
MOV B, A

to:

MOV A, B

Overwritten MOV

A sequential MOV instruction which has the same destination as the MOV instruction right before it makes the previous MOV redundant:

MOV A, 0x0001
MOV A, 0x0002

to:

MOV A, 0x0002

JMPs to the next instruction

A JMP instruction which simply jumps to the instruction right after it can be omitted:

JMP label
label:

to:

label:

Potentially unsafe optimizations

The following is a list of optimizations that can only be done assuming the user never intentionally reads memory below the stack pointer (SP). These may need to be optional separately from the safe optimizations above (e.g. the assembler could have a flag --no-stack-optimizations to disable these).

Condensed sequential PUSH & POP

Sequential PUSH and POP instructions can be condensed into a single MOV:

PUSH 0x001
POP A

to:

MOV A, 0x001

Omitting sequential PUSH & POP from & to same destination

Sequential PUSH and POP instructions to and from the same destination can be omitted entirely:

PUSH A
POP A

to:

; Nothing

PUSH overwritten by SP increment

A PUSH instruction immediately succeeded by a an increment of the stack pointer (SP) effectively "erases" the value pushed onto the stack:

PUSH 0x0001
ADD SP, 2 ; Second operand can be any value above 0

to:

ADD SP, 1 ; Second operand must be decremented to account for the missing PUSH!
  • If the ADD specifically only adds 1 to the stack pointer, both instructions can be removed.
  • This is only possible if the second operand of ADD is an immediate as we need to decrement the value.

Assembler should split strings into multiple DW lines/operands to support escape characters which aren't supported by textual MAR

For example, the MAR assembler does not support \0 (the null character) as an escape character. The MMAR assembler should assemble:

DW "Hello \0 Null"

into:

DW "Hello ", 0x0, " Null"
; OR
DW "Hello "
DW 0x0
DW " Null"

MAR only supports the Java escape characters found here (not including escape sequences for char literals).

In the future, this will allow the assembler to support more advance escape characters such as \x1234 (putting the word 0x1234 at that position in the string).

The disassembler should be able to read relocation sections

Currently, programs with relocation sections break how the disassembler labels jump and call targets since it doesn't know the real start of the program. A flag should be added to the disassembler to tell it to try and parse the relocation section. Additionally, the relocation section should be displayed in a separate section before the main disassembly.

The #scope macro

The #scope macro (and its counterpart #endscope) would allow users to scope identifiers. This assists in reducing the number of identifiers (mainly labels) that exist in the global scope. Often, labels are specific to a piece of code and are not meant to be used elsewhere, but still must have a unique name across the entire program.

Identifiers defined inside of a scope must only be unique within that scope. They may however 'shadow' identifiers defined outside of the scope.

Scopes may be optionally named to allow access to nested identifiers.

Example 1 - Unnamed scopes

The following example creates a scope around a block of code. Any identifiers defined inside of this scope cannot be referenced outside. The labels do_thing and end only have to be unique within the scope allowing the names to be nice and short.

some_procedure:
#scope
  cmp A, 0
  jz do_thing
  
  push 0
  jmp end

  do_thing:
  push 1

  end:
#endscope

Example 2 - Named scopes

Sometimes, users may want to allow access to identifiers but still keep them out of the global scope. Scopes can optionally be named to primarily allow two things: referencing 'local' labels inside of a procedure, or scoping multiple procedures under the same name. Nested identifiers may be referenced by using the dot (.) accessor on the scope identifier.

#scope procedure
  option_a:
  push 0
  jmp end

  option_b:
  push 1

  end:
#endscope

; ...

jmp procedure.option_a ; Pushes 0
jmp procedure.option_b ; Pushes 1

Named scopes may also be used to create "modules" or "packages" of procedures:

#scope utils
  do_thing:
  #scope
    cmp A, 0
    jz end

    push 0

    end:
  #endscope

  do_other_thing:
  #scope
    mov A, 10
    loop:
    dec A
    cmp A, 0
    jnz loop
  #endscope
#endscope

; ...

jmp utils.do_thing
jmp utils.do_other_thing

Notes

Obsoletes #19.

Create a MAR disassembler

Now that the MMAR assembler supports binary output, a disassembler is going to be really important for debugging user code and verifying that the assembler outputs correct binary.

Simple initial features

  • Takes 1 MAR binary file as input
  • Outputs a textual MAR source file as output
  • Comments should replace instructions that could not be read

Future features

  • How to deal with valid data that aren't instructions? (will be moved to a separate issue)

Document how the ORG directive affects assembled binary

The biggest difference right now is that the default ORG value differs from the MAR assembler. MAR defaults to 0x0200, while the toolkit's assembler defaults to 0x0000. This technically isn't a compliance issue with MAR as this is only an issue when compiling to binary, user's can't simply compile programs via the UI into binary and use them elsewhere.

This also changes when relocation table support is added to the binary assembler.

Identifiers which start with an underscore could be scoped to the file

The lack of scoping in MMAR makes it difficult to make large programs. One way to add this could be to treat label and constant identifiers which start with an underscore as being private to the file they're defined in.

This does not solve the problem of nesting labels however, and may just encourage users to split each function into its own file.

Document how MMAR deals with sections

Edit: The behavior documented in this issue is not how the assembler currently works. See the official docs for the updated version.


Note: Despite this issue containing basically the entire documentation, it's not documented already because this behavior needs to be ironed out a little.

The following example needs to be explained (possible in the file modularity section):

lib.mmar

.data
  label: DW 0x0001

.text
  mov A, [label]

main.mmar

#include "lib.mmar"

.text
  brk

which results in:

.text
  mov A, [label]
  brk

.data
  label: DW 0x0001

Should also note how the current section being parsed carries over from includes, explained with this sort of example (although, maybe this should change?):
lib.mmar

.data
  label: DW 0x0002

main.mmar

#include "lib.mmar"

mov X, [label]

.text
  mov Y, X
  brk

which results in:

.text
  mov Y, X
  brk

.data
  label: DW 0x0002
  mov X, [label]

Add the unary negation operator to MMAR

MAR allows negative integer literals, MMAR should treat the '-' as a unary operator to support both negative integer literals and integer negation in constant expressions.

For example, these should be valid:

EXAMPLE_1 equ -5
EXAMPLE_2 equ -0x0001
EXAMPLE_3 equ -EXAMPLE_1 - 30

which results in the following MAR:

EXAMPLE_1 equ -0x5
EXAMPLE_2 equ -0x1
EXAMPLE_3 equ 0xFFE7

Allow MMAR integer literals to separate digits with underscores

Using binary literals is quite challenging since a full 16-bit binary literal looks like 0b0000000000000000. This issue proposes to allow underscores ('_') to separate digits so that the previous example can instead be written as 0b0000_0000_0000_0000.

Underscores should not be allowed to prefix the literal or prefix the base prefix.

Invalid examples:

  • _0xFF
  • 0_xFF

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.