miniforth

miniforth is a real mode FORTH that fits in an MBR boot sector. The following standard words are available:

+ - ! @ c! c@ dup drop swap emit u. >r r> [ ] : ; load

Additionally, there is one non-standard word. s: ( buf -- buf+len ) will copy the rest of the current input buffer to buf, and terminate it with a null byte. The address of said null byte will be pushed onto the stack. This is designed for saving the code being ran to later put it in a disk block, when no block editor is available yet.

The dictionary is case-sensitive. If a word is not found, it is converted into a number with no error checking. For example, g results in the decimal 16, extending the 0123456789abcdef of hexadecimal. On boot, the number base is set to hexadecimal.

Backspace works, but not how you're used to — the erased input will be still visible on screen until you write something else.

Various aspects of this project's internals are described in detail on my blog.

Trying it out

You can either build a disk image yourself (see below), or download one from the releases page.

When Miniforth boots, no prompt will be shown on the screen. However, if what you're typing is being shown on the screen, it is working. You can:

do some arithmetic: 1 2 + u.
load additional functionality from disk: 1 load (see Onwards from miniforth below).

Building a disk image

You will need yasm and python3, which you can obtain with nix-shell or your package manager of choice. Then run ./build.sh.

This will create the following artifacts:

boot.bin - the built bootsector.
uefix.bin - the chainloader (see below).
disk.img - a disk image with the contents of block*.fth installed into the blocks.
boot.lst - a listing with the raw bytes of each instruction. Note that the dd 0xdeadbeef are removed by compress.py.

The build will print the number of used bytes, as well as the number of block files found. You can run the resulting disk image in QEMU with ./run.sh, or pass ./run.sh boot.bin if you do not want to include the code from *.fth in your disk. QEMU will run in curses mode, exit with Alt + 2, q, Enter.

UEFI sucks, or `uefix.bin`

Running Miniforth on real hardware is certainly possible. In fact, I hardly use emulation for it these days. Anything not unreasonably old should work. However, most implementations of UEFI have a bug/misfeature where they try to parse the partition table of an MBR before booting from it. As Miniforth does not include a partition table in the bootsector (as there is simply no space for it), this prevents booting on most UEFI computers.

To remedy this, this repository includes a small chainloader as a workaround. So, instead of this disk layout, which works on computers old enough:

LBA 0   - boot.bin
LBA 1   - unused
LBA 2-3 - Forth block 1
...       ...

...the disk image generated by build.sh looks as follows:

LBA 0   - uefix.bin
LBA 1   - boot.bin
LBA 2-3 - Forth block 1
...       ...

Blocks

load ( blk -- ) loads a 1K block of FORTH source code from disk and executes it. All other block operations are deferred to user code. Thus, after appropriate setup, one can get an arbitrarily feature-rich system by simply typing 1 load — see Onwards from miniforth below.

Each pair of sectors on disk forms a single block. Block number 0 is partially used by the MBR, and is thus reserved.

System variables

Due to space constraints, variables such as STATE or BASE couldn't be exposed by creating separate words. Depending on the variable, the address is either hardcoded or pushed onto the stack on boot:

>IN is a word at 0xa02. It stores the pointer to the first unparsed character of the null-terminated input buffer.
The stack on boot is LATEST STATE BASE HERE #DISK (with #DISK on top).
STATE has a non-standard format - it is a byte, where 0 means compiling, and 1 means interpreting.
#DISK is not a variable, but the saved disk number of the boot media

Onwards from miniforth

The main goal of the project is bootstrapping a full system on top of Miniforth as a seed. Thus the repository also contains various Forth code that may run on top of Miniforth and extend its capabilities.

In bootstrap.fth (1 load):
- A simple assembler is implemented, and then used to implement additional primitives, which wouldn't fit in Miniforth itself. This includes control flow words like IF/THEN and BEGIN/UNTIL, as well as calls to the BIOS disk interrupt to allow manipulating the code on disk.
  
  For the syntax of the assembler, see No branches? No problem — a Forth assembler.
- Exception handling is implemented, with semantics a little different from standard Forth. See Contextful exceptions with Forth metaprogramming.
- A separate, more featureful outer interpreter overrides the one built into Miniforth, to correct the ugly backspace behavior and handle things such as uncaught exceptions and vocabularies.
- A way of searching for occurences of a particular string in the code stored in the blocks is provided:
  - 10 20 grep create searches blocks $10 through $20 inclusive for occurences of create
  - If your search term includes spaces, use grep" — the syntax is similar to s" string literals: 10 20 grep" : >number"
In editor.fth (30 load), a vi-like block editor is implemented. It can be started with e.g. 10 edit to edit block 10.
- Non-standard keybindings:
  - Q to quit back to the Forth REPL.
  - [ to look at the previous block.
  - ] to look at the next block.
- After first use, you can use the shorthand ed to reopen the last-edited block.
- Use run to execute the last-edited block. This sets a flag to prevent a chain of --> from loading all the subsequent blocks.
- Changes are saved to disk whenever you use run or open a different block with edit or the [/] keybinds. You can also trigger this manually with save.

All this code was originally developed within Miniforth itself, which meant it was stored within a disk image — a format that's not very friendly to tooling like Git or GitHub's web interface. This disparity is handled by two Python scripts:

mkdisk.py takes the files and merges them into a bootable disk image;
splitdisk.py extracts the code from a disk image's blocks and splits it into files.

Free bytes

At this moment, not counting the 55 AA signature at the end, 499 bytes are used, leaving 11 bytes for any potential improvements.

Byte saving leaderboard:

Ilya Kurdyukov saved 24 bytes. Thanks!
Peter Ferrie saved 5 bytes. Thanks!

If a feature is strongly desirable, potential tradeoffs include:

1 byte: Use a SPECIAL_BYTE for compression such that it can be turned into 0xad with inc [di-1] or another instruction of the same size. This has the disadvantage that avoiding occurances of SPECIAL_BYTE becomes harder, and the solution of simply changing the special byte no longer works.
?? bytes: Don't expose the BASE variable, hardcode hexadecimal — as it turns out, it is not that useful. The current bootstrap doesn't make use of BASE in the initial interpreter.
7 bytes: Remove the - word (with the expectation that the user will assemble their own primitives later anyway).
6 bytes: Remove the + word (with the expectation that the user will define : negate 0 swap - ; : + negate - ;
- Note that bootstrapping with neither + nor - would be, to put it mildly, quite hard.
12 bytes: Remove the emit word.
9 bytes: Don't push the addresses of variables kept by self-modifying code. This essentially changes the API with each edit (NOTE: it's 9 bytes because this makes it beneficial to keep >IN in the literal field of an instruction).
?? bytes: Instead of storing the names of the primitives, let the user pick their own names on boot. This would take very little new code — the decompressor would simply have to borrow some code from :. However, reboots would become somewhat bothersome.

01luna / miniforth Goto Github PK

miniforth's Introduction

miniforth

Trying it out

Building a disk image

UEFI sucks, or `uefix.bin`

Blocks

System variables

Onwards from miniforth

Free bytes

miniforth's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

01luna / miniforth Goto Github PK

miniforth's Introduction

miniforth

Trying it out

Building a disk image

UEFI sucks, or uefix.bin

Blocks

System variables

Onwards from miniforth

Free bytes

miniforth's People

Contributors

Recommend Projects

Recommend Topics

Recommend Org

UEFI sucks, or `uefix.bin`