Code Monkey home page Code Monkey logo

armish's Introduction


*** Armish - An assembler quite contraire

Armish is the Amish of the computer-world. The Amish plough forth through life
under the twin burden of an extreme religion and an accompanying extreme way of
life. As a result they're marginalized and they watch life pass them by. But
they might just hold the seed for a future way of living, you know, with global
warming and all. It has worked in the past; You never know. You never know I
say.

Armish ploughs through bit-space under the twin burden of being an assembler
and being written in lisp. As a result it will be marginalized and it will
watch with an unending waxing headache how programmers pass it by, riding their
high octane steeds of high-level languages. But it might just hold the seed for
a future mass acceptance of languages long forgotten, garbage collected os-ses
of yonder, recursive programming-joys untold... (*sigh*) ...  You never
know... You never know, da#@^$it!!



** License - LLGPL, see the LICENSE file

** Authors

- Jeff Massung 
- Ties Stuij



** Concept

Armish is an assembler like any other. With parenthesis around the instructions.
It's Armish's aim is to be a bit on the general side. It assembles arm and thumb
code, but new features of the latest processors are not (yet/ever) supported.
They're quite backwards compatible though, those processors, last time I checked.
Armish should be usable for arm architectures 3 through 5 (till arm9), and of those
just the core. No enhanced dsp instructions or such. Complaints about missing
features are welcome. See Liards, for a lib that actually uses it.



** Installation

To get the latest development version, do a darcs get:

darcs get http://common-lisp.net/project/armish/darcs/armish

Versioned releases have been put to a halt atm.

Armish depends on Umpa-Lumpa, Arnesi, Split-Sequence and FiveAM.

darcs get http://common-lisp.net/project/liards/darcs/umpa-lumpa
darcs get http://common-lisp.net/project/bese/repos/arnesi_dev
darcs get http://common-lisp.net/project/bese/repos/fiveam
http://ww.telent.net/cclan/split-sequence.tar.gz

Once you've got those wired into your asdf machinery together with Armish,
just fire it up.


** Testing

You can check if (a subset of) all the standard instructions work by first
building aasm from the supplied c source if you're not on linux or windows, and
be sure to name it aasm. For linux the executable is already there. Make the
executable executable (`chmod 755 aasm' or equivalent) and then execute: (run!
'arm-suite)

There might by errors here caused by the fact i faultily tried to support your
implementation or operating system. The test tries to access an external
assembler supplied with Armish to test the instructions; however the functions
to read and write to other processes is implementation-dependent. Armish is
only tested on sbcl on Linux, so if you can, please fix by editing run-prog and
process-output in helpers.lisp and control-check in test.lisp. Oh! and send a
fix if you want! Thanks. Or just remove your implementation from the two
feature-query lists in control-check.

This will compare the output of Armish with that of a reference implementation,
included in this release. This shouldn't be to interesting for the average
user. I would myself at least expect the tested functionality to work. It might
give you a warm fuzzy feeling to know this assembler doesn't just fool you
along, like it would me. More interesting is perhaps that you can check the
syntax of this assembler by looking at the test-cases in the test.lisp file.

More practical for if you do feel the need to hack on this assembler-thing (i
for one encourage it) is that you'll find some handy functions in the test.lisp
file that can help with debugging.



** Exported functions

- assemble - Assembles forms into a list of opcodes for specified chip and
processor mode.  syntax: (assemble chip mode forms) where:

chip - decides which chipset to assemble for - is a number or a symbol

Atm assemble accepts the first value of the pairings below, and it translates
into the latter. The higher the value, the newer the chip. Check the function
get-version to see the latest supported symbols or to add your own.

0 0 'all 0 3 3 4 4 'version-4 4 '4t 4.2 'ARM7TDMI 4.2 'arm7 4.2 5 5 'version-5
5 '5TExP 5.3 '5TE 5.4 'ARM946E-S 5.4 'arm9 5.4


mode - decides if you want to start in arm or thumb mode - accepts one of the
following symbols: 'arm 'code32 'thumb 'code16


forms - are, yes, the forms to be assembled

example usage:
(assemble 'arm9 'arm
  '(:label (mov r3 r4) (b :label)))

==> (4 48 160 225 253 255 255 234)


- emit-asm - emits a list of assembly forms ready to be fed to assemble. Emit-asm
escapes variables which are not part of the assembler syntax, so we can enrich the
assembler with variables, while still keeping a clean, uniform syntax. Inside
emit-asm, escape forms you want to evaluate as Common Lisp with (ea ...), as in
escape assembler.

syntax: (emit-asm form1 form2 ...)

example usage:

(let ((foo 'r4)
      (bar 'r6))
  (emit-asm
   :loop
   (stmib r3 (r3 foo_bar))
   (b :loop)))

==> (:LOOP (STMIB R3 (R3 R4_R6)) (B :LOOP))


- align - aligns a list of bytes to a bytes byte boundry by padding
zeroes. Defaults to four if bytes is not supplied

syntax: (align byte-lst &optional bytes)

example usage: (align '(1 2 3 4 5)) ==> (1 2 3 4 5 0 0 0)


- aligned - returns the next bytes byte aligned address. Defaults to four if
bytes is not supplied

syntax: (aligned address &optional bytes)

example usage: (aligned (length '(1 2 3 4 5))) ==> 8

There are some other functions exported (actually just one at the time of
writing), but those are simple helper functions that might aid the programmer
on a general level. I had a seperate package for that, but decided to cut it to
keep the package-count down. But i might reintroduce it again. Look in
helpers.lisp for functions that might aid you.


- set-armish-string-encoding - sets the armish string encoding. Armish passes
the encoding to the arnesi string-to-octets fuction, which tries to do the right
thing. From the documentation: "We gurantee that :UTF-8, :UTF-16 and :ISO-8859-1 will
work as expected. Any other values are simply passed to the underlying lisp's
function and the results are implementation dependant.

syntax: (set-armish-string-encoding :keyword)

example usage: (set-armish-string-encoding :ISO-8859-1) ==> :ISO-8859-1



** Instruction syntax

Basically this assembler follows the arm assembler syntax but then lispified:
Wrap the expressions in parenthesis, get rid of the comma's, substitute curly
braces and braces for parenthesis, substitute - for _ in register lists for
multiple load/store instructions and get rid of the pound signs before
literals and immediates. Barrel rolling modifiers and labels are keywords.

examples:
standard assembly syntax            lisp syntax

ldmhied r12!, {r2, r15, r13-r14} -> (ldmhied r12! (r2 r15 r13_r14))
ldrbt r5, [r2], -r1 ror #12      -> (ldrbt r5 (r2) -r1 :ror 12)

For a pretty extensive, case by case comparison of the instructions, see the
test cases in the test.lisp file.



** Assembler format, features and conventions


* Directives

code16 - assemble as thumb code32 - assemble as arm

pool - dump the literary pool

align - align code to a 4 byte boundary align-hw - align code to a 2 byte
boundary (align &optional bytes) - aligns code to a bytes byte boundary;
defaults to four if bytes is not supplied

(dcb byte &rest bytes) - define one or more bytes (byte byte &rest bytes) -
same as dcb

(dcw byte &rest bytes) - define one or more 16 bit words (hword byte &rest
bytes) - same as dcw

(dcd byte &rest bytes) - define one or more 32 bit words (word byte &rest
bytes) - same asl dcd

(dword byte &rest bytes) - define one or more 64 bit words (quad byte &rest
bytes) -define one or more 64 bit words

(bin byte-size bin-list) - a more general directive than the previous ones.
First specify how many bytes of storage space the data items of the bin-list
should take, then supply the list itself. For example assembling
(bin 2 (1 2 3 4)) results in (1 0 2 0 3 0 4 0) 

(binae lis-of-bin-lists) - same as above, but now we specify a list of
byte-size - bin-list lists. For example assembling
(binae ((2 (1 2)) (4 (3 4)))) results in (1 0 2 0 3 0 0 0 4 0 0 0)

(space size &optional (fill 0)) - pad assembly output with size amount of
bytes, all of value fill

"string" - a literal string will be encoded as *string-encoding* (defaults to :utf-8)
specified transformed bytes, at arnesi's string-to-octets discretion. No auto-align.

(string &rest strings) - strings will be concatenated and then encoded as
*string-encoding* (defaults to :utf-8) transformed bytes, at arnesi's
string-to-octets discretion. If the symbol :null-terminated is present,
the concatenated (so NOT the individual) strings will be *string-end* (defaults to 0)
terminated. Next code won't be automatically aligned.

:label - an unadulterated keyword will be treated as a label

For convenience the to be compiled forms are appended with the :code-end label,
for if one wants to jump to code which might be placed directly after the compiled
code.


* pseudo-instructions

(ldr register literal) - loads the value of literal in register. Encodes in two's
complement
(ldr register literal :pi) - loads the value of literal in register, as Positive
Integer, understand (or 0)

(adr register label) - loads the address of label in register
(nop) - no opcode; translates into (mov r0 r0) in arm and
(mov r8 r8) in thumb


* register and coprocessor syntax

Registers can be written in the familiar way: rx, where x is a number from 0 to
15. the lr, sp, and pc can be written like lr, sp and pc.

Coprocessor registers can be written as cx or crx where x is a number from 0 to
15.

Coprocessors can be written as px or cpx where x is a number from 0 to 15.



** History

The core of the thing is a file called thumb.lisp, which Jeff Massung was so
kind to dig up from his digital archive. It was a beta version of a thumb
assembler and it assembled thumb opcodes if you fed it instructions. It has
been expanded upon a bit by the Armish team of one by modifying the thumb code
a bit and by adding arm instructions and facilities to make it more like a
traditional assembler. Who knows, maybe you can even have some use for it.



** Todo

- no arm or thumb adrl pseudo-instruction
- in the arm ldr instruction, encode a constant load more efficiently if possible
in stead of always loading from memory
- write enhanced dsp instructions
- document code think about a comment-extractor

armish's People

Contributors

stuij avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.