newlandsvalley / purescript-abc-parser Goto Github PK

9.0 3.0 2.0 496 KB

Yet another parser for the ABC Notation

License: MIT License

PureScript 98.61% Dhall 1.39%

abc abc-parser abc-notation

purescript-abc-parser's Introduction

purescript-abc-parser

This is a parser for version 2.1 of Chris Walshaw's ABC Notation which is primarily designed as an interchange format for scores of traditional music. Also included are functions to manipulate the parse tree in order to provide alteration of tempo, transposition, conversion to MIDI etc.

For more information, see the guide.

Motivation

The goal of this project is not to produce a general purpose parser for all forms of music in a wide variety of computational settings. Rather, it is to provide a tool that will parse an individual traditional tune when presented to it in a browser - either from a file or from keyed input. In particular, the parser is designed to handle the majority of tunes housed in the major Western European collections - particularly The Session, FolkWiki, Spillefolk and abcnotation.com.

Consequently, aspects of the spec that apply to other settings or other musical forms will be ignored or curtailed. In addition, parts of the specification marked as volatile will be treated as being non-normative and in some cases ignored.

It is assumed that it will work in cooperation with other modules which will be responsible for such aspects as editing, displaying or playing the score. It is a particular design aim to support editor applications such that a user may, if she prefers, edit the tune body before even thinking about the headers.

Support for ABC Version 2.2

As far as I can tell, ABC version 2.2 is also supported. Unfortunately, very many sections of this spec are still marked as volatile.

The main changes in the spec are to do with multiple voices and in particular, the manner in which clefs for a variety of (possibly transposing) instruments may be represented. This is not a problem for most traditional music collections. In this parser, clef descriptions (and all other voice properties) are parsed, but left predominantly untyped.

Support for Polyphony

There is a degree of support for polyphony in the Voice module. If the ABC contains multiple V: (voice) headers, then it gives a separate ABC tune for each voice. These can then be passed to a suitable polyphonic player. However, the Midi module remains monophonic.

Deviations from the spec

Tunebooks. Only one tune is allowed per file containing text entirely dedicated to that tune. Comsequently the need for free text or embdedded fragments does not arise.
Typeset text. Not supported. It is assumed that any associated score-engraving software will include its own typesetting strategy.
Chord Symbols. Parsed but ignored (intentionally) in the MIDI module. These tend to sound terrible and, in my opinion, tend to be too dictatorial.
Decorations are supported against bar lines, notes and chords but are not currently supported against (the start of) tuplets or rests. You may, of course, decorate any note in a tuplet.
Mandatory information fields (headers). In an editor application, it is important to allow the user the option of first entering the notes and only later the information fields, whilst parsing the input after each keystroke. For this reason, mandatory headers are not enforced - it is assumed that later software modules will enforce them in many circumstances. In particular, the X:(reference-number) header has no usefulness in a browser setting.
Unicode escape sequences. Browsers have full Unicode support and I would expect users to use fully unicode aware editors these days and so this feature is ignored.

Issues

Slurs (represented by round brackets) are awkward. They seem to be impossible to match - for instance they can span across bars or even across separate lines of music. I attach them directly to the notes (or note groups) that delineate the slur. However, where the slur is not directly attached to the note (e.g. when attached to a broken rhythm operator) then the parser is lenient, accepting but discarding the slur bracket. Where a note is prefaced both by grace note(s) and an opening slur then the grace note must come before the slur bracket.
I have found no description of how a tuplet should be validated. Currently, tuplets must be completely contained within a bar and the number of items in the tuplet must agree with the its signature. Spaces are allowed between the notes but tuplets may not be embedded, one inside the other.
Grace notes are not supported against chords. (I am unclear what the specification defines here with respect to grace notes and see note above.)
Grace notes are, however, supported against notes in all other contexts and attached to them directly, although optionally mediated by a left slur bracket.
In translating to MIDI, only a single voice is recognized.

To Build

npm run build

To Test

npm run test

purescript-abc-parser's People

Contributors

Stargazers

Watchers

Forkers

matthew1172 jgarte

purescript-abc-parser's Issues

Support comments at the end of header lines

2.2.5 Comments and remarks
A percent symbol (%) will cause the remainder of any input line to be ignored. It can be used to add a comment to the end of an abc line or as a comment line in its own right. (To get a percent symbol, type % - see text strings.)

Alternatively, you can use the syntax [r:remark] to write a remark in the middle of a line of music.

Example:

|:DEF FED| % this is an end of line comment
% this is a comment line
DEF [r:and this is a remark] FED:|
```

Tuplet notes may be separated by spaces

I think this means both that we must parse the space properly and also count the notes. At the moment, the tuplet must contain at least one rest/note and the contents are terminated by the first space. There is no attempt to ensure the number of notes in the content equals the number required by the signature.

Extend the reach of the voice module

Currently, the module partitions the tune into separate voices when the tune body is tagged with inline voice headers:

    [V:1] abc... 
    [V:1] def...
    [V:2] GAB...
    [V:2] Cde...

but not when presented with free-standing voice headers:

    V:1
    abd... 
    def...
    V:2
    GAB...
    Cde

Support optional key-value pairs at the end of a key header

For example:

K:Gmin shift=GD

which indicates moveable pitch transpositions. This is needed for ABC 2.2 support.

Strip the quotes when parsing chord symbols

ABC defines these (awkwardly) as strings surrounded by double quotes. We should strip off the quotes before saving them in the ADT (and restore them in canonical). This is needed because we now intend to use them in melody and score modules.

Liberalise the handling of bar lines

The ABC specification is really messy here. After listing the various possibilities of thick/thin line combinations and showing that only the single thin line can have repeat markers against it, it throws it all away by saying:

Abc parsers should be quite liberal in recognizing bar lines. In the wild, bar lines may have any shape, using a sequence of
 | (thin bar line), [ or ] (thick bar line), and : (dots), e.g. |[| or [|:::

We need to improve things in order to accommodate the ::| or |:: multiple repeat marker. I think that the best we can do is firstly restrict ourselves to these bar line combinations:

   |    thin
   ||   thin-thin
   [|   thick-thin
   |]   thin-thick
   ]|   thick-thin

and then to allow any number up to a sensible maximum (0-2?) of repeat : markers both immediately before and after the bar lines, and also to allow the freestanding :: version as a synonym for :|:. Note that it would probably be tricky also to include the |[ combination because of the possibility of ambiguity when we come to adding variant markers. Here, as far as I can see, the following would all be legal:

   |1 
   [1
   | [1

I guess this construct is best parsed as a separate entity immediately after the bar line and colon(s).

As fart as the ABC ADT is concerned, we retain the Begin/End/BeginAndEnd repeat descriptor but add to an integer indicating the number of repeats. (Currently it is only 1). Otherwise, no change.

partitionVoices and partitionTuneBody should return a NonEmptyArray not an Array

This is because, if there are voices, the array is not empty and if there are not, the original tune is returned as a singleton.

Translation to MIDI ignores grace notes

Perhaps now is the time to support them.

Add support for the default unit note length

From 3.1.7 L: - unit note length

If there is no L: field defined, a unit note length is set by default, based on the meter field M:. This default is calculated by computing the meter as a decimal: if it is less than 0.75 the default unit note length is a sixteenth note; if it is 0.75 or greater, it is an eighth note. For example, 2/4 = 0.5, so, the default unit note length is a sixteenth note, while for 4/4 = 1.0, or 6/8 = 0.75, or 3/4= 0.75, it is an eighth note. For M:C (4/4), M:C| (2/2) and M:none (free meter), the default unit note length is 1/8.

At the moment we just use 1/8.

Cannot find module 'Main'

How do you run the parser? When I do:
pulp run
I get

* Building project in C:\...\...\...\purescript-abc-parser
* Build successful.
internal/modules/cjs/loader.js:905
  throw err;
  ^

Error: Cannot find module 'Main'
Require stack:
- C:\...\...\AppData\Local\Temp\pulp-run2022222-14824-131d0pd.49pxi.js\index.js
    at Function.Module._resolveFilename (internal/modules/cjs/loader.js:902:15)
    at Function.Module._load (internal/modules/cjs/loader.js:746:27)
    at Module.require (internal/modules/cjs/loader.js:974:19)
    at require (internal/modules/cjs/helpers.js:93:18)
    at Object.<anonymous> (C:\...\...\AppData\Local\Temp\pulp-run2022222-14824-131d0pd.49pxi.js\index.js:1:1)
    at Module._compile (internal/modules/cjs/loader.js:1085:14)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1114:10)
    at Module.load (internal/modules/cjs/loader.js:950:32)
    at Function.Module._load (internal/modules/cjs/loader.js:790:12)
    at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:76:12) {
  code: 'MODULE_NOT_FOUND',
  requireStack: [
    'C:\\...\\...\\AppData\\Local\\Temp\\pulp-run2022222-14824-131d0pd.49pxi.js\\index.js'
  ]
}
* ERROR: Subcommand terminated with exit code 1

It would be nice if someone beefed up the readme with instructions on how to parse an .abc file.

Add support for partitioning the voices

At the moment, the Midi module is pretty much stuck with monophonic tunes - it ignores the separate voices completely. However, as a minimum, we should support the partitioning of the ABC tune body into separate such bodies - one for each voice (where voice inline headers are present). This would allow, for example, players to be built which are polyphonic.

To do this, we must identify if an inline voice header starts each Score line and extract its voice id (if the voice header doesn't exist, we use an arbitrary id). We then partition the TuneBody on this id. BodyInfo headers would be sent to each partition.

Grace notes not supported against tuplets

e.g. {G}(3F4D4E4. Causes a parse error,

Make decorations type-safe

We need a Decoration enumerated type to replace the stringly typed decorations we have at present. Look to do this as and when we resolve #13.

Support the deprecated and outmoded exclamation mark line-break character

See:

`10.2.1 Outdated line-breaking

There are plenty of examples on folkwiki - particularly those with multiple parts that still employ it.

Implement variant endings more fully

At the moment, the parser supports variant endings of the form |n <notes> but the midi module ignores these for values of n above 2. i.e. only 2 variant endings are supported and this is also true in abc-melody

The parser does not support variants of the form 1,3 or 1-3. Note, too, that the P: - parts header is marked as volatile in the spec and thus subject to change.
,
It seems to me that we ought to extend the variant endings support to the Midi module (and to abc-melody) whilst still ignoring the Parts setting for the time being. This is probably the most significant and awkward improvement we could make to abc-parser.

4.10 Variant endings

In combination with P: part notation, it is possible to notate more than two variant endings for a section that is to be repeated a number of times.

For example, if the header of the tune contains P:A4.B4 then parts A and B will each be played 4 times. To play a different ending each time, you could write in the tune:

P:A
<notes> | [1  <notes>  :| [2 <notes> :| [3 <notes> :| [4 <notes> |]
The Nth ending starts with [N and ends with one of ||, :| |] or [|. You can also mark a section as being used for more than one ending e.g.

[1,3 <notes> :|
plays on the 1st and 3rd endings and

[1-3 <notes> :|
plays on endings 1, 2 and 3. In general, '[' can be followed by any list of numbers and ranges as long as it contains no spaces e.g.

[1,3,5-7  <notes>  :| [2,4,8 <notes> :|

Voice titles should be the tune name followed by the voice name

At the moment, it's just the voice name.

Make the Midi module aware of multiple repeats

Ar the moment, it handles first and second voltas and single simple repeats.

Allow spaces between the notes in chords

i.e. [A B C] as well as [ABC]. I've been looking at ABC samples for 8 or 9 years now and have just come across my first example of this in the wild.

Add a function to remove repeat markers

This is necessary if we want to play a thumbnail which may have a begin-repeat marker that we need to ignore.

Attach decorations to the note that they decorate

Rather than being free-standing.Similar to what we've done with grace notes.

Introduce a Bar structure

At the moment, the parser doesn't discriminate bars. Bar lines are just ordinary items (such as notes, rests etc.) that can appear in a line of music. There is no concept of a Bar itself being a container for these items. We should alter the Abc ADT (and parser) so that bars are thus represented. This should, in turn, make it simpler to write players or score engravers which need to interpret the ADT.

Handle line continuations better

At the moment, a continuation is parsed as just another Music item and the parser makes no attempt whatsoever to join the line with a continuation to the following line. This is all very well but is not helpful for score engraving modules which want to treat the two lines as one continuous line.

We can fix this, I think, largely by having the parser's continuation production consume the eol and by having it remember any comment after the continuation character. This has the effect of joining the lines in the ADT.

Extend the reach of decorations

At the moment, we can decorate a note or a 'y' typesetting space. However, from 4.14 Decorations:

Note that the decorations may be applied to notes, rests, note groups, and bar lines

This is another nasty area of the spec because ambiguity is introduced between the kosher !trill! type of decoration and the deprecated and outmoded single exclamation mark which indicates a line-break. Also, although the decorations may be applied in all of these places, there is no concept of specifying decorations appropriate to the context. e.g. note decorations are allowed against bars. The ambiguity is admitted in the spec, which supplies an algorithm 12.2 Loose Interpretation:

When encountering a !, scan forward. If you find another ! before encountering any of |[:], a space, or the end of a line, then you have a decoration, otherwise it is a line-break.

However I fail to see how this helps. Decorations are applied immediately before the object they decorate, whereas, in the wild, the deprecated line-break exclamation appears at the end of a line, attached to nothing.

Allow spaces between grace notes and the note that they 'grace'

I've just seen my first example in the wild. For example: {ab} cd. We should support this, irrespective of what the spec might say, because it's cost-free.

Add better support for multiple voices in the parser

There is minimal support for multiple voices in the parser. 'V' headers are recognised, but any parameters to the header are saved as an untyped String and then effectively forgotten. There is thus no opportunity for later processes (MIDI translation or score engraving) to do anything sensible with them.

As a start, we should replace the single String parameter with the pair (String, StrMap). The String parameter will represent the voice ID and the Map will contain any voice properties (see section 7.1 of the 2.1 version of the spec).

This seems to be a sensible first step. Note this statement from 7. Multiple Voices:

VOLATILE: Multi-voice music is under active review, with discussion about control voices and
interaction between P:, V: and T: fields. It is intended that the syntax will be finalised in abc 2.2.

Allow tupleted notes to contain slurs

(and throw them away if necessary). This is required now that we've tightened up tuplet parsing.

In the voice module, retitling is inconsistent

getVoiceMap retitles, partitionVoices doesn't.

Allow slurred notes in broken rhythm pairs

Specifically in the second pairing. This fails to parse at the moment:

A>(BC)

As a first shot, accept the slur open bracket and throw it away.

Consider relaxing the syntax for ties

The parser implements the spec:

You can tie two notes of the same pitch together, within or between bars, with a - symbol, e.g. abc-|cba or c4-c4. The tie symbol must always be adjacent to the first note of the pair, but does not need to be adjacent to the second, e.g. c4 -c4 and abc|-cba are not legal.

However a good deal of ABC in the wild attaches the tie to the second note. Maybe we ought to relax the syntax here in order to allow these rogue examples if there is no serious downside. Some other parsers seem to do this.

Consider replacing specific header getter functions with suitable optics

These functions, like getTitle, mostly live in the metadata module. I think it makes sense to make these more general and more usable. I suspect it is less useful to supply optics for the tune body because in nearly all cases, you have to process the body sequentially.

The module also provides getHeaderMap which provides a map from name to header for the first header encountered of each kind. I suspect that this can be deprecated in favour of optics as well.

Add option to convert to MIDI at an over-ridden tempo

This will make it straightforward for tempo sliders to alter the playback rate, Should just be a matter of providing a further toMidi function.

Tuplets may include embedded tuplets

see for example the penultimate bar in http://abcnotation.com/tunePage?a=www.folkwiki.se/pub/cache/_J%F6ns_Lars_fars_Bodapolska_c32f9a/0001.

Support multiple simple repeats

Simple (non-variant) repeats are only supported once. The parser only allows a single colon, but the main impact will be on the Midi module (and abc-melody) as is the case for improved variant endings.

4.8 Repeat/bar symbols

...
By extension, |:: and ::| mean the start and end of a section that is to be repeated three times, and so on.
...

Add a thumbnail function

This would produce a new ABC tune from an existing one, but representing only the first two bars of the tune (not counting the lead-in notes). This could then be used in score generation to produce a thumbnail image.

Rewrite the Transcription module using ExceptT

now I have some practical experience of how to use it!

X: header may be empty

From the spec:

The X: field may be empty, although this is not recommended.

Add a function to get the voice names

This can be added to the Voice module and should be independent of retrieving the actual partitioned voices themselves.

Consider redefining MeterSignature

Currently it is Tuple Int Int but would perhaps be more useful as { numerator :: Int, denominator :: Int }. We could then have a new module Abc.Meter and move here the Meter-related functions from Abc.Metadata and add a function: commonTime.

Allow the Midi module to produce raw MIDI

At the moment, the Midi module functions toMidi and toMidiAtBpm produce a Midi.Recording type. We should make functions with these names produce raw MIDI (a list of bytes) and change the names of these two functions to toMidiRecording and toMidiRecordingAtBpm.

Support ties between chords?

I'm note sure about whether to do this. Section 4.1.1 in the spec only mentions ties between notes, but it is not entirely clear whether in this context a note might in fact be be a bunch of notes, not just an individual one. This is not cost-free because downstream melody generators would be strongly affected. I'm inclined to ignore it,

MIDI pitches are incorrect for B# and B##

We convert an ABC note's pitch (pitch class, accidental, octave) to a MIDI pitch. The octave is determined by the commas attached to the pitch class in the ABC source, (A, A A' etc.) however we constrain the mapping between ABC and MIDI pitch by assuming that the pitch offset within any given octave must lie between 0 and 11 (because of the 12 note scale).

In fact, this is incorrect for B# and B## because they effectively belong in the next octave up. i.e. they should have in-octave offsets of 12 and 13 respectively.

We should also take the opportunity to move the MIDI pitch calculating functions to their own module within MIDI.

Grace notes in tuplets or broken rhythm pairs parse incorrectly

This is a significant problem. Caused partly by the fact that we don't check that the number of notes in a tuplet are provided (which we could later do). However the main cause is that grace is a separate production within music. We should change the ADT and therefore the parser so that grace notes must be attached to a following note-like production. To do this, we need to add a 'GracedNote' type to the ADT and remove grace from the music options.

More work needed on slurs

I find slurs are the hardest of all things to handle in ABC. Where should you attach them? I originally had them free-standing as representing just a 'Music item' and hence on a par with notes, chords, tuplets, inline headers etc. But this was both too liberal (you don't need slurs round headers) and too restrictive (you couldn't get slurs round the notes inside a tuplet or a broken-rhythm pair).

I then attached the slurs to the start and end of 'graceable' notes themselves. So currently, individual notes or notes within tuplets or broken-rhythm pairs can have slurs attached. But not chords, nor slurs that start at the start of a tuplet.

Perhaps the simplest thing is to allow them now to be attached to chords and presumably to the start of tuplets.

Review the use of decorations

The parser attempts to follow a literal interpretation of the ABC spec. Section 4.14 Decorations:

Decorations should be placed before the note which they decorate - see order of abc constructs.

Order of constructs:

The order of abc constructs for a note is: grace notes, chord symbols, annotations/decorations (e.g. Irish roll, staccato marker or up/downbow), , ,....

i.e. clearly decorations decorate individual notes and not note groups. However, in the wild, it is quite common to see a !coda! decoration coming immediately before a tuplet note group or even an inline header. Such uses seem reasonable, seem to break the spec, and are disallowed by the current parser.

Be lenient when parsing 'degenerate' slurs

We have now committed to associating slurs explicitly with the notes that define the span of the slur. We now have to be as lenient as possible in parsing slurs which are detached from a note by another construction. We should at lease parse these forms, and if possible, fix them to the 'standard' representation. For example

slurred tuplets:   ((3GBA) should be treated as (3(GBA)

slurred grace notes: ({c}BA) should be treated as {c}(BA)

slurred broken rhythm operators: (f a>)g should be treated as (f a)>g

Allow rests in tuplets

Although the spec is unclear, we obviously need these. An example instance is in the Fastän polska which starts with a triplet where the first component is a rest.

Support 'dotted' slurs and ties

I've never seen this in the wild, but had missed it in the spec.