The MuT music tool

For 8088 MPH I wrote a tool to convert Amiga MOD (module) files to the format required for playback with the 4.77MHz 8088 PC speaker 4 channel playback routine. The MOD file solution never felt quite ideal to me because the playback routine has some possibilities (like SID-style ring modulation) which can't be expressed in a MOD file and there are also a lot of things you can do in a MOD that won't really work with my player, so if I make it easy to try arbitrary MODs with the player, people are likely to try MODs that don't come out very well and conclude that the player is rubbish.

What I really wanted was to write my own tracker designed specifically for the player routines I had written (and some variants that I might want to write). But writing a whole tracker is a big project - particularly the GUI (GUIs take ages and aren't the most interesting things to program).

So I started thinking: what's the simplest piece of software I could write that a musician could use to compose music for this player? What about a command line tool - a sort of compiler which accepts text files as input and generates binary music data in the appropriate format (or indeed various formats) as output? This isn't entirely unprecedented - there have been various tools for processing text files into music such as Grigasoft's "Polyphonic Music" and John Worley's "Clockwork Pianola".

This is the design I came up with - I'm calling it "MuT" ("music tool", pronounced "mute") for now - it can't seem to decide if it's a tracker, a musical instrument or a programming language.

Inside the text files we would probably want to have something similar to a more traditional tracker's pattern grid, with different channels arranged horizontally and time going vertically. Rather than using semantically-significant spaces and newlines (which cause all sorts of trouble) I think a nice way to do it would be for the musician to lay out the grid using the "&" character to separate voices (think "C & E" means a C and an E playing at the same time) and the "|" character to indicate a new time division (think "bar is a measure of time" though the "|" interval would usually be shorter than a musical bar, obviously). So an empty grid would look something like:

output =
     &     &     &     |
     &     &     &     |
     &     &     &     |
     &     &     &     ;

The spaces could then be filled in with notes:

output = sine@(
 C4 & E4 & G4 & C5  |
 C4 & E4 & A4 & C5  |
 C4 & F4 & A4 & C5  |
 D4 & F4 & A4 & D5  |
 D4 & F4 & B4 & D5  |
 D4 & G4 & B4 & D5  |
 E4 & G4 & B4 & E5  |
 E4 & G4 & C5 & E5  );

Leaving a space blank causes the note in the same voice in the previous division to continue, so this could also be written:

output = sine@(
 C4 & E4 & G4 & C5  |
    &    & A4 &     |
    & F4 &    &     |
 D4 &    &    & D5  |
    &    & B4 &     |
    & G4 &    &     |
 E4 &    &    & E5  |
    &    & C5 &     );

If you want to silence a voice, put a 0 in there instead of a blank:

output = sine@(
 C4 & E4 & G4 & C5  |
 0  & 0  & A4 & 0   |
    & F4 & 0  &     |
 D4 & 0  &    & D5  |
 0  &    & B4 & 0   |
    & G4 & 0  &     |
 E4 & 0  &    & E5  |
 0  &    & C5 & 0   );

One can also put different instruments into the grid:

output =
 sine@C4 & square@E4 & triangle@G4 & bell@C5  |
 sine@C4 & square@E4 & triangle@A4 & bell@C5  |
 sine@C4 & square@F4 & triangle@A4 & bell@C5  |
 sine@D4 & square@F4 & triangle@A4 & bell@D5  |
 sine@D4 & square@F4 & triangle@B4 & bell@D5  |
 sine@D4 & square@G4 & triangle@B4 & bell@D5  |
 sine@E4 & square@G4 & triangle@B4 & bell@E5  |
 sine@E4 & square@G4 & triangle@C5 & bell@E5  ;

Instrument names are resolved per voice, and "." is (by convention) a sort of default instrument name, so this can also be written:

output =
 { . = sine; } .@C4 &
      { . = square; } .@E4 &
           { . = triangle; } .@G4 &
                      { . = bell; } .@C5  |
               .@C4 & .@E4 & .@A4 & .@C5  |
               .@C4 & .@F4 & .@A4 & .@C5  |
               .@D4 & .@F4 & .@A4 & .@D5  |
               .@D4 & .@F4 & .@B4 & .@D5  |
               .@D4 & .@G4 & .@B4 & .@D5  |
               .@E4 & .@G4 & .@B4 & .@E5  |
               .@E4 & .@G4 & .@C5 & .@E5  ;

One can make instruments in all sorts of ways. A few simple (chiptune-esque) waveforms are built into the program, or you can load them from a file:

bell = load("BELL.WAV")@s/440;

The MuT syntax is extremely flexible and rich enough to do just about any kind of audio processing, and hopefully it's also intuitive enough that it's not too difficult to figure out how to do just that.

What follows is a more technical and in-depth explanation, so if you're not a programmer or mathematician you might want to stop reading here.

Basically most MuT objects are functions from the real numbers to the complex numbers. Both the range and domain of these functions have units (some integer power of seconds - s) which are tracked by MuT so that it can distinguish between time-domain functions like sine@C4 and pure functions like sine.

There are various operators for acting on MuT objects:

+ - pointwise addition: output = sine@C4 + square@E4; sounds the same as output = sine@C4 & square@E4; but it has some different behavior in other ways - you can't do the "empty space means continue playing the same instrument in that voice" trick if you use + instead of & in your grid, and you can't use a + b as an l-value.

- - pointwise subtraction: same as + but the function on the right has its phase inverted.

* - pointwise multiplication (aka modulation). Also useful for volume envelopes.

/ and ^ - pointwise division and power respectively. Mostly for completeness sake, though they do have some uses especially for scalars.

[] indexes into a function, just like an array in C or Pascal. So sine[1] = 0.841.... When you put a function instead of a scalar inside the brackets, you (naturally) get function composition. So a[b][x] = a[b[x]].

{} allows execution of statements during the evaluation of an expression. This is handy for redefining instruments while inside the grid (see example above), or changing the tempo, amongst other things. The tempo (i.e. the amount of time covered by one "|" is set by the special variable division. So if MuT sees { division /= 2; } inside an expression, the following grid rows will be played at twice the speed (only durations are affected, not frequencies). The scope of any changes inside {} is the remainder of the statement in which it occurs.

@ - basis scale. This is where it gets really interesting. This is an operator only applicable to functions, not scalars (so unlike +, -, *, / and ^ it isn't found on calculators). Suppose you have a function f and a scalar x. Then (f@k)[x] = f[k*x]. So if f is a waveform then f@2 is the same waveform played twice as fast (and an octave higher). The @ operator also adjusts domain units if the right-hand side has a value which is not dimensionless. So, if f has a dimensionless domain (e.g. sine) then f@110*Hz (or f@110/s) will be a normal time-indexed waveform (i.e. a sine wave with a frequency of 110Hz). As I'm sure you can see this is a very useful operator for an audio program!

It gets better, though. Suppose we have a complex unit i = sqrt(-1); and we make f@i compute the Fourier transform of f. Surprisingly, I discovered (after defining it that way) that the mathematics of doing so work out very neatly - time scaling and Fourier transformations are closely related, mathematically - they are both Linear Canonical Transformations, and there's a nice mapping from complex numbers to LCTs which gives an (efficiently computable) meaning to f@z for any complex number z. Using f@i in combination with * allows us to do convolutions and therefore any linear filters you care to describe (high-pass, low-pass, band-pass, notch, echos, resonances, you name it). Using other complex numbers on the right-hand side of @ gives us inverse Fourier transforms and fractional Fourier transforms (the musical utility of which I know not, but I'm sure inventive musicians will find some interesting uses for it).

One can also use a function on the right-hand side of @, which will result in a different basis scaling for each sample in the output - i.e. for scalar x and functions f and w, (f@w)[x] = (f@(w[x]))[x]. That's how the first non-trivial example above works.

& - next voice operator. This just puts moves to the next voice, as we've seen above. It can also be used on the left-hand side of an assignment: (a & b) = (c & d); does the same thing as a = c; b = d;. This is useful for grouping values together into a compound object.

| - time sequence operator. c = a | b yields a for division followed by b for division. Functions in MuT repeat once they are complete, so if you evaluate c in a time sequence with a sufficiently long division, it'll sound like a | b | a | b | .... Similarly you can loop a division-long section of just one function by doing c = a | ;.

As with &, | can be used on the left-hand side of an assignment: (a | b) = c; assigns the part of c between 0 and division to a and the part of c from division to division*2 to b. So by using just assignment, the special division variable and the | operator you can make whatever edits to a waveform you like.

The comparison operators ==, !=, <, <=, >, >= work the same way as their C equivalents, yielding (in general) a function whose values are boolean (true, false) elements rather than numbers. These can be used with the ?: operator to combine two functions via a predicate.

The % operator has a couple of uses (which I might change to use different characters). On its own it gives the output from the previous voice in a voice set, which is useful for ring-modulation style effects (such as those that can be done by the SID chip and by my MOD player routine). If it comes immediately after a sequence, it yields a function which is the time-to-index mapping of that sequence. So (a|b|c)% is a function that is 0 from 0 to division, 1 from division to division*2 and 2 from division*2 to division*3. You can also change the time-to-index mapping of an l-value by assigning to this function. If the time-to-index mapping function takes a non-integral value at any point then the corresponding two elements of the sequence are mixed, so you can create fades and glissandos. Time-scaling the time-to-index mapping function will speed up or slow down the playing of that sequence without affecting the frequencies.

If you combine (via multiplication or basis-scaling) two functions which don't have the same domain units you end up with a compound object. When this compound object is basis-scaled, it'll move the dimensions of the compound object closer and leave other elements unchanged. So you can create an instrument by combining a waveform (dimensionless domain) and a volume envelope (time domain), and when this is time-scaled the scaling will change the frequency of the waveform part without speeding up or slowing down the volume envelope.

I'm contemplating a built-in function that will convert a waveform into a sequence so that it can be time-scaled without changing the pitch (or vice-versa) in the manner of a phase vocoder, but I haven't quite got all the details ironed out yet.

Arbitrary functions can be defined: e.g. major(x) = x + x@5/4 + x@3/2; will turn any waveform or function into a major chord using its input as the base note. If we allow functions to be recursive, then (by the Church-Turing thesis) MuT becomes a Turing-complete programming language (for better or for worse).

It's handy to be able to have libraries of predefined functions, instruments, values (note frequencies, intervals) etc. Executing an include statement will read in a file and insert its contents into the program at that point (much like C's #include directive). Conventionally, a MuT input file will start with include "standard.mut"; to include the predefined variables which come with the program, or include "pc_speaker.mut"; which does the same but also sets up special variables for the PC speaker output mode.

There's some more design documents (which are somewhat out of date but cover a few more dark corners) and the beginnings of an implementation on my github. Being a command-line tool, it's not terribly user-friendly but it probably wouldn't be hard for someone more GUI-oriented than me to slap a user interface on top of it. One fairly simple user-interface for this might just be a special text editor, which plays whatever object is under the cursor whenever a particular key combination is pressed (and which highlights the parts of the MuT program which are being used to compute the currently-playing notes). An interesting enhancement would be a little knob that appears whenever the mouse hovers over a numeric value; dragging this knob changes that numeric value in real-time, changing the currently playing sound if that value is involved in it.

2 Responses to “The MuT music tool”

  1. enrique says:

    Nice idea, it seems a bit complex to compose in it btw.
    Did you thought in modifying some open source tracker and add your own commands? I remember IT was open sourced, and the always present OMP tracker?
    Good work anyway!

    • Andrew says:

      Thanks. Yeah, for solving the original problem modifying an existing tracker would probably be better. But this thing has taken on a life of it's own now, and I'd be really interested to hear what a musician could do with the kind of flexibility this system would make possible. Maybe not even for composing but just for generating new sounds/instruments from existing ones. As for being complex to compose in, I'd hope that the GUI ideas I mentioned could help with that (or at least some better documentation that is more tailored to musicians than implementers). Having a programming language for music isn't entirely unprecedented (there's a whole list of them at https://en.wikipedia.org/wiki/List_of_audio_programming_languages) but it seems like most of them are more limited (being to MIDI as MuT is to MOD). And I don't think any of them have any the equivalent of the generalized @ operator.

Leave a Reply