In the words of the authors, “LilyPond is a music engraving program, devoted to producing the highest-quality sheet music possible.”
Lilypond has a wealth of documentation. Most of it consists of examples that show syntax and semantics of individual features. I find it difficult to infer from that the underlying principles of the semantics of the language.
It seems that the documentation targets (semi) professional musicians that are hobby programmers (at best). I am a professional computer scientist and a hobby musician (at best). I find that lilypond docs are lacking in structure and exactness when compared to definitions of general or domain specific programming languages that I usually work with. (Of course there are enough under-documented programming languages and libraries out there, but I avoid them.)
There is a subset of lilypond documentation targeted at programmers that want to extend lilypond. We can infer some of the underlying semantics from that, but the description is mixed with implementation detail that a plain “user” does not need. Still, interesting reading.
So, the following is my attempt to explain lilypond in a “semantics first” way.
And I do think this kind of documentation would serve both groups of users well. Certainly a musician is capable of structured thinking? Music is structure that you can hear. But I am digressing.
I should add that my application area is writing scores in the style of “real books” (for jazz and rock), to be used in a hobbyist band. That is, a song absolutely must fit on one page, and you want to see global structure, chords, voice; plus perhaps some extra snippets for a typical instrumental or vocal accompaniment, or break. For reasons of copyright, I don’t publish these scores. But I show my default empty score. So, my typical score will exercise most of the basic features of lilypond, but probably just very few of the advanced ones.
While writing this text here, I am trying to remember how I learned lilypond, and what was the information that I was missing the most. In several cases, I later found that the information had been in the docs all along, but I did not know where to look. Sometimes, I asked on the mailing list and I do appreciate the help I got.
Also, I invite comments (by email) on this text. I do not necessarily want to make this text longer than it already is, but I do want to repair errors, and perhaps add specific links to original documentation.
lilypond is a command line program that reads a textual description of music and produces a graphical description and MIDI.
A very short lilypond program is
\score{ \new Staff { c d e f } \layout{} \midi{} }
Put this in foo.ly
, call lilypond foo.ly
, it will produce files foo.pdf
and foo.midi
.
Notes:
\score
, inside \layout
, inside \midi
).\layout{}
(it will still produce pdf).\mid{}
, it will not produce MIDI.\score{}
blocks in one lilypond program, each one will produce separate output file(s).The PDF file can be rendered on screen, or be sent to a printer The MIDI file can be rendered (to sound that you can hear) by an external (hardware) synthesizer device, or by a software synthesizer running on your computer (e.g., timidity). Details of rendering are outside the scope of this document. There are WYSIWYG environments (IDEs) that integrate a text editor, and hide the actual command line call. Of course I never use these.
Music is one of
c4
) that has pitch (c
) and duration (4
)r4
) that has a duration (4
)s4
) that has a duration (4
){ .. }
)<< .. >>
, the semantics is “start all components at the same time, and finish when the last component finishes (?)”)\transpose c d { .. }
, There also is \shiftDurations
to scale the duration of notes There probably are others, e.g., see \repeat unfold
below.This is the “pure data” aspect of music.
For reference, here is the prototypical example of an algebraic (tree-structured) data type (ADT) that represents music, see paper by Paul Hudak et al., 1996 usage example. I guess that internally, lilypond does much the same, but since it’s LISP (not Haskell), the internal representation is only weakly (dynamically) typed (not statically).
In fact here is an ADT that represents lilypond’s concept of music. (by Rohan Drape)
There is also a “processing” aspect for scores.
Music, as defined above, is ultimately put inside the \score{ }
environment for rendering.
For production of the PDF, musical atoms will be assigned to specific parts of the page: notes appear on staffs, chord symbols appear above staffs, lyrics appear below staffs. Also, for the production of MIDI, atoms will be assigned to specific parts of the output: notes appear on certain MIDI channels.
These assignments are handled by contexts. A Context is described by (or associated with?) a block declaration like \new Staff { ... }
or \new Voice { ... }
.
There seems to be an implied, or intended, nesting of contexts:
This interacts (sometimes in inexplicable ways) when nested arbitrarily - which is allowed by the language, and sometimes required to get a specific result. E.g., this works (\new Staff
inside a Voice):
c d e f << { g a b c } \new Staff { g g g c } >>
(it produces an extra staff, extending over one bar, running in parallel) but this does not:
c d e f << { g a b c } \new Staff { g g g c } >>
c d e f << { g a b c } \new Staff { g g g c } >>
(it produces another extra staff but it’s on a different level, so overall it wastes space). Work-around suggested here
There is some magic: in some places, specification of context can be omitted, and then some default context is re-used, or created. E.g., a Staff
already has an implied Voice
, so if you \new Voice
inside it, then you actually get the second voice (?)
For me it is difficult (or, “magic”) to refer to contexts, but in some cases it works. E.g., it is possible to add lyrics to a voice, where the syllables of the text are aligned with the notes in the voice.
In some places, elements inside one context do appear to influence the processing of others contexts. E.g., if we have a parallel composition of two staffs, each with a voice, and one voice contains a \repeat volta
, then this is actually displayed also for the parallel staff. (But I think it does not effect a repeat when the MIDI for the other staff is produced.)
In addition to writing scores literally (each note/rest that you write in the .ly file, creates one visible note/rest in the .pdf file) we can use mechanisms that save typing by re-using code.
The most important one (and its usage is highly recommended in the docs) is: you can define names that then denote other things (typically, music):
foo = { c d }
bar = { e f \foo }
\score{ \new Staff { \foo \bar } }
I habitually use this feature for
bpm = 120
... \new Staff { \tempo 4 = \bpm ... }
... \midi { \tempo 4 = \bpm }
This sets the tempo for the MIDI file, and also makes it visible on the score sheet.
I think the following properties are true for names:
These are heavy obstacles when trying to use lilypond in a “programming language” way.
Note: also \repeat unfold
is a notation for re-using code, see next item.
\repeat unfold N { .. }
where N is a number, and the argument will be composed sequentially with itself N times.
Note: composition is sequential. These two are not equivalent (the first one does not give two parallel staffs):
\score{ << \repeat unfold 2 \new Staff { c } >> }
\score{ << \new Staff { c } \new Staff { c } >> }
Lilypond has several ways to denote repeats. If you want a compressed representation on the page, but all notes in the MIDI, then you must do
music = << ... >> % <-- bulk of the text goes here
\score{\music \layout{...}}
\score{\unfoldRepeats \music \midi{...}}
This is documented (but I did not know it for a long time.)
The abstract syntax for repeats with alternative endings
\repeat volta 2 { c4 d e f }
\alternative {{ g a b c} {f e d c }}
is broken: the semantics (for both graphics and MIDI rendering) of the first part (first line) depends on the second, while the appear syntactically separate. But MIDI rendering is correct (with \unfoldRepeats
as described above).
There is a way to write more powerful programs: lilypond programs can contain LISP code, since the lilypond compiler is using this anyway under the hood (actually: the GUILE Scheme dialect of LISP).
Analogy:
Inside “lilypond mode” you can embed a range in “LISP mode” by #( .. )
. Inside LISP mode, you can embed lilypond mode by #{ .. #}
.
Combined example:
#(define (foo x y) (if x y) )
\score {
#(foo #t #{c d e#} )
#(foo #f #{c d e#} )
}
Inside LISP mode, you can use functions that you defined, and also functions from libraries
There is also $( .. )
to embed LISP but it has somewhat different semantics: code embedded that way is executed already while parsing (?) so you’ll be surprised by visibility problems. Some declarations may not have been elaborated when you think they should be.
Quite often, the basic building blocks of scores are sequences of notes. Lilypond has invented notation for writing these while saving keystrokes.
This is generally well-documented already, as it is a purely local thing, and does not refer to any global structural concept.
Relative pitch: pitch is denoted by letters c,d,e,.. The octave is left implicit.
If notes x y
are adjacent (subsequent) in the text, then the octave of y
is chosen in such a way that the resulting pitch is nearest to the pitch of x
. This means that g a b c
works (it is an ascending sequence, and the finalc
is from the next octave, one half tone above b
). If you don’t want this, then you can jump an octave higher with y'
, or lower with y,
. This works nicely except that I think these annotations are in the wrong place - it should be 'y
and ,y
because you are first changing the octave, then setting the note. A.G. comments that in fact it should be ' y
(and , y
), i.e., two separate items: the first is an invisible, inaudible, zero-time atom that just moves the “current position” by an octave, the second one then is relative to this.
Another problem is that the “x
before y
” relation refers to the order in the source code. So if one voice ends with x
, and next note y
that is in the next line of the source code may be for a different staff, different voice, at a totally different time - but it is still pitch-relative to x
. The recommended work-around is to put notes inside a block \relative c' { ...}
- this relates the block contents to the given absolute pitch.
Default duration: duration is denoted by numbers: c4
denotes a fourth, c4.
denotes an extended fourth (c4 ~ c8
). If you omit the denotation of the duration, then the duration of the previous note is used. So you can write c4 d e f
as a shorter equivalent of c4 d4 e4 f4
. Again, this has the same problem as relative pitch: if you omit the declaration, then lilypond uses whatever it has read previously, even if that’s unrelated musically. The recommended work-around is to state the duration for the first note in a bar, or any group of related notes.
? It is also possible to change the duration by *
, as in s1*10
(this gives 10 bars of silence) but this does not work for visible notes as it does not change their appearance (so you don’t know that duration is multiplied)
Bar checks: Useful feature! Each staff has a time, and you can set it as in \time 6/4
. If you don’t set it, it is \time 4/4
. This defines the length of a bar. When you write a sequence of notes, you can add |
(vertical bar). The semantics of a “bar check” is: if we are indeed at the start of a bar, then remain silent, else complain. So, if you use bar checks, you don’t have to mark each bar, but when you put a mark, it will be checked. As I said, very useful.
A.G. reminds me that this also works with ties across bars: we can write
{ \time 4/4 c4 d e f8 g8 ~ | g1 } }
I must have assumed that ~
and |
are both operators, and concluded that you could syntactically use only one of them.
Now, what I would want from lilypond is checks for
{ c4 d .. } :: 2
(meaning 2 whole notes)<< .. >>
have equal length.The big challenge in the design of a textual music (representation) language is the following: You have to go from source text (what you type when you create the score) to score sheet (what you look at when you play) to actual music (what you hear). You cannot put each note in the source by itself, and also not in the score sheet, so you use abstractions. There are two kinds:
\repeat unfold
, named music expressions) and are interpreted by lilypond.Why the difference?
My main critique of the lilypond language design is
I’d like to end this with a positive statement: Above all, lilypond does work as advertised. I can write “real book”-like lead sheets “from scratch” (that is, from the audio) with a reasonable amount of work (say, 1 hour per song, for a first playable approximation), then lilypond renders them nicely, and we use them for practice in the band. I can also produce MIDI and play along for self-study.
The “pain” of using lilypond was never strong enough for me to say “yikes, I’d rather write Haskore and have that compiled to lilypond”. Of course, it cannot hurt to try alternatives, say, hly and others, also mentioned there.