Types in csound-expression

These is my best attempt at explaining some design choices in https://hackage.haskell.org/package/csound-expression, “a library to make electronic music”, by Anton Kholomiov.

This will be like a compressed version of https://github.com/spell-music/csound-expression/blob/master/tutorial/chapters/BasicTypesTutorial.md which has lots of examples.

The basic types and type classes of c-e are:

Sig: descrition of a signal
SE a (polymorphic): side-effecting action that produces a
Evt a (polymorphic): a stream of events of type a

(I am ignoring Str and Tab)

Signals and their Descriptions

For audio processing, the central concept is “Signal”, which is a function from time to (numerical) value. So why don’t we declare type Sig = (Time -> Double)?

In the above, I was careful to write “description of signal”. The actual signal (the semantics of the description) is not known to the Haskell program, since it will only be realized by the csound back-end. There is no way to compute (in ghci) the value some Sig at some specific time. What we can do, is

create signal descriptions (e.g., constant signals, described by polymorphic numerical literals)
combine signal descriptions (add, multiply, etc.), producing other signal descriptions,
but the only way to consume them, is to use them as argument for the dac function, which will compile the description to a csound expression, and send it to the csound server for rendering.

What we can not do, is, e.g., progam a function f :: Double -> Double in Haskell and map it over a Sig. This would work if Sig was an actual signal, but it is only a signal description. So, to map a function over the description of a signal, we would need a description of the function that the csound back-end understands. This works if we can write the function at a generic type f = \x -> 0.5 * x :: Fractional a => a -> a, since we have instance Fractional Sig and we can then just apply f, as in f (osc 220).

Numerical notation and operations on Signal descriptions

For convenience of writing, we can use signal descriptions like numbers - sometimes. We have instance Num Sig, so we can use operators + and * to combine signal descriptions. We can also write numeric literals to denote constant signals, e.g., 440 :: Sig. This is needed, e.g., when writing osc 440, where osc :: Sig -> Sig expresses the fact that the frequency of this harmonic oscillation is given by the argument signal.

This allows to write osc (220 * exp (osc 1)), where the outer osc denotes a VCO, and the argument expression denotes the LFO that produces its control voltage. Note that exp is used at type Sig -> Sig here. This works because of instance Floating Sig

There is another type D: description of a number. It is used, e.g., in the envelope generator xeg :: D -> D -> D -> D -> Sig. We can produce D by double :: Double -> D, but this function has no inverse. We can make a constant signal by sig :: D -> Sig. There is no inverse function, so we can not voltage-control this envelope generator.

Actions

Some CE functions use Sig, some use SE Sig, e.g., white noise is white :: SE Sig. The difference is that Sig is a signal description, while SE Sig is an action that

may have a side effect (invisible: initialize some data structures, or visible: draw GUI elements on screen)
and produce Sig as result.

The type system enforces the distinction between an action and its result. We use actions in these ways:

create an action by return x (without side effects) or by some pre-defined function, e.g., white (create a white noise generator), slider .. (create a GUI element)
combine actions sequentially, mostly using do notation, where the result is still an action
run a (combined) action by using it as an argument of dac.

CE has some convenience instances and functions that do allow to handle Sig and SE Sig alike in several common situations. For instance, given these functions

sqr :: Sig -> Sig
white :: SE Sig
mlp :: Sig -> Sig -> Sig -> Sig

this will work: mlp 400 0.8 $ sqr 300 but this will not: mlp 400 0.8 $ white.

We could write mlp 400 0.8 <$> white but it would still show the difference. The solution is to use at:

at (mlp 400 0.8) $ sqr 300
at (mlp 400 0.8) $ white

This works because of

class SigSpace b => At a b c where
  type family AtOut a b c :: *
  at :: At a b c => (a -> b) -> c -> AtOut a b c
instance SigSpace a => At Sig Sig a where
  type AtOut Sig Sig a = a
  at f a = mapSig f a
instance SigSpace Sig
instance SigSpace a => SigSpace (SE a)

Since we also have

instance At Sig (SE Sig) (SE Sig)

we can

  at (fvdelays 1 [(utri 0.1,0.9)] 0.8 )
$ hall 0.5 $ mul (upw 0.1 1) $ sqr 300

where the first argument of at has type Sig -> SE Sig because of

fvdelays :: MaxDelayTime
     -> [(DelayTime, Feedback)] -> Balance -> Sig -> SE Sig

What can we do with results of Actions?

Actions of type SE a are executed by the csound back-end. This means that their results can never be observed by the Haskell program (the front-end). So, the statement instance Monad SE seems misleading: the bind function (>>=) has type

SE a -> (a -> SE b) -> SE b

indicating that in a >>= f, the function f can look at the result of action a, and decide about the continuation afterwards.

I don’t think this can happen. Instead, instance Applicative SE should be enough: we still can combine actions, with

(<*>) :: SE (a -> b) -> SE a -> SE b

, but the type makes clear that we have to decide on the continuation (after SE a) beforehand, by providing an action SE (a -> b) that produces the continuation.

An indicator for Applicative is that programs actually don’t use (>>=), but only fmap. Example: https://hackage.haskell.org/package/csound-catalog-0.7.2/docs/src/Csound.Catalog.Wave.Deserted.html#wind

The do notation for monads is used in CE in several places, e.g., https://hackage.haskell.org/package/csound-expression-5.3.2/docs/src/Csound.Control.Gui.html#lift2

lift2 gf f ma mb = source $ do
    (ga, a) <- ma
    (gb, b) <- mb
    return $ (gf ga gb, f a b)

but this could be rewritten into

lift2 gf f ma mb = source $ 
  (\ ((ga,a),(gb,b)) -> (gf ga gb, f a b) <$> ma <*> mb )

With recent compilers, that’s not even necessary, since “Applicative Do” is available. https://ghc.haskell.org/trac/ghc/wiki/ApplicativeDo.

Type Classes

We earlier said that dac takes a Sig argument, and now it’s SE Sig? Both statements are true, since the actual type is

dac :: RenderCsd a => a -> IO ()

This uses type classes with instances

class RenderCsd a where ...
instance Sigs a => RenderCsd a
instance Sigs a => RenderCsd (SE a)

class Sigs a
instance Sigs Sig

There are (many) more signal-like things, e.g., some effects produce stereo signals, like magicCave :: Sig -> Sig2, with

type Sig2 = (Sig, Sig)
instance (Sigs a1, Sigs a2) => Sigs (a1, a2)

Events

(The documentation in https://github.com/spell-music/csound-expression/blob/master/tutorial/chapters/EventsTutorial.md is fine, I am just repeating essentials here.)

Evt a is a stream of events of type a. We can think of it as (a representation for) [(Time, a)], with increasing time.

Events can come from external sources (e.g., keyboard connected via MIDI), or from internal sources, e.g., metro 2 :: Evt Unit is a unit event each half second.

This can be used to trigger sound generation. We need some extra types.

A score Sco a is an event with a duration. This models notes: when read/play a note (in a written score), we see when to play it (that’s the time of the event), but we also see (immediately) for how long to play (graphical representation for quarter notes, halves, etc.) We make a stream of eighth notes from a metronome

withDur :: Sig -> Evt a -> Evt (Sco a)
withDur (1/8) $ metro 2 :: Evt (Sco Unit)

An instrument is of type a -> SE Sig, it make signals from notes.

The sched function produces a signal from an instrument and a score.

sched :: (Arg a, Sigs b) => (a -> SE b) -> Evt (Sco a) -> b

We will use this for a = Unit, b = Sig, in:

dac $ sched (\ _ -> mul (fades 0.01 0.3) $ pink )
    $ withDur (1/8) $ metro 2

Finally, some music

let ticker a b c = sched (const $ return $ sqr a) $ withDur b $ metro c
dac $ at (echo 0.75 $ uosc 0.17)
    $ hall 0.3 $ fvdelay 1 (0.005 * uosc 0.15) 0.2
    $ mlp (exp (1 * osc 0.08) * 500) (utri 0.04 * 0.9) $ mul 0.3
    $ mul (uosc 0.12) (ticker (sqr 180) 0.1 2)
    + mul (uosc 0.3) (ticker (sqr 150) 0.1 3)
    + mul (uosc 0.21) (ticker (sqr 120) 0.1 1)
    + mul (uosc 0.17) (ticker (sqr 240) 0.1 0.5)

sounds like