The Karplus-Strong Algorithm
Quarto
Quarto enables you to weave together content and executable code into a finished document. To learn more about Quarto see https://quarto.org.
WebR - R in the Browser
WebR is a version of the statistical language R compiled for the browser and Node.js using WebAssembly, via Emscripten.
WebR makes it possible to run R code in the browser without the need for an R server to execute the code: the R interpreter runs directly on the user’s machine. Several R packages have also been ported for use with webR, and can be loaded in the usual way using the library()
function.
The Karplus-Strong Algorithm
The Karplus-Strong algorithm is a simple digital feedback loop with an internal buffer of M samples. The buffer is filled with a set of initial values and the loop, when running, produces an arbitraryly long output signal. Although elementary, the K-S loop can be used to synthesize interesting musical sounds as we will see in this notebook.
Let’s start with a basic implementation of the K-S loop:
Now that we have our utility functions, let’s set up the sampling rate:
With this sampling rate, since the period of the generated signal is equal to the length of the inital buffer, we will be able to compute the fundamental frequency of the resulting sound. For instance, if we init the K-S algorithm with a vector of 50 values, the buffer will fit 16000/50=320 times in a second’s worth of samples or, in other words, the resulting frequency will be 320Hz, which corresponds roughly to a E4 on a piano.
We still haven’t talked about what to use as the initial values for the buffer. Well, the cool thing about K-S is that we can use pretty much anything we want; as a matter of fact, using random values will give you a totally fine sound. As a proof, consider this initial data set:
Let’s now generate a 2-second audio clip:
OK, so the K-S algorithm works! From the signal processing point of view, we can describe the system with the following block diagram (neglect the factor \(\alpha\) for a moment)
The output can be expressed as \[ y[n] = x[n] + y[n - M] \] assuming that the input is the finite-support signal \[ x[n] = \begin{cases} 0 & \mbox{for $n < 0$} \\ b_n & \mbox{for $0 \le n < M$} \\ 0 & \mbox{for $n \ge M$} \end{cases} \]
Let’s implement the K-S algorithm as a signal processing loop
By looking at block diagram we can see a simple modification that adds a lot of realism to the sound: by setting \(\alpha\) to a value close to but less that one, we can introuce a decay in the note that produces guitar-like sounds: \[ y[n] = x[n] + \alpha y[n - M] \]
If we now plot the resulting K-S output, we can see the decaying envelope:
There is just one last detail (the devil’s in the details, here as everywhere else). Consider the output of a dampened K-S loop; every time the initial buffer goes through the loop, it gets multiplied by \(\alpha\) so that we can write
\[ y[n] = \alpha^{\lfloor n/M \rfloor}x[n \mod M] \]
(think about it and it will make sense). What that means is that the decay envelope is dependent on both \(\alpha\) and \(M\) or, in other words, the higher the pitch of the note, the faster its decay. For instance:
This is no good and therefore we need to compensate so that, if \(\alpha\) is the same, the decay rate is the same. This leads us to the last implementation of the K-S algorithm:
Playing Music!
Let’s now play some cool guitar and, arguably, no guitar chord is as cool as the opening chord of “A Hard Day’s Night”, by The Beatles.
Much has been written about the chord (which, in fact, is made up of 2 guitars, one of which a 12-string, a piano and a bass) but to keep things simple, we will accept the most prevalent thesis which states that the notes are \(D_3, F_3, G_3, F_4, A_4, C_5\) and \(G_5\). To give it a “wider” feeling we will add another \(D_2\) below.
In Western music, where equal temperament is used, \(A_4\) is the reference pitch at a frequency at 440Hz. All other notes can be computed using the formula \(f(n) = A4 \times 2^{n/12}\) where \(n\) is the number of half-tones between \(A_4\) and the desired note. The exponent \(n\) is positive if the note is above \(A_4\) and negative otherwise.
Each note is generated using a separate Karplus-Strong algorithm. We try to mix the different “instruments” by assigning a different gain to each note. Also, we sustain Paul’s D note on the bass a bit longer by changing the corresponding decay factor.
Close enough, no? (Check here).
You can now play around with other famous chords, try for instance the “Mystic Chord” by Scriabin, whose notes are
\(C_3, F^{\sharp}_3, B^{\flat}_3, E_4, A_4, D_5\).
(Check here)
Final Quiz
How would you describe what’s happening here?
Answer to Final Quiz
What’s happening in the final quiz is a demonstration of the additive nature of sound waves. We’re creating two separate random buffers (a
of length 100 and b
of length 80) and then:
- We repeat buffer
a
four times:c(a, a, a, a)
- creating a composite buffer with a fundamental frequency of Fs/100 = 160 Hz - We repeat buffer
b
five times:c(b, b, b, b, b)
- creating a composite buffer with a fundamental frequency of Fs/80 = 200 Hz - We add these two repeated patterns together, creating a buffer that contains two overlapping frequencies
When we feed this combined buffer into the Karplus-Strong algorithm, we’re essentially hearing two notes played simultaneously - a chord with frequencies of 160 Hz and 200 Hz. These two frequencies have a ratio of 5:4, which is a major third interval in music theory.
The resulting sound has a beat frequency (the difference between the two frequencies) of 40 Hz, which creates the warbling or pulsating effect in the audio. This is an example of constructive and destructive interference between two sound waves with different frequencies.
In musical terms, we’re creating a simple harmony by combining two notes in a pleasing interval ratio.