Karplus-Strong in C++
The Karplus-Strong algorithm is a proto-physical model. The underlying theory is covered in the Karplus-Strong Section of the Sound Synthesis Introduction. Although the resulting sounds are very interesting, the Karplus-Strong algorithm is easy to implement, especially in C/C++. It is based on a single buffer, filled with noise, and a moving average smoothing.
The Noise Buffer
Besides the general framework of all examples in this teaching unit, the karplus_strong_example
needs just a few additional elements, defined in the class`s header:
// the buffer length int l_buff = 600; // the 'playback position' in the buffer int buffer_pos=0; /// noise buffer double *noise_buffer; /// length of moving average filter int l_smooth = 10; // feedback gain double gain =1.0;
Note that the pitch of the resulting sound is hard-coded in this example, since it is based only on the sampling rate of the system and the buffer length. In contrast to the original Karplus-Strong algorithm, this version uses an arbitrary length for the moving average filter, instead of only two samples. This results in a faster decay of high frequency components.
Initializing the Buffer
Since the noise buffer is implemented as a pointer to an array of doubles,
it first needs to be allocated and initialized. This happens in the constructor of the karplus_strong_example
class:
Plucking the Algorithm
Each time the Karplus-Strong algorithm is excited, or plucked, the buffer needs to be filled with a sequence of random noise. At each call of the JACK callback function (process
), it is checked, whether a new event has been triggered via MIDI or OSC.
If that is true, the playback position of the buffer is set to 0
and each sample of the noise_buffer
is filled with a random double between -1 and 1:
Running Through the Buffer
The sound is generated by directly writing the samples of the noise_buffer
to the JACK output buffer. This is managed in a circular fashion with the buffer_pos
counter. Wrapping the counter to the buffer size makes the process circular. This example uses a stereo output with the mono signal.
Smoothing the Buffer
The above version results in a never-ending oscillation, a white tone. The timbre of this tone changes with every triggering, since a unique random sequence is used each time. With the additional smoothing, the tone will decay and lose the high spectral components, gradually. This is done as follows:
Compiling
To compile the KarplusStrongExample, run the following command line:
g++ -Wall -L/usr/lib src/yamlman.cpp src/main.cpp src/karplus_strong_example.cpp src/oscman.cpp src/midiman.cpp -ljack -llo -lyaml-cpp -lsndfile -lrtmidi -o karplus_strong
This call of the g++ compiler includes all necessary libraries and creates the binary karplus_strong
.
Running the Example
The binary can be started with the following command line:
This will use the configurations from the YAML file and wait for OSC input. The easiest way of triggering the synth via OSC is to use the Puredata patch from the example's directory.
Exercises
Using MIDI with RtMidi
Although the MIDI protocol is quite old and has several drawbacks, it is still widely used and is appropriate for many applications. Read the MIDI section in the Computer Music Basics for a deeper introduction.
The development system used in this class relies on the RtMidi framework. This allows the inclusion of any ALSA MIDI device on Linux systems and hence any USB MIDI device. The RtMidi Tutorial gives a thorough introduction to the use of the library.
ALSA MIDI
The Advanced Linux Sound Architecture (ALSA) makes audio- and MIDI interfaces accessible for software. As an API it is part of the Linux kernel. Other frameworks, like JACK or Pulseaudio work on a higher level and rely on ALSA.
Finding your ALSA MIDI Devices
After connecting a MIDI device to an USB port, it should be available via ALSA. All ALSA MIDI devices can be listed with the following shell command:
The output of this request can look as follows:
Dir Device Name IO hw:1 ,0 ,0 nanoKONTROL MIDI 1 IO hw:2 ,0 ,0 PCR-1 MIDI 1 I hw:2 ,0 ,1 PCR-1 MIDI 2
In this case, two USB MIDI devices are connected. They can be addressed by their MIDI device ID (hw:0/1).
The MIDI Tester Example
The MIDI tester example can be used to print all incoming MIDI messages to the console. This can be helpful for reverse-engineering MIDI devices to figure out their controller numbers.
The MIDI Manager Class
The MIDI Manager class introduced in this test example is used as a template for following examples which use MIDI. For receiving messages, RtMidi offers a queued MIDI input and a user callback mode. In the latter case, each incoming message triggers a callback function. For the queued mode, as used here, incoming messages are collected until retrieved by an additional process.
The midiMessage
struct is used to store incoming messages. It holds the three standard MIDI message bytes plus a Boolean for the processing state.
SuperCollider Granular Example
The TGrains
UGen is an easy to use granular synth. It uses a Hanning window
for each grain and offers control over position, pitch and length of the grains.
The help files offer multiple examples for using this unit generator.
The following example uses a simple pulse train for triggering grains.
Reading Channels
A single channel is loaded to a buffer from a sample for this granular example. The duration in seconds can be queried from the buffer object, once loaded.
The Granular Node
The granular node uses an Impulse
UGen to create a trigger signal for the TGrains
UGen.
This node has several arguments to control the granular process:
The density defines how often a grain is triggered per second.
Every grain can be pitch shifted by a value (1 = default rate).
The grain duration is specified in seconds.
The grain center is defined in seconds.
A gain parameter can be used for amplification.
buffer specifies the index of the buffer to be used.
Once the node has been created with a nil
buffer, the buffer index of the
previously loaded sample can be passed. Depending on the nature of the sample,
this can already result in something audible:
Manual Parameter Setting
As with any node, the arguments of the granular process can be set, manually. Since the center is specified in seconds, the buffer duration is useful at this point.
Exercise
The Fourier Transform
The Discrete Fourier Transform¶
The frequency-domain representation gives insight into the composition of time series and hence of musical signals. In the digital domain we are formemost interested in discrete signals and will thus introduce the Discrete Fourier Transform (DFT). This section does not aim at a full introduction of the DFT, but illustrates a few aspects which help to better understand the basics of computer music and sound synthesis.
The DFT $X[k]$ of a discrete signal $x$ with the length $N$ and the sampling frequency $f_s$ is calculated as follows. For every frequency bin $k=1 ... N, N \in \mathbb{N}$ of the output, the correlation of the signal with a complex oscillation with the frequency $2 \pi \frac{k}{N}$ is calculated:
$$ \begin{eqnarray} X[k] & = & \sum\limits_{n=0}^{N-1} x[n] \left( \cos \left(2 \pi k \frac{n}{N}) -j \sin(2 \pi k \frac{n}{N} \right) \right) \\ X[k] & = & \sum\limits_{n=0}^{N-1} x[n] e^{-j 2 \pi k \frac{n}{N}} \end{eqnarray} $$
Since real and imaginary part of the complex oscillation have a relative phase of $\frac{\pi}{2}$, the correlation does not only deliver information on the magnitude of spectral components, but also on their phase. The real part is a cosine, whereas the imaginary part is a sine function. The following plot shows the real and imaginary component of the complex oscillations for the indices $k=1$ and $k=2$:
Absolute Representation¶
In many cases, the absolute values of DFT spectra will be shown only for the positive frequencies. This representation is used in most examples in the following sections. The sine wave becomes a single peak at its frequency:
DFT Support Points¶
For a DFT with $N$ points and a sampling rate $f_s$, the DFT bins $b[k]$ - or support points - are located at the following frequencies:
$$ b[k] = k \frac{f_s}{N} $$
Only for harmonic signals located at these exact frequencies, the DFT results can be expressed by the Dirac-Delta function.
Links¶
This website visualizes the DFT very nicely for a better understanding: https://jackschaedler.github.io/circles-sines-signals/dft_walkthrough.html
The Fourier Transform
DFT of a Sine Wave¶
In the field of musical signal processing, the sine wave (respectively the cosine wave) are the basic elements of complex sounds. They can be used to model and synthesize any periodic signal. Hence, the frequency domain representation of these harmonic functions is fundamental to the understanding of many algorithms for analysis and synthesis. In most visualizations in the spectral domain, a sinusoidal component is shown as a single peak at the oscillation's frequency. However, when viewed closely, this peak is smeared accompanied by several side lobes. The following example derives these characteristics, based on a $1024$ sample sine wave with a frequency of $f_0 = 100\ \mathrm{Hz}$ at a sampling rate of $f_s = 16\ \mathrm{kHz}$:
For calculating the DFT of sinusoidal signals, it makes sense to express them in the complex notation through Euler's formula:
$$ \sin(2 \pi f_0 \frac{n}{fs}) = \frac{1}{2j} \left( e^{j 2 \pi f_0 \frac{n}{fs} } -e^{-j2 \pi f_0 \frac{n}{fs}} \right) \\ $$
The DFT of the sine wave thus extends to:
$$ X[k] = \sum\limits_{n=0}^{N-1} \frac{1}{2j} \left(e^{j 2 \pi \frac{f_0}{fs} n} -e^{-j2 \pi \frac{f_0}{fs} n} \right) e^{-j 2 \pi k \frac{n}{N}} \\ $$
Solving:
$$ \begin{eqnarray} X[k] & = & \frac{1}{2j} \sum\limits_{i=0}^{N-1} e^{- j 2 \pi k \frac{n}{N} + j 2 \pi \frac{f_0}{fs} n } - e^{- j 2 \pi k \frac{n}{N} - j 2 \pi\frac{f_0}{fs} n} \\ % % & = & \frac{1}{2j} \sum\limits_{i=0}^{N-1} e^{- j 2 \pi ( \frac{k}{N} - \frac{f_0}{fs} ) n } - e^{- j 2 \pi ( \frac{k}{N} + \frac{f_0}{fs}) n} \\ % % & = & \frac{1}{2j} \sum\limits_{i=0}^{N-1} e^{- j 2 \pi ( \frac{k}{N} - \frac{f_0}{fs} ) n } - \frac{1}{2j} \sum\limits_{i=0}^{N-1} e^{- j 2 \pi ( \frac{k}{N} + \frac{f_0}{fs}) n} \\ % % \end{eqnarray} $$
Geometric Series
Using the geometric series formula (with acknowledgements to 1)
$$ \sum\limits_{n=0}^{N-1} a^n = \frac{1-a^{N}}{1-a} $$
the above equation results in:
$$ X[k] = \frac{1}{2j} \frac{1-e^{-j 2 \pi \left( \frac{k}{N} - \frac{f_0}{fs} \right) N}}{1-e^{-j 2 \pi \left( \frac{k}{N} - \frac{f_0}{fs} \right)}} % + \frac{1}{2j} \frac{1-e^{-j 2 \pi (\left( \frac{k}{N} + \frac{f_0}{fs} \right)) N}}{1-e^{-j 2 \pi \left( \frac{k}{N} + \frac{f_0}{fs} \right)}} $$
Factoring out
The above equation features the following term:
$$ E = \frac{1-e^{-j \Lambda N}}{1-e^{-j \Lambda}} $$
After factoring out the term $\frac{e^{-j \Lambda \frac{N}{2}}}{e^{-j \frac{\Lambda}{2}}}$ we can solve further, using Euler's formula:
$$ \begin{eqnarray} E & = & \frac{e^{-j \Lambda \frac{N}{2}}} {e^{-j \frac{\Lambda}{2}}} \cdot \frac{e^{j \Lambda \frac{N}{2}} - e^{-j \Lambda \frac{N}{2}}}{e^{j \frac{\Lambda}{2}} - e^{-j \frac{\Lambda}{2}}} \\ % % & = & e^{-j \Lambda \frac{N+1}{2} } \underbrace{\frac{\sin(\Lambda \frac{N}{2} )}{\sin(\frac{\Lambda}{2})}}_{\text{Dirichlet function}} \end{eqnarray} $$
As the plot below shows, the term is characterized by a Dirichlet function, which is related to the sinc
function. Inserting
$$ \begin{eqnarray} \Lambda(-f_0) & = & 2 \pi \left( \frac{k}{N} - \frac{f_0}{fs} \right) \\ \Lambda(+f_0) & = & 2 \pi \left( \frac{k}{N} + \frac{f_0}{fs} \right) \\ \end{eqnarray} $$
we get the following result for the spectrum of the sinusoid:
$$ \begin{eqnarray} X [k] & = & \frac{1}{2j} \left( e^{-j \Lambda(-f_0) \frac{N+1}{2} } \frac{\sin(\Lambda(-f_0) \frac{N}{2} )}{\sin(\frac{\Lambda(-f_0) }{2})} % + e^{-j \Lambda(+f_0) \frac{N+1}{2} } \frac{\sin(\Lambda(+f_0) \frac{N}{2} )}{\sin(\frac{\Lambda(+f_0)}{2})} \right) \end{eqnarray} $$
According to the shift theorem, the above equation holds two components - one cenered at $f_0$, another centered at $-f_0$, each with a main lobe and an infinite number of sidelobes:
Magnitude Plot
The plot below visualizes the result, with a two main lobes and the decaying side lobes:
Absolute Representation¶
In many cases, the absolute values of DFT spectra will be shown only for the positive frequencies. This representation is used in most examples in the following sections. The sine wave becomes a single peak at its frequency:
DFT Support Points¶
For a DFT with $N$ points and a sampling rate $f_s$, the DFT bins $b[k]$ - or support points - are located at the following frequencies:
$$ b[k] = k \frac{f_s}{N} $$
Only for harmonic signals located at these exact frequencies, the DFT results can be expressed by the Dirac-Delta function.
Links¶
This website visualizes the DFT very nicely for a better understanding: https://jackschaedler.github.io/circles-sines-signals/dft_walkthrough.html
Background
The EOC
The Electronic Orchestra Charlottenburg (EOC) was founded at the TU Studio in 2017 as a place for developing and performing with custom musical instruments on large loudspeaker setups.
EOC Website: https://eo-charlottenburg.de/
Initially, the EOC worked in a traditional live setup with sound director. Several requests arose during the first years:
-
enable control of the mixing and rendering system through musicians
control spatialization
-
flexible spatial arrangement of musicians
break up rigid stage setup
-
distribution of data
scores
playing instructions
visualization of system states
The SPRAWL System
During Winter Semester 2019-20 Chris Chafe was invited as guest professor at Audio Communication Group. In combined classes, the SPRAWL network system was designed and implemented to solve the above introduced problems in local networks:
Quarantine Sessions
The quarantine sessions are an ongoing concert series between CCRMA at Stanford, the TU Studio in Berlin, the Orpheus Institute in Gent, Belgium and various guests:
These sessions use the same software components as the SPRAWL System. Audio is transmitted via JackTrip and SuperCollider is used for signal processing.
SuperCollider for the Remote Server
SuperCollider is per default built with Qt and X for GUI elements and the ScIde. This can be a problem when running it on a remote server without a persistent SSH connection and starting it as a system service. However, for service reasons a version with full GUI support is a useful tool. One solution is to compile and install both versions and make them selectable via symbolic links:
build and standard-install a full version of SuperCollider
build a headless version of SuperCollider (without system install)
- replace the following binaries in /usr/bin with symbolic links to the headless version
scsynth
sclang
supernova
create scripts for changing the symlink targets
This allows you to redirect the symlinks to the GUI version for development and testing whereas they point to the headless version otherwise.
Compiling a Headless SC
The SC Linux build instructions are very detailed: https://github.com/supercollider/supercollider/blob/develop/README_LINUX.md
Compiling it without all graphical components is straightforward. Simply add the flags NO_X11=ON
and -DSC_QT=OFF
for building a headless version of SuperCollider.
Using JackTrip in the HUB Mode
About JackTrip
In this class we will use JackTrip for audio over network connections but there were some successful tests with the Zita-njbridge. JackTrip can be used for peer-to-peer connections and for server-client setups. For the latter, JackTrip was extended with the so called HUB Mode for the SPRAWL System and the EOC in 2019-20.
---
Basics
For connecting to a server or hosting your own instance, the machine needs to be connected to a router directly via Ethernet. WiFi will not result in a robust connection and leads to significant dropouts. JackTrip needs the following ports for communication. If a machine is behind a firewall, these need to be added as an exception:
Port |
Protocol |
Purpose |
---|---|---|
4464 |
TCP/UDP |
audio packages |
61002-62000 |
UDP |
establish connection (server only) |
The Nils Branch
Due to the increasing interest, caused by the pandemic, and the endless list
of feature requests, the Jacktrip project has been growing rapidly in since early 2020
and the repository has many branches.
In this class we are using the nils
branch, which implements some unique features we need for the
flexible routing system. Please check the instructions for compiling and installing a specific branch: Compiling JackTrip
Starting JackTrip
JACK Parameters
Before starting JackTrip on the server or the clients, a JACK server needs to be booted on the system. Read the chapter Using JACK Audio from the Computer Music Basics class for getting started with JACK. A purely remote server, as used in this class, does not have or need an audio interface and can thus be booted with the dummy client:
To this point, the version of JackTrip used with the SPRAWL system requires all participants
to run their JACK server at the same sample rate and buffer size.
Recent changes to JackTrip dev
branch allow the mixing of different buffer sizes
but have not been tested with this setup.
The overall system's buffer size is defined by the weakest link, respectively the
client with the worst connection.
Although tests between two sites have shown to work with down to $16$ samples,
a buffer size of $128$ or $256$ samples usually works for a group.
Experience has shown that about a tenth of all participants has an insufficient
internet connection for participating without significant dropouts.
JackTrip Parameters
As with most command line programs, JackTrip gives you a list of all available
parameters with the help flag: $ jacktrip -h
A single instance is launched on the SPRAWL Server with the following
arguments:
The following arguments are needed for starting a JackTrip client instance and connecting to the SPRAWL server (the server.address can be found in the private data area):
Using SSH for Remote Access
SSH (Secure Shell Protocol) gives remote access to remote computer. In its basic form it allows the execution of command lines in a terminal, if the remote machine is running an SSH server. It can also be used for remote graphical user interfaces.
Connecting to an SSH Server
For connecting to a remote machine, it needs to run an SSH server. On the client side, an SSH connection can be established without additional installations from the terminal on Linux and MAC machines and - since version 10 - from Windows. SSH receives the following command, with the remote user's credentials (username and ip-address). This user needs to be installed on the remote machine. The remote SSH server will ask for the user's password, if no SSH key has been installed.
X11 Forwarding
X11, or the X Window System is a framework for a graphical user interface (GUI) environment, used on Unix systems.
With X11 Forwarding, SSH can also be used to run applications with a GUI, remotely.
When connecting to an SSH server from a Linux machine, simply add the -X
argument to do so:
X11 On Mac
On Mac you need to install xqwartz <https://www.xquartz.org/> to enable X11 Forwarding.
Afterwards, the -X
argument will enable X11 Forwarding:
X11 on Windows
Although SSH will be possible from Windows' builtin Power Shell or Windows Terminal, X11 Forwarding requires Putty and additional tools:
install putty
install vcxsrv
enable X11 in Putty
Remote Commands
SSH can also be used to send single commands, without starting a remote session. This example launches the jack_simple_client
,
which plays a continuing sine tone on the remote machine.