The Ambisonics Workflow

Basic Workflow

A basic Ambisonics production workflow can be split into three stages, as shown in Figure 1. The advantage of this procedure ist that the production is independent of the output format, since the intermediate format is in the Ambisonics domain. A sound field produced in this way can subsequently be rendered or decoded to any desired loudspeaker setup or headphones.


/images/spatial/ambisonics/ambi-workflow.png

Figure 1: Basic Ambisonics production workflow.


Stages

1: Encoding Stage

In the encoding stage, Ambisonics signals are generated. This can happen via recording with an Ambisonics microphone or through encoding of mono sources with individual angles (azimuth, elevation). A plain Ambisonics encoding does not include distance information - altough it can be added through attenuation. All encoded signals have the same amount of $N$ ambisonics channels.

2: Summation Stage

All individual Ambisonics signals can be summed up to create one scene, respectively one sound field.

3: Decoding Stage

In the decoding stage, individual output signals can be calculated. This requires either head-related transfer functions or loudspeaker coordinates.


More advanced workflows may feaure additional stages for manipulating encoded Ambisonics signals, inlcuding directional filtering or rotation of the audio scene.


References

2015

  • Matthias Frank, Franz Zotter, and Alois Sontacchi. Producing 3d audio in ambisonics. In Audio Engineering Society Conference: 57th International Conference: The Future of Audio Entertainment Technology–Cinema, Television and the Internet. Audio Engineering Society, 2015.
    [details] [BibTeX▼]

2009

  • Frank Melchior, Andreas Gräfe, and Andreas Partzsch. Spatial audio authoring for ambisonics reproduction. In Proc. of the Ambisonics Symposium. 2009.
    [details] [BibTeX▼]

Spatial Additive Synthesis

Additive Synthesis and Spectral Modeling are in detail introduced in the corresponding sections of the Sound Synthesis Introduction. Since sounds are created by combining large numbers of spectral components, such as harmonics or noise bands, spatialization at synthesis stage is an obvious method. Listeners can thereby be spatially enveloped by a single sound, with spectral components being perceived from all angles. The continuous character, however, blurs the localization.


SOS

Spatio-operational spectral (SOS) synthesis (Topper, 2002) is an attempt towards a dynamic spatial additive synthesis, implemented in MAX/MSP and RTcmix. Partials are rotated independently within a 2D 8 channel speaker setup. A first experiment used a varying rate circular spatial path of the first eight partials of a square wave, as shown in Figure 1.

/images/spatial/spatial_synthesis/sos_1.png

Figure 1: First SOS experiment (Topper, 2002).

Figure 2 shows the second experiment with one partial moving against the others.

/images/spatial/spatial_synthesis/sos_2.png

Figure 2: Second SOS experiment (Topper, 2002).


GLOOO

GLOOO is a system for real-time expressive spatial synthesis with spectral models. A haptic interface allows the dynamic distribution of 100 spectral components, allowing a control over the spread and position of the resulting violin sound. The project is best documented on the corresponding websites:


References

2017

  • Grimaldi, Vincent and Böhm, Christoph and Weinzierl, Stefan and von Coler, Henrik. Parametric Synthesis of Crowd Noises in Virtual Acoustic Environments. In Proceedings of the 142nd Audio Engineering Society Convention. Audio Engineering Society, 2017.
    [details] [BibTeX▼]

2015

  • Stuart James. Spectromorphology and spatiomorphology of sound shapes: audio-rate AEP and DBAP panning of spectra. In Proceedings of the International Computer Music Conference (ICMC). 2015.
    [details] [BibTeX▼]
  • Ryan McGee. Spatial modulation synthesis. In Proceedings of the International Computer Music Conference (ICMC). 2015.
    [details] [BibTeX▼]

2009

  • Alexander Müller and Rudolf Rabenstein. Physical modeling for spatial sound synthesis. In Proceedings of the International Conference of Digital Audio Effects (DAFx). 2009.
    [details] [BibTeX▼]

2008

  • Scott Wilson. Spatial swarm granulation. In Proceedings of the International Computer Music Conference (ICMC). 2008.
    [details] [BibTeX▼]
  • David Kim-Boyle. Spectral spatialization - an overview. In Proceedings of the International Computer Music Conference (ICMC). Belfast, UK, 2008.
    [details] [BibTeX▼]

2004

  • Curtis Roads. Microsound. The MIT Press, 2004. ISBN 0262681544.
    [details] [BibTeX▼]

2002

  • David Topper, Matthew Burtner, and Stefania Serafin. Spatio-operational spectral (SOS) synthesis. In Proceedings of the International Conference of Digital Audio Effects (DAFx). Singapore, 2002.
    [details] [BibTeX▼]

Stockhausen & Elektronische Musik

Spatialization Concepts

Klangmühle

The Klangmühle was an early electronic device for spatialization, allowing the panning between different channels by moving a crank, which was then mapped to multiple variable resistors.

Rotationstisch

The Rotationstisch was used by Karlheinz Stockhausen for his work Kontakte (1958-60) (von Blumroeder, 2018). In the studio, the device was used for producing spatial sound movements on a quadraphonic loudspeaker setup. This was realized with four microphones in a quadratic setup, each pointing towards a loudspeaker in the center:

/images/spatial/rotational-table_W640.jpg

Functional sketch of the Rotationstisch (Braasch, 2008).


The predominant effect of the Rotationstisch is amplitude panning, using the directivity of the loudspeaker and wave guide. In addition, the spatialization includes a Doppler shift when rotating the loudspeaker. The rotation device can be moved manually, thus allowing to perform the spatial movements and record them on quadraphonic tape:

/images/spatial/rotationstisch.jpg

Rotationstisch , operated by Karlheinz Stockhausen (Stockhausen-Stiftung für Musik, Kürten).


Kontakte

Stockhausen's 1958-60 composition Kontakte can be considered a milestone of multichannel music. It exists as a tape-only version, as well as a version for tape and live piano and percussion. For the tape part, the Rotationstisch was used to create the spatial movements - not fully captured in this stereo version (electronics only). Listen to 17'00'' for the most prominent rotation movement in four channels:


References

2018

  • Christoph von Blumröder. Zur bedeutung der elektronik in karlheinz stockhausens œuvre / the significance of electronics in karlheinz stockhausen's work. Archiv für Musikwissenschaft, 75(3):166–178, 2018.
    [abstract▼] [details] [BibTeX▼]

2015

  • Martha Brech and Henrik von Coler. Aspects of space in Luigi Nono's Prometeo and the use of the Halaphon. In Martha Brech and Ralph Paland, editors, Compositions for Audible Space, Music and Sound Culture, pages 193–204. transctript, 2015.
    [details] [BibTeX▼]
  • Michael Gurevich. Interacting with Cage: realising classic electronic works with contemporary technologies. Organised Sound, 20:290–299, 12 2015. doi:10.1017/S1355771815000217.
    [details] [BibTeX▼]

2011

  • John Chowning. Turenas: the realization of a dream. In Proceedings of the 17th Journées d\rq Informatique Musicale. 2011.
    [details] [BibTeX▼]

2010

2008

  • Marco Böhlandt. “kontakte” – reflexionen naturwissenschaftlich-technischer innovationsprozesse in der frühen elektronischen musik karlheinz stockhausens (1952–1960). Berichte zur Wissenschaftsgeschichte, 31(3):226–248, 2008.
    [details] [BibTeX▼]
  • Jonas Braasch, Nils Peters, and Daniel Valente. A loudspeaker-based projection technique for spatial music applications using virtual microphone control. Computer Music Journal, 32:55–71, 09 2008.
    [details] [BibTeX▼]

Karplus-Strong in C++

The Karplus-Strong algorithm is a proto-physical model. The underlying theory is covered in the Karplus-Strong Section of the Sound Synthesis Introduction. Although the resulting sounds are very interesting, the Karplus-Strong algorithm is easy to implement, especially in C/C++. It is based on a single buffer, filled with noise, and a moving average smoothing.


The Noise Buffer

Besides the general framework of all examples in this teaching unit, the karplus_strong_example needs just a few additional elements, defined in the class`s header:

// the buffer length
int l_buff = 600;

// the 'playback position' in the buffer
int buffer_pos=0;

/// noise buffer
double  *noise_buffer;

/// length of moving average filter
int l_smooth = 10;

// feedback gain
double gain =1.0;

Note that the pitch of the resulting sound is hard-coded in this example, since it is based only on the sampling rate of the system and the buffer length. In contrast to the original Karplus-Strong algorithm, this version uses an arbitrary length for the moving average filter, instead of only two samples. This results in a faster decay of high frequency components.


Initializing the Buffer

Since the noise buffer is implemented as a pointer to an array of doubles, it first needs to be allocated and initialized. This happens in the constructor of the karplus_strong_example class:

// allocate noise buffer
noise_buffer = new double [l_buff];
for (int i=0; i<l_buff; i++)
  noise_buffer[i]=0.0;

Plucking the Algorithm

Each time the Karplus-Strong algorithm is excited, or plucked, the buffer needs to be filled with a sequence of random noise. At each call of the JACK callback function (process), it is checked, whether a new event has been triggered via MIDI or OSC. If that is true, the playback position of the buffer is set to 0 and each sample of the noise_buffer is filled with a random double between -1 and 1:

cout << "Filling buffer!";
buffer_pos = 0;
for(int i=0; i<=l_buff; i++)
  noise_buffer[i]=  rand() % 2 - 1;

Running Through the Buffer

The sound is generated by directly writing the samples of the noise_buffer to the JACK output buffer. This is managed in a circular fashion with the buffer_pos counter. Wrapping the counter to the buffer size makes the process circular. This example uses a stereo output with the mono signal.

for(int sampCNT=0; sampCNT<nframes; sampCNT++)
{

    // write all input samples to output
    for(int chanCNT=0; chanCNT<nChannels; chanCNT++)
    {
      out[chanCNT][sampCNT]=noise_buffer[buffer_pos];
    }


    // increment buffer position
     buffer_pos++;
     if (buffer_pos>=l_buff)
      buffer_pos=0;
}

Smoothing the Buffer

The above version results in a never-ending oscillation, a white tone. The timbre of this tone changes with every triggering, since a unique random sequence is used each time. With the additional smoothing, the tone will decay and lose the high spectral components, gradually. This is done as follows:

// smoothing the buffer
double sum = 0;
for(int smoothCNT=0; smoothCNT<l_smooth; smoothCNT++)
  {
    if(buffer_pos+smoothCNT<l_buff)
      sum+=noise_buffer[buffer_pos+smoothCNT];
    else
      sum+=noise_buffer[smoothCNT];
  }
  noise_buffer[buffer_pos] = gain*(sum/l_smooth);

Compiling

To compile the KarplusStrongExample, run the following command line:

g++ -Wall -L/usr/lib src/yamlman.cpp src/main.cpp src/karplus_strong_example.cpp src/oscman.cpp src/midiman.cpp -ljack -llo -lyaml-cpp -lsndfile -lrtmidi -o karplus_strong

This call of the g++ compiler includes all necessary libraries and creates the binary karplus_strong.


Running the Example

The binary can be started with the following command line:

./karplus_strong -c config.yml -m "OSC"

This will use the configurations from the YAML file and wait for OSC input. The easiest way of triggering the synth via OSC is to use the Puredata patch from the example's directory.


Exercises

Exercise I

Make the buffer length and filter length command line or realtime-controllable parameters.

Exercise II

Implement a fractional noise buffer for arbitrary pitches.

Using MIDI with RtMidi

Although the MIDI protocol is quite old and has several drawbacks, it is still widely used and is appropriate for many applications. Read the MIDI section in the Computer Music Basics for a deeper introduction.

The development system used in this class relies on the RtMidi framework. This allows the inclusion of any ALSA MIDI device on Linux systems and hence any USB MIDI device. The RtMidi Tutorial gives a thorough introduction to the use of the library.


ALSA MIDI

The Advanced Linux Sound Architecture (ALSA) makes audio- and MIDI interfaces accessible for software. As an API it is part of the Linux kernel. Other frameworks, like JACK or Pulseaudio work on a higher level and rely on ALSA.

Finding your ALSA MIDI Devices

After connecting a MIDI device to an USB port, it should be available via ALSA. All ALSA MIDI devices can be listed with the following shell command:

$ amidi -l

The output of this request can look as follows:

Dir     Device        Name
IO      hw:1 ,0 ,0   nanoKONTROL MIDI 1
IO      hw:2 ,0 ,0   PCR-1 MIDI 1
I       hw:2 ,0 ,1   PCR-1 MIDI 2

In this case, two USB MIDI devices are connected. They can be addressed by their MIDI device ID (hw:0/1).


The MIDI Tester Example

The MIDI tester example can be used to print all incoming MIDI messages to the console. This can be helpful for reverse-engineering MIDI devices to figure out their controller numbers.

The MIDI Manager Class

The MIDI Manager class introduced in this test example is used as a template for following examples which use MIDI. For receiving messages, RtMidi offers a queued MIDI input and a user callback mode. In the latter case, each incoming message triggers a callback function. For the queued mode, as used here, incoming messages are collected until retrieved by an additional process.

The midiMessage struct is used to store incoming messages. It holds the three standard MIDI message bytes plus a Boolean for the processing state.

/// struct for holding a MIDI message
typedef struct  {
    int byte1             = -1;
    int byte2             = -1;
    double byte3          = -1;
    bool hasBeenProcessed = false;

}midiMessage;

SuperCollider Granular Example

The TGrains UGen is an easy to use granular synth. It uses a Hanning window for each grain and offers control over position, pitch and length of the grains. The help files offer multiple examples for using this unit generator. The following example uses a simple pulse train for triggering grains.


Reading Channels

A single channel is loaded to a buffer from a sample for this granular example. The duration in seconds can be queried from the buffer object, once loaded.

~buffer = Buffer.readChannel(s,"/some/wavefile.wav",channels:0);

~buffer.duration;

The Granular Node

The granular node uses an Impulse UGen to create a trigger signal for the TGrains UGen. This node has several arguments to control the granular process:

  • The density defines how often a grain is triggered per second.

  • Every grain can be pitch shifted by a value (1 = default rate).

  • The grain duration is specified in seconds.

  • The grain center is defined in seconds.

  • A gain parameter can be used for amplification.

  • buffer specifies the index of the buffer to be used.

Once the node has been created with a nil buffer, the buffer index of the previously loaded sample can be passed. Depending on the nature of the sample, this can already result in something audible:

~grains =
{
    |
    density = 1,
    pitch   = 1,
    dur     = 0.1,
    center  = 0,
    gain    = 1,
    buffer  = nil
    |

    var trigger = Impulse.kr(density);

    Out.ar(0,   gain * TGrains.ar(1, trigger, buffer, pitch, center, dur));

}.play();


~grains.set(\buffer,~buffer.bufnum);

Manual Parameter Setting

As with any node, the arguments of the granular process can be set, manually. Since the center is specified in seconds, the buffer duration is useful at this point.

~grains.set(\center,0.2);
~grains.set(\density,100);
~grains.set(\dur,0.2);
~grains.set(\pitch,0.8);

Exercise

Exercise I

Use the mouse with buses for a fluid control of granular parameters.

Exercise II

Use envelopes for an automatic control of the granular parameters.

Frequency Domain