Oi! Algorithm, chew on this!
One of a series of algorithmic images generated by JB for the project

HAL Seed Project

Oi! Algorithm, chew on this!

By John Bowers & Owen Green

Assaying The Noise Between Human And Algorithm

Over the course of the first two HAL meetings, as we collectively mapped out the territory of the research network, certain flavours of question seemed to crop up that sought to stabilise what might be meant by human or algorithm (or, indeed, listening), and others that wondered at the kind(s) of role that certain disciplinary approaches could play in this combined effort. Our schtick in this project is to investigate what sort of contribution practical arts research can make by approaching these putatively ontological concerns about humans and algorithms as a matters that are inherently unstable.

Ontological Noise

Our premise is that these kinds of ontological question simply don’t admit stable answers because what it is to be human, and what it is to be algorithmic are to a great extent co-indexical: that is, our (historically, culturally located) understandings of what it is to be an algorithm inflect our similarly situated understandings of what it is to be human and vice versa. In really complicated ways. Of course, this stance is hardly novel. It has a great deal in common with the perspectives of much of Science and Technology Studies (e.g. Barad, Latour, Suchman), and with certain philosophies of technology (e.g. Feenberg). Feenberg, for instance, makes precisely this argument:

The stakes in this debate over artificial intelligence are not merely technical. If we understand computers rationalistically, as automata, we prepare a revised self-understanding along the same lines. People become information processors and decision makers, rather than participants in shared communicative activity. 1

We proposed to map some of this territory through a series of design provocations and a large portfolio of small collaborative makes, which are assembled in performance or installation, and critically reflected upon in the light of the concerns of the Network. Insofar as algorithms are typically conceived as trying to identify order against a background of noise, our attempts to find our way around the territory between human and algorithmic listening destabilise that concern by:

  1. Providing challenging input

    oi, you algorithm, with your norms as to what counts as music…chew on this!

  2. Taking noise to be its own kind of information

    oi, you algorithm, you think that is signal…chew on this!

  3. Making noise-­‐fuelled reductiones ad absurdum of soi-disant high-level algorithms

    oi, you algorithm, you think that is emotion…chew on this!

  4. Using algorithms intended to bring a certain kind of order to usher in disorder and discomfort

    oi, you algorithm, you want us to index massive arrays of online music… listen to these clicks!).

Methods

We collaborated over the course of three sessions of intensive practice-­based work, two located in Huddersfield and one in Newcastle. Our strategy was to follow the character of Bowers, Bowen and Shaw’s (2016) ‘Many Makings’2: a large number of small collaborative makes were created in response to our four noisy destabilisations. A key aspect of this approach is to remain alert to the polysemy of making:

There can be many makings. Of things, problematisations, identities, interests, ecologies, infrastructures, portfolios, federations (Bowers, Bowen & Shaw 2016 p.1255)

algoImg
Examples of algorithmic images from the series of generated by JB for the project

Makes

We sought to explore how ‘noise’ might provide challenging input to algorithmic listening techniques or make for a desirable, divergent output (in all the varied senses of that word, (see Marie Thompson’s Beyond Unwanted Sound: Noise, Affect and Aesthetic Moralism 3). We sought to misappropriate known techniques to uncover their limitations or the implicit assumptions built into them. Independently, we each brainstormed proposals for makes. Combined we had a long list of 48, some expressed compactly, some at greater length, some in a standard ‘scientific’ language, some deliberately written humorously or facetiously, some with a degree of overlap and convergence with other proposals, some unique, some making reference to existing artworks but bending them to our context of interest, and so forth.

We made work in two concerted sessions of two days duration each, one at each of our host institutions. We worked with a light touch doing just enough to prove the principle of our design ideas before moving on to the next. We were drawn to prioritise proposals that we both shared but we ensured that our individual idiosyncrasies were also represented to maximise the coverage of our work. We conducted a third two day session to combine our makes in a performable installation environment. In total, 18 of our proposals were made to some degree with 14 having a role in the final presentation of the work.

algoImg
Examples of algorithmic images from the series of generated by JB for the project

AntiGate: Amplitude Version

The input signal is amplitude envelope-followed. When the signal drops below a given threshold, it is let through the gate, thereby performing the opposite action to a classic noise gate. At the moment the signal drops below threshold, it is subject to single frame FFT analysis which is used to create a freeze effect that is held until the next time the gate opens. The sound through the open gate and the frozen spectral texture can be cross-faded. The cross-fade and the threshold are both variable in performance.

AntiGate: Spectral Version

In the Spectral AntiGate (SAG), a carefully engineered multi-resolution spectral gate, made by Harker to showcase his new FrameLib signal processing framework, is hijacked by simply reversing the inequality at its core. Being multi-resolution means that the chirping redolent of crude spectral processing is mitigated somewhat, particularly in higher frequencies that retain a degree of texture. If a feedback loop is set up with an air microphone picking up SAG’s output, a steady cycle is settled into that alternates between more chirpy mid-frequencies and bursts of higher frequency noise, although the inner textures of these components do vary. This behaviour is oddly reminiscent of the change ringing of bells. The rhythmic behaviour changes if a player manipulates the microphone, for instance by shielding the microphone.

Room Tone Shift Register

Bursts of filtered room-tone are fired back into the space with an attack-release energy profile. The timing of these bursts is dictated by a maximum-length pseudo-random sequence (see our discussion of LFSRs below) with four ‘voices’ each with different length sequences and occupying different spectral bands. The room tone is read from a 10-second delay line, so depending on the delay time, there is the possibility of sampling previous output. The overall effect depends to a large degree on how fast the sequencers are driven. High speeds and short bursts produce an impulsive kind of texture, moderate speeds a more rhythmic feel, and low speeds with long bursts can occasionally punctuate whatever else is happening in the space with dramatic impact sounds.

Electrical Field Re-synthesiser

An inductive coil (sometimes known as a phone tap coil) is used to transduce electromagnetic fluctuation into a signal that is presented to the EFR which tries to model its input as coloured noise. This is done using a conventional source-filter technique where noise is filtered in the Fourier domain by a spectral envelope derived by cepstral liftering of the input. This is supplemented with a very simple, single-voice sinusoidal model driven by the sigmund~ external to the Max language. The character of the resynthesis is largely determined by the degree of liftering and, of course, how much sense it makes to model the input with filtered noise in the first place. In variants of the EFR, a microphone has substituted the inductive coil.

Disagreeing Pitch Trackers One

A signal is ‘resynthesised’ by sine oscillators driven by three different pitch trackers in Max (sigmund~, zsa.fund and the built-in Fzero~). One can also mix in another signal of oscillators driven by the difference in frequency between each of the three trackers, ring-modulated with each other. Driven with a pitched signal and a sensible gain structure, the effect is rather like six excited slide whistles. However, introducing feedback and nonlinearity opens up a much wider range of territory. If left in a feedback loop with a suitably large delay (we use 12 seconds here), DPT1 can settle into a quite diverse range of states, especially if there is clipping or distortion somewhere in the loop. Smoothing and delaying the frequency inputs of the oscillators by different amounts can also enrich the emerging dynamics.

Disagreeing Pitch Trackers Two

A similar approach to pitch tracking and making the disagreement in results between algorithms palpable was made using the Pd-vanilla language, The sigmund~ and fiddle~ objects are used to identify pitches in the input and to set the frequencies and amplitude envelopes of two sine waves. These signals are also ring modulated to enhance the perceptibility of their disagreement. The performer can cross-fade between the sine waves and their ring modulation. The identified pitches and their absolute difference as a disagreement measure are made available from DPT2 to other patches (e.g. to parameterise the LFSR, see below).

Eternal Resonance Machine

The ERM is a means for converting any input into a sustained noise texture. On receipt of a button press style event, the momentary spectrum of the sound is subject to an 4096-band FFT and used to synthesise a sustained frozen noise. Successive button presses will add partials to the sustained sound if their FFT bands are louder than in the last analysis. Button presses will also momentarily open a gate to pass the input sound to Pd’s freeverb~ set to a large room size with little damping. When the gate closes, the reverb is frozen to give an infinite reverb effect. This gives an alternative way to synthesise a spectral noise from input sound. The performer can cross-fade between the two methods and reset the analysis (which fades both kinds of noise to silence).

Linear Feedback Shift Register Sequencer-Synthesizer

An 8-bit linear feedback shift register (LFSR) was implemented in Pd-vanilla. A flexible design was adopted where the last bit could feedback to any of the 8 positions in the register for exclusive-OR combination with the position’s contents. This creates an algorithmic system which can generate a variety of behaviour from the digital pseudo-noises of maximal length sequences to varied periodic behaviour. The values in the register were interpreted both as a 8-bit sample values to be read into a wavetable and as 8-bit specifications of frequency with which the wavetable (or a sine or a square wave) would be played. The rate at which the LFSR is clocked and the centre and range values of frequency could be determined manually or received from other processes (e.g. the Disagreeing Pitch Trackers). In this way, pseudo-noises or pitched sequences could be generated which followed identified profiles.

Emotion Recognizer-Generators

We reversed-engineered a music-psychological study that aims to demonstrate a mapping between given musical ‘features’ (timbre, tempo, mode, register, articulation, dynamics) and ‘emotions’ on the basis of rating judgments given by listeners to various transformations of simple melodies. Working in parallel, we each independently came up with ways of trying to estimate these six features from an audio stream. Then, using the paper’s experimentally derived table of correlations between features and emotions, we constructed a mapping function between ‘features’ and the four ‘emotions’ examined in the paper (happy, sad, scary, peaceful). We then set about using this mapping for generative purposes. One of us made a noise/drone generator, which constructed a spectrum based on a shifting histogram of detected pitch classes that was modulated using the detected emotions and features. The other of us made a melody generator which, on the basis of the emotions recognised in the input audio stream, estimated values for the six musical features analysed in the study and played back notes synthesised with enveloped, filtered sawtooth waves.

Random Sample and Holding

The instantaneous digitised value of an input audio stream is sampled at random intervals and read into a wavetable, the insertion point wrapping round when the table is full. Following a fractal expansion technique used previously by JB, the wavetable is read to generate long patterns of nested amplitude modulated sound. The reference rate for reading the wavetable can be set as a linear function of the currently sampled value or from other pitch tracking processes. The range of the random sampling intervals can be set in performance. The output can vary from a noisy reconstruction of the input through a slow pattern which can variably follow the pitch content of the input to a distorted granular-sounding stream.

Arduino Nano Circuit Noise

The analog-in values from an Arduino Nano are read into wavetables and used for direct digital synthesis via nested amplitude modulation as described in the previous section. The analog terminals are left floating so they are sensitive in unpredictable and interactive ways to touch and circuit noise. This creates a lively five oscillator digital synthesizer capable of a range of distorted, bit-reduced and granular-sounding textures which can be steered by touch but not precisely played. Two improvisations were recorded and used by us in performance as a fixed media element.

Schlechtmusik

In recognition of the prominence that Mozart’s music has in the history of algorithmic composition and machine listening, we took a recording his Eine kleine Nachtsmusik and extracted its tonal component using Izotope RX. We followed this with a sinusoidal analysis using Spear and made various resynthesises. For example, we made a version which was reconstructed out of banks of sine waves, another which retained only the transients and yet another in which the tonal analysis was read at a slow rate to generate a 45 minute texture. To explore how machine listening techniques might react to suboptimal renderings, we also degraded the original recording by playing it back in a reverberant space, freely talking over it and recording the result using a gain structure with a tendency to distort. We selected five versions plus the original and mixed them using a good to bad (Nacht- to Schlecht-musik) crossfader. We informally calibrated the crossfader so that at extreme good/Nacht the online music recognition service Shazam would accurately recognise Eine kleine Nachtsmusik while at extreme bad/Schlecht no results were returned, with an approximately 50% hit rate in the middle.

Sincere Resynthesis, Subsequently Violated

Using sigmund~ feeding an oscillator bank with a generous number of partials (100), we found that a reasonable facsimile of even a noisy environment could be rendered, but that it was a simple matter to reduce this to a sludge of artefacts by over-smoothing frequency and / or amplitude tracks. The degree of over-smoothing was made a function of the distribution of averaged spectral centroid in the space by building a histogram (periodically cleared), that was occasionally sampled as if it were a PDF and used to set the amount of smoothing.

I Am Sitting in Skype’s Audio Compression Algorithm

Following the same principle as Alvin Lucier’s I am Sitting in a Room, a prepared text was read by one of us and recirculated through Skype until its original identity had completely dissipated. This was roughly 30 iterations. In contrast to the shifting resonances of Lucier’s acoustic version, the accumulating artefacts included bursts of noise and clicks, and the appearance of a distinctive crescendo of bass-drum-like impact sounds partway through, as well as the chirpy filtering we had expected. Our text was from a deeply critical review of Abraham Moles’ Information Theory and Esthetic Perception, and the results formed a fixed-media component of the final presentation.

Re-De-Reverberation

Using a black-box de-reverberation plugin and a reverberation pedal, we constructed a controllable feedback loop, stimulated with chirps, noise bursts and crackles programmed in Pd-vanilla, and recorded a short, two-person improvisation that was used as a source of fixed material in our presentation. One feature of the plugin is that reverberant components can be boosted as well as suppressed using a ‘focus’ parameter, and that it is easy to mistune the settings to generate plenty of artefacts. The resulting material had a drone-like character but did not tend to collapse into indistinct mush.

Miscellaneous Makes

We made a number of other explorations which we will only briefly relate here. Many of these concern processing fixed media material using offline processes or involve recordings that for reasons of practicality could only be appear in our work as fixed media. For example, one of us created a piece entitled Maximum Zero which takes a recording of David Tudor performing John Cage’s 4’ 33” and subjects it to brickwall limiting to bring out the environmental sounds around the performance at maximum intensity. One of us also made recordings using the aerial array and amplifier designed by NASA’s Radio Jove to bring recordings of the radio transmissions of Jupiter to our project. We also made experiments to see whether we could transmit the results of our machine listening analyses via non-standard means. This included an encoding of identified pitches as audible Morse messages, which we decoded and played back in a feedback loop. In this way, we sought to corrupt conventional understandings of the relationship between representation and the represented and between signal and noise.

Performing our Work

The fruits of our labours were assembled together and explored in the University of Huddersfield’s multichannel Spatialisation and Interactive Research Lab (SPIRAL), which offers 25.4 channels to work in, arranged as three tiered rings of eight, plus a ceiling mounted speaker dubbed the voice of god. A binaural dummy head was used as the input for all listening process, which we dubbed Stookie Helen (people who attended the second HAL meeting in Belfast will have already encountered JB’s partner, Stookie John).

JB and Stookie Helen have some quiet time together
JB and Stookie Helen have some quiet time together

Different processes were placed in different speakers, and kept stationary. In this way, the character of what emerged was driven in part by the interactions of the different processes, affected by the relative gain structure, over which we had control.

This allowed more or less scrutable relationships to emerge between processes as they interfered with each other, and it also encouraged visitors to explore the space, and discover different points of focus. We had some control over each process, in the form of individual gain faders, ‘nudge’ buttons which would push an individual process into a new (possibly random) state, and a combined overall control on a boundless rotary encoder that would affect all processes. This combined control yielded 24 separate control signals internally, based on a set of transfer functions. Processes were free to use whichever of these we fancied, however we wished, the object being to generate variety with coherence. For example, processes like the crossfades on the Anti-Gates or the ERM could set by the values from the transfer functions.

Under the provisional title of All The Noises, our work was first presented on 18th January 2018, and occupied territory between a performance, an installation and a research presentation. We started with a brief, 15 minute, performance whilst an audience composed of colleagues from Huddersfield and members of the public responding to local publicity arrived and explored the space. We then set the system into a lower-key state whilst we explained our project to the room at large. Thereafter, we had a steady trickle of guests passing through and we would alternate between talking, nudging the system, and demonstrating brief performative moves. Finally, we concluded the session with a 10 minute performance crescendo.

Here are edited highlights:

Oi Algorithm Performance, Huddersfield 18 January 2018

OG and JB, tearing it up
OG and JB, tearing it up

Closing Thoughts

Going into this we had a few ambitions: one was to use the many makings approach to try and sketch out an approach for errantly-inclined, artistic algorithmic listening research that complements and challenges to the engineering orthodoxy. Our thoughts on this have been submitted to NIME 2018, so we hope to be pontificating on this topic in public later this year. An other ambition was, straightforwardly, to collaborate, as we hadn’t done so before despite having been in each other’s orbit for a while. We’re encouraged by what we made, and intend to keep refining and gigging it.

Acknowledgement

As well as the support of HAL in making this possible, OG’s time and access to facilities are supported by the ERC through the Fluid Corpus Manipulation project.

  1. Andrew Feenberg (2002). Transforming Technology: A Critical Theory Revisited. Oxford University Press, p. 106 

  2. Bowers, John, Simon Bowen, and Tim Shaw (2016) “Many makings: Entangling publics, participation and things in a complex collaborative context.” Proceedings of the 2016 ACM Conference on Designing Interactive Systems. ACM. 

  3. Marie Thompson (2017) Beyond Unwanted Sound: Noise, Affect and Aesthetic Moralism. London: Bloomsbury 

Dialogue & Discussion