Immersive audio is one of the most exciting frontiers of the contemporary sound experience. We hear about “immersion” in movies, video games, concerts, and even streaming platforms. But how did we get here? When did we start thinking of sound as something that could move around us and not just come from a speaker in front?

In this post, I want to share with you a historical journey that I consider fundamental to understanding how the ideas of sound spatiality were born and what path we took to reach the tools we use today in immersive mixing studios.

Although today talking about spatial audio seems part of the future, the concern for distributing sound in space has deep roots in the history of 20th-century experimental music. One of the names that has always fascinated me is Karlheinz Stockhausen, who in the 50s was already working with the idea of moving sounds through the physical space of a room. His work Gesang der Jünglinge (1955-56), produced at the Studio for Electronic Music of the WDR in Cologne, is a paradigmatic piece in this sense. I was lucky enough to appreciate this work in Spæs Lab, Berlin where an intimate listening session was organized curated by its members and directors Johannes Scherzer and Gerriet K. Sharma with whom we had heated and deep conversations.

In this work, Stockhausen combines recordings of a child’s voice (edited and processed) with electronic synthesis, using five independent channels distributed in a spatial configuration designed to surround the audience.

What I find most powerful is that it was not just about “putting” sounds in different speakers: the composition was built taking into account the spatial trajectory of the sound events. Stockhausen considered space as a structural dimension of musical time. In his words, space becomes an organizational parameter as important as pitch, duration, or dynamics. In Gesang der Jünglinge, the movement of the voice between speakers generates a sensation of ubiquity and displacement, often articulated with the transformation of the timbre, accentuating the direct relationship between spatial form and sound form.

On the other hand, Pierre Schaeffer, a pioneer of concrete music, worked with a different approach, although also deeply influential in the conception of space. In his Traité des objets musicaux (1966), Schaeffer introduces the idea of the “sound object” as a morphic and perceptual unit of sound, shifting the focus from the source or notation to phenomenological listening. Although his compositions were not based on multichannel systems per se, the treatment of space was implicit in the manipulation of the stereophonic (and occasionally quadraphonic) field, especially in works such as Etude aux chemins de fer (1948) or Symphonie pour un homme seul (1950, together with Pierre Henry).

In these works, Schaeffer used panning, filters, artificial reverberation, and the montage of recorded fragments to create a sound architecture in which the sounds seemed to come from different depths or directions. It was not a spatiality based on exact coordinates, but on the subjective perception of auditory space. The articulation of space was given through the montage and transformation of the materials: train noises, footsteps, voices, all converted into spatially organized events in an abstract sound narrative.

In the early 70s, the recording industry experimented with quadraphonics: a four-channel system that allowed sound to be reproduced from the four corners of a room. Although it had little commercial success, it left an important artistic legacy. Pink Floyd, for example, was one of the groups that best knew how to explore its potential.

During the production of The Dark Side of the Moon (1973), engineer Alan Parsons and the band incorporated a quadraphonic version designed for live concerts. In those presentations, they used a joystick specially designed to control the movement of sound in real time called Azimuth Co-ordinator.

Effects such as clocks, heartbeats, laughter, and voices were sent to different parts of the venue, creating an immersive and cinematic auditory environment. It seems key to me how the band understood sound as an architectural component of the show: you didn’t just listen to the music, you “entered” into it.

At the end of the 20th century, cinema took a fundamental step with the introduction of 5.1 surround sound, which incorporated front, rear speakers, and a low-frequency channel (subwoofer). This configuration became standard, both in movie theaters and in home theater systems. It was a huge advance in terms of immersion, although still limited by the idea of fixed channels.

The standard chunk of Lorem Ipsum used since the 1500s is reproduced below for those interested. Sections 1.10.32 and 1.10.33 from “de Finibus Bonorum et Malorum” by Cicero are also reproduced in their exact original form, accompanied by English versions from the 1914 translation by H. Rackham.

Meanwhile, in the academic world and sound art, much more flexible systems were being developed. Ambisonics, created in the 70s by Michael Gerzon, allowed a three-dimensional sound field to be encoded using a mathematical system that could then be decoded for any number of speakers. This made it ideal for installations and multichannel compositions.

Another important advance was Wave Field Synthesis (WFS), a technology developed in the 90s that uses arrays of dozens of speakers to recreate “wave fronts” that simulate a sound source located at a specific point in space. WFS allows the sound to appear to come from a specific place, even if the listener moves within the room. Although costly and complex, it was a clear inspiration for later systems such as the company HOLOPLOT and its implementation in The Sphere, Las Vegas.

Already in the 2010s, the 4DSOUND system revolutionized the landscape. Not only did it allow sound sources to be placed at any point in space, but it could also be integrated with live performances, controlling spatial movements in an expressive and musical way. In simple words, it was not only about “where” to put a sound, but “how” it moves and transforms in time. In addition to including effects processes such as granular spatial synthesis, spatial delays and reverbs all within its own ecosystem with modules built in Max MSP for Ableton Live and its “renderer” built in C++.

Today, immersive audio has been integrated into much broader contexts. It is no longer just for cinema or artistic installations. Festivals such as Mutek in Montreal, Sónar in Barcelona, or the Barbican Centre in London have programmed works that use immersive systems such as Dolby Atmos or 4DSOUND. Artists such as Suzanne Ciani, a pioneer of the modular synthesizer, have presented concerts in 360° where the sound travels throughout the room. Arca, with its complex and highly processed works, has also explored immersion as an expressive language.

Björk, for her part, took immersive audio to another level with her show Cornucopia, where the sound, video, and set design were perfectly synchronized to create a total immersive experience. Even artists more linked to pop or commercial electronics such as The Weeknd or Billie Eilish have released albums available in formats such as Dolby Atmos on digital platforms such as Apple Music.

In my experience, I have explored both facets, from digital platforms to live shows, and although today there is a wide range of devices to reproduce at home or individually, I will always prefer to be in a speaker environment in a shared experience.

Personally, I believe that immersive audio is not a fad or an isolated technique: it is the result of an evolution that mixes art, science, and technology. What we have available today (compatible DAWs, spatialization plugins, platforms like Apple Music or TIDAL that support Atmos) is the result of decades of exploration and development. Immersive does not replace stereo, but opens a new dimension to create, communicate, and connect in a very subtle way.

As this technology becomes more accessible, we will see new works, new forms of listening, and new audiences emerge. And it is very likely that in a few years, immersive sound will be as common as stereo sound is today.

Christopher Manhey.-

You might also like