
No one would disagree that providing an experience that causes moviegoers to return to a theatre is a smart business decision. But what exactly is the “premium moviegoing experience” that everyone is talking about, and how do you deliver it?
It’s more than just oversized screens, plush recliners or haptic seating, gourmet dining options, and immersive sound. It isn’t just limited to premium large format auditoriums either.
But when you strip away all of the extra amenities, the moviegoing experience comes down to the movie itself, and how it is perceived. Breaking it down even further, it is the audio and visual perception of each moviegoer that ultimately determines the quality of the experience for them. This is not to discount any other sensory perception, but arguably, cinema is primarily an audio-visual (AV) experience.
Great strides have been made to enhance the visual experience since the digital cinema revolution. We now have greatly expanded color gamut, dynamic range, brightness, resolution, image depth (3D), and more – all of these improvements provide the filmmaker with the tools to create any type of visual world they wish to, with startling realism.
On the other hand, similar improvements on the audio side of the cinema technology equation have not kept pace. Significant advancements in transducer design, signal processing, integrated system design, and networking capabilities have become common in other professional sound markets, such as installed AV systems in commercial buildings for public address, paging, background and foreground music, live and touring sound. But for some reason, similar advancements have not yet been fully embraced in the cinema world. This clearly represents an opportunity to further enhance the premium experience and create customer preference for your theatre.
Let’s look at the nature of sound perception. It provides two sources of critical information that have been fundamental to the survival of our species: localization (where is it coming from?), and source identification (what is it?).
In terms of the sound systems in a theatre, localization involves where the loudspeakers are installed, the characteristics of the loudspeakers, and any signal processing that is applied to them. Immersive sound formats have exponentially improved the range of localization experiences that can be delivered.
The other half of the cinema sound experience – source identification – has not seen the same advances as with localization. Of course, localizing the sound source helps us identify the source (if it’s overhead, it’s probably a bird or an airplane) but is also dependent on the actual nature of the sound, or sound quality, delivered by your loudspeakers. Since quality is inherently a subjective value judgement, for sound systems we can objectively define quality as the degree to which the sound system reproduces the same signal as it receives. The output should be the same as the input – or as close to it as physics will allow.
What Makes Good Cinema Sound?
The reproduction of a cinema soundtrack is unique among other types of recorded sound because it supports a visual experience to successfully deliver the artist’s intent (in this case the filmmaker). The filmmaker is attempting to bring us into a very specific and intentional world that they have labored to create.
Unlike music, visual arts, or even live theater, one priority in theatrical exhibition is to duplicate the same experience regardless of the venue. The goal is consistency of experience from creation to presentation, for every patron, in every venue. Achieving this is a critical part of a premium experience. And since the goal in cinema is to make the output equivalent to the input, we need to eliminate (as much as possible) anything in between that might alter the final presentation.
If we focus on the sound half of the cinema AV experience, the sound that’s heard when the film is mixed should be the same sound that the audience hears when the film is presented in a movie theatre. Then, and only then, is the filmmaker’s creative intent successfully delivered.
The challenge is that technology doesn’t always make it easy. Passing a signal (sound in its electrical form) through a sound system in order to reproduce it accurately for various rooms and audience sizes always results in some degree of alteration of the signal itself, due to the laws of physics combined with inherent compromises of the technology itself. Another word for alteration is distortion: any difference between the input and the output is the result of some type of distortion.
The Good, The Bad, and The Accurate
Since sound is a perception and therefore subjective, there is no such thing as universally “good” sound. Every living vertebrate has its own unique set of calibrated microphones (ears) and signal processing device (brain), but there is no calibration among individuals. What sounds good to me might not sound good to you. There is “accurate” sound – sound that is transmitted from the source to the listener with minimal alteration or distortion of the original. Accuracy is the aim, since we can’t control how sound is perceived by each individual.
There are three characteristics of sound that dictate its accuracy: frequency (the spectral nature of it), amplitude (how loud is it), and time (how long it takes to reach our ears).
The accuracy of sound transmission from the cinema media player to the listener (moviegoer) is dictated by the main components of a sound system: electronics, loudspeakers, and room acoustics.

When the soundtrack is created, sound designers and mixers make creative judgements based on what they hear, which is inevitably filtered by the sound system and its interaction with the room they are listening in. When the final soundtrack is played back to audiences in theatres, these same two variables also affect the final listening experience. For the best possible “translation” from post-production to the theatre, alteration (distortion) of the signal (the sound) caused by both the room and the sound system must be minimized.
Fortunately, there are well-established industry standards (e.g. SMPTE ST-202, SMPTE RP-2096, ISO 2969) which, if followed as closely as possible, can mitigate room-to-room variance caused by room acoustics. There are some standards for sound system components, but the highly competitive nature of the audio business has led to a wide range of actual performance. Attempts to “level the playing field” by referring to published technical specifications has not made objective comparison any easier; in fact, it’s often even more difficult to evaluate on paper one system or component versus another. The ideal situation would be a sound system that adds no alteration of the signal whatsoever.
Over the years, when selecting sound system components, audio manufacturers and consumers have generally focused on two essential characteristics: frequency and amplitude. If a published system specification chart appears as a relatively “flat” amplitude vs. frequency graph (like the yellow line in the figure below), it should be a good sounding system – or so says conventional wisdom.

This would be reasonable logic if it weren’t for one thing: time. Except for a single frequency pure sine wave tone (which no one wants to listen to), a complex sound (like a film soundtrack) is usually comprised of almost an infinite number of frequencies all being transmitted at approximately the same time. Since no single loudspeaker can effectively reproduce the incredibly wide range of frequencies of complex sound, we use several loudspeaker types like compression drivers (attached to horns) for mid and high frequencies and cone drivers for low and mid frequencies, which each perform optimally only for a given “band” of frequencies.
The problem is that the frequencies reproduced by each of these types of loudspeakers might not be transmitted to the listener’s ears all at the same time. Some frequencies might lag behind others. This disparity in the time domain is called “phase delay”, a frequency-dependent stretching of the temporal makeup of the signal. It may be a tiny amount of delay, but it is perceptible and can degrade the true fidelity of the sound you’re trying to reproduce. All of the parts are there, but the arrivals have been stretched over time. A reproduced violin might sound less like a violin because the temporal clues have been altered. A snare drum will sound less “tight” when phase delay widens the impulse. A human voice might not sound exactly as it would if you were actually standing face-to-face with that person. When the frequency response and phase response approach linearity, the sound is accurately reproduced – as intended by the sound creators when they heard it on the dub stage. This is premium sound.

If, as the famous George Lucas quote goes, “sound is half the picture”, then clearly premium sound is part of the premium moviegoing experience. But if you’re not delivering accurate sound for your audiences, then you’re not fully delivering on the promise of the premium experience.
- CJ Tech: Sound and the Premium Moviegoing Experience - January 30, 2025