Binaural recording

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
Neumann KU100 microphone used to record binaural sound
Lautsprecherwiedergabe.svg
FrequenzgangDruckempfänger.svg
Lokalisation.svg
Lautsprecherwiedergabe-Göttingen.svg

Binaural recording is a method of recording sound that uses two microphones, arranged with the intent to create a 3-D stereo sound sensation for the listener of actually being in the room with the performers or instruments. This effect is often created using a technique known as "dummy head recording", wherein a mannequin head is outfitted with a microphone in each ear. Binaural recording is intended for replay using headphones and will not translate properly over stereo speakers. This idea of a three dimensional or "internal" form of sound has also translated into useful advancement of technology in many things such as stethoscopes creating "in-head" acoustics and IMAX movies being able to create a three dimensional acoustic experience.

The term "binaural" has frequently been confused as a synonym for the word "stereo", and this is partially due to a large amount of misuse in the mid-1950s by the recording industry, as a marketing buzzword. Conventional stereo recordings do not factor in natural ear spacing or "head shadow" of the head and ears, since these things happen naturally as a person listens, generating their own ITDs (interaural time differences) and ILDs (interaural level differences). Because loudspeaker-crosstalk of conventional stereo interferes with binaural reproduction, either headphones are required, or crosstalk cancellation of signals intended for loudspeakers such as Ambiophonics is required. For listening using conventional speaker-stereo, or mp3 players, a pinna-less dummy head may be preferable for quasi-binaural recording, such as the sphere microphone or Ambiophone. As a general rule, for true binaural results, an audio recording and reproduction system chain, from microphone to listener's brain, should contain one and only one set of pinnae (preferably the listener's own) and one head-shadow.

Recording technique[edit]

With a simple recording method, two microphones are placed 18 cm (7") apart facing away from each other. This method will not create a real binaural recording. The distance and placement roughly approximates the position of an average human's ear canals, but that is not all that is needed. More elaborate techniques exist in pre-packaged forms. A typical binaural recording unit has two high-fidelity microphones mounted in a dummy head, inset in ear-shaped molds to fully capture all of the audio frequency adjustments (known as head-related transfer functions (HRTFs) in the psychoacoustic research community) that happen naturally as sound wraps around the human head and is "shaped" by the form of the outer and inner ear. The Neumann KU-81, and KU-100 are the most commonly used binaural packages, especially by musicians. A simplified version of binaural recordings can be achieved using microphones with a separating element, like the Jecklin Disk. Not all cues required for exact localization of the sound sources can be preserved this way, but it also works well for loudspeaker reproduction.

In the late 1960s, Aiwa and Sony offered headphones with a pair of microphones mounted on the headband around two inches above the ears. These allowed pseudo-binaural recordings to be made.

Miniature binaural "in-ear" or "near-ear" microphones can be linked to a portable Digital Audio Tape (DAT) or MiniDisc recorder, bypassing the need for a dummy head by using the recordist's own head. The first clip-in binaural microphones using the recordist's own head were offered by Sennheiser in 1974. The first clip-on binaural microphones using the recordist's own head were offered by Core Sound LLC in 1989. The first completely "in-ear" binaural microphones using the recordist's own head were offered by Sound Professionals in 1999. Roland Corporation also offers its CS-10EM in-ear binaural microphone set.

Binaural re-recording techniques[edit]

The technique of binaural re-recording is simple, but has not been well established. It follows the same principles of Worldizing,[1] a technique used by film sound designers in which sound is played over loud speaker in a real world location and then re-recorded, taking along all the aspects and characteristics of the real world environment with it.[2]

Using space to manipulate a sound and then being re-recorded is also nothing new as it is something that has been done through the use of echo-chambers in recording studios for many years. In 1959, an echo-chamber was famously used by Irving Townsend during the post production process of Miles Davis's 1959 album 'Kind of Blue'. "[the effect of the echo chamber on Kind of Blue is] just a bit of sweetening. At 30th Street, a line was run from the mixing console down into a low-ceilinged, concrete basement room - about twelve by fifteen feet in size - where we set up a speaker and a good omnidirectional microphone." [3]

In binaural re-recording, a binaural microphone is used to record content being played over a multi-channel speaker set-up. The binaural head, or microphone, is therefore theoretically making a recording of how humans will hear multi-channel content. The soundtrack to a film for example will be recorded by the binaural microphone with all the environmental cues of the given location, as well as reverberations, including those commonly created by our own torso (assuming a HATS[4] model is used). This method, like certain binaural recordings made with a Neumann KU100[5] or HATS[4] model for example, can produce convincing 3D sound.

There is the common issue of listeners simply not being able to understand binaural recordings, and in most cases being unable to hear a sense of externalisation. However this issue is common among all binaural recording, and isn't a localised issue to re-recording.

Examples of binaural re-recording were produced by Stanford engineering student Jorge Miramontes[6] by watching sections of the film Saving Private Ryan whilst wearing Sound Professionals SP-TFB-2 in-ear binaural microphones. As mentioned previously when using binaural microphones, all audio being recorded is subjected to one set of HRTFʼs and the outcome may not translate well to every listener (although it is worth pointing out that this is the case with all methods of improving spatial awareness in soundtracks).

Playback[edit]

The key constituents that cause the 3D stereo effect: timing, loudness and timbre. Sound coming from the left arrives first to the left ear and microseconds later to the right ear. Head muffles the sound making the sound louder to the left ear than to the right ear. The head and other parts of the body deflect the sound thus changing the sound's frequency spectrum along its way from the left side to the right side. The human brain interprets these differences and automatically causes a sensation of a certain location for the sound to the listening person.[7]

Once recorded, the binaural effect can be reproduced using headphones. It does not work with mono playback; nor does it work while using loudspeaker units, as the acoustics of this arrangement distort the channel separation via natural crosstalk (an approximation can be obtained if the listening environment is carefully designed by employing expensive crosstalk cancellation equipment.)

Any set of headphones that provides good right and left channel isolation is sufficient to hear the immersive effects of the recording. Several high-end head set manufacturers have created some units specifically for the playback of binaural. It is also found that even normal headphones suffer from poor externalization, especially if the headphone completely blocks the ear from outside. A better design for externalization found in experiments is the open-ear one, where the drivers are sitting in front of the pinnae with the ear canal connected to the air. The hypothesis is that when the ear canal is completely blocked, the radiation impedance seen from the eardrum to the outside has been altered, which negatively affects externalization.[citation needed]

There are some complications with the playback of binaural recordings through headphones. The sound that is picked up by a microphone placed in or at the entrance of the ear channel has a frequency spectrum that is very different from the one that would be picked up by a free-standing microphone. The diffuse-field head-transfer function (HRTF), that is, the frequency response at the ear drum averaged for sounds coming from all possible directions, is quite grotesque, with peaks and dips exceeding 10 dB. Frequencies from around 2 kHz to 5 kHz in particular are strongly amplified as compared to free field presentation.[8]

Known issues[edit]

Timbral issues[edit]

In January 2012 BBC R&D worked together with BBC Radio 4 to produce a binaural production of Private Peaceful, the book by Michael Morpurgo.[9] The 88 minute dramatization featured a reproduction of a 5.1 speaker system, and had 4 variations. At the start of each variation the listener would hear a series of test signals allowing for a choice of which version gives the listener the best spatial experience. By doing this, BBC R&D have accepted that there will be variations on the success of the binaural reproduction, and therefore provided different mixes based on different sets of HRTF data. The release of Private Peaceful had an accompanying survey which all listeners were asked to complete. It asked questions about the success that the binaural reproduction had with the listeners and which version (1-4) the listener thought was most successful.

During an interview with Chris Pike from BBC R&D in September 2012, Pike stated that "you may get good spatial impression but timbral coloration is often an issue".[10] The issue of timbral coloration is mentioned in a large amount of spatial enhancement research and is sometimes seen as the outcome of the misuse or insufficient amount of HRTF data when reproducing binaural audio for example, or the fact that the end-user simply will not respond well to the collected HRTF data. Francis Rumsey states in the 2011 article 'Whose head is it anyway?' [11] that "badly implemented HRTFs can give rise to poor timbral quality, poor externalisation, and a host of other unwanted results".[11] Getting the HRTF data correct is a key point in making the final product a success, and possibly by making the HRTF data as extensive as possible, there will be less room for error such as timbral issues. The HRTFs used for Private Peaceful[9] were designed by measuring impulse responses in a reverberant room, done so to capture a sense of space, but is not very external and there are obvious timbral issues as pointed out by Pike.[10]

Juha Merimaa's from Sennheiser Research Laboratories in California discusses using HRTF filters and EQ to reduce timbral issues in his paper entitled 'Modification of HRTF Filters to Reduce Timbral Effects in Binaural Synthesis, Part2: Individual HRTFs' (2010).[12] His research found that using HRTF filters to reduce timbral issues did not affect the spatial localisation previously achieved using the data when tested on a panel of listeners. This explains that there are ways of reducing the effects of timbral issues on audio that has been processed with HRTF data, but this does mean further EQ manipulation of the audio. If this route is to be further explored, researchers will have to be happy with the fact that the audio is being manipulated in great amounts to achieve a greater sense of spacial awareness, and that this further manipulation will cause irreversible changes to the audio, something content creators may not be happy with. Consideration will have to be taken into how much manipulation is appropriate and to what extent, if any, will this affect the end users experience.It is important to consider the room that the BRIR and HRTF data has been collected in, as different rooms will influence the end results.

When recording a series of HRTF data, only a limited amount of measurements can be taken for distribution, and the end-users will have to find the best results for themselves. Of course the best HRTF data for any individuals will be the information that would be collect from their own pinna, not something that content creators for mobile applications are currently taking part in. Because of this, timbral issues may be unavoidable while using non-personal HRTF data, or attempting to distribute any audio that has already been affected by spatial manipulation. It may be that the most feasible route to improving spatial awareness in audio is to explore the possibilities of head tracking or other methods of collecting HRTF data at the user-end.

Timbral issues related to headphones[edit]

The headphones used by consumers will inevitably make an impact on the end results. An issue surrounding headphone use is the wide range in quality of consumer level headphones. Many mp3 players and tablets are traditionally supplied with low budget earphones and these can cause problems for spatially enhanced audio.

Ideal listening conditions will most likely be experienced with headphones designed and calibrated to give an as flat frequency response as possible in order to reduce colouration of the audio the user is listening to. In most circumstances this has not seemed enough of a problem for end-users to make an investment into headphones that will allow them to hear audio exactly how the creator of the content intended, and will instead continue to use bundled headphones, or in some cases make investments into headphones endorsed and branded by certain artists. As previously discussed, there are issues of timbral effects present while using BRIR and HRTF data to create spatially improved audio, techniques used by Chris Pike and BBC R&D.[10] The results experienced timbral issues and therefore this method may not yet be a successful way of creating spatially enhanced audio for headphones, but these timbral issues are also experienced with headphone choice. "[Are timbral issues brought about by the use of BRIR and HRFT data] any worse than the difference between some cheap headphones that you get with an mp3 player versus some nice Sennhesiers".[10]

The point raised by DR Mason[who?] in this interview is that even if a successful reproduction of audio in a more realistic 3D space is achieved, the effect could be damaged by the end users choice of headphones. Cheaper headphones, and indeed more expensive headphones with EQ colouration, will have an influence on how the audio is heard by the end user. Having headphones that add colouration to the audio heard will undoubtedly make disruptive changes. This can disrupt any chances of experiencing an improved spatial soundtrack.[citation needed]

Commonly used binaural microphones[edit]

Neumann KU100[edit]

Neumann KU100 binaural microphone

The Neumann KU100 is a dummy head microphone used to record in binaural stereo. "It resembles the human head and has two microphone capsules built into the ears".[5] The Neumann is a commonly used binaural microphone and features use by BBC R&D teams[13]

G.R.A.S. Head & Torso Simulator KEMAR (HATS)[edit]

"KEMAR was initially invented in collaboration with the audiological industry for the use of hearing aid development, and is still the defacto standard for this industry – however since then the usage of KEMAR has spread into a multitude of other industries like: telecommunications, hearing protection test, automotive development etc."[4] KEMAR is designed using large statistical research to as close to the average human measurements as possible. The KEMAR model is also the only microphone on this list to feature a torso model. Torso reflections have been seen to be a considerable contributor to creating a successful binaural recording.[14]

Brüel & Kjær head and torso simulators (HATS)[edit]

"Designed to be used in-situ electroacoustics tests on, for example, telephone handsets, headsets, audio conference devices, microphones, headphones, hearing aids and hearing protectors."[15] Similar to the G.R.A.S. HATS model, these is head and torso simulators (HATS) designed to replicate human hearing as close as possible.

Brüel & Kjær head and torso simulator (HATS)

3Dio range[edit]

The 3Dio range of binaural microphones feature two silicone ear (pinna) moulds separated by 19 cm (close to average distance between ears). Microphones are placed inside the ears range from Primo EM172 in the Free Space and Free Space XLR models, to DPA 4060s in the Pro II model. The 3Dio range is considerably cheaper than the Neumann KU100 for example and therefore used more on a consumer to prosumer level. The main difference with the 3Dio models compared to the KEMAR or KU100 is the absence of a head model. The 3Dio relies entirely in the use on pinna moulds to achieve a binaural effect from the stereo recording.

Sound.Codes Kaan[edit]

Kaan is a DIY binaural microphone for sound artist. Its a 3D printed model averaging human ear canal to average the resonating frequency inherently present in every human. Because of the form factor and weight it makes it easy to sample environments which would be otherwise harder to with other Microphones along with ADC and recorders.

Microphones are placed exactly at the ear drum using Primo EM 172 and 235mm being the average earlobe to earlobe distance. The sigmoid form in the canal of Kaan makes up for the missing head to a greater extent.

Sound Professionals SP-TFB-2[edit]

An in-ear wearable stereo microphone used like earphones, placed inside the human pinna. This microphone uses the user's pinna to create the binaural effect.[16]

ZiBionic[edit]

The ZiBionic One is a binaural microphone for ASMR recording. The specific shapes and sizes of a binaural recording device "affect the behaviour - such as absorption, transmission, reflection, interference - of acoustic waves". Similarly to 3Dio, ZiBionic has no head model, but its head shadow and body shape was bent in such a way that ASMR recording technics (close range sound source, for example whispering) can be detected more effectively with the two capsules inside the ear-shaped microphones.

Hooke Verse[edit]

The Hooke Verse is a relatively newer binaural device that is an in-ear wearable set of microphones that connects to recording devices utilizing Bluetooth with lossless recording. The codec developed allows the user to capture audio along with video. Additionally, the device utilizes microphone windscreens to cut down on wind noise, a common problem with wearable devices and smart phones. "Smartphone manufacturers face a double problem with wind noise. Not only is turbulence present in the airflow at large, but the rectangular shape of a smartphone produces miniscule eddies around itself."[17]

History[edit]

The history of binaural recording goes back to 1881. The first binaural unit, the Théâtrophone, was an array of carbon telephone microphones installed along the front edge of the Opera Garnier. The signal was sent to subscribers through the telephone system, and required that they wear a special head set, which had a tiny speaker for each ear.

The novelty wore off, and there wasn't significant interest in the technology until around forty years later when a Connecticut radio station began to broadcast binaural shows. Stereo radio had not yet been implemented, so the station actually broadcast the left channel on one frequency and the right channel on a second. Listeners would then have to own two radios, and plug the right and left ear pieces of their head sets into each radio. Naturally, the expense of owning two radios was, at the time, too much for a broad audience, and again binaural faded into obscurity.

In 1978, Lou Reed released the first commercially produced binaural pop record, Street Hassle, a combination of live and studio recordings.[18]

Binaural stayed in the background due to the expensive, specialized equipment required for quality recordings, and the requirement of headphones for proper reproduction. Particularly in pre-Walkman days, most consumers considered headphones an inconvenience, and were only interested in recordings that could be listened to on a home stereo system or in automobiles. Lastly, the types of things that can be recorded do not have a typically high market value. Recordings that are done in studios would have little to benefit from using a binaural set up, beyond natural cross-feed, as the spatial quality of the studio would not be very dynamic and interesting. Recordings that are of interest are live orchestral performances, and ambient "environmental" recordings of city sounds, nature, and other such subject matters.

The modern era has seen a resurgence of interest in binaural, partially due to the widespread availability of headphones, cheaper methods of recording and the general increased commercial interest in 360° audio technology.

A small grassroots movement of people building their own recording sets and swapping them on the Internet has joined the very few CDs available for purchase.

The online ASMR community is another movement of late which has widely employed binaural recordings.

The rise of Dolby Atmos and other 360° Audio film technology in relation to commercial entertainment has seen a rise in popularity of the use of Binaural simulation. This is with the purpose of fully adapting the 360° soundtrack for headphones and earphones. Users can watch 360° Audio films and music with the immersive surround sound experience remaining intact despite using just the two headset speakers. Notably, any full 360° multi-channel soundtrack is automatically converted to simulated binaural audio when listened to with headphones.

In 2013 David Cittadini, along with Andrew Hills, used binaural recording techniques and technologies on the Australian short film The Blind Passenger. On 29 August 2013 and 31 August 2013 they recorded The Metropolitan Orchestra using binaural recording techniques, the first recording of an orchestra in Australia using binaural recording techniques.

In 2017 Ninja Theory used binaural recording techniques for the video game Hellblade: Senua's Sacrifice. This was done to immerse the player in the mindset of the player character, Senua, who is affected by psychosis and hears multiple voices in her head.

See also[edit]

References[edit]

  1. ^ "Worldizing". Filmsound.org. 
  2. ^ Burtt, Ben (2001). Galactic Phrase Book & Travel Guide. New York: Random House. pp. 136–137. 
  3. ^ Kahn, Ashley (2002). Kind of Blue, The Making of a masterpiece. London: Granta Publications. p. 102. 
  4. ^ a b c "Head & Torso Simulators". 
  5. ^ a b "Dummy Head KU 100". www.neumann.com. 
  6. ^ Miramontes, Jorge. "5.1 Re-recording to Binaural". ccrma.stanford.edu. 
  7. ^ Rumsey, Francis (2001). Spatial Audio. Focal Press. pp. 62–64. ISBN 0 240 51623 0. 
  8. ^ Hertsens, Tyll (6 February 2016). "Headphone Measurements Explained". InnerFidelity. Retrieved 22 July 2016. 
  9. ^ a b "BBC - Radio 4 and 4 Extra Blog". Retrieved 2017-04-06. 
  10. ^ a b c d Costerton, Benjamin (2013). "A Systematic Review of the Most Appropriate Methods of Achieving Spatially Enhanced Audio for Headphone Use". Pinecone Research Labs. 
  11. ^ a b Rumsey, Francis (2011). "Whose Head Is it Anyway? Optimizing Binaural Audio". Journal of the Audio Engineering Society. 59. 
  12. ^ Merimaa, Juha (2010). "Modification of HRTF Filters to Reduce Timbral Effects in Binaural Synthesis, Part 2: Individual HRTFs". Journal of the Audio Engineering Society. 
  13. ^ "Binaural Broadcasting". bbc.co.uk. 
  14. ^ Ham, H,L (1991). "Measuring a Dummy Head in Search of Pinna Cues". Journal of the Audio Engineering Society. 
  15. ^ "TYPE 4128-C Head and Torso Simulator (HATS)". Brüel & Kjær. Retrieved 6 May 2017. 
  16. ^ "Sound Professionals SP-TFB-2 Low-Noise In-Ear Binaural Microphones Review". Techwalls, Apr 5, 2016 By Tuan Do
  17. ^ http://blogs.bl.uk/sound-and-vision/2010/07/how-to-reduce-wind-noise-on-your-smartphone-recordings.html
  18. ^ Nusser, Dick (14 January 1978). "Arista Has 1st Stereo/Binaural Disk". Billboard. Retrieved 7 April 2014.