Head-related transfer function
||This article provides insufficient context for those unfamiliar with the subject. (October 2009)|
A head-related transfer function (HRTF) is a response that characterizes how an ear receives a sound from a point in space; a pair of HRTFs for two ears can be used to synthesize a binaural sound that seems to come from a particular point in space. It is a transfer function, describing how a sound from a specific point will arrive at the ear (generally at the outer end of the auditory canal). Some consumer home entertainment products designed to reproduce surround sound from stereo (two-speaker) headphones use HRTFs. Some forms of HRTF-processing have also been included in computer software to simulate surround sound playback from loudspeakers.
Humans have just two ears, but can locate sounds in three dimensions – in range (distance), in direction above and below, in front and to the rear, as well as to either side. This is possible because the brain, inner ear and the external ears (pinna) work together to make inferences about location. This ability to localize sound sources may have developed in humans as an evolutionary necessity, since the eyes can only see a fraction of the world around a viewer, and vision is hampered in darkness, while the ability to localize a sound source works in all directions, to varying accuracy, regardless of the surrounding light.
Humans estimate the location of a source by taking cues derived from one ear (monaural cues), and by comparing cues received at both ears (difference cues or binaural cues). Among the difference cues are time differences of arrival and intensity differences. The monaural cues come from the interaction between the sound source and the human anatomy, in which the original source sound is modified before it enters the ear canal for processing by the auditory system. These modifications encode the source location, and may be captured via an impulse response which relates the source location and the ear location. This impulse response is termed the head-related impulse response (HRIR). Convolution of an arbitrary source sound with the HRIR converts the sound to that which would have been heard by the listener if it had been played at the source location, with the listener's ear at the receiver location. HRIRs have been used to produce virtual surround sound. [example needed]
The HRTF is the Fourier transform of HRIR. The HRTF is also sometimes known as the anatomical transfer function (ATF).
HRTFs for left and right ear (expressed above as HRIRs) describe the filtering of a sound source (x(t)) before it is perceived at the left and right ears as xL(t) and xR(t), respectively.
The HRTF can also be described as the modifications to a sound from a direction in free air to the sound as it arrives at the eardrum. These modifications include the shape of the listener's outer ear, the shape of the listener's head and body, the acoustical characteristics of the space in which the sound is played, and so on. All these characteristics will influence how (or whether) a listener can accurately tell what direction a sound is coming from.
How HRTF works
The associated mechanism varies between individuals, as their head and ear shapes differ.
HRTF describes how a given sound wave input (parameterized as frequency and source location) is filtered by the diffraction and reflection properties of the head, pinna, and torso, before the sound reaches the transduction machinery of the eardrum and inner ear (see auditory system). Biologically, the source-location-specific prefiltering effects of these external structures aid in the neural determination of source location), particularly the determination of the source's elevation (see vertical sound localization).
Linear systems analysis defines the transfer function as the complex ratio between the output signal spectrum and the input signal spectrum as a function of frequency. Blauert (1974; cited in Blauert, 1981) initially defined the transfer function as the free-field transfer function (FFTF). Other terms include free-field to eardrum transfer function and the pressure transformation from the free-field to the eardrum. Less specific descriptions include the pinna transfer function, the outer ear transfer function, the pinna response, or directional transfer function (DTF).
The transfer function H(f) of any linear time-invariant system at frequency f is:
- H(f) = Output(f) / Input(f)
One method used to obtain the HRTF from a given source location is therefore to measure the head-related impulse response (HRIR), h(t), at the ear drum for the impulse Δ(t) placed at the source. The HRTF H(f) is the Fourier transform of the HRIR h(t).
Even when measured for a "dummy head" of idealized geometry, HRTF are complicated functions of frequency and the three spatial variables. For distances greater than 1 m from the head, however, the HRTF can be said to attenuate inversely with range. It is this far field HRTF, H(f, θ, φ), that has most often been measured. At closer range, the difference in level observed between the ears can grow quite large, even in the low-frequency region within which negligible level differences are observed in the far field.
HRTFs are typically measured in an anechoic chamber to minimize the influence of early reflections and reverberation on the measured response. HRTFs are measured at small increments of θ such as 15° or 30° in the horizontal plane, with interpolation used to synthesize HRTFs for arbitrary positions of θ. Even with small increments, however, interpolation can lead to front-back confusion, and optimizing the interpolation procedure is an active area of research.
In order to maximize the signal-to-noise ratio (SNR) in a measured HRTF, it is important that the impulse being generated be of high volume. In practice, however, it can be difficult to generate impulses at high volumes and, if generated, they can be damaging to human ears, so it is more common for HRTFs to be directly calculated in the frequency domain using a frequency-swept sine wave or by using maximum length sequences. User fatigue is still a problem, however, highlighting the need for the ability to interpolate based on fewer measurements.
The head-related transfer function is involved in resolving the Cone of Confusion, a series of points where ITD and ILD are identical for sound sources from many locations around the "0" part of the cone. When a sound is received by the ear it can either go straight down the ear into the ear canal or it can be reflected off the pinnae of the ear, into the ear canal a fraction of a second later. The sound will contain many frequencies, so therefore many copies of this signal will go down the ear all at different times depending on their frequency (according to reflection, diffraction, and their interaction with high and low frequencies and the size of the structures of the ear.) These copies overlap each other, and during this, certain signals are enhanced (where the phases of the signals match) while other copies are canceled out (where the phases of the signal do not match). Essentially, the brain is looking for frequency notches in the signal that correspond to particular known directions of sound.
If another person's ears were substituted, the individual would not immediately be able to localize sound, as the patterns of enhancement and cancellation would be different from those patterns the person's auditory system is used to. However, after some weeks, the auditory system would adapt to the new head-related transfer function. The inter-subject variability in the spectra of HRTFs has been studied through cluster analyses.
Recordings processed via an HRTF, such as in a computer gaming environment (see A3D, EAX and OpenAL), which approximates the HRTF of the listener, can be heard through stereo headphones or speakers and interpreted as if they comprise sounds coming from all directions, rather than just two points either side of the head. The perceived accuracy of the result depends on how closely the HRTF data set matches the characteristics of one's own ears.
- Binaural recording
- Dummy head recording
- Environmental audio extensions
- Sound Retrieval System
- Sound localization
- Transfer function
- Daniel Starch (1908). Perimetry of the localization of sound. State University of Iowa. p. 35 ff.
- Begault, D.R. (1994) 3D sound for virtual reality and multimedia. AP Professional.
- So, R.H.Y., Leung, N.M., Braasch, J. and Leung, K.L.(2006) A low cost, Non-individualized surround sound system based upon head-related transfer functions. An Ergonomics study and prototype development. Applied Ergonomics, 37, pp.695-707.
- Blauert, J. (1997) Spatial hearing: the psychophysics of human sound localization. MIT Press.
- Hofman, Paul M.; Van Riswick, JG; Van Opstal, AJ (September 1998). "Relearning sound localization with new ears" (PDF). Nature Neuroscience 1 (5): 417–421. doi:10.1038/1633. PMID 10196533.
- So, R.H.Y., Ngan, B., Horner, A., Leung, K.L., Braasch, J. and Blauert, J. (2010) Toward orthogonal non-individualized head-related transfer functions for forward and backward directional sound: cluster analysis and an experimental study. Ergonomics, 53(6), pp.767-781.