Temporal envelope and fine structure

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

Temporal envelope (ENV) and temporal fine structure (TFS) are changes in the amplitude and frequency of sound perceived by humans over time. These temporal changes are responsible for several aspects of auditory perception, including loudness, pitch and timbre perception and spatial hearing.

Complex sounds such as speech or music are decomposed by the peripheral auditory system of humans into narrow frequency bands. The resulting narrow-band signals convey information at different time scales ranging from less than one millisecond to hundreds of milliseconds. A dichotomy between slow "temporal envelope" cues and faster "temporal fine structure" cues has been proposed to study several aspects of auditory perception (e.g., loudness, pitch and timbre perception, auditory scene analysis, sound localization) at two distinct time scales in each frequency band.[1][2][3][4][5][6][7] Over the last decades, a wealth of psychophysical, electrophysiological and computational studies based on this envelope/fine-structure dichotomy have examined the role of these temporal cues in sound identification and communication, how these temporal cues are processed by the peripheral and central auditory system, and the effects of aging and cochlear damage on temporal auditory processing. Although the envelope/fine-structure dichotomy has been debated and questions remain as to how temporal fine structure cues are actually encoded in the auditory system, these studies have led to a range of applications in various fields including speech and audio processing, clinical audiology and rehabilitation of sensorineural hearing loss via hearing aids or cochlear implants.


Outputs of simulated cochlear filters centred at 364, 1498 and 4803 Hz (from bottom to top) in response to a segment of a speech signal, the sound “en” in “sense”. These filter outputs are similar to the waveforms that would be observed at places on the basilar membrane tuned to 364, 1498 and 4803 Hz. For each centre frequency, the signal can be considered as a slowly-varying envelope (EBM) imposed on a more rapid temporal fine structure (TFSBM). The envelope for each band signal is shown by the thick line.

Notions of temporal envelope and temporal fine structure may have different meanings in many studies. An important distinction to make is between the physical (i.e., acoustical) and the biological (or perceptual) description of these ENV and TFS cues.

Schematic representation of the three levels of temporal envelope (ENV) and temporal fine structure (TFS) cues conveyed by a band-limited signal processed by the peripheral auditory system.

Any sound whose frequency components cover a narrow range (called a narrowband signal) can be considered as an envelope (ENVp, where p denotes the physical signal) superimposed on a more rapidly oscillating carrier, the temporal fine structure (TFSp).[8]

Many sounds in everyday life, including speech and music, are broadband; the frequency components spread over a wide range and there is no well-defined way to represent the signal in terms of ENVp and TFSp. However, in a normally functioning cochlea, complex broadband signals are decomposed by the filtering on the basilar membrane (BM) within the cochlea into a series of narrowband signals.[9] Therefore, the waveform at each place on the BM can be considered as an envelope (ENVBM) superimposed on a more rapidly oscillating carrier, the temporal fine structure (TFSBM).[10] The ENVBM and TFSBM depend on the place along the BM. At the apical end, which is tuned to low (audio) frequencies, ENVBM and TFSBM vary relatively slowly with time, while at the basal end, which is tuned to high frequencies, both ENVBM and TFSBM vary more rapidly with time.[10]

Both ENVBM and TFSBM are represented in the time patterns of action potentials in the auditory nerve[11] these are denoted ENVn and TFSn. TFSn is represented most prominently in neurons tuned to low frequencies, while ENVn is represented most prominently in neurons tuned to high (audio) frequencies.[11][12] For a broadband signal, it is not possible to manipulate TFSp without affecting ENVBM and ENVn, and it is not possible to manipulate ENVp without affecting TFSBM and TFSn.[13][14]

Temporal envelope (ENV) processing[edit]

Neurophysiological aspects[edit]

Examples of sinusoidally amplitude- and frequency-modulated signals

The neural representation of stimulus envelope, ENVn, has typically been studied using well-controlled ENVp modulations, that is sinusoidally amplitude-modulated (AM) sounds. Cochlear filtering limits the range of AM rates encoded in individual auditory-nerve fibers. In the auditory nerve, the strength of the neural representation of AM decreases with increasing modulation rate. At the level of the cochlear nucleus, several cell types show an enhancement of ENVn information. Multipolar cells can show band-pass tuning to AM tones with AM rates between 50 and 1000 Hz.[15][16] Some of these cells show an excellent response to the ENVn and provide inhibitory sideband inputs to other cells in the cochlear nucleus giving a physiological correlate of comodulation masking release, a phenomenon whereby the detection of a signal in a masker is improved when the masker has correlated envelope fluctuations across frequency (see section below).[17][18]

Responses to the temporal-envelope cues of speech or other complex sounds persist up the auditory pathway, eventually to the various fields of the auditory cortex in many animals. In the Primary Auditory Cortex, responses can encode AM rates by phase-locking up to about 20–30 Hz,[19][20][21][22] while faster rates induce sustained and often tuned responses.[23][24] A topographical representation of AM rate has been demonstrated in the primary auditory cortex of awake macaques.[25] This representation is approximately perpendicular to the axis of the tonotopic gradient, consistent with an orthogonal organization of spectral and temporal features in the auditory cortex. Combining these temporal responses with the spectral selectivity of A1 neurons gives rise to the spectro-temporal receptive fields that often capture well cortical responses to complex modulated sounds.[26][27] In secondary auditory cortical fields, responses become temporally more sluggish and spectrally broader, but are still able to phase-lock to the salient features of speech and musical sounds.[28][29][30][31] Tuning to AM rates below about 64 Hz is also found in the human auditory cortex [32][33][34][35] as revealed by brain-imaging techniques (fMRI) and cortical recordings in epileptic patients (electrocorticography). This is consistent with neuropsychological studies of brain-damaged patients[36] and with the notion that the central auditory system performs some form of spectral decomposition of the ENVp of incoming sounds. The ranges over which cortical responses encode well the temporal-envelope cues of speech have been shown to be predictive of the human ability to understand speech. In the human superior temporal gyrus (STG), an anterior-posterior spatial organization of spectro-temporal modulation tuning has been found in response to speech sounds, the posterior STG being tuned for temporally fast varying speech sounds with low spectral modulations and the anterior STG being tuned for temporally slow varying speech sounds with high spectral modulations.[37]

One unexpected aspect of phase locking in the auditory cortex has been observed in the responses elicited by complex acoustic stimuli with spectrograms that exhibit relatively slow envelopes (< 20 Hz), but that are carried by fast modulations that are as high as hundreds of Hertz. Speech and music, as well as various modulated noise stimuli have such temporal structure.[38] For these stimuli, cortical responses phase-lock to both the envelope and fine-structure induced by interactions between unresolved harmonics of the sound, thus reflecting the pitch of the sound, and exceeding the typical lower limits of cortical phase-locking to the envelopes of a few 10’s of Hertz. This paradoxical relation[38][39] between the slow and fast cortical phase-locking to the carrier “fine structure” has been demonstrated both in the auditory[38] and visual[40] cortices. It has also been shown to be amply manifested in measurements of the spectro-temporal receptive fields of the primary auditory cortex giving them unexpectedly fine temporal accuracy and selectivity bordering on a 5-10 ms resolution.[38][40] The underlying causes of this phenomenon have been attributed to several possible origins, including nonlinear synaptic depression and facilitation, and/or a cortical network of thalamic excitation and cortical inhibition.[38][41][42][43] There are many functionally significant and perceptually relevant reasons for the coexistence of these two complementary dynamic response modes. They include the ability to accurately encode onsets and other rapid ‘events’ in the ENVp of complex acoustic and other sensory signals, features that are critical for the perception of consonants (speech) and percussive sounds (music), as well as the texture of complex sounds.[38][44]

Psychoacoustical aspects[edit]

The perception of ENVp depends on which AM rates are contained in the signal. Low rates of AM, in the 1–8 Hz range, are perceived as changes in perceived intensity, that is  loudness fluctuations (a percept that can also be evoked by frequency modulation, FM); at higher rates, AM is perceived as roughness, with the greatest roughness sensation occurring at around 70 Hz;[45] at even higher rates, AM can evoke a weak pitch percept corresponding to the modulation rate.[46] Rainstorms, crackling fire, chirping crickets or galloping horses produce "sound textures" - the collective result of many similar acoustic events - which perception is mediated by ENVn statistics.[47][48]

The auditory detection threshold for AM as a function of AM rate, referred to as the temporal modulation transfer function (TMTF),[49] is best for AM rates in the range from 4 – 150 Hz and worsens outside that range[49][50][51] The cutoff frequency of the TMTF gives an estimate of temporal acuity (temporal resolution) for the auditory system. This cutoff frequency corresponds to a time constant of about 1 - 3 ms for the auditory system of normal-hearing humans.

Correlated envelope fluctuations across frequency in a masker can aid detection of a pure tone signal, an effect known as comodulation masking release.[18]

AM applied to a given carrier can perceptually interfere with the detection of a target AM imposed on the same carrier, an effect termed modulation masking.[52][53] Modulation-masking patterns are tuned (greater masking occurs for masking and target AMs close in modulation rate), suggesting that the human auditory system is equipped with frequency-selective channels for AM. Moreover, AM applied to spectrally remote carriers can perceptually interfere with the detection of AM on a target sound, an effect termed modulation detection interference.[54] The notion of modulation channels is also supported by the demonstration of selective adaptation effects in the modulation domain.[55][56][57] These studies show that AM detection thresholds are selectively elevated above pre-exposure thresholds when the carrier frequency and the AM rate of the adaptor are similar to those of the test tone.

Human listeners are sensitive to relatively slow "second-order" AMs cues correspond to fluctuations in the strength of AM. These cues arise from the interaction of different modulation rates, previously described as "beating" in the envelope-frequency domain. Perception of second-order AM has been interpreted as resulting from nonlinear mechanisms in the auditory pathway that produce an audible distortion component at the envelope beat frequency in the internal modulation spectrum of the sounds.[58][59][60]

Interaural time differences in the envelope provide binaural cues even at high frequencies where TFSn cannot be used.[61]

Models of normal envelope processing[edit]

Diagram of the common part of the envelope perception model of Torsten Dau and EPSM.

The most basic computer model of ENV processing is the leaky integrator model.[62][49] This model extracts the temporal envelope of the sound (ENVp) via bandpass filtering, half-wave rectification (which may be followed by fast-acting amplitude compression), and lowpass filtering with a cutoff frequency between about 60 and 150 Hz. The leaky integrator is often used with a decision statistic based on either the resulting envelope power, the max/min ratio, or the crest factor. This model accounts for the loss of auditory sensitivity for AM rates higher than about 60–150 Hz for broadband noise carriers.[49] Based on the concept of frequency selectivity for AM,[53] the perception model of Torsten Dau[63] incorporates broadly tuned bandpass modulation filters (with a Q value around 1) to account for data from a broad variety of psychoacoustic tasks and particularly AM detection for noise carriers with different bandwidths, taking into account their intrinsic envelope fluctuations. This model of has been extended to account for comodulation masking release (see sections above).[64] The shapes of the modulation filters have been estimated[65] and an “envelope power spectrum model” (EPSM) based on these filters can account for AM masking patterns and AM depth discrimination.[66] The EPSM has been extended to the prediction of speech intelligibility[67] and to account for data from a broad variety of psychoacoustic tasks.[68] A physiologically-based processing model simulating brainstem responses has also been developed to account for AM detection and AM masking patterns.[69]

Temporal fine structure (TFS) processing[edit]

Neurophysiological aspects[edit]

Phase locking recorded from a neuron in the cochlear nucleus in response to a sinusoidal acoustic stimulus at the cell’s best frequency (in this case 240 Hz). The stimulus was approximately 20 dB above the neuron’s best frequency. The neural outputs (action potentials) are shown in the upper trace and the stimulus waveform in the lower trace.

The neural representation of temporal fine structure, TFSn, has been studied using stimuli with well-controlled TFSp: pure tones, harmonic complex tones, and frequency-modulated (FM) tones.

Auditory-nerve fibres are able to represent low-frequency sounds via their phase-locked discharges (i.e., TFSn information). The upper frequency limit for phase locking is species dependent. It is about 5 kHz in the cat, 9 kHz in the barn owl and just 4 kHz in the guinea pig. We do not know the upper limit of phase locking in humans but current, indirect, estimates suggest it is about 4–5 kHz.[70] Phase locking is a direct consequence of the transduction process with an increase in probability of transduction channel opening occurring with a stretching of the stereocilia and decrease in channel opening occurring when pushed in the opposite direction. This has led some to suggest that phase locking is an epiphenomenon. The upper limit appears to be determined by a cascade of low pass filters at the level of the inner hair cell and auditory-nerve synapse.[71][72]

TFSn information in the auditory nerve may be used to encode the (audio) frequency of low-frequency sounds, including single tones and more complex stimuli such as frequency-modulated tones or steady-state vowels (see role and applications to speech and music).

The auditory system goes to some length to preserve this TFSn information with the presence of giant synapses (End bulbs of Held) in the ventral cochlear nucleus. These synapses contact bushy cells (Spherical and globular) and faithfully transmit (or enhance) the temporal information present in the auditory nerve fibers to higher structures in the brainstem.[73] The bushy cells project to the medial superior olive and the globular cells project to the medial nucleus of the trapezoid body (MNTB). The MNTB is also characterized by giant synapses (calyces of Held) and provides precisely timed inhibition to the lateral superior olive. The medial and lateral superior olive and MNTB are involved in the encoding of interaural time and intensity differences. There is general acceptance that the temporal information is crucial in sound localization but it is still contentious as to whether the same temporal information is used to encode the frequency of complex sounds.

Several problems remain with the idea that the TFSn is important in the representation of the frequency components of complex sounds. The first problem is that the temporal information deteriorates as it passes through successive stages of the auditory pathway (presumably due to the low pass dendritic filtering). Therefore, the second problem is that the temporal information must be extracted at an early stage of the auditory pathway. No such stage has currently been identified although there are theories about how temporal information can be converted into rate information (see section Models of normal processing: Limitations).

Psychoacoustical aspects[edit]

It is often assumed that many perceptual capacities rely on the ability of the monaural and binaural auditory system to encode and use TFSn cues evoked by components in sounds with frequencies below about 1–4 kHz. These capacities include discrimination of frequency,[74][4][75][76] discrimination of the fundamental frequency of harmonic sounds,[75][4][76] detection of FM at rates below 5 Hz,[77] melody recognition for sequences of pure tones and complex tones,[74][4] lateralization and localization of pure tones and complex tones,[78] and segregation of concurrent harmonic sounds (such as speech sounds).[79] It appears that TFSn cues require correct tonotopic (place) representation to be processed optimally by the auditory system.[80] Moreover, musical pitch perception has been demonstrated for complex tones with all harmonics above 6 kHz, demonstrating that it is not entirely dependent on neural phase locking to TFSBM (i.e., TFSn) cues.[81]

As for FM detection, the current view assumes that in the normal auditory system, FM is encoded via TFSn cues when the FM rate is low (<5 Hz) and when the carrier frequency is below about 4 kHz,[77][82][83][84] and via ENVn cues when the FM is fast or when the carrier frequency is higher than 4 kHz.[77][85][86][87][84] This is supported by single-unit recordings in the low brainstem.[73] According to this view, TFSn cues are not used to detect FM with rates above about 10 Hz because the mechanism decoding the TFSn information is “sluggish” and cannot track rapid changes in frequency.[77] Several studies have shown that auditory sensitivity to slow FM at low carrier frequency is associated with speech identification for both normal-hearing and hearing-impaired individuals when speech reception is limited by acoustic degradations (e.g., filtering) or concurrent speech sounds.[88][89][90][91][92] This suggests that robust speech intelligibility is determined by accurate processing of TFSn cues.

Models of normal processing: limitations[edit]

The separation of a sound into ENVp  and TFSp appears inspired partly by how sounds are synthesized and by the availability of a convenient way to separate an existing sound into ENV and TFS, namely the Hilbert transform. There is a risk that this view of auditory processing[93] is dominated by these physical/technical concepts, similarly to how cochlear frequency-to-place mapping was for a long time conceptualized in terms of the Fourier transform. Physiologically, there is no indication of a separation of ENV and TFS in the auditory system for stages up to the cochlear nucleus. Only at that stage does it appear that parallel pathways, potentially enhancing ENVn or TFSn information (or something akin to it), may be implemented through the temporal response characteristics of different cochlear nucleus cell types.[73] It may therefore be useful to better simulate cochlear nucleus cell types to understand the true concepts for parallel processing created at the level of the cochlear nucleus. These concepts may be related to separating ENV and TFS but are unlikely realized like the Hilbert transform.

A computational model of the peripheral auditory system[94][95] may be used to simulate auditory-nerve fiber responses to complex sounds such as speech, and quantify the transmission (i.e., internal representation) of ENVn and TFSn cues. In two simulation studies,[96][97] the mean-rate and spike-timing information was quantified at the output of such a model to characterize, respectively, the short-term rate of neural firing (ENVn) and the level of synchronization due to phase locking (TFSn) in response to speech sounds degraded by vocoders.[98][99] The best model predictions of vocoded-speech intelligibility were found when both ENVn and TFSn cues were included, providing evidence that TFSn cues are important for intelligibility when the speech ENVp cues are degraded.

At a more fundamental level, similar computational modeling was used to demonstrate that the functional dependence of human just-noticeable-frequency-differences on pure-tone frequency were not accounted for unless temporal information was included (notably most so for mid-high frequencies, even above the nominal cutoff in physiological phase locking).[100][101] However, a caveat of most TFS models is that optimal model performance with temporal information typically over-estimates human performance.

An alternative view is to assume that TFSn information at the level of the auditory nerve is converted into rate-place (ENVn) information at a later stage of the auditory system (e.g., the low brainstem). Several modelling studies proposed that the neural mechanisms for decoding TFSn are based on correlation of the outputs of adjacent places.[102][103][104][105][106]

Role in speech and music perception[edit]

Role of temporal envelope in speech and music perception[edit]

Amplitude modulation spectra (left) and frequency modulation spectra (right), calculated on a corpus of English or French sentences.[107]

The ENVp plays a critical role in many aspects of auditory perception, including in the perception of speech and music.[2][7][108][109] Speech recognition is possible using cues related to the ENVp, even in situations where the original spectral information and TFSp are highly degraded.[110] Indeed, when the spectrally local TFSp from one sentence is combined with the ENVp from a second sentence, only the words of the second sentence are heard.[111] The ENVp rates most important for speech are those below about 16 Hz, corresponding to fluctuations at the rate of syllables.[112][107][113] On the other hand, the fundamental frequency (“pitch”) contour of speech sounds is primarily conveyed via TFSp cues,[107] although some information on the contour can be perceived via rapid envelope fluctuations corresponding to the fundamental frequency.[2] For music, slow ENVp rates convey rhythm and tempo information, whereas more rapid rates convey the onset and offset properties of sound (attack and decay, respectively) that are important for timbre perception.[114]

Role of TFS in speech and music perception[edit]

The ability to accurately process TFSp information is thought to play a role in our perception of pitch (i.e., the perceived height of sounds), an important sensation for music perception, as well as our ability to understand speech, especially in the presence of background noise.[4]

Role of TFS in pitch perception[edit]

Although pitch retrieval mechanisms in the auditory system are still a matter of debate,[76][115] TFSn information may be used to retrieve the pitch of low-frequency pure tones[75] and estimate the individual frequencies of the low-numbered (ca. 1st-8th) harmonics of a complex sound,[116] frequencies from which the fundamental frequency of the sound can be retrieved according to, e.g., pattern-matching models of pitch perception.[117] A role of TFSn information in pitch perception of complex sounds containing intermediate harmonics (ca. 7th-16th) has also been suggested[118] and may be accounted for by temporal or spectrotemporal[119] models of pitch perception. The degraded TFSn cues conveyed by cochlear implant devices may also be partly responsible for impaired music perception of cochlear implant recipients.[120]

Role of TFS cues in speech perception[edit]

TFSp cues are thought to be important for the identification of speakers and for tone identification in tonal languages.[121] In addition, several vocoder studies have suggested that TFSp cues contribute to the intelligibility of speech in quiet and noise.[98] Although it is difficult to isolate TFSp from ENVp cues,[109][122] there is evidence from studies in hearing-impaired listeners that speech perception in the presence of background noise can be partly accounted for by the ability to accurately process TFSp,[92][99] although the ability to “listen in the dips” of fluctuating maskers does not seem to depend on periodic TFSp cues.[123]

Role in environmental sound perception[edit]

Environmental sounds can be broadly defined as nonspeech and nonmusical sounds in the listener’s environment that can convey meaningful information about surrounding objects and events.[124] Environmental sounds are highly heterogeneous in terms of their acoustic characteristics and source types, and may include human and animal vocalizations, water and weather related events, mechanical and electronic signaling sounds. Given a great variety in sound sources that give rise to environmental sounds both ENVp and TFSp play an important role in their perception. However, the relative contributions of ENVp and TFSp can differ considerably for specific environmental sounds. This is reflected in the variety of acoustic measures that correlate with different perceptual characteristics of objects and events.[125][126][127]

Early studies highlighted the importance of envelope-based temporal patterning in perception of environmental events. For instance, Warren & Verbrugge, demonstrated that constructed sounds of a glass bottle dropped on the floor were perceived as bouncing when high-energy regions in four different frequency bands were temporally aligned, producing amplitude peaks in the envelope.[128] In contrast, when the same spectral energy was distributed randomly across bands the sounds were heard as breaking. More recent studies using vocoder simulations of cochlear implant processing demonstrated that many temporally-patterned sounds can be perceived with little original spectral information, based primarily on temporal cues [126][127]. Such sounds as footsteps, horse galloping, helicopter flying, ping-pong playing, clapping, typing were identified with a high accuracy of 70% or more with a single channel of envelope-modulated broadband noise or with only two frequency channels. In these studies, envelope-based acoustic measures such as number of bursts and peaks in the envelope were predictive of listeners’ abilities to identify sounds based primarily on ENVp cues. On the other hand, identification of brief environmental sounds without strong temporal patterning in ENVp may require a much larger number of frequency channels to perceive. Sounds such as a car horn or a train whistle were poorly identified even with as many as 32 frequency channels.[126] Listeners with cochlear implants, which transmit envelope information for specific frequency bands, but do not transmit TFSp, have considerably reduced abilities in identification of common environmental sounds.[129][130][131]

In addition, individual environmental sounds are typically heard within the context of larger auditory scenes where sounds from multiple sources may overlap in time and frequency. When heard within an auditory scene, accurate identification of individual environmental sounds is contingent on the ability to segregate them from other sound sources or auditory streams in the auditory scene, which involves further reliance on ENVp and TFSp cues (see Role in auditory scene analysis).

Role in auditory scene analysis[edit]

Auditory scene analysis refers to the ability to perceive separately sounds coming from different sources. Any acoustical difference can potentially lead to auditory segregation,[132] and so any cues based either on ENVp or TFSp are likely to assist in segregating competing sound sources.[133] Such cues involve percepts such as pitch.[134][135][136][137] Binaural TFSp cues producing interaural time differences have not always resulted in clear source segregation, particularly with simultaneously presented sources, although successful segregation of sequential sounds, such as noise or speech, have been reported.[138]

Effects of age and hearing loss on temporal envelope processing[edit]

Developmental aspects[edit]

In infancy, behavioral AM detection thresholds[139] and forward or backward masking thresholds[139][140][141] observed in 3-month olds are similar to those observed in adults. Electrophysiological studies conducted in 1-month-old infants using 2000 Hz AM pure tones indicate some immaturity in envelope following response (EFR). Although sleeping infants and sedated adults show the same effect of modulation rate on EFR, infants’ estimates were generally poorer than adults’.[142][143] This is consistent with behavioral studies conducted with school-age children showing differences in AM detection thresholds compared to adults. Children systematically show worse AM detection thresholds than adults until 10–11 years. However, the shape of the TMTF (the cutoff) is similar to adults’ for younger children of 5 years.[144][145] Sensory versus non-sensory factors for this long maturation are still debated,[146] but the results generally appear to be more dependent on the task or on sound complexity for infants and children than for adults.[147] Regarding the development of speech ENVp processing, vocoder studies suggest that infants as young as 3 months are able to discriminate a change in consonants when the faster ENVp information of the syllables is preserved (< 256 Hz) but less so when only the slowest ENVp is available (< 8 Hz).[148] Older children of 5 years show similar abilities than adults to discriminate consonant changes based on ENVp cues (< 64 Hz).[149]

Neurophysiological aspects[edit]

The effects of hearing loss and age on neural coding are generally believed to be smaller for slowly varying envelope responses (i.e., ENVn) than for rapidly varying temporal fine structure (i.e., TFSn).[150][151] Enhanced ENVn coding following noise-induced hearing loss has been observed in peripheral auditory responses from single neurons[152] and in central evoked responses from the auditory midbrain.[153] The enhancement in ENVn coding of narrowband sounds occurs across the full range of modulation frequencies encoded by single neurons.[154] For broadband sounds, the range of modulation frequencies encoded in impaired responses is broader than normal (extending to higher frequencies), as expected from reduced frequency selectivity associated with outer-hair-cell dysfunction.[155] The enhancement observed in neural envelope responses is consistent with enhanced auditory perception of modulations following cochlear damage, which is commonly believed to result from loss of cochlear compression that occurs with outer-hair-cell dysfunction due to age or noise overexposure.[156]  However, the influence of inner-hair-cell dysfunction (e.g., shallower response growth for mild-moderate damage and steeper growth for severe damage) can confound the effects of outer-hair-cell dysfunction on overall response growth and thus ENVn coding.[152][157] Thus, not surprisingly the relative effects of outer-hair-cell and inner-hair-cell dysfunction have been predicted with modeling to create individual differences in speech intelligibility based on the strength of envelope coding of speech relative to noise.

Psychoacoustical aspects[edit]

For sinusoidal carriers, which have no intrinsic envelope (ENVp) fluctuations, the TMTF is roughly flat for AM rates from 10 to 120 Hz, but increases (i.e. threshold worsens) for higher AM rates,[51][158] provided that spectral sidebands are not audible. The shape of the TMTF for sinusoidal carriers is similar for young and older people with normal audiometric thresholds, but older people tend to have higher detection thresholds overall, suggesting poorer “detection efficiency” for ENVn cues in older people.[159][160] Provided that the carrier is fully audible, the ability to detect AM is usually not adversely affected by cochlear hearing loss and may sometimes be better than normal, for both noise carriers [161][162] and sinusoidal carriers,[158][163] perhaps because loudness recruitment (an abnormally rapid growth of loudness with increasing sound level) “magnifies” the perceived amount of AM (i.e., ENVn cues). Consistent with this, when the AM is clearly audible, a sound with a fixed AM depth appears to fluctuate more for an impaired ear than for a normal ear. However, the ability to detect changes in AM depth can be impaired by cochlear hearing loss.[163] Speech that is processed with noise vocoder such that mainly envelope information is delivered in multiple spectral channels was also used in investigating envelope processing in hearing impairment. Here, hearing-impaired individuals could not make use of such envelope information as well as normal-hearing individuals, even after audibility factors were taken into account.[164] Additional experiments suggest that age negatively affects the binaural processing of ENVp at least at low audio-frequencies.[165]

Models of impaired temporal envelope processing[edit]

The perception model of ENV processing[63] that incorporates selective (bandpass) AM filters accounts for many perceptual consequences of cochlear dysfunction including enhanced sensitivity to AM for sinusoidal and noise carriers,[166][167] abnormal forward masking (the rate of recovery from forward masking being generally slower than normal for impaired listeners),[168] stronger interference effects between AM and FM [82] and enhanced temporal integration of AM.[167] The model of Torsten Dau[63] has been extended to account for the discrimination of complex AM patterns by hearing-impaired individuals and the effects of noise-reduction systems.[169] The performance of the hearing-impaired individuals was best captured when the model combined the loss of peripheral amplitude compression resulting from the loss of the active mechanism in the cochlea[166][167][168] with an increase in internal noise in the ENVn domain.[166][167][82] Phenomenological models simulating the response of the peripheral auditory system showed that impaired AM sensitivity in individuals experiencing chronic tinnitus with clinically normal audiograms could be predicted by substantial loss of  auditory-nerve fibers with low spontaneous rates and some loss of auditory-nerve fibers with high-spontaneous rates.[170]

Effects of age and hearing loss on TFS processing[edit]

Developmental aspects[edit]

Very few studies have systematically assessed TFS processing in infants and children. Frequency-following response (FFR), thought to reflect phase-locked neural activity, appears to be adult-like in 1-month-old infants when using a pure tone (centered at 500, 1000 or 2000 Hz) modulated at 80 Hz with a 100% of modulation depth.[142]

As for behavioral data, six-month-old infants require larger frequency transitions to detect a FM change in a 1-kHz tone compared to adults.[171] However, 4-month-old infants are able to discriminate two different FM sweeps,[172] and they are more sensitive to FM cues swept from 150 Hz to 550 Hz than at lower frequencies.[173] In school-age children, performance in detecting FM change improves between 6 and 10 years and sensitivity to low modulation rate (2 Hz) is poor until 9 years.[174]

For speech sounds, only one vocoder study has explored the ability of school age children to rely on TFSp cues to detect consonant changes, showing the same abilities for 5-years-olds than adults.[149]

Neurophysiological aspects[edit]

Psychophysical studies have suggested that degraded TFS processing due to age and hearing loss may underlie some suprathreshold deficits, such as speech perception;[10] however, debate remains about the underlying neural correlates.[150][151] The strength of phase locking to the temporal fine structure of signals (TFSn) in quiet listening conditions remains normal in peripheral single-neuron responses following cochlear hearing loss.[152] Although these data suggest that the fundamental ability of auditory-nerve fibers to follow the rapid fluctuations of sound remains intact following cochlear hearing loss, deficits in phase locking strength do emerge in background noise.[175] This finding, which is consistent with the common observation that listeners with cochlear hearing loss have more difficulty in noisy conditions, results from reduced cochlear frequency selectivity associated with outer-hair-cell dysfunction.[156]  Although only limited effects of age and hearing loss have been observed in terms of TFSn coding strength of narrowband sounds, more dramatic deficits have been observed in TFSn coding quality in response to broadband sounds, which are more relevant for everyday listening.  A dramatic loss of tonotopicity can occur following noise induced hearing loss, where auditory-nerve fibers that should be responding to mid frequencies (e.g., 2–4 kHz) have dominant TFS responses to lower frequencies (e.g., 700 Hz).[176]  Notably, the loss of tonotopicity generally occurs only for TFSn coding but not for ENVn coding, which is consistent with greater perceptual deficits in TFS processing.[10] This tonotopic degradation is likely to have important implications for speech perception, and can account for degraded coding of vowels following noise-induced hearing loss in which most of the cochlea responds to only the first formant, eliminating the normal tonotopic representation of the second and third formants.

Psychoacoustical aspects[edit]

Several psychophysical studies have shown that older people with normal hearing and people with sensorineural hearing loss often show impaired performance for auditory tasks that are assumed to rely on the ability of the monaural and binaural auditory system to encode and use TFSn cues, such as: discrimination of sound frequency,[76][177][178] discrimination of the fundamental frequency of harmonic sounds,[76][177][178][179] detection of FM at rates below 5 Hz,[180][181][91] melody recognition for sequences of pure tones and complex sounds,[182] lateralization and localization of pure tones and complex tones,[78][183][165] and segregation of concurrent harmonic sounds (such as speech sounds).[79] However, it remains unclear to which extent deficits associated with hearing loss reflect poorer TFSn processing or reduced cochlear frequency selectivity.[182]

Models of impaired processing[edit]

The quality of the representation of a sound in the auditory nerve is limited by refractoriness, adaptation, saturation, and reduced synchronization (phase locking) at high frequencies, as well as by the stochastic nature of actions potentials.[184] However, the auditory nerve contains thousands of fibers. Hence, despite these limiting factors, the properties of sounds are reasonably well represented in the population nerve response over a wide range of levels[185] and audio frequencies (see Volley Theory).

The coding of temporal information in the auditory nerve can be disrupted by two main mechanisms: reduced synchrony and loss of synapses and/or auditory nerve fibers.[186] The impact of disrupted temporal coding on human auditory perception has been explored using physiologically inspired signal-processing tools. The reduction in neural synchrony has been simulated by jittering the phases of the multiple frequency components in speech,[187] although this has undesired effects in the spectral domain. The loss of auditory nerve fibers or synapses has been simulated by assuming (i) that each afferent fiber operates as a stochastic sampler of the sound waveform, with greater probability of firing for higher-intensity and sustained sound features than for lower-intensity or transient features, and (ii) that deafferentation can be modeled by reducing the number of samplers.[184] However, this also has undesired effects in the spectral domain. Both jittering and stochastic undersampling degrade the representation of the TFSn more than the representation of the ENVn. Both jittering and stochastic undersampling impair the recognition of speech in noisy backgrounds without degrading recognition in silence, support the argument that TFSn is important for recognizing speech in noise.[3] Both jittering and stochastic undersampling mimic the effects of aging on speech perception.[188]

Transmission by hearing aids and cochlear implants[edit]

Temporal envelope transmission[edit]

Individuals with cochlear hearing loss usually have a smaller than normal dynamic range between the level of the weakest detectable sound and the level at which sounds become uncomfortably loud.[189][190] To compress the large range of sound levels encountered in everyday life into the small dynamic range of the hearing-impaired person, hearing aids apply amplitude compression, which is also called automatic gain control (AGC). The basic principle of such compression is that the amount of amplification applied to the incoming sound progressively decreases as the input level increases. Usually, the sound is split into several frequency “channels”, and AGC is applied independently in each channel. As a result of compressing the level, AGC reduces the amount of envelope fluctuation in the input signal (ENVp) by an amount that depends on the rate of fluctuation and the speed with which the amplification changes in response to changes in input sound level.[191][192] AGC can also change the shape of the envelope of the signal.[193] Cochlear implants are devices that electrically stimulate the auditory nerve, thereby creating the sensation of sound in a person who would otherwise be profoundly or totally deaf. The electrical dynamic range is very small,[194] so cochlear implants usually incorporate AGC prior to the signal being filtered into multiple frequency channels.[195] The channel signals are then subjected to instantaneous compression to map them into the limited dynamic range for each channel.[196]

Cochlear implants differ than hearing aids in that the entire acoustic hearing is replaced with direct electric stimulation of the auditory nerve, achieved via an electrode array placed inside the cochlea. Hence, here, other factors than device signal processing also strongly contribute to overall hearing, such as etiology, nerve health, electrode configuration and proximity to the nerve, and overall adaptation process to an entirely new mode of hearing.[197][198][199][200] Almost all information in cochlear implants is conveyed by the envelope fluctuations in the different channels. This is sufficient to give reasonable perception of speech in quiet, but not in noisy or reverberant conditions.[201][202][203][204][121][110][205][206][207][208] The processing in cochlear implants is such that the TFSp is discarded in favor of fixed-rate pulse trains amplitude-modulated by the ENVp within each frequency band. Implant users are sensitive to these ENVp modulations, but performance varies across stimulation site, stimulation level, and across individuals.[209][210] The TMTF shows a low-pass filter shape similar to that observed in normal-hearing listeners.[210][211][212] Voice pitch or musical pitch information, conveyed primarily via weak periodicity cues in the ENVp, results in a pitch sensation that is not salient enough to support music perception,[213][214] talker sex identification,[215][216] lexical tones,[217][218] or prosodic cues.[219][220][221] Listeners with cochlear implants are susceptible to interference in the modulation domain[222][223] which likely contributes to difficulties listening in noise.

Temporal fine structure transmission[edit]

Hearing aids usually process sounds by filtering them into multiple frequency channels and applying AGC in each channel. Other signal processing in hearing aids, such as noise reduction, also involves filtering the input into multiple channels.[224] The filtering into channels can affect the TFSp of sounds depending on characteristics such as the phase response and group delay of the filters. However, such effects are usually small. Cochlear implants also filter the input signal into frequency channels. Usually, the ENVp of the signal in each channel is transmitted to the implanted electrodes in the form an electrical pulses of fixed rate that are modulated in amplitude or duration. Information about TFSp is discarded. This is justified by the observation that people with cochlear implants have a very limited ability to process TFSp information, even if it is transmitted to the electrodes,[225] perhaps because of a mismatch between the temporal information and the place in the cochlea to which it is delivered[76] Reducing this mismatch may improve the ability to use TFSp information and hence lead to better pitch perception.[226] Some cochlear implant systems transmit information about TFSp in the channels of the cochlear implants that are tuned to low audio frequencies, and this may improve the pitch perception of low-frequency sounds.[227]

Training effects and plasticity of temporal-envelope processing[edit]

Perceptual learning resulting from training has been reported for various auditory AM detection or discrimination tasks,[228][229][230] suggesting that the responses of central auditory neurons to ENVp cues are plastic and that practice may modify the circuitry of ENVn processing.[230][231]

The plasticity of ENVn processing has been demonstrated in several ways. For instance, the ability of auditory-cortex neurons to discriminate voice-onset time cues for phonemes is degraded following moderate hearing loss (20-40 dB HL) induced by acoustic trauma.[232] Interestingly, developmental hearing loss reduces cortical responses to slow, but not fast (100 Hz) AM stimuli, in parallel with behavioral performance.[233] As a matter of fact, a transient hearing loss (15 days) occurring during the "critical period" is sufficient to elevate AM thresholds in adult gerbils.[234] Even non-traumatic noise exposure reduces the phase-locking ability of cortical neurons as well as the animals' behavioral capacity to discriminate between different AM sounds.[235] Behavioral training or pairing protocols involving neuromodulators also alter the ability of cortical neurons to phase lock to AM sounds.[236][237] In humans, hearing loss may result in an unbalanced representation of speech cues: ENVn cues are enhanced at the cost of TFSn cues (see: Effects of age and hearing loss on temporal envelope processing). Auditory training may reduce the representation of speech ENVn cues for elderly listeners with hearing loss, who may then reach levels comparable to those observed for normal-hearing elderly listeners.[238] Last, intensive musical training induces both behavioral effects such as higher sensitivity to pitch variations (for Mandarin linguistic pitch) and a better synchronization of brainstem responses to the f0-contour of lexical tones for musicians compared with non-musicians.[239]

Clinical evaluation of TFS sensitivity[edit]

Fast and easy to administer psychophysical tests have been developed to assist clinicians in the screening of TFS-processing abilities and diagnosis of suprathreshold temporal auditory processing deficits associated with cochlear damage and ageing. These tests may also be useful for audiologists and hearing-aid manufacturers to explain and/or predict the outcome of hearing-aid fitting in terms of perceived quality, speech intelligibility or spatial hearing.[240][241] These tests may eventually be used to recommend the most appropriate compression speed in hearing aids [242] or the use of directional microphones. The need for such tests is corroborated by strong correlations between slow-FM or spectro-temporal modulation detection thresholds and aided speech intelligibility in competing backgrounds for hearing-impaired persons.[90][243] Clinical tests can be divided into two groups: those assessing monaural TFS processing capacities (TFS1 test) and those assessing binaural capacities (binaural pitch, TFS-LF, TFS-AF).

TFS1: this test assesses the ability to discriminate between an harmonic complex tone and its frequency-transposed (and thus, inharmonic) version.[244][245][246][159] Binaural pitch: these tests evaluate the ability to detect and discriminate binaural pitch, and melody recognition using different types of binaural pitch.[182][247] TFS-LF: this test assesses the ability to discriminate low-frequency pure tones that are identical at the two ears from the same tones differing in interaural phase.[248][249] TFS AF: this test assesses the highest audio frequency of a pure tone up to which a change in interaural phase can be discriminated.[250]

Objective measures using envelope and TFS cues[edit]

Signal distortion, additive noise, reverberation, and audio processing strategies such as noise suppression and dynamic-range compression can all impact speech intelligibility and speech and music quality.[251][252][253][254][255] These changes in the perception of the signal can often be predicted by measuring the associated changes in the signal envelope and/or temporal fine structure (TFS). Objective measures of the signal changes, when combined with procedures that associate the signal changes with differences in auditory perception, give rise to auditory performance metrics for predicting speech intelligibility and speech quality.

Changes in the TFS can be estimated by passing the signals through a filterbank and computing the coherence[256] between the system input and output in each band. Intelligibility predicted from the coherence is accurate for some forms of additive noise and nonlinear distortion,[251][255] but works poorly for ideal binary mask (IBM) noise suppression.[253] Speech and music quality for signals subjected to noise and clipping distortion have also been modeled using the coherence [257] or using the coherence averaged across short signal segments.[258]

Changes in the signal envelope can be measured using several different procedures. The presence of noise or reverberation will reduce the modulation depth of a signal, and multiband measurement of the envelope modulation depth of the system output is used in the speech transmission index (STI) to estimate intelligibility.[259] While accurate for noise and reverberation applications, the STI works poorly for nonlinear processing such as dynamic-range compression.[260] An extension to the STI estimates the change in modulation by cross-correlating the envelopes of the speech input and output signals.[261][262] A related procedure, also using envelope cross-correlations, is the short-time objective intelligibility (STOI) measure,[253] which works well for its intended application in evaluating noise suppression, but which is less accurate for nonlinear distortion.[263] Envelope-based intelligibility metrics have also been derived using modulation filterbanks [67] and using envelope time-frequency modulation patterns.[264] Envelope cross-correlation is also used for estimating speech and music quality.[265][266]

Envelope and TFS measurements can also be combined to form intelligibility and quality metrics. A family of metrics for speech intelligibility,[263] speech quality,[267][268] and music quality [269] has been derived using a shared model of the auditory periphery [270] that can represent hearing loss. Using a model of the impaired periphery leads to more accurate predictions for hearing-impaired listeners than using a normal-hearing model, and the combined envelope/TFS metric is generally more accurate than a metric that uses envelope modulation alone.[263][267]

See also[edit]


  1. ^ Viemeister NF, Plack CJ (1993). Human Psychophysics. Springer Handbook of Auditory Research. Springer, New York, NY. pp. 116–154. doi:10.1007/978-1-4612-2728-1_4. ISBN 978-1-4612-7644-9.
  2. ^ a b c Rosen S (June 1992). "Temporal information in speech: acoustic, auditory and linguistic aspects". Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences. 336 (1278): 367–73. Bibcode:1992RSPTB.336..367R. doi:10.1098/rstb.1992.0070. PMID 1354376.
  3. ^ a b Drullman R (January 1995). "Temporal envelope and fine structure cues for speech intelligibility". The Journal of the Acoustical Society of America. 97 (1): 585–92. Bibcode:1995ASAJ...97..585D. doi:10.1121/1.413112. PMID 7860835.
  4. ^ a b c d e Moore BC (December 2008). "The role of temporal fine structure processing in pitch perception, masking, and speech perception for normal-hearing and hearing-impaired people". Journal of the Association for Research in Otolaryngology. 9 (4): 399–406. doi:10.1007/s10162-008-0143-x. PMC 2580810. PMID 18855069.
  5. ^ De Boer E (September 1956). "Pitch of inharmonic signals". Nature. 178 (4532): 535–6. Bibcode:1956Natur.178..535B. doi:10.1038/178535a0. PMID 13358790.
  6. ^ Zeng FG, Nie K, Liu S, Stickney G, Del Rio E, Kong YY, Chen H (September 2004). "On the dichotomy in auditory perception between temporal envelope and fine structure cues". The Journal of the Acoustical Society of America. 116 (3): 1351–4. Bibcode:2004ASAJ..116.1351Z. doi:10.1121/1.1777938. PMID 15478399.
  7. ^ a b Plomp R (1983). "Perception of speech as a modulated signal". Proceedings of the 10th International Congress of Phonetic Sciences, Utrecht: 19–40.
  8. ^ Hilbert D (1912). Grundzüge einer allgemeinen theorie der linearen integralgleichungen. University of California Libraries. Leipzig, B. G. Teubner.
  9. ^ Ruggero MA (July 1973). "Response to noise of auditory nerve fibers in the squirrel monkey". Journal of Neurophysiology. 36 (4): 569–87. doi:10.1152/jn.1973.36.4.569. PMID 4197339.
  10. ^ a b c d Moore BC (2014-05-04). Auditory Processing of Temporal Fine Structure: Effects of Age and Hearing Loss. New Jersey: World Scientific Publishing Company. ISBN 9789814579650.
  11. ^ a b Joris PX, Louage DH, Cardoen L, van der Heijden M (June 2006). "Correlation index: a new metric to quantify temporal coding". Hearing Research. 216–217: 19–30. doi:10.1016/j.heares.2006.03.010. PMID 16644160.
  12. ^ Heinz MG, Swaminathan J (September 2009). "Quantifying envelope and fine-structure coding in auditory nerve responses to chimaeric speech". Journal of the Association for Research in Otolaryngology. 10 (3): 407–23. doi:10.1007/s10162-009-0169-8. PMC 3084379. PMID 19365691.
  13. ^ Søndergaard PL, Decorsière R, Dau T (2011-12-15). "On the relationship between multi-channel envelope and temporal fine structure". Proceedings of the International Symposium on Auditory and Audiological Research. 3: 363–370.
  14. ^ Shamma S, Lorenzi C (May 2013). "On the balance of envelope and temporal fine structure in the encoding of speech in the early auditory system". The Journal of the Acoustical Society of America. 133 (5): 2818–33. Bibcode:2013ASAJ..133.2818S. doi:10.1121/1.4795783. PMC 3663870. PMID 23654388.
  15. ^ Joris PX, Schreiner CE, Rees A (April 2004). "Neural processing of amplitude-modulated sounds". Physiological Reviews. 84 (2): 541–77. doi:10.1152/physrev.00029.2003. PMID 15044682.
  16. ^ Frisina RD (August 2001). "Subcortical neural coding mechanisms for auditory temporal processing". Hearing Research. 158 (1–2): 1–27. doi:10.1016/S0378-5955(01)00296-9. PMID 11506933.
  17. ^ Pressnitzer D, Meddis R, Delahaye R, Winter IM (August 2001). "Physiological correlates of comodulation masking release in the mammalian ventral cochlear nucleus". The Journal of Neuroscience. 21 (16): 6377–86. doi:10.1523/JNEUROSCI.21-16-06377.2001. PMC 6763188. PMID 11487661.
  18. ^ a b Hall JW, Haggard MP, Fernandes MA (July 1984). "Detection in noise by spectro-temporal pattern analysis". The Journal of the Acoustical Society of America. 76 (1): 50–6. Bibcode:1984ASAJ...76R..50H. doi:10.1121/1.391005. PMID 6747111.
  19. ^ Eggermont JJ (April 1994). "Temporal modulation transfer functions for AM and FM stimuli in cat auditory cortex. Effects of carrier type, modulating waveform and intensity". Hearing Research. 74 (1–2): 51–66. doi:10.1016/0378-5955(94)90175-9. PMID 8040099.
  20. ^ Bieser A, Müller-Preuss P (1996). "Auditory responsive cortex in the squirrel monkey: neural responses to amplitude-modulated sounds". Exp Brain Res. 108 (2): 273–84. doi:10.1007/BF00228100. PMID 8815035.
  21. ^ Liang L, Lu T, Wang X (May 2002). "Neural representations of sinusoidal amplitude and frequency modulations in the primary auditory cortex of awake primates". Journal of Neurophysiology. 87 (5): 2237–61. doi:10.1152/jn.2002.87.5.2237. PMID 11976364.
  22. ^ Schreiner CE, Urbas JV (January 1988). "Representation of amplitude modulation in the auditory cortex of the cat. II. Comparison between cortical fields". Hearing Research. 32 (1): 49–63. doi:10.1016/0378-5955(88)90146-3. PMID 3350774.
  23. ^ Lu T, Liang L, Wang X (November 2001). "Temporal and rate representations of time-varying signals in the auditory cortex of awake primates". Nature Neuroscience. 4 (11): 1131–8. doi:10.1038/nn737. PMID 11593234.
  24. ^ Eggermont JJ (November 1991). "Rate and synchronization measures of periodicity coding in cat primary auditory cortex". Hearing Research. 56 (1–2): 153–67. doi:10.1016/0378-5955(91)90165-6. PMID 1769910.
  25. ^ Baumann S, Joly O, Rees A, Petkov CI, Sun L, Thiele A, Griffiths TD (January 2015). "The topography of frequency and time representation in primate auditory cortices". eLife. 4. doi:10.7554/eLife.03256. PMC 4398946. PMID 25590651.
  26. ^ Depireux DA, Elhilali M, eds. (2014-01-15). Handbook of Modern Techniques in Auditory Cortex (first ed.). Nova Science Pub Inc. ISBN 9781628088946.
  27. ^ Kowalski N, Depireux DA, Shamma SA (November 1996). "Analysis of dynamic spectra in ferret primary auditory cortex. I. Characteristics of single-unit responses to moving ripple spectra" (PDF). Journal of Neurophysiology. 76 (5): 3503–23. doi:10.1152/jn.1996.76.5.3503. hdl:1903/5688. PMID 8930289.
  28. ^ Mesgarani N, Chang EF (May 2012). "Selective cortical representation of attended speaker in multi-talker speech perception". Nature. 485 (7397): 233–6. Bibcode:2012Natur.485..233M. doi:10.1038/nature11020. PMC 3870007. PMID 22522927.
  29. ^ John MS, Picton TW (March 2000). "Human auditory steady-state responses to amplitude-modulated tones: phase and latency measurements". Hearing Research. 141 (1–2): 57–79. doi:10.1016/S0378-5955(99)00209-9. PMID 10713496.
  30. ^ Atiani S, David SV, Elgueda D, Locastro M, Radtke-Schuller S, Shamma SA, Fritz JB (April 2014). "Emergent selectivity for task-relevant stimuli in higher-order auditory cortex". Neuron. 82 (2): 486–99. doi:10.1016/j.neuron.2014.02.029. PMC 4048815. PMID 24742467.
  31. ^ Schreiner CE, Urbas JV (1986). "Representation of amplitude modulation in the auditory cortex of the cat. I. The anterior auditory field (AAF)". Hearing Research. 21 (3): 227–41. doi:10.1016/0378-5955(86)90221-2. PMID 3013823.
  32. ^ Giraud AL, Lorenzi C, Ashburner J, Wable J, Johnsrude I, Frackowiak R, Kleinschmidt A (September 2000). "Representation of the temporal envelope of sounds in the human brain". Journal of Neurophysiology. 84 (3): 1588–98. doi:10.1152/jn.2000.84.3.1588. PMID 10980029.
  33. ^ Liégeois-Chauvel C, Lorenzi C, Trébuchon A, Régis J, Chauvel P (July 2004). "Temporal envelope processing in the human left and right auditory cortices". Cerebral Cortex. 14 (7): 731–40. doi:10.1093/cercor/bhh033. PMID 15054052.
  34. ^ Herdener M, Esposito F, Scheffler K, Schneider P, Logothetis NK, Uludag K, Kayser C (November 2013). "Spatial representations of temporal and spectral sound cues in human auditory cortex". Cortex; A Journal Devoted to the Study of the Nervous System and Behavior. 49 (10): 2822–33. doi:10.1016/j.cortex.2013.04.003. PMID 23706955.
  35. ^ Schönwiesner M, Zatorre RJ (August 2009). "Spectro-temporal modulation transfer function of single voxels in the human auditory cortex measured with high-resolution fMRI". Proceedings of the National Academy of Sciences of the United States of America. 106 (34): 14611–6. Bibcode:2009PNAS..10614611S. doi:10.1073/pnas.0907682106. PMC 2732853. PMID 19667199.
  36. ^ Griffiths TD, Penhune V, Peretz I, Dean JL, Patterson RD, Green GG (April 2000). "Frontal processing and auditory perception". NeuroReport. 11 (5): 919–22. doi:10.1097/00001756-200004070-00004. PMID 10790855.
  37. ^ Hullett PW, Hamilton LS, Mesgarani N, Schreiner CE, Chang EF (February 2016). "Human Superior Temporal Gyrus Organization of Spectrotemporal Modulation Tuning Derived from Speech Stimuli". The Journal of Neuroscience. 36 (6): 2014–26. doi:10.1523/JNEUROSCI.1779-15.2016. PMC 4748082. PMID 26865624.
  38. ^ a b c d e f Elhilali M, Fritz JB, Klein DJ, Simon JZ, Shamma SA (February 2004). "Dynamics of precise spike timing in primary auditory cortex". The Journal of Neuroscience. 24 (5): 1159–72. doi:10.1523/JNEUROSCI.3825-03.2004. PMC 6793586. PMID 14762134.
  39. ^ Boer, E. de (1985). "Auditory Time Constants: A Paradox?". Time Resolution in Auditory Systems. Proceedings in Life Sciences. Springer, Berlin, Heidelberg. pp. 141–158. doi:10.1007/978-3-642-70622-6_9. ISBN 9783642706240.
  40. ^ a b Bair W, Koch C (August 1996). "Temporal precision of spike trains in extrastriate cortex of the behaving macaque monkey" (PDF). Neural Computation. 8 (6): 1185–202. doi:10.1162/neco.1996.8.6.1185. PMID 8768391.
  41. ^ Simon JZ, Depireux DA, Klein DJ, Fritz JB, Shamma SA (March 2007). "Temporal symmetry in primary auditory cortex: implications for cortical connectivity". Neural Computation. 19 (3): 583–638. arXiv:q-bio/0608027. doi:10.1162/neco.2007.19.3.583. PMID 17298227.
  42. ^ Theunissen FE, Sen K, Doupe AJ (March 2000). "Spectral-temporal receptive fields of nonlinear auditory neurons obtained using natural sounds". The Journal of Neuroscience. 20 (6): 2315–31. doi:10.1523/JNEUROSCI.20-06-02315.2000. PMC 6772498. PMID 10704507.
  43. ^ David SV, Mesgarani N, Fritz JB, Shamma SA (March 2009). "Rapid synaptic depression explains nonlinear modulation of spectro-temporal tuning in primary auditory cortex by natural stimuli". The Journal of Neuroscience. 29 (11): 3374–86. doi:10.1523/JNEUROSCI.5249-08.2009. PMC 2774136. PMID 19295144.
  44. ^ Bieser A, Müller-Preuss P (March 1996). "Auditory responsive cortex in the squirrel monkey: neural responses to amplitude-modulated sounds". Experimental Brain Research. 108 (2): 273–84. doi:10.1007/bf00228100. PMID 8815035.
  45. ^ Fast H (2007). Psychoacoustics - Facts and Models. Springer. ISBN 9783540231592.[page needed]
  46. ^ Burns EM, Viemeister NF (December 1981). "Played‐again SAM: Further observations on the pitch of amplitude‐modulated noise". The Journal of the Acoustical Society of America. 70 (6): 1655–1660. Bibcode:1981ASAJ...70.1655B. doi:10.1121/1.387220.
  47. ^ McDermott JH, Simoncelli EP (September 2011). "Sound texture perception via statistics of the auditory periphery: evidence from sound synthesis". Neuron. 71 (5): 926–40. doi:10.1016/j.neuron.2011.06.032. PMC 4143345. PMID 21903084.
  48. ^ McWalter R, Dau T (2017-09-11). "Cascaded Amplitude Modulations in Sound Texture Perception". Frontiers in Neuroscience. 11: 485. doi:10.3389/fnins.2017.00485. PMC 5601004. PMID 28955191.
  49. ^ a b c d Viemeister NF (November 1979). "Temporal modulation transfer functions based upon modulation thresholds". The Journal of the Acoustical Society of America. 66 (5): 1364–80. Bibcode:1979ASAJ...66.1364V. doi:10.1121/1.383531. PMID 500975.
  50. ^ Sheft S, Yost WA (August 1990). "Temporal integration in amplitude modulation detection". The Journal of the Acoustical Society of America. 88 (2): 796–805. Bibcode:1990ASAJ...88..796S. doi:10.1121/1.399729. PMID 2212305.
  51. ^ a b Kohlrausch A, Fassel R, Dau T (August 2000). "The influence of carrier level and frequency on modulation and beat-detection thresholds for sinusoidal carriers". The Journal of the Acoustical Society of America. 108 (2): 723–34. Bibcode:2000ASAJ..108..723K. doi:10.1121/1.429605. PMID 10955639.
  52. ^ Bacon SP, Grantham DW (June 1989). "Modulation masking: effects of modulation frequency, depth, and phase". The Journal of the Acoustical Society of America. 85 (6): 2575–80. Bibcode:1989ASAJ...85.2575B. doi:10.1121/1.397751. PMID 2745880.
  53. ^ a b Houtgast T (April 1989). "Frequency selectivity in amplitude-modulation detection". The Journal of the Acoustical Society of America. 85 (4): 1676–80. Bibcode:1989ASAJ...85.1676H. doi:10.1121/1.397956. PMID 2708683.
  54. ^ Yost WA, Sheft S (February 1989). "Across-critical-band processing of amplitude-modulated tones". The Journal of the Acoustical Society of America. 85 (2): 848–57. Bibcode:1989ASAJ...85..848Y. doi:10.1121/1.397556. PMID 2925999.
  55. ^ Kay RH, Matthews DR (September 1972). "On the existence in human auditory pathways of channels selectively tuned to the modulation present in frequency-modulated tones". The Journal of Physiology. 225 (3): 657–77. doi:10.1113/jphysiol.1972.sp009962. PMC 1331136. PMID 5076392.
  56. ^ Tansley BW, Suffield JB (September 1983). "Time course of adaptation and recovery of channels selectively sensitive to frequency and amplitude modulation". The Journal of the Acoustical Society of America. 74 (3): 765–75. Bibcode:1983ASAJ...74..765T. doi:10.1121/1.389864. PMID 6630734.
  57. ^ Wojtczak M, Viemeister NF (August 2003). "Suprathreshold effects of adaptation produced by amplitude modulation". The Journal of the Acoustical Society of America. 114 (2): 991–7. Bibcode:2003ASAJ..114..991W. doi:10.1121/1.1593067. PMID 12942978.
  58. ^ Lorenzi C, Simpson MI, Millman RE, Griffiths TD, Woods WP, Rees A, Green GG (November 2001). "Second-order modulation detection thresholds for pure-tone and narrow-band noise carriers". The Journal of the Acoustical Society of America. 110 (5 Pt 1): 2470–8. Bibcode:2001ASAJ..110.2470L. doi:10.1121/1.1406160. PMID 11757936.
  59. ^ Ewert SD, Verhey JL, Dau T (December 2002). "Spectro-temporal processing in the envelope-frequency domain". The Journal of the Acoustical Society of America. 112 (6): 2921–31. Bibcode:2002ASAJ..112.2921E. doi:10.1121/1.1515735. PMID 12509013.
  60. ^ Füllgrabe C, Moore BC, Demany L, Ewert SD, Sheft S, Lorenzi C (April 2005). "Modulation masking produced by second-order modulators". The Journal of the Acoustical Society of America. 117 (4 Pt 1): 2158–68. Bibcode:2005ASAJ..117.2158F. doi:10.1121/1.1861892. PMC 2708918. PMID 15898657.
  61. ^ Klein-Hennig M, Dietz M, Hohmann V, Ewert SD (June 2011). "The influence of different segments of the ongoing envelope on sensitivity to interaural time delays". The Journal of the Acoustical Society of America. 129 (6): 3856–72. Bibcode:2011ASAJ..129.3856K. doi:10.1121/1.3585847. PMID 21682409.
  62. ^ Strickland EA, Viemeister NF (June 1996). "Cues for discrimination of envelopes". The Journal of the Acoustical Society of America. 99 (6): 3638–46. Bibcode:1996ASAJ...99.3638S. doi:10.1121/1.414962. PMID 8655796.
  63. ^ a b c Dau T, Kollmeier B, Kohlrausch A (November 1997). "Modeling auditory processing of amplitude modulation. I. Detection and masking with narrow-band carriers". The Journal of the Acoustical Society of America. 102 (5 Pt 1): 2892–905. Bibcode:1997ASAJ..102.2892D. doi:10.1121/1.420344. PMID 9373976.
  64. ^ Piechowiak T, Ewert SD, Dau T (April 2007). "Modeling comodulation masking release using an equalization-cancellation mechanism" (PDF). The Journal of the Acoustical Society of America. 121 (4): 2111–26. Bibcode:2007ASAJ..121.2111P. doi:10.1121/1.2534227. PMID 17471726.
  65. ^ Ewert SD, Dau T (September 2000). "Characterizing frequency selectivity for envelope fluctuations". The Journal of the Acoustical Society of America. 108 (3 Pt 1): 1181–96. Bibcode:2000ASAJ..108.1181E. doi:10.1121/1.1288665. PMID 11008819.
  66. ^ Wakefield GH, Viemeister NF (September 1990). "Discrimination of modulation depth of sinusoidal amplitude modulation (SAM) noise". The Journal of the Acoustical Society of America. 88 (3): 1367–73. Bibcode:1990ASAJ...88.1367W. doi:10.1121/1.399714. PMID 2229672.
  67. ^ a b Jørgensen S, Ewert SD, Dau T (July 2013). "A multi-resolution envelope-power based model for speech intelligibility". The Journal of the Acoustical Society of America. 134 (1): 436–46. Bibcode:2013ASAJ..134..436J. doi:10.1121/1.4807563. PMID 23862819.
  68. ^ Biberger T, Ewert SD (August 2016). "Envelope and intensity based prediction of psychoacoustic masking and speech intelligibility". The Journal of the Acoustical Society of America. 140 (2): 1023–1038. Bibcode:2016ASAJ..140.1023B. doi:10.1121/1.4960574. PMID 27586734.
  69. ^ Nelson PC, Carney LH (August 2006). "Cues for masked amplitude-modulation detection". The Journal of the Acoustical Society of America. 120 (2): 978–90. Bibcode:2006ASAJ..120..978N. doi:10.1121/1.2213573. PMC 2572864. PMID 16938985.
  70. ^ Verschooten E, Robles L, Joris PX (February 2015). "Assessment of the limits of neural phase-locking using mass potentials". The Journal of Neuroscience. 35 (5): 2255–68. doi:10.1523/JNEUROSCI.2979-14.2015. PMC 6705351. PMID 25653380.
  71. ^ Palmer AR, Russell IJ (1986). "Phase-locking in the cochlear nerve of the guinea-pig and its relation to the receptor potential of inner hair-cells". Hearing Research. 24 (1): 1–15. doi:10.1016/0378-5955(86)90002-X. PMID 3759671.
  72. ^ Weiss TF, Rose C (May 1988). "A comparison of synchronization filters in different auditory receptor organs". Hearing Research. 33 (2): 175–9. doi:10.1016/0378-5955(88)90030-5. PMID 3397327.
  73. ^ a b c Paraouty N, Stasiak A, Lorenzi C, Varnet L, Winter IM (April 2018). "Dual Coding of Frequency Modulation in the Ventral Cochlear Nucleus". The Journal of Neuroscience. 38 (17): 4123–4137. doi:10.1523/JNEUROSCI.2107-17.2018. PMC 6596033. PMID 29599389.
  74. ^ a b Moore BC (September 1973). "Frequency difference limens for short-duration tones". The Journal of the Acoustical Society of America. 54 (3): 610–9. Bibcode:1973ASAJ...54..610M. doi:10.1121/1.1913640. PMID 4754385.
  75. ^ a b c Moore B (2013-04-05). An Introduction to the Psychology of Hearing: Sixth Edition (6th ed.). Leiden: BRILL. ISBN 9789004252424.
  76. ^ a b c d e f Plack CJ (2005). Pitch - Neural Coding and Perception. Springer Handbook of Auditory Research. Springer. ISBN 9780387234724.
  77. ^ a b c d Moore BC, Sek A (October 1996). "Detection of frequency modulation at low modulation rates: evidence for a mechanism based on phase locking". The Journal of the Acoustical Society of America. 100 (4 Pt 1): 2320–31. Bibcode:1996ASAJ..100.2320M. doi:10.1121/1.417941. PMID 8865639.
  78. ^ a b Lacher-Fougère S, Demany L (October 2005). "Consequences of cochlear damage for the detection of interaural phase differences". The Journal of the Acoustical Society of America. 118 (4): 2519–26. Bibcode:2005ASAJ..118.2519L. doi:10.1121/1.2032747. PMID 16266172.
  79. ^ a b Hopkins K, Moore BC, Stone MA (February 2008). "Effects of moderate cochlear hearing loss on the ability to benefit from temporal fine structure information in speech". The Journal of the Acoustical Society of America. 123 (2): 1140–53. Bibcode:2008ASAJ..123.1140H. doi:10.1121/1.2824018. PMC 2688774. PMID 18247914.
  80. ^ Oxenham AJ, Bernstein JG, Penagos H (February 2004). "Correct tonotopic representation is necessary for complex pitch perception". Proceedings of the National Academy of Sciences of the United States of America. 101 (5): 1421–5. doi:10.1073/pnas.0306958101. PMC 337068. PMID 14718671.
  81. ^ Oxenham AJ, Micheyl C, Keebler MV, Loper A, Santurette S (May 2011). "Pitch perception beyond the traditional existence region of pitch". Proceedings of the National Academy of Sciences of the United States of America. 108 (18): 7629–34. doi:10.1073/pnas.1015291108. PMC 3088642. PMID 21502495.
  82. ^ a b c Paraouty N, Ewert SD, Wallaert N, Lorenzi C (July 2016). "Interactions between amplitude modulation and frequency modulation processing: Effects of age and hearing loss". The Journal of the Acoustical Society of America. 140 (1): 121–131. Bibcode:2016ASAJ..140..121P. doi:10.1121/1.4955078. PMID 27475138.
  83. ^ Demany L, Semal C (March 1989). "Detection thresholds for sinusoidal frequency modulation". The Journal of the Acoustical Society of America. 85 (3): 1295–301. Bibcode:1989ASAJ...85.1295D. doi:10.1121/1.397460. PMID 2708671.
  84. ^ a b Ernst SM, Moore BC (December 2010). "Mechanisms underlying the detection of frequency modulation". The Journal of the Acoustical Society of America. 128 (6): 3642–8. Bibcode:2010ASAJ..128.3642E. doi:10.1121/1.3506350. PMID 21218896.
  85. ^ Zwicker, E (1956-01-01). "Die elementaren Grundlagen zur Bestimmung der Informationskapazität des Gehörs". Acta Acustica United with Acustica. 6 (4): 365–381.
  86. ^ Maiwald, D (1967). "Ein Funktionsschema des Gehors zur Beschreibung der Erkennbarkeit kleiner Frequenz und Amplitudenanderungen". Acustica. 18: 81–92.
  87. ^ Saberi K, Hafter ER (April 1995). "A common neural code for frequency- and amplitude-modulated sounds". Nature. 374 (6522): 537–9. Bibcode:1995Natur.374..537S. doi:10.1038/374537a0. PMID 7700378.
  88. ^ Ruggles D, Bharadwaj H, Shinn-Cunningham BG (September 2011). "Normal hearing is not enough to guarantee robust encoding of suprathreshold features important in everyday communication". Proceedings of the National Academy of Sciences of the United States of America. 108 (37): 15516–21. Bibcode:2011PNAS..10815516R. doi:10.1073/pnas.1108912108. PMC 3174666. PMID 21844339.
  89. ^ Johannesen PT, Pérez-González P, Kalluri S, Blanco JL, Lopez-Poveda EA (September 2016). "The Influence of Cochlear Mechanical Dysfunction, Temporal Processing Deficits, and Age on the Intelligibility of Audible Speech in Noise for Hearing-Impaired Listeners". Trends in Hearing. 20: 233121651664105. doi:10.1177/2331216516641055. PMC 5017567. PMID 27604779.
  90. ^ a b Lopez-Poveda EA, Johannesen PT, Pérez-González P, Blanco JL, Kalluri S, Edwards B (January 2017). "Predictors of Hearing-Aid Outcomes". Trends in Hearing. 21: 2331216517730526. doi:10.1177/2331216517730526. PMC 5613846. PMID 28929903.
  91. ^ a b Buss E, Hall JW, Grose JH (June 2004). "Temporal fine-structure cues to speech and pure tone modulation in observers with sensorineural hearing loss". Ear and Hearing. 25 (3): 242–50. doi:10.1097/01.AUD.0000130796.73809.09. PMID 15179115.
  92. ^ a b Strelcyk O, Dau T (May 2009). "Relations between frequency selectivity, temporal fine-structure processing, and speech reception in impaired hearing" (PDF). The Journal of the Acoustical Society of America. 125 (5): 3328–45. Bibcode:2009ASAJ..125.3328S. doi:10.1121/1.3097469. PMID 19425674.
  93. ^ Ewert SD, Paraouty N, Lorenzi C (January 2018). "A two-path model of auditory modulation detection using temporal fine structure and envelope cues". The European Journal of Neuroscience. 51 (5): 1265–1278. doi:10.1111/ejn.13846. PMID 29368797.
  94. ^ Zilany MS, Bruce IC, Nelson PC, Carney LH (November 2009). "A phenomenological model of the synapse between the inner hair cell and auditory nerve: long-term adaptation with power-law dynamics". The Journal of the Acoustical Society of America. 126 (5): 2390–412. Bibcode:2009ASAJ..126.2390Z. doi:10.1121/1.3238250. PMC 2787068. PMID 19894822.
  95. ^ Zilany MS, Bruce IC, Carney LH (January 2014). "Updated parameters and expanded simulation options for a model of the auditory periphery". The Journal of the Acoustical Society of America. 135 (1): 283–6. Bibcode:2014ASAJ..135..283Z. doi:10.1121/1.4837815. PMC 3985897. PMID 24437768.
  96. ^ Wirtzfeld MR, Ibrahim RA, Bruce IC (October 2017). "Predictions of Speech Chimaera Intelligibility Using Auditory Nerve Mean-Rate and Spike-Timing Neural Cues". Journal of the Association for Research in Otolaryngology. 18 (5): 687–710. doi:10.1007/s10162-017-0627-7. PMC 5612921. PMID 28748487.
  97. ^ Moon IJ, Won JH, Park MH, Ives DT, Nie K, Heinz MG, Lorenzi C, Rubinstein JT (September 2014). "Optimal combination of neural temporal envelope and fine structure cues to explain speech identification in background noise". The Journal of Neuroscience. 34 (36): 12145–54. doi:10.1523/JNEUROSCI.1025-14.2014. PMC 4152611. PMID 25186758.
  98. ^ a b Lorenzi C, Gilbert G, Carn H, Garnier S, Moore BC (December 2006). "Speech perception problems of the hearing impaired reflect inability to use temporal fine structure". Proceedings of the National Academy of Sciences of the United States of America. 103 (49): 18866–9. Bibcode:2006PNAS..10318866L. doi:10.1073/pnas.0607364103. PMC 1693753. PMID 17116863.
  99. ^ a b Hopkins K, Moore BC (July 2011). "The effects of age and cochlear hearing loss on temporal fine structure sensitivity, frequency selectivity, and speech reception in noise". The Journal of the Acoustical Society of America. 130 (1): 334–49. Bibcode:2011ASAJ..130..334H. doi:10.1121/1.3585848. PMID 21786903.
  100. ^ Heinz MG, Colburn HS, Carney LH (October 2001). "Evaluating auditory performance limits: i. one-parameter discrimination using a computational model for the auditory nerve". Neural Computation. 13 (10): 2273–316. doi:10.1162/089976601750541804. PMID 11570999.
  101. ^ Heinz MG, Colburn HS, Carney LH (October 2001). "Evaluating auditory performance limits: II. One-parameter discrimination with random-level variation". Neural Computation. 13 (10): 2317–38. doi:10.1162/089976601750541813. PMID 11571000.
  102. ^ Carney, Laurel H.; Heinzy, Michael G.; Evilsizer, Mary E.; Gilkeyz, Robert H.; Colburn, H. Steven (2002). "Auditory Phase Opponency: A Temporal Model for Masked Detection at Low Frequencies". Acta Acustica United with Acustica. 88 (3): 334–47.
  103. ^ Deng L, Geisler CD (December 1987). "A composite auditory model for processing speech sounds". The Journal of the Acoustical Society of America. 82 (6): 2001–12. Bibcode:1987ASAJ...82.2001D. doi:10.1121/1.395644. PMID 3429735.
  104. ^ Loeb GE, White MW, Merzenich MM (1983). "Spatial cross-correlation. A proposed mechanism for acoustic pitch perception". Biological Cybernetics. 47 (3): 149–63. doi:10.1007/BF00337005. PMID 6615914.
  105. ^ Shamma S, Klein D (May 2000). "The case of the missing pitch templates: how harmonic templates emerge in the early auditory system". The Journal of the Acoustical Society of America. 107 (5 Pt 1): 2631–44. Bibcode:2000ASAJ..107.2631S. doi:10.1121/1.428649. hdl:1903/6017. PMID 10830385.
  106. ^ Shamma SA (November 1985). "Speech processing in the auditory system. II: Lateral inhibition and the central processing of speech evoked activity in the auditory nerve". The Journal of the Acoustical Society of America. 78 (5): 1622–32. Bibcode:1985ASAJ...78.1622S. doi:10.1121/1.392800. PMID 3840813.
  107. ^ a b c Varnet L, Ortiz-Barajas MC, Erra RG, Gervain J, Lorenzi C (October 2017). "A cross-linguistic study of speech modulation spectra". The Journal of the Acoustical Society of America. 142 (4): 1976–1989. Bibcode:2017ASAJ..142.1976V. doi:10.1121/1.5006179. PMID 29092595.
  108. ^ Van Tasell DJ, Soli SD, Kirby VM, Widin GP (October 1987). "Speech waveform envelope cues for consonant recognition". The Journal of the Acoustical Society of America. 82 (4): 1152–61. Bibcode:1987ASAJ...82.1152V. doi:10.1121/1.395251. PMID 3680774.
  109. ^ a b Ghitza O (September 2001). "On the upper cutoff frequency of the auditory critical-band envelope detectors in the context of speech perception". The Journal of the Acoustical Society of America. 110 (3 Pt 1): 1628–40. Bibcode:2001ASAJ..110.1628G. doi:10.1121/1.1396325. PMID 11572372.
  110. ^ a b Shannon RV, Zeng FG, Kamath V, Wygonski J, Ekelid M (October 1995). "Speech recognition with primarily temporal cues". Science. 270 (5234): 303–4. Bibcode:1995Sci...270..303S. doi:10.1126/science.270.5234.303. PMID 7569981.
  111. ^ Smith ZM, Delgutte B, Oxenham AJ (March 2002). "Chimaeric sounds reveal dichotomies in auditory perception". Nature. 416 (6876): 87–90. Bibcode:2002Natur.416...87S. doi:10.1038/416087a. PMC 2268248. PMID 11882898.
  112. ^ Drullman R, Festen JM, Plomp R (February 1994). "Effect of temporal envelope smearing on speech reception". The Journal of the Acoustical Society of America. 95 (2): 1053–64. Bibcode:1994ASAJ...95.1053D. doi:10.1121/1.408467. PMID 8132899.
  113. ^ Singh NC, Theunissen FE (December 2003). "Modulation spectra of natural sounds and ethological theories of auditory processing". The Journal of the Acoustical Society of America. 114 (6 Pt 1): 3394–411. Bibcode:2003ASAJ..114.3394S. doi:10.1121/1.1624067. PMID 14714819.
  114. ^ Iverson P, Krumhansl CL (November 1993). "Isolating the dynamic attributes of musical timbre". The Journal of the Acoustical Society of America. 94 (5): 2595–603. Bibcode:1993ASAJ...94.2595I. doi:10.1121/1.407371. PMID 8270737.
  115. ^ Cheveigné, Alain de (2005). "Pitch Perception Models". Pitch. Springer Handbook of Auditory Research. 24. Springer, New York, NY. pp. 169–233. doi:10.1007/0-387-28958-5_6. ISBN 9780387234724.
  116. ^ Moore BC, Glasberg BR, Low KE, Cope T, Cope W (August 2006). "Effects of level and frequency on the audibility of partials in inharmonic complex tones". The Journal of the Acoustical Society of America. 120 (2): 934–44. Bibcode:2006ASAJ..120..934M. doi:10.1121/1.2216906. PMID 16938981.
  117. ^ Terhardt E (May 1974). "Pitch, consonance, and harmony". The Journal of the Acoustical Society of America. 55 (5): 1061–9. Bibcode:1974ASAJ...55.1061T. doi:10.1121/1.1914648. PMID 4833699.
  118. ^ Santurette S, Dau T (January 2011). "The role of temporal fine structure information for the low pitch of high-frequency complex tones". The Journal of the Acoustical Society of America. 129 (1): 282–92. Bibcode:2011ASAJ..129..282S. doi:10.1121/1.3518718. PMID 21303009.
  119. ^ Santurette S, Dau T, Oxenham AJ (December 2012). "On the possibility of a place code for the low pitch of high-frequency complex tones". The Journal of the Acoustical Society of America. 132 (6): 3883–95. Bibcode:2012ASAJ..132.3883S. doi:10.1121/1.4764897. PMC 3528728. PMID 23231119.
  120. ^ Gfeller K, Turner C, Oleson J, Zhang X, Gantz B, Froman R, Olszewski C (June 2007). "Accuracy of cochlear implant recipients on pitch perception, melody recognition, and speech reception in noise". Ear and Hearing. 28 (3): 412–23. doi:10.1097/AUD.0b013e3180479318. PMID 17485990.
  121. ^ a b Zeng FG, Nie K, Stickney GS, Kong YY, Vongphoe M, Bhargave A, Wei C, Cao K (February 2005). "Speech recognition with amplitude and frequency modulations". Proceedings of the National Academy of Sciences of the United States of America. 102 (7): 2293–8. Bibcode:2005PNAS..102.2293Z. doi:10.1073/pnas.0406460102. PMC 546014. PMID 15677723.
  122. ^ Apoux F, Yoho SE, Youngdahl CL, Healy EW (September 2013). "Role and relative contribution of temporal envelope and fine structure cues in sentence recognition by normal-hearing listeners". The Journal of the Acoustical Society of America. 134 (3): 2205–12. Bibcode:2013ASAJ..134.2205A. doi:10.1121/1.4816413. PMC 3765279. PMID 23967950.
  123. ^ Freyman RL, Griffin AM, Oxenham AJ (October 2012). "Intelligibility of whispered speech in stationary and modulated noise maskers". The Journal of the Acoustical Society of America. 132 (4): 2514–23. Bibcode:2012ASAJ..132.2514F. doi:10.1121/1.4747614. PMC 3477190. PMID 23039445.
  124. ^ Dick, Frederic; Krishnan, Saloni; Leech, Robert; Saygin, Ayşe Pinar (2016). "Environmental Sounds". Neurobiology of Language. pp. 1121–1138. doi:10.1016/b978-0-12-407794-2.00089-4. ISBN 978-0-12-407794-2.
  125. ^ Lemaitre, Guillaume; Grimault, Nicolas; Suied, Clara (2018). "Acoustics and Psychoacoustics of Sound Scenes and Events". Computational Analysis of Sound Scenes and Events. pp. 41–67. doi:10.1007/978-3-319-63450-0_3. ISBN 978-3-319-63449-4.
  126. ^ a b c Shafiro, Valeriy (June 2008). "Identification of Environmental Sounds With Varying Spectral Resolution". Ear and Hearing. 29 (3): 401–420. doi:10.1097/AUD.0b013e31816a0cf1. PMID 18344871.
  127. ^ a b Gygi, Brian; Kidd, Gary R.; Watson, Charles S. (March 2004). "Spectral-temporal factors in the identification of environmental sounds". The Journal of the Acoustical Society of America. 115 (3): 1252–1265. Bibcode:2004ASAJ..115.1252G. doi:10.1121/1.1635840. PMID 15058346.
  128. ^ Warren, William H.; Verbrugge, Robert R. (1984). "Auditory perception of breaking and bouncing events: A case study in ecological acoustics". Journal of Experimental Psychology: Human Perception and Performance. 10 (5): 704–712. doi:10.1037/0096-1523.10.5.704.
  129. ^ Inverso, Yell; Limb, Charles J. (August 2010). "Cochlear Implant-Mediated Perception of Nonlinguistic Sounds". Ear and Hearing. 31 (4): 505–514. doi:10.1097/AUD.0b013e3181d99a52. PMID 20588119.
  130. ^ Shafiro, Valeriy; Gygi, Brian; Cheng, Min-Yu; Vachhani, Jay; Mulvey, Megan (July 2011). "Perception of Environmental Sounds by Experienced Cochlear Implant Patients". Ear and Hearing. 32 (4): 511–523. doi:10.1097/AUD.0b013e3182064a87. PMC 3115425. PMID 21248643.
  131. ^ Harris, Michael S.; Boyce, Lauren; Pisoni, David B.; Shafiro, Valeriy; Moberly, Aaron C. (October 2017). "The Relationship Between Environmental Sound Awareness and Speech Recognition Skills in Experienced Cochlear Implant Users". Otology & Neurotology. 38 (9): e308–e314. doi:10.1097/MAO.0000000000001514. PMC 6205294. PMID 28731964.
  132. ^ Moore BC, Gockel HE (April 2012). "Properties of auditory stream formation". Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences. 367 (1591): 919–31. doi:10.1098/rstb.2011.0355. PMC 3282308. PMID 22371614.
  133. ^ Cusack R, Roberts B (July 2004). "Effects of differences in the pattern of amplitude envelopes across harmonics on auditory stream segregation". Hearing Research. 193 (1–2): 95–104. doi:10.1016/j.heares.2004.03.009. PMID 15219324.
  134. ^ Vliegen J, Oxenham AJ (January 1999). "Sequential stream segregation in the absence of spectral cues". The Journal of the Acoustical Society of America. 105 (1): 339–46. Bibcode:1999ASAJ..105..339V. doi:10.1121/1.424503. PMID 9921660.
  135. ^ Grimault N, Micheyl C, Carlyon RP, Arthaud P, Collet L (July 2000). "Influence of peripheral resolvability on the perceptual segregation of harmonic complex tones differing in fundamental frequency". The Journal of the Acoustical Society of America. 108 (1): 263–71. Bibcode:2000ASAJ..108..263G. doi:10.1121/1.429462. PMID 10923890.
  136. ^ Grimault N, Bacon SP, Micheyl C (March 2002). "Auditory stream segregation on the basis of amplitude-modulation rate". The Journal of the Acoustical Society of America. 111 (3): 1340–8. Bibcode:2002ASAJ..111.1340G. doi:10.1121/1.1452740. PMID 11931311.
  137. ^ Yamagishi S, Otsuka S, Furukawa S, Kashino M (July 2017). "Comparison of perceptual properties of auditory streaming between spectral and amplitude modulation domains". Hearing Research. 350: 244–250. doi:10.1016/j.heares.2017.03.006. PMID 28323019.
  138. ^ David M, Lavandier M, Grimault N, Oxenham AJ (September 2017). "Discrimination and streaming of speech sounds based on differences in interaural and spectral cues". The Journal of the Acoustical Society of America. 142 (3): 1674–1685. Bibcode:2017ASAJ..142.1674D. doi:10.1121/1.5003809. PMC 5617732. PMID 28964066.
  139. ^ a b Levi EC, Werner LA (1996). "Amplitude modulation detection in infancy: Update on 3-month-olds". Assoc. Res. Otolaryngol. 19: 142.
  140. ^ Werner LA (October 1996). "The development of auditory behavior (or what the anatomists and physiologists have to explain)". Ear and Hearing. 17 (5): 438–46. doi:10.1097/00003446-199610000-00010. PMID 8909892.
  141. ^ Werner LA (April 1999). "Forward masking among infant and adult listeners". The Journal of the Acoustical Society of America. 105 (4): 2445–53. Bibcode:1999ASAJ..105.2445W. doi:10.1121/1.426849. PMID 10212425.
  142. ^ a b Levi EC, Folsom RC, Dobie RA (September 1995). "Coherence analysis of envelope-following responses (EFRs) and frequency-following responses (FFRs) in infants and adults". Hearing Research. 89 (1–2): 21–7. doi:10.1016/0378-5955(95)00118-3. PMID 8600128.
  143. ^ Levi EC, Folsom RC, Dobie RA (June 1993). "Amplitude-modulation following response (AMFR): effects of modulation rate, carrier frequency, age, and state". Hearing Research. 68 (1): 42–52. doi:10.1016/0378-5955(93)90063-7. PMID 8376214.
  144. ^ Hall JW, Grose JH (July 1994). "Development of temporal resolution in children as measured by the temporal modulation transfer function". The Journal of the Acoustical Society of America. 96 (1): 150–4. Bibcode:1994ASAJ...96..150H. doi:10.1121/1.410474. PMID 7598757.
  145. ^ Peter V, Wong K, Narne VK, Sharma M, Purdy SC, McMahon C (February 2014). "Assessing spectral and temporal processing in children and adults using temporal modulation transfer function (TMTF), Iterated Ripple Noise (IRN) perception, and spectral ripple discrimination (SRD)". Journal of the American Academy of Audiology. 25 (2): 210–8. doi:10.3766/jaaa.25.2.9. PMID 24828221.
  146. ^ Werner LA (2007). "Issues in human auditory development". Journal of Communication Disorders. 40 (4): 275–83. doi:10.1016/j.jcomdis.2007.03.004. PMC 1975821. PMID 17420028.
  147. ^ Buss E, Hall JW, Grose JH, Dev MB (August 1999). "Development of adult-like performance in backward, simultaneous, and forward masking". Journal of Speech, Language, and Hearing Research. 42 (4): 844–9. doi:10.1044/jslhr.4204.844. PMID 10450905.
  148. ^ Cabrera L, Werner L (July 2017). "Infants' and Adults' Use of Temporal Cues in Consonant Discrimination" (PDF). Ear and Hearing. 38 (4): 497–506. doi:10.1097/AUD.0000000000000422. PMC 5482774. PMID 28338496.
  149. ^ a b Bertoncini J, Serniclaes W, Lorenzi C (June 2009). "Discrimination of speech sounds based upon temporal envelope versus fine structure cues in 5- to 7-year-old children". Journal of Speech, Language, and Hearing Research. 52 (3): 682–95. doi:10.1044/1092-4388(2008/07-0273). PMID 18952853.
  150. ^ a b Le Prell CG (2012). Noise-Induced Hearing Loss - Scientific Advances. Springer Handbook of Auditory Research. Springer. ISBN 9781441995223.
  151. ^ a b Manley GA (2017). Understanding the Cochlea. Springer Handbook of Auditory Research. Springer. ISBN 9783319520711.
  152. ^ a b c Kale S, Heinz MG (December 2010). "Envelope coding in auditory nerve fibers following noise-induced hearing loss". Journal of the Association for Research in Otolaryngology. 11 (4): 657–73. doi:10.1007/s10162-010-0223-6. PMC 2975881. PMID 20556628.
  153. ^ Zhong Z, Henry KS, Heinz MG (March 2014). "Sensorineural hearing loss amplifies neural coding of envelope information in the central auditory system of chinchillas". Hearing Research. 309: 55–62. doi:10.1016/j.heares.2013.11.006. PMC 3922929. PMID 24315815.
  154. ^ Kale S, Heinz MG (April 2012). "Temporal modulation transfer functions measured from auditory-nerve responses following sensorineural hearing loss". Hearing Research. 286 (1–2): 64–75. doi:10.1016/j.heares.2012.02.004. PMC 3326227. PMID 22366500.
  155. ^ Henry KS, Kale S, Heinz MG (2014-02-17). "Noise-induced hearing loss increases the temporal precision of complex envelope coding by auditory-nerve fibers". Frontiers in Systems Neuroscience. 8: 20. doi:10.3389/fnsys.2014.00020. PMC 3925834. PMID 24596545.
  156. ^ a b Ruggero MA, Rich NC (April 1991). "Furosemide alters organ of corti mechanics: evidence for feedback of outer hair cells upon the basilar membrane". The Journal of Neuroscience. 11 (4): 1057–67. doi:10.1523/JNEUROSCI.11-04-01057.1991. PMC 3580957. PMID 2010805.
  157. ^ Heinz MG, Young ED (February 2004). "Response growth with sound level in auditory-nerve fibers after noise-induced hearing loss". Journal of Neurophysiology. 91 (2): 784–95. doi:10.1152/jn.00776.2003. PMC 2921373. PMID 14534289.
  158. ^ a b Moore BC, Glasberg BR (August 2001). "Temporal modulation transfer functions obtained using sinusoidal carriers with normally hearing and hearing-impaired listeners". The Journal of the Acoustical Society of America. 110 (2): 1067–73. Bibcode:2001ASAJ..110.1067M. doi:10.1121/1.1385177. PMID 11519575.
  159. ^ a b Füllgrabe C, Moore BC, Stone MA (2014). "Age-group differences in speech identification despite matched audiometrically normal hearing: contributions from auditory temporal processing and cognition". Frontiers in Aging Neuroscience. 6: 347. doi:10.3389/fnagi.2014.00347. PMC 4292733. PMID 25628563.
  160. ^ Wallaert N, Moore BC, Lorenzi C (June 2016). "Comparing the effects of age on amplitude modulation and frequency modulation detection". The Journal of the Acoustical Society of America. 139 (6): 3088–3096. Bibcode:2016ASAJ..139.3088W. doi:10.1121/1.4953019. PMID 27369130.
  161. ^ Bacon SP, Gleitman RM (June 1992). "Modulation detection in subjects with relatively flat hearing losses". Journal of Speech and Hearing Research. 35 (3): 642–53. doi:10.1044/jshr.3503.642. PMID 1608256.
  162. ^ Moore BC, Shailer MJ, Schooneveldt GP (August 1992). "Temporal modulation transfer functions for band-limited noise in subjects with cochlear hearing loss". British Journal of Audiology. 26 (4): 229–37. doi:10.3109/03005369209076641. PMID 1446186.
  163. ^ a b Schlittenlacher J, Moore BC (November 2016). "Discrimination of amplitude-modulation depth by subjects with normal and impaired hearing". The Journal of the Acoustical Society of America. 140 (5): 3487–3495. Bibcode:2016ASAJ..140.3487S. doi:10.1121/1.4966117. PMID 27908066.
  164. ^ Başkent D (November 2006). "Speech recognition in normal hearing and sensorineural hearing loss as a function of the number of spectral channels". The Journal of the Acoustical Society of America. 120 (5): 2908–2925. Bibcode:2006ASAJ..120.2908B. doi:10.1121/1.2354017. PMID 17139748.
  165. ^ a b King A, Hopkins K, Plack CJ (January 2014). "The effects of age and hearing loss on interaural phase difference discrimination". The Journal of the Acoustical Society of America. 135 (1): 342–51. Bibcode:2014ASAJ..135..342K. doi:10.1121/1.4838995. PMID 24437774.
  166. ^ a b c Derleth RP, Dau T, Kollmeier B (September 2001). "Modeling temporal and compressive properties of the normal and impaired auditory system". Hearing Research. 159 (1–2): 132–49. doi:10.1016/S0378-5955(01)00322-7. PMID 11520641.
  167. ^ a b c d Wallaert N, Moore BC, Ewert SD, Lorenzi C (February 2017). "Sensorineural hearing loss enhances auditory sensitivity and temporal integration for amplitude modulation". The Journal of the Acoustical Society of America. 141 (2): 971–980. Bibcode:2017ASAJ..141..971W. doi:10.1121/1.4976080. PMID 28253641.
  168. ^ a b Jepsen ML, Dau T (January 2011). "Characterizing auditory processing and perception in individual listeners with sensorineural hearing loss". The Journal of the Acoustical Society of America. 129 (1): 262–81. Bibcode:2011ASAJ..129..262J. doi:10.1121/1.3518768. PMID 21303008.
  169. ^ Ives DT, Kalluri S, Strelcyk O, Sheft S, Miermont F, Coez A, Bizaguet E, Lorenzi C (October 2014). "Effects of noise reduction on AM perception for hearing-impaired listeners". Journal of the Association for Research in Otolaryngology. 15 (5): 839–48. doi:10.1007/s10162-014-0466-8. PMC 4164688. PMID 24899379.
  170. ^ Paul BT, Bruce IC, Roberts LE (February 2017). "Evidence that hidden hearing loss underlies amplitude modulation encoding deficits in individuals with and without tinnitus". Hearing Research. 344: 170–182. doi:10.1016/j.heares.2016.11.010. PMID 27888040.
  171. ^ Aslin RN (August 1989). "Discrimination of frequency transitions by human infants". The Journal of the Acoustical Society of America. 86 (2): 582–90. Bibcode:1989ASAJ...86..582A. doi:10.1121/1.398237. PMID 2768673.
  172. ^ Colombo J, Horowitz FD (April 1986). "Infants' attentional responses to frequency modulated sweeps". Child Development. 57 (2): 287–91. doi:10.2307/1130583. JSTOR 1130583. PMID 3956313.
  173. ^ Leibold LJ, Werner LA (2007-09-01). "Infant Auditory Sensitivity to Pure Tones and Frequency-Modulated Tones". Infancy. 12 (2): 225–233. CiteSeerX doi:10.1111/j.1532-7078.2007.tb00241.x.
  174. ^ Dawes P, Bishop DV (August 2008). "Maturation of visual and auditory temporal processing in school-aged children". Journal of Speech, Language, and Hearing Research. 51 (4): 1002–15. doi:10.1044/1092-4388(2008/073). PMID 18658067.
  175. ^ Henry KS, Heinz MG (October 2012). "Diminished temporal coding with sensorineural hearing loss emerges in background noise". Nature Neuroscience. 15 (10): 1362–4. doi:10.1038/nn.3216. PMC 3458164. PMID 22960931.
  176. ^ Henry KS, Kale S, Heinz MG (February 2016). "Distorted Tonotopic Coding of Temporal Envelope and Fine Structure with Noise-Induced Hearing Loss". The Journal of Neuroscience. 36 (7): 2227–37. doi:10.1523/JNEUROSCI.3944-15.2016. PMC 4756156. PMID 26888932.
  177. ^ a b Moore BC, Peters RW (May 1992). "Pitch discrimination and phase sensitivity in young and elderly subjects and its relationship to frequency selectivity". The Journal of the Acoustical Society of America. 91 (5): 2881–93. Bibcode:1992ASAJ...91.2881M. doi:10.1121/1.402925. PMID 1629481.
  178. ^ a b Moore BC (2008). Moore BC (ed.). Cochlear Hearing Loss: Physiological, Psychological and Technical Issues (Second ed.). Wiley Online Library. doi:10.1002/9780470987889. ISBN 9780470987889.
  179. ^ Hopkins K, Moore BC (August 2007). "Moderate cochlear hearing loss leads to a reduced ability to use temporal fine structure information". The Journal of the Acoustical Society of America. 122 (2): 1055–68. Bibcode:2007ASAJ..122.1055H. doi:10.1121/1.2749457. PMID 17672653.
  180. ^ Moore BC, Skrodzka E (January 2002). "Detection of frequency modulation by hearing-impaired listeners: effects of carrier frequency, modulation rate, and added amplitude modulation". The Journal of the Acoustical Society of America. 111 (1 Pt 1): 327–35. Bibcode:2002ASAJ..111..327M. doi:10.1121/1.1424871. PMID 11833538.
  181. ^ Grose JH, Mamo SK (December 2012). "Frequency modulation detection as a measure of temporal processing: age-related monaural and binaural effects". Hearing Research. 294 (1–2): 49–54. doi:10.1016/j.heares.2012.09.007. PMC 3505233. PMID 23041187.
  182. ^ a b c Santurette S, Dau T (January 2007). "Binaural pitch perception in normal-hearing and hearing-impaired listeners". Hearing Research. 223 (1–2): 29–47. doi:10.1016/j.heares.2006.09.013. PMID 17107767.
  183. ^ Grose JH, Mamo SK (December 2010). "Processing of temporal fine structure as a function of age". Ear and Hearing. 31 (6): 755–60. doi:10.1097/AUD.0b013e3181e627e7. PMC 2966515. PMID 20592614.
  184. ^ a b Lopez-Poveda EA, Barrios P (2013-07-16). "Perception of stochastically undersampled sound waveforms: a model of auditory deafferentation". Frontiers in Neuroscience. 7: 124. doi:10.3389/fnins.2013.00124. PMC 3712141. PMID 23882176.
  185. ^ Young ED, Sachs MB (November 1979). "Representation of steady-state vowels in the temporal aspects of the discharge patterns of populations of auditory-nerve fibers". The Journal of the Acoustical Society of America. 66 (5): 1381–1403. Bibcode:1979ASAJ...66.1381Y. doi:10.1121/1.383532. PMID 500976.
  186. ^ Zeng FG, Kong YY, Michalewski HJ, Starr A (June 2005). "Perceptual consequences of disrupted auditory nerve activity". Journal of Neurophysiology. 93 (6): 3050–63. doi:10.1152/jn.00985.2004. PMID 15615831.
  187. ^ Pichora-Fuller MK, Schneider BA, Macdonald E, Pass HE, Brown S (January 2007). "Temporal jitter disrupts speech intelligibility: a simulation of auditory aging". Hearing Research. 223 (1–2): 114–21. doi:10.1016/j.heares.2006.10.009. PMID 17157462.
  188. ^ Lopez-Poveda EA (2014-10-30). "Why do I hear but not understand? Stochastic undersampling as a model of degraded neural encoding of speech". Frontiers in Neuroscience. 8: 348. doi:10.3389/fnins.2014.00348. PMC 4214224. PMID 25400543.
  189. ^ Fowler EP (1936-12-01). "A method for the early detection of otosclerosis: a study of sounds well above threshold". Archives of Otolaryngology–Head & Neck Surgery. 24 (6): 731–741. doi:10.1001/archotol.1936.00640050746005.
  190. ^ Moore BC (June 2004). "Testing the concept of softness imperception: loudness near threshold for hearing-impaired ears". The Journal of the Acoustical Society of America. 115 (6): 3103–11. Bibcode:2004ASAJ..115.3103M. doi:10.1121/1.1738839. PMID 15237835.
  191. ^ Stone MA, Moore BC (December 1992). "Syllabic compression: effective compression ratios for signals modulated at different rates". British Journal of Audiology. 26 (6): 351–61. doi:10.3109/03005369209076659. PMID 1292819.
  192. ^ Plomp R (June 1988). "The negative effect of amplitude compression in multichannel hearing aids in the light of the modulation-transfer function". The Journal of the Acoustical Society of America. 83 (6): 2322–7. Bibcode:1988ASAJ...83.2322P. doi:10.1121/1.396363. PMID 3411024.
  193. ^ Stone MA, Moore BC (March 2007). "Quantifying the effects of fast-acting compression on the envelope of speech". The Journal of the Acoustical Society of America. 121 (3): 1654–64. Bibcode:2007ASAJ..121.1654S. doi:10.1121/1.2434754. PMID 17407902.
  194. ^ Bacon S (2004). Compression: From Cochlea to Cochlear Implants. Springer Handbook of Auditory Research. Springer. ISBN 9780387004969.
  195. ^ Boyle PJ, Büchner A, Stone MA, Lenarz T, Moore BC (April 2009). "Comparison of dual-time-constant and fast-acting automatic gain control (AGC) systems in cochlear implants". International Journal of Audiology. 48 (4): 211–21. doi:10.1080/14992020802581982. PMID 19363722.
  196. ^ Clark GM, Blamey PJ, Brown AM, Gusby PA, Dowell RC, Franz BK, Pyman BC, Shepherd RK, Tong YC, Webb RL (1987). "The University of Melbourne--nucleus multi-electrode cochlear implant". Advances in Oto-Rhino-Laryngology. 38: V–IX, 1–181. doi:10.1159/000414597. PMID 3318305.
  197. ^ Başkent D, Gaudrain E, Tamati TN, Wagner A (2016). "Chapter 12: Perception and Psychoacoustics of Speech in Cochlear Implant Users". In Cacace AT, de Kleine E, Holt AG, van Dijk P (eds.). Scientific foundations of Audiology: Perspectives from Physics, Biology, Modeling, and Medicine. San Diego, CA: Plural Publishing, Inc. pp. 285–319. ISBN 978-1-59756-652-0.
  198. ^ Bierer JA, Faulkner KF (April 2010). "Identifying cochlear implant channels with poor electrode-neuron interface: partial tripolar, single-channel thresholds and psychophysical tuning curves". Ear and Hearing. 31 (2): 247–58. doi:10.1097/AUD.0b013e3181c7daf4. PMC 2836401. PMID 20090533.
  199. ^ Lazard DS, Vincent C, Venail F, Van de Heyning P, Truy E, Sterkers O, et al. (November 2012). "Pre-, per- and postoperative factors affecting performance of postlinguistically deaf adults using cochlear implants: a new conceptual model over time". PLOS One. 7 (11): e48739. Bibcode:2012PLoSO...748739L. doi:10.1371/journal.pone.0048739. PMC 3494723. PMID 23152797.
  200. ^ Holden LK, Firszt JB, Reeder RM, Uchanski RM, Dwyer NY, Holden TA (December 2016). "Factors Affecting Outcomes in Cochlear Implant Recipients Implanted With a Perimodiolar Electrode Array Located in Scala Tympani". Otology & Neurotology. 37 (10): 1662–1668. doi:10.1097/MAO.0000000000001241. PMC 5113723. PMID 27755365.
  201. ^ Boyle PJ, Nunn TB, O'Connor AF, Moore BC (March 2013). "STARR: a speech test for evaluation of the effectiveness of auditory prostheses under realistic conditions". Ear and Hearing. 34 (2): 203–12. doi:10.1097/AUD.0b013e31826a8e82. PMID 23135616.
  202. ^ Won JH, Drennan WR, Nie K, Jameyson EM, Rubinstein JT (July 2011). "Acoustic temporal modulation detection and speech perception in cochlear implant listeners". The Journal of the Acoustical Society of America. 130 (1): 376–88. Bibcode:2011ASAJ..130..376W. doi:10.1121/1.3592521. PMC 3155593. PMID 21786906.
  203. ^ Fu QJ (September 2002). "Temporal processing and speech recognition in cochlear implant users". NeuroReport. 13 (13): 1635–9. doi:10.1097/00001756-200209160-00013. PMID 12352617.
  204. ^ Friesen LM, Shannon RV, Baskent D, Wang X (August 2001). "Speech recognition in noise as a function of the number of spectral channels: comparison of acoustic hearing and cochlear implants". The Journal of the Acoustical Society of America. 110 (2): 1150–63. Bibcode:2001ASAJ..110.1150F. doi:10.1121/1.1381538. PMID 11519582.
  205. ^ Moore DR, Shannon RV (June 2009). "Beyond cochlear implants: awakening the deafened brain". Nature Neuroscience. 12 (6): 686–91. doi:10.1038/nn.2326. PMID 19471266.
  206. ^ Stickney GS, Zeng FG, Litovsky R, Assmann P (August 2004). "Cochlear implant speech recognition with speech maskers". The Journal of the Acoustical Society of America. 116 (2): 1081–91. Bibcode:2004ASAJ..116.1081S. doi:10.1121/1.1772399. PMID 15376674.
  207. ^ Blamey P, Artieres F, Başkent D, Bergeron F, Beynon A, Burke E, et al. (2013). "Factors affecting auditory performance of postlinguistically deaf adults using cochlear implants: an update with 2251 patients" (PDF). Audiology & Neuro-Otology. 18 (1): 36–47. doi:10.1159/000343189. PMID 23095305.
  208. ^ Başkent D, Clarke J, Pals C, Benard MR, Bhargava P, Saija J, Sarampalis A, Wagner A, Gaudrain E (October 2016). "Cognitive compensation of speech perception with hearing impairment, cochlear implants, and aging". Trends in Hearing. 20: 233121651667027. doi:10.1177/2331216516670279. PMC 5056620.
  209. ^ Pfingst BE, Burkholder-Juhasz RA, Xu L, Thompson CS (February 2008). "Across-site patterns of modulation detection in listeners with cochlear implants". The Journal of the Acoustical Society of America. 123 (2): 1054–62. Bibcode:2008ASAJ..123.1054P. doi:10.1121/1.2828051. PMC 2431465. PMID 18247907.
  210. ^ a b Chatterjee M, Oberzut C (September 2011). "Detection and rate discrimination of amplitude modulation in electrical hearing". The Journal of the Acoustical Society of America. 130 (3): 1567–80. Bibcode:2011ASAJ..130.1567C. doi:10.1121/1.3621445. PMC 3188971. PMID 21895095.
  211. ^ Shannon RV (April 1992). "Temporal modulation transfer functions in patients with cochlear implants". The Journal of the Acoustical Society of America. 91 (4 Pt 1): 2156–64. Bibcode:1992ASAJ...91.2156S. doi:10.1121/1.403807. PMID 1597606.
  212. ^ Cazals Y, Pelizzone M, Saudan O, Boex C (October 1994). "Low-pass filtering in amplitude modulation detection associated with vowel and consonant identification in subjects with cochlear implants". The Journal of the Acoustical Society of America. 96 (4): 2048–54. Bibcode:1994ASAJ...96.2048C. doi:10.1121/1.410146. PMID 7963020.
  213. ^ Cooper WB, Tobey E, Loizou PC (August 2008). "Music perception by cochlear implant and normal hearing listeners as measured by the Montreal Battery for Evaluation of Amusia". Ear and Hearing. 29 (4): 618–26. doi:10.1097/AUD.0b013e318174e787. PMC 2676841. PMID 18469714.
  214. ^ Galvin JJ, Fu QJ, Nogaki G (June 2007). "Melodic contour identification by cochlear implant listeners". Ear and Hearing. 28 (3): 302–19. doi:10.1097/01.aud.0000261689.35445.20. PMC 3627492. PMID 17485980.
  215. ^ Fu QJ, Chinchilla S, Nogaki G, Galvin JJ (September 2005). "Voice gender identification by cochlear implant users: the role of spectral and temporal resolution". The Journal of the Acoustical Society of America. 118 (3 Pt 1): 1711–8. Bibcode:2005ASAJ..118.1711F. doi:10.1121/1.1985024. PMID 16240829.
  216. ^ Fuller CD, Gaudrain E, Clarke JN, Galvin JJ, Fu QJ, Free RH, Başkent D (December 2014). "Gender categorization is abnormal in cochlear implant users". Journal of the Association for Research in Otolaryngology. 15 (6): 1037–48. doi:10.1007/s10162-014-0483-7. PMC 4389960. PMID 25172111.
  217. ^ Peng SC, Lu HP, Lu N, Lin YS, Deroche ML, Chatterjee M (May 2017). "Processing of Acoustic Cues in Lexical-Tone Identification by Pediatric Cochlear-Implant Recipients". Journal of Speech, Language, and Hearing Research. 60 (5): 1223–1235. doi:10.1044/2016_JSLHR-S-16-0048. PMC 5755546. PMID 28388709.
  218. ^ Wang W, Zhou N, Xu L (April 2011). "Musical pitch and lexical tone perception with cochlear implants". International Journal of Audiology. 50 (4): 270–8. doi:10.3109/14992027.2010.542490. PMC 5662112. PMID 21190394.
  219. ^ Chatterjee M, Peng SC (January 2008). "Processing F0 with cochlear implants: Modulation frequency discrimination and speech intonation recognition". Hearing Research. 235 (1–2): 143–56. doi:10.1016/j.heares.2007.11.004. PMC 2237883. PMID 18093766.
  220. ^ Fu QJ, Galvin JJ (December 2007). "Vocal emotion recognition by normal-hearing listeners and cochlear implant users". Trends in Amplification. 11 (4): 301–15. doi:10.1177/1084713807305301. PMC 4111530. PMID 18003871.
  221. ^ Chatterjee M, Zion DJ, Deroche ML, Burianek BA, Limb CJ, Goren AP, Kulkarni AM, Christensen JA (April 2015). "Voice emotion recognition by cochlear-implanted children and their normally-hearing peers". Hearing Research. 322: 151–62. doi:10.1016/j.heares.2014.10.003. PMC 4615700. PMID 25448167.
  222. ^ Chatterjee M, Oba SI (December 2004). "Across- and within-channel envelope interactions in cochlear implant listeners". Journal of the Association for Research in Otolaryngology. 5 (4): 360–75. doi:10.1007/s10162-004-4050-5. PMC 2504569. PMID 15675001.
  223. ^ Chatterjee M, Kulkarni AM (February 2018). "Modulation detection interference in cochlear implant listeners under forward masking conditions". The Journal of the Acoustical Society of America. 143 (2): 1117–1127. Bibcode:2018ASAJ..143.1117C. doi:10.1121/1.5025059. PMC 5821512. PMID 29495705.
  224. ^ Alcántara JL, Moore BC, Kühnel V, Launer S (January 2003). "Evaluation of the noise reduction system in a commercial digital hearing aid". International Journal of Audiology. 42 (1): 34–42. doi:10.3109/14992020309056083. PMID 12564514.
  225. ^ Moore BC (March 2003). "Coding of sounds in the auditory system and its relevance to signal processing and coding in cochlear implants". Otology & Neurotology. 24 (2): 243–54. doi:10.1097/00129492-200303000-00019. PMID 12621339.
  226. ^ Rader T, Döge J, Adel Y, Weissgerber T, Baumann U (September 2016). "Place dependent stimulation rates improve pitch perception in cochlear implantees with single-sided deafness". Hearing Research. 339: 94–103. doi:10.1016/j.heares.2016.06.013. PMID 27374479.
  227. ^ Roy AT, Carver C, Jiradejvong P, Limb CJ (October 2015). "Musical Sound Quality in Cochlear Implant Users: A Comparison in Bass Frequency Perception Between Fine Structure Processing and High-Definition Continuous Interleaved Sampling Strategies". Ear and Hearing. 36 (5): 582–90. doi:10.1097/AUD.0000000000000170. PMID 25906173.
  228. ^ Fitzgerald MB, Wright BA (February 2011). "Perceptual learning and generalization resulting from training on an auditory amplitude-modulation detection task". The Journal of the Acoustical Society of America. 129 (2): 898–906. Bibcode:2011ASAJ..129..898F. doi:10.1121/1.3531841. PMC 3070992. PMID 21361447.
  229. ^ Fitzgerald MB, Wright BA (December 2005). "A perceptual learning investigation of the pitch elicited by amplitude-modulated noise". The Journal of the Acoustical Society of America. 118 (6): 3794–803. Bibcode:2005ASAJ..118.3794F. doi:10.1121/1.2074687. PMID 16419824.
  230. ^ a b Sabin AT, Eddins DA, Wright BA (May 2012). "Perceptual learning evidence for tuning to spectrotemporal modulation in the human auditory system". The Journal of Neuroscience. 32 (19): 6542–9. doi:10.1523/JNEUROSCI.5732-11.2012. PMC 3519395. PMID 22573676.
  231. ^ Joosten ER, Shamma SA, Lorenzi C, Neri P (July 2016). "Dynamic Reweighting of Auditory Modulation Filters". PLOS Computational Biology. 12 (7): e1005019. Bibcode:2016PLSCB..12E5019J. doi:10.1371/journal.pcbi.1005019. PMC 4939963. PMID 27398600.
  232. ^ Aizawa N, Eggermont JJ (March 2006). "Effects of noise-induced hearing loss at young age on voice onset time and gap-in-noise representations in adult cat primary auditory cortex". Journal of the Association for Research in Otolaryngology. 7 (1): 71–81. doi:10.1007/s10162-005-0026-3. PMC 2504589. PMID 16408166.
  233. ^ Rosen MJ, Sarro EC, Kelly JB, Sanes DH (2012-07-26). "Diminished behavioral and neural sensitivity to sound modulation is associated with moderate developmental hearing loss". PLOS One. 7 (7): e41514. Bibcode:2012PLoSO...741514R. doi:10.1371/journal.pone.0041514. PMC 3406049. PMID 22848517.
  234. ^ Caras ML, Sanes DH (July 2015). "Sustained Perceptual Deficits from Transient Sensory Deprivation". The Journal of Neuroscience. 35 (30): 10831–42. doi:10.1523/JNEUROSCI.0837-15.2015. PMC 4518056. PMID 26224865.
  235. ^ Zhou X, Merzenich MM (May 2012). "Environmental noise exposure degrades normal listening processes". Nature Communications. 3: 843. Bibcode:2012NatCo...3..843Z. doi:10.1038/ncomms1849. PMID 22588305.
  236. ^ Bao S, Chang EF, Woods J, Merzenich MM (September 2004). "Temporal plasticity in the primary auditory cortex induced by operant perceptual learning". Nature Neuroscience. 7 (9): 974–81. doi:10.1038/nn1293. PMID 15286790.
  237. ^ Kilgard MP, Merzenich MM (December 1998). "Plasticity of temporal information processing in the primary auditory cortex". Nature Neuroscience. 1 (8): 727–31. doi:10.1038/3729. PMC 2948964. PMID 10196590.
  238. ^ Anderson S, White-Schwoch T, Choi HJ, Kraus N (2013). "Training changes processing of speech cues in older adults with hearing loss". Frontiers in Systems Neuroscience. 7: 97. doi:10.3389/fnsys.2013.00097. PMC 3842592. PMID 24348347.
  239. ^ Wong PC, Skoe E, Russo NM, Dees T, Kraus N (April 2007). "Musical experience shapes human brainstem encoding of linguistic pitch patterns". Nature Neuroscience. 10 (4): 420–2. doi:10.1038/nn1872. PMC 4508274. PMID 17351633.
  240. ^ Perez E, McCormack A, Edmonds BA (2014). "Sensitivity to temporal fine structure and hearing-aid outcomes in older adults". Frontiers in Neuroscience. 8: 7. doi:10.3389/fnins.2014.00007. PMC 3914396. PMID 24550769.
  241. ^ Rönnberg J, Lunner T, Ng EH, Lidestam B, Zekveld AA, Sörqvist P, Lyxell B, Träff U, Yumba W, Classon E, Hällgren M, Larsby B, Signoret C, Pichora-Fuller MK, Rudner M, Danielsson H, Stenfelt S (November 2016). "Hearing impairment, cognition and speech understanding: exploratory factor analyses of a comprehensive test battery for a group of hearing aid users, the n200 study". International Journal of Audiology. 55 (11): 623–42. doi:10.1080/14992027.2016.1219775. PMC 5044772. PMID 27589015.
  242. ^ Moore BC, Sęk A (September 2016). "Preferred Compression Speed for Speech and Music and Its Relationship to Sensitivity to Temporal Fine Structure". Trends in Hearing. 20: 233121651664048. doi:10.1177/2331216516640486. PMC 5017572. PMID 27604778.
  243. ^ Bernstein JG, Danielsson H, Hällgren M, Stenfelt S, Rönnberg J, Lunner T (November 2016). "Spectrotemporal Modulation Sensitivity as a Predictor of Speech-Reception Performance in Noise With Hearing Aids". Trends in Hearing. 20: 233121651667038. doi:10.1177/2331216516670387. PMC 5098798. PMID 27815546.
  244. ^ Sęk A, Moore BC (January 2012). "Implementation of two tests for measuring sensitivity to temporal fine structure". International Journal of Audiology. 51 (1): 58–63. doi:10.3109/14992027.2011.605808. PMID 22050366.
  245. ^ Moore BC, Vickers DA, Mehta A (October 2012). "The effects of age on temporal fine structure sensitivity in monaural and binaural conditions". International Journal of Audiology. 51 (10): 715–21. doi:10.3109/14992027.2012.690079. PMID 22998412.
  246. ^ Füllgrabe C (December 2013). "Age-dependent changes in temporal-fine-structure processing in the absence of peripheral hearing loss". American Journal of Audiology. 22 (2): 313–5. doi:10.1044/1059-0889(2013/12-0070). PMID 23975124.
  247. ^ Santurette S, Dau T (April 2012). "Relating binaural pitch perception to the individual listener's auditory profile" (PDF). The Journal of the Acoustical Society of America. 131 (4): 2968–86. Bibcode:2012ASAJ..131.2968S. doi:10.1121/1.3689554. PMID 22501074.
  248. ^ Hopkins K, Moore BC (December 2010). "Development of a fast method for measuring sensitivity to temporal fine structure information at low frequencies". International Journal of Audiology. 49 (12): 940–6. doi:10.3109/14992027.2010.512613. PMID 20874427.
  249. ^ Füllgrabe C, Harland AJ, Sęk AP, Moore BC (December 2017). "Development of a method for determining binaural sensitivity to temporal fine structure" (PDF). International Journal of Audiology. 56 (12): 926–935. doi:10.1080/14992027.2017.1366078. PMID 28859494.
  250. ^ Füllgrabe C, Moore BC (January 2017). "Evaluation of a Method for Determining Binaural Sensitivity to Temporal Fine Structure (TFS-AF Test) for Older Listeners With Normal and Impaired Low-Frequency Hearing". Trends in Hearing. 21: 2331216517737230. doi:10.1177/2331216517737230. PMC 5669320. PMID 29090641.
  251. ^ a b Kates JM, Arehart KH (April 2005). "Coherence and the speech intelligibility index". The Journal of the Acoustical Society of America. 117 (4 Pt 1): 2224–37. Bibcode:2005ASAJ..117.2224K. doi:10.1121/1.1862575. PMID 15898663.
  252. ^ Arehart KH, Kates JM, Anderson MC (June 2010). "Effects of noise, nonlinear processing, and linear filtering on perceived speech quality". Ear and Hearing. 31 (3): 420–36. doi:10.1097/AUD.0b013e3181d3d4f3. PMID 20440116.
  253. ^ a b c Taal CH, Hendriks RC, Heusdens R, Jensen J (September 2011). "An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech". IEEE Transactions on Audio, Speech, and Language Processing. 19 (7): 2125–2136. doi:10.1109/tasl.2011.2114881.
  254. ^ Croghan NB, Arehart KH, Kates JM (September 2014). "Music preferences with hearing aids: effects of signal properties, compression settings, and listener characteristics". Ear and Hearing. 35 (5): e170–84. doi:10.1097/AUD.0000000000000056. PMID 25010635.
  255. ^ a b Arehart K, Souza P, Kates J, Lunner T, Pedersen MS (2015). "Relationship Among Signal Fidelity, Hearing Loss, and Working Memory for Digital Noise Suppression". Ear and Hearing. 36 (5): 505–16. doi:10.1097/aud.0000000000000173. PMC 4549215. PMID 25985016.
  256. ^ Carter, G.; Knapp, C.; Nuttall, A. (August 1973). "Estimation of the magnitude-squared coherence function via overlapped fast Fourier transform processing". IEEE Transactions on Audio and Electroacoustics. 21 (4): 337–344. doi:10.1109/TAU.1973.1162496.
  257. ^ Arehart KH, Kates JM, Anderson MC, Harvey LO (August 2007). "Effects of noise and distortion on speech quality judgments in normal-hearing and hearing-impaired listeners". The Journal of the Acoustical Society of America. 122 (2): 1150–64. Bibcode:2007ASAJ..122.1150A. doi:10.1121/1.2754061. PMID 17672661.
  258. ^ Tan CT, Moore BC (May 2008). "Perception of nonlinear distortion by hearing-impaired people". International Journal of Audiology. 47 (5): 246–56. doi:10.1080/14992020801945493. PMID 18465409.
  259. ^ Houtgast, T.; Steeneken, H. J. M. (March 1985). "A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria". The Journal of the Acoustical Society of America. 77 (3): 1069–1077. Bibcode:1985ASAJ...77.1069H. doi:10.1121/1.392224.
  260. ^ Hohmann V, Kollmeier B (February 1995). "The effect of multichannel dynamic compression on speech intelligibility". The Journal of the Acoustical Society of America. 97 (2): 1191–5. Bibcode:1995ASAJ...97.1191H. doi:10.1121/1.413092. PMID 7876441.
  261. ^ Goldsworthy RL, Greenberg JE (December 2004). "Analysis of speech-based Speech Transmission Index methods with implications for nonlinear operations". The Journal of the Acoustical Society of America. 116 (6): 3679–89. Bibcode:2004ASAJ..116.3679G. doi:10.1121/1.1804628. PMID 15658718.
  262. ^ Ludvigsen C, Elberling C, Keidser G, Poulsen T (1990). "Prediction of intelligibility of non-linearly processed speech". Acta Oto-Laryngologica. Supplementum. 469: 190–5. doi:10.1080/00016489.1990.12088428. PMID 2356726.
  263. ^ a b c Kates, James M.; Arehart, Kathryn H. (November 2014). "The Hearing-Aid Speech Perception Index (HASPI)". Speech Communication. 65: 75–93. doi:10.1016/j.specom.2014.06.002.
  264. ^ Chi T, Gao Y, Guyton MC, Ru P, Shamma S (November 1999). "Spectro-temporal modulation transfer functions and speech intelligibility". The Journal of the Acoustical Society of America. 106 (5): 2719–32. Bibcode:1999ASAJ..106.2719C. doi:10.1121/1.428100. hdl:1903/6121. PMID 10573888.
  265. ^ Huber, R.; Kollmeier, B. (November 2006). "PEMO-Q—A New Method for Objective Audio Quality Assessment Using a Model of Auditory Perception". IEEE Transactions on Audio, Speech and Language Processing. 14 (6): 1902–1911. doi:10.1109/TASL.2006.883259.
  266. ^ Huber R, Parsa V, Scollie S (2014-11-17). "Predicting the perceived sound quality of frequency-compressed speech". PLOS One. 9 (11): e110260. Bibcode:2014PLoSO...9k0260H. doi:10.1371/journal.pone.0110260. PMC 4234248. PMID 25402456.
  267. ^ a b Kates J, Arehart K (2014-03-20). "The Hearing-Aid Speech Quality Index (HASQI) Version 2". Journal of the Audio Engineering Society. 62 (3): 99–117. doi:10.17743/jaes.2014.0006. ISSN 1549-4950.
  268. ^ Kates J, Arehart K (20 March 2014). "The Hearing-Aid Speech Quality Index (HASQI) Version 2". Journal of the Audio Engineering Society. 62 (3): 99–117. doi:10.17743/jaes.2014.0006.
  269. ^ Kates JM, Arehart KH (February 2016). "The Hearing-Aid Audio Quality Index (HAAQI)". IEEE/ACM Transactions on Audio, Speech, and Language Processing. 24 (2): 354–365. doi:10.1109/taslp.2015.2507858. PMC 4849486. PMID 27135042.
  270. ^ Kates J (2013). "An auditory model for intelligibility and quality predictions". Acoustical Society of America Journal. Proceedings of Meetings on Acoustics. ASA. 133 (5): 050184. Bibcode:2013ASAJ..133.3560K. doi:10.1121/1.4799223.