How rising and decaying sound profiles shape auditory perception

In humans, sounds that increase in intensity over time (up-ramp) are perceived as louder than down-ramping sounds. Here, we show that in mice this bias exists as well and is reflected in the complex nonlinearities of auditory cortex activity.

HFSP Career Development Award holder Brice Bathellier and colleagues
authored on Fri, 16 September 2016

It is well recognized that sound perception relies on the frequency decomposition of acoustic waves by the auditory system. The frequency spectrum is, however, not the only characteristic that influences perception and identification of sounds. Psychophysical experiments in audition have shown that temporal features, i.e. the sequence of intensity and frequency variations, are also crucial determinants of perception, not only for sound localization but also for identification. For example, the recognition of musical instruments by humans strongly depends on the time-intensity profile of the tones and is strongly impaired by time-reversal of the waveform.

Figure: Two-photon calcium imaging of the mouse auditory cortex (top) shows a surprising asymmetry between the neuronal population responses to up- versus down-ramping sounds.

Even for simple percepts, such as loudness, temporal features play an important role. Several psychophysical experiments have shown that sounds whose intensity is ramping up with time (up-ramps) are globally perceived as louder or changing more in loudness than their time-symmetric opposites (down-ramps). It was proposed that this asymmetry helps to emphasize approaching sound sources with respect to vanishing ones, with potential importance for predator detection. However, the underlying mechanisms of this well-characterized effect were so far poorly understood.

In this report, we recorded the activity of thousands of neurons in the mouse auditory cortex using the two-photon calcium imaging technique while presenting up- and down-ramping sounds to the awake animal. Combined with behavioral assays these measurements showed that the positive bias for up-ramping sounds as compared to down-ramping sounds is also present in mice at both the cortical and perceptual level, indicating a remarkably general property of the mammalian auditory system.

With the incomparable size of the collected neuronal sample, we could precisely analyze how cortex encodes the intensity modulations of the sounds, and demonstrate that this bias is the result of profound computational nonlinearities which go beyond sensory adaptation mechanisms. In particular, we showed that different ensembles of neurons specifically detect a variety of features about the time-course and magnitude of the intensity modulations, besides their specificity to the frequency of the sound. Features of the up-ramps (e.g. low magnitude onsets and high magnitude offsets) were more often represented explaining the observed activity asymmetry in favor of up-ramps at the level of the entire auditory cortex.

Using neural modeling, we also show that to computationally implement such an asymmetric representation of up- and down-ramping sounds, it is required to perform a succession of at least two linear operations, each of them followed by one nonlinear operation (e.g. cutting out activity below a threshold).  With  “multilayer” architecture, as compared to a more simple one (one linear operation followed by a nonlinearity as traditionally assumed), the representations of up- and down ramps are better separated and thus can generate very distinct percepts based only on the intensity modulation profile of the sound, as we in fact experience every day. These results point towards the importance of “multilayer“ computational architectures, now heavily used in artificial intelligence (deep learning), to reproduce and interpret even relatively simple properties of our perceptual systems.

Reference

Temporal asymmetries in auditory coding and perception reflect multi-layered nonlinearities. Deneux T, Kempf A, Daret A, Ponsot E, Bathellier B. Nat Commun. 2016, 7:12682. doi: 10.1038/ncomms12682.

Pubmed link