New Audio Cue Improved Depth Perception Only for Some Adults

TL;DR: A 2026 iScience study found that after 1 hour of training, adults could use a new audio cue for depth judgments, but only about half combined it with a familiar visual cue well enough to improve perceptual precision.

Key Findings

  1. A total of 78 adults completed the depth task after training on an artificial link between sound pitch and distance.
  2. Familiar visual cues combined strongly: 52 of 70 participants, or 74%, improved precision when disparity and size cues appeared together.
  3. The new audio cue was more variable: 31 of 63 participants, or 49%, improved precision when audio and disparity cues appeared together.
  4. Reliability weighting still worked: 91% of participants shifted weight toward the audio cue when the visual disparity cue became noisier.
  5. The pattern was stable: repeat testing suggested these differences reflected person-to-person variation more than random measurement noise.

Source: Scheller et al. tested how quickly adults could fold a newly learned audio signal into depth perception, comparing it with the familiar visual pairing of binocular disparity and object size.

Adults Learned the Audio-Depth Mapping Quickly

Human depth perception normally uses multiple familiar cues. Binocular disparity, the difference between the two eyes’ views, and size cues, the way apparent object size changes with distance, are classic examples.

Researchers asked whether a newly learned cue joined that system after brief training. In this experiment, the new cue was sound pitch.

For some participants, pitch increased with simulated distance. For others, pitch decreased with simulated distance.

The mapping direction was counterbalanced so the result would not depend on a natural pitch-distance association. The training lasted about 1 hour and included 336 trials, giving participants a short but explicit chance to learn the artificial relationship.

After training, participants completed two-interval forced-choice judgments. A square built from a random-dot stereogram appeared twice at simulated depths, and participants reported whether the second presentation was closer or farther away.

  • Familiar-familiar condition: disparity and size cues were tested together.
  • Familiar-novel condition: disparity and the newly learned audio cue were tested together.
  • Single-cue trials: each cue was also tested alone so researchers estimated precision for each source of information.

Familiar Visual Cues Improved Precision More Reliably

The main question was not whether participants noticed the sound. The stricter question was whether the new audio cue improved perceptual precision when paired with a familiar visual cue.

For familiar visual cues, the precision benefit was clear. When disparity and size appeared together, sensory noise dropped relative to the best single cue, and the combined performance did not differ significantly from an optimal-combination prediction.

At the participant level, 52 of 70 people showed a precision benefit from the two familiar visual cues. That equals 74% of the familiar-cue analysis sample.

The new audio cue showed a different pattern. At the group level, combining disparity and audio did not significantly reduce sensory noise compared with the best single cue, and performance differed from the optimal-combination prediction.

Individual results mattered, though. 31 of 63 participants, or 49%, showed a positive precision gain when the audio cue and disparity cue appeared together. The other half did not gain from the paired cues.

  • Familiar visual pair: most people combined the cues well enough to improve precision.
  • Audio-visual pair: about half improved, while half acted more like they were relying on one cue or switching strategies.
  • Mapping direction: the benefit did not depend on whether pitch increased or decreased with simulated distance.
Comparison of cue combination, reliability weighting, and incongruence sensitivity for familiar visual cues versus a new audio cue
After short training, familiar visual cues produced a clearer precision benefit than the newly learned audio cue, even though most participants still re-weighted the audio cue by reliability.

Reliability Weighting Worked Even for the New Cue

The audio cue did not simply fail. Participants used it in other computationally organized ways.

Researchers tested re-weighting, meaning whether participants shifted trust away from a cue when it became less reliable. When noise was added to the disparity cue, participants should lean more on the other cue if they are using reliability information.

That happened for both cue pairs:

  • Disparity plus size: when disparity became noisier, 56 of 60 participants, or 93%, shifted weight toward the size cue.
  • Disparity plus audio: when disparity became noisier, 50 of 55 participants, or 91%, shifted weight toward the newly learned sound cue.
See also  Myopia Linked to High Screen Time in Children & Adolescents (2024 Study)

The empirical weight shifts also did not differ significantly from optimal predictions. In practical terms, many participants treated the new sound as a reliability-weighted source of depth information even when it did not consistently improve final precision.

  1. Cue combination: asks whether two cues reduce depth-judgment noise compared with the best single cue.
  2. Reliability weighting: asks whether the brain trusts the less noisy cue more.
  3. Congruence sensitivity: asks whether performance worsens when the learned cue relationship is reversed.

Incongruent Cue Pairings Reduced Performance

The third marker was congruence sensitivity. If participants learned the cue relationship, reversing that relationship should make judgments less precise.

That is what the study found. Incongruent pairings increased sensory noise for both cue pairs. For the familiar visual pair, 57 of 59 participants, or 97%, showed reduced precision when the size-disparity relationship was reversed.

For the familiar-new cue pair, 50 of 59 participants, or 85%, showed reduced precision when the learned audio-disparity relationship was reversed. The audio cue was therefore meaningful to most participants, even when it did not reliably produce an additive precision gain.

The new cue was learned, weighted, and detected as congruent or incongruent. Full integration into native depth perception was the step that varied most across people.

Individual Differences Were Part of the Result

Large person-to-person differences might have been random noise, so the researchers repeated key measures across sessions. There was no additional training between the repeated sessions.

Sensory-noise estimates were significantly repeatable across single-cue and combined-cue conditions. Combination benefits also showed repeatability, although those estimates were interpreted more cautiously because some mixed-effect models had convergence problems.

The direction was still important: the variability reflected stable individual differences more than random measurement error. Some people appeared able to combine the newly learned audio cue with vision quickly; others did not.

  • Application boundary: sensory-substitution and sensory-augmentation devices need training plans that account for individual perceptual flexibility.
  • Training boundary: 1 hour supported mapping and reliability weighting, but not full cue integration in everyone.
  • Research boundary: future studies need enough power to test individual-level cue combination rather than relying only on group averages.

What This Means for Sensory Augmentation

Sensory-substitution tools often try to translate one kind of information into another signal, such as turning spatial information into sound or touch. This study suggests that learning the code is not the only bottleneck.

Participants used a sound-distance mapping after short training. Many also adjusted cue weights when visual reliability changed.

The harder step was making the new cue improve precision in the same way familiar visual cues did.

Main limitation: this was a controlled psychophysics experiment in young healthy adults, not a clinical test of an assistive device. The cue was also auditory, and other modalities or longer training schedules may produce different integration patterns.

A new sensory cue became usable quickly in this task, but full integration into perception still depended on the person, the training period, or both.

Citation: DOI: 10.1016/j.isci.2026.115526. Scheller et al. Learning new perceptual skills: Individual differences in the computations that integrate novel sensory cues into depth perception. iScience. 2026;29:115526.

Study Design: Psychophysics experiment comparing familiar-familiar and familiar-novel cue integration during depth judgments.

Sample Size: 78 healthy adults completed the task; cue-pair analyses used smaller samples after pre-specified exclusions.

Key Statistic: 52 of 70 participants combined familiar visual cues with a precision benefit, compared with 31 of 63 for the newly learned audio-visual cue pair.

Caveat: The study tested short-term learning in a controlled depth-perception task, so it does not show how much training real sensory-augmentation devices would require.

Brain ASAP