TY - JOUR
T1 - Spatial alignment between faces and voices improves selective attention to audio-visual speech
AU - Fleming, Justin T.
AU - Maddox, Ross K.
AU - Shinn-Cunningham, Barbara G.
N1 - Publisher Copyright:
© 2021 Author(s).
PY - 2021/10/1
Y1 - 2021/10/1
N2 - The ability to see a talker's face improves speech intelligibility in noise, provided that the auditory and visual speech signals are approximately aligned in time. However, the importance of spatial alignment between corresponding faces and voices remains unresolved, particularly in multi-talker environments. In a series of online experiments, we investigated this using a task that required participants to selectively attend a target talker in noise while ignoring a distractor talker. In experiment 1, we found improved task performance when the talkers' faces were visible, but only when corresponding faces and voices were presented in the same hemifield (spatially aligned). In experiment 2, we tested for possible influences of eye position on this result. In auditory-only conditions, directing gaze toward the distractor voice reduced performance, but this effect could not fully explain the cost of audio-visual (AV) spatial misalignment. Lowering the signal-to-noise ratio (SNR) of the speech from +4 to -4 dB increased the magnitude of the AV spatial alignment effect (experiment 3), but accurate closed-set lipreading caused a floor effect that influenced results at lower SNRs (experiment 4). Taken together, these results demonstrate that spatial alignment between faces and voices contributes to the ability to selectively attend AV speech.
AB - The ability to see a talker's face improves speech intelligibility in noise, provided that the auditory and visual speech signals are approximately aligned in time. However, the importance of spatial alignment between corresponding faces and voices remains unresolved, particularly in multi-talker environments. In a series of online experiments, we investigated this using a task that required participants to selectively attend a target talker in noise while ignoring a distractor talker. In experiment 1, we found improved task performance when the talkers' faces were visible, but only when corresponding faces and voices were presented in the same hemifield (spatially aligned). In experiment 2, we tested for possible influences of eye position on this result. In auditory-only conditions, directing gaze toward the distractor voice reduced performance, but this effect could not fully explain the cost of audio-visual (AV) spatial misalignment. Lowering the signal-to-noise ratio (SNR) of the speech from +4 to -4 dB increased the magnitude of the AV spatial alignment effect (experiment 3), but accurate closed-set lipreading caused a floor effect that influenced results at lower SNRs (experiment 4). Taken together, these results demonstrate that spatial alignment between faces and voices contributes to the ability to selectively attend AV speech.
UR - http://www.scopus.com/inward/record.url?scp=85118743664&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85118743664&partnerID=8YFLogxK
U2 - 10.1121/10.0006415
DO - 10.1121/10.0006415
M3 - Article
C2 - 34717460
AN - SCOPUS:85118743664
SN - 0001-4966
VL - 150
SP - 3085
EP - 3100
JO - Journal of the Acoustical Society of America
JF - Journal of the Acoustical Society of America
IS - 4
ER -