Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

We report a series of experiments designed to assess the effect of audiovisual semantic congruency on the identification of visually-presented pictures. Participants made unspeeded identification responses concerning a series of briefly-presented, and then rapidly-masked, pictures. A naturalistic sound was sometimes presented together with the picture at a stimulus onset asynchrony (SOA) that varied between 0 and 533 ms (auditory lagging). The sound could be semantically congruent, semantically incongruent, or else neutral (white noise) with respect to the target picture. The results showed that when the onset of the picture and sound occurred simultaneously, a semantically-congruent sound improved, whereas a semantically-incongruent sound impaired, participants' picture identification performance, as compared to performance in the white-noise control condition. A significant facilitatory effect was also observed at SOAs of around 300 ms, whereas no such semantic congruency effects were observed at the longest interval (533 ms). These results therefore suggest that the neural representations associated with visual and auditory stimuli can interact in a shared semantic system. Furthermore, this crossmodal semantic interaction is not constrained by the need for the strict temporal coincidence of the constituent auditory and visual stimuli. We therefore suggest that audiovisual semantic interactions likely occur in a short-term buffer which rapidly accesses, and temporarily retains, the semantic representations of multisensory stimuli in order to form a coherent multisensory object representation. These results are explained in terms of Potter's (1993) notion of conceptual short-term memory.

Original publication




Journal article



Publication Date





389 - 404


Adult, Animals, Dogs, Female, Humans, Male, Perceptual Masking, Semantics, Signal Detection, Psychological, Vocalization, Animal