The superior temporal sulcus (STS) and gyrus (STG) are commonly identified to be functionally relevant for multisensory integration of audiovisual (AV) stimuli. However, most neuroimaging studies on AV integration used stimuli of short duration in explicit evaluative tasks. Importantly though, many of our AV experiences are of a long duration and ambiguous. It is unclear if the enhanced activity in audio, visual, and AV brain areas would also be synchronised over time across subjects when they are exposed to such multisensory stimuli. We used intersubject correlation to investigate which brain areas are synchronised across novices for uni- and multisensory versions of a 6-min 26-s recording of an unfamiliar, unedited Indian dance recording (Bharatanatyam). In Bharatanatyam, music and dance are choreographed together in a highly intermodal-dependent manner. Activity in the middle and posterior STG was significantly correlated between subjects and showed also significant enhancement for AV integration when the functional magnetic resonance signals were contrasted against each other using a general linear model conjunction analysis. These results extend previous studies by showing an intermediate step of synchronisation for novices: while there was a consensus across subjects’ brain activity in areas relevant for unisensory processing and AV integration of related audio and visual stimuli, we found no evidence for synchronisation of higher level cognitive processes, suggesting these were idiosyncratic.