A great deal of research effort has been put in exploring crossmodal correspondences in the field of cognitive science which refer to the systematic associations frequently made between different sensory modalities (e.g. high pitch is matched with angular shapes). However, the possibilities cross-modality opens in the digital world have been relatively unexplored. Therefore, we consider that studying the plasticity and the effects of crossmodal correspondences in a mulsemedia setup can bring novel insights about improving the human-computer dialogue and experience. Mulsemedia refers to the combination of three or more senses to create immersive experiences. In our experiments, users were shown six video clips associated with certain visual features based on color, brightness, and shape. We examined if the pairing with crossmodal matching sound and the corresponding auto-generated haptic effect, and smell would lead to an enhanced user QoE. For this, we used an eye-tracking device as well as a heart rate monitor wristband to capture users’ eye gaze and heart rate whilst they were experiencing mulsemedia. After each video clip, we asked the users to complete an on-screen questionnaire with a set of questions related to smell, sound and haptic effects targeting their enjoyment and perception of the experiment. Accordingly, the eye gaze and heart rate results showed significant influence of the cross-modally mapped multisensorial effects on the users’ QoE. Our results highlight that when the olfactory content is crossmodally congruent with the visual content, the visual attention of the users seems shifted towards the correspondent visual feature. Crosmodally matched media is also shown to result in an enhanced QoE compared to a video only condition.