Making audience experiences more meaningful and emotionally engaging through mixed visual and audio media

Unlike a conventional lecture, where interaction and delivery behaviour is generally predictable, this paper presents suggestions for alternative and innovative delivery methods using different forms of visual and auditory modes to create a ‘performance lecture’. A performance lecture is distinct from a conventional lecture in several ways. Its primary purpose is not simply to impart knowledge didactically, but to find ways of making spectatorship emotionally engaging. This paper first discusses the constituent parts of our evolving framework for performance lectures. We then provide a review of some initial visually-based demonstrator work, a 12 minute ‘triptych’ demonstrator video, followed by some preliminary analysis of a user evaluation study. Suggestions for further work conclude this paper.


INTRODUCTION
Unlike a conventional lecture, where interaction and delivery behaviour is generally predictable, this paper presents suggestions for alternative and innovative delivery methods using different forms of visual and auditory modalities to create a 'performance lecture'. The first part of the paper explains our development approach through the use of a 'framework' to help guide what a performance lecture could be. This is followed by an account of some early development and evaluation work of a 'triptych' video which begins to examine the effectiveness of spatially distributed display screens.

WHAT IS A PERFORMANCE LECTURE?
The traditional lecture has existed for over well over 2000 years as a vehicle for sharing or imparting knowledge and remains the principal teaching and learning tool in nearly all educational institutions despite research evidence indicating their ineffectiveness (Costin 1972;Bligh 1998) and their unpopularity with students (Maloney and Lally 1998;Sander, Stevenson et al. 2000). Recently, interactive technologies have been used to improve the learning experience (e.g. (Shneiderman, Borkowski et al. 1998;Draper and Brown 2004) or have been supplemented with elearning and social networking environments (Mason and Rennie 2008). Despite attempts to replace the lecture with computer-based learning (Dewhurst, Macleod et al. 2000;Cramer, Collins et al. 2007) or to improve the popularity of the lecture format lectures through conferences like 'TED' and 'Fame Lab', the conventional lecture remains fundamentally unchanged. The lecture usually embodies socially accepted rituals and structures; consisting, in the main, with an introduction leading to series of arguments or facts and ending with a summary or conclusions. One of the key characteristics of a lecture is the single flow of knowledge or information from the speaker to the audience. If other forms of communication technology exist, they consist of supplementary visual aids such as text, static images and perhaps video.
This research begins to question this format by exploring what a lecture could become. The approach is not to supplement the lecture with more sophisticated communication tools, but rather, begin a deeper examination of how the relationship between the lecturer and audience could evolve. It must also be stressed; our intention is not to replace the conventional lecture, but conceptually re-appraise it with the eventual aim of some of the ideas perhaps trickling down into conventional lectures. Artists have, for many centuries, questioned the relationship between art and the audience by evoking a wide range of emotional responses to their work. Performance Making audience experiences more meaningful and emotionally engaging through mixed visual and audio media Bonner, Ramduny-Ellis and Peebles artists often take this further by finding new ways to interactively engage with an audience. Artists, however, usually make aesthetic and not pedagogical judgements about their work. In this research our aim was to explore if the structure and delivery of a lecture could be transformed 'performatively'. Would it be possible to build new 'representations' of action (Laurel 1993) by disseminating knowledge through artistic as well as conventional pedagogical means?
A performance lecture is therefore distinct from a conventional lecture. Its primary purpose is not simply to impart knowledge didactically, but to find alternative ways of making spectatorship engaging, particularly as an emotional experience in the same way an artist might do. Visual and audio technologies now allow all manner of possibilities to be explored, for example, recorded and live speech, animated text, spatially distributed soundscapes and music, combined with multiple video projections which can be simultaneously displayed on any surface. This paper sets out the first few steps in this interesting territory.

FRAMEWORK FOUNDATIONS
Much of the design and delivery of a performance lecture is still uncertain and unknown. To help the development process, we began by producing a development framework which is briefly described here. Benford and Giannachi (2011) refer to the use of 'sensitizing concepts' as a means of providing guidance and analytical direction for exploratory fields of study. In their work, the use of 'trajectories' have been used to retrospectively re-examine artistic work in a 'mixed-reality' environment and also a tool for envisioning future work. A similar approach is taken here. The framework was produced as a structuring tool, to help foster new ideas while also providing development coherence.
We view a performance lecture as a hybrid artefact stemming from three disciplines: cognitive psychology, performance art and interaction design. Each field offers different viewpoints particularly between art and science; supporting a gathering argument that these two 'engines of culture' need to re-combine (Wilson 2002). Seminal work from each field helped direct and shape the framework and provided many sources of inspiration and tools to assist in developing and evaluating performance lectures. Cognitive psychology supports development through theories related to listening, seeing, memory and emotional engagement. Interaction design offers analytical models such as activity theory and related research integrating the arts and sciences such as mixed reality installations and games.
Performance art provides a historical context, creative inspiration and critical argument on the nature of performance.

Cognitive psychology
One of the important goals is to ensure that performance lectures engage audiences cognitively using recognised empirical and theoretical evidence from the field of cognitive psychology. For example, an understanding of the cognitive processing strategies of text and visual information is essential to ensure effective learning if these two forms of information are delivered concurrently (Schnotz 2002); or, that high quality audio has a greater effect on the perceived quality of an interactive experience than its relative visual or screen representation (Reeves and Nass 1996). Using existing research evidence is helping to determine how the design of performance delivery characteristics could be shaped. Too much multimedia learning is based on what the technology can do and not how audiences learn within a multimedia environment (Mayer 1997)

Interaction design
Two areas of research have influenced our approach so far. Firstly, Laurel (1993) used the analogy of theatre as an interface metaphor to help design user experiences.
She defined 'interactivity' as 'the ability of humans to participate in actions in a representational context'. Laurel's mapping of the theatre to human-computer interaction is helping in defining the rhetorical structure of a lecture. For example, one important element within a narrative is the use of 'causal inference', how an event is relevant or linked to another. Thus consideration needs to be made about how explicitly or implicitly these relationships are made.
Second, activity theory (Bødker 1991) provides an analytical structure in which to design and then subsequently evaluate a performance lecture. One of the key principles of activity theory is to bring together activity, using a wider social and organisational context and combining this with 'consciousness' which can be defined as unifying attention, intention, memory, reasoning and speech located within everyday practice (Nardi 1996). An activity only occurs when a subject (which can be a person or group) is motivated to do something by an 'object'. The term 'object' is an important concept and has a different meaning to its typical understanding in interaction design. An object can be physical, such as a book, or it can be an idea or concept explained within a book or it can be a shared understanding between individuals about a book. The object can also change during the process of activity. Clearly, Making audience experiences more meaningful and emotionally engaging through mixed visual and audio media Bonner, Ramduny-Ellis and Peebles explicitly identifying, designing and using 'objects' within a performance lecture has benefits in terms of thinking about the rhetorical structure which is discussed later.

Performance art
Both video and audio have been used extensively by artists to evoke some form of emotional response.
The film, music, TV and games industries are good examples. There is also a strong history of experimental performance art. For instance, Joan Jonas used video projections and mirrors facing the audience to make them more conscious of their spectator role; Laurie Anderson devised new instruments such as the tape-bow violin and a talking stick a baton-like MIDI controller which produces and replicates sounds and a voice filter to deepen her voice. Alan Greenaway's 'Tulse Luper Suitcases' is a series of digital media events and artefacts which collectively challenge the role of cinema as a passive linear event. Large music events such as rock concerts now frequently use a mixture of electronic, audio and visual artefacts to extend the shared spectator experience -for example Chris Milk's interactive spheres which featured at a recent Arcade Fire concert.

FRAMEWORK PARAMETERS
By using knowledge and inspiration from these different fields, different analytical lenses help to maintain a balance that is neither artistically, or technology centred, but presents alternative ways of exploring the performer-audience relationship. By combining the emotional as well as the pedagogical journey together, the intention is to offer different audience experiences oscillating between aesthetic appreciation and learning. This section offers potential creative design possibilities for performance lectures.
These include the overall structure of the lecture in terms of its rhetorical construction, the employment and mix of delivery modalities and issues around spectatorship and how this could be changed.
Rhetoric is about effectively communicating thinking, writing, and speaking strategies. Given a performance lecture is concerned with blending both aesthetic and instructive experiences, greater emphasis and consideration therefore needs to be given to rhetorical parameters. In this way a lecture can be critically reappraised or overhauled thus allowing greater expressive latitude. These communication parameters have been thought of in terms of space, time, delivery and engagement. Four parameters are offered.

• Modal expressions
• Spatial design • Temporal flow • Audience engagement and spectatorship These parameters form the design palette by which a performance lecture can be creatively considered in terms of production and delivery. By using the framework in this way all aspects of a lecture can be reappraised using divergent thinking thus avoiding conventional lecture-design boundaries. Constraints or convergent thinking is imposed by framing these parameters through the disciplines of cognitive psychology and interaction design. Through this dual approach, new forms of expressive lectures can emerge. Each of these parameters is now examined in more detail with some suggestions of how they could be deployed.

Modal expressions
Oration is the most common modal expression in a lecture. Much has been written about how to improve presentation skills and therefore unnecessary to consider here. Presentation style will only be considered when it has impact on other aspects of rhetorical structure. What is more important here is the consideration of other modalities and how they might be mixed and integrated into a performance lecture. Music is an important, but very different, modal expression to speech. Music does not use a language structure in the same way as speech as the component elements do not have collective, conventional meaning or understanding. Nevertheless, music has the ability to intensify or weaken emotional engagement and is capable of the 'blurring of selfawareness and the heightening of fellow-feeling' (Honing 2009). Music can evoke physiological as well as strong emotional feelings in the listener. For instance, some studies have found evidence of a physiological response between sound frequencies and natural brainwave activity (Treasure 2007).
The key question here is how music can be used as a communication interface to supplement and augment a narrative grounded in cognitive theory and not purely as a pleasurable listening experience. Traditionally, music works in the 'Romantic' sense in that the emotions of the work attempt to be a literal transmission of the composer's emotional intent. Views differ on whether music can arouse commonly-understood emotions or can embody these emotions (Robinson 2005). Studies have found that listeners are aroused at the same point in music, Making audience experiences more meaningful and emotionally engaging through mixed visual and audio media Bonner, Ramduny-Ellis and Peebles for example during harmonic changes but listeners attribute different emotional responses.
Soundscapes are another potential channel of modal expression. The term was defined by (Schafer 1977) and refers to an auditory landscape which can consist of natural and/or artificially created sounds.
Schafer offers a classification of physical sound characteristics which relate to the listening experience. These are: perceived distance from the listener; estimated intensity of original sound; distinctiveness of the sound; the natural or manufactured texture of the sound; and the environmental influences on the sound such as reverb, echo or displacement. To be effective, soundscapes need to form a relationship and be relevant to the listener. Soundscapes usually trigger an association with a specific place, event, or even nowadays, a product. Intel's 5 tone melody is a good example of this. Sound therefore offers inferential possibilities, associate meaning and emotional engagement.
Static images such as pictures and diagrams form a very useful and important role in the lecture and often help the presenter to illustrate complex and conceptually abstract ideas in clear and concise ways. More recently moving images such as animation or video have also been used to augment and supplement key points. Generally, the goal for using such media is instructionally oriented, that is, the intended purpose is to help the audience to grasp or appreciate the presenter's argument through visual means.
Our aim is to find alternative ways of using imagery to deepen the learning experience through emotional engagement. Video artists and cinematographers commonly explore this territory. In the late 19th Century, Muybridge was one of the pioneers of recording motion using stopmotion photography. He regarded his work to be artistic rather than scientific placing heavy importance on expressing movement more than accurate recording. Today, viewing behaviour 'around' the moving image through media such as TV and film is extremely sophisticated (Silverstone 1994).
Viewers are capable of quickly suspending disbelief and be capable of discerning between staged, real and artificial acts. The moving image is finding new expressions through projections onto walls and buildings as media architecture.
Artists are continually challenging our relationship with the moving image. Paul Sermon's work 'Telematic Dreaming', explores the relationship between actual and projected images of people.
The telematic installation consists of two beds, each located in different places. Each bed has a projector above it and a display screen of the other co-located bed; each bed permits one real and virtual occupant which is projected from the other bed, thus allowing interaction between real and virtual occupants. The work explores how we behave and see ourselves within an intimate space which is occupied by a 'virtual' stranger. Sermon's work powerfully challenges the emotional sense of remote intimacy and social engagement.
The video image is now so common-place and mundane, that intermittent distractions and attention are quite natural particularly with TV and cinema viewing. Consequently, this makes it difficult for video artists to immerse their audience within their work (Petersen 2010) which is at odds with the social conventions of an art gallery. Artists have to embrace or contend with distraction and non-attention. Petersen states, 'art must reconfigure itself and develop a more adequate framework for understanding contemporary conditions of cultural creation and reception'.
The mixing of modal expressions is also likely to be complex as they can be combined and distributed in many permutations over time and space. There is evidence that mixed modalities can be highly beneficial in improving audience comprehension.
Mayer (1997) argues that presenting information in more than one mode improves learning; although the conditions for significant improvement are quite specific. Other studies suggest that if human senses are brought into harmony one obtains a phenomenon of super-additivity where the combined effect is 'multiplicative, not additive' (Treasure 2007).
Related to this a phenomenon known as the McGurk effect is an illusion in which subject's comprehension of a spoken syllable is overridden when the subject is observing the talker's articulation. In other words the spoken sound 'v' can be perceived to sound like 'b' if the listener can see the talker's mouth. At a neurological level this is also known as multisensory integration where the neurological response is faster if different senses are responded to in approximately the same time and space (Holmes and Spence 2005). If a sound and movement appear to derive from the same place and time, the response is faster than if a single stimulus was received.
Thus, research in cognition combined with artistic endeavours offers opportunities to examine and explore new ways of engaging an audience.

Spatial design
Although large visual displays are increasingly being used in large public spaces, little research on spatial design has been carried out. Interactive displays are a different matter. Attention has been Making audience experiences more meaningful and emotionally engaging through mixed visual and audio media Bonner, Ramduny-Ellis and Peebles given to understanding how to engage people in closed public events, for example, how they move through different phases of awareness (Brignull and Rogers 2003). In the entertainment industry, the use of multi-screen displays is becoming increasingly popular. Cinematographer, Eric Scott has produced a multi-screen experience for the music group 'The Hives' for their 'Tick Tick Boom' tour. However, studies related to how viewers cognitively process and interpret spatially distributed display information were not found.
In contrast, location based audio has been used effectively in learning environments (FitzGerald, Sharples et al. 2011) who offer guidelines for the organisation of audio content particularly in relation to placement and movement. Artist, Peter Batty, has created an installation of audio graffiti work where audio files are spatially located within a room.

Temporal flow
Two elements of temporal flow are available within a lecture: conceptual and perceptual time. Conceptual time can be thought of as interplay between the underlying chronological sequence of events or story and the narrative structure presented in real time. Perceptual time is related to the experience in the 'here and now' and how preceding and expectant events influence transitions between events. Cooke (2010) argues that live media performance should be regarded as an experience only in the here and now thus making 'time itself a medium'.
Again, consideration of temporal flow allows an additional parameter to be considered in the design and delivery of a performance lecture.

Audience engagement and spectatorship
Emotional engagement is increasingly being recognised as an important approach to improving the learning experience (Kort, Reilly et al. 2001). For this reason, our intention is to place strong emphasis on creating emotional resonance. Artists often engage audiences in this way. Fundamental to powerful artistic work is the ability to elicit strong emotional engagement. Robinson (2006) provides a comprehensive review of how emotions are used in our engagement and understanding of the 'arts'. Emotion can be thought of as a transaction between the organism and its environment, therefore it should not be considered as a state but more as a continual interactive process or transaction between a person and an environment.
Consequently, 'basic emotions' (Lazarus 1991) such as anger, anxiety, fright, guilt, shame, sadness, envy, jealousy, disgust, happiness, pride, relief, hope, love, and compassion defines emotion too simplistically. Ortony (1991) classified emotions very differently and thought of them as appraisals rooted in goals, standards or as tastes and attitudes. Different emotions are formed depending on whether they contribute or negate internally held values. Through evidence of empirical and theoretical research, Robinson (2006) argues that an emotional reaction can go through a series of appraisals: affective, cognitive and cognitive reappraisal. First, the affective appraisal moves quickly through the amygdala and then more slowly through a cognitive route via the neocortex. The fast route seems strongly connected to emotional memory as a physiological reaction, either to innate stimuli or to reactions that have been remembered as a result of individual experiences.
This can then be cognitively appraised further with judgement being dependent on our personal interests, goals and desires. Thus, emotions are not only triggered by automatic affective appraisal but also by environmental conditions. Cognitive appraisals are contingent on some form of 'incomplete' evaluative judgement between our personal values and social context. Emotional experiences can then be further re-appraised many times which may result in an alternative summary judgement. In this three-stage process, an insult could, at first, make us angry, followed by a cognitive re-appraisal to remain calm, followed by a collective reflection questioning why the insult was given in the first place. Emotional engagement can help us become more astute in our understanding of human motivation and achievement. Thus rational thinking cannot occur without emotional involvement. One of the pleasures of art is being able to distance ourselves from a portrayed event -we can experience abhorrence -while at the same time enjoying an understanding of our cognitive appraisal of such an event.
The use of 'inference' is also an important cognitive tool. Artists make careful decisions about what to make explicit and, perhaps more importantly, what to omit.
Omission or redundancy allows the spectator to make inferences -to fill in the gaps. Inference can be used as an important element to assist deeper understanding. What helps is an understanding of the audience's prior knowledge as this has a strong effect on what can be inferred. There has to be a good fit between the audiences strategies for adaptation and those presented by the artist otherwise the audience will not experience emotional congruence.
An important aspect of this research will be to explore alternative mechanisms to assist the Making audience experiences more meaningful and emotionally engaging through mixed visual and audio media Bonner, Ramduny-Ellis and Peebles audience to 'decode' the lecture thus encouraging active analytical listening. Active listening involves a purposeful intent to understand what is being said or is being listened to. This should be followed with some degree of moment-by-moment reflection, followed by a level of critical analysis of the performance. One way of doing this is by adhering to the listener's pattern of expectations and also playing with the cognitive functions of listening. For example, we use 'differencing' to select or de-select what we listen to and pattern recognition to identify and tune into certain sounds.
Memory is a useful tool to emotional engagement. The practices of remembrance through the process of recollection can help give meaning by giving sense to present personal or collective identity, as well as changing values and behaviour over time. Robinson (2005) has shown that importance and memorability of recalled events from a novel can be predicted in terms of their relation to the main 'causal chain'. These causal events can be both articulated by the author or be inferred. Memory can be used as a thematic component but also as a structural element in delivery of a performance lecture. Collective or cultural memory could be used at the thematic level. Cultural memory can be thought of as a shared knowledge of the past which does not form part of a formal historical discourse (Plate and Smelik 2009). Good examples of these are traumatic experiences that have been publically shared such as the recent city riots in the summer of 2011.

Summary
The framework can therefore be thought of as having interweaving horizontal and vertical strands. The disciplines of cognitive psychology, interaction design and performance arts offer opportunities to reappraise our thinking about the constitution of a lecture by superimposing these disciplines onto four different parameters of rhetorical structure. Further work is required to evolve and refine the rhetorical elements, particularly in terms of how best to use them to design and deliver performance lectures and measuring their effectiveness.
Our aim is to be bold and innovative in approach. Innovative artists pitch their work at the boundaries of audience interpretation and our intention is to do the same. The formation of the framework has already helped in bringing divergent and convergent thinking together. Through this process, useful research and development questions have emerged. Because mixed modalities are being deployed, it follows that the narrative structure need not be linear. This poses a host of design issues. Audience attention will be deliberately shifted from time to time away from the presenter or performer to other forms of audio and video media. How should this be designed to ensure that alternative modes of engagement supplement rather than just distract the audience? Furthermore, emotion and making inferences are closely bound together, therefore, where and how should inferential meaning be placed? To begin to answer some of these questions, we set up a small experiment which explored some of the rhetorical parameters raised above.

TRIPTYCH VIDEO STUDY
A 12 minute video was produced to primarily investigate how spatially distributed content would be perceived and understood by a spectator (Figure 1). Also, to explore how the use of mixed modalities affected content retention and recall. The video, presents reportage of a jukebox manufacture. Video and audio content is divided across three 50" plasma screens. The reportage gives an insider's view of the company, the working environment and how the products are manufactured. The organisational culture and management style are also conveyed and how the company is coping with changes within the industry. The video footage is divided into four chapters.
The first chapter provides scene setting both in terms of the content but more importantly introducing the viewer to the spatially delivered audio and video. The second chapter portrays the family feel to the business. A mixture of industrial sounds is mixed with music emitted from jukeboxes undergoing quality testing along with numerous radios belonging to factory workers. The third chapter reflects the office environment using a subdued soundscape, supplemented with slowly repeating still images which contrasts with the factory environment. The final chapter depicts a more poignant and emotional side to the business. It consists primarily of an audio interview with the semiretired owner and father of the current managing director. It reveals a candid and heartfelt account about his personal views of the business and himself. Added to this are occasional visual vignettes relating to his sentiments.
Not all the video screens are active at the same time. Viewing therefore becomes a more active process by having to attend to different screens at different points in the production. On occasions the viewer has to identify which screen is synchronised with the audio while another screen shows video footage a few seconds ahead of the synchronised footage and the third screen shows footage a few seconds behind. Thus, the viewer can simultaneously observes some aspects of the narrative before or after the active or Making audience experiences more meaningful and emotionally engaging through mixed visual and audio media Bonner, Ramduny-Ellis and Peebles synchronised screen, Added to this, each video may also contain still photographs, supplementary audio and textual information. Some elements of the video and audio material are also repeated to reinforce key concepts. This 'layering' is deliberately intense at times while at other times only a single audio channel is provided.
A user study was carried out to assess how well viewers could understand and interpret the content and also to assess the effectiveness of the adopted delivery styles formed from the framework.
12 participants took part in the study, and consisted of undergraduate students studying either psychology or computing science. Seven were male and the rest were female. The majority were aged between 20-30 years. Eye movement was also video recorded in order to track gaze across the three screens. Each session was video recorded and was followed by a 20 minute semistructured interview. Participants were probed for further detail if responses were deemed to be lacking in detail. Interviews were audio recorded and later transcribed for further analysis.
The questionnaire was designed to elicit as much free-recall material as possible while avoiding potentially influential experimenter prompts. Once this elicitation approach was exhausted, more specific questions were posed to identify if specific elements of content could be accurately recalled and also to gather subjective views about the delivery styles. The key aim was to discover participant attitudes, perceptions and understanding of the delivery style and how opinions were formed.
At the time of writing, the transcripts were only just beginning to be analysed. So at this stage our findings are based more on anecdotal evidence and a cursory review of the data. Out of the 12 subjects, 7 found the video to be entertaining and engaging but 8 thought they had to work hard to understand it. Although the majority of viewers enjoyed the experience in some way, about 4 subjects gave strong negative views. In free recall, most subjects were able to discern the key elements of the video such as the company having family values and the changes in the music industry. However, there appeared to be less agreement on the meaning of the narrative structure of the video. Also revealing was what viewers attended to. A strong theme of the video was expressing the prevalence of different forms of music and sounds within the factory and offices. A minority of viewers did not recall any aspect of this, even after being prompted for a response. One subject even responded, 'what music?' This requires further and deeper analysis, but our early review suggests that subjects are more receptive to certain modes of delivery over others. Furthermore, it would seem this does not correlate proportionally with modal exposure. For example, quite a few subjects commented on the text displayed on the screen and found they could provide rich detail around or connected to these statements even though their overall occurrence was relatively rare. Another initial observation involved a series of images which simultaneously flashes across different screens with the sound of slamming doors. Intriguingly, one of the four images contained the face of a young woman and a few subjects commented on this fleeting image and were curious to know its contextual significance.
It's not possible to draw heavy conclusions at this stage, but early indications suggest that distributed media increases attention but does not override personally preferred reception channels. Some of the video was designed to convey its meaning inferentially but early analysis suggests subjects did not respond to this very well. Deeper analysis of the results may prove the video was too much of an assault on the senses despite only being 12 minutes long.

FURTHER WORK
Further analysis will be carried out on the study transcripts and will help inform further development work and methodological improvements.
The video was a passive encounter and we intend developing more interactive engagement between the performer and the audience in the future. Work has begun on a lecture known as 'Faith in Progress' which takes a critical view of our relationship with technology. The work from the triptych video suggests that if innovative methods of delivery are to succeed in the way we hope, then we need to find ways of helping the audience to decode the performance lecture.
Artistic work containing a strong narrative usually conforms to a recognisable rhetorical structure such as a novel, play, film or painting. Conventional lectures also have their conventions. A performance lecture has been intentionally designed to sit somewhere in between. We need to ensure that audiences understand its place and role.