Spaces of Interaction

As the world becomes increasingly computationally enabled, so our view of human-computer interaction (HCI) needs to evolve. The proliferation of wireless connectivity and mobile devices in all their various forms moves people from being outside a computer and interacting with it to being inside an information space and moving through it. Sensors on the body, wearable computers, wireless sensor networks, increasingly believable virtual characters and speech-based systems are all contributing to new interactive environments. New forms of interaction such as gesture and touch are rapidly emerging and interactions involving emotion and a real sense of presence are beginning. These are the new spaces of interaction we need to understand, design and engineer. Most importantly these new forms of interaction are fundamentally embodied. Older views of a disembodied cognition need to be replaced with an understanding of how people with bodies live in and move through spaces of interaction. Human-Computer Interaction, Interaction Design, Navigation of Information Space, gesture, emotion, presence, software architecture.


INTRODUCTION
As the world becomes increasingly computationally enabled, so our view of human-computer interaction (HCI) needs to evolve.The proliferation of wireless connectivity and mobile devices in all their various forms moves people from being outside a computer and interacting with it to being inside an information space and moving through it.Whilst this shift of relationship has been noted before (Benyon, 2001, Benyon 2006), the hybrid, mixed reality spaces that are now becoming more common make it timely to revisit the idea.
New technologies such as biological sensors worn on your body, interactive clothes (or wearable "computers") and mobiles equipped with accelerometers, magnetometers and other sensors that register spatial orientation are becoming common place.Wireless Sensor Networks (WSNs) are beginning to appear, either embedded in a physical environment or formed ad hoc through connecting suitably functional devices.Data projectors, whether fixed or mobile, can project displays onto a surface and tracking technologies such as video can register interactive gestures such as pointing.Thus any surface has the potential to be an interactive device.These developments open up a space of possibilities for touch, gesture-based, physical and bodily-based interaction.
A second strand of developments concerns the anthropomorphism of interaction, something we have termed "personification technologies" (Benyon and Mival, 2008).The friendly voice of a "sat nav", telephone help-line or railway station announcer may be produced through a text-to-speech (TTS) system, but it may evoke a response that ascribes personality to the system.Automatic Speech Recognition (ASR) is reaching a point of robustness where people believe the system is reaching a deeper level of understanding of them than it is.The realism of computer-generated characters previously only available in movies is now coming to characters operating in real-time.People are connected to one another across distances through more media and using more modalities than ever before.
The information spaces of this new age mix the real and the virtual in new ways and this means that it can become difficult to tell whether one is interacting with a device or a person.The natural and intuitive forms of interaction based on speech, touch and gesture remove the mediating effect of keyboards and screens and open up social and emotional expression and increase accessibility for those excluded by languagebased interaction.
HCI needs to change to deal with this new environment; and indeed is beginning to do so.The rise of Interaction Design has helped to bring a "designerly way of thinking" (Cross, 2001) to HCI.

2
Experience Design recognises the importance of aesthetics, engagement and pleasure to interaction.Expressive Interaction focuses on the impact of speech, touch and gesture on opportunities for new forms of interaction.In this paper we consider HCI in the context of spaces of interaction looking at the theoretical, engineering and design issues that are raised.

UNDERSTANDING SPACES OF INTERACTION
In 2006 an ACM CHI conference workshop tackled the question "What is the next generation of Human-Computer Interaction?" (Jacob, 2006).Robert Jacob introduced the workshop through the idea of realitybased, or real-world interaction.This workshop explored many of the issues concerned with mixed reality and tangible user interfaces.Building on this work they now identify four key characteristics of reality-based interaction; naïve physics, bodily awareness and skills, environment awareness and skills, social awareness and skills (Jacob, 2008).
Naïve physics concerns people"s abilities to understand how objectives behave in the real world.For example, unsupported objects fall to earth, larger objects are usually heavier than smaller ones, round objects roll and so on.The importance of the body in reality-based interaction is also emphasised.People are physical beings, living in an environment populated by other people and hence they naturally have the awareness and skills that such an existence brings.
The body is also central to new and emerging forms of interaction.Touch, gesture and proximity are all bodilybased forms of interaction and in the context of the mixed realities that characterize the next generation of technological environments, the relationship between the physical and digital worlds becomes the focus.This in turn has lead to a rediscovery of phenomenology as a philosophical underpinning for considering interaction.Merleau-Ponty (1962), for example views the body as the condition and context through which people are in the world.Our bodily experiences are integral to how we come to interpret and thus make sense of the world.
Playing a central role in phenomenology, embodiment offers a way of explaining how we create meaning from our interactions with the everyday world we inhabit.Our experience of the world depends on our human bodies.This is true in both a physical sense and also in a cultural way (Fallman, 2003).
Paul Dourish (Dourish, 2001) has argued that embodiment offers a way of explaining how we create meaning from our interactions with the everyday world we inhabit.Our experience of the world depends on our human bodies, both in a physical, way and also through our learnt, culturally-determined behaviours and experiences.Dourish provides a modern account of embodied cognition, highlighting the social as well as the physical aspects of being a person with a body.Underlying these theories are fundamental view of cognition such as experientialism (Lakoff and Johnson, 1999) and cognitive semantics (Fouconnier and Turner, 2002;Imaz and Benyon, 2007).Most recently these views have been brought together by Shaleph O"Neil in his book Interactive Media: The Semiotics of Embodied Interaction.
For people outside the field of HCI, this move may seem strange, but it is an important shift.HCI was built on the concept of a Human Information Processor (Card, Moran and Newell, 1983) and this disembodied view of cognitive processing has continued to dominate the theoretical underpinnings of HCI.Even during the "turn to the social" movement that took place in the 1990s (Bannon, 1991) many of the central ideas of cognition were not questioned.It is only in the 21st century that the philosophy of phenomenology and the importance of the body to cognition has (re)gained favour.
However, it is not just the body that needs to taken into account as we seek a better understanding of how we come to understand.Outside HCI, in diverse areas such as psychology, neurology, medicine and sociology, there has been a wave of new research on the importance of emotion in cognition (Katz, 1999).Neurologists have studied how the brain works and how emotion processes are a key part of cognition.Emotion processes are positioned in the middle of most processing going from frontal lobe processing in the brain, via brain stem to body and back (Katz, 1999).Bodily movements and emotion processes are tightly coupled.As discussed by Sheets-Johnstone, there is "a generative as well as expressive relationship between movement and emotion" (Sheets-Johnstone, 2009).Certain movements will generate emotion processes and vice-versa.But emotions are not hardwired processes in our brains, but changeable and interesting regulating processes for our social selves.As such, they are constructed in dialogue between ourselves and the culture and social settings we live in.Emotion is a social and dynamic communication mechanism.We learn how and when certain emotions are appropriate, and we learn the appropriate expressions of emotions for different cultures, contexts and situations.The way we make sense of emotions is a combination of the experiential processes in our bodies and how emotions arise and are expressed in specific situations in the world, in interaction with others, coloured by cultural practices that we have learnt.Katz (1999) provides us with a rich account of how people individually and group-wise actively produce emotion as part of their social practices.For example when he discusses anger among car drivers in Los Angeles, he shows how anger is produced as a consequence of a loss of embodiment with the car (as part of our body), the road and the general experience of travelling.He connects the social situation on the road and the lack of communicative possibilities between cars and their drivers with the anger that s produced when we are cut up by another car.He even sees anger as a graceful way to regain embodiment afterwards.
This philosophy leads to a new conception of emotion.Emotion is not something to be measured through galvanic skin response, transmitted or stored somewhere.Emotion is constructed in interaction, where the system supports people in understanding and experiencing their own emotions.An interactional perspective on emotion makes emotional experiences available for reflection.That is, to create a representation that incorporates people"s everyday experiences that they can later reflect on (Höök, Ståhl, Sundström, and Laaksolaahti, 2008).
Another area of work that has arisen as a result of new forms of interaction is in the area of presence.Advances in technologies have made a huge difference to tele-presence, so that things such as telemedicine (where a doctor in one location operates on a person in another location) have become possible.The remote control of distant vehicles such as the Mars Lander is made possible because fine motor control can be experienced by an operator thanks to advances in haptic (touch) technologies; another bodily form of interaction.
Research into tele-presence has spawned a philosophical interest into what it means to be somewhere, or to be with some one.Research into presence (the word is used, confusingly, as a contraction of "tele-presence" and for the concept of presence) has established the importance of presence as a human capability for intention, attention and social cohesion.Riva and Waterworth (2004) argue that humans are social beings, pre-programmed to prioritise the presence of others.Our sense of presence of the other arises from the integration of information about three levels of being of the sensed person, all arrived at from the observation of the physical cues inherent in actions: the physical, the physiological and the psychological.At the physical level, we confirm that the patterns of bodily movements are those of a recognised person, or we register those of an unknown person.Each person has their own way of acting, which is revealed even when they are in a neutral state.At the physiological level, we infer the emotional state of the person from how they are behaving.Finally, at the psychological level, we interpret what we are observing in terms of the focus of attention and likely mode of cognition of the other person.
For example, we register their physical being, that it is indeed they.We may then detect that they are anxious, and go on to conclude that the cause of the anxiety is not the activity or person with whom they are currently engaged (or that it is), all this from perceiving their way of moving parts of their bodies (or the results of those movements, such as the characteristics of their way of speaking).
Our sciences of emotion, presence, interaction and the role of the body in technology-mediated experiences require considerable development in terms of theory and in terms of experimental data that verifies the theory.It is one thing to move away from disembodied cognition as a basis for HCI, but it is another to translate the new philosophy of embodiment into good designs.At present we do not know how to design for these new environments and these new experiences.

ENGINEERING SPACES OF INTERACTION
Software to support these new information spaces of embodied interaction are undergoing a period of radical change.The traditional software model of HCI relies on a sequential input-output feedback loop.The Seeheim model or Arch model separate the presentation of information (the user interface) from the semantics of the interaction that forms the interface to the application.In the middle is the "dialogue" control.Another classic model of user interface software separates the conceptual model of information from how it is presented (the view) and the relationship between them (the controller).This MVC model (and others such as presentation, actor, control, PAC) have dominated user interface software descriptions since the 1980s.All of these models relied on a person interacting with an application.In a similar way the OSI standard 7-layer model made the application top of its structure.All these standards are set to change.
Work on tangible user interfaces as well as other more subtle forms of input and output such as audio, gesture, touch and full body interaction blur the abstract/concrete distinction that characterises graphical user interfaces (GUIs).For example, with tangible user interfaces (TUIs) control is coupled with the model of the domain, not separate from it as it is in traditional models (Ishii and Ulmer, 1997).Other basic models of user interfaces and HCI are also proving inadequate to these new forms of interaction.GUIs force people to express things in a very precise way with selections from menus, and clicks on icons.With the more natural expressive interfaces that are needed to accommodate people in a complex environment, we cannot shoehorn people into such precise actions.People cannot be expected to point "correctly", to wave "correctly" at some device or to walk "correctly" through a door.Systems must be tolerant of a wider range of actions and of differences between people; they must get to know the individuals concerned.
Another difference with these new information spaces is that the input-output feedback loop is not sequential, as the Seeheim and Arch models suggest.Modern interfaces need to deal with multiple, multimodal and concurrent interactions.Facing the vast world of possibilities for interaction modalities, models and tools for integrating and combining those modalities become a real challenge.The OpenInterface (OI) project (OI-Project; Coutrix and Nigay, 2006) has developed a framework for prototyping multimodal interaction.It includes a platform that handles heterogeneous software components and a development environment that defines access to interaction capabilities at multiple levels of abstraction with a repository of reusable components and generic mechanisms for combining modalities.This approach aims to deal with the inherent difficulties of interleaving interactions that are using many modalities simultaneously (such as speech and touch and gesture).
The essential problem is that the interaction needs to change to take account of different contexts.Lieberman and Selker (2000) argue that the OSI seven layer model of software is inadequate to deal with interactions that cross applications.Quality of service, authentication, privacy and security should all be available across the whole interaction over time.Applications currently do not share context, but interactions now need to move seamlessly across devices and across applications.For example if I have identified a restaurant on my iPhone I should be able to transfer this effortlessly to my sat nav.Even if I can do this, I will not typically be able to transfer the previous interaction that enabled me to arrive at the choice of restaurant, something that is an essential part of the context of the whole interaction.
Similar issues arise in the area of ambient intelligence (Aarts, and Marzano, 2003).Systems need to be aware of the context of interactions; who is interacting, what has gone before, what has the person done in the past, what states are the person and the whole system in and what are the characteristics of the wider environment.Data provided by a wide range of sensors needs to be integrated with the cognitive and affective characteristics of the person at that time, their behaviours and the activities that they are undertaking.
Interactive multimodal visualizations are concerned with harnessing the power of novel interactive techniques with novel presentations of large quantities of data.Indeed Card (Card, 2007) argues that visualization is concerned with "amplifying cognition" (or more generally amplifying meaning-making).It achieves this through increasing the memory and processing resources available to people, reducing the search for information, helping people to detect patterns in the data, helping people to draw inferences from the data, encoding data in an interactive medium.The aim of the designer is to provide people with a good overview of the extent of the whole dataset (the information space), to allow zooming in to focus on details when required, and to provide dynamic queries that filter out the data that is not required.
The problem is that graphical output modalities are reaching their limits: the human visual perceptual and cognitive resources have limits and the size of screens do not increase proportionally to the size of the information spaces.A second challenge is the existence of time and resource constraints.Multimodal output user interfaces play a central role to increase the information bandwidth between the human and the computer.This includes combining graphical modalities with sound and using multiple complementary graphical modalities (one for conveying high-level structure of the information space and another one for specific details) exploiting one or multiple display surfaces (wall, screen, PDA, etc.).In addition to output multimodal interfaces, input multimodality can facilitate the exploration and navigation within the large information space: examples include active modalities such as two-handed interaction, physical objects as landmarks and passive modalities such as eye-gaze and position tracking.Looking further ahead brings a mixture of realities in addition to modalities.
Advances in tele-presence, as mentioned above, blur the distinction between the virtual and the real and between the near and the far.You might feel that you are touching a real object, but may be interacting with a virtual one.You may think you are interacting with a real person, but may be interacting with a simulation.
The state of the art in character representation and animation includes full body and face animation, with bone-based skin deformation for the body and different deformation techniques for the face (e.g.weighted morph targets, muscle models, bone-based animation, procedural animation); visual text-to-speech and lip sync for the animation of the mouth; (semi-) automatic personalization of the face from single image, as well as prototypes of mobile face animation players.Research in this area is expected to produce fully automatic production of natural looking character animation; to reduce the gap between the state-of-theart crafted animations as seen in high-budget movies and the currently much less impressive automatic animations seen in real-time interactive systems.Statistical methods are used to generate facial gestures (Zoric, Smid and Panzic, 2006).The next goal is to study how emotions influence the behaviour, and how this can be represented in the statistical model.
The key element enabling the advanced human-tohuman interaction is the efficiency and reliability of data communication over a network.We foresee that the networking infrastructure, together with the appropriately designed software architecture must be capable of transmitting, processing and synchronising large amounts of data communicated with the use of various formats within multiple transmission channels.It is key to deliver efficient methods to store and transmit the presence over networks, including also the emerging types of media, such as for example emotions.
The data communicated between distant humans interacting with the use of ICT must be transmitted taking a due account of the constraints of the network, or networks that constitute the presence transmission channels during the interaction sessions.This concerns all types of networking, including broadband over wired lines or wireless, mobile communication and personal area networks.The level of presence will heavily depend on the available bandwidth and end devices in terms of computational power, storage and display.

DESIGNING SPACES OF INTERACTION
The art of HCI will need to change if designers are to create experiences that allow people to build relationships with their personification technologies, to express and interact with emotions, to feel present and to move through large, mixed reality information spaces.Interaction design and HCI will need to understand and develop a new set of techniques that will enable people to work at this level.New methodologies and new attitudes to design will be developed.
Emotion and presence are part of our social ways of being in the world, they colour our dreams, hopes and experiences of the world.If we aim to design for emotion and presence, we need to address aspects of aesthetic experiences in our design processes.Dewey, for example, distinguishes aesthetic experiences from other aspects of our life through placing it in between two extremes on a scale.On one end of that scale, in everyday life there are many experiences where we just drift and experience an unorganized flow of events, and on the other end we experience events that do have a clear beginning and end but that only mechanically connect the events with one-another.Aesthetic experiences exist between those extremes.They have a beginning and an end; they can be uniquely named afterwards (e.g."when I first heard jazz at the Village Vanguard") but in addition, the experience has a unity -there is a single quality that pervades the entire experience: "An experience has a unity that gives it its name, that meal, that storm, that rupture of a friendship.The existence of this unity is constituted by a single quality that pervades the entire experience in spite of the variation of its constituent parts."(Dewey,1934).
In such a holistic perspective, it will not make sense to talk about emotions, or the sense of presence as something separate from our embodied experience of being in the world.It is the coming together, the confluence, of humans and technologies that is our focus.This aesthetic quality of the whole interaction has been well described as Technology as Experience (McCarthy and Wright, 2005).
These spaces of interaction will include many devices, virtual humans and other forms of interaction that encourage people to anthropomorphise the object of their interaction.We call these "personification technologies" to include robots, on-screen avatars and other autonomous systems imbued with character that demonstrate intelligence and affect, that know their "owners" personally (Benyon and Mival, 2008).Personification technologies enable intelligent interaction with people in terms of speech and language, gesture and other forms of touch and nonspeech audio.They are believable, intuitive, and humane conversational partners.They are autonomous and personality rich.Virtual humans are one example of personification technologies.Others include autonomous toys such as Sony"s AIBO, engaging devices, and ambient environments.In all cases the idea is to enhance interaction by getting people to engage more; by turning interactions into relationships.Bickmore and Picard (2007) argue that maintaining relationships involves managing expectations, attitudes and intentions.They emphasise that relationships are longterm; built up over time through many interactions.Relationships are fundamentally social and emotional, persistent and personalised.Citing Kelley they say that relationships demonstrate interdependence between two parties -a change in one results in a change to the other.Relationships demonstrate unique patterns of interaction for a particular dyad, a sense of "reliable alliance".
It is these characteristics of relationships as rich and extended forms of affective and social interaction that we are trying to tease apart so that we can understand personification technologies.Digesting all our experience to date we describe the technology in terms of utility, form, personality, emotion, social aspects and trust.Designing for relationships is very different than designing for function.Interaction design has always embraced the importance of form and as well as function and now it is taking on board emotional design and designing for a high sense of presence too.
The final part of design that we address is the idea of moving through information spaces.Benford and colleagues (Benford, et al., 2009) introduce the concept of "interaction trajectories" in their analysis of their experiences with a number of mixed reality, pervasive experiences.Drawing upon areas such as dramaturgy and museum design they identify the importance of design for interactions that take place over time and through physical as well as digital 6 spaces.These hybrid experiences take people through mixed spaces, times, roles and interfaces.Trajectories "take their participants on journeys" (p.712).They argue that the trajectories need to be coherent, part of a connected whole.This use of the spatial analysis for understanding interaction is reminiscent of the idea of HCI as "navigation of information space" (Benyon, 1998;2001;2005;2006).With this view, cognitive engineering gives way to a more design-oriented discipline akin to architecture or interior design.These disciplines emphasise flow and the unfolding of experience.It is the design of people"s experiences as they move through environments containing mixed reality, multimodal interactions over time that needs to be foregrounded in the design of spaces of interaction.Navigation of information space, or interaction trajectories, emphasizes the importance of the body as a central component of interaction.

CONCLUSION
The future promises to bring new spaces of interaction to people.Characterised by new more intuitive and less rigid forms of interaction such as speech, touch and gesture, these new spaces will combine multimodal forms of input and output in novel ways.The spaces of interaction will include networks of computational devices, wirelessly connected and mobile devices with new sensors that enable new forms of interaction.People will move through these spaces transitioning from one form of interactive experience to another.These technological developments are driving new thinking about HCI and interaction design.Designers will need to understand and design for presence in these hybrid environments, exploiting technologies to make people feel present and enabling people to reflect and understand their presence.Emotional interaction will become an important part of these spaces of interaction as people connect with each other through new media that allow subtle and possibly quite new forms of emotional engagement.Anthropomorphism will bring people to interact with devices and virtual characters as if they were people.Designers are embracing these new opportunities through concepts of navigation, aesthetic and experience design.
In its turn this new thinking and new design concepts drive technological boundaries.However, it is the software architectures and network infrastructure that are limiting opportunities.New architectures are needed to support moving across applications and moving through environments, whilst maintaining the quality of service, security, privacy and other features of the infrastructure.Interactions have to move smoothly, if not seamlessly, from one device to another and from one modality to another.
The social implications of these new spaces of interaction are difficult to foresee and the cultural impact is difficult to anticipate until we have better models and a better understanding of the issues.Once we can prototype these spaces of interaction we can investigate, understand and provide design advice and regulation to secure balanced and appropriate interactions for the future.