Towards the improvement of self-service systems via emotional virtual agents

Christopher Martin Leslie Ball Jacqueline Archibald Lloyd Carson School of Computing & Engineering Systems School of Computing & Engineering Systems School of Computing & Engineering Systems School of Social & Health Sciences University of Abertay, Bell Street, Dundee University of Abertay, Bell Street, Dundee University of Abertay, Bell Street, Dundee University of Abertay, Bell Street, Dundee c.martin@abertay.ac.uk l.ball@abertay.ac.uk j.archibald @abertay.ac.uk l.carson@abertay.ac.uk


INTRODUCTION
This paper describes research which contributes towards the development of an empathetic system which will detect and improve a user's affective state during a problematic self-service interaction (SSI) through the use of an affective agent.Self-Service Technologies (SSTs) are those which allow a person to obtain goods or services from a retailer or service provider without the need for another person to be involved in the transaction.SSTs are used in many situations including high street shops, supermarkets and ticket kiosks.The use of SSTs may provide benefits such as improved customer service (for example allowing 24 hour a day, 7 days a week service), reduced labour costs and improved efficiency (Cho & Fiorito, 2010).Less than 5% of causes for dissatisfaction with SST interactions were found to be the fault of the customer (Meuter et al., 2000;Pujari, 2004), indicating that there is a need for businesses and SST manufacturers to improve these interactions in order to reduce causes for dissatisfaction (Martin et al., unpublished).The frustration caused by a negative SSI can have a detrimental effect on a user's behavioural intentions towards the retailer, impacting the likelihood the user will continue doing business with them in the future and whether they will recommend them to other potential users (Lin & Hsieh, 2006;Johnson et al., 2008).By adopting affective computing practices in SSI design, such as giving computers the ability to detect and react intelligently to human emotions and to express their own simulated emotions, user experiences may be improved (Klein et al., 1999;Jaksic et al., 2006;Wang et al., 2009).
Affective agents have been found to reduce frustration during human-computer interactions (HCIs) (Klein et al., 1999;Jaksic et al., 2006), therefore we are investigating their effectiveness at improving negative affective states in a SST user during a shopping scenario.We propose a system which will detect negative affective states in a user and express appropriate empathetic reactions using an affective virtual agent.
Two stages of research were identified.The purpose of stage 1 (reported in Martin et al., in press) was to investigate whether emotional facial expressions are present during SST use, to determine whether a vision-based emotion detector would be suitable for this system.The purpose of stage 2 (reported in Martin et al., unpublished) was to gather opinions on the appropriateness of several emotional facial expressions which may be expressed by the virtual agent during a SSI.

LITERATURE REVIEW
Several studies have focussed on having computers recognise emotions from facial expressions (Essa & Pentland, 1995;Tian et al., 2001;Valstar & Pantic, 2006;Kotsia & Pitas 2007;Kumano et al., 2009;Lucey et al., 2010;Satiyan & Nagarajan, 2010;).As reported in Martin et al. (in press), in stage 1 of this research we examined the facial Action Units (AUs) displayed during SSIs.AUs are descriptors for the movement of individual facial features which combine to form facial expressions (Kanade et al., 2000).Prototypical definitions of AU activation patterns have been identified for each of Ekman's (1987) basic emotions: anger, disgust, fear, happiness, sadness and surprise (Lucey et al., 2010).In order to develop an affect detection system, facial expressions which occur during SST interactions need to be identified so that a detector can be trained.
We intend to use Haar-like feature extraction (Essa & Pentland, 1995;Kumano et al., 2009;Satiyan & Nagarajan, 2010) to detect facial expressions due to its ease of training and implementation.
Social and emotional intelligence, the abilities of an individual to deal rationally and effectively with social situations and the emotions of others (Salovey & Mayer, 1990, Pantic & Rothkrantz, 2003;Pantic, 2005), have been argued to be important for successful social functioning (Salovey & Mayer, 1990).Nass et al. (1994) showed that humans apply the same social rules to interactions with computers as they do to interactions with other humans.If people have similar social expectations for computers as they do for other humans, an interaction with a system whose perceived social or emotional intelligence falls below their expectations may prove unsatisfactory.Therefore it is important that a computer's perceived social and emotional intelligence is acceptable to users (Martin et al., in press;Pantic, 2005;Picard, 1997;Picard, 2001).
In order to progress towards the proposed empathetic self-service system, stage 2 of our research involved determining whether any emotional behaviours are found to be appropriate or inappropriate during a problematic SSI.Appropriate behaviours will be included in the abilities of our emotional agent while inappropriate behaviours will be excluded.
Virtual agents are personas involved in a HCI which interact with a user via a combination of a graphical representation, printed text and recorded or synthesised speech.Affective virtual agents are those with affective abilities including affect detection and emotional expressiveness (Picard, 1998;Jaksic et al., 2006;Peters et al., 2006).The use of affective agents in HCIs has produced encouraging findings for the improvement of these interactions, including causing a user to interact longer with error-prone (Klein et al., 1999) or difficult tasks (Wang et al., 2009) and reducing frustration (Jaksic et al., 2006).
While these findings indicate potential benefits of affective agents, the studies by Klein et al. (1999) and Jaksic et al. (2006) relied on a Wizard of Oz methodology, where the researchers manipulated the agent "behind the scenes" to simulate automated behaviour.The study by Wang et al. (2009) required users to manually self-report their emotional state.Our proposed empathetic system will be fully automatic.
The appropriateness of an agent's emotional behaviour for the situation in which it is used has been found to affect the agent's believability (Lim & Aylett, 2007), the degree of trust the user has in the agent (Cramer et al., 2010) and the user's stress level, with inappropriate emotional behaviour inducing stress (Becker et al., 2005).If our system is to be successful in reducing negative affective states in users during frustrating interactions, the emotional behaviour of the agent must be acceptable to users.Since appropriateness of emotional behaviour varies with situation, a study was carried out to evaluate a range of emotional facial expressions to determine whether any of them were acceptable in a problematic self-service shopping task.Acceptable emotions would be included in our empathetic system while inappropriate emotions would be excluded.

METHOD
In order to identify AUs which occur during SSIs, in stage 1 customers at a supermarket were filmed using self-service checkouts.Customers were approached when they were queuing and asked whether they would like to be filmed while they scanned their shopping.Those who agreed were directed to a self-service checkout where a video camera was set up on a tripod.The camera filmed the participant's head and shoulders for the duration of their interaction.
Thirty participants took part in the study (17 male, 13 female).In order to minimise disruption to customers and to increase the appeal of taking part, participants were not asked any questions or required to perform any additional task apart from read and sign the informed consent form.Although it may have been useful to gather demographic data, information about participants' technological background and their current and previous experiences with SSTs, this study's main aim was simply to collect data on emotional facial expressions from as large a group of participants as possible, therefore the researchers wanted to maximise the appeal of taking part.The collected footage was viewed by a researcher and facial expressions were manually identified.The footage was divided into clips -short sequences of footage containing either a facial expression or a neutral face.
Clips containing a neutral face were discarded and the remainder were analysed by the researcher in order to determine which AUs were present at peak emotional expression.Expressions exhibited as a result of interacting with a member of staff, companion or the researcher were discarded as they were not elicited by the SST device.It is possible that some AUs were exhibited by the participants but not observed by the researcher due to factors including the resolution of the recorded footage (320x240 pixels), the lighting in the scene and participant features such as hair styles, glasses, hats, scarves and facial hair.
The study carried out during stage 2 required participants to attempt a computer-based selfservice shopping task.The task was similar to most online shopping experiences involving adding items to a virtual shopping basket.Twenty-six participants took part (14 males, 12 females) with an average age of 29.3 years.Ethical approval was obtained from the University of Abertay School of Computing and Engineering Systems Research Ethics Sub-Committee.The study had a within subjects design with each participant being subjected to the same conditions.Participants interacted with a computer program which required them to complete a shopping task, similar to most online shopping scenarios which utilise a "shopping basket".
Participants received on-screen instructions to add various items from a list into a virtual shopping basket.A 3D animated agent was present on the screen during the task, displaying a looped animation of emotionally neutral behaviour.
Participants were asked to attempt to complete all tasks, with the option of clicking an onscreen "quit" button if they felt they could not complete a task.
In order to simulate a real-world SST interaction where an event makes it impossible for a user to achieve their goal without the assistance of a member of staff, the program purposefully obstructed the user, making the task impossible to complete.This was achieved by causing the program to add incorrect items to the shopping basket after a preset number of actions had been carried out by the user.If participants asked the researcher for assistance they were advised to click the "quit" button if the felt that they were unable to complete the task.After the quit button was pressed, participants were presented with a screen displaying an array of seven animated faces, each displaying either anger, disgust, fear, happiness, sadness, surprise or the emotionally neutral behaviour expressed during the task.The faces were arranged randomly and animations lasted for four seconds before being repeated.
Participants used on-screen 5-point Likert scales to record how appropriate they felt each emotional expression was in the current situation.The points of each Likert scale were labelled, in order, as "completely inappropriate", "slightly inappropriate", "neither appropriate nor inappropriate", "slightly appropriate" or "completely appropriate".Participants were then presented with an on-screen questionnaire gathering details of their age and gender as well as a freeform writing section where they could suggest changes they would like to see made to the agent in order to make it more acceptable.
The emotionally expressive agent used in this study was developed by Sloan et al. (2011).Its emotional behaviours consisted of animations of the six basic universal emotions (Ekman, 1987).These animations had previously been validated (Sloan et al., 2010) to confirm that each reliably portrayed the emotion it purported to.The basic universal emotions were chosen for this study as each has a corresponding facial expression which is universally produced and recognised by people of all cultures (Ekman, 1987).Humans arguably have a much wider range of emotions (e.g.rage, frustration and contentment), however these do not have universally recognised facial expressions and were therefore unsuitable for this study.

RESULTS
In the stage 1 research, 30 interactions were filmed and analysed, with an average length of 302 seconds.The activation of 16 different AUs was observed in the video clips.
A chi-square goodness-of-fit test was carried out to determine whether the frequencies of occurrences for each AU deviated significantly from the expected distribution.This was found to be the case (X2 (15, n = 213) = 261.066,p < 0.001), with AU 23 and AU 24 occurred at least twice as frequently as any other AU (Figure 1).
In the stage 2 study, results from agent appropriateness Likert scales were combined from 5 to 3 categories: Inappropriate, Neither Appropriate Nor Inappropriate and Appropriate (Figure 2).Chi-Square goodness-of-fit tests were carried out to assess responses to each of the animated expressions.Appropriateness ratings for the 'disgust' and emotionally neutral facial expressions were found to deviate significantly from the expected distribution.The frequency of each rating was compared between males and females.Significant variations were found in the distributions of each rating between males and females, indicating that: • More females rated surprise as inappropriate.
• More males rated surprise as appropriate.
• More males rated fear as inappropriate.
• More females rated fear as neither.
• More females rated happiness as neither.
• More females rated sadness as neither.
Participants made several suggestions about improvements which could be made to the agent, the majority of which related to the agent's appearance, including comments suggesting the agent should change its age and gender based on the user's age and gender (although it was not specified whether these should be changed to match or contrast the user's gender and age).

DISCUSSION
In stage 1, a collection of video clips was gathering showing people spontaneously expressing facial expressions during a SSI.AU 23 and AU 24 were observed with particular frequency.
We propose that the appearance of these AUs in the collected footage indicates a negative emotional state occurring in these SST users.These AUs were used to train a Haar-like feature detection system, however the performance of the system made it unsuitable for use.
The results of stage 2 indicate that, for an affective agent in a SSI, disgust was rated as an inappropriate facial expression with significantly higher frequency than expected while an emotionally neutral expression was rated as appropriate with significantly higher frequency than expected.
This suggests that disgust is an unsuitable emotion to be expressed during a problematic SST interaction while an emotionally neutral expression may be preferable to other basic emotions.
It was found that females rated surprise as inappropriate with significantly higher frequency than expected.Significant differences were found between males' and females' appropriateness ratings for fear, happiness, sadness and surprise, suggesting that an agent's behaviour should vary, depending on the user's gender, in order to be perceived as appropriate.
Participants also suggested that the agent should be able to change its age and gender depending on that of the user, although participants did not specify whether they would prefer an agent of their own or opposite gender or their own or a different age.
The results suggest that none of the basic emotions were considered appropriate by all participants.This could be due to participants not wanting to see more complex emotions (frustration, impatience, concern etc.), participants wanting to see less extreme manifestations of the basic emotions or participants not wanting to be presented with any agent emotions during an SSI.Complex emotions may be difficult to implement as there is a lack of literature on reliably recognised facial expressions for these emotions.

CONCLUSION & FUTURE WORK
This paper describe two stages in the development of an empathetic system which will detect a negative affective state in a SST user and attempt to improve this state via empathetic feedback from an affective agent.Facial expressions of selfservice checkout users were analysed and it was found that AUs commonly linked to anger were observed with particular frequency.A study was carried out to determine the appropriateness of emotional facial expressions displayed by an affective agent in response to a problematic SSI.Results indicate that disgust was perceived as inappropriate and emotionally neutral behaviour as appropriate by all participants.Gender differences were also observed, suggesting that agents may be considered more appropriate if they alter their behaviour based on the user's gender.
In order to continue development of the proposed empathetic system, the researchers will investigate appropriate emotions to be displayed in problematic SSIs, gender differences in perception of appropriate affective behaviour and the development of an acceptable affect detection system.

Figure1:Figure2:
Figure1: Occurrences of AUs observed in SST usersEmotions were rated as Neither Appropriate Nor Inappropriate by a minority of participants.The majority of participants rated emotions as either Inappropriate or Appropriate.When responses for males and females were examined individually, it was found that the distribution for surprise in females varied significantly from the expected distribution (X2 (2, n = 12) = 6.5, p < 0.04), indicating that females found surprise inappropriate.