A Remote Breathing Relaxation System-A case-study for Web-based Real-time Adaptive Human to Virtual Human Interaction

We present a web-based design for a human agent interaction (HAI) system. In this relaxation system a virtual-human acts as a breathing coach and guides the user along with it using a series of breathing exercises. The core of the proposed design is a step ladder approach for optimising the difficulty level of the breathing exercises. We are using real-time optimisation to achieve a seamless and effective interaction between the human user and the virtual coach. For the user to have a personalised exercise, the coach adapts the exercise regime based on the user’s response. For a natural interaction experience, our virtual coach looks like a human, speaks in human voice and displays nonverbal gestures.


INTRODUCTION
When people communicate, there is an exchange of messages between them. The efficacy of this message transfer depends on utilising all the communication channels, namelyintentional, unintentional, verbal, nonverbal, implicit, and explicit [1]. While human-to-human interaction naturally exploits all these channels effortlessly, human-toagent interaction is still developing. One of these developments is the implementation of real-time systems that interact with the user based on the feedback. There are several examples of real-time interactive systems that adapt based on users' responses. A real-time virtual tutor adapts its instructions to the students. This was shown to improve the learning of students significantly [2]. Another example of a real-time system is a virtual coach -Gabby, who can read the breathing of the user and give instructions based on the user's breathing dynamically [3].
Although the use of an interactive virtual human as a coach or instructor has been precedented before, most of these are either text-based, voice-based, while some have a 2D face, or a 2D body. These systems lack the physical attributes that are required for a multichannel communication. They also lack the sense of personal connection that is felt through the exchange of bodily cues between the interaction partners. With such limitations, human to virtual human interaction is still not on par with human-tohuman interaction.
Based on the above research, we propose the design of a real-time human to virtual human webbased interaction system. The system is a step toward improving the human to virtual human interaction. The design outlines a relaxation system that is based on breathing physiology. A virtual coach guides users' breathing to a relaxed level using an exercise optimisation approach in realtime. We aim to build an easy to access and affordable system that gives a stimulating interaction experience to the user which is close to a human-tohuman interaction experience.

DESIGN OF THE SYSTEM
The design of the system is based on a multichannel interaction between a human and a virtual human. The virtual coach uses a variety of verbal instructions and nonverbal cues. The aim is to decrease the effort and increase the efficiency of human-to-agent interaction by increasing the number of communication channels used simultaneously.
The main feature of the proposed design is the use of the exercise optimisation approach implemented using a step ladder approach where the difficulty level of the breathing exercises is altered in steps until the users find the optimal difficulty level for A Remote Breathing Relaxation System-A case-study for Web-based Realtime Adaptive Human to Virtual Human Interaction Sanobar Dar • Aniko Ekart • Ulysses Bernardet 2 themselves. The system is designed to accommodate user feedback in real-time as well as to adapt to the difficulty level by changing the length of inhaling, holding, exhale of the breathing cycle.

Interaction channels
For humans to have a seamless, effortless, and satisfying interaction with a computer, the process needs to be bidirectional and multichannel [1]. The use of multiple channels in interaction not only ensures continuity but also adds to the naturalness of this process. Therefore, we have incorporated verbal, nonverbal, explicit, and implicit channels in our human-to-agent interaction. The coach communicates with the user using all the channels by giving breathing instructions to the user through verbal cues, nonverbal gestures, and explicit instructions. The user communicates back by explicitly choosing the breathing exercise that they find to be the most comfortable.

Breathing Exercises
The virtual coach guides users' breathing through a series of relaxation exercises based on a wellestablished breathing exercise technique -Box breathing [4]. Box breathing is a breathing technique where each sub-cycle of the breath cycle i.e., breathe in (inhale), hold the breath (hold), and breath out (exhale) is equal.
Traditionally, in the Box breathing exercise, the inhale is 4 seconds, the hold is 4 seconds, the exhale is 4 seconds, and the hold is 4 seconds (4x4x4x4). In our implementation, we adapt the difficulty level between exercises such that in the first iteration, we use 2x2x2x2 for the easy exercise and 4x4x4x4 for the difficult exercise, i.e., the inhale, hold, exhale and hold are each 2 seconds long in the short breathing exercise, and 4 seconds long in the long breathing exercise. Each exercise is carried out for 2 minutes. We then adapt the final exercise based on the user's response and this exercise is carried out for 5 minutes to attain relaxation. This method is based on the National Health Services recommendation for a standard breathing exercise that can be performed to attain relaxation [5].

Adaptive Difficulty Level
We are using an adaptive difficulty level based on the optimisation of breathing exercises for each user. Here, a step ladder approach based on simulated annealing [6] is used for optimising the breathing cycle length as shown in figure 1. Simulated annealing is a popular combinatorial optimisation method, inspired by statistical mechanics [7]. Originally, annealing is used in metallurgy to alter the physical properties of an object by first heating it and then slowly lowering the temperature in increments. In our design, we are using the same principle, where we are altering the length of inhaling, holding, exhaling, and holding, in increments, eventually realising the optimum length for each user. The optimisation protocol is as follows: The user performs two breathing exercises -A shorter breathing exercise (2x2x2x2) and a longer breathing exercise (4x4x4x4). The reason for using shorter timings is that some people find it hard to inhale, exhale, or hold their breath for 4 seconds, therefore they are recommended to start with a 2 or 3 seconds cycle to ensure that it is comfortable to perform [8].
After completing both the exercises, the user is asked to assess the length of each exercise, was it too short, too long, or fine. The purpose of starting with two exercises is for the user to have a reference to compare the longer and shorter exercise. Depending on the user response, the system either reduces or increases the length of the breath cycle. The table 1 shows the steps ladder optimisation based on all the possible participant responses. The Ex3 column contains the adapted length of breathing sub-cycle post participants responses on both Ex1 and Ex2. This customised approach ensures personalisation of breathing exercises for each user such that they get the freedom of choosing between the varied lengths of the breathing exercise cycle's inhale, hold, exhale and hold. In the last exercise of the session, the user follows the coach through the adapted breathing exercise to attain relaxation.

System Architecture
The core of the system is the breathing controller ( Figure 2) which is implemented using the game engine Unity (https://unity3d.com). It receives the user response from the screen and sends control signals to the visual and auditory components of the virtual human to then adapt the breathing rate, breathing instructions, and breathing movements based on the user's response [9]. The core of the breathing controller is the finite state machine that is implemented using Unity's addon visual scripting tool -PlayMaker [10]. It allows for a quicker visual implementation of finite state machines compared to traditional scripting. The figure 3, shows the working of the finite state machine. Following the Introduction, the states of Breathe In, Hold, Breathe Out, and Hold are cycled through for the intended duration of the exercise. These states further control the breathing movements, breathing instructions, and time per state. Different from other systems, the design allows for real-time interactivity. The virtual environment and the virtual human are created and rendered using Unity. The novel implementation of the system allows it to run standalone in a web browser. The system is deployed using Unity's Web Graphics Library (WebGL) which allows for interactive 2D or 3D graphics. The system works seamlessly in the browser without the need for any installation or instrumentation. This allows people to use the system without any further gear or any need to travel to a lab. This particularly is useful when people are restricted to their homes, such as due to physical limitations or in the recent Covid19 pandemic.
The virtual coach is a high-quality, post-processed, realistic character with a natural human appearance and speech. The character is equipped with verbal as well as nonverbal dialogue skills displaying the animations for idle, standing, pointing, sitting, standing, acknowledging, nodding, looking around, shifting weight, and waving (as shown in figure 4). There are multiple gestures and animations used in the system. These are downloaded from Mixamo (https://www.mixamo.com) and edited in Unity.
For an advanced animation control, the character allows direct manipulation of skeleton model joints (e.g., the neck joint or the spine joint). The virtual coach speaks in a natural human voice from voiceover. The dialogues are scripted and pre-recorded by a voice artist. The virtual coach lip-syncs the voice and the mouth movements. The gesticulation with lip-syncing provides a realistic environment for our users where they can interact with a virtual coach as they would with a human coach.
Our framework offers potential for further personalisation, such as the character can be customised to the user's preference for age, gender, and appearance features, such as skin colour.

DISCUSSION AND CONCLUSION
This paper describes the advantages of real-time adaptive web-based systems that incorporate simultaneous use of multi-channel bi-directional communication channels in human-to-agent interaction. We have proposed a design for a relaxation system that uses a virtual human as a breathing coach. The coach instructs users when to breathe in, hold, breathe out and hold their breath. It follows an established breathing exercise and an optimisation approach to find the optimum length of the breathing cycle which allows for a personalised exercise regime for each user.
We plan to carry out a pilot study to test the validity and efficacy of the system. We will investigate how people perceive a realistic virtual human as a breathing coach and whether they are influenced by the coach. Our aim is to find out whether people are willing to use a virtual coach instead of a real human coach. By this, we give users a natural interaction experience with the freedom to use the system on demand.
Although the system uses a realistic virtual human, there is room for improvement in terms of updating the character with a more realistic one. We have included Box breathing exercises in our regime, further exercise regimes such as the Papworths technique, Deep breathing, or other techniques [11], [12] can also be included to give users the choice of further personalisation.
Our system is based on the concrete theoretical concepts of human-to-human interaction as well as scientific research suggesting the advantages of virtual reality in health and wellbeing. This is a step towards improving the accessibility and affordability of personalised healthcare that can be realised within the comfort of our homes.