Using physiological signals to measure the Quality-of-Experience of Health Care Professionals when interacting with a clinical guideline mobile app

Digital content adaptation and personalisation is a crucial component in increasing user engagement, and becoming of interest to designers/developers in areas related to clinical information delivery. In order to achieve this, new data-intensive methods are required that go beyond traditional user-centred design approaches. In this position paper, we discuss how although user-centred design has shown to be useful for generating generalised design guidelines (predominantly driven by qualitative data collection techniques), more quantitative methods and the use of measures such as Quality of Experience, could not only augment standard user research methods but also provide data to inform the adaption and personalisation of interfaces. In this paper we propose a solution-by-design to gather personal preferences through users’ physiological data (using pupillometry) and how it would be useful for applications such as mobile apps for clinical guidelines, where access to in-situ data collection is increasingly more challenging.


INTRODUCTION
Digital content adaption and personalisation is becoming more prevalent (Madan 2021). A popular example is the identification and delivery of interestbased digital content and services. Digital content providers (Netflix, Amazon Prime, Disney Plus etc.) use large amounts of visitor data to train deep learning algorithms, which in turn, predict a new user's preferences based on their age group, geolocation, and other personal information. However, due to legal, ethical, and privacy requirements, using individuals' information to even train machine / deep learning models is not recommended. Therefore, the research community is actively seeking different ways to address these issues, including federated learning, pseudonymisation, anonymisation, and encryption, as well as more traditional user research methods such as interviews and questionnaires.
In this paper we propose a solution-by-design to gather personal preferences through users' physiological data for customising the presentation of digital content. This will pave a way for the development of customised digital applications/platforms in a variety of domains which improve the interactivity of digital devices and their content (Bergstrom et al. 2014); in the case of this position paper, applied to a clinical guidelines mobile app.
Utilising user-centred design (UCD) methodologies, the Bedside Clinical Guidelines (BCG) app has been developed to deliver 157 clinical guidelines to acute care clinicians via a mobile device (Mitchell et al. 2020). During the development of the BCG application, methods such as focus groups, think-aloud during basic clinical scenarios (lab-based), and system usability questionnaires were used to elicit feedback from clinical users. The findings from these mostly qualitative UCD 1 Using physiological signals to measure the Quality-of-Experience of Health Care Professionals when interacting with a clinical guideline mobile app Mitchell • Kanwal • de Quincey methods informed the production of a set of 'general' design guidelines for producing such applications and presenting information to users (Mitchell et al. 2021). This research demonstrated that the use of UCD methods can be useful in developing and producing generalised design guidelines for all users but also pointed to the fact that different types of clinicians may want guideline information displayed to them in a more personalised manner, dependent on their level of experience.
As access to clinical environments is often restricted in nature and the fact that clinicians also work in high-stress environments and frequently face interruptions in workflow, standard UCD methodologies are often limited and restricted to predefined scenarios or unrealistic settings (Mitchell et al. 2020). Methods such as these then can fail to gather representative data in terms of experience in a more realistic setting and do not gather appropriate types of data to inform adaption and personalisation of information presentation. Therefore, more implicit/passive and quantitative methods of gathering information that more accurately reflect the overall Quality of Experience (QoE) are required.
To access web based digital content, similar measures have been developed and used such as usefulness, desirability, accessibility, findability and credibility, (Richardson et al. 2021) through binary questionnaires. Furthermore, some UX tools have also been developed for in-use assessment for web products such as event-tracking, click-tracking, A/B split testing, visual feedback and Wireframing (Haije 2017). However, these tools are technology focused and are helpful in accessing the usefulness of the content provided but fail to address the customisation of the product or a service. Hence we need a merger of the two methodologies i.e. qualitative as well as quantitative analysis of user's preferences. This paper then proposes utilising other sources of data such as physiological signals to measure QofE, in order to determine potential customisation and presentation of content in the BCG app.

USER CENTRED VS. TECHNOLOGY-CENTRED ASSESSMENT
As (Nielsen 2001) has pointed out "Don't listen to your user, watch them doing it". UCD related methods therefore quite often suffer from being too subjective, a lack of personalisation, differences in interpretation and not eliciting tacit knowledge. Analysis is often based on collective responses from multiple users and therefore, can serve a population, but not an individual's preferences. Therefore, it is important to combine user centred  (2011)) and Arousal-valence model research with technology-centred assessment of users' experiences. QoE aims at solving this by considering all factors that contribute to a user's perceived quality of a system or service, including system, human and contextual factors (Reiter et al. 2014). In the case of the the BCG app mentioned in the previous section, its development has so far been reliant on traditional usercentred methods and pre-determined scenarios. By considering QofE however, a more complete measurement of the users expectations feelings, perceptions, cognition and satisfaction can be captured (Laghari et al. 2011), creating not only a more meaningful assessment of user interaction but also more quantitative methods to utilise in-user modelling, adaptation and personalisation of content presentation.

ASSESSING MEANINGFUL USER INTERACTION
(Pine and Gilmore 2011) have mapped a user's engagement experience to active participation and absorption during an interaction. The experience is then categorised into four realms as shown in Figure 1: Entertainment, education, escapist, and aesthetics. The spectator of a magic show, a football game, or a music concert shows an entertainment level engagement, while a wine taster or a cookery class participant experiences an educational engagement. Similarly, tourism, paragliding, or any unusual activity gives escapist engagement, and lastly hair cutting, drinking coffee in a coffee shop reflect an aesthetic experience. This basic user engagement can be mapped to a basic valence and arousal model also shown in Figure  1 and therefore, one model can be used to point out the other model's state. For example, a person's interaction with a clinical guideline can be ranked 2 Using physiological signals to measure the Quality-of-Experience of Health Care Professionals when interacting with a clinical guideline mobile app Mitchell • Kanwal • de Quincey based on his/her engagement level, and based on active engagement and absorption measures the user can be identified as the one gaining more knowledge. This can also be reflected by identifying the emotional status of the user from the arousal and valence measure and high valence and arousal values also depict the excitement and happiness of the user in turn reflecting an information gain. This gives enough base to combine the two models (Emotional-Model and Experience Realm) together to develop meaningful, attractive, and engaging interfaces for human-computer interaction.
Using a valence-arousal model we can classify a human's emotional response from their physiological data. Physiological data include EEG, ECG, heart rate, Galvanic Skin Response, blood pressure, and pupil response (Dzedzickis et al. 2020;Rafique et al. 2021a). Sensors are usually required to be physically attached to the human body to get these physiological responses which make people uncomfortable and hence cannot be used on a large scale. But some of these responses can be collected in an unobtrusive manner such as pupil's response using a camera. Using pupillometry for classifying human emotion is an emerging research domain

PUPILLOMETRY AS A METRIC FOR QOFE
Pupillometry is the study of the reactivity of the pupil in the human eye, this reactivity is measured by capturing the change in pupil size. A pupil constantly constricts or dilates due to the neurological activity in the human brain also described as cognition. High level cognition reflects mental effort, whereas neutral or low-level cognition shows satisfaction or uninterested behaviour. The relationship between eye movements and pupil responses can be used to interpret brain functions as shown in Table 1. Pupil responses are not discrete and therefore, temporal data analysis is required to assess the response accurately. Therefore, machine learning and deep learning methods can be employed to capture the constriction and dilation patterns (Rafique et al. 2021a).
This paper proposes that the pupil responses can also be used to assess the QofE from the technology's perspective. Figure 2 shows lowcost but comfortable equipment for capturing pupil responses and gaze detection. Pupil size variations, fixation and eye tracking can be combined together to measure a QofE against individual functionalities 1 https://pupil-labs.com/ 2 https://www.tobii.com/  Hess (1965) Interest in food and music Hahnemann and Beatty (1967) Decision Making Ellermeier and Westphal (1995) Pain estimation Ramsøy et al. (2012) Visual Processing Nowack et al. (2013) Online information processing Wykowska et al. (2013) Action-planning task Sirois and Brisson (2014) Data Processing Zekveld and Kramer (2014) Language processing Strohl et al. (2016) User Experience Rafique et al. (2021a,b) Emotions classification Figure 2: Eye Tracking device by Moritz and Patera (2014) of the technology under study without affecting normal activities.
where t represents the time Relating this back to the BCG app, this method would be particularly useful in this scenario to make assessments in real-time at real locations i.e. in a hospital. Eye-tracking can be used to detect the function in use while pupil response can be used to understand the cognitive load. A proposed framework for assessing the technology-based QofE would include capturing a video recording of pupil size variation as well as the visual context through forward-facing camera. Furthermore, eye gaze will also be captured and recorded by the equipment shown in figure 2. This data can then be fed into a deep learning model as described in (Rafique et al. 2021a) to classify the user's behaviour in terms of high-arousal, high-valence for specific functionality to be the most convenient to be used. A simple mathematical equation representing QofE is shown in Equation 1, where x, y represents the gaze location on the video frame showing visual content. Analysis of valance and arousal data for specific visual content needs to be done for a time interval defined by gaze detection module.
The proposed method is not suitable and hence not recommended for requirement analysis or getting feedback during development phase. It should be 3 Using physiological signals to measure the Quality-of-Experience of Health Care Professionals when interacting with a clinical guideline mobile app Mitchell • Kanwal • de Quincey used after the product/service passes other software testing processes to avoid any technology-based bias in user's behaviour.

CONCLUSION
In this position paper, we have proposed the use of QofE measurement, using pupillometry as a metric, to provide a more meaningful assessment of the user experience, by enabling in-context use, measured in real-time. This not only has the advantage of being more realistic, compared to the use of scenariodriven research completed in previous studies, but also the ability to collect quantitative data that can be used to model user behaviour. In areas such as the design of mobile apps for use in hospitals by clinicians, this offers a unique method for not only studying real-world usage but also how digital content can be adapted and personalised.