A Model-based Heuristic Evaluation Method of Exploratory Search

HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. A Model-based Heuristic Evaluation Method of Exploratory Search Emilie Palagi, Fabien Gandon, Alain Giboin, Raphaël Troncy


INTRODUCTION
Exploratory search (ES) is a particular information seeking activity in terms of "problem context and/or strategies employed" (White & Roth, 2009). In this paper we will use White's (2016) definition: "[the term] exploratory search can be used to describe both an information-seeking problem context that is open-ended, persistent, and multifaceted, and an information seeking process that is opportunistic, iterative, and multi-tactical. […] Although almost all searches are in some way exploratory, it is not only the act of exploration that makes a search exploratory ; the search must also include complex cognitive activities associated with knowledge acquisition and the development of cognitive skills".
Evaluating ES systems is still an open issue. A main point is the capacity of the evaluation methods to effectively assess whether users' ES behaviours and tasks are actually supported by the ES systems. One of the reasons is that these methods rely on a model of ES which is still loosely defined, or at least on a definition which is not yet clear and stable. The few existing evaluations based explicitly on a model proceed at a too low level avoiding the analysis of a user's exploration process in its entirety (Bozzon, Brambilla, Ceri, & Mazza, 2013;Wilson, schraefel, & White, 2009). Thus, the evaluation methods of ES are still incomplete as they are not fully based on a suitable ES process model.
Our main goal is to design a set of user-centered methods based on a suitable model of ES which provide a better understanding of the user, such as her needs or her behaviors when performing an ES. The method we present here is an inspection method in line with Nielsen's heuristic evaluation (Nielsen, 1994); it allows to evaluate ES systems without users. The method is based on a model of ES emphasizing the transitions between the ES steps, so-called ES "features" hereafter, and aim to verify if the evaluated system supports ES behaviours. The method consists of a set of heuristics based on this model, and, of their associated procedure of use (including the heuristics presented in a checklist format).

THE MODEL'S FEATURES AND TRANSITIONS
The ten features of the model (from A to J) express typical ES behaviours such as having an evolving information need or a serendipitous attitude (Palagi, Gandon, Giboin, & Troncy, 2017) Table 1 presents a non-exhaustive list of possible transitions between these 10 model's features. The transitions were elicited by comparing the features of the model with actual behaviours of three information-seekers performing an ES task. Two transitions were not observed in the records analysis but inferred from basic behaviours occurring commonly in an ES task. Users can do in their ES process the two following transitions: G  F: When the user does backward or forward steps, she can have an idea and change her goal(s) of search. The ES process is evolving, and in fact, a user should be able to change her goal any time in the ES session.
I  G: When the user analyses the relevance of a result, she also analyses the pathway that took her there. If she is not satisfied, she can do backward or forward steps to restore a satisfactory state. This situation is not specific to ES and can occur in all type of information seeking. Indeed, a user can analyse an irrelevant result and come back to the result list for example.
One will notice that a search session always starts with A. Define the search space, and ends with J. Stop the search session.

HEURISTICS OF EXPLORATORY SEARCH
Heuristic evaluation helps experts identify potential issues in the system's interface. We propose a list of ES heuristics (see details in Table 2) that present interfaces or interactions recommendations that help and facilitate users to carry out their exploration.
These heuristics are classified according to user interface elements: the home page, the list of results (or data), elements' description, the browsing history (including bread crumb and other similar features), and any screen. This classification helps the expert and makes the evaluation easier. Indeed, the expert can focus on a specified (element of the) interface and then split the evaluation for a better attention and a lower cognitive load.

Heuristics' design procedure
The heuristics of ES are based on the model's features and transitions. However, in the design process of the heuristics, we realised that one heuristic is not necessarily linked to only one feature or to only one transition. So we had to cluster features and transitions. First of all, we listed the different ways to facilitate the transitions. For example, in Table 2, transition F (Change goal(s))  B ((Re)Formulate the query) can be realised in different ways and implies different needs in terms of interface elements: the user must be able to change easily her search at any time during her search session. This includes different situations, and that is why two heuristics are proposed:  The search bar or the bookmarks are accessible anytime during the search session.  The system should enable starting to start a new search from the search bar but also from an element of the interface (e.g. a result, a world/element on the result description…).
After that, we cluster the transitions corresponding to the same action with the same interface elements. For example, we cannot cluster the previous heuristics, but the transitions E (Pinpoint result)  D (Put some information aside) and I (Analyse results)  D can both be facilitated by using bookmarks.
Note that C, D, E, G, H, I  F does not have linked heuristics because there are no interface elements that facilitate the transition to F. Indeed, this feature cannot be translated by a physical action: it occurs when the user decides to change her search goal.
Heuristics for facilitating the transitions between features are not sufficient to support the whole process of ES. Indeed, some model's features need to have their own heuristics. For example, feature A (Define the search space) implies a description of the system and tools to help the user to (1) understand what kind of information she will find by using the system, and (2) define her search space (e.g. with query auto-completion feature, suggestions…). The heuristic corresponding to the transition A  B ((Re)Formulate the query) is more focused on how to facilitate the query (re)formulation after learning more about the system at the beginning of the search session. That is why the heuristics that facilitate some model's features are also necessary, and we added heuristics that support these features. Note that heuristics "A & B" and "E & I" are also clustered, sometimes more than once.
For an easy use, the heuristics are classified into different screens. To this end, sometimes, we need to split some heuristics in two, e.g. the previous E, I  D: List of result/data: The system should allow the user to bookmark elements from the result list.

Elements' description:
The system should allow the user to bookmark elements from the elements' description.
The evaluator is able to use these classified heuristics without understanding or knowing the model of ES process or the transition to evaluate the ES system. She only uses the heuristics and verifies if the system satisfies all of them.
All the heuristics provide elements that support at least one feature or transition. Thus, if a system satisfies all the heuristics, it supports the ES process and behaviour. This evaluation of ES process should be performed by persons familiar with the ES process (e.g. ES systems designer, or UX designer). The level of knowledge on ES can vary but a certain level of experience is required. The evaluators will use their own experience and the heuristics to analyse and improve their search system.

Heuristics' use: an evaluation checklist
For an easy use of the heuristics of ES, we proposed to present them to the evaluators in a form format. Indeed, first tests of the heuristics showed that they are more understandable when phrased in an interrogative form. The evaluation checklist is at: http://bit.ly/evaluation-checklist

EVALUATION OF THE HEURISTICS OF EXPLORATORY SEARCH
In order to determine if an ES system supports users' ES behaviours, our heuristics intend to identify system's features, interface elements or possible actions that are required for the proper achievement of the ES task. We assume that an evaluation of the system with our heuristics of ES allows a better identification of these elements. We want to evaluate the following hypotheses derived from our research questions: (H1) users following the evaluation checklist identify more provided or missing features than users not following the evaluation checklist (H2) users following the evaluation checklist identify more systematically than users not following the evaluation checklist

Methodology and protocol
Twenty persons participated to the evaluation and all of them are computer scientists. They did not know any of the evaluated systems before. Ten participants evaluated Discovery Hub 1 : five with the heuristics (Group A) and five without the heuristics (Group B). The ten other participants evaluated 3cixty 2 : five with the heuristics (Group A) and five without the heuristics (Group B). In the users tests we proposed to Group A to use the heuristics in their form format mentioned in Section 3.2. We also provide Group B paper and a pen. Group A and B first familiarized themselves with their respective system through a scenario-based demo. Then Group A (evaluation with the checklist) evaluated the system by following the evaluation check list, whereas Group B replayed the scenario-based demo of the familiarization phase and formulated all negative and positive aspects that can facilitate or compromise the exploration.

Metrics
We measured the effectiveness of our heuristic method in terms of precision, recall and F-measure of the participants' answers: Group A: each time a participant has to check "Yes" or "No" in a multiple choice list. Group B: each time a participant's comment refers to an answer to the evaluation form.
We created a gold standard by asking two usability experts, who possess a good knowledge of ES process, to evaluate independently the two systems with the evaluation checklist. After a discussion about the understandability of the questions, the two experts reach a perfect consensus. considering the F-measure of 83.18% vs 14.66% and 89.55% vs 24.99% we can say that the users with the evaluation checklist identify more provided or missing system's features than users not following the evaluation checklist. For (H2), with a standard deviation of 3.35 vs 10.32 and 3.45 vs 9.26, we can say that users following the evaluation checklist identified features more systematically than users not following the evaluation checklist.

Results and conclusion
This evaluation demonstrates that the method significantly help the evaluators identify more and more systematically the presence or the absence of elements or possible actions required to effectively support ES behaviours than without the heuristics (Mann-Whitney test, p<0.02). Our experiment also shows that a more complete evaluation of these systems requires a combined evaluation with wellknown usability and user experience heuristics such as Nielsen and Bastien's and Scapin's ones (Bastien & Scapin, 1993;Nielsen, 1994).

ACKNOWLEGMENT
This work was partly funded by the French Government (National Research Agency, ANR) through the "Investments for the Future" Program reference #ANR-11-LABX-0031-01.