This paper reports on the empirical findings of the meta-evaluation of the application of work tasks in connection with the evaluation of interactive information retrieval (IIR) systems.
The purpose of the meta-evaluation is to uncover if it is recommendable to apply work tasks to future evaluation of IIR systems. It is investigated if any search behavioural differences exist between test persons’ treatment of their own real information needs versus simulated information needs. The hypothesis is that if no difference exists one can correctly substitute real information needs with simulated information needs through the application of simulated work task situations. We are also interested in learning what defines a ’good’ work task situation.
The empirical results of the meta-evaluation provide positive evidence of the application of simulated work task situations in connection with evaluation of IIR systems. The results also point to that tuning of work task situations towards the group of test persons is of importance due to motivation of the test persons. Furthermore, the results of the evaluation show that different versions of semantic openness of the simulated situations make no difference to the test persons’ search treatment. Finally, it is verified that there exists a general pattern of assessment behaviour for the test persons. This verification makes yet another experimental reason for permutation of work tasks between test persons, in order to avoid bias of the retrieval results.