Serious Games as Input versus Modulation: Different Evaluations of Utility

The paper discusses two different approaches in designing and evaluating serious games: games as inputs in non-game activities, and games as modulation of non-game activities. Playing and gaming offer powerful metaphors and interpretive repertoires for making sense of professional challenges: for example, business and politics may be seen as gameful, while computer engineering may be seen as playful. Serious games are uniquely positioned to support or modify such repertoires, turning them more or less competitive, collaborative, exploratory, rule bound or rule bending etc. Their modulation force thus becomes a distinctive topic of evaluation. We discuss a case study illustrating how a successful assessment of a serious game seen as input for educational activities has obscured its ambivalent modulating influence on creating a playful take on computer engineering. Common glosses of serious games as 'competitive' or 'useful for learning' may divert attention from the relationships between specific game features, such as a particular organization of competitions and score display, and play styles. A successful translation of game playing into a desired professional ethos depends on fine-tuning relevant game features and game related discourse.


INTRODUCTION
Serious games are widely used and praised in a variety of professional fields.Their rise to prominence as forms of professional training and vicarious experience has brought to forefront the issue of assessment.There is a considerable body of literature on game evaluation, and a significant research thread considers the specificity of serious games, including games in education.We take this concern further and we discuss, on the basis of a case study, several open issues in the assessment of serious games, deriving from their variable positions in relation with the wider non-game activity 1 .
We start the discussion by a review of evaluation strategies for serious games.We then present related work on serious games and the challenge they pose for the distinction between 'serious' and 'playful'/'gameful' activities.We distinguish between two versions of this contrast that are sometimes conflated: the inward versus outward orientation of game play, and the gameful versus playful styles of playing a game.Both distinctions are relevant for understanding serious gaming practices and for assessing their utility.We go on to discuss the relatively neglected topic of playfulness versus gamefulness in game playing styles, and we argue that such styles may be designed as well as experienced as support for a particular professional ethos.For example, business and politics may be pursued in a gameful, competitive frame (Applbaum 1999), while computer engineering is sometimes defined as a playful, collaborative practice, as illustrated by the classical 'hacker ethic' (Raymond 2001;Himanen 2001).Serious games are most often assessed in light of their contribution to knowledge and skills; we ask, how can they be assessed in light of the professional ethos that they inspire?We elaborate this question by discussing an assessment exercise for World of USO (WoUSO), a game designed and played in an introductory operating systems course focused on Linux.We conclude by highlighting the need to understand and fine-tune the design features that shape game playing styles, as well as game related discourse.

RELATED WORK
Games are notoriously difficult to define as a unitary concept, and serious games, which include an apparent paradox in their very name (Statler et al. 2011), extend this challenge.To provide a starting point for our discussion, we define serious games as: games which are purposefully associated with a non-game activity by its organisers, in order to improve participants' performance, however defined.Because of this association, serious games are subject to assessment in relation to the success of the respective non-game activity.To put it simply, any game that is organised in support of an activity that may also exist without it is a serious game, and it may therefore be evaluated in regard to its utility.

Evaluating serious games
There is a substantial literature on evaluation of game usability and related variables such as utility, efficiency, user experience, playability, user satisfaction etc. Game evaluation may be primarily oriented towards design, focusing on heuristic development and assessment (Pinelle and Wong 2008;Desurvire et al. 2004), or towards understanding and measuring user experiences (Bernhaupt et al. 2007;Obrist et al. 2009;Nacke et al. 2009;Bernhaupt 2010).There is a substantial literature on heuristics that account for successful games, serving as are useful tools for designers; some of this research has distinctly investigated heuristics that promote learning in serious games (Malone 1982;Habgood 2005).
A related body of evaluation literature focuses on serious games, especially games used for training and education, and discusses their efficacy in relation to specific learning outcomes (Hays 2005;Connoly et al. 2012).There is a wide diversity of methods employed in game evaluation research, depending on the research question.Quantitative estimates of game influence appeal to experiments or regression analysis; user experiences may be explored with qualitative or quantitative survey methods, as well as by biometric indicators of embodied experiences.
In the following sections we argue that there is a significant aspect of playing serious games that has received less attention: participants' play styles and their relationship with the non-game activities of interest.Playing styles are shaped by specific game features, for example by how players and other participants are represented in the game (by profiles, feed-back, scoring displays).Playfulness may be supported by designing game spaces affording low stake exploration and bricolage, while gamefulness is encouraged by clear outcomes and competition.Therefore, in evaluating game playing styles and their adequacy for the purposes of a serious game, there is considerable scope for identifying heuristics that lead to desired outcomes.

Two axes for describing game playing
Serious games seem intrinsically paradoxical, challenging the serious versus playful contrast.We can distinguish two distinctions that may be evaded -or outright contradicted -by the concept of serious games.
Firstly, there is the distinction between seriousness and playfulness, which has also been discussed as a contrast between 'gaming' and 'playing' (Zimmerman 2004), or between 'Progress' and 'Play' (Barr et al. 2007).In this classification playing essentially refers to open-ended, exploratory activities that orient to rules but also defy and create rules, while gaming refers to purpose-driven, rule-constrained and often competitive activities.Any particular game may accommodate a more playful or gameful overall approach, and local combinations of styles in particular situations; at the same time, there are specific game features that encourage one of these styles at the expense of the other.
Secondly, there is the distinction between participants' orientation towards an outcome that is only relevant in play (such as acquiring points, immersion, or character complexity), versus an orientation towards outcomes that are also (even if not exclusively) relevant in non-game settingssuch as playing for a higher course grade, for skill development, for impressing a partner, for exchanging in-game currency into cash ('gold farming') or for other direct material benefits (in professional playing, or by betting).Games may be even played with an intense inward orientation, by recruiting elements from non-game contexts (money, relationships, images, or stories) to transform them into in-game resources.There are also games, especially the so-called 'pervasive games', that thrive on this very distinction, playing with the borders of the 'circle of play' (Brynskov and Ludvigsen 2006).Overall, the distinction between game and non-game worlds and logics is relevant insofar participants orient towards it and use it to order their activities, similar with other social distinctions.The game / non-game contrast may be very powerful, or it may simply fade into a routinized irrelevance for participants (Pargman and Jakobsson 2008).
The defining feature of serious games is that they are designed or at least organised for players in view of an outcome that is also relevant in a non-game setting -be it skills, knowledge, social relationships, experiences etc. Participants may still play the game completely inwardly oriented, but for the organisers there is at least some outward orientation.Assessment of serious games usually reflects this outward orientation by estimating the contribution of game participation to the outcomes of interest by controlled experiments or regression analysis (see Figure 1).The inward orientation of play is usually captured by measures of satisfaction and immersion in the game world.

GAME PLAYING AS INPUT AND AS MODULATION
Although the link between game involvement and non-game activities is explicitly referred to in serious game design and evaluation, its varieties are scarcely discussed.Games are implicitly considered as potential contributions or inputs to non-game activities.Depending on the ratio of game to non-game elements in an activity we may speak of gamification (Deterding et al. 2011), when the game elements are introduced locally into a non-game activity, of game insertion, when complete games are part of a broader activity, or even of game encompassment, when game playing becomes the main component of the activity.
There is, nevertheless, an alternative type of relationship between a serious game and its corresponding non-game activity: the game may act as a commentary on the activity, a modulation of that practice.After all, as Bateson (1972) observed, playing a game involves meta-level communication that signals that specific acts are to be taken as playful, not as serious.Still, in non-game contexts people may perform serious activities as if they were playing, drawing on the game's interpretive resources.In other words, the game may be purposefully designed and / or experienced by players as an impetus towards a certain professional style: more or less playful / gameful, collaborative / competitive etc.For instance, in therapy play may be used to modulate the therapist -client relationship, and also to orient patients towards a lighter approach on life's hardships (Coulter and Rushbrook 2010).
Professionals make use of various interpretive repertoires to make sense of their work and to account for their successes and failures (Mulkay and Gilbert 1982).Metaphors of game and play are common in professional fields.Business, politics, legal counselling and war, as well as private actions such as flirting, or being popular, are susceptible to being framed as adversarial gamesvery 'serious' games, in yet another definition of the term.Framing an activity as a particular type of 'game' is consequential for decision making: it makes quite a difference whether we understand business negotiations as zero-sum or positive-sum games, as competitive or partly collaborative games, or as not being a game at all.
Our focus in this paper is on computer engineering, particularly on the 'hacker ethic' as a professional style.As proof of the relevance of interpretive repertoires in professional communities of practice, the phrase 'hacker ethic' has become a catchphrase since its initial formalization by Levy (1984).Articles and books in academic, professional or underground publications have examined what is this ethic and what it is not, whether it has changed (Mizrach n.d.), and what are its social consequences.Playfulness is one of the core values of the hacker ethic, and it is also associated with its central tension: the hacker / cracker distinction (Raymond 2001;Himanen 2001).
Game playing is not neutral towards professional interpretive repertoires: it resonates selectively with vocabularies of competition or collaboration, playfulness or gamefulness, tolerance or aversion towards rule-bending.Therefore, game playing is not only a fun, intrinsically motivated element of an activity; it may also be seen as a commentary on an activity, modulating it in a particular style.
Games may be designed specifically as an inflection rather than as a contribution; irrespectively of the designers' intentions, games afford players distinctive tonalities of professional practice.How can this aspect of serious game playing be evaluated?
In order to establish what sort of interpretive repertoire is encouraged by game playing, evaluators should inquire into players' strategies and also into discourses about game play.For example, how do designers and players present the game interaction -as competitive or collaborative, as playful or gameful, as a metaphorical representation of the non-game activity, or as a contrast to it?

DESIGN AND EVALUATION OF A MODULATING GAME: WORLD OF USO
The game of World of USO (Deaconescu et   When game had reached its 4 th edition in 2010, the designer team (which also includes the author as a member) decided that the game should be significantly improved to attract more student involvement.The game was rewritten for its 5 th edition, in 2011, leading to a palpable increase in student interest, and prompting our first systematic assessment exercise.In what follows we shall briefly present the game and then the experience of this evaluation research, in order to highlight several open issues and to propose tentative solutions.
World of USO includes several types of activities, each contributing to a participant's score: • Question of the day (QotD): a daily quiz question on technical topics, aimed to keep attention alive and mobilise participation.
• Challenges: each player may challenge one opponent, per day, in a competition in which they must answer in a limited interval of time 5 quiz questions on technical topics, drawn at random from a library; participants receive points based on their speed and accuracy, and final scores are weighted in inverse proportion to participants initial scores, thus favouring fights between a player and his / her superiors in game rank; each player may challenge only one player, but s/he may be challenged by an unlimited number of opponents in a given day, thus aiming to encourage interaction, as participants ask other colleagues to challenge them.• Spells: players may convert points into gold and buy various spells to increase their returns or to hinder their opponents.• Special quests: an offline, team based quest involving all sorts of activities (singing, writing poems, visiting the library or the subway station) and bringing evidence of such travels and exploits at the end of the course lecture.• Weekly quests: a series of quiz questions that build on complex and off-topic technical challenges, unrelated to the specifics of the course, and various pieces of information on computer science history and culture.• Game development activities: students may report bugs and suggest features on the project portal, and they may contribute quiz questions for the library that underlies Question of the Day, using a special feature of the game interface.
Figure 2 presents the components of the WoUSO game and their relationship to the USO course activities.Some elements are in direct communication: special quests are rounded in lecture encounters, quizzes for QotD and challenge tests address technical topics from the course subject matter, and the course forum hosts a thread dedicated to WoUSO, in which players ask for quest hints and comment on the game features.Game administration for the duration of the game (one semester) is realised by the faculty team and includes question formulation, solving bugs, monitoring game play, adding or modifying the rules of the game (which are not actually frozen at the beginning of the game), occasional raids to detect players that crop points or otherwise cheat the system, and offering hints for Quest participants.The two waving lines indicate zones of inflexion, transitions between styles of activity.WoUSO affords two distinctions in students' activity: the serious (coursework) vs. the game play (WoUSO) line, and also the player vs. designer line.Students engage the game as players and also, occasionally, as designers.

GAME ASSESSMENT
We have initially designed the evaluation of WoUSO following established methods for measuring game contribution to learning outcomes and player satisfaction.Because voluntary participation is a key feature of the game, we could not make use of an experimental methodology based on randomization; therefore, we designed a survey that would allow an estimate of the influence of WoUSO play (measured by the final game score) on students' end-of-semester knowledge, measured by their final grades (in exam and in the practical test).As control variables we have introduced: students' initial knowledge, measured by a test specifically designed for this purpose and administered at the beginning of the semester, and students' involvement in school work during the semester, measured by laboratory participation.
The resulting path model (Figure 3) was estimated on a population of 347 students, including 87 women (25% of the total); the results are presented in Table 4.We have included gender as a control variable because of our expectation, based on familiarity with game involvement and also on literature on gender and gaming (such as Bonanno and Kommers 2007), that women would be, on average, less interested in WoUSO.In addition to estimating game efficacy for acquiring knowledge, we have asked students for their anonymous evaluation of their satisfaction with WoUSO, course and laboratory activities, elicited with a one page survey (WoUSO contributors, 2012b) in the final week of the semester.We discuss the assessment results below.Although assessment findings were both interesting and positive, supporting our enthusiasm for further improvements, we gradually realised that they did not reflect some of our key design concerns.This realization and some tentative solutions were brought forward during a qualitative exploration of WoUSO play, relying on interviews with designers and student players, presented in section 5.3.

Evaluation of student satisfaction with WoUSO
The anonymous student evaluation indicates that around 25% of the students participated systematically (several times per month or more) in WoUSO (see Table 2).In order to understand the variability in students' evaluations, we have classified the opinions of all respondents who have provided answers into three clusters, using the K-means clustering method in the SPSS statistical package (Table 3).The first resulting cluster includes students that are highly appreciative of the game, grading it with 5 or more, on a scale from -10 to 10, comprising 43% of the respondents who volunteered opinions; the second cluster, with 45% of the respondents, includes students that are rather neutral, while the third cluster, grouping the remaining 12%, includes students who dislike the game.

Contribution of WoUSO to student learning
Given that a substantial amount of WoUSO activities, namely QotD and challenges, involved solving technical puzzles with direct relevance for course topics, we were interested to see to what extent involvement in the game supports student learning, measured by final grades.Data indicate that game and course participation were, indeed, convergent.In the scatterplot in Figure 4 we can observe that players with the highest scores in the game also had higher final grades than the average in their final exam.The exam grade correlates positively with the game score (R=0.335), and it also correlates positively with the player's rank (R=0.346).
.   Outliers with WoUSO final scores larger than 8000 points (9 cases) are excluded from analysis.Coefficients describe the student population included in analysis, and they cannot be generalised to a larger population.Therefore inferential information (such as statistical significance) is not relevant to our current analysis.
We used the statistical program MPlus v.5 to estimate the path model from Figure 3, in order to see whether participation in WoUSO is associated with the final exam and practical test grades, when controlling for laboratory participation and the initial test score (students with WoUSO final scores higher than 8000 were excluded from analysis, as outliers).Model results indicate that the estimated influence of game participation on grades in our student population is comparable in magnitude with the influence of laboratory participation and initial familiarity with the subject matter.It is also interesting that, while feminine gender has a small negative association with the final course grade, marking a disadvantage for women students, this disadvantage does not exist for WoUSO participation.

Evaluation of WoUSO as a modulating experience for computer engineering learning
Aiming to achieve a better understanding of student experiences of WoUSO play and of the game impact in the student community, we extended our assessment exercise to include: 1) Interviews with game developers, asking: what does the game aim for?What are its successes and failures?What problems arose in game development and organization?These interviews served as dialogical reflections on the game, bringing forward the team's aims and reservations about the game; 2) An analysis of the online presence of WoUSO, including: a) conversations on the dedicated forum thread, and b) all mentions of WoUSO on blogs, web pages etc.; 3) Interviews with students, conducted after game completion by a team of doctoral students in Sociology, as part of a project on student experiences in technical education.Interviews included a section that probed how WoUSO is presented in a conversation about student life -that is, how the game experience is used by students as a discursive resource to make sense of what it means to be a junior student in our Department.Respondents were recruited by interviewers by e-mail invitations indicating their research interest in student experiences, without any mention of WoUSO or any other specific elements of the course or the curriculum.Since interviewers were visibly from outside the Department, and interviews were confidential, we could thus investigate if WoUSO is a memorable and noteworthy experience for students in the interview context when there are no obvious prompts -and what aspects of it are brought into discussion.A number of 8 students have volunteered to participate in interviews to date, including intensive players, occasional players and non-players; the evaluation is therefore rather tentative, at this point.
During this qualitative and open ended inquiry I have come to realise, as a participant in the game development and as a reader of interviews, that WoUSO was designed primarily as an impulse towards playfulness in study and work, and not as a fun drill practice for students.Finding that WoUSO play does contribute to higher knowledge was, of course, good news -but it did not actually address the more pressing evaluation question: is the play experience cultivating an enthusiasm for technical elegance, for computer humour, for collaboration in the spirit of openness and collective problem solving?Is WoUSO encouraging students to work and learn in the spirit of the 'hacker ethic'as it is convincingly summarised by Levy (1984), Torvalds and Himanen (Himanen 2001) and Raymond (2001)?
The qualitative inquiry does not yet allow a formulation of definite answers.Still, interviews and text analyses point towards several ambiguities, or even contradictions, in WoUSO game playing styles.The game challenges (in which players compete one against another in answering technical quiz questions) are meant to encourage swift problem solving and thus to reward proficiency.Still, it appears that a significant proportion of the played challenges are in fact simulated competitions, in which a player either competes with amicable adversaries, or even with him/herself by using other players' accounts, obtained as a favour or, possibly, by cracking them.For some of the players who approach the top of the hierarchy, considerable energy goes in managing the social networking useful for gathering large amounts of points in challenges or in quests.Players are actively trying to find bugs and to trick the game system, in what is generally seen as proof of technical expertise.Still, only a fraction of the detected bugs are reported, usually after being exploited for a while.There is little collaboration in joint problem solving; collaboration usually consists in offering complete answers, participating in challenge competitions, or donating accounts.On the other side, the launch of a new Weekly Quest is an occasion of intense involvement, extended exploration and in-depth search of technical information and advice on the Internet.Weekly Quests, rather than challenges, seem to be the game activity that encourages students towards enthusiastic problem tackling and technical savoir faire.The fact that Quests are also amenable for conversation as individual topics, without reference to the total score of the player, means that they offer a valuable opportunity for players to start fresh in the game, building and presenting a competent character.
Overall, WoUSO participation is not only quantitatively but also qualitatively uneven: the relatively few top players seem largely oriented to accumulating points at virtually any cost, including rule bending and system cracking, while the lower ranking players play by the rules -with some unease at their low standing, lack of chances to recover a better position in hierarchy, and a feeling that the highest scores are at least in part due to unfair advantage.This polarised participation is encouraged by the game scoring and feedback system, which continuously displays, on the front page, the top 10 players according to the total accumulated points; the overall hierarchy is available one click away.
Descriptions of WoUSO, formal and informal, in interview accounts and online presentations of all sorts, make reference to the game competitiveness, which renders it addictive and fun, and praise the fact that it promotes student learning and technical insightful thinking.Still, the positive and unproblematic presentation of competitiveness as a stimulating game feature obscures the downsides of the current game organization of competition and the possibility of alternative organization -for example, by promoting more intensively competition between teams rather than between individuals and by publishing results at a lower level of aggregation (such displaying as winners for every Quest, or making more visible the top of weekly performance), thus encouraging new entries into competitive encounters.On the second hand, the positive and unproblematic mention of the WoUSO usefulness for knowledge acquisition obscures, from students and from designers, its ambivalence in encouraging playfulness as a style of work.

CONCLUSIONS
This paper discusses several specific issues in assessing utility of serious games.Serious games are linked to a non-game activity, and this relationship is usually modelled as one of input-output: the game is seen to contribute to the achievement of desired results.Quantitative evaluation of serious games as useful components of an activity usually involves experiments or regression analysis, and it is the dominant assessment strategy.We argue that a significant dimension of the relationship between games and non-game activities is relatively under-studied: games may also be 'serious' insofar they modulate an activity and cultivate a specific style or ethos, such as a playful (or a gameful) professional orientation.For instance, playing a game for business training may have an influence on understanding business as a game, even of a very particular kind.There are many ways to play a game, but some of them are favoured by game features, in specific play situations.Game playing styles may be designed and experienced as a vicarious professional ethos, contributing to the constitution of interpretive repertoires for professional activities.
We discuss, as a case study, the game World of USO.Its assessment indicates that the game had a positive influence on student knowledge.Still, this is not all there is to be said about its contribution to student learning.An inquiry into the playing styles that the game affords indicates that several of its features seem to distract players from a playful and collaborative style, encouraging instead intensely competitive strategies.These styles of play at best become irrelevant for cultivating technical finesse, and at worst encourage a malicious kind of savoir faire, pushing would-be hackers into crackers.We hypothesise, at this point of the evaluation exercise, that the game scoring and feedback system promotes intensive competitiveness at the expense of technical playfulness; also, we should redesign challenges to better match the desired game playing style.Quests are the game activity that seems to have the highest potential to cultivate a playful take on technical complications.To make a more general conclusive point, cultivation of a desired game playing style is amenable to heuristic devices, building upon evaluative research.
Last but not least, designing a game as modulation, not solely as input, also requires the development of an adequate vocabulary to describe the game, including game presentations and debriefing sessions for players.In order to make full use of a serious game's potential to modulate its associated activity, specific features and game-related discourse should be fine-tuned to match the desired ethos.

Figure 2 .
Figure 2. Integration of WoUSO game and USO course activities (faculty designed activities are marked with blue; game design activities are marked with yellow; student play activities are marked with green)As an open source, experimental game, designed and maintained by volunteers, WoUSO has evolved through a collection of initiatives, oriented towards the broad objective of stimulating junior students' interest in technical sophistication and computer fun.Unlike many serious games used in education, WoUSO participation is voluntary: students choose whether to play it or not, and their game score is completely independent of their course grade.Players with the highest final score win the game.Each year, at the end of the first semester, WoUSO hosts a festivity, awarding prizes for the highest scoring ten participants.

Figure 4 .
Figure 4. Scatterplot of players' score in WoUSO and their final exam grade in the USO course.N=233

Table 1 .
Game play orientation, style and corresponding assessment strategies

Game Game Action 2 Result Assessment Game is in activity Game is about activity Assessment Do players have a playful or a gameful approach? Is game play used as an interpretive resource?
Figure 1.Linking games to non-game activities and formulating corresponding assessment strategies ?

Table 2 .
Distribution of frequencies for student participation: "How often have you participated in the ?" 1

Table 3 .
Student classification in three clusters according to their patterns of evaluation for WoUSO.Evaluation for all criteria is marked on a scale from -10 to +10.

Table 4 .
Standardised results (STDYX) for a path model estimating the influence of WoUSO participation on two measures of student learning