The hidden assumptions in public engagement: A case study of engaging on ethics in government data analysis

This study examines the hidden assumptions around running public-engagement exercises in government. We study an example of public engagement on the ethics of combining and analysing data in national government – often called data science ethics. We study hidden assumptions, drawing on hidden curriculum theories in education research, as it allows us to identify conscious and unconscious underlying processes related to conducting public engagement that may impact results. Through participation in the 2016 Public Dialogue for Data Science Ethics in the UK, four key themes were identified that exposed underlying public-engagement norms. First, that organizers had constructed a strong imagined public as neither overly critical nor supportive, which they used to find and engage participants. Second, that official aims of the engagement, such as including publics in developing ethical data regulations, were overshadowed by underlying meta-objectives, such as counteracting public fears. Third, that advisory group members, organizers and publics understood the term ‘engagement’ in varying ways, from creating interest to public inclusion. And finally, that stakeholder interests, particularly government hopes for a positive report, influenced what was written in the final report. Reflection on these underlying mechanisms, such as the development of meta-objectives that seek to benefit government and technical stakeholders rather than publics, suggests that the practice of public engagement can, in fact, shut down opportunities for meaningful public dialogue.

The hidden assumptions of conducting public engagement include norms that shape both the design of public participation activities and the reporting of results.These norms reflect familiar themes in theory around public engagement, including how imagining the public can limit who is involved in engagement exercises.
• Practitioners of public engagement should take note of what is meant by terms such as 'engagement' to different types of participants, including publics, government, academics and industry.

Introduction
Public engagement with data science is an emerging field of interest in the wake of government and public attentiveness to the ways that personal and public data are used in government settings.This public attentiveness, and subsequent publicengagement interest, has been located around several high-profile events that have raised concerns about privacy and safe data sharing, for example in the care.dataand Google DeepMind cases (Caldicott, 2016;Carter et al., 2015;Powles and Hodson, 2017).Data science is the combination and application of data, including 'born digital' data, such as Twitter feeds, and more traditional forms of 'digitized' data, such as administrative government records.Data science is posited to create smarter, more responsive government services (Drew, 2015).However, others are concerned that these smarter services also have the potential to do harm.For example, is it alright for children to be digitally monitored to predict school and social outcomes (Vale, 2016)?Is it alright not to track these children, if following them could flag a child in need of support?The benefit, and indeed harm, of these systems is dependent on individual and community perspectives related to privacy, harm prevention, political ideology, personal values and more.
There have been a number of attempts to determine the public view on privacy issues around data science, and particularly on data sharing (Cameron et al., 2014;Davidson et al., 2013;Ipsos MORI Social Research Institute, 2016b;MRC, 2007;Sciencewise, 2014Sciencewise, , 2012)).Thus far, these government public consultations have found publics hesitantly supportive of the use of personal data that is informed by a clear public benefit.However, concerns are often embedded in commercial access to personal data, and the opacity of algorithmic mechanisms such as machine learning, that is, digital and automated processes of decision making (Davidson et al., 2013;Cameron et al., 2014;Ipsos MORI Social Research Institute, 2016b;Sciencewise, 2014).Public engagements around data use in the UK have thus far mostly been oneoff workshop-style events (Aitken et al., 2016) that probe public opinion on privacy and consent.There has been little critical reflection on these emerging data science engagements, nor on how their limited focus on privacy and consent may influence what is seen as available for public influence.As Selin et al. (2017: 636) argue on these fixed views of public opinion engagements, 'practices are thus often shielded from being contingent, mobile, and ultimately intertwined'.They go on to suggest that attention is often not 'paid to … the ways in which [engagement] design, publics, and findings are co-produced' (ibid.).Thus, the organization of public engagement, in this case with data science, is an opportunity for critical study.
Definitions of public engagement with science and technology vary.At the most basic level, it is some form of interface between individuals who develop or govern technology and 'publics'.There are several relevant definitions and descriptions to consider.For example, Rowe and Frewer (2005) discuss the nature of such interfaces and suggest a typology of public-engagement mechanisms: communication, consultation, and participation where engagement moves from telling, to asking, and ultimately to including publics in science and policy.Fiorino (1990) discusses the purpose of engagement wherein engagement is instrumental, substantive or normative.Instrumental engagements include publics due to operational requirements, while substantive engagements include publics to improve a technology.Normative engagements include publics as they are due a role in technologies that may impact them.For the purposes of this paper, we broadly define public engagement as the inclusion of non-technical and non-governing publics in the imagination, development and/or regulation of technology (Rempel et al., 2018), in this case, government data science.
Definitions of the public vary from stricter and more defined to fluid and dynamic categorizations.Positivist perspectives, such as that of Renn (2008), categorize publics in distinct groups, that is, as stakeholder, affected, observing or general publics.Newman (2011), on the other hand, argues for dynamic and fluid publics who are called upon and produced through summoning, mediation and mobilization by leaders.This paper more closely aligns with Newman's (2011) view, and considers publics to be dynamic groups of individuals defined both through self-identification and political process.This challenges the notion of 'the public' as a singular group with a singular point of view on any science or technology topic.It also highlights that publics hold varying degrees of power within public-engagement exercises, where non-governing and non-technical publics are often not given, or able to take up, the responsibility of governing technological development through public-engagement exercises.
In reflecting critically on the organization of public engagement, we draw on theories of the 'hidden curriculum' (Cribb and Bignold, 1999).The hidden curriculum, popularized in education research, exists alongside the explicit syllabus and describes embedded methods of teaching (Cribb and Bignold, 1999;Cotton et al., 2013).For example, Cribb and Bignold (1999: 197) discuss the roles of 'loss of idealism' and 'emotional socialisation' in medical education.Loss of idealism refers to an increasing sense of cynicism among medical students over the course of their education, while emotional socialization refers to the strategizing and 'management' of emotions introduced through medical education, such as increased exposure to death (Cribb and Bignold, 1999).While the term 'hidden curriculum' may suggest intentional deceit surrounding professional practices, it is better considered as a way to identify the underlying processes related to conducting any kind of work.
The 'on the surface' processes related to running a public engagement are by definition transparent, but 'below the surface' processes offer an opportunity for a more nuanced critical consideration and a contrast with these more transparent procedures.The hidden curriculum perspective provides a theoretical mechanism through which to analyse qualitative data on the underlying norms and practices of public engagement that may influence the results of, and power relations within, public-engagement exercises.For example, we can ask questions using this theoretical framework to understand whether publics have substantive influence on policy changes related to technology regulation.Thus, the aim of this study is to examine an example of how public engagement with data science is currently operationalized from the perspective of the hidden curriculum.

Methods and context
The ethnographic site  Drew (2016), as well as used to develop an online game called Data Dilemmas that allowed users to find their 'data personality'.

Methodology
Fieldwork over four and a half months included observation and participation in planning exercises and Dialogue events.The first author (ESR) attended all meetings and events, and was joined by the second and third authors (JB and HD) for the final government workshop.The final data set included ethnographic notes taken at 12 teleconference meetings, 3 larger face-to-face advisory group meetings, 2 public workshops and 1 government workshop, as well as the final Dialogue report and 3 blogs written by the project leads.Private emails and documents were not included to protect the privacy of the organizers.Ethnography was chosen for its informal method of capturing a social phenomenon as outlined by Hammersley and Atkinson (2007).Essentially, ethnography allows for the description of the social world, in this case describing how an engagement process was undertaken.As well, Cotton et al. (2013: 196) argue that 'observation is widely regarded as an important tool for revealing the nuances of the hidden curriculum … since hidden curriculum research entails the search for meanings and contexts which may not be immediately visible to actors in that context'.As hidden curricula are in situ social phenomena that may not be obvious to those who run public engagements, ethnography allows the researcher to query these curricula without disturbing the context.Thus, ethnography allowed the lead author to observe the processes of the Dialogue while not interrupting the 'hidden' norms that she sought to understand.
Inductive thematic analysis was used to analyse the note and text data.NVivo 10 was used to support an iterative process of coding, theme identification and review (Braun and Clarke, 2006).Initial codes were reviewed by all authors, and a final code set was developed and subsequently used to code the remaining notes and text documents.The final codes were then again discussed by all authors to group into themes.Inductive thematic analysis develops themes from the data itself and does not apply a pre-existing code set to the notes.This is an iterative and dynamic process of reading the notes, developing candidate themes and codes, and rereading and recoding until the authors are satisfied that the themes developed represent the notes taken as well as the questions of interest.All notes were taken, and primary analysis was conducted, by ESR, while all three authors refined and developed the final themes presented.The key question of interest was to query the practices and norms of public engagement that may influence the explicit and 'on the surface' results of the exercises.Essentially, these questions asked: how is a public engagement run?And, how does that positively or negatively influence the published results of the engagement?Themes are presented below, and are substantiated through the presentation of examples from the Dialogue, as well as quotations from published Dialogue materials where relevant.While the themes are set within the context of data science, they may also be relevant to other technologies that are subject to government-led public engagement.
This study received ethical approval from the Department of Psychology's Research Ethics Board at the University of Bath.

Results and discussion
There were several lessons or themes identified through the ethnographic study.We will discuss the four lessons most relevant to the 'hidden curriculum', those things that shaped 'on the surface' processes: (1) how organizers identified an appropriate public; (2) how they set the purpose of the dialogue; (3) the variations in nomenclature; and (4) how interests of stakeholders were negotiated, particularly through the theme of setting the discussion parameters.We end with a brief discussion on the remaining minor themes, including time as a key constraint, how meaning was interpreted from the Dialogue materials, and mechanisms of validation through publication.
Lesson 1: Constructing the imagined public Inherent to any public engagement, and indeed this Dialogue, is imagining and constructing a public.While there was relatively unproblematic referral to the Dialogue participants as 'the public', organizers and advisory group members made several references to seeking opinions from the 'average' person, whom was imagined to be neither highly critical nor extremely supportive of data science.Members of this group were not imagined to be activists, nor to be highly literate in data science methodologies.Data science literacy was seen as a key group identification criterion, and separate workshops were held with 'techy' and 'high data' user groups in order to partition the general and specialist publics.Participants were recruited in particular from a range of areas and cities within England where it was perceived these 'average' opinion holders may reside, for example in Sheffield, rather than solely in London, where it may be more likely to find 'techy' participants.In writing up the results, efforts were made to report a diversity of opinions from these different groups.This was seen to demonstrate to external parties that participants were part of the ideal public, that is, neither too positive nor too critical.Having established the perceived 'average' status of participants, a chapter of the final report was devoted to describing differing perspectives, to highlight that the public held a mix of notionally positive opinions on data science.From the hidden curriculum perspective, we can see these imaginations of the 'average' opinion, and the separating out of the groups to different levels of data literacy, as multifaceted practices that shape who the public are, and introduce an element of bias in the reporting of results.Barnett et al. (2012) and Walker et al. (2010) argue that political actors hold preconceived conceptions or imaginations of the public and what they believe; this theme demonstrates how preconceptions can be used to define and shape the kinds of publics that are enrolled in engagement exercises.By imagining the 'real' public as being neither overly critical nor supportive, these sorts of individuals are then targeted for enrolment in the engagement exercise.While this reflects theoretical literature around the fluid and politically constructed nature of the public (Newman, 2011), it also demonstrates how the theoretical process of imagining the public lends itself to limiting who is and is not the 'public' in public engagement (Mahony and Stephansen, 2017).The imaginations are made reality.

Lesson 2: Developing explicit and meta-objectives
The Dialogue reported several explicit objectives as outlined by Ipsos MORI Social Research Institute (2016a) that have the potential to align with substantive engagement, for example having the public comment on the ethical guide (Fiorino, 1990).If the motivation behind this is to improve the guide, it fits within a substantive aim, while if the motivation is to avoid future public opposition, then it aligns more closely with instrumental engagement.These motivations can be explored more clearly by surfacing underlying or meta-objectives.One key example was the aim to anticipate and assuage public concerns around data science.The Dialogue process was seen as a way to explore and counteract potential public fears, and therefore build a positive space for government data science.As MP Matt Hancock stated, the guideline's purpose was to help 'people in government to feel confident using new techniques' (Hancock, 2016).In fact, during the planning sessions there were several reflections on how this dialogue may help avoid highly publicized negative data events.These underlying purposes to 'counteract fears' were not unproblematic to the organizers.Some raised concerns about this instrumental objective of communicating and calming.Thus, these tensions were highlighted during the processes of the engagement.And while there were efforts to ensure that the Dialogue allowed for open communication from publics, these hidden objectives persisted.As Felt and Fochler (2010: 228) argue, engagement can be a way of 'surveying and assessing potential critical voices', and this was evident in this Dialogue.The Dialogue's more explicit aims of public empowerment seem at odds with the discourse and planning around empowering government data users.It is difficult to determine in final reports where the meta-objectives or explicit objectives drove the conclusions.This process of holding two opposing kinds of objectives suggests that despite being explicitly substantive, public-engagement practice may be implicitly instrumental.

Lesson 3: Contesting the meaning of 'engagement'
Organizers, advisory group members and publics held varying conceptions of what was meant by 'engagement'.At times, engagement represented public inclusion, particularly when discussing the workshop components, but it was also referred to as being about generating interest, for example the online tool should be engaging for users.This led to confusion during advisory group meetings, which included a wide group of stakeholders, where there were evident contestations as to what engagement as an action comprised.For some, engagement was public participation, while for others engagement was seen as education.These contested meanings were exemplified by discussion of the online Data Dilemmas game.It was often referred to as an engagement tool, but also as an education or knowledge tool.The final report stated that the 'engagement tool [will be used] to engage a wider audience in a public debate around data science' (Ipsos MORI Social Research Institute, 2016a: 10).One can see how contested meanings could alter the substance of this sentence.If engagement is evoking interest and the tool excites publics, then the work is done.If engagement is participation, then the tool would need to be followed by more active public involvement.Thus, the former meaning suggests that participating in the game is itself engagement, while the latter suggests that participating in the game should enable further engagement.While authors such as Smallman (2016) demonstrate that academic researchers have in recent years intended engagement to be participative, other meanings are more common outside of academia.The term engagement is multidimensional.In practice, one cannot assume that the academic shorthand of engagement as public participation is consistent in government or industry settings, where engagement more often means communication.Establishing a common language between academia, publics, government and industry may help all parties to reflect critically on the meaning of engagement, and thus on how to encourage more inclusive and substantive engagement practice (where this is at least part of the aim).

Lesson 4: Negotiating stakeholder interests
Throughout the Dialogue, the organizers acted as intermediaries between the wider stakeholder group, consisting of public engagement and data science professionals, interested government departments, and the participants.They negotiated expectations, particularly those of the government departments, through anticipating what could be sensitive topics.For example, discussion around data use to counteract terrorism was considered to be a sensitive case.There was ongoing adaptation to anticipated problems to ensure that these subjects did not become roadblocks.One of the principal concerns was balancing the tone of the engagement, that is, ensuring that the report did not read either too positively or too negatively.It was anticipated that participants would want a representative account of their discussions, while government would want a positive tone, and a negative tone at times would be necessary to balance out the arguments.Organizers had to develop, whether real or perceived, knowledge and imagination of 'hot topics' for both government and publics.The Dialogue materials, the final report, the case studies of government data science and more can be seen as objects processed to be as uncontested as possible.For example, a sub-theme included deciding what sort of materials to present to publics during the engagement.This process of setting the discussion parameters was also a way of drawing lines around what can be said.The Dialogue used real government case studies to stimulate debate, which were designed to test certain perceptions of public concern, such as keeping data secure.Thus, running the workshops was not simply a process of organizing materials and resources, but was also a constant negotiation of what could be said and what should be said, both to publics and to stakeholders.This anticipation of concerns furthers one understanding of public engagement as a process of preventing critique (Felt and Fochler, 2010).This theme suggests that further than public engagement identifying critique, it is also a practice of anticipating concerns for the engagement itself.

Other lessons and practices of interest
There were several underlying practical constraints during the Dialogue, including managing time and public understanding.From inception meeting to final report, the Dialogue lasted just six months.This short timescale left little time for reflection during the Dialogue process, and organizers were in a constant cycle of iterating plans, reports and workshop materials.At one point during the project, the team piloted an education session on a Friday night and proceeded to reconfigure it for another session on Saturday morning.While organizers planned to allow spontaneous suggestions for ethical concerns, there was very limited time for this process after the education and deliberation components.The discussion was then limited to these predetermined ethical concerns.These limitations are then, necessarily, reflected in the reporting, where making meaning is performed.As Lezaun and Soneryd (2007: 288) argue, the reporting stage is about 'attempting to officialise a singular meaning for the exercise'.In order to maintain clarity of meaning, the report was written with different prospective readers in mind -for example, media versus government -and organizers focused on avoiding writing in a way that could be interpreted negatively.The final, and perhaps most practical, step in the engagement process was releasing these results.A final stakeholder event was held to discuss the report with a wider group of interested parties.In interacting with these larger groups, there was a focus on not overstating the applicability of the results.For example, the final report states that 'the views of proportions of the qualitative group should not be extrapolated to the population at large' and that the 'results are intended to be illustrative rather than statistically reliable' (Ipsos MORI Social Research Institute, 2016a: 66).These caveats can function as a way of safeguarding the organizers from the aforementioned negative meanings.

Critical reflections on the lead author's role in the engagement
The first author's role in the Dialogue was as an observer, and at rare times a participant in larger meetings.However, due to her limited role, there were meetings and communications that she was unable to observe.It is impossible to know whether following more private exchanges would have changed the nature of the themes discussed here.The majority of her notes come from teleconference meetings, which also risk missing non-verbal cues.However, this allowed her to take notes freely without attracting notice.She found it challenging to feel comfortable in her role, being a non-expert in the design of public-engagement exercises, but being unaware of the norms of public engagement allowed her to highlight processes that she may not have recognized as hidden knowledge had she been more familiar with government-run public engagement.As time went on, she felt more comfortable as an observer and found ways to give herself a sense of value in the process.Although she attempted to minimize incorporating her own views on data science and public engagement in her interpretation of the process, as she became more familiar with the organizers, she expects that her comfort led to unconscious self-filtering to avoid reflecting the Dialogue in a negative light.Evaluation of any dialogue is challenging; as Rowe et al. (2005) highlight, it is difficult to define effectiveness in public engagement.As such, and due to the limitations of ethnography, this study is meant to be descriptive rather than evaluative.

Concluding thoughts and reflection on future practice
Public engagement with data science is becoming increasingly popular in an attempt to understand where the social and ethical lines should lie in government data science.The Public Dialogue on Data Science Ethics is an example of this.This ethnographic study examined this Dialogue to understand how a public engagement with data science is operationalized from the perspective of the hidden curriculum.Embedded within practical considerations of organizing meetings and creating stimulus plans are the underlying processes of negotiating stakeholder interests, identifying who the public is, and setting the purpose of the Dialogue.There was also a key lesson in defining what engagement consists of; examples in this Dialogue included an 'engaging' online quiz-style tool and workshops that were a public 'engagement'.With cautious interpretation that this Dialogue's processes are in some manner reflective of wider trends, the authors conclude by questioning how these processes of doing engagement may impact the space for public involvement in data science, now and in the future.
In a recent paper, Rowe and Watermeyer (2018) highlight, similarly to our discussion here, that public engagements face several 'dilemmas' that limit the substantive nature of public inclusion.Of relevance to this paper, they highlight how public engagements are less likely to happen in cases where controversy is expected.We add to this This study used ethnographic methods to observe a government-led public engagement with data science in the United Kingdom.The Public Dialogue on Data Science Ethics (the Dialogue) ran from late 2015 to mid-2016 as a joint venture by the Government Data Science Partnership (the Government Digital Service, the Office for National Statistics and the Government Office for Science), Ipsos MORI and Sciencewise.The Dialogue aimed to identify what 'the public' thinks is appropriate for government data science, inform an ethical framework and set future goals for engagement.Consultations were with 88 individuals over five events: two in London, and one each in Sheffield, Wolverhampton and Taunton.The workshops involved small-group discussion, deliberation and discussion of examples of government data science, and hypothetical deliberative scenarios.A further 2,003 people were surveyed on hypothetical government data science projects to determine what was most important to the public in data science acceptability: data type, use of aggregate versus individual data, scope of coverage of data set, purpose, the human role in the project, or the clarity of decisions.Results of the Dialogue are reported in Public Dialogue on the Ethics of Data Science in Government by Ipsos MORI Social Research Institute (2016a) and by