PIC, a paediatric-specific intensive care database

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

PIC (Paediatric Intensive Care) is a large paediatric-specific, single-centre, bilingual database comprising information relating to children admitted to critical care units at a large children’s hospital in China. The database is deidentified and includes vital sign measurements, medications, laboratory measurements, fluid balance, diagnostic codes, length of hospital stays, survival data, and more. The data are publicly available after registration, which includes completion of a training course on research with human subjects and signing of a data use agreement mandating responsible handling of the data and adherence to the principle of collaborative research. Although the PIC can be considered an extension of the widely used MIMIC (Medical Information Mart for Intensive Care) database in the field of paediatric critical care, it has many unique characteristics and can support database-based academic and industrial applications such as machine learning algorithms, clinical decision support tools, quality improvement initiatives, and international data sharing.

Abstract

Measurement(s)	Demographics • Vital Signs Measurement • Medication • clinical laboratory measurement • fluid balance • length of hospital stay • survival time • microbiological information
Technology Type(s)	digital curation
Sample Characteristic - Organism	Homo sapiens
Sample Characteristic - Environment	Intensive Care Unit
Sample Characteristic - Location	China

Machine-accessible metadata file describing the reported data: 10.6084/m9.figshare.11481810

Related collections

Most cited references 18

Record: found
Abstract: not found
Article: not found

The inevitable application of big data to health care.

Travis Murdoch, Allan Detsky (2013)

0 comments Cited 434 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Multiparameter Intelligent Monitoring in Intensive Care II: a public-access intensive care unit database.

Mohammed Saeed, Mauricio Villarroel, Andrew Reisner … (2011)

We sought to develop an intensive care unit research database applying automated techniques to aggregate high-resolution diagnostic and therapeutic data from a large, diverse population of adult intensive care unit patients. This freely available database is intended to support epidemiologic research in critical care medicine and serve as a resource to evaluate new clinical decision support and monitoring algorithms. Data collection and retrospective analysis. All adult intensive care units (medical intensive care unit, surgical intensive care unit, cardiac care unit, cardiac surgery recovery unit) at a tertiary care hospital. Adult patients admitted to intensive care units between 2001 and 2007. None. The Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II) database consists of 25,328 intensive care unit stays. The investigators collected detailed information about intensive care unit patient stays, including laboratory data, therapeutic intervention profiles such as vasoactive medication drip rates and ventilator settings, nursing progress notes, discharge summaries, radiology reports, provider order entry data, International Classification of Diseases, 9th Revision codes, and, for a subset of patients, high-resolution vital sign trends and waveforms. Data were automatically deidentified to comply with Health Insurance Portability and Accountability Act standards and integrated with relational database software to create electronic intensive care unit records for each patient stay. The data were made freely available in February 2010 through the Internet along with a detailed user's guide and an assortment of data processing tools. The overall hospital mortality rate was 11.7%, which varied by critical care unit. The median intensive care unit length of stay was 2.2 days (interquartile range, 1.1-4.4 days). According to the primary International Classification of Diseases, 9th Revision codes, the following disease categories each comprised at least 5% of the case records: diseases of the circulatory system (39.1%); trauma (10.2%); diseases of the digestive system (9.7%); pulmonary diseases (9.0%); infectious diseases (7.0%); and neoplasms (6.8%). MIMIC-II documents a diverse and very large population of intensive care unit patient stays and contains comprehensive and detailed clinical data, including physiological waveforms and minute-by-minute trends for a subset of records. It establishes a new public-access resource for critical care research, supporting a diverse range of analytic studies spanning epidemiology, clinical decision-rule development, and electronic tool development.

0 comments Cited 252 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Ethical Challenges of Big Data in Public Health

Effy Vayena, Marcel Salathé, Lawrence Madoff … (2015)

Introduction Digital epidemiology, also referred to as digital disease detection (DDD), is motivated by the same objectives as traditional epidemiology. However, DDD focuses on electronic data sources that emerged with the advent of information technology [1–3]. It draws on developments such as the widespread availability of Internet access, the explosive growth in mobile devices, and online sharing platforms, which constantly generate vast amounts of data containing health-related information, even though they are not always collected with public health as an objective. Furthermore, this novel approach builds on the idea that information relevant to public health is now increasingly generated directly by the population through their use of online services, without their necessarily having engaged with the health care system [4, 5]. By utilizing global real-time data, DDD promises accelerated disease outbreak detection, and examples of this enhanced timeliness in detection have already been reported in the literature. The most recent example is the 2014 Ebola virus outbreak in West Africa [6]. Reports of the emerging outbreak were detected by digital surveillance channels in advance of official reports. Furthermore, information gleaned by the various datasets can be used for several epidemiological purposes beyond early detection of disease outbreaks [7, 8], such as the assessment of health behavior and attitudes [4] and pharmacovigilance [9]. This is a nascent field that is developing rapidly [10]. While changes in the ways in which epidemiologic information is obtained, analyzed, and disseminated are likely to result in great social benefits, it is important to recognize and anticipate potential risks and unintended consequences. In this article we identify some of the key ethical challenges associated with DDD activities and outline a framework for addressing them. We argue that it is important to engage with these questions while the field is at an early stage of evolution in order to make ethical awareness integral to its development. The Context in Which DDD Operates DDD operates at the intersection of personal information, public health, and information technologies, and increasingly within the so-called big data environment. Big data lacks a widely accepted definition. The term has, nevertheless, acquired substantial rhetorical power. We use it here in the sense of very large, complex, and versatile sets of data that are constantly evolving in terms of format and velocity [11]. This dynamic environment generates various ethical challenges that relate not only to the value of health for individuals and societies, but also to individual rights and other moral requirements. In order to spell out these challenges and possible ways of meeting them, it is necessary to take into account the distinctive nature of DDD and the broader context in which it operates. Generally, these distinct features are linked to the methods by which data are generated, the purposes for which they are collected and stored, the kind of information that is inferred by their analysis, and eventually how that information is translated into practice [12]. More specifically, some of these relevant features include those outlined below—namely, the steady growth of digital data, the multifaceted character of big data, and ethical oversight and governance. The steady growth of digital data The amount of data that is generated from activities facilitated by the Internet and mobile technologies is unprecedented. The global number of mobile-cellular subscriptions is close to the world’s population figures, with a total penetration rate of 96%. The mobile-cellular penetration rate in developing countries is 89%, and about 40% of the world’s population is connected to the internet [13]. 82% of the world’s online population uses social media and networks. [14]. More than 40,000 health apps are available, and a new higher-level Internet domain name “health” is about to be released [15, 16]. Not surprisingly, personal data have recently been described as a new asset class with the potential to, among other things, transform health care and global public health [17]. The multifaceted character of big data Big data cannot be readily grouped into clearly demarcated functional categories. Depending on how they are queried and combined with other datasets, a given dataset can traverse categories in unpredictable ways. For example, health data can now be extracted from our purchases of everyday goods, our social media exchanges, and our web searches. New data analytics constantly change the kinds of outcomes that become possible. They go beyond early identification of outbreaks and disease patterns to include predictions of the event’s trajectory or likelihood of reoccurrence [18, 19]. These new possibilities render good data governance, which ensures their ethical use, all the more complex. Ethical oversight and governance Public health surveillance and public health research are governed by national and international legislation and guidelines. However, many of these norms were developed in response to very different historical conditions, including technologies that have now been superseded [20]. Such mechanisms may not be appropriate or effective in addressing the new ethical challenges posed by DDD, nor the questions that will be raised if DDD is effectively integrated into standard public health systems. Health research utilizing social media data and other online datasets has already exerted pressure on existing research governance procedures [21]. Ethical Challenges Against this background we have identified three clusters of ethical challenges facing DDD that require consideration (Table 1). 10.1371/journal.pcbi.1003904.t001 Table 1 Mapping the ethical issues in digital disease detection. Categories Ethical Challenges Specific Examples Values Context sensitivity Differentiating between commercial versus public health uses of data Is identification permitted? Is consent required for DDD uses? If so, has consent been obtained? Can it be revoked? Privacy and contextual integrity User agreements, terms of service, participatory epidemiology Are users protected in all contexts irrespective of privacy laws that differ according to jurisdiction? Transparency Global health issues Are privately collected data open to global public health uses? Global justice Nexus of ethics and methodology Robust methodology: algorithm validation, algorithm recalibration, noise filtering, and feedback mechanisms False identification of outbreaks and inaccurate predictions of outbreak trajectory Risk of harm Pressure to mobilize public health resources in light of rapidly spreading unvalidated predictions Fair use of resources Data provenance Awareness about public health uses of personal data (in aggregated form) Trust, transparency, accountability Legitimacy requirements Best practice standards Is there a shared code of practice amongst all those working on DDD? Trustworthiness Monitoring bodies (policies for ongoing monitoring and action plans for correction of false results) Is there a mechanism for quick response to inaccuracies about outbreaks? Trust, transparency, accountability Paced integration of DDD to standard surveillance systems Are there mechanisms for redressing harms caused by DDD activities? Justice Communication to the public (prevent hype) Management of expectations Common good A. Context sensitivity At the crux of the debate on the ethics of big data lies a familiar, but formidably complex, question: how can big data be utilized for the common good whilst respecting individual rights and liberties, such as the right to privacy? What are the acceptable trade-offs between individual rights and the common good, and how do we determine the thresholds for such trade-offs? These ethical concerns and the tensions between them are not new to public health research and practice, but now they must be addressed in a new context, with the result that appropriate standards may vary according to the type of big data activity in question. It is clear that the context of DDD differs in significant ways from other types of big data activity concerned with health. DDD has a public health function, aiming ultimately to improve health at the population level. Public health is a common good from which all individuals benefit and one that is essential to human development and prosperity. There is a clear contrast here with forms of corporate activity that may use the exact same data (i.e., social networking data), but for other purposes, such as advertising. The former aims at fostering a public good (health); the latter at generating a corporate profit. Such differences have important ethical implications. A context-sensitive understanding of ethical obligations may reveal that some data uses that may not be acceptable within corporate activity (e.g., user profiling and data sharing with third parties) may be permissible for public health purposes. Furthermore, societal obligations to foster the common good of public health may generate duties on corporate data collectors to make data available for use in DDD. Pursuing this line of thought, it is arguable that privacy considerations that apply in standard public health practice will have to be creatively extended and adapted to the case of DDD. This will result in new standards that relate to data from a diverse range of sources, e.g., self-tracking, citizen scientists, social networks, volunteers, or other participatory contexts [22, 23]. Such new standards are urgently needed, especially as greater convergence of datasets becomes possible. An illustration of global activity on this front is the United Nations Global Pulse project [24]. This project explores the concept of data philanthropy whereby public–private partnerships are formed to share data for the public good. Such so-called data commons, operating on the basis of clear rules about privacy and codes of conduct, can profoundly affect disease surveillance and public health research more generally. Another dimension of context relates to global justice. Historically, new health tools have been predominantly used to improve the health of inhabitants of the better-off parts of the world. DDD projects that access global data are often less costly than traditional public health approaches. They could thus offer a potential breakthrough in early disease detection that would benefit communities throughout the world [25, 26]. However, this potential brings moral obligations in its train. This requires not only efforts to detect diseases in poorer parts of the world but also measures to ensure that the way data are collected and processed respect the rights and interests of people from these diverse regions and communities. This raises difficult questions of cultural relativity, such as whether standards of privacy can take different forms in relation to different cultures or whether some minimal core of uniform standards is also justified. B. Nexus of ethics and methodology Robust scientific methodology involves the validation of algorithms, an understanding of confounding, filtering systems for noisy data, managing biases, the selection of appropriate data streams, and so on. Some have expressed skepticism about the role that DDD can play in public health practice given its early state of development [27]. In 2013, when Google Flu Trends overestimated flu prevalence levels in the US, further concerns were raised about the sensitivity of this methodology to the digital environments created by users’ behavior—for example, different uses of search terms [28] from those used to develop the initial algorithm or the distorting influence of searches arising from media coverage of the flu [29, 30]. Methodological robustness is an ethical, not just a scientific, requirement. This is not only because limited resources are wasted on producing defective results or because trust in scientific findings is undermined by misleading or inaccurate findings. There is a further risk of harm to individuals, businesses, or communities if they are falsely identified as affected by an infectious disease. The harm can take many forms, including financial losses, such as a tourist region being falsely identified as the location of a disease outbreak; stigmatization of particular communities, which may adversely affect individual members; and even the infringement of individual freedoms, such as the freedom of movement of an individual falsely identified as a carrier of a particular disease. The issue of data provenance comes within the remit of ethically sound methodology. Currently published DDD studies and other initiatives have mostly used data that are in the public domain (e.g., Twitter) or that have been contributed by individuals with their explicit consent for use in disease surveillance (flunearyou.org). While in principle data in the public domain are open to being used for public health purposes, what constitutes public domain on the Internet is the subject of lively debate [31]. Especially in the context of data derived from social network interactions, it remains unclear whether users understand in what ways their data can be used and who may access them [32]. Any DDD project will inevitably have to navigate this uncertain environment and so must exercise diligence about data provenance and exhibit transparency about its uses. C. Bootstrapping legitimacy Legitimacy concerns the extent to which DDD is actually ethically justified in imposing the compliance burdens that it does and also the extent to which it is perceived to be ethically justified. In recent years the concept of “global health security” has been mobilized by international organizations, nongovernmental organizations, and national governments to strengthen the legitimacy of systems of disease surveillance both nationally and globally. The idea of human security has been expanded to include health (protection from infectious diseases and other health hazards), augmenting state responsibilities to provide appropriate safeguards. The revised International Health Regulations [33], which set out a global legal framework for disease detection and response, are premised on the understanding that in our globalized world diseases spread rapidly and therefore on the need for the timely notification of any public health threat of potentially international significance. They also recognize the importance of information gathering from various sources, including unofficial or informal ones, whilst also requiring that the validity of such information be verified [34]. This creates a legitimate space for DDD activities because they are precisely responses to both the accelerated detection and the global nature of the spread of disease. However, even if ethical arguments already justify the DDD enterprise, they only serve as a starting point. DDD will have to build its own legitimacy over time as an integral part of its approach. This means that the issues under categories A and B have to be constantly engaged with thorough processes that bootstrap DDD’s legitimacy, so it is continuously self-generating and enhanced over time. So, for example, it is not enough simply to appeal to the great contribution that DDD stands to make to the common good of public health. It is important that this contribution is made in certain ways rather than others, through transparent procedures that are worthy of engendering trust among those individuals whose data are used in DDD. Current regulatory and ethical oversight mechanisms are ill-equipped to address the entire spectrum of DDD-type activities. The distinction between public health and public health research has long been considered a problematic one, and this is even more evident in the DDD context. Consider an analogy with participant-led biomedical research—a growing movement of people collecting data about themselves and conducting various forms of research in large groups. Either such activities fall through the cracks of the existing oversight mechanisms or else, if they do not, those mechanisms impose inappropriate burdens upon them [35, 36]. Participatory approaches to disease surveillance confront similar challenges. Individuals report on disease symptoms on online platforms, (e.g., flunearyou.org) which enables them to contribute to the common good of disease surveillance and often to receive feedback about disease prevalence in their area [37]. This active participation potentially empowers individuals and democratizes the process of scientific discovery. However, data (personally identifiable information, geolocation, etc.) that are collected for DDD purposes need to be governed in ways that minimize the risk of harm to participants. For example, if individuals take personal risks in order to report events of public health importance (i.e., a farmer reporting avian flu at risk of losing his flock), those risks should be mitigated by appropriate policies (e.g., compensation) that acknowledge the societal contribution and the local/personal costs. For the purposes of ensuring its legitimacy, DDD must develop internal mechanisms such as its own best-practice standards, including monitoring boards with the concrete mandate to ensure that risks and costs to individuals and communities are proportional to benefits. Such boards should also be empowered to negotiate compensation schemes for harms that have been suffered. As in standard public health practice individuals may be adversely affected by a practice that aims to secure the health of the population. However, this laudable goal does not remove the obligation to respect individual rights and dignity in its pursuit. Neither of these standards are to be equated with an automatic insistence on individual consent. Instead, they consist of distinct individual entitlements, of the sort set out in the Universal Declaration of Human Rights, and the inherent value in all human beings, which underlies them. Conclusions The emergence of DDD promises tangible global public health benefits, but these are accompanied by significant ethical challenges. While some of the challenges are inherent to public health practice and are only accentuated by the use of digital tools, others are specific to this approach and largely unprecedented. They span a wide spectrum, ranging from risks to individual rights, such as privacy and concerns about autonomy, to individuals’ obligations to contribute to the common good and the demands of transparency and trust. We have grouped these concerns under the headings of context sensitivity, nexus of ethics and methodology, and bootstrapping legitimacy. It is vital that engagement with these challenges comes to be seen as part of the development of DDD itself, not as some extrinsic constraint. We intend this paper to be a contribution to the development of a more comprehensive and concrete ethical framework for DDD, one that will enable DDD to find an ethical pathway to realizing its great potential for public health.

0 comments Cited 95 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Qiang Shu: shuqiang@zju.edu.cn

Haomin Li:

ORCID: http://orcid.org/0000-0002-6420-7719

hmli@zju.edu.cn

Journal

Journal ID (nlm-ta): Sci Data

Journal ID (iso-abbrev): Sci Data

Title: Scientific Data

Publisher: Nature Publishing Group UK (London )

ISSN (Electronic): 2052-4463

Publication date (Electronic): 13 January 2020

Publication date PMC-release: 13 January 2020

Publication date Collection: 2020

Volume: 7

Electronic Location Identifier: 14

Affiliations

[1 ]GRID grid.411360.1, The Children’s Hospital of Zhejiang University School of Medicine and National Clinical Research Center for Child Health, ; Hangzhou, China

[2 ]ISNI 0000 0004 1759 700X, GRID grid.13402.34, The College of Biomedical Engineering and Instrument Science, , Zhejiang University, ; Hangzhou, China

Author information

Haomin Li http://orcid.org/0000-0002-6420-7719

Article

Publisher ID: 355

DOI: 10.1038/s41597-020-0355-4

PMC ID: 6957490

PubMed ID: 31932583

SO-VID: b12aae27-5bc2-4d35-a4e7-516a869706c0

License:

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files associated with this article.

History

Date received : 24 September 2019

Date accepted : 20 December 2019

Funding

Funded by: FundRef https://doi.org/10.13039/501100001809, National Natural Science Foundation of China (National Science Foundation of China);

Award ID: 81871456

Award Recipient : Haomin Li

Funded by: FundRef https://doi.org/10.13039/501100004751, Chinese Ministry of Science and Technology | Department of S and T for Social Development (Department of S&T for Social Development);

Award ID: 2016YFC0901905

Award Recipient : Haomin Li

Custom metadata

Keywords: paediatrics,health services

Data availability:

Keywords: paediatrics, health services

Comments

Comment on this article

scite_

Cited by 39

See all cited by

Most referenced authors 277

See all reference authors

- Version 1

PIC, a paediatric-specific intensive care database

Read this article at

Abstract

Abstract

Related collections

The Physician

Most cited references 18

The inevitable application of big data to health care.

Multiparameter Intelligent Monitoring in Intensive Care II: a public-access intensive care unit database.

Ethical Challenges of Big Data in Public Health

Author and article information

Contributors

Journal

Affiliations

Author information

Article

History

Funding

Categories

Custom metadata

Comments

Comment on this article

Similar content 448

Cited by 39

Most referenced authors 277