Early in the epidemic: impact of preprints on global discourse about COVID-19 transmissibility

Majumder, Maimuna S; Mandl, Kenneth D.

doi:10.1016/S2214-109X(20)30113-3

ScienceOpen: research and publishing network

For Publishers

For Researchers

Blog
About

Search
Advanced search

views

recommends

Record: found
Abstract: found
Article: not found

Early in the epidemic: impact of preprints on global discourse about COVID-19 transmissibility

discussion

Author(s): Maimuna S Majumder ^a , Kenneth D Mandl ^a

Publication date (Electronic): 24 March 2020

Journal: The Lancet. Global Health

Publisher: The Author(s). Published by Elsevier Ltd.

Read this article at

ScienceOpenPublisher PMC

Bookmark

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Since it was first reported by WHO in Jan 5, 2020, over 80 000 cases of a novel coronavirus disease (COVID-19) have been diagnosed in China, with exportation events to nearly 90 countries, as of March 6, 2020. 1 Given the novelty of the causative pathogen (named SARS-CoV-2), scientists have rushed to fill epidemiological, virological, and clinical knowledge gaps—resulting in over 50 new studies about the virus between January 10 and January 30 alone. 2 However, in an era where the immediacy of information has become an expectation of decision makers and the general public alike, many of these studies have been shared first in the form of preprint papers—before peer review. For the past three decades, preprint servers have become commonplace in the scientific publication ecosystem, and COVID-19 has prompted a seemingly unprecedented use of these platforms. 3 Although peer-review is crucial for the validation of science, the ongoing outbreak has showcased the speed with which preprints can disseminate information during emergencies. In this Comment, we used both preprint and peer-reviewed studies that estimated the transmissibility potential (ie, basic reproduction number [R 0]) of SARS-CoV-2 on or before Feb 1, 2020 to investigate the role that preprints have had in information dissemination during the ongoing outbreak. We also analysed the agreement of preprint estimates compared with those presented by peer-reviewed studies and propose a consensus-based approach for evaluating the validity of preprint findings during public health crises. For our analysis, we collected publicly available data from scientific studies, news reports, and search trends pertaining to SARS-CoV-2 and its R 0. Defined as the average number of secondary infections that a new case might transmit in a fully susceptible population, estimates of R 0 can provide decision makers with insights into the epidemic potential of a given outbreak. Relevant news reports were discovered through MediaCloud and search trends by use of Google Search Trends, and both served as a proxy indicator for information dissemination. Meanwhile, relevant scientific studies were discovered through a combination of searches executed with use of Google Scholar and, to address possible delays in indexing, four popular public preprint servers (ie, arXiv, bioRxiv, medRxiv, and Social Science Research Network [SSRN]) that we believe are representative of the relevant preprint literature. Search terms and specifications for each data source are outlined in the appendix (p 2). All studies discovered through Google Scholar, arXiv, bioRxiv, medRxiv, and SSRN were manually checked for relevance to the topic area of interest. We retained only studies that included estimates for the R 0 associated with SARS-CoV-2 in the body of the text. After this initial data discovery phase, which yielded 11 individual studies, date of first publication, publication platform, review status (ie, preprint vs peer-reviewed), and methodological details were manually curated from each study (appendix p 3).4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18 R 0 estimates were also extracted from each study for further analysis. In the event of multiple R 0 estimates—because of preprint revisions after the first version or the use of multiple approaches in a single study—each estimate was recorded and treated as a separate entry to represent all available knowledge at any given point in time (appendix p 3). Given that the first known preprint estimates for R 0 were posted to SSRN by us on Jan 23, we plotted search trend fractions and news report volume between Jan 23 and Feb 1 (appendix p 4). Baseline data for both sources before Jan 23, 2020, yielded negligible search trend interest and news report volume, and data collected up to Feb 9, 2020, showed diminishing interest and volume after the catchment window (appendix p 4). To illustrate when each of the 11 relevant studies became available to the public, indicator bars were overlaid against the search trend and news report data by date of publication (appendix p 4). We then plotted each of the 16 R 0 estimates produced by the 11 studies, including both the mean and the estimate range (eg, 95% CI, 95% credible interval, and so on) presented (appendix p 3). Estimates were plotted by date of publication and alphabetically there-in, offering a side-by-side comparison of preprint versus peer-reviewed results; averages and 95% CIs were also computed for both groups (figure ). Figure R 0 mean and range estimates from 11 different studies of COVID–19 as a function of time For preprints that were revised before publication of the first relevant peer-reviewed study on Jan 29, the version number is indicated between parentheses as (n). When multiple R 0 estimates were presented in a single study because of the use of multiple approaches, the version number is followed by a single decimal place to indicate the approach used (n.n). If a first author published more than one relevant independent study before Feb 1, the version number is followed immediately by an alphabetical marker ordered by date of publication (nx). Ranges presented vary by study (eg, 95% CI, 95% credible interval, and so on) and are presented in the appendix (p 3). R 0=basic reproduction number. Google Search Trends and MediaCloud data suggested that both general (ie, search) interest and news media interest in the R 0 associated with COVID–19 peaked before the publication of relevant peer-reviewed studies during the early stages of the epidemic. In the selected time frame, search interest peaked on Jan 27 after a sharp increase between Jan 23 and Jan 25 immediately after the publication of five early preprint studies—all of which estimated R 0—in bioRxiv, medRxiv, and SSRN. Meanwhile, news media interest peaked on Jan 28, coinciding with a sixth preprint study published in arXiv (appendix p 4). The first peer-reviewed estimates were then published by Li and colleagues in The New England Journal of Medicine on Jan 29 at 17:00 h (eastern standard time), followed by four additional peer-reviewed studies in Eurosurveillance, The International Journal of Infectious Diseases, The Lancet, and Journal of Clinical Medicine up to Feb 1.14, 19 Average R 0 estimates across the preprint group were 3·61 (95% CI 2·77–4·45) and 2·54 (2·17–2·91) across the peer-reviewed group—showing overlap in 95% CIs despite a wide diversity of modelling methods and data sources used both in-group and across-group (appendix p 3). Although the average mean for the preprint group was higher than that for the peer-reviewed group, this effect was driven primarily by two upper-limit outlier estimates (with R 0 higher than the 95% CI maximum; figure).9, 10 Exclusion of these two estimates by use of a consensus-based approach based on the 95% CIs yielded an average R 0 estimate of 3·02 (95% CI 2·65–3·39) for the preprint group. Notably, two studies in the peer-reviewed group had previously been published as preprints.15, 16 Although estimates presented by Riou and Althaus remained unchanged after peer review, estimates presented by Zhao and colleagues were higher before peer review than afterwards. Our findings suggest that, because of the speed of their release, preprints—rather than peer-reviewed literature in the same topic area—might be driving discourse related to the ongoing COVID-19 outbreak. Although our analysis focused on search trends and news media data as a measure for general discourse, it is likely that preprints are also influencing policy making discussions, given that WHO announced on Jan 26, 2020, that they would be creating a repository of relevant studies—including those that have not yet been peer-reviewed. 20 Nevertheless, despite the advantages of speedy information delivery, the lack of peer review can also translate into issues of credibility and misinformation, both intentional and unintentional. This particular drawback has been highlighted during the ongoing outbreak, especially after the high-profile withdrawal of a virology study from the preprint server bioRxiv, which erroneously claimed that COVID-19 contained HIV “insertions”. 21 The very fact that this study was withdrawn showcases the power of open peer-review during emergencies; the withdrawal itself appears to have been prompted by outcry from dozens of scientists from around the globe who had access to the study because it was placed on a public server. 22 Much of this outcry was documented on Twitter (a microblogging platform) and on longer-form popular science blogs, signalling that such fora would serve as rich additional data sources for future work on the impact of preprints on public discourse. 22 However, instances such as this one described showcase the need for caution when acting upon the science put forth by any one preprint. With this in mind, taking multiple studies into consideration as presented in our analysis can help operationalise the kind of caution necessitated by preprints while simultaneously allowing for important, robust insights before the publication of a peer-reviewed study in the same topic area. Here, we used a simple method in which we plotted the ten R 0 estimates that were posted as preprints before publication of the first peer-reviewed study on Jan 29; we then took the average of these estimates and excluded the two estimates that qualified as upper-limit outliers—both upon visual inspection and as a function of the 95% CI. Even before outlier elimination, this simple method yielded average R 0 estimates similar to those presented by the peer-reviewed studies subsequently published on and after Jan 29; however, more complex approaches that incorporate weighted averages based on estimate confidence, similar to traditional meta-analytical methods, offer a promising avenue for future work. Such collective, consensus-based approaches will arguably be easiest to use when the research of interest is quantitative in nature; nevertheless, given that many crucial epidemiological parameters that inform decision making (eg, incubation period, generation time, and so on) are quantitative, our proposed approach could work well in these contexts as well. Our work showcases the powerful role preprints can have during public health crises because of the timeliness with which they can disseminate new information. Furthermore, given that two of the preprints included in this analysis were later published in peer-reviewed outlets, the evidence shows that that even prestigious journals now permit the sharing of important findings before peer review and that the use of preprint platforms does not jeopardise future peer-reviewed publication.15, 16 Without question, primacy and peer-reviewed publications are key metrics in individual professional advancement (eg, academic promotion); nevertheless, the impact of preprints on discourse and decision making pertaining to the ongoing COVID-19 outbreak suggests that we must rethink how we reward and recognise community contributions during present and future public health crises.

Related collections

Most cited references 14

Record: found
Abstract: found
Article: not found

Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus–Infected Pneumonia

Qun Li, Xuhua Guan, Peng Wu … (2020)

Abstract Background The initial cases of novel coronavirus (2019-nCoV)–infected pneumonia (NCIP) occurred in Wuhan, Hubei Province, China, in December 2019 and January 2020. We analyzed data on the first 425 confirmed cases in Wuhan to determine the epidemiologic characteristics of NCIP. Methods We collected information on demographic characteristics, exposure history, and illness timelines of laboratory-confirmed cases of NCIP that had been reported by January 22, 2020. We described characteristics of the cases and estimated the key epidemiologic time-delay distributions. In the early period of exponential growth, we estimated the epidemic doubling time and the basic reproductive number. Results Among the first 425 patients with confirmed NCIP, the median age was 59 years and 56% were male. The majority of cases (55%) with onset before January 1, 2020, were linked to the Huanan Seafood Wholesale Market, as compared with 8.6% of the subsequent cases. The mean incubation period was 5.2 days (95% confidence interval [CI], 4.1 to 7.0), with the 95th percentile of the distribution at 12.5 days. In its early stages, the epidemic doubled in size every 7.4 days. With a mean serial interval of 7.5 days (95% CI, 5.3 to 19), the basic reproductive number was estimated to be 2.2 (95% CI, 1.4 to 3.9). Conclusions On the basis of this information, there is evidence that human-to-human transmission has occurred among close contacts since the middle of December 2019. Considerable efforts to reduce transmission will be required to control outbreaks if similar dynamics apply elsewhere. Measures to prevent or reduce transmission should be implemented in populations at risk. (Funded by the Ministry of Science and Technology of China and others.)

0 comments Cited 7174 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study

Joseph Wu, Kathy Leung, Gabriel Leung (2020)

Summary Background Since Dec 31, 2019, the Chinese city of Wuhan has reported an outbreak of atypical pneumonia caused by the 2019 novel coronavirus (2019-nCoV). Cases have been exported to other Chinese cities, as well as internationally, threatening to trigger a global outbreak. Here, we provide an estimate of the size of the epidemic in Wuhan on the basis of the number of cases exported from Wuhan to cities outside mainland China and forecast the extent of the domestic and global public health risks of epidemics, accounting for social and non-pharmaceutical prevention interventions. Methods We used data from Dec 31, 2019, to Jan 28, 2020, on the number of cases exported from Wuhan internationally (known days of symptom onset from Dec 25, 2019, to Jan 19, 2020) to infer the number of infections in Wuhan from Dec 1, 2019, to Jan 25, 2020. Cases exported domestically were then estimated. We forecasted the national and global spread of 2019-nCoV, accounting for the effect of the metropolitan-wide quarantine of Wuhan and surrounding cities, which began Jan 23–24, 2020. We used data on monthly flight bookings from the Official Aviation Guide and data on human mobility across more than 300 prefecture-level cities in mainland China from the Tencent database. Data on confirmed cases were obtained from the reports published by the Chinese Center for Disease Control and Prevention. Serial interval estimates were based on previous studies of severe acute respiratory syndrome coronavirus (SARS-CoV). A susceptible-exposed-infectious-recovered metapopulation model was used to simulate the epidemics across all major cities in China. The basic reproductive number was estimated using Markov Chain Monte Carlo methods and presented using the resulting posterior mean and 95% credibile interval (CrI). Findings In our baseline scenario, we estimated that the basic reproductive number for 2019-nCoV was 2·68 (95% CrI 2·47–2·86) and that 75 815 individuals (95% CrI 37 304–130 330) have been infected in Wuhan as of Jan 25, 2020. The epidemic doubling time was 6·4 days (95% CrI 5·8–7·1). We estimated that in the baseline scenario, Chongqing, Beijing, Shanghai, Guangzhou, and Shenzhen had imported 461 (95% CrI 227–805), 113 (57–193), 98 (49–168), 111 (56–191), and 80 (40–139) infections from Wuhan, respectively. If the transmissibility of 2019-nCoV were similar everywhere domestically and over time, we inferred that epidemics are already growing exponentially in multiple major cities of China with a lag time behind the Wuhan outbreak of about 1–2 weeks. Interpretation Given that 2019-nCoV is no longer contained within Wuhan, other major Chinese cities are probably sustaining localised outbreaks. Large cities overseas with close transport links to China could also become outbreak epicentres, unless substantial public health interventions at both the population and personal levels are implemented immediately. Independent self-sustaining outbreaks in major cities globally could become inevitable because of substantial exportation of presymptomatic cases and in the absence of large-scale public health interventions. Preparedness plans and mitigation interventions should be readied for quick deployment globally. Funding Health and Medical Research Fund (Hong Kong, China).

0 comments Cited 2001 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Preliminary estimation of the basic reproduction number of novel coronavirus (2019-nCoV) in China, from 2019 to 2020: A data-driven analysis in the early phase of the outbreak

Shi Zhao, Qianyin Lin, Jinjun Ran … (2020)

Highlights • The novel coronavirus (2019-nCoV) pneumonia has caused 2033 confirmed cases, including 56 deaths in mainland China, by 2020-01-26 17:06. • We aim to estimate the basic reproduction number of 2019-nCoV in Wuhan, China using the exponential growth model method. • We estimated that the mean R 0 ranges from 2.24 to 3.58 with an 8-fold to 2-fold increase in the reporting rate. • Changes in reporting likely occurred and should be taken into account in the estimation of R 0.

0 comments Cited 737 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Maimuna S Majumder

Journal

Journal ID (nlm-ta): Lancet Glob Health

Journal ID (iso-abbrev): Lancet Glob Health

Title: The Lancet. Global Health

Publisher: The Author(s). Published by Elsevier Ltd.

ISSN (Electronic): 2214-109X

Publication date PMC-release: 24 March 2020

Publication date (Electronic): 24 March 2020

Affiliations

[a ]Computational Health Informatics Program, Boston Children's Hospital, and Department of Pediatrics, Harvard Medical School, Boston, MA 02115, USA

Article

Publisher ID: S2214-109X(20)30113-3

DOI: 10.1016/S2214-109X(20)30113-3

PMC ID: 7159059

PubMed ID: 32220289

SO-VID: 1da0ba21-821e-4a4c-ae45-88a78dd99bc6

License:

Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.

Early in the epidemic: impact of preprints on global discourse about COVID-19 transmissibility

Read this article at

Abstract

Related collections

Novel Coronavirus Disease COVID-19

Most cited references 14

Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus–Infected Pneumonia

Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study

Preliminary estimation of the basic reproduction number of novel coronavirus (2019-nCoV) in China, from 2019 to 2020: A data-driven analysis in the early phase of the outbreak

Author and article information

Contributors

Journal

Affiliations

Article

History

Categories

Comments

Comment on this article

Similar content 41

Cited by 80

Most referenced authors 675