10
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: not found
      • Article: not found

      Data standards and standardization: The shortest plank of bucket for the COVID-19 containment

      review-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          In the battle against the unprecedented pandemic of COVID-19 worldwide, biomedical informatics, especially data standards and data standardization, have played significant roles in multiple aspects in containment of the pandemic, including understanding disease mechanisms, 1 improving clinical care, 2 triaging resource needs, 3 advising policy-making, 4 implementing public health countermeasures, 5 enhancing technical innovation in syndromic surveillance, 6 developing vaccines, and enabling wide coverage of vaccination. 7 Nevertheless, the development of the standards for COVID-19 relevant data collection during the pandemic have gone through a lot of obstacles 8 globally since the very beginning of the pandemic, which led to misleading statistics, inefficient communication, biased policy-making, and clinical risks. 9 COVID-19 provided an eminent chance to test the data infrastructure in different regions and many issues and challenges have been exposed. Efforts to access and align existing healthcare data infrastructure in the context of the pandemic highlighted complicated interoperability challenges, which remain significant barriers to real-time data analytics and hurdles for improving health outcomes through data-driven responses. 10 By reflecting on the COVID-19 related data standards in runological order (Figure 1 ), recommendations are made with the goal of promoting a globally-aligned standardization of healthcare data and the establishment of a community of common health for humankind amid the current and potentially future global public health crisis. Figure 1 Timeline of data standards development during initial phase of COVID-19. Figure 1 Recognizing the value of data standards and standardization for COVID-19 containment It is now an era when medical practices, in both routine and emergent scenarios, are continuously recorded by digital systems, covering electronic health records and physiologic, laboratory, imaging data as well as decision-making and treatment information. Therefore, when no clinical trial data informs a rapidly evolving situation or unknown disease, the expectation would arise from the public for rapid and large-scale data collection, analysis to support strategic decision-making, and sharing of best practices. 11 A critical component of the proposed strategy is the democratization of data: all collected information (observing necessary privacy standards) should be made publicly available immediately upon release in machine-readable formats based on open data standards and enabling data-informed decision making for all stakeholders. Data standards empower international knowledge discovery and solution exploitation Understanding of the clinical characteristics and responses to treatment of COVID-19 brought enormous value to clinicians when the trial-based evidence was sparse. 1 , 12 The large-scale real-world evidence generation network formed within the framework of OHDSI (Observational Health Data Sciences and Informatics) 1 has brought an innovative approach to coordinate data sources from different institutes, countries, and languages, aligned a cohort of over 4.5 million cases, and retrospectively described the unknown disease with strong representativeness on populations and regions (Europe, United States, South Korea, and China). OHDSI developed a comprehensive vocabulary system to incorporate data standards used in different countries and areas and implemented them in data processing and analytics. The high-level standardization and implementation of multiple standards enabled the OHDSI network to bring insights to clinical characteristics, 13 treatment pathways 14 and subgroup patients analysis. 15 The network also provided important evidence on potential repurposed medications, which demonstrated an important approach to scan existing therapeutic methods in the lack of clinical trials of a new regimen. 14 Last but not the least, data standardization and data sharing significantly improved the recruitment efficiency of clinical trials for new treatments and effectively monitored potential side effects of various medicinal products and the vaccines. 16 The sharing of the data has been restricted to comply with related regulations. The potential of data-driven knowledge discovery and transfer has been weakened accordingly. However, in face of the high pressure, the scientific world has been robust in encouraging novel studies and data sharing without violation of data privacy. It's important to point out that the data standards and their implementation in different countries and languages have enabled multi-national studies without inflicting concerns of data governance and original data leakage. Within the coordinating mechanisms organized by OHDSI, 1 TriNetX, 12 ICODA, 17 and other open-science networks, insights can be extracted, with an unprecedented scale and efficiency, from multiple independent databases around the world due to their common data model, vocabulary control, quality control, privacy protection mechanism and ethics standards. Data standards enable data-informed decision making Statistical analysis of the epidemiological trend required a standard nomenclature for the disease and high quality of data standardization in case reporting as well as data collection at both regional and global level. 18 Inference from the epidemiological data to calculate the population size of potential contact was one of the key parameters to make policies on public health. It is difficult to assess the accuracy of the data at the population level when the relevant data are distributed in the silos and the data owners are not willing to share it. Our experience, as illustrated in the Honghu Hybrid System (HHS), 5 was using digital technologies to connect variable, if not all, data sources, integrated and standardized the data, and generated a near real-time surveillance system (daily) in the area with a population close to a million. Error in statistics during the emergent period of the pandemic was inevitable. A double-check mechanism, enabled by an independent channel (digital vs. manual) effectively minimized mismatched information. Moreover, to mitigate the huge burden on medical needs and manpower shortage, many clinical decision-support systems (CDSS), mostly machine-learning based and data-driven, were developed and implemented in different checkpoints of the data flow 19 for covering syndromic surveillance, triaging, severity classification, and outcome prediction. Although successes were reported within individual development sites, these systems could hardly be transplanted to other sites. The major reasons for such challenge include inconsistency in data standards and standardization, lack of usability for laypersons, difficulty of deployment in resource-poor settings, and potential ethical pitfalls or legal barriers. 20 The systems with the highest success rate of migration were the classification of chest CT images based on artificial intelligence (AI) technologies 21 since the data in the Picture Archiving and Communication System (PACS) around the world follow the Digital Imaging and Communication in Medicine (DICOM) standard. However, the power of AI and data-driven predictive science played little role in improving the general level of clinical care for the COVID-19 patients, especially for the severe cases as the data infrastructure of standards and standardization were not ready for such challenges. Reflection and effort on improving the level of data standardization It is never too late to mend the fences as an old Chinese proverb said. There is an urgent need to reflect on the cause of low effectiveness of data sharing, data mining, and data science applications during the COVID-19 pandemic. The most important factor, also the shortest plank of bucket for the effort of containing the pandemic, is the lack of a widely implemented clinical data standard system and the various level of data standardization. This made the value of all the investment on hardware and software diminish. In order to quickly form an international data sharing network to generate real-world evidence and understand the disease as well as the affected populations, 22 it is important to implement standards beyond the classification code (ICD). SNOMED CT (Systematized Nomenclature of Medicine – Clinical Terms), LOINC (Logical Observation Identifiers Names and Codes), and RxNorm are among the top recommended terminology systems. 1 In November 2020, the European Commission declared its commitment to the establishment of the European Health Data Space (EHDS), with the goal of facilitating access and better utilization of the European health data—eg, EHR, genomic, public health, and registry data. 23 Meanwhile, the Europe Commission announced the financial support program to member countries on implementing SNOMED CT as their core clinical vocabulary standard to enhance interoperability and increase the value of the data. 24 This provided a good example for the Western Pacific countries and regions to learn and build a data sharing platform for the future by clearly defining the best practices for fair benefit sharing, transparent and accountable governance of public and private sector data, true commitment to public dialogue, and global cooperation. Recommendations for a tested preparedness Strengthen the leadership of WHO Reflecting on the initial phase of the COVID-19 pandemic, the identification of the pathogenic microorganism and its nomenclature, the characterization of the clinical manifestation and the definition of the diseases (from novel coronavirus pneumonia to COVID-19) have been the key steps for global coordination on research resources and implementation of public health countermeasures. 10 WHO played an essential role in coordinating the expert resources, government support, and world-wide implementation, which paved the foundation for disease classification in healthcare IT systems, epidemiological statistics, and multi-center research programs. ICD has been proven efficient and cost-effective, considering the implementation in multiple languages in a short time across countries. International collaboration, under the leadership of WHO, should be strengthened to get more prepared for the future global public health emergencies. The upcoming ICD-11, 25 which has been significantly modified to cope with the increasing needs in classification with more granularity, hierarchical terminology structure, coverage on clinical phenotypes, and incorporation of traditional medicine, will definitely help improve preparedness of data infrastructure in different countries. Avoid potential bias and conflicts Bias has been observed in the process of naming the disease. The use of the name of Wuhan city, where the world started to know about the virus, by some politicians and experts raised widespread sentimental conflicts worldwide and caused unnecessary waste of time and resources in that special period when each hour was counted for battling the disease, including taking care of patients and conducting research on understand the disease. We recommend that the bias and conflicts should be avoided, following the current naming methodology for COVID-19, to improve the implementation of the standards in all relevant countries and areas. Equity in technology access and international collaboration It is also recognized an unmet need to help low-to-middle income countries to accomplish standardization of the data and application of healthcare IT technologies. A regional effort to control the disease with such high transmissibility will not be successful without the involvement of all countries and regions. Training, financial support on infrastructure, free implementation of mature systems, and man-power support in data standardization and analytics are necessary and essential, 5 especially for low-to-middle income countries and areas. 26 Conclusion Healthcare IT, data sciences, and AI have failed public expectations during the COVID-19 pandemic due to the inadequate preparedness of IT infrastructure in most countries, if not all. Lack of data standards and low-to-middle level of data standardization were part of the major causes and the shortest plank in the bucket for the containment of the pandemic. With strong coordination by WHO, a global effort to increase interoperability among the healthcare IT systems of different countries will be a fundamental step to get prepared for the next pandemic with an unknown origin. Contributors Dr. Gong Mengchun contributed to the conceptulisation and writing – original draft of the manuscript. Mr. Jiao Yuanshi contributed to visualisation and writing – original draft of the manuscript. Dr. Yang Gong and Dr. Liu Li contributed to writing – review & editing of this paper and provided valuable suggestions. Declaration of interests None.

          Related collections

          Most cited references22

          • Record: found
          • Abstract: found
          • Article: not found

          An interactive web-based dashboard to track COVID-19 in real time

          In December, 2019, a local outbreak of pneumonia of initially unknown cause was detected in Wuhan (Hubei, China), and was quickly determined to be caused by a novel coronavirus, 1 namely severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The outbreak has since spread to every province of mainland China as well as 27 other countries and regions, with more than 70 000 confirmed cases as of Feb 17, 2020. 2 In response to this ongoing public health emergency, we developed an online interactive dashboard, hosted by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University, Baltimore, MD, USA, to visualise and track reported cases of coronavirus disease 2019 (COVID-19) in real time. The dashboard, first shared publicly on Jan 22, illustrates the location and number of confirmed COVID-19 cases, deaths, and recoveries for all affected countries. It was developed to provide researchers, public health authorities, and the general public with a user-friendly tool to track the outbreak as it unfolds. All data collected and displayed are made freely available, initially through Google Sheets and now through a GitHub repository, along with the feature layers of the dashboard, which are now included in the Esri Living Atlas. The dashboard reports cases at the province level in China; at the city level in the USA, Australia, and Canada; and at the country level otherwise. During Jan 22–31, all data collection and processing were done manually, and updates were typically done twice a day, morning and night (US Eastern Time). As the outbreak evolved, the manual reporting process became unsustainable; therefore, on Feb 1, we adopted a semi-automated living data stream strategy. Our primary data source is DXY, an online platform run by members of the Chinese medical community, which aggregates local media and government reports to provide cumulative totals of COVID-19 cases in near real time at the province level in China and at the country level otherwise. Every 15 min, the cumulative case counts are updated from DXY for all provinces in China and for other affected countries and regions. For countries and regions outside mainland China (including Hong Kong, Macau, and Taiwan), we found DXY cumulative case counts to frequently lag behind other sources; we therefore manually update these case numbers throughout the day when new cases are identified. To identify new cases, we monitor various Twitter feeds, online news services, and direct communication sent through the dashboard. Before manually updating the dashboard, we confirm the case numbers with regional and local health departments, including the respective centres for disease control and prevention (CDC) of China, Taiwan, and Europe, the Hong Kong Department of Health, the Macau Government, and WHO, as well as city-level and state-level health authorities. For city-level case reports in the USA, Australia, and Canada, which we began reporting on Feb 1, we rely on the US CDC, the government of Canada, the Australian Government Department of Health, and various state or territory health authorities. All manual updates (for countries and regions outside mainland China) are coordinated by a team at Johns Hopkins University. The case data reported on the dashboard aligns with the daily Chinese CDC 3 and WHO situation reports 2 for within and outside of mainland China, respectively (figure ). Furthermore, the dashboard is particularly effective at capturing the timing of the first reported case of COVID-19 in new countries or regions (appendix). With the exception of Australia, Hong Kong, and Italy, the CSSE at Johns Hopkins University has reported newly infected countries ahead of WHO, with Hong Kong and Italy reported within hours of the corresponding WHO situation report. Figure Comparison of COVID-19 case reporting from different sources Daily cumulative case numbers (starting Jan 22, 2020) reported by the Johns Hopkins University Center for Systems Science and Engineering (CSSE), WHO situation reports, and the Chinese Center for Disease Control and Prevention (Chinese CDC) for within (A) and outside (B) mainland China. Given the popularity and impact of the dashboard to date, we plan to continue hosting and managing the tool throughout the entirety of the COVID-19 outbreak and to build out its capabilities to establish a standing tool to monitor and report on future outbreaks. We believe our efforts are crucial to help inform modelling efforts and control measures during the earliest stages of the outbreak.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            6-month neurological and psychiatric outcomes in 236 379 survivors of COVID-19: a retrospective cohort study using electronic health records

            Background Neurological and psychiatric sequelae of COVID-19 have been reported, but more data are needed to adequately assess the effects of COVID-19 on brain health. We aimed to provide robust estimates of incidence rates and relative risks of neurological and psychiatric diagnoses in patients in the 6 months following a COVID-19 diagnosis. Methods For this retrospective cohort study and time-to-event analysis, we used data obtained from the TriNetX electronic health records network (with over 81 million patients). Our primary cohort comprised patients who had a COVID-19 diagnosis; one matched control cohort included patients diagnosed with influenza, and the other matched control cohort included patients diagnosed with any respiratory tract infection including influenza in the same period. Patients with a diagnosis of COVID-19 or a positive test for SARS-CoV-2 were excluded from the control cohorts. All cohorts included patients older than 10 years who had an index event on or after Jan 20, 2020, and who were still alive on Dec 13, 2020. We estimated the incidence of 14 neurological and psychiatric outcomes in the 6 months after a confirmed diagnosis of COVID-19: intracranial haemorrhage; ischaemic stroke; parkinsonism; Guillain-Barré syndrome; nerve, nerve root, and plexus disorders; myoneural junction and muscle disease; encephalitis; dementia; psychotic, mood, and anxiety disorders (grouped and separately); substance use disorder; and insomnia. Using a Cox model, we compared incidences with those in propensity score-matched cohorts of patients with influenza or other respiratory tract infections. We investigated how these estimates were affected by COVID-19 severity, as proxied by hospitalisation, intensive therapy unit (ITU) admission, and encephalopathy (delirium and related disorders). We assessed the robustness of the differences in outcomes between cohorts by repeating the analysis in different scenarios. To provide benchmarking for the incidence and risk of neurological and psychiatric sequelae, we compared our primary cohort with four cohorts of patients diagnosed in the same period with additional index events: skin infection, urolithiasis, fracture of a large bone, and pulmonary embolism. Findings Among 236 379 patients diagnosed with COVID-19, the estimated incidence of a neurological or psychiatric diagnosis in the following 6 months was 33·62% (95% CI 33·17–34·07), with 12·84% (12·36–13·33) receiving their first such diagnosis. For patients who had been admitted to an ITU, the estimated incidence of a diagnosis was 46·42% (44·78–48·09) and for a first diagnosis was 25·79% (23·50–28·25). Regarding individual diagnoses of the study outcomes, the whole COVID-19 cohort had estimated incidences of 0·56% (0·50–0·63) for intracranial haemorrhage, 2·10% (1·97–2·23) for ischaemic stroke, 0·11% (0·08–0·14) for parkinsonism, 0·67% (0·59–0·75) for dementia, 17·39% (17·04–17·74) for anxiety disorder, and 1·40% (1·30–1·51) for psychotic disorder, among others. In the group with ITU admission, estimated incidences were 2·66% (2·24–3·16) for intracranial haemorrhage, 6·92% (6·17–7·76) for ischaemic stroke, 0·26% (0·15–0·45) for parkinsonism, 1·74% (1·31–2·30) for dementia, 19·15% (17·90–20·48) for anxiety disorder, and 2·77% (2·31–3·33) for psychotic disorder. Most diagnostic categories were more common in patients who had COVID-19 than in those who had influenza (hazard ratio [HR] 1·44, 95% CI 1·40–1·47, for any diagnosis; 1·78, 1·68–1·89, for any first diagnosis) and those who had other respiratory tract infections (1·16, 1·14–1·17, for any diagnosis; 1·32, 1·27–1·36, for any first diagnosis). As with incidences, HRs were higher in patients who had more severe COVID-19 (eg, those admitted to ITU compared with those who were not: 1·58, 1·50–1·67, for any diagnosis; 2·87, 2·45–3·35, for any first diagnosis). Results were robust to various sensitivity analyses and benchmarking against the four additional index health events. Interpretation Our study provides evidence for substantial neurological and psychiatric morbidity in the 6 months after COVID-19 infection. Risks were greatest in, but not limited to, patients who had severe COVID-19. This information could help in service planning and identification of research priorities. Complementary study designs, including prospective cohorts, are needed to corroborate and explain these findings. Funding National Institute for Health Research (NIHR) Oxford Health Biomedical Research Centre.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found

              Response to COVID-19 in Taiwan: Big Data Analytics, New Technology, and Proactive Testing

                Bookmark

                Author and article information

                Journal
                Lancet Reg Health West Pac
                Lancet Reg Health West Pac
                The Lancet Regional Health: Western Pacific
                The Authors. Published by Elsevier Ltd.
                2666-6065
                11 August 2022
                11 August 2022
                : 100565
                Affiliations
                [a ]Nanfang Hospital, Southern Medical University, Guangzhou, China
                [b ]Institute of Health Management, Southern Medical University, Guangzhou, China
                [c ]Digital Health China Technologies, Beijing, China
                [d ]School of Biomedical Informatics, University of Texas Health Science Center at Houston, United States
                Author notes
                [* ]Corresponding author at: Nanfang Hospital, Southern Medical University, 1838 North Guangzhou Ave, Guangzhou, China.
                Article
                S2666-6065(22)00180-8 100565
                10.1016/j.lanwpc.2022.100565
                9366352
                35971388
                6924e941-6b88-4742-be76-4abee419939a
                © 2022 The Authors

                Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.

                History
                Categories
                Viewpoint

                digital health,data standard,data standardization,covid-19,it infrastructure

                Comments

                Comment on this article