title: Translation of Genomics into Precision Medicine with Artificial Intelligence: A Narrative Review on Applications, Challenges and Future Perspectives

In the field of genomics, the broad availability of genetic information offered by next-generation sequencing technologies and rapid growth in biomedical publication has led to the advent of the big-data era. Integration of artificial intelligence. (AI) approaches such as machine learning, deep learning, and natural language processing (NLP) to tackle the challenges of scalability and high dimensionality of data and to transform big data into clinically actionable knowledge is expanding and becoming the foundation of precision medicine. In this paper, we review the current status and future directions of AI application in genomics within the context of workflows to integrate genomic analysis for precision cancer care. The existing solutions of AI and their limitations in cancer genetic testing and diagnostics such as variant calling and interpretation are critically analyzed. In addition, the present paper highlights the challenges to AI adoption in digital healthcare with regard to data requirements, algorithmic transparency, reproducibility, and real-world assessment, and discusses the importance of preparing patients and physicians for modern digitized healthcare. We believe that AI will remain the main driver to healthcare transformation toward precision medicine, yet the unprecedented challenges posed should be addressed to ensure safety and beneficial impact to healthcare.


PHARMACOGENOMICS AND PRECISION MEDICINE
 Pharmacogenomics is the study of how genes affect a person's response to drugs. This relatively new field combines pharmacology (the science of drugs) and genomics (the study of genes and their functions) to develop effective, safe medications and doses that will be tailored to a person's genetic makeup.  A form of medicine that uses information about a person's own genes or proteins to prevent, diagnose, or treat disease. In cancer, precision medicine uses specific information about a person's tumor to help make a diagnosis, plan treatment, find out how well treatment is working, or make a prognosis. Examples of precision medicine include using targeted therapies to treat specific types of cancer cells, such as HER2positive breast cancer cells, or using tumor marker testing to help diagnose cancer. Also called personalized medicine.

HOW NEXT-GENERATION SEQUENCING IS CHANGING THE LANDSCAPE OF CANCER GENOMICS
• Next-generation sequencing (NGS) is being applied broadly as a valuable method for gaining insights into the genomic profile of a tumour. • Cancer panels are designed specifically to detect clinically relevant somatic mutations with high confidence. Germline mutations in cancer-predisposing genes such as BRCA1/2 are also detected to assess cancer risk. • In 2017, the FDA approved several NGS-based panels related to oncology: Oncomine Dx Target Test, Praxis Extended RAS Panel, MSK-IMPACT, and FoundationOne CDx. Recent FDA approval of NTRK gene fusions for tumoragnostic indications also expands the clinical utilization of NGS (Larotrectinib FDA approval). • Liquid biopsy holds great promise due to its noninvasive nature.
• Cell-free DNA (cfDNA) released by dying tumor cells, cell-derived vesicles termed exosomes, and circulating tumor cells (CTCs), which shed from the tumor and enter the vasculature system, are often used as a source for tumor DNA. • The Cancer Genome Atlas (TCGA) project highlights how NGS screens can facilitate the discovery of novel oncogenic mechanisms and patient stratification. • In a recent study, the regulatory role of F-box/WD repeat-containing protein 7 (Fbw7) in cancer cell oxidative metabolism is discovered (Davis et al. 2018) using ML algorithms. • Finally, NGS supports the discovery of novel biomarkers such as mutation signatures and tumor mutational burden (TMB). • TMB has been shown to be an effective biomarker for predicting the response to immuno-therapy-an innovative area of research that can use the body's own immune system to fight cancer.

CHALLENGES IN CANCER GENOMICS DATA INTERPRETATION
 First, combing data profiles at various levels would result in high dimensionality with large number of covariates. Data sparsity from high dimensionality combined with high heterogeneity from diverse types of data imposes a significant difficulty in integrative analyses.  Second, better standards for data generation and reporting are needed to facilitate data integration and to reduce bias. Sample acquisition and preparation procedures need to be well regulated for data generation and sequencing platform, and computational pipelines need to be carefully calibrated and validated.
 Last, but not least, well-designed studies with causal inference are needed to filter out biomarkers that have strong correlative effects but no real causative effects in tumorigenesis.  On the genetic level, the pathogenic variants could be significantly enriched in cases compared to controls and/ or the variant is co-inherited with disease status within affected families.  On the informatic level, the pathogenic variants could be found at the location predicted to cause functional disruption (for example, proteinbinding region). And on the experimental level, the pathogenic variants could significantly alter levels, splicing, or normal biochemical function of the product of the affected genes.  Finally, the cellular phenotype in patient-derived cells, model organisms, or engineered equivalents can be rescued by addition of wild-type gene product or specific knockdown of the variant allele.  The advancement of ML technologies is bound to impact the interpretation of genomic sequencing data, which has traditionally relied on manual curation by experts in the field. These curation efforts rely on protein structure, functional studies and more recently, on "in silico" models that predict the functional impact of genetic alteration such as SIFT, PANTHER-PSEP, PolyPhen2, and others.  Two key limitations of manually curating and interpreting the results from genomics data are scalability and reproducibility. These challenges continue to grow as more genomic data become available.

PRECISION MEDICINE AND AI
 With precision medicine and the advancement of NGS, genomic profiles of patients have been increasingly used for risk prediction, disease diagnosis, and development of targeted therapies.  Gene expression is an important part of the patients' genomic profiles, and interestingly, ML classification methods applied to gene expression data are not new.  Historically, comprehensive gene expression analysis was done with microarrays and now with RNA-seq.  Expression data are analyzed to identify the significant genes in the upregulated or downregulated pathways (Lyu and Haque 2018;Hwang et al. 2002), and are also trained to predict the cancer subtypes and prognosis when outcome data or diagnosis information is available (Bartsch et al. 2016;Pepke and Ver Steeg 2017).

CHALLENGES TO AI ADOPTION IN HEALTHCARE  Lack of ground truth to validate the benefit
The evaluation of AI accuracy is critical to help gauge how well the system performs in assisting experts, and to make AI less of a black box. In cancer genomics, variant classification, clinical relevance, literature validation, and summarization are traditionally done by human experts. To prove the usefulness of an AI application, it needs to be evaluated in comparison with human experts and not only with the other AI solutions. However, this is rarely done due to the lack of publicly accessible knowledge bases for ground truth data.

 Transparency and reproducibility
AI is a hot field and its use has been claimed by many platforms and companies. However, detailed information on AI techniques and models is not clearly presented and there is considerable variability in methodologies from company to company.

 Patient/physician education
Digitization of healthcare has provided the access to bigdata information and cognitive insights to both caregivers and patients, transforming healthcare and clinical workflows (Mesko et al. 2017). The point-of-care has shifted from the clinic and physician to the patient. The old paradigm of paternalistic physician-patient relationship has been transformed into an equallevel partnership with shared medical decision-making. Experience-based medicine has evolved into evidence-based and patient-centered approaches. Both physicians and patients need to be prepared for this revolutionary role of AI in healthcare.