3
views
0
recommends
+1 Recommend
2 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      OrthologAL: a Shiny application for quality-aware humanization of non-human pre-clinical high-dimensional gene expression data

      brief-report

      Read this article at

      ScienceOpenPublisherPMC
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Motivation

          Single-cell and spatial transcriptomics provide unprecedented insight into diseases. Pharmacotranscriptomic approaches are powerful tools that leverage gene expression data for drug repurposing and discovery. Multiple databases attempt to connect human cellular transcriptional responses to small molecules for use in transcriptome-based drug discovery efforts. However, preclinical research often requires in vivo experiments in non-human species, which makes utilizing such valuable resources difficult. To facilitate both human orthologous conversion of non-human transcriptomes and the application of pharmacotranscriptomic databases to pre-clinical research models, we introduce OrthologAL. OrthologAL interfaces with BioMart to access different gene sets from the Ensembl database, allowing for ortholog conversion without the need for user-generated code.

          Results

          Researchers can input their single-cell or other high-dimensional gene expression data from any species as a Seurat object, and OrthologAL will output a human ortholog-converted Seurat object for download and use. To demonstrate the utility of this application, we tested OrthologAL using single-cell, single-nuclei, and spatial transcriptomic data derived from common preclinical models, including patient-derived orthotopic xenografts of medulloblastoma, and mouse and rat models of spinal cord injury. OrthologAL can convert these data types efficiently to that of corresponding orthologs while preserving the dimensional architecture of the original non-human expression data. OrthologAL will be broadly useful for the simple conversion of Seurat objects and for applying preclinical, high-dimensional transcriptomics data to functional human-derived small molecule predictions.

          Availability and implementation

          OrthologAL is available for download as an R package with functions to launch the Shiny GUI at https://github.com/AyadLab/OrthologAL or via Zenodo at https://doi.org/10.5281/zenodo.15225041. The medulloblastoma single-cell transcriptomics data were downloaded from the NCBI Gene Expression Omnibus with the identifier GSE129730. 10X Visium data of medulloblastoma PDX mouse models from Vo et al. were acquired by contacting the authors, and the raw data are available from ArrayExpress under the identifier E-MTAB-11720. The single-cell and single-nuclei transcriptomics data of rat and mouse spinal-cord injury were acquired from the Gene Expression Omnibus under the identifiers GSE213240 and GSE234774.

          Related collections

          Most cited references21

          • Record: found
          • Abstract: found
          • Article: not found

          Comprehensive Integration of Single-Cell Data

          Single-cell transcriptomics has transformed our ability to characterize cell states, but deep biological understanding requires more than a taxonomic listing of clusters. As new methods arise to measure distinct cellular modalities, a key analytical challenge is to integrate these datasets to better understand cellular identity and function. Here, we develop a strategy to "anchor" diverse datasets together, enabling us to integrate single-cell measurements not only across scRNA-seq technologies, but also across different modalities. After demonstrating improvement over existing methods for integrating scRNA-seq data, we anchor scRNA-seq experiments with scATAC-seq to explore chromatin differences in closely related interneuron subsets and project protein expression measurements onto a bone marrow atlas to characterize lymphocyte populations. Lastly, we harmonize in situ gene expression and scRNA-seq datasets, allowing transcriptome-wide imputation of spatial gene expression patterns. Our work presents a strategy for the assembly of harmonized references and transfer of information across datasets.
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Integrated analysis of multimodal single-cell data

            Summary The simultaneous measurement of multiple modalities represents an exciting frontier for single-cell genomics and necessitates computational methods that can define cellular states based on multimodal data. Here, we introduce “weighted-nearest neighbor” analysis, an unsupervised framework to learn the relative utility of each data type in each cell, enabling an integrative analysis of multiple modalities. We apply our procedure to a CITE-seq dataset of 211,000 human peripheral blood mononuclear cells (PBMCs) with panels extending to 228 antibodies to construct a multimodal reference atlas of the circulating immune system. Multimodal analysis substantially improves our ability to resolve cell states, allowing us to identify and validate previously unreported lymphoid subpopulations. Moreover, we demonstrate how to leverage this reference to rapidly map new datasets and to interpret immune responses to vaccination and coronavirus disease 2019 (COVID-19). Our approach represents a broadly applicable strategy to analyze single-cell multimodal datasets and to look beyond the transcriptome toward a unified and multimodal definition of cellular identity.
              • Record: found
              • Abstract: found
              • Article: not found

              Integrating single-cell transcriptomic data across different conditions, technologies, and species

              Computational single-cell RNA-seq (scRNA-seq) methods have been successfully applied to experiments representing a single condition, technology, or species to discover and define cellular phenotypes. However, identifying subpopulations of cells that are present across multiple data sets remains challenging. Here, we introduce an analytical strategy for integrating scRNA-seq data sets based on common sources of variation, enabling the identification of shared populations across data sets and downstream comparative analysis. We apply this approach, implemented in our R toolkit Seurat (http://satijalab.org/seurat/), to align scRNA-seq data sets of peripheral blood mononuclear cells under resting and stimulated conditions, hematopoietic progenitors sequenced using two profiling technologies, and pancreatic cell 'atlases' generated from human and mouse islets. In each case, we learn distinct or transitional cell states jointly across data sets, while boosting statistical power through integrated analysis. Our approach facilitates general comparisons of scRNA-seq data sets, potentially deepening our understanding of how distinct cell states respond to perturbation, disease, and evolution.

                Author and article information

                Contributors
                Role: Associate Editor
                Journal
                Bioinformatics
                Bioinformatics
                bioinformatics
                Bioinformatics
                Oxford University Press
                1367-4803
                1367-4811
                June 2025
                20 May 2025
                20 May 2025
                : 41
                : 6
                : btaf311
                Affiliations
                Department of Oncology, Lombardi Comprehensive Cancer Center, Georgetown University , Washington, DC 20007, United States
                Department of Oncology, Lombardi Comprehensive Cancer Center, Georgetown University , Washington, DC 20007, United States
                Department of Oncology, Lombardi Comprehensive Cancer Center, Georgetown University , Washington, DC 20007, United States
                The Miami Project to Cure Paralysis, Department of Neurological Surgery, University of Miami Miller School of Medicine , Miami, FL 33136, United States
                Department of Chemistry and Chemical Biology, Rutgers University, The State University of New Jersey, Piscataway, NJ 08854, United States
                Department of Chemistry and Chemical Biology, Rutgers University, The State University of New Jersey, Piscataway, NJ 08854, United States
                The Miami Project to Cure Paralysis, Department of Neurological Surgery, University of Miami Miller School of Medicine , Miami, FL 33136, United States
                Department of Oncology, Lombardi Comprehensive Cancer Center, Georgetown University , Washington, DC 20007, United States
                Author notes
                Corresponding author. Department of Oncology, Lombardi Comprehensive Cancer Center, Georgetown University, 3970 Reservoir Rd NW Washington, D.C. District of Columbia 20007, United States. E-mail: na853@ 123456georgetown.edu .
                Author information
                https://orcid.org/0000-0002-3335-8426
                https://orcid.org/0009-0002-1994-3095
                Article
                btaf311
                10.1093/bioinformatics/btaf311
                12158155
                40392208
                7a582e7f-ee59-4c23-95b2-7e9f387414c2
                © The Author(s) 2025. Published by Oxford University Press.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 24 January 2025
                : 22 April 2025
                : 05 May 2025
                : 15 May 2025
                : 11 June 2025
                Page count
                Pages: 4
                Funding
                Funded by: NINDS, DOI 10.13039/100000065;
                Award ID: RM1NS133003
                Award ID: R01NS118023
                Categories
                Applications Note
                Gene Expression
                AcademicSubjects/SCI01060

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article

                Related Documents Log