2
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Chromosome‐level genome assembly of Prunella vulgaris L. provides insights into pentacyclic triterpenoid biosynthesis

      Read this article at

      ScienceOpenPublisher
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          SUMMARY

          Prunella vulgaris is one of the bestselling and widely used medicinal herbs. It is recorded as an ace medicine for cleansing and protecting the liver in Chinese Pharmacopoeia and has been used as the main constitutions of many herbal tea formulas in China for centuries. It is also a traditional folk medicine in Europe and other countries of Asia. Pentacyclic triterpenoids are a major class of bioactive compounds produced in P. vulgaris. However, their biosynthetic mechanism remains to be elucidated. Here, we report a chromosome‐level reference genome of P. vulgaris using an approach combining Illumina, ONT, and Hi‐C technologies. It is 671.95 Mb in size with a scaffold N50 of 49.10 Mb and a complete BUSCO of 98.45%. About 98.31% of the sequence was anchored into 14 pseudochromosomes. Comparative genome analysis revealed a recent WGD in P. vulgaris. Genome‐wide analysis identified 35 932 protein‐coding genes (PCGs), of which 59 encode enzymes involved in 2,3‐oxidosqualene biosynthesis. In addition, 10 PvOSC, 358 PvCYP, and 177 PvUGT genes were identified, of which five PvOSCs, 25 PvCYPs, and 9 PvUGTs were predicted to be involved in the biosynthesis of pentacyclic triterpenoids. Biochemical activity assay of PvOSC2, PvOSC4, and PvOSC6 recombinant proteins showed that they were mixed amyrin synthase (MAS), lupeol synthase (LUS), and β‐amyrin synthase (BAS), respectively. The results provide a solid foundation for further elucidating the biosynthetic mechanism of pentacyclic triterpenoids in P. vulgaris.

          Significance Statement

          The first chromosome‐level reference genome of Prunella vulgaris and the Prunella genus was reported. Fifty‐nine 2,3‐oxidosqualene biosynthesis‐related genes and 39 pentacyclic triterpenoid biosynthesis‐related PvOSCs, PvCYPs, and PvUGTs genes were systematically analyzed. PvOSC2, PvOSC4, and PvOSC6 were experimentally verified to be mixed amyrin synthase, lupeol synthase, and β‐amyrin synthase, respectively. The results provide a solid foundation for further elucidating the biosynthetic mechanism of bioactive compounds in the widely used medicinal plant, P. vulgaris.

          Related collections

          Most cited references130

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Fast and accurate short read alignment with Burrows–Wheeler transform

          Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. Results: We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is ∼10–20× faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package. Availability: http://maq.sourceforge.net Contact: rd@sanger.ac.uk
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            IQ-TREE: A Fast and Effective Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies

            Large phylogenomics data sets require fast tree inference methods, especially for maximum-likelihood (ML) phylogenies. Fast programs exist, but due to inherent heuristics to find optimal trees, it is not clear whether the best tree is found. Thus, there is need for additional approaches that employ different search strategies to find ML trees and that are at the same time as fast as currently available ML programs. We show that a combination of hill-climbing approaches and a stochastic perturbation method can be time-efficiently implemented. If we allow the same CPU time as RAxML and PhyML, then our software IQ-TREE found higher likelihoods between 62.2% and 87.1% of the studied alignments, thus efficiently exploring the tree-space. If we use the IQ-TREE stopping rule, RAxML and PhyML are faster in 75.7% and 47.1% of the DNA alignments and 42.2% and 100% of the protein alignments, respectively. However, the range of obtaining higher likelihoods with IQ-TREE improves to 73.3-97.1%.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              HISAT: a fast spliced aligner with low memory requirements.

              HISAT (hierarchical indexing for spliced alignment of transcripts) is a highly efficient system for aligning reads from RNA sequencing experiments. HISAT uses an indexing scheme based on the Burrows-Wheeler transform and the Ferragina-Manzini (FM) index, employing two types of indexes for alignment: a whole-genome FM index to anchor each alignment and numerous local FM indexes for very rapid extensions of these alignments. HISAT's hierarchical index for the human genome contains 48,000 local FM indexes, each representing a genomic region of ∼64,000 bp. Tests on real and simulated data sets showed that HISAT is the fastest system currently available, with equal or better accuracy than any other method. Despite its large number of indexes, HISAT requires only 4.3 gigabytes of memory. HISAT supports genomes of any size, including those larger than 4 billion bases.
                Bookmark

                Author and article information

                Contributors
                Journal
                The Plant Journal
                The Plant Journal
                Wiley
                0960-7412
                1365-313X
                May 2024
                January 16 2024
                May 2024
                : 118
                : 3
                : 731-752
                Affiliations
                [1 ] Key Lab of Chinese Medicine Resources Conservation, State Administration of Traditional Chinese Medicine of the People's Republic of China Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences & Peking Union Medical College Beijing 100193 China
                [2 ] Engineering Research Center of Chinese Medicine Resource, Ministry of Education Beijing 100193 China
                Article
                10.1111/tpj.16629
                1a205e7d-a408-4eb3-b392-7289cb96d188
                © 2024

                http://onlinelibrary.wiley.com/termsAndConditions#vor

                History

                Comments

                Comment on this article