24
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Optimized design and assessment of whole genome tiling arrays

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Motivation

          Recent advances in microarray technologies have made it feasible to interrogate whole genomes with tiling arrays and this technique is rapidly becoming one of the most important high-throughput functional genomics assays. For large mammalian genomes, analyzing oligonucleotide tiling array data is complicated by the presence of non-unique sequences on the array, which increases the overall noise in the data and may lead to false positive results due to cross-hybridization. The ability to create custom microarrays using maskless array synthesis has led us to consider ways to optimize array design characteristics for improving data quality and analysis. We have identified a number of design parameters to be optimized including uniqueness of the probe sequences within the whole genome, melting temperature and self-hybridization potential.

          Results

          We introduce the uniqueness score, U, a novel quality measure for oligonucleotide probes and present a method to quickly compute it. We show that U is equivalent to the number of shortest unique substrings in the probe and describe an efficient greedy algorithm to design mammalian whole genome tiling arrays using probes that maximize U. Using the mouse genome, we demonstrate how several optimizations influence the tiling array design characteristics. With a sensible set of parameters, our designs cover 78% of the mouse genome including many regions previously considered ‘untilable’ due to the presence of repetitive sequence. Finally, we compare our whole genome tiling array designs with commercially available designs.

          Availability

          Source code is available under an open source license from http://www.ebi.ac.uk/~graef/arraydesign/

          Related collections

          Most cited references28

          • Record: found
          • Abstract: found
          • Article: not found

          Global identification of human transcribed sequences with genome tiling arrays.

          Elucidating the transcribed regions of the genome constitutes a fundamental aspect of human biology, yet this remains an outstanding problem. To comprehensively identify coding sequences, we constructed a series of high-density oligonucleotide tiling arrays representing sense and antisense strands of the entire nonrepetitive sequence of the human genome. Transcribed sequences were located across the genome via hybridization to complementary DNA samples, reverse-transcribed from polyadenylated RNA obtained from human liver tissue. In addition to identifying many known and predicted genes, we found 10,595 transcribed sequences not detected by other methods. A large fraction of these are located in intergenic regions distal from previously annotated genes and exhibit significant homology to other mammalian proteins.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            A high-resolution map of active promoters in the human genome.

            In eukaryotic cells, transcription of every protein-coding gene begins with the assembly of an RNA polymerase II preinitiation complex (PIC) on the promoter. The promoters, in conjunction with enhancers, silencers and insulators, define the combinatorial codes that specify gene expression patterns. Our ability to analyse the control logic encoded in the human genome is currently limited by a lack of accurate information regarding the promoters for most genes. Here we describe a genome-wide map of active promoters in human fibroblast cells, determined by experimentally locating the sites of PIC binding throughout the human genome. This map defines 10,567 active promoters corresponding to 6,763 known genes and at least 1,196 un-annotated transcriptional units. Features of the map suggest extensive use of multiple promoters by the human genes and widespread clustering of active promoters in the genome. In addition, examination of the genome-wide expression profile reveals four general classes of promoters that define the transcriptome of the cell. These results provide a global view of the functional relationships among transcriptional machinery, chromatin structure and gene expression in human cells.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Suffix Arrays: A New Method for On-Line String Searches

                Bookmark

                Author and article information

                Journal
                9808944
                20940
                Bioinformatics
                Bioinformatics
                Bioinformatics (Oxford, England)
                1367-4803
                1367-4811
                4 April 2018
                01 July 2007
                10 April 2018
                : 23
                : 13
                : i195-i204
                Affiliations
                [1 ]EMBL–European Bioinformatics Institute, Hinxton, Cambridge, UK
                [2 ]Nijmegen Center for Molecular Life Sciences, Radboud University Nijmegen, The Netherlands
                [3 ]Center for Bioinformatics, University of Hamburg, Germany
                [4 ]Nijmegen Center for Molecular Life Sciences, Radboud University Nijmegen Medical Center, The Netherlands
                Author notes
                [* ]To whom correspondence should be addressed: flicek@ 123456ebi.ac.uk
                Article
                EMS76926
                10.1093/bioinformatics/btm200
                5892713
                17646297
                96720ddc-3ea8-4758-8166-4f746d79fa8c

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                Categories
                Article

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article