50
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      SPRINT: A new parallel framework for R

      product-review

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Microarray analysis allows the simultaneous measurement of thousands to millions of genes or sequences across tens to thousands of different samples. The analysis of the resulting data tests the limits of existing bioinformatics computing infrastructure. A solution to this issue is to use High Performance Computing (HPC) systems, which contain many processors and more memory than desktop computer systems. Many biostatisticians use R to process the data gleaned from microarray analysis and there is even a dedicated group of packages, Bioconductor, for this purpose. However, to exploit HPC systems, R must be able to utilise the multiple processors available on these systems. There are existing modules that enable R to use multiple processors, but these are either difficult to use for the HPC novice or cannot be used to solve certain classes of problems. A method of exploiting HPC systems, using R, but without recourse to mastering parallel programming paradigms is therefore necessary to analyse genomic data to its fullest.

          Results

          We have designed and built a prototype framework that allows the addition of parallelised functions to R to enable the easy exploitation of HPC systems. The Simple Parallel R INTerface (SPRINT) is a wrapper around such parallelised functions. Their use requires very little modification to existing sequential R scripts and no expertise in parallel computing. As an example we created a function that carries out the computation of a pairwise calculated correlation matrix. This performs well with SPRINT. When executed using SPRINT on an HPC resource of eight processors this computation reduces by more than three times the time R takes to complete it on one processor.

          Conclusion

          SPRINT allows the biostatistician to concentrate on the research problems rather than the computation, while still allowing exploitation of HPC systems. It is easy to use and with further development will become more useful as more functions are added to the framework.

          Related collections

          Most cited references13

          • Record: found
          • Abstract: found
          • Article: not found

          ArrayExpress--a public repository for microarray gene expression data at the EBI.

          A Brazma (2003)
          ArrayExpress is a new public database of microarray gene expression data at the EBI, which is a generic gene expression database designed to hold data from all microarray platforms. ArrayExpress uses the annotation standard Minimum Information About a Microarray Experiment (MIAME) and the associated XML data exchange format Microarray Gene Expression Markup Language (MAGE-ML) and it is designed to store well annotated data in a structured way. The ArrayExpress infrastructure consists of the database itself, data submissions in MAGE-ML format or via an online submission tool MIAMExpress, online database query interface, and the Expression Profiler online analysis tool. ArrayExpress accepts three types of submission, arrays, experiments and protocols, each of these is assigned an accession number. Help on data submission and annotation is provided by the curation team. The database can be queried on parameters such as author, laboratory, organism, experiment or array types. With an increasing number of organisations adopting MAGE-ML standard, the volume of submissions to ArrayExpress is increasing rapidly. The database can be accessed at http://www.ebi.ac.uk/arrayexpress.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            OpenMP: an industry standard API for shared-memory programming

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              DNA microarray technology: devices, systems, and applications.

              In this review, recent advances in DNA microarray technology and their applications are examined. The many varieties of DNA microarray or DNA chip devices and systems are described along with their methods for fabrication and their use. This includes both high-density microarrays for high-throughput screening applications and lower-density microarrays for various diagnostic applications. The methods for microarray fabrication that are reviewed include various inkjet and microjet deposition or spotting technologies and processes, in situ or on-chip photolithographic oligonucleotide synthesis processes, and electronic DNA probe addressing processes. The DNA microarray hybridization applications reviewed include the important areas of gene expression analysis and genotyping for point mutations, single nucleotide polymorphisms (SNPs), and short tandem repeats (STRs). In addition to the many molecular biological and genomic research uses, this review covers applications of microarray devices and systems for pharmacogenomic research and drug discovery, infectious and genetic disease and cancer diagnostics, and forensic and genetic identification purposes. Additionally, microarray technology being developed and applied to new areas of proteomic and cellular analysis are reviewed.
                Bookmark

                Author and article information

                Journal
                BMC Bioinformatics
                BMC Bioinformatics
                BioMed Central
                1471-2105
                2008
                29 December 2008
                : 9
                : 558
                Affiliations
                [1 ]EPCC, The University of Edinburgh, James Clerk Maxwell Building, Mayfield Road, Edinburgh, EH9 3JZ, UK
                [2 ]Division of Pathway Medicine (DPM), The University of Edinburgh Medical School, Chancellor's building, 49 Little France Crescent, Edinburgh, EH16 4SB, UK
                Article
                1471-2105-9-558
                10.1186/1471-2105-9-558
                2628907
                19114001
                da784940-431a-4493-905c-abe235dacdbd
                Copyright © 2008 Hill et al; licensee BioMed Central Ltd.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 24 September 2008
                : 29 December 2008
                Categories
                Software

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article