Enspara: Modeling molecular ensembles with scalable data structures and parallel computing

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Markov state models (MSMs) are quantitative models of protein dynamics that are useful for uncovering the structural fluctuations that proteins undergo, as well as the mechanisms of these conformational changes. Given the enormity of conformational space, there has been ongoing interest in identifying a small number of states that capture the essential features of a protein. Generally, this is achieved by making assumptions about the properties of relevant features—for example, that the most important features are those that change slowly. An alternative strategy is to keep as many degrees of freedom as possible and subsequently learn from the model which of the features are most important. In these larger models, however, traditional approaches quickly become computationally intractable. In this paper, we present enspara, a library for working with MSMs that provides several novel algorithms and specialized data structures that dramatically improve the scalability of traditional MSM methods. This includes ragged arrays for minimizing memory requirements, message passing interface-parallelized implementations of compute-intensive operations, and a flexible framework for model construction and analysis.

Related collections

Most cited references 43

Record: found
Abstract: found
Article: found

Is Open Access

The NumPy array: a structure for efficient numerical computation

Gael Varoquaux, S. Chris Colbert, Stefan van der Walt (2011)

In the Python world, NumPy arrays are the standard representation for numerical data. Here, we show how these arrays enable efficient implementation of numerical computations in a high-level language. Overall, three techniques are applied to improve performance: vectorizing calculations, avoiding copying data in memory, and minimizing operation counts. We first present the NumPy array structure, then show how to use it for efficient computation, and finally how to share array data with other libraries.

0 comments Cited 536 times – based on 0 reviews

Preprint

     Review now

Bookmark

Record: found
Abstract: not found
Article: not found

OpenMP: an industry standard API for shared-memory programming

L. Dagum, Prema R Menon (1998)

0 comments Cited 272 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Article: not found

Cython: The Best of Both Worlds

Stefan Behnel, Robert Bradshaw, Craig Citro … (2011)

0 comments Cited 221 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Title: The Journal of Chemical Physics

Abbreviated Title: J. Chem. Phys.

Publisher: AIP Publishing

ISSN (Print): 0021-9606

ISSN (Electronic): 1089-7690

Publication date Created: January 28 2019

Publication date (Print): January 28 2019

Volume: 150

Issue: 4

Page: 044108

Affiliations

[1 ]Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, 660 South Euclid Avenue, St. Louis, Missouri 63110, USA

Article

DOI: 10.1063/1.5063794

PMC ID: 6910589

PubMed ID: 30709308

SO-VID: eeb7ef40-f70f-4f54-b00b-153045e16508

History

Data availability:

Comments

Comment on this article

scite_

Cited by 25

See all cited by

Most referenced authors 1,698

See all reference authors

- Version 1

Enspara: Modeling molecular ensembles with scalable data structures and parallel computing

Read this article at

Abstract

Related collections

Emerald: Sustainable Structures & Infrastructures

Most cited references 43

The NumPy array: a structure for efficient numerical computation

OpenMP: an industry standard API for shared-memory programming

Cython: The Best of Both Worlds

Author and article information

Journal

Affiliations

Article

History

Comments

Comment on this article

Similar content 2,294

Cited by 25

Most referenced authors 1,698