19
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Laryngeal Pressure Estimation With a Recurrent Neural Network

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Quantifying the physical parameters of voice production is essential for understanding the process of phonation and can aid in voice research and diagnosis. As an alternative to invasive measurements, they can be estimated by formulating an inverse problem using a numerical forward model. However, high-fidelity numerical models are often computationally too expensive for this. This paper presents a novel approach to train a long short-term memory network to estimate the subglottal pressure in the larynx at massively reduced computational cost using solely synthetic training data. We train the network on synthetic data from a numerical two-mass model and validate it on experimental data from 288 high-speed ex vivo video recordings of porcine vocal folds from a previous study. The training requires significantly fewer model evaluations compared with the previous optimization approach. On the test set, we maintain a comparable performance of 21.2% versus previous 17.7% mean absolute percentage error in estimating the subglottal pressure. The evaluation of one sample requires a vanishingly small amount of computation time. The presented approach is able to maintain estimation accuracy of the subglottal pressure at significantly reduced computational cost. The methodology is likely transferable to estimate other parameters and training with other numerical models. This improvement should allow the adoption of more sophisticated, high-fidelity numerical models of the larynx. The vast speedup is a critical step to enable a future clinical application and knowledge of parameters such as the subglottal pressure will aid in diagnosis and treatment selection.

          Abstract

          In the clinical routine the air pressure in the larynx is usually inaccessible. We trained a neural network on a numerical model to estimate the pressure from endoscopic videos at minimal computational cost.

          Related collections

          Most cited references47

          • Record: found
          • Abstract: found
          • Article: not found

          Reynolds averaged turbulence modelling using deep neural networks with embedded invariance

          There exists significant demand for improved Reynolds-averaged Navier–Stokes (RANS) turbulence models that are informed by and can represent a richer set of turbulence physics. This paper presents a method of using deep neural networks to learn a model for the Reynolds stress anisotropy tensor from high-fidelity simulation data. A novel neural network architecture is proposed which uses a multiplicative layer with an invariant tensor basis to embed Galilean invariance into the predicted anisotropy tensor. It is demonstrated that this neural network architecture provides improved prediction accuracy compared with a generic neural network architecture that does not embed this invariance property. The Reynolds stress anisotropy predictions of this invariant neural network are propagated through to the velocity field for two test cases. For both test cases, significant improvement versus baseline RANS linear eddy viscosity and nonlinear eddy viscosity models is demonstrated.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning

            Remarkable progress has been made in image recognition, primarily due to the availability of large-scale annotated datasets and the revival of deep CNN. CNNs enable learning data-driven, highly representative, layered hierarchical image features from sufficient training data. However, obtaining datasets as comprehensively annotated as ImageNet in the medical imaging domain remains a challenge. There are currently three major techniques that successfully employ CNNs to medical image classification: training the CNN from scratch, using off-the-shelf pre-trained CNN features, and conducting unsupervised CNN pre-training with supervised fine-tuning. Another effective method is transfer learning, i.e., fine-tuning CNN models pre-trained from natural image dataset to medical image tasks. In this paper, we exploit three important, but previously understudied factors of employing deep convolutional neural networks to computer-aided detection problems. We first explore and evaluate different CNN architectures. The studied models contain 5 thousand to 160 million parameters, and vary in numbers of layers. We then evaluate the influence of dataset scale and spatial image context on performance. Finally, we examine when and why transfer learning from pre-trained ImageNet (via fine-tuning) can be useful. We study two specific computer-aided detection (CADe) problems, namely thoraco-abdominal lymph node (LN) detection and interstitial lung disease (ILD) classification. We achieve the state-of-the-art performance on the mediastinal LN detection, with 85% sensitivity at 3 false positive per patient, and report the first five-fold cross-validation classification results on predicting axial CT slices with ILD categories. Our extensive empirical evaluation, CNN model analysis and valuable insights can be extended to the design of high performance CAD systems for other medical imaging tasks.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Synthesis of Voiced Sounds From a Two-Mass Model of the Vocal Cords

                Bookmark

                Author and article information

                Contributors
                Journal
                IEEE J Transl Eng Health Med
                IEEE J Transl Eng Health Med
                0063400
                JTEHM
                IJTEBN
                IEEE Journal of Translational Engineering in Health and Medicine
                IEEE
                2168-2372
                2019
                27 December 2018
                : 7
                : 2000111
                Affiliations
                [1]divisionDivision of Phoniatrics and Pediatric Audiology, departmentDepartment of Otorhinolaryngology, Head and Neck Surgery, institutionUniversity Hospital Erlangen, Friedrich-Alexander University Erlangen-Nürnberg; 91054ErlangenGermany
                Author notes
                Article
                2000111
                10.1109/JTEHM.2018.2886021
                6331197
                30680252
                a8953d4f-0f9a-4ce7-8402-57a678d9fa8b
                This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/
                History
                : 09 July 2018
                : 24 October 2018
                : 30 November 2018
                : 14 January 2019
                Page count
                Figures: 6, Tables: 3, Equations: 76, References: 68, Pages: 11
                Funding
                Funded by: Deutsche Forschungsgemeinschaft, fundref 10.13039/501100001659;
                Award ID: 391215328 DO1247/10-1
                This work was supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under grant 391215328 DO1247/10-1.
                Categories
                Article

                high-speed video,inverse problem,recurrent neural networks,vocal fold dynamics,voice physiology

                Comments

                Comment on this article