7
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      GEDI: Gammachirp Envelope Distortion Index for Predicting Intelligibility of Enhanced Speech

      Preprint

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          In this study, we proposed a new concept, gammachirp envelope distortion index (GEDI), based on the signal-to-distortion ratio in the auditory envelope SDRenv, to predict the intelligibility of speech enhanced by nonlinear algorithms. The main objective of using GEDI is to calculate the distortion between enhanced and clean speech representations in the domain of a temporal envelope that is extracted by the gammachirp auditory filterbank and modulation filterbank. We also extended the GEDI with multi-resolution analysis (mr-GEDI) to predict the speech intelligibility of sound under non-stationary noise conditions. We evaluated the GEDI in terms of the speech intelligibility predictions of speech sounds enhanced by a classic spectral subtraction and a state-of-the-art Wiener filtering method. The predictions were compared with human results for various signal-to-noise ratio conditions with additive pink and babble noise. The results showed that mr-GEDI predicted the intelligibility curves more accurately than the short-time objective intelligibility (STOI) measure and the hearing aid speech perception index (HASPI).

          Related collections

          Most cited references19

          • Record: found
          • Abstract: not found
          • Article: not found

          An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Spectro-temporal modulation transfer functions and speech intelligibility.

            Detection thresholds for spectral and temporal modulations are measured using broadband spectra with sinusoidally rippled profiles that drift up or down the log-frequency axis at constant velocities. Spectro-temporal modulation transfer functions (MTFs) are derived as a function of ripple peak density (omega cycles/octave) and drifting velocity (omega Hz). The MTFs exhibit a low-pass function with respect to both dimensions, with 50% bandwidths of about 16 Hz and 2 cycles/octave. The data replicate (as special cases) previously measured purely temporal MTFs (omega = 0) [Viemeister, J. Acoust. Soc. Am. 66, 1364-1380 (1979)] and purely spectral MTFs (omega = 0) [Green, in Auditory Frequency Selectivity (Plenum, Cambridge, 1986), pp. 351-359]. A computational auditory model is presented that exhibits spectro-temporal MTFs consistent with the salient trends in the data. The model is used to demonstrate the potential relevance of these MTFs to the assessment of speech intelligibility in noise and reverberant conditions.
              Bookmark
              • Record: found
              • Abstract: not found
              • Conference Proceedings: not found

              Enhancement of speech corrupted by acoustic noise

                Bookmark

                Author and article information

                Journal
                03 April 2019
                Article
                1904.02096
                a60a1243-a445-4e93-b344-e0fa5e3c4e23

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                Preprint, 29 pages, 5 tables, 8 figures
                cs.SD eess.AS

                Electrical engineering,Graphics & Multimedia design
                Electrical engineering, Graphics & Multimedia design

                Comments

                Comment on this article