8
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: not found
      • Article: not found

      Gammatone Cepstral Coefficients: Biologically Inspired Features for Non-Speech Audio Classification

      Read this article at

      ScienceOpenPublisher
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Related collections

          Most cited references15

          • Record: found
          • Abstract: found
          • Article: not found

          Derivation of auditory filter shapes from notched-noise data.

          A well established method for estimating the shape of the auditory filter is based on the measurement of the threshold of a sinusoidal signal in a notched-noise masker, as a function of notch width. To measure the asymmetry of the filter, the notch has to be placed both symmetrically and asymmetrically about the signal frequency. In previous work several simplifying assumptions and approximations were made in deriving auditory filter shapes from the data. In this paper we describe modifications to the fitting procedure which allow more accurate derivations. These include: 1) taking into account changes in filter bandwidth with centre frequency when allowing for the effects of off-frequency listening; 2) correcting for the non-flat frequency response of the earphone; 3) correcting for the transmission characteristics of the outer and middle ear; 4) limiting the amount by which the centre frequency of the filter can shift in order to maximise the signal-to-masker ratio. In many cases, these modifications result in only small changes to the derived filter shape. However, at very high and very low centre frequencies and for hearing-impaired subjects the differences can be substantial. It is also shown that filter shapes derived from data where the notch is always placed symmetrically about the signal frequency can be seriously in error when the underlying filter is markedly asymmetric. New formulae are suggested describing the variation of the auditory filter with frequency and level. The implication of the results for the calculation of excitation patterns are discussed and a modified procedure is proposed. The appendix list FORTRAN computer programs for deriving auditory filter shapes from notched-noise data and for calculating excitation patterns. The first program can readily be modified so as to derive auditory filter shapes from data obtained with other types of maskers, such as rippled noise.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            A cochlear frequency-position function for several species--29 years later.

            Accurate cochlear frequency-position functions based on physiological data would facilitate the interpretation of physiological and psychoacoustic data within and across species. Such functions might aid in developing cochlear models, and cochlear coordinates could provide potentially useful spectral transforms of speech and other acoustic signals. In 1961, an almost-exponential function was developed (Greenwood, 1961b, 1974) by integrating an exponential function fitted to a subset of frequency resolution-integration estimates (critical bandwidths). The resulting frequency-position function was found to fit cochlear observations on human cadaver ears quite well and, with changes of constants, those on elephant, cow, guinea pig, rat, mouse, and chicken (Békésy, 1960), as well as in vivo (behavioral-anatomical) data on cats (Schucknecht, 1953). Since 1961, new mechanical and other physiological data have appeared on the human, cat, guinea pig, chinchilla, monkey, and gerbil. It is shown here that the newer extended data on human cadaver ears and from living animal preparations are quite well fit by the same basic function. The function essentially requires only empirical adjustment of a single parameter to set an upper frequency limit, while a "slope" parameter can be left constant if cochlear partition length is normalized to 1 or scaled if distance is specified in physical units. Constancy of slope and form in dead and living ears and across species increases the probability that the function fitting human cadaver data may apply as well to the living human ear. This prospect increases the function's value in plotting auditory data and in modeling concerned with speech and other bioacoustic signals, since it fits the available physiological data well and, consequently (if those data are correct), remains independent of, and an appropriate means to examine, psychoacoustic data and assumptions.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Environmental Sound Recognition With Time–Frequency Audio Features

                Bookmark

                Author and article information

                Journal
                IEEE Transactions on Multimedia
                IEEE Trans. Multimedia
                Institute of Electrical and Electronics Engineers (IEEE)
                1520-9210
                1941-0077
                December 2012
                December 2012
                : 14
                : 6
                : 1684-1689
                Article
                10.1109/TMM.2012.2199972
                3da9f20a-812c-40dc-beec-025aec8de38e
                © 2012
                History

                Comments

                Comment on this article