43
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Text-Independent Speaker Verification Using 3D Convolutional Neural Networks

      Preprint
      , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          In this paper, a 3D Convolutional Neural Network (3D-CNN) architecture has been utilized for text-independent speaker verification. At the development phase, a CNN is trained to classify speakers at the utterance-level. In the enrollment stage, the trained network is utilized to directly create a speaker model for each speaker based on the extracted features. Finally, in the evaluation phase, the extracted features from the test utterance will be compared to the stored speaker model to verify the claimed identity. One of the main challenges is the creation of the speaker models. Previously-reported approaches create speaker models based on averaging the extracted features from utterances of the speaker, which is known as a d-vector system. In our paper, we propose to use the 3D-CNNs for direct speaker model creation in which, for both development and enrollment phases, an identical number of speaker utterances is fed to the network for representing the speaker utterances and creation of the speaker model. This leads to simultaneously capturing the speaker-related information and building a more robust system to cope with within-speaker variation. We demonstrate that the proposed method significantly outperforms the d-vector verification system.

          Related collections

          Most cited references7

          • Record: found
          • Abstract: not found
          • Article: not found

          Support vector machines using GMM supervectors for speaker verification

            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            Deep neural networks for small footprint text-dependent speaker verification

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Joint Factor Analysis Versus Eigenchannels in Speaker Recognition

                Bookmark

                Author and article information

                Journal
                2017-05-25
                Article
                1705.09422
                468d5131-c3c8-4021-b619-1c916f141415

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                submitted to 5th IEEE Global Conference on Signal and Information Processing(2017)
                cs.CV

                Computer vision & Pattern recognition
                Computer vision & Pattern recognition

                Comments

                Comment on this article