ScienceOpen: research and publishing network

For Researchers

Search
Advanced search

44

views

    

0

recommends

0

shares

Record: found
Abstract: found
Article: found

Is Open Access

Text-Independent Speaker Verification Using 3D Convolutional Neural Networks

Preprint

Author(s): Amirsina Torfi , Nasser M. Nasrabadi , Jeremy Dawson

Publication date Created: 2017-05-25

Read this article at

ScienceOpen ArXiv

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

In this paper, a 3D Convolutional Neural Network (3D-CNN) architecture has been utilized for text-independent speaker verification. At the development phase, a CNN is trained to classify speakers at the utterance-level. In the enrollment stage, the trained network is utilized to directly create a speaker model for each speaker based on the extracted features. Finally, in the evaluation phase, the extracted features from the test utterance will be compared to the stored speaker model to verify the claimed identity. One of the main challenges is the creation of the speaker models. Previously-reported approaches create speaker models based on averaging the extracted features from utterances of the speaker, which is known as a d-vector system. In our paper, we propose to use the 3D-CNNs for direct speaker model creation in which, for both development and enrollment phases, an identical number of speaker utterances is fed to the network for representing the speaker utterances and creation of the speaker model. This leads to simultaneously capturing the speaker-related information and building a more robust system to cope with within-speaker variation. We demonstrate that the proposed method significantly outperforms the d-vector verification system.

Related collections

Most cited references 7

Record: found
Abstract: not found
Article: not found

Support vector machines using GMM supervectors for speaker verification

W.M. Campbell, D.E Sturim, D.A. Reynolds (2006)

0 comments Cited 66 times – based on 0 reviews      Review now

Record: found
Abstract: not found
Conference Proceedings: not found

Deep neural networks for small footprint text-dependent speaker verification

Xin’ Lei, Ehsan Variani, Javier Gonzalez-Dominguez … (2014)

0 comments Cited 65 times – based on 0 reviews

Record: found
Abstract: not found
Article: not found

Joint Factor Analysis Versus Eigenchannels in Speaker Recognition

Pierre Dumouchel, Pierre Ouellet, Gilles Boulianne … (2007)

0 comments Cited 52 times – based on 0 reviews      Review now

Author and article information

Journal

Publication date Created: 2017-05-25

Article

ArXiV ID: 1705.09422

SO-VID: 468d5131-c3c8-4021-b619-1c916f141415

License:

http://arxiv.org/licenses/nonexclusive-distrib/1.0/

History

Custom metadata

Comments submitted to 5th IEEE Global Conference on Signal and Information Processing(2017)

Categories cs.CV

ScienceOpen disciplines: Computer vision & Pattern recognition

Data availability:

ScienceOpen disciplines: Computer vision & Pattern recognition

Comments

Comment on this article