Conferencia

McLaren, M.; Lei, Y.; Ferrer, L.; The Institute of Electrical and Electronics Engineers Signal Processing Society "Advances in deep neural network approaches to speaker recognition" (2015) 40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015. 2015-August:4814-4818
Estamos trabajando para incorporar este artículo al repositorio
Consulte el artículo en la página del editor
Consulte la política de Acceso Abierto del editor

Abstract:

The recent application of deep neural networks (DNN) to speaker identification (SID) has resulted in significant improvements over current state-of-the-art on telephone speech. In this work, we report a similar achievement in DNN-based SID performance on microphone speech. We consider two approaches to DNN-based SID: one that uses the DNN to extract features, and another that uses the DNN during feature modeling. Modeling is conducted using the DNN/i-vector framework, in which the traditional universal background model is replaced with a DNN. The recently proposed use of bottleneck features extracted from a DNN is also evaluated. Systems are first compared with a conventional universal background model (UBM) Gaussian mixture model (GMM) i-vector system on the clean conditions of the NIST 2012 speaker recognition evaluation corpus, where a lack of robustness to microphone speech is found. Several methods of DNN feature processing are then applied to bring significantly greater robustness to microphone speech. To direct future research, the DNN-based systems are also evaluated in the context of audio degradations including noise and reverberation. © 2015 IEEE.

Registro:

Documento: Conferencia
Título:Advances in deep neural network approaches to speaker recognition
Autor:McLaren, M.; Lei, Y.; Ferrer, L.; The Institute of Electrical and Electronics Engineers Signal Processing Society
Filiación:Speech Technology and Research Laboratory, SRI International, California, United States
Departamento de Computación, FCEN, Universidad de Buenos Aires and CONICET, Argentina
Palabras clave:bottleneck features; channel mismatch; Deep neural networks; normalization; speaker recognition
Año:2015
Volumen:2015-August
Página de inicio:4814
Página de fin:4818
DOI: http://dx.doi.org/10.1109/ICASSP.2015.7178885
Título revista:40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015
Título revista abreviado:ICASSP IEEE Int Conf Acoust Speech Signal Process Proc
ISSN:15206149
CODEN:IPROD
Registro:https://bibliotecadigital.exactas.uba.ar/collection/paper/document/paper_15206149_v2015-August_n_p4814_McLaren

Referencias:

  • Lei, Y., Scheffer, N., Ferrer, L., McLaren, M., A novel scheme for speaker recognition using a phonetically-aware deep neural network (2014) Proc. ICASSP
  • Lei, Y., Ferrer, L., McLaren, M., Scheffer, N., A deep neural network speaker verification system targeting microphone speech (2014) Proc. Interspeech
  • Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P., Front-end factor analysis for speaker verification (2011) IEEE Trans. on Speech and Audio Processing, 19, pp. 788-798
  • Song, Y., Jiang, B., Bao, Y., Wei, S., Dai, L., I-vector representation based on bottleneck features for language identification (2013) Electronics Letters, 49 (24), pp. 1569-1570
  • Ferrer, L., Lei, Y., McLaren, M., Study of senone-based deep neural network approaches for spoken language recognition (2014) Submitted to IEEE Trans. ASLP
  • Ferrer, L., Lei, Y., McLaren, M., Scheffer, N., Spoken language recognition based on senone posteriors (2014) Proc. Interspeech
  • McLaren, M., Lei, Y., Scheffer, N., Ferrer, L., Application of convolutional neural networks to speaker recognition in noisy conditions (2014) Proc Interspeech
  • Matejka, P., Zhang, L., Ng, T., Mallidi, S.H., Glembek, O., Ma, J., Zhang, B., Neural network bottleneck features for language identification (2014) Proc. Speaker Odyssey
  • Lei, Y., Ferrer, L., Lawson, A., McLaren, M., Scheffer, N., Application of convolutional neural networks to language identification in noisy conditions (2014) Proc. Speaker Odyssey
  • Lei, Y., Ferrer, L., McLaren, M., Scheffer, N., Comparative study on the use of senone-based deep neural networks for speaker recognition (2014) Submitted to IEEE Trans. ASLP
  • Pelecanos, J., Sridharan, S., Feature warping for robust speaker verification (2001) Proc. Speaker Odyssey
  • Young, S.J., Odell, J.J., Woodland, P.C., Tree-based state tying for high accuracy acoustic modelling (1994) Proc.Workshop on Human Language Technology, pp. 307-312
  • McLaren, M., Scheffer, N., Ferrer, L., Lei, Y., Effective use of DCTs for contextualizing features for speaker recognition (2014) Proc. ICASSP
  • McLaren, M., Lei, Y., Improved speaker recognition using DCT coefficients as features (2015) Proc. ICASSP (Submitted)
  • Prince, S.J.D., Elder, J.H., Probabilistic linear discriminant analysis for inferences about identity (2007) Proc. ICCV. IEEE, pp. 1-8
  • Ferrer, L., McLaren, M., Scheffer, N., Lei, Y., Graciarena, M., Mitra, V., A noise-robust system for NIST 2012 speaker recognition evaluation (2013) Proc. Interpseech
  • (2012), http://www.nist.gov/itl/iad/mig/upload/NIST_SRE12_evalplan-v17-r1.pdf; Ferrer, L., Bratt, H., Burget, L., Cernocky, H., Glembek, O., Graciarena, M., Lawson, A., Plchot, O., Promoting robustness for speaker modeling in the community: The PRISM evaluation set (2011) Proc. NIST 2011 Workshop
  • Senoussaoui, M., Kenny, P., Brummer, N., De Villiers, E., Dumouchel, P., Mixture of PLDA models in i-vector space for gender independent speaker recognition (2011) Proc. Speech Communication and Technology
  • Lei, Y., Burget, L., Ferrer, L., Graciarena, M., Scheffer, N., Towards noise-robust speaker recognition using probabilistic linear discriminant analysis (2012) Proc. ICASSP, pp. 4253-4256A4 - The Institute of Electrical and Electronics Engineers Signal Processing Society

Citas:

---------- APA ----------
McLaren, M., Lei, Y., Ferrer, L. & The Institute of Electrical and Electronics Engineers Signal Processing Society (2015) . Advances in deep neural network approaches to speaker recognition. 40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015, 2015-August, 4814-4818.
http://dx.doi.org/10.1109/ICASSP.2015.7178885
---------- CHICAGO ----------
McLaren, M., Lei, Y., Ferrer, L., The Institute of Electrical and Electronics Engineers Signal Processing Society "Advances in deep neural network approaches to speaker recognition" . 40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 2015-August (2015) : 4814-4818.
http://dx.doi.org/10.1109/ICASSP.2015.7178885
---------- MLA ----------
McLaren, M., Lei, Y., Ferrer, L., The Institute of Electrical and Electronics Engineers Signal Processing Society "Advances in deep neural network approaches to speaker recognition" . 40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015, vol. 2015-August, 2015, pp. 4814-4818.
http://dx.doi.org/10.1109/ICASSP.2015.7178885
---------- VANCOUVER ----------
McLaren, M., Lei, Y., Ferrer, L., The Institute of Electrical and Electronics Engineers Signal Processing Society Advances in deep neural network approaches to speaker recognition. ICASSP IEEE Int Conf Acoust Speech Signal Process Proc. 2015;2015-August:4814-4818.
http://dx.doi.org/10.1109/ICASSP.2015.7178885