Exploring the role of phonetic bottleneck features for speaker and language recognition

McLaren, M.; Ferrer, L.; Lawson, A.; The Institute of Electrical and Electronics Engineers Signal Processing Society

doi:10.1109/ICASSP.2016.7472744

Navegar

Documento Últimos Documentos Autor FCEN - Año Autor FCEN - Revista Año - Revista Revista - Año SubjectPcEn Colores Type

Colección

Conferencia

McLaren, M.; Ferrer, L.; Lawson, A.; The Institute of Electrical and Electronics Engineers Signal Processing Society "Exploring the role of phonetic bottleneck features for speaker and language recognition" (2016) 41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016. 2016-May:5575-5579

https://bibliotecadigital.exactas.uba.ar/collection/paper/document/paper_15206149_v2016-May_n_p5575_McLaren

Estamos trabajando para incorporar este artículo al repositorio

Consulte el artículo en la página del editor

Consulte la política de Acceso Abierto del editor

Abstract:

Using bottleneck features extracted from a deep neural network (DNN) trained to predict senone posteriors has resulted in new, state-of-the-art technology for language and speaker identification. For language identification, the features' dense phonetic information is believed to enable improved performance by better representing language-dependent phone distributions. For speaker recognition, the role of these features is less clear, given that a bottleneck layer near the DNN output layer is thought to contain limited speaker information. In this article, we analyze the role of bottleneck features in these identification tasks by varying the DNN layer from which they are extracted, under the hypothesis that speaker information is traded for dense phonetic information as the layer moves toward the DNN output layer. Experiments support this hypothesis under certain conditions, and highlight the benefit of using a bottleneck layer close to the DNN output layer when DNN training data is matched to the evaluation conditions, and a layer more central to the DNN otherwise. © 2016 IEEE.

Registro:

Documento:	Conferencia
Título:	Exploring the role of phonetic bottleneck features for speaker and language recognition
Autor:	McLaren, M.; Ferrer, L.; Lawson, A.; The Institute of Electrical and Electronics Engineers Signal Processing Society
Filiación:	Speech Technology and Research Laboratory, SRI InternationalCA, United States Departamento de Computación, FCEN, Universidad de Buenos Aires and CONICET, Argentina
Palabras clave:	Bottleneck Features; Deep Neural Networks; Language Recognition; Speaker Recognition
Año:	2016
Volumen:	2016-May
Página de inicio:	5575
Página de fin:	5579
DOI:	http://dx.doi.org/10.1109/ICASSP.2016.7472744
Título revista:	41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016
Título revista abreviado:	ICASSP IEEE Int Conf Acoust Speech Signal Process Proc
ISSN:	15206149
CODEN:	IPROD
Registro:	https://bibliotecadigital.exactas.uba.ar/collection/paper/document/paper_15206149_v2016-May_n_p5575_McLaren

Referencias:

Lei, Y., Scheffer, N., Ferrer, L., McLaren, M., A novel scheme for speaker recognition using a phonetically aware deep neural network (2014) Proc. ICASSP
Ferrer, L., Lei, Y., McLaren, M., Study of senone-based deep neural network approaches for spoken language recognition (2015) Submitted to IEEE Trans. Audio Speech and Language Processing
Richardson, F., Reynolds, D., Dehak, N., A unified deep neural network for speaker and language recognition (2015) Proc. Interspeech
McLaren, M., Lei, Y., Ferrer, L., Advances in deep neural network approaches to speaker recognition (2015) Proc. IEEE ICASSP
Ferrer, L., Lei, Y., McLaren, M., Scheffer, N., Language identification based on senone posteriors (2014) Proc. Interspeech
Song, Y., Jiang, B., Bao, Y., Wei, S., Dai, L., I-Vector representation based on bottleneck features for language identification (2013) Electronics Letters, 49 (24), pp. 1569-1570
Matejka, P., Zhang, L., Ng, T., Mallidi, S.H., Glembek, O., Ma, J., Zhang, B., Neural network bottleneck features for language identification (2014) Proc. Speaker Odyssey
Lei, Y., Ferrer, L., Lawson, A., McLaren, M., Scheffer, N., Application of convolutional neural networks to language identification in noisy conditions (2014) Proc. Speaker Odyssey
Matejka, P., Schwarz, P., Cernocky, J., Chytil, P., Phonotactic language identification using high-quality phoneme recognition (2005) Proc Interspeech
Shen, W., Campbell, W., Gleason, T., Reynolds, D., Singer, E., Experiments with lattice-based PPRLM language identification (2006) Proc. Odyssey
Stolcke, A., Akbacak, M., Ferrer, L., Kajarekar, S., Richey, C., Scheffer, N., Shriberg, E., Improving language recognition with multilingual phone recognition and speaker adaptation transforms (2010) Proc. Odyssey
Fernando D'Haro Enŕquez, L., Glembek, O., Plchot, O., Matejka, P., Soufifar, M., De Córdoba Herralde, R., Ernockỳ, J.C., Phonotactic language recognition using i-vectors and phoneme posteriogram counts (2012) Proc. Interspeech
Penagarikano, M., Varona, A., Diez, M., Rodriguez-Fuentes, L.J., Bordel, G., Study of different backends in a state-of-the-art language recognition system (2012) Proc. Interspeech
Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P., Front-end factor analysis for speaker verification (2011) IEEE Trans. on Speech and Audio Processing, 19, pp. 788-798
Ferrer, L., Bratt, H., Burget, L., Cernocky, H., Glembek, O., Graciarena, M., Lawson, A., Scheffer, N., Promoting robustness for speaker modeling in the community: The PRISM evaluation set (2011) Proc. NIST 2011 Workshop
Lei, Y., Burget, L., Ferrer, L., Graciarena, M., Scheffer, N., Towards noise-robust speaker recognition using probabilistic linear discriminant analysis (2012) Proc. ICASSP, pp. 4253-4256
Larcher, A., Lee, K., Ma, B., Li, H., RSR2015: Database for text-dependent speaker verification using multiple pass-phrases (2012) Proc. Interspeech
McLaren, M., Lawson, A., Ferrer, L., Scheffer, N., Lei, Trial-based calibration for speaker recognition in unseen conditions (2014) Odyssey 2014: The Speaker and Language Recognition Workshop
(2009) The 2009 NIST Language Recognition Evaluation Plan, , http://www.itl.nist.gov/iad/mig/tests/lre/2009/
Lawson, A., McLaren, M., Lei, Y., Mitra, V., Scheffer, N., Ferrer, L., Graciarena, M., Improving language identification robustness to highly channel-degraded speech through multiple system fusion (2013) Proc. Interspeech
Walker, K., Strassel, S., The rats radio traffic collection system (2012) Proc. Odyssey
Stafylakis, T., Kenny, P., Ouellet, P., Perez, J., Kockmann, M., Dumouchel, P., Text-dependent speaker recognition using PLDA with uncertainty propagation (2013) Proc. Interspeech, p. 36843688A4 - The Institute of Electrical and Electronics Engineers Signal Processing Society

Citas:

---------- APA ----------

McLaren, M., Ferrer, L., Lawson, A. & The Institute of Electrical and Electronics Engineers Signal Processing Society (2016) . Exploring the role of phonetic bottleneck features for speaker and language recognition. 41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016, 2016-May, 5575-5579.
http://dx.doi.org/10.1109/ICASSP.2016.7472744

---------- CHICAGO ----------

McLaren, M., Ferrer, L., Lawson, A., The Institute of Electrical and Electronics Engineers Signal Processing Society "Exploring the role of phonetic bottleneck features for speaker and language recognition" . 41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 2016-May (2016) : 5575-5579.
http://dx.doi.org/10.1109/ICASSP.2016.7472744

---------- MLA ----------

McLaren, M., Ferrer, L., Lawson, A., The Institute of Electrical and Electronics Engineers Signal Processing Society "Exploring the role of phonetic bottleneck features for speaker and language recognition" . 41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016, vol. 2016-May, 2016, pp. 5575-5579.
http://dx.doi.org/10.1109/ICASSP.2016.7472744

---------- VANCOUVER ----------

McLaren, M., Ferrer, L., Lawson, A., The Institute of Electrical and Electronics Engineers Signal Processing Society Exploring the role of phonetic bottleneck features for speaker and language recognition. ICASSP IEEE Int Conf Acoust Speech Signal Process Proc. 2016;2016-May:5575-5579.
http://dx.doi.org/10.1109/ICASSP.2016.7472744