Conferencia

McLaren, M.; Ferrer, L.; Castan, D.; Lawson, A.; Morgan N.; Georgiou P.; Morgan N.; Narayanan S.; Metze F.; Amazon Alexa; Apple; eBay; et al.; Google; Microsoft "The 2016 speakers in thewild speaker recognition evaluation" (2016) 17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016. 08-12-September-2016:823-827
Estamos trabajando para incorporar este artículo al repositorio
Consulte el artículo en la página del editor
Consulte la política de Acceso Abierto del editor

Abstract:

The newly collected Speakers in the Wild (SITW) database was central to a text-independent speaker recognition challenge held as part of a special session at Interspeech 2016. The SITW database is composed of audio recordings from 299 speakers collected from open source media, with an average of 8 sessions per speaker. The recordings contain unconstrained or "wild" acoustic conditions, rarely found in large speaker recognition datasets, and multi-speaker recordings for both speaker enrollment and verification. This article provides details of the SITW speaker recognition challenge and analysis of evaluation results. There were 25 international teams involved in the challenge of which 11 teams participated in an evaluation track. Teams were tasked with applying existing and novel speaker recognition algorithms to the challenges associated with the real world conditions of SITW. We provide an analysis of some of the top performing systems submitted during the evaluation and provide future research directions. Copyright ©2016 ISCA.

Registro:

Documento: Conferencia
Título:The 2016 speakers in thewild speaker recognition evaluation
Autor:McLaren, M.; Ferrer, L.; Castan, D.; Lawson, A.; Morgan N.; Georgiou P.; Morgan N.; Narayanan S.; Metze F.; Amazon Alexa; Apple; eBay; et al.; Google; Microsoft
Filiación:Speech Technology and Research Laboratory, SRI InternationalCA, United States
Departamento de Computación, FCEN, Universidad de Buenos Aires, CONICET, Argentina
Palabras clave:Evaluation; Speaker recognition; Speakers in the wild database; Audio recordings; Character recognition; Database systems; Speech communication; Speech processing; Acoustic conditions; Evaluation; Evaluation results; Future research directions; International team; Speaker recognition; Speaker recognition evaluations; Text independents; Speech recognition
Año:2016
Volumen:08-12-September-2016
Página de inicio:823
Página de fin:827
DOI: http://dx.doi.org/10.21437/Interspeech.2016-1137
Título revista:17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016
Título revista abreviado:Proc. Annu. Conf. Int. Speech. Commun. Assoc., INTERSPEECH
ISSN:2308457X
Registro:https://bibliotecadigital.exactas.uba.ar/collection/paper/document/paper_2308457X_v08-12-September-2016_n_p823_McLaren

Referencias:

  • NIST Speaker Recognition Evaluations, , http://www.nist.gov/itl/iad/mig/sre.cfm
  • Gonzalez-Rodriguez, J., Evaluating automatic speaker recognition systems: An overview of the nist speaker recognition evaluations (1996-2014) (2014) Loquens, 1 (1)
  • McLaren, M., Ferrer, L., Castan, D., Lawson, A., The speakers in the wild (SITW) speaker recognition database (2016) Submitted to Interspeech, 2016
  • Poh, N., Bengio, S., Estimating the confidence interval of expected performance curve in biometric authentication using joint bootstrap (2007) Proc. ICASSP, Honolulu, , Apr
  • Lei, Y., Scheffer, N., Ferrer, L., McLaren, M., A novel scheme for speaker recognition using a phonetically-aware deep neural network (2014) Proc. ICASSP, Florence, Italy, , May
  • Matejka, P., Zhang, L., Ng, T., Mallidi, S.H., Glembek, O., Ma, J., Zhang, B., Neural network bottleneck features for language identification (2014) Proc. Odyssey-14, Joensuu, Finland, , Jun
  • McLaren, M., Lei, Y., Ferrer, L., Advances in deep neural network approaches to speaker recognition (2015) Proc. ICASSP, , Brisbane, Australia, May
  • Mclaren, M., Van Leeuwen, D., Source-normalized lda for robust speaker recognition using i-vectors from multiple speech sources (2012) Audio, Speech, and Language Processing, IEEE Transactions on, 20 (3), pp. 755-766
  • Zhou, X., Garcia-Romero, D., Duraiswami, R., Espy-Wilson, C., Shamma, S., Linear versus mel frequency cepstral coefficients for speaker recognition (2011) Automatic Speech Recognition and Understanding (ASRU 2011 IEEE Workshop on, , IEEE
  • Garcia-Romero, D., Espy-Wilson, C., Analysis of i-vector length normalization in speaker recognition systems (2011) Proc. Interspeech, , Florence, Italy, Aug
  • Ferrer, L., McLaren, M., Scheffer, N., Lei, Y., Graciarena, M., Mitra, V., A noise-robust system for NIST 2012 speaker recognition evaluation (2013) Proc. Interspeech, , Lyon, France, Aug
  • DARPA RATS Program, , http://www.darpa.mil/program/robust-atuomatic-transcription-of-speech
  • Thomas, S., Saon, G., Van Segbroeck, M., Narayanan, S.S., Improvements to the IBM speech activity detection system for the DARPA rats program (2015) Proc. ICASSP, , Brisbane, Australia, May
  • Ma, J., Improving the speech activity detection for the DARPA RATS phase-3 evaluation (2014) Proc. Interspeech, , Singapore, Sep
  • Graciarena, M., Alwan, A., Ellis, D., Franco, H., Ferrer, L., Hansen, J.H., Janin, A., Mitra, V., All for one: Feature combination for highly channel-degraded speech activity detection (2013) Proc. Interspeech, , Lyon, France, Aug
  • Ferrer, L., Graciarena, M., Mitra, V., A phonetically aware system for speech activity detection Proc. ICASSP, , Shanghai, China, March 2016
  • Ferrer, L., Burget, L., Plchot, O., Scheffer, N., A unified approach for audio characterization and its application to speaker recognition (2012) Proc. Odyssey-12, , Singapore, Jun
  • McLaren, M., Lawson, A., Ferrer, L., Scheffer, N., Lei, Y., Trial-based calibration for speaker recognition in unseen conditions (2014) Proc. Odyssey-14, , Joensuu, Finland, JunA4 - Amazon Alexa; Apple; eBay; et al.; Google; Microsoft

Citas:

---------- APA ----------
McLaren, M., Ferrer, L., Castan, D., Lawson, A., Morgan N., Georgiou P., Morgan N.,..., Amazon Alexa; Apple; eBay; et al.; Google; Microsoft (2016) . The 2016 speakers in thewild speaker recognition evaluation. 17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016, 08-12-September-2016, 823-827.
http://dx.doi.org/10.21437/Interspeech.2016-1137
---------- CHICAGO ----------
McLaren, M., Ferrer, L., Castan, D., Lawson, A., Morgan N., Georgiou P., et al. "The 2016 speakers in thewild speaker recognition evaluation" . 17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016 08-12-September-2016 (2016) : 823-827.
http://dx.doi.org/10.21437/Interspeech.2016-1137
---------- MLA ----------
McLaren, M., Ferrer, L., Castan, D., Lawson, A., Morgan N., Georgiou P., et al. "The 2016 speakers in thewild speaker recognition evaluation" . 17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016, vol. 08-12-September-2016, 2016, pp. 823-827.
http://dx.doi.org/10.21437/Interspeech.2016-1137
---------- VANCOUVER ----------
McLaren, M., Ferrer, L., Castan, D., Lawson, A., Morgan N., Georgiou P., et al. The 2016 speakers in thewild speaker recognition evaluation. Proc. Annu. Conf. Int. Speech. Commun. Assoc., INTERSPEECH. 2016;08-12-September-2016:823-827.
http://dx.doi.org/10.21437/Interspeech.2016-1137