Abstract:
The newly collected Speakers in the Wild (SITW) database was central to a text-independent speaker recognition challenge held as part of a special session at Interspeech 2016. The SITW database is composed of audio recordings from 299 speakers collected from open source media, with an average of 8 sessions per speaker. The recordings contain unconstrained or "wild" acoustic conditions, rarely found in large speaker recognition datasets, and multi-speaker recordings for both speaker enrollment and verification. This article provides details of the SITW speaker recognition challenge and analysis of evaluation results. There were 25 international teams involved in the challenge of which 11 teams participated in an evaluation track. Teams were tasked with applying existing and novel speaker recognition algorithms to the challenges associated with the real world conditions of SITW. We provide an analysis of some of the top performing systems submitted during the evaluation and provide future research directions. Copyright ©2016 ISCA.
Registro:
Documento: |
Conferencia
|
Título: | The 2016 speakers in thewild speaker recognition evaluation |
Autor: | McLaren, M.; Ferrer, L.; Castan, D.; Lawson, A.; Morgan N.; Georgiou P.; Morgan N.; Narayanan S.; Metze F.; Amazon Alexa; Apple; eBay; et al.; Google; Microsoft |
Filiación: | Speech Technology and Research Laboratory, SRI InternationalCA, United States Departamento de Computación, FCEN, Universidad de Buenos Aires, CONICET, Argentina
|
Palabras clave: | Evaluation; Speaker recognition; Speakers in the wild database; Audio recordings; Character recognition; Database systems; Speech communication; Speech processing; Acoustic conditions; Evaluation; Evaluation results; Future research directions; International team; Speaker recognition; Speaker recognition evaluations; Text independents; Speech recognition |
Año: | 2016
|
Volumen: | 08-12-September-2016
|
Página de inicio: | 823
|
Página de fin: | 827
|
DOI: |
http://dx.doi.org/10.21437/Interspeech.2016-1137 |
Título revista: | 17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016
|
Título revista abreviado: | Proc. Annu. Conf. Int. Speech. Commun. Assoc., INTERSPEECH
|
ISSN: | 2308457X
|
Registro: | https://bibliotecadigital.exactas.uba.ar/collection/paper/document/paper_2308457X_v08-12-September-2016_n_p823_McLaren |
Referencias:
- NIST Speaker Recognition Evaluations, , http://www.nist.gov/itl/iad/mig/sre.cfm
- Gonzalez-Rodriguez, J., Evaluating automatic speaker recognition systems: An overview of the nist speaker recognition evaluations (1996-2014) (2014) Loquens, 1 (1)
- McLaren, M., Ferrer, L., Castan, D., Lawson, A., The speakers in the wild (SITW) speaker recognition database (2016) Submitted to Interspeech, 2016
- Poh, N., Bengio, S., Estimating the confidence interval of expected performance curve in biometric authentication using joint bootstrap (2007) Proc. ICASSP, Honolulu, , Apr
- Lei, Y., Scheffer, N., Ferrer, L., McLaren, M., A novel scheme for speaker recognition using a phonetically-aware deep neural network (2014) Proc. ICASSP, Florence, Italy, , May
- Matejka, P., Zhang, L., Ng, T., Mallidi, S.H., Glembek, O., Ma, J., Zhang, B., Neural network bottleneck features for language identification (2014) Proc. Odyssey-14, Joensuu, Finland, , Jun
- McLaren, M., Lei, Y., Ferrer, L., Advances in deep neural network approaches to speaker recognition (2015) Proc. ICASSP, , Brisbane, Australia, May
- Mclaren, M., Van Leeuwen, D., Source-normalized lda for robust speaker recognition using i-vectors from multiple speech sources (2012) Audio, Speech, and Language Processing, IEEE Transactions on, 20 (3), pp. 755-766
- Zhou, X., Garcia-Romero, D., Duraiswami, R., Espy-Wilson, C., Shamma, S., Linear versus mel frequency cepstral coefficients for speaker recognition (2011) Automatic Speech Recognition and Understanding (ASRU 2011 IEEE Workshop on, , IEEE
- Garcia-Romero, D., Espy-Wilson, C., Analysis of i-vector length normalization in speaker recognition systems (2011) Proc. Interspeech, , Florence, Italy, Aug
- Ferrer, L., McLaren, M., Scheffer, N., Lei, Y., Graciarena, M., Mitra, V., A noise-robust system for NIST 2012 speaker recognition evaluation (2013) Proc. Interspeech, , Lyon, France, Aug
- DARPA RATS Program, , http://www.darpa.mil/program/robust-atuomatic-transcription-of-speech
- Thomas, S., Saon, G., Van Segbroeck, M., Narayanan, S.S., Improvements to the IBM speech activity detection system for the DARPA rats program (2015) Proc. ICASSP, , Brisbane, Australia, May
- Ma, J., Improving the speech activity detection for the DARPA RATS phase-3 evaluation (2014) Proc. Interspeech, , Singapore, Sep
- Graciarena, M., Alwan, A., Ellis, D., Franco, H., Ferrer, L., Hansen, J.H., Janin, A., Mitra, V., All for one: Feature combination for highly channel-degraded speech activity detection (2013) Proc. Interspeech, , Lyon, France, Aug
- Ferrer, L., Graciarena, M., Mitra, V., A phonetically aware system for speech activity detection Proc. ICASSP, , Shanghai, China, March 2016
- Ferrer, L., Burget, L., Plchot, O., Scheffer, N., A unified approach for audio characterization and its application to speaker recognition (2012) Proc. Odyssey-12, , Singapore, Jun
- McLaren, M., Lawson, A., Ferrer, L., Scheffer, N., Lei, Y., Trial-based calibration for speaker recognition in unseen conditions (2014) Proc. Odyssey-14, , Joensuu, Finland, JunA4 - Amazon Alexa; Apple; eBay; et al.; Google; Microsoft
Citas:
---------- APA ----------
McLaren, M., Ferrer, L., Castan, D., Lawson, A., Morgan N., Georgiou P., Morgan N.,..., Amazon Alexa; Apple; eBay; et al.; Google; Microsoft
(2016)
. The 2016 speakers in thewild speaker recognition evaluation. 17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016, 08-12-September-2016, 823-827.
http://dx.doi.org/10.21437/Interspeech.2016-1137---------- CHICAGO ----------
McLaren, M., Ferrer, L., Castan, D., Lawson, A., Morgan N., Georgiou P., et al.
"The 2016 speakers in thewild speaker recognition evaluation"
. 17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016 08-12-September-2016
(2016) : 823-827.
http://dx.doi.org/10.21437/Interspeech.2016-1137---------- MLA ----------
McLaren, M., Ferrer, L., Castan, D., Lawson, A., Morgan N., Georgiou P., et al.
"The 2016 speakers in thewild speaker recognition evaluation"
. 17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016, vol. 08-12-September-2016, 2016, pp. 823-827.
http://dx.doi.org/10.21437/Interspeech.2016-1137---------- VANCOUVER ----------
McLaren, M., Ferrer, L., Castan, D., Lawson, A., Morgan N., Georgiou P., et al. The 2016 speakers in thewild speaker recognition evaluation. Proc. Annu. Conf. Int. Speech. Commun. Assoc., INTERSPEECH. 2016;08-12-September-2016:823-827.
http://dx.doi.org/10.21437/Interspeech.2016-1137