Abstract:
We present a system for detecting lexical stress in English words spoken by English learners. The system uses both spectral and segmental features to detect three levels of stress for each syllable in a word. The segmental features are computed on the vowels and include normalized energy, pitch, spectral tilt and duration measurements. The spectral features are computed at the frame level and are modeled by one Gaussian Mixture Model (GMM) for each stress class. These GMMs are used to obtain segmental posteriors, which are then appended to the segmental features to obtain a final set of GMMs. The segmental GMMs are used to obtain posteriors for each stress class. The system was tested on English speech from native English-speaking children and from Japanese-speaking children with variable levels of English proficiency. Our algorithm results in an error rate of approximately 13% on native data and 20% on Japanese non-native data. © 2014 IEEE.
Registro:
Documento: |
Conferencia
|
Título: | Lexical stress classification for language learning using spectral and segmental features |
Autor: | Ferrer, L.; Bratt, H.; Richey, C.; Franco, H.; Abrash, V.; Precoda, K. |
Ciudad: | Florence |
Filiación: | Speech Technology and Research Laboratory, SRI International, CA, United States CONICET, Argentina Departamento de Computación, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Argentina
|
Palabras clave: | Computer-aided language learning; Gaussian Mixture Models; Stress classification; Communication channels (information theory); Computer aided instruction; Object recognition; Computer-Aided Language Learning; English word; Gaussian Mixture Model; Language learning; Non-native; Spectral feature; Spectral tilt; Stress classifications; Signal processing |
Año: | 2014
|
Página de inicio: | 7704
|
Página de fin: | 7708
|
DOI: |
http://dx.doi.org/10.1109/ICASSP.2014.6855099 |
Título revista: | 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
|
Título revista abreviado: | ICASSP IEEE Int Conf Acoust Speech Signal Process Proc
|
ISSN: | 15206149
|
CODEN: | IPROD
|
Registro: | https://bibliotecadigital.exactas.uba.ar/collection/paper/document/paper_15206149_v_n_p7704_Ferrer |
Referencias:
- Tepperman, J., Narayanan, S., Automatic syllable stress detection using prosodic features for pronunciation evaluation of language learners (2005) Proc. ICASSP, , Philadelphia, Mar
- Chen, J.Y., Wang, L., Automatic lexical stress detection for Chinese learners' of English (2010) Chinese Spoken Language Processing (ISCSLP), 2010 7th International Symposium on
- Deshmukh, O.D., Verma, A., Nucleus-level clustering for word-independent syllable stress classification (2009) Speech Communication, 51 (12)
- Chen, L.-Y., Jang, J.-S., Stress detection of English words for a CAPT system using word-length dependent GMM-based Bayesian classifiers (2012) Interdisciplinary Information Sciences, 18 (2), pp. 65-70
- Verma, A., Lal, K.L., Lo, Y.Y., Basak, J., Word independent model for syllable stress evaluation (2006) Proc. ICASSP, , Toulouse, May
- Li, C., Liu, J., Xia, S., English sentence stress detection system based on HMM framework (2007) Applied Mathematics and Computation, 185 (2)
- Lai, M., Chen, Y., Chu, M., Zhao, Y., Hu, F., A hierarchical approach to automatic stress detection in English sentences (2006) Proc. ICASSP, , Toulouse, May
- Ananthakrishnan, S., Narayanan, S., An automatic prosody recognizer using a coupled multi-stream acoustic model and a syntactic-prosodic language model (2005) Proc. ICASSP, , Philadelphia, Mar
- Franco, H., Abrash, V., Precoda, K., Bratt, H., Rao, R., Butzberger, J., Rossier, R., Cesari, F., The SRI EduSpeakTMsystem: Recognition and pronunciation scoring for language learning (2000) Proceedings of InSTILL 2000
- Franco, H., Bratt, H., Rossier, R., Gadde, V.R., Shriberg, E., Abrash, V., Precoda, K., EduSpeak: A speech recognition and pronunciation scoring toolkit for computeraided language learning applications (2010) Language Testing, 27 (3), pp. 401-418. , July
- Talkin, D., (1995) Robust Algorithm for Pitch Tracking, , Elsevier Science
- Lin, C.-Y., Wang, H.-C., Language identification using pitch contour information (2005) Proc. ICASSP, 1, pp. 601-604. , Philadelphia, Mar
- Reynolds, D.A., Quatieri, T.F., Dunn, R.B., Speaker verification using adapted Gaussian mixture models (2000) Digital Signal Processing, 10, pp. 19-41A4 -
Citas:
---------- APA ----------
Ferrer, L., Bratt, H., Richey, C., Franco, H., Abrash, V. & Precoda, K.
(2014)
. Lexical stress classification for language learning using spectral and segmental features. 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014, 7704-7708.
http://dx.doi.org/10.1109/ICASSP.2014.6855099---------- CHICAGO ----------
Ferrer, L., Bratt, H., Richey, C., Franco, H., Abrash, V., Precoda, K.
"Lexical stress classification for language learning using spectral and segmental features"
. 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
(2014) : 7704-7708.
http://dx.doi.org/10.1109/ICASSP.2014.6855099---------- MLA ----------
Ferrer, L., Bratt, H., Richey, C., Franco, H., Abrash, V., Precoda, K.
"Lexical stress classification for language learning using spectral and segmental features"
. 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014, 2014, pp. 7704-7708.
http://dx.doi.org/10.1109/ICASSP.2014.6855099---------- VANCOUVER ----------
Ferrer, L., Bratt, H., Richey, C., Franco, H., Abrash, V., Precoda, K. Lexical stress classification for language learning using spectral and segmental features. ICASSP IEEE Int Conf Acoust Speech Signal Process Proc. 2014:7704-7708.
http://dx.doi.org/10.1109/ICASSP.2014.6855099