Artículo

Estamos trabajando para incorporar este artículo al repositorio
Consulte el artículo en la página del editor
Consulte la política de Acceso Abierto del editor

Abstract:

We present a series of studies of affirmative cue words-a family of cue words such as "okay" or "alright" that speakers use frequently in conversation. These words pose a challenge for spoken dialogue systems because of their ambiguity: They may be used for agreeing with what the interlocutor has said, indicating continued attention, or for cueing the start of a new topic, among other meanings. We describe differences in the acoustic/prosodic realization of such functions in a corpus of spontaneous, task-oriented dialogues in Standard American English. These results are important both for interpretation and for production in spoken language applications. We also assess the predictive power of computational methods for the automatic disambiguation of these words. We find that contextual information and final intonation figure as the most salient cues to automatic disambiguation. © 2012 Association for Computational Linguistics.

Registro:

Documento: Artículo
Título:Affirmative cue words in task-oriented dialogue
Autor:Gravano, A.; Hirschberg, J.; Běnuš, S.
Filiación:Departamento de Computación, FCEyN, Universidad de Buenos Aires, Pabellón I, Ciudad Universitaria, (C1428EGA) Buenos Aires, Argentina
Columbia University, United States
Constantine the Philosopher University and Institute of Informatics, Slovak Academy of Sciences, Slovakia
Año:2012
Volumen:38
Número:1
Página de inicio:1
Página de fin:39
DOI: http://dx.doi.org/10.1162/COLI_a_00083
Título revista:Computational Linguistics
Título revista abreviado:Comput. Linguist.
ISSN:08912017
Registro:https://bibliotecadigital.exactas.uba.ar/collection/paper/document/paper_08912017_v38_n1_p1_Gravano

Referencias:

  • Allwood, J., Nivre, J., Ahlsen, E., On the semantics and pragmatics of linguistic feedback (1992) Journal of Semantics, 9 (1), pp. 1-30
  • Beckman, M.E., Hirschberg, J., (1994) The ToBI annotation conventions, , http://www.ling.ohio-state.edu/-tobi/ame_tobi/annotation_conventions.html, Available on-line at
  • Bevacqua, E., Mancini, M., Pelachaud, C., A listening agent exhibiting variable behavior (2008) Intelligent Virtual Agents, pp. 262-269. , In B. H. Prendinger, J. Lester, and M. Ishizuka, editors, Springer, Berlin
  • Bhuta, T., Patrick, L., Garnett, J.D., Perceptual evaluation of voice quality and its correlation with acoustic measurements (2004) Journal of Voice, 18 (3), pp. 299-304
  • Boersma, P., Weenink, D., (2001) Praat: Doing phonetics by computer, , http://www.praat.org, Available at
  • Brown, G., Currie, K.L., Kenworthy, J., (1980) Questions of Intonation, , University Park Press, Baltimore, MD
  • Bunt, H.C., Information dialogues as communicative actions in relation to user modelling and information processing (1989) The Structure of Multimodal Dialogue, pp. 47-73. , In M. M. Taylor, F. Neel, and D. G. Bouwhuis, editors, Elsevier, Amsterdam
  • Bunt, H.C., Morante, R., Keizer, S., An empirically based computational model of grounding in dialogue (2007) Proceedings of the 8th SIGdial Workshop on Discourse and Dialogue, pp. 283-290. , Antwerp
  • Cathcart, N., Carletta, J., Klein, E., A shallow model of backchannel continuers in spoken dialogue (2003) Proceedings of the 10th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pp. 51-58. , Budapest
  • Charniak, E., Johnson, M., Edit detection and parsing for transcribed speech (2001) Proceedings of the 2nd Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL), pp. 118-126. , Pittsburgh, PA
  • Clark, H.H., Brennan, S., Grounding in communication (1991) Perspectives on Socially Shared Cognition, pp. 127-149. , In L. Resnick, J. Levine, and S. Teasley, editors, American Psychological Association (APA), Hyattsville, MD
  • Clark, H.H., Schaefer, E.F., Contributing to discourse (1989) Cognitive Science, 13, pp. 259-294
  • Cohen, R., A computational theory of the function of clue words in argument understanding (1984) Proceedings of the 22nd Annual Meeting Association for Computational Linguistics (ACL), pp. 251-258. , Stanford, CA
  • Cohen, W.C., Fast effective rule induction (1995) Proceedings of the 12th International Conference on Machine Learning, pp. 115-123. , Tahoe City, CA
  • Core, M.G., Analyzing and predicting patterns of DAMSL utterance tags (1998) Working Notes of the AAAI Spring Symposium on Applying Machine Learning to Discourse Processing, pp. 18-24. , Stanford, CA
  • Core, M.G., Allen, J., Coding dialogs with the damsl annotation scheme (1997) Proceedings of the AAAI Fall Symposium on Communicative Action in Humans and Machines, pp. 28-35. , Cambridge, MA
  • Cortes, C., Vapnik, V., Support vector networks (1995) Machine Learning, 20 (3), pp. 273-297
  • Duncan, S., Some signals and rules for taking speaking turns in conversations (1972) Journal of Personality and Social Psychology, 23 (2), pp. 283-292
  • Eskenazi, L., Childers, D.G., Hicks, D.M., Acoustic correlates of vocal quality (1990) Journal of Speech, Language and Hearing Research, 33 (2), pp. 298-306
  • Fleiss, J.L., Measuring nominal scale agreement among many raters (1971) Psychological Bulletin, 76 (5), pp. 378-382
  • Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S., Dahlgren, N.L., Zue, V., (1993) Ldc93s1: Timit acoustic-phonetic continuous speech corpus, , Linguistic Data Consortium, University of Pennsylvania, Philadelphia
  • Godfrey, J.J., Holliman, E.C., McDaniel, J., SWITCHBOARD: Telephone speech corpus for research and development (1992) Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 517-520. , San Francisco, CA
  • Goodwin, C., (1981) Conversational Organization: Interaction Between Speakers and Hearers, , Academic Press, New York
  • Gravano, A., Benus, S., Chávez, H., Hirschberg, J., Wilcox, L., On the role of context and prosody in the interpretation of 'okay' (2007) Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 800-807. , Prague
  • Gravano, A., Hirschberg, J., Backchannel-inviting cues in task-oriented dialogue (2009) Proceedings of Interspeech, pp. 1019-1022. , Brighton
  • Gravano, A., Hirschberg, J., Turn-yielding cues in task-oriented dialogue (2009) Proceedings of the 10th SIGdial Workshop on Discourse and Dialogue, pp. 253-261. , London
  • Gravano, A., Hirschberg, J., Turn-taking cues in task-oriented dialogue (2011) Computer Speech and Language, 25 (3), pp. 601-634
  • Grosz, B., Sidner, C., Attention, intention, and the structure of discourse (1986) Computational Linguistics, 12 (3), pp. 175-204
  • Hirschberg, J., Accent and discourse context: Assigning pitch accent in synthetic speech (1990) Proceedings of the 8th National Conference on Artificial Intelligence, 2, pp. 952-957. , Boston, MA
  • Hirschberg, J., Litman, D., Now let's talk about now: Identifying cue phrases intonationally (1987) Proceedings of the 25th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 163-171. , Stanford, CA
  • Hirschberg, J., Litman, D., Empirical studies on the disambiguation of cue phrases (1993) Computational Linguistics, 19 (3), pp. 501-530
  • Hirschberg, J., Nakatani, C., A prosodic analysis of discourse segments in direction-giving monologues (1996) Proceedings of the 34th Annual Meeting Association for Computational Linguistics (ACL), pp. 286-293. , Santa Cruz, CA
  • Hjalmarsson, A., On cue-Additive effects of turn-regulating phenomena in dialogue (2009) Proceedings of Diaholmia-13th Workshop on the Semantics and Pragmatics of Dialogue, pp. 27-34. , Stockholm
  • Hjalmarsson, A., The additive effect of turn-taking cues in human and synthetic voice (2011) Speech Communication, 53 (1), pp. 23-35
  • Hobbs, J.R., The Pierrehumbert-Hirschberg theory of intonational meaning made simple: Comments on Pierrehumbert and Hirschberg (1990) Intentions in Communication, pp. 313-323. , P. R. Cohen, J. Morgan, andM. E. Pollack, editors, MIT Press, Cambridge, MA
  • Hockey, B.A., Prosody and the role of 'okay' and 'uh-huh' in discourse (1993) Proceedings of the Eastern States Conference on Linguistics, pp. 128-136. , Columbus, OH
  • Jefferson, G., Notes on a systematic deployment of the acknowledgement tokens "yeah"; and "mm hm" (1984) Research on Language & Social Interaction, 17 (2), pp. 197-216
  • Jekat, S., Klein, A., Maier, E., Maleck, I., Mast, M., Quantz, J.J., (1995) Dialogue acts in VERBMOBIL, , Technical report Verbmobil-Report 65, Universitaet Erlangen, Berlin
  • Jurafsky, D., Shriberg, E., Fox, B., Curl, T., Lexical, prosodic, and syntactic cues for dialog acts (1998) Proceedings of ACL/COLING, Workshop on Discourse Relations and Discourse Markers, pp. 114-120. , Montreal
  • Kendon, A., Some functions of gaze-direction in social interaction (1967) Acta Psychologica, 26, pp. 22-63
  • Koiso, H., Horiuchi, Y., Tutiya, S., Ichikawa, A., den, Y., An analysis of turn-taking and backchannels based on prosodic and syntactic features in Japanese Map Task dialogs (1998) Language and Speech: Special Issue on Prosody and Conversation, 41 (3-4), pp. 295-321
  • Kowtko, J.C., (1996) The Function of Intonation in Task-Oriented Dialogue, , Ph.D. thesis, University of Edinburgh
  • Lampert, A., Dale, R., Paris, C., Classifying speech acts using verbal response modes (2006) Proceedings of the Australasian Language Technology Workshop, pp. 34-41. , Sydney
  • Litman, D., Classifying cue phrases in text and speech using machine learning (1994) Proceedings of the 12th National Conference on Artificial Intelligence-AAAI, pp. 806-813. , Seattle, WA
  • Litman, D., Cue phrase classification using machine learning (1996) Journal of Artificial Intelligence, 5, pp. 53-94
  • Litman, D., Hirschberg, J., Disambiguating cue phrases in text and speech (1990) Proceedings of the 13th International Conference on Computational Linguistics, pp. 251-256. , Helsinki
  • Litman, D.J., Allen, J.F., A plan recognition model for subdialogues in conversations (1987) Cognitive Science, 11 (2), pp. 163-200
  • Litman, D.J., Passonneau, R.J., Combining multiple knowledge sources for discourse segmentation (1995) Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics (ACL), pp. 108-115. , Cambridge, MA
  • Maatman, R.M., Gratch, J., Marsella, S., Natural behavior of a listening agent (2005) 5th International Conference on Intelligent Virtual Agents, pp. 25-36. , Kos
  • Marcus, M.P., Marcinkiewicz, M.A., Santorini, B., Building a large annotated corpus of English: The Penn Treebank (1993) Computational Linguistics, 19 (2), pp. 313-330
  • Morency, L.P., de Kok, I., Gratch, J., Predicting listener backchannels: A probabilistic multimodal approach (2008) Proceedings of the 8th International Conference on Intelligent Virtual Agents, pp. 176-190. , Tokyo
  • Mushin, I., Stirling, L., Fletcher, J., Wales, R., Discourse structure, grounding, and prosody in task-oriented dialogue (2003) Discourse Processes, 35 (1), pp. 1-31
  • Novick, D.G., Sutton, S., An empirical model of acknowledgment for spoken-language systems (1994) Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics (ACL), pp. 96-101. , Morristown, NJ
  • Pierrehumbert, J., Hirschberg, J., The meaning of intonational contours in the interpretation of discourse (1990) Intentions in Communication, pp. 271-311. , P. R. Cohen, J. Morgan, andM. E. Pollack, editors, MIT Press, Cambridge, MA
  • Pierrehumbert, J.B., (1980) The Phonology and Phonetics of English Intonation, , Ph.D. thesis, Massachusetts Institute of Technology, Cambridge, MA
  • Pitrelli, J.F., Beckman, M.E., Hirschberg, J., Evaluation of prosodic transcription labeling reliability in the ToBI framework (1994) Proceedings of the International Conference of Spoken Language Processing (ICSLP), pp. 123-126. , Yokohama
  • Quinlan, J.R., (1993) C4.5: Programs for Machine Learning, , Morgan Kaufmann, Waltham, MA
  • Ratnaparkhi, A., Brill, E., Church, K., A maximum entropy model for part-of-speech tagging (1996) Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 133-142. , Philadelphia, PA
  • Redeker, G., Review article: Linguistic markers of linguistic structure (1991) Linguistics, 29 (6), pp. 1139-1172
  • Reichman, R., (1985) Getting Computers to Talk like You and Me, , MIT Press, Cambridge, MA
  • Reithinger, N., Klesen, M., Dialogue act classification using language models (1997) Proceedings of the 5th European Conference on Speech Communication and Technology, pp. 2235-2238. , Rhodes
  • Roque, A., Traum, D., Improving a virtual human using a model of degrees of grounding (2009) Proceedings of the 21st International Joint Conferences on Artificial Intelligence (IJCAI), pp. 1537-1542. , Pasadena, CA
  • Rosenberg, A., Hirschberg, J., Detecting pitch accent at the word, syllable and vowel level (2009) Proceedings of the North American Chapter of the Association for Computational Linguistics-Human Language Technologies (NAACL-HLT) Conference, pp. 81-84. , Boulder, CO
  • Rosenberg, A., AuToBI-A tool for automatic ToBI annotation (2010) Proceedings of Interspeech, pp. 146-149. , Makuhari
  • Rosenberg, A., Classification of prosodic events using quantized contour modeling (2010) Proceedings of the North American Chapter of the Association for Computational Linguistics-Human Language Technologies (NAACLHLT) Conference, pp. 721-724. , Los Angeles, CA
  • Sacks, H., Schegloff, E.A., Jefferson, G., A simplest systematics for the organization of turn-taking for conversation (1974) Language, 50, pp. 696-735
  • Schegloff, E.A., Discourse as an interactional achievement: Some uses of 'uh huh' and other things that come between sentences (1982) Analyzing Discourse: Text and Talk, pp. 71-93. , In Tannen D, editor, APA, Hyattsville, MD
  • Schiffrin, D., (1987) Discourse Markers, , Cambridge University Press, Cambridge, UK
  • Shriberg, E., Bates, R., Stolcke, A., Taylor, P., Jurafsky, D., Ries, K., Coccaro, N., van Ess-Dykema, C., Can prosody aid the automatic classification of dialog acts in conversational speech? (1998) Language and Speech, 41 (3-4), pp. 443-492
  • Stolcke, A., Ries, K., Coccaro, N., Shriberg, E., Bates, R., Jurafsky, D., Taylor, P., Meteer, M., Dialogue act modeling for automatic tagging and recognition of conversational speech (2000) Computational Linguistics, 26 (3), pp. 339-373
  • Traum, D., (1994) A Computational Theory of Grounding in Natural Language Conversation, , Ph.D. thesis, Rochester University, Rochester, NY
  • Traum, D., Allen, J., A speech acts approach to grounding in conversation (1992) Proceedings of the International Conference on Spoken Language Processing (ICSLP), pp. 137-140. , Banff
  • Vapnik, V.N., (1995) The Nature of Statistical Learning Theory, , Springer-Verlag, New York
  • Walker, M.A., Redundancy in collaborative dialogue (1992) Proceedings of the 14th Conference on Computational Linguistics, pp. 345-351. , Morristown, NJ
  • Walker, M.A., (1993) Informational Redundancy and Resource Bounds in Dialogue, , Ph.D. thesis, University of Pennsylvania, Philadelphia, PA
  • Walker, M.A., When given information is accented: Repetition, paraphrase and inference in dialogue (1993) LSA Annual Meeting, pp. 231-240. , Los Angeles, CA
  • Walker, M.A., Inferring acceptance and rejection in dialogue (1996) Language and Speech, 39 (2-3)
  • Ward, N., Tsukahara, W., Prosodic features which cue back-channel responses in English and Japanese (2000) Journal of Pragmatics, 32 (8), pp. 1177-1207
  • Witten, I.H., Frank, E., (2000) Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, , Morgan Kaufmann, Waltham, MA
  • Yngve, V.H., On getting a word in edgewise (1970) Proceedings of the 6th Regional Meeting of the Chicago Linguistic Society, 6, pp. 657-677. , Chicago, IL
  • Young, S., Evermann, G., Gales, M., Kershaw, D., Moore, G., Odell, J., Ollason, D., Woodland, P., (2006) The HTK Book, version 3.4, , http://htk.eng.cam.ac.uk, Available on-line at
  • Zufferey, S., Popescu-Belis, A., Towards automatic identification of discourse markers in dialogs: The case of 'like' (2004) Proceedings of the 5th SIGdial Workshop on Discourse and Dialogue, pp. 63-71. , Boston, MA

Citas:

---------- APA ----------
Gravano, A., Hirschberg, J. & Běnuš, S. (2012) . Affirmative cue words in task-oriented dialogue. Computational Linguistics, 38(1), 1-39.
http://dx.doi.org/10.1162/COLI_a_00083
---------- CHICAGO ----------
Gravano, A., Hirschberg, J., Běnuš, S. "Affirmative cue words in task-oriented dialogue" . Computational Linguistics 38, no. 1 (2012) : 1-39.
http://dx.doi.org/10.1162/COLI_a_00083
---------- MLA ----------
Gravano, A., Hirschberg, J., Běnuš, S. "Affirmative cue words in task-oriented dialogue" . Computational Linguistics, vol. 38, no. 1, 2012, pp. 1-39.
http://dx.doi.org/10.1162/COLI_a_00083
---------- VANCOUVER ----------
Gravano, A., Hirschberg, J., Běnuš, S. Affirmative cue words in task-oriented dialogue. Comput. Linguist. 2012;38(1):1-39.
http://dx.doi.org/10.1162/COLI_a_00083