2016 1st Workshop on Speech-Centric Natural Language Processing (SCNLP)

December 11, 2016, Osaka, Japan

We are happy to introduce the 1st SCNLP workshop, which will be held at at COLING!

SCNLP's goal is to unite the ASR and NLP communities to discuss new frameworks for exploiting the rich information present in the speech signal to improve the capabilities of natural language processing applications such as conversational agents, question-answering systems, machine translation, and search. In addition to acoustic environment information, the audio signal may contain speaker-specific features which may identify the emotional state, demographic information, and the presence of uncertainty in the speaker’s utterance: features which may influence the output of the NLP component. SCNLP encourages novel contributions that revisit the conventional NLP problems with a focus on incorporating the richness of spoken language, as well as contributions that promote cross-fertilization between statistical methods for ASR and NLP.

We look forward to seeing you!

Workshop Organizers


Important Dates


Call for Papers

We invite submissions of both long and short papers on original and unpublished work. Similar to the main conference, submissions are limited to 8 pages. All accepted submissions will be presented as posters. Additionally, selected submissions will be presented orally.

Topics of interest include but are not limited to:

All submissions should conform to COLING 2016 style guidelines. Long and short paper submissions must be anonymized. Please submit your papers at https://www.softconf.com/coling2016/SCNLP/.


Program Committee


Motivation

Language technologies have come of age and are playing an increasingly vital role in our everyday lives. From human-machine conversational technologies to text and speech analytics, we are routinely in contact with language technologies, with or without our knowledge. This progress is directly attributable to robust accuracy improvements in the automatic speech recognition (ASR) and natural language processing (NLP) communities. While both communities use data-driven techniques to achieve robustness, the opportunity to jointly optimize the robustness in a majority of speech-driven natural language processing systems is widely ignored; instead speech-centric NLP tasks predominantly rely on a sequential application of independently optimized ASR and NLP tools.

While advancements in ASR have been demonstrated through significant reductions in word error rate evaluation scores for a variety of word transcription tasks, the standard ASR evaluation metric does not account for the varied uses of the transcriptions in downstream NLP tasks. Furthermore, the impact of rich para-lexical information latent in speech on downstream tasks has not received sufficient attention due to the disproportionate emphasis on word transcription in speech processing. Likewise, although NLP research has begun to address the problem of extra-grammatical and telegraphic texts in user-generated social media, the traditional focus of the field has been on well-edited written texts. As a result, the majority of speech-centric NLP systems do not exploit the weighted multi-string hypotheses typically produced by speech recognizers, but instead treat the problem as a simple ASR-NLP pipeline which transforms ASR outputs into text-like input, such as N-best word hypotheses, prior to processing with conventional NLP tools. Such approaches result in a suboptimal quality of output with potentially significant room for improvement by leveraging the rich information available from speech input.

The purpose of this workshop is to unite the ASR and NLP communities to discuss new frameworks for exploiting the rich information present in the speech signal to improve the capabilities of natural language processing applications such as conversational agents, question-answering systems, machine translation, and search. In addition to acoustic environment information, the audio signal may contain speaker-specific features which may identify the emotional state, demographic information, and the presence of uncertainty in the speaker’s utterance: features which may influence the output of the NLP component. For example, a dialogue system may infer negative feedback from the consumer’s responses and switch to a different dialogue strategy to obtain the necessary information to carry out its task. We invite contributions that revisit the conventional NLP problems with a focus on incorporating the richness of spoken language, as well as contributions that promote cross-fertilization between statistical methods for ASR and NLP.


Contact

Send us an email at scnlp {AT} interactions.com.