Categories: corpora
Task description
Speech summarization has been of great interest to the community because speech is the principal modality of human communications and it is not as easy to skim, search or browse speech transcripts as it is for textual messages. Speech recorded from call centers offers a great opportunity to study goal-oriented and focused conversations between an agent and a caller. The Call Centre Conversation Summarization (CCCS) task consists in automatically generating summaries of spoken conversations in the form of textual synopses that shall inform on the content of a conversation and might be used for browsing a large database of recordings. Compared to news summarization where extractive approaches have been very successful, the CCCS task's objective is to foster work on abstractive summarization in order to depict what happened in a conversation instead of what people actually said.
The MultiLing'15 CCCS track leverages conversations from the DECODA and LUNA corpora of French and Italian call center recordings, both with transcripts available in their original language as well as English translation (both manual and automatic). Recording duration range from a few minutes to 15 minutes, involving two or sometimes more speakers. In the public transportation and help desk domains, the dialogs offer a rich range of situations (with emotions such as anger or frustration) while staying in a coherent domain.
Given transcripts, participants to the task shall generate abstractive summaries informing a reader about the main events of the conversations, such as the objective of the caller, whether and how it was solved by the agent, and the attitude of both parties. Evaluation will be performed by comparing submissions to reference synopses written by experts. Both conversations and reference summaries are kindly provided by the SENSEI project.
To participate to this task, please contact the Multiling organisers.
Participants have to submit one synopsis for each conversation with a length limit of 7% in term of words. They can participate to any of the English, Italian, French track and submit up to 3 runs per track. Participants have to write a paper describing their system and submit it to the SIGDIAL special session.
Data
Dates
Revision created 3350 days ago by Benoit Favre
Revision created 3350 days ago by Benoit Favre
Revision created 3484 days ago by Benoit Favre
Revision created 3495 days ago by Benoit Favre
Revision created 3495 days ago by Benoit Favre
Revision created 3562 days ago by Benoit Favre
Revision created 3562 days ago by Benoit Favre
Revision created 3562 days ago by Benoit Favre
Revision created 3562 days ago by Benoit Favre
Revision created 3572 days ago by Benoit Favre
Revision created 3572 days ago by Benoit Favre
Revision created 3572 days ago by Benoit Favre
Revision created 3572 days ago by George Giannakopoulos (Admin)