Overview

MultiLing 2013 covers three subdomains of Natural Language Processing, focused on the multilingual aspect of summarization. Each domain is allocated a separate section of the workshop.
The three domains are:

  • Multilingual multi-document summarization: Summarization, especially from multiple documents, has received increasing attention during the last years. This is mostly due to the increasing volume and redundancy of available online information. Recently, more and more interest arises for methods that will be able to function on a variety of languages. Multilingual multi-document summarization is the domain that researches such methods and studies their requirements and intricacies.
  • Multilingual summary evaluation: Summary evaluation has been an open question for several years, even though there exist methods that correlate well to human judgement, whencalled upon to compare systems. In the multilingual setting, it is not obvious that these methods will perform equally well to the English language setting.
    In fact, some preliminary results have shown that several problems may arise in the multilingual setting  [Giannakopoulos et al., 2011]. This section of the workshop aims to cover and discuss these research problems and corresponding solutions.
  • Multilingual summarization data collection and exploitation: The collection of multi-lingual corpora for summarization and summarization evaluation offers a challenge in itself.
    This section of the workshop works towards well-defined practices for the collection of such data, as well as the implementation and use of community tools for the support of the collection process. Furthermore, this section will include a discussion on how we can maximize the effect of the generated corpora in favor or the scientific community.

 

References:

[Giannakopoulos et al., 2011] Giannakopoulos, G., El-Haj, M., Favre, B., Litvak, M., Steinberger, J., and Varma,
V. (2011). TAC2011 MultiLing Pilot Overview.

The workshop will build upon the results of a set of research community tasks, which are
described in the following paragraphs.

Tasks

Summarization Task

This MultiLing task aims to evaluate the application of (partially or fully) language-independent summarization algorithms on a variety of languages. Each system participating in the task will be called to provide summaries for a range of different languages, based on corresponding corpora.  In the MultiLing Pilot of 2011 the languages used were 7, while we aim for at least 8 languages in the MultiLing 2013 workshop. Participating systems will be required to apply their methods on a minimum of two languages. Evaluation will favour systems that apply their methods in more languages.

The MultiLing task requires to generate a single, fluent, representative summary from a set of documents describing an event sequence. The language of the document set will be within a given range of languages and all documents in a set share the same language. The output summary should be of the same language as its source documents. The output summary should be 250 words at most.

Evaluation Task

This task aims to examine how well automated systems can evaluate summaries from different languages. This task takes as input the summaries generated from automatic systems and humans in the Sumamrization Task. The output should be a grading of the summaries. Ideally, we would want the automatic evaluation to maximally correlate to human judgement.

Venue

MultiLing 2013 will be held as part of ACL 2013.

Details to be announced.