Categories: corpora
Task description
Following the task of 2015, the multi-lingual single-document summarization task will be to generate a single document summary for all the given Wikipedia feature articles from one of about 38 languages provided. The provided training data will be the Single-Document Summarization Task data from MultiLing 2015. A new set of data will be generated based on additional Wikipedia feature articles. The summaries will be evaluated via automatic methods and participants will be required to perform some limited summarization evaluations. The manual evaluation will consist of pairwise comparisons of machine(-generated) summaries. Each evaluator will be presented the human(-generated) summary and two machine summaries. The evaluation task is to read the human summary and judge if the one machine summary is significantly closer to the human summary information content (e.g. system A > system B) or if the two machine summaries contain comparable quantity of information as the human summary.
Papers on multi-lingual summarization based on the 2015 train and test data may be submitted for consideration as part of the workshop proceeedings. Note that the 2017 test data will be released the day AFTER the papers are due. There will be an opportunity for poster sessions giving results on the 2017 data for all who submit summaries in for at least 5 languages in the 2017 multi-lingual summarization task.
Data
For 2017 the training data will be the 2015 test data which may be downloaded from the 2015 site or simply CLICK THIS. as well as the 2015 training data.
The submitted summaries for 2015 and their automatic evaluation scores can be found downloaded by clicking here. ROUGE needs to modified to run on multilingual data you may download the modifications with the scripts used for 2015 here.
2017 Test Data is availabe here. These data consist 30 featured Wikipedia articles for each of 41 languages. The data are formated in both an XML format and raw text.
Results
To be announced
Dates
Training data available Dec 18, 2016
Submission deadline extended to Monday, Jan 30, 2017 (end of day GMT-12, i.e. end of day wherever you are in the world).
Test data available Jan 31, 2017: Now available!
Submissions Due Feb 15, 2017
Manual Evaluation Begins: February 20, 2017
Preliminary Results Released: March 15, 2017
Workshop: April 3, 2
Revision created 2902 days ago by George Giannakopoulos (Admin)
Revision created 2902 days ago by George Giannakopoulos (Admin)