(2019-07-10) Submission process details updated.
(2019-05-22) Training data for the Wikipedia Feature articles are available  here.
(2019-06-02) Training data for the Wikinews articles is now available to download here
(2019-06-30) Test data for the Wikinews articles is now available to download here
(2019-07-03) Testing data for the Wikipedia Feature articles are available here.  

The objective of the Headline Generation (HG) task is to explore some of the challenges highlighted by current state of the art approaches on creating informative headlines to news articles: non-descriptive headlines, out-of-domain training data, and generating headlines from long documents which are not well represented by the head heuristic.

We propose to make available this large set of  training data for headline generation, and create evaluation conditions which emphasize those challenges.  Our data sets will draw from Wikinews as well as Wikipedia.  The latter set will leverage data previously released for the MultiLing 2015 and 2017 tasks.  For Wikinews systems will attempt to generate news headlines and for the Wikipedia both title and main section headings given the respective documents with the headlines or title and subject headings removed. Participants will be required to generate summaries for at least 3 languages, which will be evaluated via both automatic and manual methods.

The manual evaluation will consist of pairwise comparisons of machine(-generated) summaries. Each evaluator will be presented the human(-generated) summary and two machine summaries. The evaluation task is to read the human summary and judge if the one machine summary is significantly closer to the human summary information content (e.g. system A > system B) or if the two machine summaries contain comparable quantity of information as the human summary. The manual evaluation will be carried out by native / fluent volunteers and participation is strongly encouraged.

The automatic evaluation will be primarily performed by HEVas system, which measures the quality of a headline both in terms of informativeness and readability. Secondary automatic evaluations may include other established automatic evaluation methods.


 Please refer to the call for participation regarding important dates for the task.

The task coordinator is dr. Marina Litvak (litvak.marina at gmail dot com) and the task mailing list is multiling19hg at scify dot org.

Participants should submit results via email to the task coordinator with the subject line:
[multiling19][headline generation results] [systemid]
where [systemid] an identifier or description of the participating organization or system.

If the results are accompanied by a paper or technical report of the system, it should be submitted by the process described in the call for community task participation.
The result submissions procedure will be announced once the test data are released.