This file contains all the dataset files related to the MultiLing 2011 Pilot.This includes:
This is a packaging of the existing individual packages, found as previously uploaded files.
We would appreciate it greatly, if you filled in the "Using the MultiLing Pilot 2011 Dataset" form, before downloading. You can find more information in the README files in the archive, as well as in the following paper:
Comments
Hi George,
Can you post the options you used for ROUGE? For example, was it
using ROUGE1.5.5 with the options
-x -u -d -m -n 2 -2 4 -c 95 -r 1000 -f A -t 0 -a -l 250
Also did you have to modify ROUGE to work with the other lanuages?
Best regards,
John
Hi John.
The command line we used was:
ROUGE-1.5.5.pl -a -x -2 4 -u -c 95 -e <ROUGEDIR> -r 1000 -n 2 -f A -p 0.5 -t 0 -d <SETTINGS.XML File>
We did not modify ROUGE to work with the other languages. This may have caused a problem, as we later saw (because things may change with the tokenization in some languages). However, whether the effect of this different tokenization was significant is unclear. Marina Litvak later tried to apply some changes (to apply to the Hebrew language) and may have more information.
Hi John, George,
Yes, we also discussed it with Michael Elhadad - there may be a significant effect to ROUGE results depending on whether it is adapted to a particular language or not. I think Michael can explain ut better - I used adapted (by his student) to Hebrew version of Rouge. Then, I also adapted it to Arabic in the same manner.