<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" >
<channel>
	<title><![CDATA[MultiLing Community Site: Invited talk]]></title>
	<link>http://multiling.iit.demokritos.gr/pages/view/1677/invited-talk</link>
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">http://multiling.iit.demokritos.gr/pages/view/1677/invited-talk</guid>
	<pubDate>Fri, 23 Aug 2019 09:21:15 +0300</pubDate>
	<link>http://multiling.iit.demokritos.gr/pages/view/1677/invited-talk</link>
	<title><![CDATA[Invited talk]]></title>
	<description><![CDATA[<p><span style="text-decoration: underline;"><strong>Title:</strong></span></p><p>Mining and Enriching Multilingual Scientific Text Collections</p><p><span style="text-decoration: underline;"><strong>Speaker:</strong></span></p><p>Horacio Saggion<br /><br />Large Scale Text Understanding Systems Lab <br />Natural Language Processing Group<br />Department of Information and Communication Technologies<br />Universitat Pompeu Fabra</p><p><span style="text-decoration: underline;"><strong>Abstract:</strong></span></p><p>Scientists worldwide are confronted with an exponential growth in the number of <br />scientific documents being made available, for example: Elsevier publishes over <br />250K scientific articles per year (or one every two minutes) and has over 7 <br />million publications; MedLine, the most important source in biomedical research, <br />contains 21 million scientific references, and the World Intellectual Patent <br />Organization (WIPO) contains some 70 million records. All this unprecedented <br />volume of information complicates the task of researchers who are faced with the <br />pressure of keeping up-to-date with discoveries in their own disciplines and <br />with the challenge of searching for innovation, new interesting problems to <br />solve, checking already solved problems or hypothesis, or getting information on <br />past and current available methods, solutions or techniques. At the same time <br />and with the rise of open science initiatives and social media, research is more <br />connected and open creating new opportunities but also challenges for the <br />scientific community. <br />&nbsp;<br />In this scenario of scientific information overload, natural language processing <br />has a key role to play. Over the past few years we have seen a number of tools <br />for the analysis of the structure of scientific documents (e.g. transforming PDF <br />to XML), methods for extracting keywords, or classifying sentences into <br />argumentative categories being developed. However, deep analysis of scientific <br />documents such as: finding key claims, assessing the argumentative quality and <br />strength of the research, or summarizing the key contributions of a piece of <br />work are less common. Besides, most research in scientific text processing is <br />being carried out for the English language, neglecting both the share of <br />scientific information available in other languages and the fact that scientific <br />publications are many times bilingual. <br />&nbsp;<br />In this talk, I will present work carried out in our laboratory towards the <br />development of a system for &ldquo;deep&rdquo; analysis and annotation of scientific text <br />collection. Originally for the English language, it has now being adapted to <br />Spanish. After a brief overview of the system and its main components, I will <br />present the development of a bi-lingual (Spanish and English) fully annotated <br />text resource in the field of natural language processing that we have created <br />with our system together with a faceted-search and visualization system to <br />explore the created resource. <br />&nbsp;<br />With this scenario in mind I will speculate on the challenges and opportunities <br />that the scientific field brings to our community not only in terms of language <br />but also from the point of view of social media and science education.</p><p><span style="text-decoration: underline;"><strong>Speaker short bio</strong></span>:</p><p>Horacio Saggion is an Associate Professor at the Department of Information and <br />Communication Technologies, Universitat Pompeu Fabra (UPF), Barcelona. He is the <br />head of the Large Scale Text Understanding Systems Lab, associated to the <br />Natural Language Processing group (TALN) where he works on automatic text <br />summarization, text simplification,&nbsp; information extraction,&nbsp; sentiment analysis <br />and related topics.&nbsp; Horacio obtained his PhD in Computer Science from <br />Universite de Montreal, Canada in 2000. He obtained his BSc&nbsp; in Computer Science <br />from Universidad de Buenos Aires in Argentina, and his MSc in Computer Science &nbsp;<br />from UNICAMP in Brazil.&nbsp; He was the Principal Investigator for UPF&nbsp; in the EU <br />projects Dr Inventor and Able-to-Include and is currently principal investigator <br />of&nbsp; the national project TUNER and the Maria de Maeztu project Mining the <br />Knowledge of Scientific Publications.&nbsp; Horacio has published over 150 works in <br />leading scientific journals, conferences, and books in the field of human <br />language technology.&nbsp; He organized four international workshops in the areas of <br />text summarization and information extraction and was scientific Co-chair of <br />STIL 2009 and scientific Chair of SEPLN 2014. He is a regular programme <br />committee member for international conferences such as ACL, EACL, COLING, EMNLP, <br />IJCNLP,&nbsp; IJCAI and is an active reviewer for international journals in computer <br />science, information processing,&nbsp; and human language technology.&nbsp; Horacio has <br />given courses, tutorials, and invited talks at a number of international events <br />including LREC, ESSLLI, IJCNLP, NLDB, and RuSSIR.</p>]]></description>
	<dc:creator>Nikiforos Pittaras</dc:creator>
</item>

</channel>
</rss>