Text2Story 2023
Sixth International Workshop on Narrative Extraction from Texts
held in conjunction with the 45th European Conference on Information Retrieval
Sixth International Workshop on Narrative Extraction from Texts
held in conjunction with the 45th European Conference on Information Retrieval
Recent years have shown a stream of continuously evolving information making it unmanageable and time-consuming for an interested reader to track and keep up with all the essential information and the various aspects of a story. Automated narrative extraction from text offers a compelling approach to this problem. It involves identifying the sub-set of interconnected raw documents, extracting the critical narrative story elements, and representing them in an adequate final form (e.g., timelines) that conveys the key points of the story in an easy-to-understand format. Although information extraction and natural language processing have made significant progress towards an automatic interpretation of texts, the problem of automated identification and analysis of the different elements of a narrative present in a document (set) still presents significant unsolved challenges.
In the sixth edition of the Text2Story workshop, we aim to bring to the forefront the challenges involved in understanding the structure of narratives and in incorporating their representation in well-established models, as well as in modern architectures (e.g., transformers) which are now common and form the backbone of almost every IR and NLP application. It is hoped that the workshop will provide a common forum to consolidate the multi-disciplinary efforts and foster discussions to identify the wide-ranging issues related to the narrative extraction task. To this regard, we encourage the submission of high-quality and original submissions covering the following topics:
We challenge the interested researchers to consider submitting a paper that makes use of the tls-covid19 dataset - published at ECIR'21 - under the scope and purposes of the text2story workshop. tls-covid19 consists of a number of curated topics related to the Covid-19 outbreak, with associated news articles from Portuguese and English news outlets and their respective reference timelines as gold-standard. While it was designed to support timeline summarization research tasks it can also be used for other tasks including the study of news coverage about the COVID-19 pandemic.
We invite two kinds of submissions:
Original and high-quality unpublished contributions to the theory and practical aspects of the narrative extraction task. Full papers should introduce existing approaches, describe the methodology and the experiments conducted in detail. Negative result papers to highlight tested hypotheses that did not get the expected outcome are also welcomed.
Unpublished short papers describing work in progress; demo; and resource papers presenting research/industrial prototypes, datasets or software packages; position papers introducing a new point of view, a research vision or a reasoned opinion on the workshop topics; and dissemination papers describing project ideas, ongoing research lines, case studies or summarized versions of previously published papers in high-quality conferences/journals that is worthwhile sharing with the Text2Story community, but where novelty is not a fundamental issue.
Papers must be submitted electronically in PDF format through Easy Chair . All submissions must be in English and formatted according to the one-column CEUR-ART style with no page numbers. Templates, either in Word or LaTeX, can be found in the following zip folder . There is also an Overleaf page for LaTeX users.
IMPORTANT: Please include between brackets the type of submission (full; negative results; work in progress; demo and resource; position; dissemination) in the paper title.
Papers submitted to Text2Story 2023 should be original work and different from papers that have been previously published, accepted for publication, or that are under review at other venues. Exceptions to this rule are "dissemination papers". Pre-prints submitted to ArXiv are eligible.
Submissions will be peer-reviewed by at least two members of the programme committee. The accepted papers will appear in the proceedings published at CEUR workshop proceedings (indexed in Scopus and DBLP) as long as they don't conflict with previous publication rights.
Abstract: Facilitating news consumption at scale is still quite challenging. Some research effort focused on coming up with useful structures for facilitating news navigation for humans, but benchmarks and objective evaluation of such structures is not common. One area that has progressed recently is news timeline summarisation. In this talk, we present some of our work on long-range large-scale news timeline summarisation. Timelines present the most important events of a topic linearly in chronological order and are commonly used by news editors to organise long-ranging topics for news consumers. Tools for automatic timeline summarisation can address the cost of manual effort and the infeasibility of manually covering many topics, over long time periods and massive news corpora. In this talk, we first compare different high-level approaches to timeline summarisation, identify the modules and features important for this task, and present new state-of-the-art results with a simple new method. We provide several examples of automatic timelines and present both a quantitative and qualitative analysis of these structured news summaries. Most of our tools and datasets are available online on github.
Bio: Dr. Georgiana Ifrim is an Associate Professor at the School of Computer Science, UCD, co-lead of the SFI Centre for Research Training in Machine Learning (ML-Labs) and SFI Funded Investigator with the Insight Centre for Data Analytics and VistaMilk SFI Centre. Dr. Ifrim holds a PhD and MSc in Machine Learning, from Max-Planck Institute for Informatics, Germany, and a BSc in Computer Science, from University of Bucharest, Romania. Her research focuses on effective approaches for large-scale sequence learning, time series classification, and text mining. She has published more than 50 peer-reviewed articles in top-ranked international journals and conferences and regularly holds senior positions in the program committees for IJCAI, AAAI, and ECML-PKDD, as well as being a member of the editorial board of the Machine Learning Journal, Springer.
Abstract: A narrative is a conceptual basis of collective human understanding. Humans use stories to represent characters' intentions, feelings and the attributes of objects, and events. A widely-held thesis in psychology to justify the centrality of narrative in human life is that humans make sense of reality by structuring events into narratives. Therefore, narratives are central to human activity in cultural, scientific, and social areas. Story maps are computer science realizations of narratives based on maps. They are online interactive maps enriched with text, pictures, videos, and other multimedia information, whose aim is to tell a story over a territory. This talk presents a semi-automatic workflow that, using a CRM-based ontology and the Semantic Web technologies, produces semantic narratives in the form of story maps (and timelines as an alternative representation) from textual documents. An expert user first assembles one territory-contextual document containing text and images. Then, automatic processes use natural language processing and Wikidata services to (i) extract entities and geospatial points of interest associated with the territory, (ii) assemble a logically-ordered sequence of events that constitute the narrative, enriched with entities and images, and (iii) openly publish online semantic story maps and an interoperable Linked Open Data-compliant knowledge base for event exploration and inter-story correlation analyses. Once the story maps are published, the users can review them through a user-friendly web tool. Overall, our workflow complies with Open Science directives of open publication and multi-discipline support and is appropriate to convey "information going beyond the map" to scientists and the large public. As demonstrations, the talk will show workflow-produced story maps to represent (i) 23 European rural areas across 16 countries, their value chains and territories, (ii) a Medieval journey, (iii) the history of the legends, biological investigations, and AI-based modelling for habitat discovery of the giant squid Architeuthis dux.
Bio: Valentina Bartalesi Lenzi is a researcher at the CNR-ISTI and external professor of Semantic Web in the Computer Science master's degree course at the University of Pisa. She earned her PhD in Information Engineering from the University of Pisa and graduated in Digital Humanities from the University of Pisa. Her research fields mainly concern Knowledge Representation, Semantic Web technologies, and the development of formal ontologies for representing textual content and narratives. She has participated in several European and National research projects, including MINGEI, PARTHENOS, E-RIHS PP, IMAGO. She is the author of over 50 peer-reviewed articles in national and international conferences and scientific journals.
Displaying agenda in event timezone (Dublin local time).
|
||
09h00 - 09h10 | Introduction
(Ricardo Campos) in-person | slides |
|
|
||
Session Chair: Adam Jatowt | ||
09h10 - 09h50 |
Keynote 1: Structured Summarisation of News at Scale
(Georgiana Ifrim, University College Dublin) in-person |
|
09h50 - 10h10 |
Multilingual Analysis of YouTube's Recommendation System: Examining Topic and
Emotion
Drift in the 'Cheng Ho' Narrative
(Ugochukwu Onyepunuka, Mustafa Alassad, Lotenna Nwana and Nitin Agarwal) in-person |
|
10h10 - 10h30 | NewsLines: Narrative Visualization
of
News Stories (Mariana Costa and Sérgio Nunes) remote | video |
|
|
||
10h30 - 11h00 | Coffee Break | |
|
||
Session Chair: Hugo Sousa | ||
11h00 - 11h20 |
Annotation and visualisation of reporting events in textual narratives
(Purificação Silvano, Evelin Amorim, António Leal, Inês Cantante, Silva Fátima, Alípio Jorge, Ricardo Campos and Sérgio Nunes) in-person | slides |
|
11h20 - 11h40 |
Segmenting Narrative Synopses into Spans for Different Event Reporting Modes
(Pablo Gervás) in-person |
|
11h40 - 12h00 | Demo Pitch | |
Integration of a Semantic Storytelling Recommender System in Speech Assistants
(Maria Gonzalez Garcia, Julian Moreno Schneider, Malte Ostendorff and Georg Rehm) in-person | slides |
||
Extracting Imprecise Geographical and Temporal References from Journey Narratives
(Ignatius Ezeani, Paul Rayson and Ian Gregory) in-person |
||
The Funhouse Mirror Has Two Sides: Visual Storification of Debates with Comics
(Tony Veale) in-person |
||
Comprehensive Terms Board Visualization for News Analysis and Editorial Story
Planning
(Ishrat Sami, Tony Russell-Rose and Larisa Soldatova) remote | slides | video |
||
A Web Tool to Create and Visualise Semantic Story Maps
(Valentina Bartalesi, Emanuele Lenzi and Nicolò Pratelli) in-person | slides |
||
|
||
12h00 - 12h30 | Demo Session | |
|
||
12h30 - 13h30 |
Lunch Break
|
|
|
||
Session Chair: Alípio Jorge | ||
13h30 - 14h10 |
Keynote 2: Creating and Visualising Semantic Story Maps
(Valentina Bartalesi, ISTI-CNR) in-person | slides |
|
14h10 - 14h25 |
On the Definition of Prescriptive Annotation Guidelines for Language-Agnostic
Subjectivity Detection
(Federico Ruggeri, Francesco Antici, Andrea Galassi, Katerina Korre, Ariann Muti and Alberto Barrón-Cedeño) remote | slides | video |
|
14h25 - 14h40 |
Edge Labelling in Narrative Knowledge Graphs
(Vani Kanjirangat and Alessandro Antonucci) remote | slides |
|
14h40 - 15h00 |
End-to-End Temporal Relation Extraction in the Clinical Domain
(José Javier Saiz and Begoña Altuna) in-person | | slides |
|
|
||
15h00 - 15h30 | Coffee Break | |
|
||
Session Chair: Purificação Silvano | ||
15h30 - 15h50 |
Cross-lingual transfer learning for detecting negative campaign in Israeli municipal
elections: a case study
(Marina Litvak, Natalia Vanetik and Lin Miao) in-person | video |
|
15h50 - 16h10 |
The Same Thing - Only Different: Classification of Movies by their Story Types
(Chang Liu, Armin Shmilovici and Mark Last) in-person | slides |
|
|
||
Session Chair: Marina Litvak | ||
16h10 - 16h30 |
ScANT: A Small Corpus of Scene-Annotated Narrative Texts
(Tarfah Alrashid and Robert Gaizauskas) remote |
|
16h30 - 16h45 |
A cognitive theoretical approach of rhetorical news analysis
(Ishrat Sami, Tony Russell-Rose and Larisa Soldatova) remote | slides | video |
|
|
||
Session Chair: Jochen Leidner | ||
16h45 - 17h00 |
Modelling Interestingness: stories as L-Systems and Magic Squares
(Cosimo Palma) in-person | slides | video |
|
17h00 - 17h20 |
On the Readability of Misinformation in Comparison to the Truth
(Mohammadali Tavakoli, Harith Alani and Gregoire Burel) remote |
|
17h20 - 17h40 |
Multi-label Infectious Disease News Event Corpus
(Jakub Piskorski, Nicolas Stefanovitch, Brian Doherty, Jens Linge, Sopho Karazi, Jas Mantero, Guillaume Jacquet, Alessio Spadaro and Giulia Teodori) in-person | slides |
|
|
||
17h40 - 18h00 |
Best Paper and Reviewers Award
(Ricardo Campos, Alípio Jorge, Adam Jatowt, Marina Litvak) in-person |
|
|
Text2Story 2023 will be held at the 45th European Conference on Information Retrieval (ECIR'23) in Dublin, Ireland
Be aware that power plug sockets in Ireland are of type G (of British origin). You may need to consider bringing or buying an adaptor.
Registration at ECIR 2023 is required to attend the workshop (don't forget to select the Text2Story workshop).
This project is financed by the ERDF - European Regional Development Fund through the North Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 and by National Funds through the Portuguese funding agency, FCT - Fundação para a Ciência e a Tecnologia within project PTDC/CCI-COM/31857/2017 (NORTE-01-0145-FEDER-03185)