April 2nd, 2023 - Dublin, Ireland

Text2Story 2023

Sixth International Workshop on Narrative Extraction from Texts
held in conjunction with the 45th European Conference on Information Retrieval

Call for papers

Overview

Recent years have shown a stream of continuously evolving information making it unmanageable and time-consuming for an interested reader to track and keep up with all the essential information and the various aspects of a story. Automated narrative extraction from text offers a compelling approach to this problem. It involves identifying the sub-set of interconnected raw documents, extracting the critical narrative story elements, and representing them in an adequate final form (e.g., timelines) that conveys the key points of the story in an easy-to-understand format. Although information extraction and natural language processing have made significant progress towards an automatic interpretation of texts, the problem of automated identification and analysis of the different elements of a narrative present in a document (set) still presents significant unsolved challenges.

Call for papers

In the sixth edition of the Text2Story workshop, we aim to bring to the forefront the challenges involved in understanding the structure of narratives and in incorporating their representation in well-established models, as well as in modern architectures (e.g., transformers) which are now common and form the backbone of almost every IR and NLP application. It is hoped that the workshop will provide a common forum to consolidate the multi-disciplinary efforts and foster discussions to identify the wide-ranging issues related to the narrative extraction task. To this regard, we encourage the submission of high-quality and original submissions covering the following topics:

Narrative Representation Models
Story Evolution and Shift Detection
Temporal Relation Identification
Temporal Reasoning and Ordering of Events
Causal Relation Extraction and Arrangement
Narrative Summarization
Multi-modal Summarization
Automatic Timeline Generation
Storyline Visualization
Comprehension of Generated Narratives and Timelines
Big Data Applied to Narrative Extraction
Personalization and Recommendation of Narratives
User Profiling and User Behavior Modeling
Sentiment and Opinion Detection in Texts
Argumentation Analysis
Bias Detection and Removal in Generated Stories
Ethical and Fair Narrative Generation
Misinformation and Fact Checking
Bots Influence
Narrative-focused Search in Text Collections
Event and Entity importance Estimation in Narratives
Multilinguality: Multilingual and Cross-lingual Narrative Analysis
Evaluation Methodologies for Narrative Extraction
Resources and Dataset Showcase
Dataset Annotation for Narrative Generation/Analysis
Applications in Social Media (e.g. narrative generation during a natural disaster)
Language Models and Transfer Learning in Narrative Analysis
Narrative Analysis in Low-resource Languages
Text Simplification

tls-covid19 Dataset

We challenge the interested researchers to consider submitting a paper that makes use of the tls-covid19 dataset - published at ECIR'21 - under the scope and purposes of the text2story workshop. tls-covid19 consists of a number of curated topics related to the Covid-19 outbreak, with associated news articles from Portuguese and English news outlets and their respective reference timelines as gold-standard. While it was designed to support timeline summarization research tasks it can also be used for other tasks including the study of news coverage about the COVID-19 pandemic.

Important Dates

February 6th, 2023

Submission Deadline
March 3rd, 2023

Acceptance Notification
March 17th, 2023

Camera-ready copies
April 2nd, 2023

Workshop

Submissions

We invite two kinds of submissions:

Full Papers

up to 7 pages + references

Original and high-quality unpublished contributions to the theory and practical aspects of the narrative extraction task. Full papers should introduce existing approaches, describe the methodology and the experiments conducted in detail. Negative result papers to highlight tested hypotheses that did not get the expected outcome are also welcomed.

Work in Progress | Demos | Dissemination Papers

up to 4 pages + references

Unpublished short papers describing work in progress; demo; and resource papers presenting research/industrial prototypes, datasets or software packages; position papers introducing a new point of view, a research vision or a reasoned opinion on the workshop topics; and dissemination papers describing project ideas, ongoing research lines, case studies or summarized versions of previously published papers in high-quality conferences/journals that is worthwhile sharing with the Text2Story community, but where novelty is not a fundamental issue.

Papers must be submitted electronically in PDF format through Easy Chair . All submissions must be in English and formatted according to the one-column CEUR-ART style with no page numbers. Templates, either in Word or LaTeX, can be found in the following zip folder . There is also an Overleaf page for LaTeX users.

IMPORTANT: Please include between brackets the type of submission (full; negative results; work in progress; demo and resource; position; dissemination) in the paper title.

Papers submitted to Text2Story 2023 should be original work and different from papers that have been previously published, accepted for publication, or that are under review at other venues. Exceptions to this rule are "dissemination papers". Pre-prints submitted to ArXiv are eligible.

Submissions will be peer-reviewed by at least two members of the programme committee. The accepted papers will appear in the proceedings published at CEUR workshop proceedings (indexed in Scopus and DBLP) as long as they don't conflict with previous publication rights.

Workshop Format

Participants of accepted papers will be given 15 minutes for oral presentations.

Organization

Organizing Committee

Ricardo Campos (INESC TEC; Ci2-Smart Cities Research Center - Polytechnic Institute of Tomar, Tomar, Portugal)
Alípio M. Jorge (INESC TEC; University of Porto, Portugal)
Adam Jatowt (University of Innsbruck, Austria)
Sumit Bhatia (Adobe Media and Data Science Research Lab, India)
Marina Litvak (Shamoon Academic College of Engineering, Israel)

Program Committee

Álvaro Figueira (INESC TEC & University of Porto)
Andreas Spitz (University of Konstanz)
Antoine Doucet (Université de La Rochelle)
António Horta Branco (University of Lisbon)
Anubhav Jangra (IIT Patna, Japan)
Arian Pasquali (Faktion AI)
Bart Gajderowicz (University of Toronto)
Begoña Altuna (Universidad del País Vasco)
Behrooz Mansouri(Rochester Institute of Technology)
Brenda Santana (Federal University of Rio Grande do Sul)
Bruno Martins (IST & INESC-ID, University of Lisbon)
Daniel Loureiro (Cardiff University)
Dennis Aumiller (Heidelberg University)
Dhruv Gupta (Norwegian University of Science and Technology)
Dyaa Albakour (Signal UK)
Evelin Amorim (INESC TEC)
Henrique Lopes Cardoso (LIACC & University of Porto)
Hugo Sousa (INESC TEC & University of Porto)
Ismail Altingovde (Middle East Technical University)
Irina Rabaev (Shamoon College of Engineering)
Jiexin Wang (South China University of Technology, China)
João Paulo Cordeiro (INESC TEC & University of Beira Interior)
Kiran Bandeli (Walmart Inc.)
Liana Ermakova (HCTI, Université de Bretagne Occidentale)
Luca Cagliero (Politecnico di Torino)
Ludovic Moncla (INSA Lyon)
Luis Filipe Cunha (INESC TEC & University of Minho)
Marc Finlayson (Florida International University)
Marc Spaniol (Université de Caen Normandie)
Moreno La Quatra (Politecnico di Torino)
Natalia Vanetik (Shamoon College of Engineering)
Nuno Guimarães (INESC TEC & University of Porto)
Pablo Gamallo (University of Santiago de Compostela)
Pablo Gervás (Universidad Complutense de Madrid)
Paulo Quaresma (Universidade de Évora)
Paul Rayson (Lancaster University)
Purificação Silvano (CLUP & University of Porto)
Raghav Jain (Indian Institute of Technology, Patna)
Ross Purves (University of Zurich)
Satya Almasian (Heidelberg University)
Sérgio Nunes (INESC TEC & University of Porto)
Simra Shahid (Adobe's Media and Data Science Research Lab)
Sriharsh Bhyravajjula (University of Washington)
Udo Kruschwitz (University of Regensburg)
Valentina Bartalesi (ISTI-CNR, Italy)
Veysel Kocaman (John Snow Labs & Leiden University)
Wenzhi Cao (University of Wisconsin, USA)
Yang Zhang (Southwestern University of Finance and Economics, China)

Proceedings Chair

João Paulo Cordeiro (INESC TEC & Universidade da Beira do Interior)
Conceição Rocha (INESC TEC)

Web and Dissemination Chair

Hugo Sousa (INESC TEC & University of Porto)
Behrooz Mansouri (Rochester Institute of Technology)

Invited Speakers

Structured Summarisation of News at Scale

Speaker: Georgiana Ifrim, University College Dublin, Ireland

Abstract: Facilitating news consumption at scale is still quite challenging. Some research effort focused on coming up with useful structures for facilitating news navigation for humans, but benchmarks and objective evaluation of such structures is not common. One area that has progressed recently is news timeline summarisation. In this talk, we present some of our work on long-range large-scale news timeline summarisation. Timelines present the most important events of a topic linearly in chronological order and are commonly used by news editors to organise long-ranging topics for news consumers. Tools for automatic timeline summarisation can address the cost of manual effort and the infeasibility of manually covering many topics, over long time periods and massive news corpora. In this talk, we first compare different high-level approaches to timeline summarisation, identify the modules and features important for this task, and present new state-of-the-art results with a simple new method. We provide several examples of automatic timelines and present both a quantitative and qualitative analysis of these structured news summaries. Most of our tools and datasets are available online on github.

Bio: Dr. Georgiana Ifrim is an Associate Professor at the School of Computer Science, UCD, co-lead of the SFI Centre for Research Training in Machine Learning (ML-Labs) and SFI Funded Investigator with the Insight Centre for Data Analytics and VistaMilk SFI Centre. Dr. Ifrim holds a PhD and MSc in Machine Learning, from Max-Planck Institute for Informatics, Germany, and a BSc in Computer Science, from University of Bucharest, Romania. Her research focuses on effective approaches for large-scale sequence learning, time series classification, and text mining. She has published more than 50 peer-reviewed articles in top-ranked international journals and conferences and regularly holds senior positions in the program committees for IJCAI, AAAI, and ECML-PKDD, as well as being a member of the editorial board of the Machine Learning Journal, Springer.

Creating and Visualising Semantic Story Maps

Speaker: Valentina Bartalesi, CNR-ISTI, Italy

Abstract: A narrative is a conceptual basis of collective human understanding. Humans use stories to represent characters' intentions, feelings and the attributes of objects, and events. A widely-held thesis in psychology to justify the centrality of narrative in human life is that humans make sense of reality by structuring events into narratives. Therefore, narratives are central to human activity in cultural, scientific, and social areas. Story maps are computer science realizations of narratives based on maps. They are online interactive maps enriched with text, pictures, videos, and other multimedia information, whose aim is to tell a story over a territory. This talk presents a semi-automatic workflow that, using a CRM-based ontology and the Semantic Web technologies, produces semantic narratives in the form of story maps (and timelines as an alternative representation) from textual documents. An expert user first assembles one territory-contextual document containing text and images. Then, automatic processes use natural language processing and Wikidata services to (i) extract entities and geospatial points of interest associated with the territory, (ii) assemble a logically-ordered sequence of events that constitute the narrative, enriched with entities and images, and (iii) openly publish online semantic story maps and an interoperable Linked Open Data-compliant knowledge base for event exploration and inter-story correlation analyses. Once the story maps are published, the users can review them through a user-friendly web tool. Overall, our workflow complies with Open Science directives of open publication and multi-discipline support and is appropriate to convey "information going beyond the map" to scientists and the large public. As demonstrations, the talk will show workflow-produced story maps to represent (i) 23 European rural areas across 16 countries, their value chains and territories, (ii) a Medieval journey, (iii) the history of the legends, biological investigations, and AI-based modelling for habitat discovery of the giant squid Architeuthis dux.

Bio: Valentina Bartalesi Lenzi is a researcher at the CNR-ISTI and external professor of Semantic Web in the Computer Science master's degree course at the University of Pisa. She earned her PhD in Information Engineering from the University of Pisa and graduated in Digital Humanities from the University of Pisa. Her research fields mainly concern Knowledge Representation, Semantic Web technologies, and the development of formal ontologies for representing textual content and narratives. She has participated in several European and National research projects, including MINGEI, PARTHENOS, E-RIHS PP, IMAGO. She is the author of over 50 peer-reviewed articles in national and international conferences and scientific journals.

Programme

Displaying agenda in event timezone (Dublin local time).



09h00 - 09h10	Introduction (Ricardo Campos) in-person \| slides

Session Chair: Adam Jatowt
09h10 - 09h50	Keynote 1: Structured Summarisation of News at Scale (Georgiana Ifrim, University College Dublin) in-person
09h50 - 10h10	Multilingual Analysis of YouTube's Recommendation System: Examining Topic and Emotion Drift in the 'Cheng Ho' Narrative (Ugochukwu Onyepunuka, Mustafa Alassad, Lotenna Nwana and Nitin Agarwal) in-person
10h10 - 10h30	NewsLines: Narrative Visualization of News Stories (Mariana Costa and Sérgio Nunes) remote \| video

10h30 - 11h00	Coffee Break

Session Chair: Hugo Sousa
11h00 - 11h20	Annotation and visualisation of reporting events in textual narratives (Purificação Silvano, Evelin Amorim, António Leal, Inês Cantante, Silva Fátima, Alípio Jorge, Ricardo Campos and Sérgio Nunes) in-person \| slides
11h20 - 11h40	Segmenting Narrative Synopses into Spans for Different Event Reporting Modes (Pablo Gervás) in-person
11h40 - 12h00	Demo Pitch
	Integration of a Semantic Storytelling Recommender System in Speech Assistants (Maria Gonzalez Garcia, Julian Moreno Schneider, Malte Ostendorff and Georg Rehm) in-person \| slides
	Extracting Imprecise Geographical and Temporal References from Journey Narratives (Ignatius Ezeani, Paul Rayson and Ian Gregory) in-person
	The Funhouse Mirror Has Two Sides: Visual Storification of Debates with Comics (Tony Veale) in-person
	Comprehensive Terms Board Visualization for News Analysis and Editorial Story Planning (Ishrat Sami, Tony Russell-Rose and Larisa Soldatova) remote \| slides \| video
	A Web Tool to Create and Visualise Semantic Story Maps (Valentina Bartalesi, Emanuele Lenzi and Nicolò Pratelli) in-person \| slides

12h00 - 12h30	Demo Session

12h30 - 13h30	Lunch Break

Session Chair: Alípio Jorge
13h30 - 14h10	Keynote 2: Creating and Visualising Semantic Story Maps (Valentina Bartalesi, ISTI-CNR) in-person \| slides
14h10 - 14h25	On the Definition of Prescriptive Annotation Guidelines for Language-Agnostic Subjectivity Detection (Federico Ruggeri, Francesco Antici, Andrea Galassi, Katerina Korre, Ariann Muti and Alberto Barrón-Cedeño) remote \| slides \| video
14h25 - 14h40	Edge Labelling in Narrative Knowledge Graphs (Vani Kanjirangat and Alessandro Antonucci) remote \| slides
14h40 - 15h00	End-to-End Temporal Relation Extraction in the Clinical Domain (José Javier Saiz and Begoña Altuna) in-person \| \| slides

15h00 - 15h30	Coffee Break

Session Chair: Purificação Silvano
15h30 - 15h50	Cross-lingual transfer learning for detecting negative campaign in Israeli municipal elections: a case study (Marina Litvak, Natalia Vanetik and Lin Miao) in-person \| video
15h50 - 16h10	The Same Thing - Only Different: Classification of Movies by their Story Types (Chang Liu, Armin Shmilovici and Mark Last) in-person \| slides

Session Chair: Marina Litvak
16h10 - 16h30	ScANT: A Small Corpus of Scene-Annotated Narrative Texts (Tarfah Alrashid and Robert Gaizauskas) remote
16h30 - 16h45	A cognitive theoretical approach of rhetorical news analysis (Ishrat Sami, Tony Russell-Rose and Larisa Soldatova) remote \| slides \| video

Session Chair: Jochen Leidner
16h45 - 17h00	Modelling Interestingness: stories as L-Systems and Magic Squares (Cosimo Palma) in-person \| slides \| video
17h00 - 17h20	On the Readability of Misinformation in Comparison to the Truth (Mohammadali Tavakoli, Harith Alani and Gregoire Burel) remote
17h20 - 17h40	Multi-label Infectious Disease News Event Corpus (Jakub Piskorski, Nicolas Stefanovitch, Brian Doherty, Jens Linge, Sopho Karazi, Jas Mantero, Guillaume Jacquet, Alessio Spadaro and Giulia Teodori) in-person \| slides

17h40 - 18h00	Best Paper and Reviewers Award (Ricardo Campos, Alípio Jorge, Adam Jatowt, Marina Litvak) in-person

Attending

Text2Story 2023 will be held at the 45th European Conference on Information Retrieval (ECIR'23) in Dublin, Ireland

Be aware that power plug sockets in Ireland are of type G (of British origin). You may need to consider bringing or buying an adaptor.

Registration at ECIR 2023 is required to attend the workshop (don't forget to select the Text2Story workshop).

Acknowledgements

This project is financed by the ERDF - European Regional Development Fund through the North Portugal Regional Operational Programme (NORTE 2020), under the PORTUGAL 2020 and by National Funds through the Portuguese funding agency, FCT - Fundação para a Ciência e a Tecnologia within project PTDC/CCI-COM/31857/2017 (NORTE-01-0145-FEDER-03185)