[CfP] CL-SciSumm @SIGIR 2019: 5th Computational Linguistics Scientific Document Summarization Shared Task

CL-SciSumm @SIGIR 2019


Artificial Intelligence



You are invited to participate in the 5th Computational Linguistics (CL) Scientific Summarization Shared Task http://wing.comp.nus.edu.sg/~cl-scisumm2019, sponsored by SRI International and Chan-Zuckerberg Initiative (CZI), to be held as part of 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019) in Paris, France on 25th July 2019. This task follows up on the successful CL-SciSumm 2018 @ SIGIR 2018 and three previous editions. In this task, a training corpus CL research papers are released. Participants are invited to enter their systems in a task-based evaluation on a blind test set.
The CLSciSumm19 corpus is expected to be of interest to a broad community including those working in computational linguistics and natural language processing, text summarization, discourse structure in scholarly discourse, paraphrase, textual entailment and text simplification. The task constitutes automatic scientific paper summarization in the Computational Linguistics (CL) domain. The output summaries will be of two types: faceted summaries of the traditional self-summary (the abstract) and the community summary (the collection of citation sentences ‘citances’). We also propose to group the citances by the facets of the text that they refer to.
This is the 5th edition of the Shared Task, 3rd at SIGIR, following the 2nd edition at JCDL‘16 and a successful Pilot task at TAC ‘14 at NIST, USA.
=== Important Dates ===
Training Set Release: Already Online
Deadline for Registration and Short System Descriptions - March 30, 2019
Test Set Release: Already Online
System Runs Due: May 24, 2019
Preliminary System Reports Due: June 9, 2019
Camera Ready Contributions: July 7, 2019
Participants present at BIRNDL 2019 workshop: July 25, 2019 in Paris, France
=== The CL-SciSumm Corpus ===
The CL-SciSumm corpus is created by randomly sampling documents from the ACL Anthology corpus and selecting their citing papers. Citing paper may include papers from outside the Anthology.
The manually annotated training set of 40 reference papers and citing papers is already available for download and can be used by participants to pilot their systems. Further, this year we have introduced 1000 documents sets that were automatically annotated to be used as training data. This training data was generated following Nomoto,2018.Further, for Task 2 one thousand summaries that were released as part of the SciSummNet (Yasunaga et al., 2019) have been included as human summaries to train on.
The training set of articles is available for download at GitHub (https://github.com/WING-NUS/scisumm-corpus) and can be used by participants to pilot their systems.
The test set (input documents with blind ground truth) of 20 articles is available in https://github.com/WING-NUS/scisumm-corpus/tree/master/data/Test-Set-2018.
The system outputs from the test set should be submitted to the task organizers, for the collation of the final results to be presented at the workshop.
=== Registration ===
Teams that wish to participate in the CL Shared Task track at BIRNDL 2019 are invited to register on EasyChair with a title and a tentative abstract describing their approach at
(https://easychair.org/conferences/?conf=clscisumm2019).
Participants are advised to register as soon as possible in order to receive timely access to evaluation resources, including development and testing data. Registration for the task does not commit you to participation - but is helpful to know for planning. All participants who submit system runs are welcome to present their system at the BIRNDL Workshop in the poster session, while the best performing system will be invited to present their paper in the main session. Dissemination of CL-SciSumm work and results other than in the workshop proceedings is welcomed, but the conditions of participation specifically preclude any advertising claims based on these results. Any questions about conference participation may be sent to the organizers mentioned below.
=== Organising Committee ===
- Muthu Kumar Chandrasekaran, SRI International,, (https://www.linkedin.com/in/muthukumarc87/)
- Michihiro Yasunaga, Yale, (https://michiyasunaga.github.io/)
- Dayne Freitag, SRI International, (https://www.sri.com/about/people/dayne-freitag)
- Dragomir Radev, Yale, (http://www.cs.yale.edu/~radev)
- Kokil Jaidka, Nanyang Technological University, (http://kokiljaidka.wordpress.com/)
- Min-Yen Kan, National University of Singapore, (https://www.comp.nus.edu.sg/~kanmy/)