Join Workshop on Computational Terminology

CompuTerm 2020


Artificial Intelligence



Join Workshop on Computational Terminology
CompuTerm 2020
LREC 2020 (Marseille, France)
Sunday, 16th May 2019
Marseille, France
https://sites.google.com/view/computerm2020
Computational Terminology covers an increasingly important
aspect
in a range of areas in Natural Language Processing such as text
mining, information retrieval, information extraction,
summarization,
textual entailment, document management systems, question-
answering systems, ontology building, machine translation, etc.
Terminological information is paramount for knowledge mining
from
texts, including bilingual texts, for scientific discovery and
competitive intelligence. Scientific needs in fast growing domains
(such as biology, medicine, chemistry and ecology) and the
overwhelming amount of textual data published daily demand that
terminology is acquired and managed systematically and
automatically; while in well-established domains (such as law,
economy, banking and music) the demand is on fine grained
analyses of documents for knowledge description and acquisition.
For all specialized domains, multilingual terminology is more and
more mandatory.
There have been four years between the last Computerm
workshop
held in Coling 2016. During this period, deep learning and neural
methods have become the state of the art for most NLP
applications, reaching higher performance on various tasks. This
workshop would like to investigate what deep learning brought to
computational terminology and its traditional topics, its impact
towards human applications, and the new questions within the
terminology scope that it raises.
The aim of this sixth Computerm workshop is to bring together
Natural Language Processing and Human Language Technology
researchers as well as terminology researchers and practitioners
to
discuss recent advances in computational terminology and its
impact within automatic and human applications. We also host a
special session for the shared task TermEval, which uses the
large,
manually annotated ACTER dataset (Annotated Corpora for Term
Extraction Research), that covers multiple domains and
languages.
For the general session, we call for submissions in the following
areas, though the list does not limit the range of topics:
* term extraction
* event recognition and extraction
* acquisition of semantic relations among terms
* distributional semantic analysis
* term variation management
* definition and terminological context extraction
* consideration of the user expertise
* monolingual and multilingual terminological resources
* robustness and portability of statistical methods including
neural
methods
* detection of unfortunate terminological artefacts
* social networks and modern media processing
* utilization of terminologies in various NLP applications
* evaluation of terminological methods and tools
* terminology diversity according to geographical area,
layman/academic, gender
The workshop submissions are open to different approaches,
ranging from term extraction in various languages (using verb co-
occurrence, information theoretic approaches, machine learning,
etc.), translation pairs extracting from bilingual corpora based on
terminology, up to semantic oriented approaches and theoretical
aspects of terminology.
Computerm 2020 will host the TermEval shared task on
monolingual
term extraction using the ACTER dataset. This dataset contains
over
100k manual annotations in comparable corpora in three different
languages (English, French, and Dutch) and four different
domains
(corruption, dressage, heart failure, and wind energy).
Participants
in the shared task can enter for one or multiple languages and will
get access to the annotated data in three of the domains, while
the
domain of heart failure will be provided at a later stage for
evaluation. Participants can choose from different tracks and will
be
ranked based on f1-scores of the list of automatically extracted
terms on the evaluation corpus. Apart from the scores, there will
also be more in-depth evaluations on how the tools handle
difficulties, e.g. infrequent terms, single-word vs. multiword
terms,
etc. All information concerning the shared task is available on
http://termeval.ugent.be
Authors may submit system description papers to CompuTerm
2020
indicating TermEval shared task.
PROGRAM COMMITTEE CHAIRS
General:
Béatrice Daille, LS2N, University of Nantes, France
Kyo Kageura, Library and Information Science Laboratory,
University
of Tokyo, Japan
Ayla Rigouts Terryn, LT3 Language and Translation Technology
Team, Ghent University, Belgium
TermEval shared task:
Patrick Drouin, OLST Observatoire de Linguistique Sens-Texte,
Université de Montréal, Canada
Els Lefever, LT3 Language and Translation Technology Team,
Ghent
University, Belgium
Ayla Rigouts Terryn, LT3 Language and Translation Technology
Team, Ghent University, Belgium
Véronique Hoste, Ghent University, Belgium
Importante dates:
- 1st workshop CFP: 9th December 2019
- Paper due date: 20th February 2020
- Notification of acceptance: 13th March 2020
- Camera-ready deadline: 25th March 2020
- Workshop: Sunday, 16th May 2020
Submission Instructions
The submissions should be written in English and anonymized for
review and must use the Word or LaTeX template files provided by
LREC 2020
(https://lrec2020.lrecconf.org/en/submission2020/authors-kit/).
- Long paper submission: up to 8 pages of content, plus 2 pages
for
references; final versions of long papers: one additional page: up
to
9 pages with unlimited pages for references
- Short paper submission: up to 4 pages of content, plus 2 pages
for
references; final version of short papers: up to 5 pages with
unlimited pages for references
PDF files will be submitted electronically via the START
submission
system available soon.
CONTACT
For any inquiries regarding the workshop please send an email to
general session: beatrice.daille@univ-nantes.fr
TermEval shared task: ayla.rigoutsterryn@ugent.be