Learning Semantic Similarities for the Financial Domain

[WWW 2021] FinSIM-2 Shared Task 2021


Artificial Intelligence



Greetings,
We would like to invite you to submit to FinSIM-2, the 2nd shared task on Learning Semantic Similarities for the Financial Domain, in conjunction with The Web Conference 2021, April 19-23th, 2021, Ljubljana, Slovenia!
It will be held at The Web Conference 2021 as part of the FinWeb-2021 workshop.
Shared Task URL: https://sites.google.com/nlg.csie.ntu.edu.tw/finweb2021/shared-task-finsim-2
Workshop URL: https://sites.google.com/nlg.csie.ntu.edu.tw/finweb2021
Registration Form: https://forms.gle/oJetvLQfpPNLuSqJ6
_____________________________________________
The FinSim 2021 shared task aims to spark interest from communities in NLP, ML/AI, Knowledge Engineering and Financial document processing. Going beyond the mere representation of words is a key step to industrial applications that make use of Natural Language Processing (NLP). This is typically addressed using either 1) Unsupervised corpus-derived representations like word embeddings, which are typically opaque to human understanding but very useful in NLP applications or 2) Manually tagged resources such as corpora, lexica, taxonomies and ontologies, which typically have low coverage and contain inconsistencies, but provide a deeper understanding of the target domain.
These two methods form the two ends of a spectrum which a number of approaches have attempted to combine, particularly in tasks aiming at expanding the coverage of manual resources using automatic methods.
The Semeval community has organized several evaluation campaigns to stimulate the development of methods which extract semantic/lexical relations between concepts/words (Bordea et al. 2015, Bordea et al. 2016, Jurgens et al. 2016, Camacho-Collados et al. 2018).
There are also a large number of datasets and challenges that specifically look at how to automatically populate knowledge bases such as DBpedia or Wikidata (e.g. KBP challenges).
To the best of our knowledge, FinSim 2020 was the first time a task attempting to combine these methods for the Financial domain.
The second edition FinSim-2 focuses on the evaluation of semantic representations by assessing the quality of the automatic classification of a given list of carefully selected terms from the Financial domain against a domain ontology. Participants will be given a list of carefully selected terms from the Financial domain such as “European depositary receipt”, “Interest rate swaps” and will be asked to design a system which can automatically classify them into the most relevant hypernym (or top-level) concept in an external ontology. For example, given the set of concepts “Bonds”, “Unclassified”, “Share”, “Loan”, the most relevant hypernym of “European depositary receipt” is “Share”.
This year, we propose an enriched dataset in terms of volume and quality. We are interested in systems which make creative use of relevant resources such as ontologies and lexica, as well as systems which make use of contextual word embeddings such as BERT (Devlin et al. 2018).
Participating systems are expected to provide for each given term the most relevant concept (hypernym/synonym) in an external ontology: the Financial Industry Business Ontology (FIBO). Performance will be measured according to the accuracy with which financial terms are classified, and according to recall (based on the total number of predictions).
This task is open to everyone. The only exception are the co-chairs of the organizing team, who cannot submit a system, and who will serve as an authority to resolve any disputes concerning ethical issues or completeness of system descriptions.
A USD$1000 prize will be rewarded to the best-performing teams.
_____________________________________________
To register your interest in participating in FinSim shared task please use the following google form by no later than February 10th, 2021: https://forms.gle/oJetvLQfpPNLuSqJ6
__________________________________________
Important dates:
Dec 23, 2020: First announcement of the shared task and beginning of registration
Jan 08, 2021: Release of training set & scoring scripts.
Feb 02, 2021: Release of test set.
Feb 10, 2021: Registration deadline.
Feb 10, 2021: System Submission deadline.
Feb 15, 2021: Release of results.
Feb 19, 2021: Shared task title and abstract due
Feb 23, 2021: Shared task paper submissions due
Mar 01, 2021: Camera-ready version of shared task paper due
April 19-23, 2021: FinWeb 2021 Workshop (Ljubljana, Slovenia)
_________________________________________
Contact:
For any questions on the shared task please contact us on:
fin.sim.task@gmail.com
______________________________________
Shared task organizers:
- Youness Mansar, Fortia Financial Solutions
- Ismail El Maarouf, Fortia Financial Solutions
- Juyeon Kang, Fortia Financial Solutions
____________________________________
References
Georgeta Bordea, Paul Buitelaar, Stefano Faralli and Roberto Navigli (2015). “SemEval-2015 Task 17: Taxonomy Extraction Evaluation (TExEval)”. In Proceedings of SemEval 2015, co-located with NAACL HLT 2015, Denver, Col, USA.
Georgeta Bordea, Els Lefever, and Paul Buitelaar (2016). “Semeval-2016 task 13: Taxonomy extraction evaluation (TExEval-2)”. In Proceedings of the 10th International Workshop on Semantic Evaluation, San Diego, CA, USA.
Jose Camacho-Collados, Claudio Delli Bovi, Luis Espinosa-Anke, Sergio Oramas, Tommaso Pasini, Enrico Santus, Vered Shwartz, Roberto Navigli, and Horacio Saggion (2018). “SemEval-2018 Task 9: Hypernym Discovery”. In Proceedings of the 12th International Workshop on Semantic Evaluation (SemEval-2018), New Orleans, LA, United States. Association for Computational Linguistics.
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova (2018). “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”. https://arxiv.org/abs/1810.04805v2.
David Jurgens and Mohammad Taher Pilehvar (2016). “SemEval-2016 Task 14: Semantic Taxonomy Enrichment”. In Proceedings of SemEval-2016, NAACL-HLT.
The Financial Industry Business Ontology (FIBO): https://spec.edmcouncil.org/fibo/