The 4th International Workshop on Big Data Tools, Methods, and Use Cases for Innovative Scientific Discovery (BTSD) 2022

BTSD 2022


Artificial Intelligence



Call for Papers
Program Chairs
Sangkeun (Matt) Lee
Jong Youl Choi
Anika Tabassum
Organizers’ Background
Sangkeun (Matt) Lee received his Ph.D. degree in computer science and engineering from Seoul National University in 2012. He is currently an R&D Associate in Computer Science and Mathematics Division at Oak Ridge National Laboratory. He has been studying big data, data science, and machine learning and applied state-of-the-art data analysis technologies in many application domains. He has developed many data analytics software, and one of his developed software, ORiGAMI has won the 2016 DOE R&D 100 Award. He has been contributing to many of leading computer science conferences and journals such as ACM WWW, ACM RecSys, and Expert Systems with Applications. For the last few years, he has collaborated with scientists across various domains including material science, nuclear science, and mechanical engineering, and published papers in scientific journals such as Journal of Nuclear Materials, Acta Materialia, The Electricity Journal, Advanced Theory, and Simulations.
Jong Youl (Jong) Choi is a researcher working in the Discrete Algorithms Group, Computer Science and Mathematics Division, Oak Ridge National Laboratory (ORNL), Oak Ridge, Tennessee, USA. He earned his Ph.D. degree in Computer Science at Indiana University Bloomington in 2012 and his MS degree in Computer Science from New York University in 2004. His areas of research interest span data mining and machine learning algorithms, high-performance data-intensive computing, and parallel and distributed systems. More specifically, he is focusing on researching and developing data-centric machine learning algorithms for large-scale data management, in situ/in-transit data processing, and data management for code coupling. Jong Choi actively serves on conference committees and journal reviews such as ParaMo, CCPE, and CLUS.
Anika Tabassum is currently working as a Postdoctoral researcher at Oak Ridge National Laboratory, where she is contributing toward Deep Learning for multi-scale and multimodal battery analytics and plasma simulation for fusion energy. Her research focuses on developing deep learning models for robust scientific computing, specifically, she works on knowledge-guided ML and scientific ML. She received her Ph.D. from the Department of Computer Science at Virginia Tech where she worked on bringing knowledge-guided ML to address multiple challenges in power system failures and clean energy. Her Ph.D. research work was funded by an NSF Urban Computing fellowship. Apart from her primary research focus, she also worked on designing the COVID-19 forecasting model for the CDC challenge. She has published in multiple venues such as ACM SigKDD, AAAI, CIKM, IEEE BigData, IAAI, and journals like ACM TIST and Elsevier. She completed her bachelor's degree in Computer Science and Engineering from the Bangladesh University of Engineering and Technology.
Introduction to Workshop
Advances in big data technology, artificial intelligence, and machine learning have created so many success stories in a wide range of areas, especially in industry. These success stories have been motivating scientists, who study physics, chemistry, materials, medicine, and many more, to explore a new pathway of utilizing big data tools for their scientific activities.
However, there are barriers to overcome. Most existing big data tools, systems, and methodologies have been developed without considering scientific purposes or scientists’ specific requirements. They are not originally developed for scientists who have no or little knowledge of programming or computer science. On the other hand, for computer scientists, understanding the domain problem is often very challenging due to the lack of enough background knowledge.
We expect that big data technologies can play a great role in contributing to scientific innovation in many ways. There are already a lot of ongoing scientific projects around the world that aim to discover novel hypotheses, analyze big multidimensional data which couldn’t be handled manually, and reduce the time required by complex calculations via machine. This workshop intends to bring domain scientists and computer scientists together while exploring and extending opportunities in the development of big data tools, systems, and methodologies for scientific discovery, to share success stories and lessons learned, and discuss challenges if overcome would enable successful collaboration across different domains, especially domain scientists and computer/data scientists.
In this workshop, we discuss the following questions:
What makes big data tools for scientists different from the existing tools?
What specific needs and challenges do domain scientists face when they try to adopt big data tools?
How can computer scientists and domain scientists communicate to define a feasible problem together?
What are the barriers of using big data for scientific discovery and how do these barriers differ in different science domains?
Workshop History
The international workshop on Big Data Tools, Methods, and Use Cases for Innovative Scientific Discovery (BTSD) was first held in December 2019 in conjunction with IEEE Big Data 2019 conference, organized by Matt Lee and Travis Johnston. Total of 12 papers were accepted. It was a great start to build a strong scientific collaboration community. The second BTSD workshop was held in December 2020 as a virtual workshop in conjunction with IEEE Big Data 2020. Total of 11 papers were accepted and presented. The third BTSD workshop was held in December 2021 as a virtual workshop in conjunction with IEEE Big Data 2021. Total of 9 papers were accepted and presented. It was a great communication and opportunity to learn from experiences across many scientific domains.
Research Topics Included in the Workshop
Big data tools, systems, and methods related to, but not limited to:
Scientific data processing
Artificial intelligence/Deep neural networks/Machine learning
Text mining/Graph mining
Database/Query processing/Query Optimization
Parallel computation/High Performance Computing
Visualization/User Interface/HCI
Parallelization/Performance/Scalability
High Performance Computing …
that facilitate innovation and discovery in a scientific domain, such as:
Physics
Chemistry
Material science
Mechanical engineering
Nuclear engineering
Biomedical science …
Use cases, success stories, lessens learned in scientific discovery using big data tools, systems, and methods
Program Committee Members
Youngjae Kim, Sogang University, South Korea
Feng Bao, Florida State University, USA
Supriya Chinthavali, Oak Ridge National Laboratory, USA
Guimu Guo, Rowan University, USA
Ramakrishnan Kannan, Oak Ridge National Laboratory, USA
Seungha Shin, University of Tennessee, USA
Pei Zhang, Oak Ridge National Laboratory, USA
Ivy Peng, Lawrence Livermore National Laboratory, USA
Ralph Kube, Princeton Plasma Physics Laboratory, USA
Ohyung Kwon, Korea Institute of Industrial Technology, South Korea
Paper Submission
Please submit a short paper (minimum 4 page, up to 6 page IEEE 2-column format) or full paper (minimum 8 page, up to 10 page IEEE 2-column format) through the online submission system.
Papers should be formatted to IEEE Computer Society Proceedings Manuscript Formatting Guidelines (see link to "formatting instructions" below).
https://wi-lab.com/cyberchair/2022/bigdata22/scripts/submit.php?subarea=S16&undisplay_detail=1&wh=/cyberchair/2022/bigdata22/scripts/ws_submit.php
Formatting Instructions
8.5" x 11" (DOC, PDF)
LaTex Formatting Macros
Important Dates
Abstract Submission: Oct 1, 2022:
Due date for full workshop papers submission: Oct 8, 2022
Nov 1, 2022: Notification of paper acceptance to authors
Nov 20, 2022: Camera-ready of accepted papers
Presentation Preparation
To be announced
Registration
To be announced
Workshop Primary Contact
Sangkeun (Matt) Lee, Discrete Algorithms Group, Computer Science and Mathematics Division, Oak Ridge National Laboratory, TN, USA. Tel: +1 865 574 8858 Email: lees4@ornl.gov