DEXA 2023 Keynotes

Keynote 1

Physics-informed Machine Learning

Stéphane Bressan

National University of Singapore (Singapore)


In 1687, Isaac Newton published his groundbreaking work, "Philosophiæ Naturalis Principia Mathematica." Newton's remarkable discoveries unveiled the laws of motion and the law of universal gravitation, propelling humanity's understanding of the physical world to new heights. In a letter to Robert Hooke in 1675, in response to an invitation to collaborate, Newton humbly remarked, "If I have seen further, it is by standing on the shoulders of giants." This metaphor swiftly became a powerful symbol of intellectual and scientific progress, signifying the idea that knowledge is built upon foundations laid by brilliant minds that came before us.

Fast-forwarding to the present, we find ourselves amidst a triumphant statistical machine learning revolution. In 2016, Google's AlphaGo, a deep reinforcement learning algorithm, astounded the world by outperforming a professional Go player. The following year, CheXNet, a deep convolutional neural network developed at Stanford University, surpassed radiologists in accurately detecting pneumonia from chest X-ray images. And in 2020, AlphaFold, a neural network model created by DeepMind, revolutionised protein structure prediction, surpassing other existing methods.

These advancements stand on the shoulders of giants. They owe their existence to the work of logicians, mathematicians, physicists, neurobiologists, computer scientists, and cyberneticists who have paved the way for the birth of modern machine learning models and algorithms. They also owe their existence to the work of material, electrical, electronics and other engineers, whose ingenuity has birthed the computer hardware and technology enabling such performance.

However, the remarkable ascent of machine learning is not solely reliant on these contributions. It thrives on the vast amounts of data permeating the global information infrastructure, enabling the construction of accurate representations of the world. What about knowledge?

In this context, we propose exploring and discussing how machine learning can both leverage and contribute to scientific knowledge. We explore how the training of a machine learning model can be informed by the fundamental principles of the very systems it seeks to comprehend and how it can create symbolic scientific knowledge. We explore applications in classical mechanics, fluid mechanics, quantum many-body systems, macroeconomics, chemistry, and astronomy. Along this journey, we cross the paths of such great minds as William Rowan Hamilton, Ernst Ising, Richard Feynman, and Johannes Kepler.


Stéphane Bressan is Associate Professor in the Department of Computer Science of the School of Computing (SoC) of the National University of Singapore (NUS). Stéphane is Track Leader for Maritime Information Technologies at NUS Centre for Maritime Studies (CMS), Affiliate Professor at NUS Business Analytics Centre, Faculty Affiliate at NUS Institute of Data Science, and a member of the Image & Pervasive Access Lab (IPAL) (Singapore-France CNRS UMI 29255). 

In 1990, Stéphane joined the European Computer-industry Research Centre (ECRC) of Bull, ICL, and Siemens in Munich (Germany). From 1996 to 1998, he was Research Associate at the Sloan School of Management of the Massachusetts Institute of Technology (MIT) (United States of America).
Stéphane's research interest is the integration, management and analysis of data from heterogeneous, disparate, and distributed sources. Stéphane has developed expertise in data- and physics-driven modelling, simulation, and optimisation with data mining and machine learning algorithms.


Keynote 2:

Data integration revitalized: from Data Warehouse through Data Lake to Data Mesh

Robert Wrembel

Poznan University of Technology; Artificial Intelligence and Cybersecurity Center (Poland)

For years, data integration (DI) architectures evolved from those supporting virtual integration (mediated, federated), through physical integration (data warehouse), to those supporting both virtual and physical integration (data lake, lakehouse, polystore, data mesh/fabric). Regardless of its type, all of the developed DI architectures include an integration layer. This layer is implemented by a sophisticated software, which runs the so-called DI processes. The integration layer is responsible for ingesting data from various sources (typically heterogeneous and distributed) and for homogenizing data into formats suitable for future processing and analysis. Nowadays, in all business domains, large volumes of highly heterogeneous data are produced, e.g., medical systems, smart cities, precision/smart agriculture, which require further advancements in the data integration technologies. In this paper, I present my subjective view on still-to-be developed data integration techniques, namely: (1) novel agile/flexible integration techniques, (2) cost-based and ML-based execution optimization of DI processes, and (3) quality assurance techniques in complex multi-modal data systems.
Robert Wrembel (PhD, Dr. Habil.) is an associate professor in the Faculty of Computing and Telecommunications, at Poznan University of Technology (Poland). In 2008 he received a post-doctoral degree in computer science (habilitation), specializing in database systems and data warehouses. He has been a deputy dean of the Faculty of Computing and Management (2008-2012) and the Faculty of Computing (2012-2016). Since Jan 2023 he is a chair of the Data Processing Technologies group at Poznan University of Technology. Recently, as a leader, he realizes the mission of creating Artificial Intelligence and Cybersecurity Center in Poznań.
He was a consultant at software house Rodan Systems (2002-2003) and a lecturer at Oracle Poland (1998-2005). Currently he is an IT consultant in hospital Centrum Medyczne HCP. Within the last 10 years he has realized three R&D projects (two for Samsung Electronics and one for a company in the energy sector - Kogeneracja Zachód). Currently, as a team leader, he is realizing the fourth R&D project for the biggest Polish bank - PKO BP. He cooperates with IBM Software Lab Kraków in Poland. He has led at his University the Erasmus Mundus Joint Doctorate Program - Information Technologies for Business Intelligence - Doctoral College (2013-2020). 
Robert visited numerous research and education centers, including: Universitat Politècnica de Catalunya - BarcelonaTech (Catalunya), Université Lyon 2 (France), Universidad de Costa Rica (Costa Rica), Klagenfurt University (Austria), Loyola University (USA), INRIA Paris-Rocquencourt (France), and Université Paris Dauphine (France). In 2012 he graduated from a 2-months innovation and entrepreneurial program at Stanford University. In 2013 he has done an internship in a BI company Targit (USA). 
In 2010 he received the IBM Faculty Award for highly competitive research, in 2011 he was awarded the Medal of the Committee of National Education (from the Minister of National Education), in 2016 - the Silver Medal for Long-lasting Service (from the President of the Republic of Poland), in 2019 - IBM Shared University Research Award, and in 2019 - International Federation for Information Processing (IFIP) Service Award. He is a senior ACM member, a country representative in the IFIP Technical Committee TC 2 - Software: Theory and Practice, and a chair the of the IFIP Working Group 2.6 (Database). 

Keynote 3:

From an Interpretable Predictive Model to a Model Agnostic Explanation

Osmar R. Zaïane

Amii Fellow & Canada CIFAR AI Chair, University of Alberta, Canada

Today, the limelight is on Deep Learning. With the massive success of deep learning, other machine learning paradigms have had to take the backstage. Yet other models, particularly rule-based learning methods, are more readable and explainable and can even be competitive when labelled data is not abundant, and therefore could be more suitable for some applications where transparency is a must. One such rule-based method is the less-known Associative Classifier. The power of associative classifiers is to determine patterns from the data and perform classification based on the features most indicative of prediction. Early approaches suffer from cumbersome thresholds requiring prior knowledge. We present a new associative classifier approach that is even more accurate while generating a smaller model. It can also be used in an explainable AI pipeline to explain inferences from other classifiers, irrespective of the predictive model used inside the black box. 
Osmar Zaïane is a Professor in Computing Science at the University of Alberta, Canada, Fellow of the Alberta Machine Intelligence Institute (Amii), and Canada CIFAR AI Chair. Dr. Zaiane obtained his Ph.D. from Simon Fraser University, Canada, in 1999. He has published more than 400 papers in refereed international conferences and journals. He is an Associate Editor of many International Journals on data mining and data analytics and served as program chair and general chair for scores of international conferences in the field of knowledge discovery and data mining. Dr. Zaiane received numerous awards including the 2010 ACM SIGKDD Service Award from the ACM Special Interest Group on Data Mining, which runs the world’s premier data science, big data, and data mining association and conference. 
Dr. Zaiane focuses on pattern discovery and information extraction from large databases, also known as data mining. His work involves data mining from disparate heterogeneous data sources, such as on the Internet, as well as the analysis of complex information networks, also known as social network analysis. Specific research projects include the development of tools for data analytics such as Meerkat, a tool for analyzing changes over time in a network of entities. Other of Dr. Zaiane’s research projects relate to data mining in health informatics and the development of tools for document categorization and decision support systems. He focuses on building applications that can improve decision-making in fields from business to medicine, allowing decisions to be based on data and data analysis. Through the application of machine learning and methods of knowledge discovery, he devises ways to personalize applications, automate processes and improve upon current data science practices.