Invited Talks
Data integration in the AI era: research trends and still open issues
Robert Wrembel (Poznan University of Technology, Poland)
Abstract
Data integration (DI) has been an area for intensive research for decades. These efforts resulted in a few reference DI architectures. They can be categorized as supporting: (1) virtual integration (federated and mediated), (2) physical integration (data warehouse), and (3) hybrid (data lake, data lakehouse, data mesh). Regardless of their specific type, all these architectures rely on a sophisticated integration layer. The layer is implemented by a sophisticated software, for designing, orchestrating, and running the so-called DI processes. Nowadays, in all business domains, large volumes of highly heterogeneous data are produced, e.g., medical systems, smart cities, smart agriculture, which require further advancements in the data integration technologies. The widespread adoption of artificial intelligence (AI) solutions is now extending towards DI, opening new research paths and generating open problems.
In this talk, I will share my perspective on the application and potential of AI solutions in DI. I will also highlight unresolved issues within the field of DI. Specifically, I will explore: the optimization of DI processes, the role of user-defined functions (UDFs) in DI, data quality with a focus on deduplication, a novel DI architecture based on the connectors-as-a-service paradigm, and data provenance.
The talk will be structured into three main parts: (1) an overview of data integration architectures, (2) selected AI techniques for DI, and (3) still open problems in DI. The findings presented in the talk are based on my experience in running research and development DI projects for various business entities.
![]() |
Robert Wrembel (PhD, Dr. Habil.) is an associate professor in the Faculty of Computing and Telecommunications, at Poznan University of Technology (Poland). In 2008 he received a post-doctoral degree in computer science (habilitation), specializing in database systems and data warehouses. He has been a deputy dean of the Faculty of Computing and Management (2008-2012) and the Faculty of Computing (2012-2016). Since Jan 2023 he is the chair of the Data Processing Technologies group at Poznan University of Technology. He was a consultant at software house (2002-2003) and a lecturer at Oracle Poland (1998-2005). Currently he is an IT consultant in a private hospital. Within the last 10 years he has realized four R&D projects: for a big financial institution in Poland, one for a company in the energy sector, and two for a corporation in the field of electronics. He cooperates with IBM Software Lab Kraków in Poland. He has led at his University the Erasmus Mundus Joint Doctorate Program - Information Technologies for Business Intelligence - Doctoral College (2013-2020). Robert visited numerous research and education centers, including: INRAE Clermont-Ferrand (France), Free University of Bozen-Bolzano (Italy), Università degli Studi di Milano (Italy), Universitat Politècnica de Catalunya - BarcelonaTech (Spain), Université Lyon 2 (France), Universidad de Costa Rica (Costa Rica), Klagenfurt University (Austria), Loyola University (USA), INRIA Paris-Rocquencourt (France), and Université Paris Dauphine (France). In 2012 he graduated from a 2-months innovation and entrepreneurial program at Stanford University. In 2013 he has done an internship in a BI company Targit (USA). His research interests encompass: data integration, data quality, databases, data warehouses, and data lakes. |
Blending Contextual Data with Heterogeneous Time Dimensions for Improved Time Series Analysis
Anton Dignös (Free University of Bozen-Bolzano, Italy)
Abstract
In modern industrial settings, sensors continuously generate vast amounts of time series data critical for automation and process optimization. However, analyzing this data in isolation limits its effectiveness, as it often lacks integration with contextual factors that influence outcomes but are not directly observable. While traditional data fusion techniques aim at combining multimodal data such as images or videos, contextual factors in industrial environments frequently differ not in modality but in temporal structure. We identify four distinct time dimensions - constant, time series, events, and intervals - that commonly characterize contextual data in these settings. By transforming diverse time structures into a unified format, we enable the application of conventional machine learning techniques, enhancing the depth and accuracy of industrial data analysis. This talk presents a case study and initial work on a foundational approach for systematically integrating such temporally heterogeneous contextual factors into time series analysis.
![]() |
Anton Dignös is an assistant professor in the Faculty of Engineering at the Free University of Bozen-Bolzano (Italy). He received his PhD in Computer Science in 2014 from the Department of Computer Science at the University of Zurich (Switzerland). His research focuses on database technologies for advanced query processing and data analytics, with particular emphasis on temporal data, time series, and data summarization. He is the program co-chair of DAWAK 2025 and previously served as program co-chair of the EDBT PhD Workshop in 2023 and 2024. |