Bottom-up Knowledge Graph-based Data Management
André Pomp
The implementation of data science use cases that rely on machine learning requires access to a sufficient amount of data. Available data usually shows a great variety in terms of data sources, formats and quality. Therefore, data scientists need ways to discover, understand and access potentially relevant data sources. At the same time, data providers need possibilities to make their data sources available regardless of their later use. An important approach for simplifying the provision and consumption of data is Ontology-based Data Management (OBDM), in which an ontology is used as a common shared conceptualization. However, current OBDM approaches rely on ontologies that were created in advance, whereby the creation as well as maintenance are already complex and time-consuming processes. This thesis introduces a novel approach, called Bottom-up Knowledge Graph (BUKG), that improves OBDM by overcoming issues of traditional ontology engineering for the management of (semi-)structured data sources. Instead of creating and maintaining ontologies top-down, a BUKG learns the individual conceptualizations of data providers and consumers and continuously integrates them into its common shared conceptualization. In this way, the presented approach makes an important contribution to the semantic data management by supporting the seamless collection, integration, discovery, understanding and access of heterogeneous data sources based on a novel bottom-up conceptualization.