European Master's Program in Computational Logic

26 August 2015

Master Thesis Defence by Ms Alifah Syamsiyah

Ms Alifah Syamsiyah defended her master thesis on 'Ontology-Driven Extraction of Event Logs from Relational Databases' at unibz on 24 August 2015.

Abstract: The need to improve and support business process management in competitive and rapidly changing environments is extremely increasing nowadays. Process mining is a new discipline that has the potential to provide meaningful insights to organizations. The main idea behind process mining is to use the event data and logs recorded in the organization information system so as to extract process-related information reflecting reality. One important open challenge in this research area is how to obtain good quality event data as the starting point. The purpose of this thesis work is to provide a solution to this problem. Since data is typically stored by contemporary organizations in legacyinformation systems managed through the standard relational technology, we provide techniques and tools for extracting event data from relational databases. Our approach is ontology-driven, and relies in particular on two ontologies: a domain ontology that captures the relevant concepts and relations involved in the business of the organization, and an event ontology that captures the general notion of event logs and related concepts, mirroring the structure of the XES IEEE Standard for event logs. An annotation mechanism is then introduced to help domain experts in highlighting event-related notions inside the domain ontology. We then leverage the ontology-based data access paradigm in order to link data to the domain ontology and the domain ontology to the event ontology, suitably using the annotations in the latter step. This provides the basis for extracting XES-related information from the legacy data. Two approaches are supported during the extraction: materialization, in which an XES event log is explicitly constructed, or virtualization, in which XES-related queries are reformulated as corresponding queries to be posed over the legacy data. Notably, process mining tools can seamlessly benefit from both strategies without changing any single line of code.