Query processing of schema design problems for data-driven renormalization

Author: 
Date created: 
2017-10-16
Identifier: 
etd10419
Keywords: 
Functional dependencies
Database design
Renormalization
NoSQL
Abstract: 

In the past decades, more and more information has been stored or delivered in non-relational data models—either in NoSQL databases or via a Software as a Service (SaaS) application. Users often want to load these data sets into a BI application or a relational database for further analysis. The data-driven renormalization framework is often used to transform non-relational data into relational data. In this thesis, we explore how to help users to make design decisions in such a framework. We formally define two kinds of queries—the point query and the stable interval query—to help users making design decisions. We propose two index structures, which can represent a list of FDs concisely but also process the queries efficiently. We conduct experiments on two real datasets and show that our algorithms greatly outperform the baseline method when processing a large set of FDs.

Document type: 
Thesis
Rights: 
This thesis may be printed or downloaded for non-commercial research and scholarly purposes. Copyright remains with the author.
File(s): 
Senior supervisor: 
Jian Pei
Department: 
Applied Sciences: School of Computing Science
Thesis type: 
(Thesis) M.Sc.
Statistics: