Gholami, Sajjad

Resource type

Thesis

Thesis type

(Thesis) M.Sc.

Date created

2016-09-20

Authors/Contributors

Author: Gholami, Sajjad

Abstract

A model selection score measures how well a model fits a dataset. We describe a new method for extending, or upgrading, a Bayesian network (BN) score designed for single-table i.i.d. data to multi-relational datasets (databases). A multi-relational BN model provides an integrated statistical analysis of the interdependent data tables in a database. We focus on log-linear model selection scores. Our key theoretical desideratum is preserving consistency: if a model selection score is statistically consistent for single-table data, then the upgraded score should be statistically consistent for relational data as well. It is difficult if not impossible to define an upgrade method for log-linear model scores that preserves consistency, if the upgraded score is a function of a single model only. We therefore develop a novel approach where an upgraded model score is a {\em gain function} that compares two models: a current vs. an alternative BN structure. Our main theorem establishes that model search based on our upgraded gain function preserves consistency. Empirical evaluation on six benchmark relational databases shows that our upgraded scores select an informative BN structure that strikes a balance between overly sparse and overly dense graph structures.

Keywords

Identifier

etd9823

Copyright statement

Copyright is held by the author.

Permissions

This thesis may be printed or downloaded for non-commercial research and scholarly purposes.

Scholarly level

Graduate student (Masters)

Supervisor or Senior Supervisor

Thesis advisor: Schulte, Oliver

Member of collection

Computing Science Theses

Download file	Size
etd9823_SGholami.pdf	917.11 KB

Upgrading Bayesian Network Scores for Multi-Relational Data

Keywords

Views & downloads - as of June 2023