Lockhart, Brandon

Resource type

Thesis

Thesis type

(Thesis) M.Sc.

Date created

2021-04-08

Authors/Contributors

Author: Lockhart, Brandon

Abstract

Obtaining an explanation for an SQL query result can enrich the analysis experience, reveal data errors, and provide deeper insight into the data. Inference query explanation seeks to explain unexpected aggregate query results on inference data; such queries are challenging to explain because an explanation may need to be derived from the source, training, or inference data in an ML pipeline. In this work, we model an objective function as a black-box function and propose BOExplain, a novel framework for explaining inference queries using Bayesian optimization (BO). An explanation is a predicate defining the input tuples that should be removed so that the query result of interest is significantly affected. BO - a technique for finding the global optimum of a black-box function - is used to find the best predicate. We develop two new techniques (individual contribution encoding and warm start) to handle categorical variables. We perform experiments showing that the predicates found by BOExplain have a higher degree of explanation compared to those found by the state-of-the-art query explanation engines. We also show that BOExplain is effective at deriving explanations for inference queries from source and training data on three real-world datasets.

Keywords

ML explanation, SQL explanation

Identifier

etd21307

Copyright statement

Copyright is held by the author(s).

Permissions

This thesis may be printed or downloaded for non-commercial research and scholarly purposes.

Supervisor or Senior Supervisor

Thesis advisor: Wang, Jiannan

Language

English

Member of collection

Computing Science Theses

Download file	Size
input_data\21360\etd21307.pdf	1.37 MB

Explaining inference queries with Bayesian optimization

Keywords

Views & downloads - as of June 2023