Jiang, Bin

Resource type

Thesis

Thesis type

(Thesis) Ph.D.

Date created

2011-05-24

Authors/Contributors

Author: Jiang, Bin

Abstract

Uncertain data has been rapidly accumulated in many important applications, such as sensor networks, market analysis, social networks, and so on. Analyzing large collections of uncertain data has become an essential task. Generally, uncertainty means the lack of certainty due to having limited knowledge of the data being examined. An uncertain object cannot be described exactly in one state. Instead, it has more than one possible representation. Therefore, we model an uncertain data set as a set of uncertain objects, each of which has a set of instances, in a domain consisting of multiple attributes. In this thesis, we put emphasis on summarizing certainty in uncertain data. We systematically identify three types of uncertainty, namely, value uncertainty, membership uncertainty, and relationship uncertainty in the levels of objects, instances, and domains of uncertain data. In particular, we develop techniques for clustering uncertain objects to summarize objects, detecting outlying instances to summarize instances, and learning domain orders to summarize domains. Technically, we combine statistical analysis and data mining techquies to investigate uncertain data. We develop efficient and scalable algorithms to tackle the computational challenges of large uncertain data sets. We also conduct comprehensive empirical studies on real and synthetic data sets to verify the effectiveness of the proposed summarization techniques and the efficiency of our algorithms.

Keywords

Identifier

etd6646

Copyright statement

Copyright is held by the author.

Permissions

The author granted permission for the file to be printed and for the text to be copied and pasted.

Scholarly level

Graduate student (PhD)

Supervisor or Senior Supervisor

Thesis advisor: Pei, Jian

Member of collection

Computing Science Theses

Download file	Size
etd6646_BJiang.pdf	1.52 MB

Summarizing certainty in uncertain data

Keywords

Views & downloads - as of June 2023