Zhao, Zijin

Resource type

Thesis

Thesis type

(Thesis) M.Sc.

Date created

2017-06-19

Authors/Contributors

Author: Zhao, Zijin

Abstract

Heavy label noise is often present in many practical scenarios where observed labels of instances are corrupted. Classification with heavy label noise has great significance and attracts a lot of attention, since label noise may lead to many potential negative consequences. Many state-of-the-art approaches assume that label noise is class-dependent, and thus cannot be generalized to situations without this assumption. In this thesis, we propose a Markov chain sampling framework, MCS, to conquer the limitations of the existing methods in the binary classification problem. The main idea is to utilize the predictions of a sequence of classifiers in an ensemble way to detect mislabeled instances, the sequence of classifiers is trained on different subsets of the training data by sampling the states of a carefully designed Markov chain with random walk. Our proposed MCS framework is general and can entertain a wide spectrum of classification algorithms. We theoretically prove the correctness and effectiveness of the MCS framework. We further present experimental results showing the effectiveness and efficiency of the proposed framework and derivative algorithms.

Keywords

Identifier

etd10244

Copyright statement

Copyright is held by the author.

Permissions

This thesis may be printed or downloaded for non-commercial research and scholarly purposes.

Scholarly level

Graduate student (Masters)

Supervisor or Senior Supervisor

Thesis advisor: Pei, Jian

Member of collection

Computing Science Theses

Download file	Size
etd10244_ZZhao.pdf	563.38 KB

Classification in the presence of heavy label noise: A Markov chain sampling framework

Keywords

Views & downloads - as of June 2023