Resource type
Thesis type
(Thesis) M.Sc.
Date created
2017-06-19
Authors/Contributors
Author (aut): Zhao, Zijin
Abstract
Heavy label noise is often present in many practical scenarios where observed labels of instances are corrupted. Classification with heavy label noise has great significance and attracts a lot of attention, since label noise may lead to many potential negative consequences. Many state-of-the-art approaches assume that label noise is class-dependent, and thus cannot be generalized to situations without this assumption. In this thesis, we propose a Markov chain sampling framework, MCS, to conquer the limitations of the existing methods in the binary classification problem. The main idea is to utilize the predictions of a sequence of classifiers in an ensemble way to detect mislabeled instances, the sequence of classifiers is trained on different subsets of the training data by sampling the states of a carefully designed Markov chain with random walk. Our proposed MCS framework is general and can entertain a wide spectrum of classification algorithms. We theoretically prove the correctness and effectiveness of the MCS framework. We further present experimental results showing the effectiveness and efficiency of the proposed framework and derivative algorithms.
Document
Identifier
etd10244
Copyright statement
Copyright is held by the author.
Scholarly level
Supervisor or Senior Supervisor
Thesis advisor (ths): Pei, Jian
Member of collection
Download file | Size |
---|---|
etd10244_ZZhao.pdf | 563.38 KB |