Skip to main content

Integrating knowledge-driven and data-driven approaches in the derivation of clinical prediction rules

Resource type
Thesis type
(Thesis) Ph.D.
Date created
2006
Authors/Contributors
Abstract
Clinical prediction rules play an important role in medical practice. They expedite diagnosis and treatment for the serious cases and limit unnecessary tests for low-probability cases. However, the creation process for prediction rules is costly, lengthy, and involves several steps: initial clinical trials, rule generation and refinement, validation, and evaluation in clinical settings. With the current development of efficient data mining algorithms and growing accessibility to a vast amount of medical data, the creation of clinical rules can be supported by automated or semi-automated rule induction from the existing data sources. A data-driven method based on the reuse of previously collected medical records and clinical trial statistics is very cost-effective; however, it requires well defined and intelligent methods for data, information, and knowledge integration. This thesis presents a new framework for the integration of domain knowledge into purely data-driven techniques for the derivation of clinical prediction rules. We concentrate on two aspects: knowledge representation for the predictors and prediction rules, and knowledge-based evaluation for the automatically induced models. We propose a new integrative framework, a semio-fuzzy approach that has its theoretical foundations in semiotics and fuzzy logic. Semiotics provides representation for the measurements and interpretation of the medical predictors. Fuzzy logic provides explicit representation for the impression of the measurements and prediction rules. The integrative framework is applied to the construction of a knowledge repository for existing facts and rules, detection of medical outliers, handling missing values, handling imbalanced data, and feature selection. Several machine learning techniques are considered, based on model comprehensibility, interpretability, and practical utility in clinical settings. This new semio-fuzzy framework is applied towards the creation of prediction rules for the diagnosis of obstructive sleep apnea, a serious and under-diagnosed respiratory disorder, and tested on heterogeneous clinical data sets. The induced decision trees and logistic regression models are evaluated in context of the existing clinical prediction rules published in medical literature. We describe how the induced rules may confirm, contradict, and expand the expert-created rules.
Document
Copyright statement
Copyright is held by the author.
Permissions
The author has not granted permission for the file to be printed nor for the text to be copied and pasted. If you would like a printable copy of this thesis, please contact summit-permissions@sfu.ca.
Scholarly level
Language
English
Member of collection
Download file Size
etd2562.pdf 3.93 MB

Views & downloads - as of June 2023

Views: 0
Downloads: 0