Safari, Amir Hosein

Resource type

Thesis

Thesis type

(Thesis) M.Sc.

Date created

2021-08-24

Authors/Contributors

Author: Safari, Amir Hosein

Abstract

Drug resistance in Mycobacterium tuberculosis (MTB) is a growing threat to human health worldwide. One way to mitigate the risk of drug resistance is to enable clinicians to prescribe the right antibiotic drugs to each patient through methods that predict drug resistance in MTB using whole-genome sequencing (WGS) data. Existing machine learning methods for this task typically convert the WGS data from a given bacterial isolate into features corresponding to single-nucleotide polymorphisms (SNPs) or short sequence segments of a fixed length K (K-mers). Here, we introduce a gene burden-based method for predicting drug resistance in TB. We define one numerical feature per gene corresponding to the number of mutations in that gene in a given isolate. This representation greatly reduces the number of model parameters. We further propose a model architecture that considers both gene order and locality structure through a Long-term Recurrent Convolutional Network (LRCN) architecture, which combines convolutional and recurrent layers. We find that using these strategies yields a substantial, statistically significant improvement over state-of-the-art methods on a large dataset of M. tuberculosis isolates, and suggest that this improvement is driven by our method's ability to account for the order of the genes in the genome and their organization into operons.

Extent

43 pages.

Keywords

Identifier

etd21607

Copyright statement

Copyright is held by the author(s).

Permissions

This thesis may be printed or downloaded for non-commercial research and scholarly purposes.

Supervisor or Senior Supervisor

Thesis advisor: Libbrecht, Maxwell

Language

English

Member of collection

Computing Science Theses

Download file	Size
etd21607.pdf	2.65 MB

Predicting drug resistance in M. tuberculosis using a long-term Recurrent Convolutional Network

Keywords

Views & downloads - as of June 2023