Resource type
Thesis type
(Thesis) M.Sc.
Date created
2021-12-17
Authors/Contributors
Abstract
Motivation: Drug resistance is becoming an increasingly serious risk to human health around the world. Using techniques that predict drug resistance across different bacterial species that utilize whole-genome sequencing (WGS) data, doctors may administer the appropriate antibiotics to each patient, reducing the chance of drug resistance. Currently available machine learning techniques for this purpose transform whole genome sequence (WGS) data from a specific bacterial isolate into features corresponding to single-nucleotide polymorphisms (SNPs) or short sequence segments of a defined length K-mers We present a novel technique for predicting drug resistance in multiple bacterial species based on gene burden. Our multi-input multi-output network predicts the resistance of multiple species to multiple antibiotic drugs. Results: On a large dataset of isolates from six species, we find that using these strategies yields a statistically significant improvement over state-of-the-art methods and that this improvement is driven by our method's ability to account for the order of the genes in the genome and jointly training on multiple bacterial species.
Document
Extent
30 pages.
Identifier
etd21971
Copyright statement
Copyright is held by the author(s).
Supervisor or Senior Supervisor
Thesis advisor: Libbrecht, Maxwell
Language
English
Member of collection
Download file | Size |
---|---|
etd21971.pdf | 977.51 KB |