Orabi, Baraa

Resource type

Thesis

Thesis type

(Thesis) M.Sc.

Date created

2018-12-07

Authors/Contributors

Author: Orabi, Baraa

Abstract

The use of circulating tumour DNA (ctDNA) in cancer oncogenomics has the potential for rapid and non-invasive monitoring of patient-specific tumour progression. However, detection of low allele frequency variations in ctDNA raises many challenges, including the handling of sequencing errors. Tagging of DNA molecules with Unique Molecular Identifiers (UMI) attempts to mitigate sequencing errors; UMI tagged molecules are PCR amplified then sequenced independently. Analyzing UMI tagged sequencing data requires clustering reads originating from the same molecule then error-correcting sequencing errors in these clusters. Sizes of the current datasets require this process to be resource-efficient. To address this problem, we introduce Calib, a computational tool that clusters and error-corrects UMI tagged sequencing data. Calib is efficient and its parameters have been optimized to different dataset setups. On simulated datasets, Calib is highly accurate. On a real dataset, Calib results in significantly reduced false positive rates in downstream variation calling.

Keywords

Identifier

etd19999

Copyright statement

Copyright is held by the author.

Permissions

This thesis may be printed or downloaded for non-commercial research and scholarly purposes.

Scholarly level

Graduate student (Masters)

Supervisor or Senior Supervisor

Thesis advisor: Chauve, Cedric

Member of collection

Computing Science Theses

Download file	Size
etd19999.pdf	2.3 MB

Alignment-free clustering and error correction of UMI tagged DNA molecules

Keywords

Views & downloads - as of June 2023