Nickchi, Payman

Resource type

Thesis

Thesis type

(Thesis) Ph.D.

Date created

2023-12-19

Authors/Contributors

Author (aut): Nickchi, Payman

Abstract

This thesis investigates two distinct projects: one in statistical genetics focusing on identifying rare causal variants using a sequence-relatedness approach, and another in goodness-of-fit test based on the empirical distribution function (EDF) for any general likelihood model. First, we investigate an association method based on sequence-relatedness for identifying causal variants in a genomic region. We focus on conducting linkage analysis by using sequences as the unit of observation rather than the traditional methods that relied on individuals. We introduce two sequence-relatedness approach to associate similarity in genetic relatedness with similarity in trait values. We compare them to two common genotypic-association methods. Based on a simulation study, we show the efficacy of sequence-relatedness methods in improving the localization and detection of rare causal variants in an allelically heterogeneous disease trait. In addition, a post-hoc labeling procedure based on the idea of genealogical nearest neighbors is introduced to identify potential carriers or non-carriers of causal variants among case sequences. Second, we introduce a goodness-of-fit test based on the EDF in the presence of parameter estimation, which can be applied to any general likelihood model. In summary, the computation of the P-value in goodness-of-fit tests based on EDF with parameter estimation depends on the limiting large-sample covariance function of a stochastic process. This function relies on key elements of the model, including the Fisher information matrix and the derivatives of the cumulative distribution function under the null hypothesis. Computing these elements is often not straightforward and can be computationally intensive or impractical in some cases. In this thesis, we review the theory and propose a new method to estimate the covariance function of the process directly from the sample instead of analytical calculation. We consider two broad cases: when the sample is independent and identically distributed, or when the expected value of the response variable depends on some covariates (e.g., linear model or generalized linear model). Through simulations, we demonstrate the reliability of the estimation method. Finally, we provide computational tools as an R package for practical implementation.

Extent

143 pages.

Keywords

Identifier

etd22918

Copyright statement

Copyright is held by the author(s).

Permissions

This thesis may be printed or downloaded for non-commercial research and scholarly purposes.

Supervisor or Senior Supervisor

Thesis advisor (ths): Lockhart, Richard

Thesis advisor (ths): Graham, Jinko

Language

English

Member of collection

Statistics and Actuarial Science Theses

Download file	Size
etd22918.pdf	1.42 MB

Linkage fine-mapping on sequences from case-control studies and goodness-of-fit tests based on empirical distribution function for general likelihood models

Keywords

Views & downloads - as of June 2023