Covariance-adjusted, sparse, reduced-rank regression with application to imaging-genetics data

Date created: 
ADNI study
Alzheimer's disease
Imaging genetics data
High dimensional data
Dimension reduction
Sparse data
Multiple response regression
Variable selection
Covariance estimation
Spatial correlation

Alzheimer's disease (AD) is one of the most challenging diseases in the world and it is crucial for researchers to explore the relationship between AD and genes. In this project, we analyze data from 179 cognitively normal individuals that contain magnetic resonance imaging measures in 56 brain regions of interest and alternate allele counts of 510 single nucleotide polymorphisms (SNPs) obtained from 33 candidate genes for AD, provided by the AD Neuroimaging Initiative (ADNI). Our objectives are to explore the data structure and prioritize interesting SNPs. Using standard linear regression models is inappropriate in this research context, because they cannot account for sparsity in the SNP effects and the spatial correlations between brain regions. Thus, we review and apply the method of covariance-adjusted, sparse, reduced-rank regression (Cov-SRRR) that simultaneously performs variable selection and covariance estimation to the data of interest. In our findings, SNP \textit{rs16871157} has the highest variable importance probability (VIP) in bootstrapping. Also, the estimated coefficient values corresponding to the thickness measures of the temporal lobe area have largest absolute values and are negative, which is consistent with current AD research.

Document type: 
Graduating extended essay / Research project
This thesis may be printed or downloaded for non-commercial research and scholarly purposes. Copyright remains with the author.
Jinko Graham
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.