Resource type
Thesis type
(Project) M.Sc.
Date created
2022-04-12
Authors/Contributors
Author (aut): Shahjahan, Zayed
Abstract
In this project, we consider a simple new approach to variable selection in linear regression based on the Sum-of-Single-Effects model. The approach is particularly well-suited to big-data settings where variables are highly correlated and effects are sparse. The approach shares the computational simplicity and speed of traditional stepwise methods of variable selection in regression, but instead of selecting a single variable at each step, computes a distribution on variables that captures uncertainty in which variable to select. This uncertainty in variable selection is summarized conveniently by credible sets of variables with an attached probability for the entire set. To illustrate the approach, we apply it to a big-data problem in genetics.
Document
Extent
35 pages.
Identifier
etd21867
Copyright statement
Copyright is held by the author(s).
Supervisor or Senior Supervisor
Thesis advisor (ths): Graham, Jinko
Language
English
Member of collection
Download file | Size |
---|---|
etd21867.pdf | 386.49 KB |