Resource type
Thesis type
(Project) M.Sc.
Date created
2006
Authors/Contributors
Author: Zamar, David Sebastian
Abstract
Current methods for conducting exact inference for logistic regression are not capable of handling large data sets due to memory constraints caused by storing large networks. We provide and implement an algorithm which is capable of conducting (approximate) exact inference for large data sets. Various application fields, such as genetic epidemiology, in which logistic regression models are fit to larger data sets that are sparse or unbalanced may benefit from this work. We illustrate our method by applying it to a diabetes data set which could not be analyzed using existing methods implemented in software packages such as LogXact and SAS. We include a listing of our code along with documented instructions and examples of all user methods. The code will be submitted to the Comprehensive R Archive Network as a freely-available R package after further testing.
Document
Copyright statement
Copyright is held by the author.
Scholarly level
Language
English
Member of collection
Download file | Size |
---|---|
etd2390.pdf | 927.38 KB |