Penalized Logistic Regression in Case-control Studies

Author: 
Date created: 
2016-12-16
Identifier: 
etd9912
Keywords: 
Logistic regression
Case-control data
Small samples
Separation
Profile likelihood
Abstract: 

Likelihood-based inference of odds ratios in logistic regression models is problematic for small samples. For example, maximum-likelihood estimators may be seriously biased or even non-existent due to separation. Firth proposed a penalized likelihood approach which avoids these problems. However, his approach is based on a prospective sampling design and its application to case-control data has not yet been fully justified. To address the shortcomings of standard likelihood-based inference, we describe: i) naive application of Firth logistic regression, which ignores the case-control sampling design, and ii) an extension of Firth's method to case-control data proposed by Zhang. We present a simulation study evaluating the empirical performance of the two approaches in small to moderate case-control samples. Our simulation results suggest that even though there is no formal justification for applying Firth logistic regression to case-control data, it performs as well as Zhang logistic regression which is justified for case-control data.

Document type: 
Graduating extended essay / Research project
Rights: 
This thesis may be printed or downloaded for non-commercial research and scholarly purposes. Copyright remains with the author.
File(s): 
Supervisor(s): 
Jinko Graham
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.
Statistics: