Bayesian Model Averaging (BMA) has previously been proposed as a solution to the variable selection problem when there is uncertainty about the true model in regression. Some recent research discusses the drawbacks; specifically, BMA can (and does) give biased parameter estimates in the presence of confounding. This is because BMA is optimized for prediction rather than parameter estimation. Though some newer research attempts to fix the issue of bias under confounding, none of the current algorithms handle either large data sets or survival outcomes. The Approximate Two-phase Bayesian Adjustment for Confounding (ATBAC) algorithm proposed in this paper does both, and we use it on a large medical cohort study called THIN (The Health Improvement Network) to estimate the effect of statins on risk of stroke. We use simulation and some analytical techniques to discuss two main topics in this paper. Firstly, we demonstrate the ability of ATBAC to perform unbiased parameter estimation on survival data while accounting for model uncertainty. Secondly, we discuss when it is, and isn't, helpful to use variable selection techniques in the first place, and find that in some large data sets variable selection for parameter estimation is unnecessary.
Copyright is held by the author.
The author granted permission for the file to be printed and for the text to be copied and pasted.
Member of collection