Advanced Monte Carlo methods and applications

Date created: 
Bayesian statistics
Sequential Monte Carlo
Markov chain Monte Carlo
Genome-wide association studies

Monte Carlo methods have emerged as standard tools to do Bayesian statistical inference for sophisticated models. Sequential Monte Carlo (SMC) and Markov chain Monte Carlo (MCMC) are two main classes of methods to sample from high dimensional probability distributions. This thesis develops methodologies within these classes to address problems in different research areas. Phylogenetic tree reconstruction is a main task in evolutionary biology. Traditional MCMC methods may suffer from the curse of dimensionality and the local-trap problem. Firstly, we introduce a new combinatorial SMC method, with a novel and efficient proposal distribution. We also explore combining SMC and Gibbs sampling to jointly estimate the phylogenetic trees and evolutionary parameter of genetic data sets. Secondly, we propose an ``embarrassingly parallel'' method for Bayesian phylogenetic inference, annealed SMC, based on recent advances in the SMC literature such as adaptive determination of annealing parameters. Another application of the methods presented in this thesis is in genome wide-association studies. Linear mixed models (LMMs) are powerful methods for controlling confounding caused by population structure. We develop a Bayesian hierarchical model to jointly estimate LMM parameters and the genetic similarity matrix using genetic sequences and phenotypes. We develop an SMC method to jointly approximate the posterior distributions of the LMM and phylogenetic trees. We also consider parameter estimation for nonlinear differential equation (DE) systems from noisy measurements of dynamic systems. We develop a fully Bayesian framework for non-linear DE systems. A flexible nonparametric function is used to represent the dynamic process such that expensive numerical solvers can be avoided. We derive an SMC method to sample from multi-modal DE posterior distributions. In addition, we consider Bayesian computing problems related to importance sampling and misclassification in multinomial data. Lastly, motivated by a personalized recommender system with dynamic preference changes, we develop a new hidden Markov model (HMM) and propose an efficient online SMC algorithm by hybridizing with the EM algorithm for the HMM model.

Document type: 
This thesis may be printed or downloaded for non-commercial research and scholarly purposes. Copyright remains with the author.
Liangliang Wang
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Thesis) Ph.D.