Statistics and Actuarial Science - Theses, Dissertations, and other Required Graduate Degree Essays

Receive updates for this collection

Bayesian Computational Methods and Applications

Author: 
Date created: 
2014-04-24
Abstract: 

The purpose of this thesis is to develop Bayesian methodology together with the proper computational tools to address two different problems. The first problem which is more general from a methodological point of view appears in computer experiments. We consider emulation of realizations of a monotone function at a finite set of inputs available from a computationally intensive simulator. We develop a Bayesian method for incorporating the monotonicity information in Gaussian process models that are traditionally used as emulators. The resulting posterior in the monotone emulation setting is difficult to sample from due to the restrictions caused by the monotonicity constraint. To overcome the difficulties faced in sampling from the constrained posterior was the motivation for development of a variant of sequential Monte Carlo samplers that are introduced in the beginning of this thesis. Our proposed algorithm that can be used in a variety of frameworks is based on imposition of the constraint in a sequential manner. We demonstrate the applicability of the sampler to different cases by two examples; one in inference for differential equation models and the second in approximate Bayesian computation. The second focus of the thesis is on an application in the area of particle physics. The statistical procedures used in the search for a new particle are investigated and a Bayesian alternative method is proposed that can address decision making and inference for a class of problems in this area. The sampling algorithm and components of the model used for this application are related to methods used in the first part of the thesis.

Document type: 
Thesis
File(s): 
Supervisor(s): 
Richard Lockhart
Derek Bingham, Hugh Chipman, David Campbell
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Thesis) Ph.D.

Irregularly Spaced Time Series Data with Time Scale Measurement Error

Date created: 
2014-05-23
Abstract: 

This project can be mainly divided into two sections. In the first section it attempts to model an irregularly spaced time series data where time scale is being measured with a measurement error. Modelling an irregularly spaced time series data alone is quite challenging as traditional time series techniques only capture equally/regularly spaced time series data. In addition to that, the measurement error in the time scale make it even more challenging to incorporate measurement error models and functional approaches to model the time series. Thus, this project is based on a Bayesian approach to model a flexible regression function when the time scale is being measured with a measurement error. The regression functions are modelled with regression P-splines and the exploration of posterior is carried out using a fully Bayesian method that uses Markov chain monte carlo (MCMC) techniques. In section two, we identify the relationship/dependency between two irregularly spaced time series data sets which were modelled using regression P-splines and a fully Bayesian method, using windowed moving correlations. The validity of the suggested methodology is then explored using two simulations. It is then applied on two irregularly spaced time series data sets each subjected to measurement errors in time scale to identify the dependency between them in terms of statistically significant correlations.

Document type: 
Graduating extended essay / Research project
File(s): 
Supervisor(s): 
Dave Campbell
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

Model assessment: Bayes assisted tests and tests for discrete data

Author: 
Date created: 
2014-01-23
Abstract: 

In this thesis, two areas of goodness-of fit are discussed and new methodology proposed. In the first, Bayesian methods are introduced to provide a narrow band of alternative continuous distributions when the distribution tested is uniform or normal. A particular use of Bayesian methods allows consideration of the problem of testing the distribution of latent (unobserved) variables when these are connected by a known relationship to a set of observed variables. The technique is used to advance an interesting procedure introduced in Geology by Krumbein and for a modern example, to test the distribution of the frailty term (random effects) in a Cox Proportional Hazards (PH) model. The second part of the thesis deals with discrete data with particular emphasis on applying Cramer von Mises statistics. Tests are proposed for K samples in an ordered contingency table. Finally, the K sample procedure is applied to testing the fit of the binary regression model to longitudinal (correlated) data using Generalized estimating equations. A common thread throughout the thesis is the use of the Cramer von Mises statistics or closely related statistics for testing.

Document type: 
Thesis
File(s): 
Supervisor(s): 
Richard A Lockhart
Michael A Stephens
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Thesis) Ph.D.

Understanding risks in a hybrid pension plan with stochastic rates of return

Author: 
Date created: 
2014-03-03
Abstract: 

The solvency risk, contribution rate risk, and bene t risk of a hybrid pension plan with stochastic investment returns are studied in this project. Gaussian, autoregressive and moving average processes are used to model the rate of return. The rst two moments of the funding level, the contribution rate and the bene t payment are presented both at the stationary status and during evolution. Three investment strategies are considered and the risks generated in the hybrid pension plan are compared. Di erent sets of valuation rates of interest are used to understand the impact of regulative environmental change on the hybrid pension plan. The trade-o between the contribution and bene t risks and the optimum region of risk sharing are discussed to provide an insight of the relationship between plan sponsors and employees under a hybrid pension plan.

Document type: 
Graduating extended essay / Research project
File(s): 
Supervisor(s): 
Yi Lu
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

Variations of the linear logarithm hazard transform for modelling cohort mortality rates

Author: 
Date created: 
2014-01-23
Abstract: 

Observing that there is a linear relationship between two sequences of the logarithm of the forces of mortality (hazard rates of the future lifetime) for two years, two variations of the linear logarithm hazard transform (LLHT) model are proposed in this project. We first regress the sequence of the logarithm of the forces of mortality for a cohort in year y on that for a base year. Next, we repeat the same procedure a number of times with y increased by one and the base year unchanged each time, and produce two sequences of slope and intercept parameters which both look linear. Then the simple linear regression and random walk with drift model are applied to each of these two parameter sequences. The fitted parameters can be used to forecast cohort mortality rates. Deterministically and stochastically forecasted cohort mortality rates with the two LLHT-based approaches, and the Lee-Carter and CBD models are presented, and their corresponding forecasted errors and associated confidence intervals are calculated for comparing the forecasting performances. Applications in pricing term life insurance and annuities are also given for illustration.

Document type: 
Graduating extended essay / Research project
File(s): 
Supervisor(s): 
Cary Chi-Liang Tsai
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

Predicting Ovarian Cancer Survival Times: Performance of Parametric Methods and Random Survival Forests

Author: 
Date created: 
2014-01-03
Abstract: 

This project is an exploration of the performance of parametric and nonparametric methods in predicting time to recurrence (progression of cancer) and time to death in late stage ovarian cancer patients. The Weibull survival model is a common parametric method and is fit to the data for both death and recurrence, while Ishwaran et al’s method of fitting random survival forests (2008) is employed as a nonparametric method. Performance of these models is evaluated using Harrell’s C-index and Lawless & Yuan’s cross-validation estimator (2010).

Document type: 
Graduating extended essay / Research project
File(s): 
Supervisor(s): 
Rachel Altman
Thomas Loughin
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

Probabilistic solution of differential equations for Bayesian uncertainty quantification and inference

Date created: 
2013-12-09
Abstract: 

In many areas of applied science the time and space evolution of variables can be naturally described by differential equation models, which define states implicitly as functions of their own rates of change. Inference for differential equation models requires an explicit representation of the states (the solution), which is typically not known in closed form, but can be approximated by a variety of discretization-based numerical methods. However, numerical error analysis is not well-suited for describing functional discretization error in a way that can be propagated through the inverse problem, and is consequently ignored in practice. Because its impact can be substantial, characterizing the effect of discretization uncertainty propagation on inference has been an important open problem. We develop a probability model for the systematic uncertainty introduced by a finite-dimensional representation of the infinite-dimensional solution of ordinary and partial differential equation problems. The result is a probability distribution over the space of possible state trajectories, describing our belief about the unknown solution given information generated from the model over a discrete grid. Our probabilistic approach provides a useful alternative to deterministic numerical integration techniques in cases when models are chaotic, ill-conditioned, or contain unmodelled functional variability. Based on these results, we develop a fully probabilistic Bayesian approach for the statistical inverse problem of inference and prediction for intractable differential equation models from data, which characterizes and propagates discretization uncertainty in the estimation. Our approach is demonstrated on a number of challenging forward and inverse problems.

Document type: 
Thesis
File(s): 
Supervisor(s): 
David Campbell
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Thesis) Ph.D.

Methodology for analyzing at-sea dive behaviour of a marine mammal

Author: 
Date created: 
2013-08-29
Abstract: 

The population of northern fur seals (Callorhinus ursinus) in the Pribilof Islands, Alaska has declined dramatically during the past 35 years. Arresting the decline of the species requires an understanding of their foraging behaviour at sea and is particularly important for those adult females whose foraging success is also linked to pup survival. We propose an augmented state space methodology for studying behavioural patterns using high-resolution movement time series. We show how non-stationary time series models that describe systems for whom parameters evolve slowly over time relative to the state dynamics can be estimated at relevant time scales for behavioural inference. This framework allows us to relate the time-varying parameter estimates of an auto-regressive system model to the seal's at-sea behaviour. The at-sea behaviour states of eleven lactating female northern fur seals were then matched, spatially and temporally, to a set of environmental variables, some of which were averages that represented the oceanic conditions over a large spatial area. The mismatch of scale between seal behaviour and the spatial variables was accounted for by applying an error-in-covariate Bayesian hierarchical model. Using this approach, we were able to link together northern fur seals that went to disparate regions of the eastern Bering Sea, with widely variable information about their underlying environmental fields into a single model. This application of a hierarchical model relates changes in identifiable behavioural states of the northern fur seal to changes in the Alaska commercial groundfish industry over a diurnal foraging cycle. The methodology described in this thesis is adaptable for analyzing any type of high-resolution movement data on marine predators, and will allow for the characterization of other at-sea behaviours as well as other descriptors of pelagic habitat and foraging success.

Document type: 
Thesis
File(s): 
Supervisor(s): 
Richard Routledge
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Thesis) Ph.D.

Classification based on supervised clustering with application to juvenile idiopathic arthritis

Author: 
Date created: 
2013-08-16
Abstract: 

Juvenile Idiopathic Arthritis (JIA) is the most common rheumatic disease of childhood. Our objective is to predict the results of remission so that those children who are likely to experience poor remission outcomes could benefit from early aggressive treatment. Many classification techniques could provide either a binary prediction or an estimated probability of remission. However, parents would like to know more specifically about the remission outcomes of children similar to their own. In this project, we propose a supervised clustering method that provides this information. Inspired by the basic idea of supervised principal component analysis, we perform supervision by selecting and/or weighting explanatory variables differently depending on their associations with the class response. Our supervised clustering method is applied to JIA data and to data simulated with known properties. Our method is shown to be competitive with an existing supervised clustering method, classification trees and random forests in terms of out-of-sample misclassification rates.

Document type: 
Graduating extended essay / Research project
File(s): 
Supervisor(s): 
Thomas M. Loughin
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

Modeling Mortality Rates with the Linear Logarithm Hazard Transform Approaches

Author: 
Peer reviewed: 
No, item is not peer reviewed.
Date created: 
2013-06-24
Abstract: 

In this project, two approaches based on the linear logarithm hazard transform (LLHT) to modeling mortality rates are proposed. Empirical observations show that there is a linear relationship between two sequences of logarithm of the forces of mortality (hazard rates of the future lifetime) for two years. The estimated two parameters of the linear relationship can be used for forecasting mortality rates. Deterministic and stochastic mortality rates with the LLHT, Lee-Carter and CBD models are predicted, and their corresponding forecasted errors are calculated for comparing the forecasting performances. Finally, applications to pricing some mortality-linked securities based on the forecasted mortality rates are presented for illustration.

Document type: 
Graduating extended essay / Research project
File(s): 
Supervisor(s): 
Cary Chi-Liang Tsai
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.