Statistics and Actuarial Science - Theses, Dissertations, and other Required Graduate Degree Essays

Receive updates for this collection

Bayesian Sensitivity Analysis for Non-ignorable Missing Data in Longitudinal Studies

Author: 
Date created: 
2017-04-13
Abstract: 

The use of Bayesian statistical methods to handle missing data in biomedical studies has become popular in recent years. In this thesis, we propose a novel Bayesian sensitivity analysis (BSA) model that accounts for the influences of missing outcome data on the estimation of treatment effects in randomized control trials with non-ignorable missing data. We implement the method using the probabilistic programming language Stan, and apply it to data from the Vancouver At Home (VAH) Study, which is a randomized control trial that provided housing to homeless people with mental illness. We compare the results of BSA to those from an existing Bayesian longitudinal model that ignores missingness in the outcome. Furthermore, we demonstrate in a simulation study that, when a diffuse conservative prior that describes a range of assumptions about the bias effect is used, BSA credible intervals have greater length and higher coverage rate of the target parameters than existing methods, and that sensitivity increases as the percentage of missingness increases.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Lawrence McCandless
Joan Hu
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

Delta Hedging for Single Premium Segregated Fund

Author: 
Date created: 
2017-03-31
Abstract: 

Segregated funds are individual insurance contracts that offer growth potential of investment in underlying assets while providing a guarantee to protect part of the money invested. The guarantee can cause significant losses to the insurer which makes it essential for the insurer to hedge this risk. In this project, we discuss the hedging effectiveness of delta hedging by studying the distribution of hedging errors under different assumptions about the return on underlying assets. We consider a Geometric Brownian motion and a Regime Switching Lognormal to model equity returns and compare the hedging effectiveness when risk-free rates are constant or stochastic. Two one-factor short-rate models, the Vasicek and CIR models, are used to model the risk-free rate. We find that delta hedging is in general effective but large hedging errors can occur when the assumptions of the Black-Scholes' framework are violated.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Gary Parker
Barbara Sanders
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

Analysis of Target Benefit Plans with Aggregate Cost Method

Author: 
Date created: 
2017-04-06
Abstract: 

The operational characteristics of a target benefit plan based on an aggregate pension cost method are studied through simulation under a multivariate time series model for projected interest rates and equity returns. The performance of the target benefit plan is evaluated by applying a variety of performance metrics for benefit security, benefit adequacy, benefit stability and intergenerational equity. Performance is shown to improve when the economy remains relatively stable over time and when the choice of valuation rate does not create persistent gains or losses.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Gary Parker
Barbara Sanders
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

Predictive Estimation in Canadian Federal Elections

Date created: 
2017-04-20
Abstract: 

Various estimation methods are employed to provide seat projections during Canadian federal elections. This project explores discrepancies between the real outcomes of recent Canadian federal elections and the predictions by the existing approaches such as the ones proposed by Grenier and Rosenthal. It appears that each seat projection procedure requires a set of assumptions, but the assumptions are not explicitly listed in the accessible references. We formulate the required assumptions used in the two prediction procedures proposed by Rosenthal, and present variance estimation procedures. Departures from the assumptions are explored with real data from the 2006, 2008, 2011, and 2015 federal election. An extensive simulation study is conducted to examine potential impacts of various deviations from the assumptions. The simulation indicates that, compared to other assumption violations, misleading polling results may cause the most damage to the prediction. In addition, we find by the simulation that the prediction is least affected by a change in number of voters and the heterogeneity of riding patterns within a region may not affect the the prediction at the national level.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Joan Hu
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

Stochastic Modelling and Comparison of Two Pension Plans

Author: 
Date created: 
2017-04-19
Abstract: 

In this project, we simulate the operation of a stylized jointly sponsored pension plan (JSPP) and a stylized defined contribution (DC) plan with identical contribution patterns using a vector autoregressive model for key economic variables. The performance of the two plans is evaluated by comparing the distribution of pension ratios for a specific cohort of new entrants. We find that the DC plan outperforms the JSPP in terms of expected pension ratio, and experiences only a moderate degree of downside risk. This downside risk is not enough to outweigh the upside potential even for a relatively risk-averse member, as reflected in the expected discounted utility of benefits under the two plans. Under more sophisticated rate stabilization techniques, the probability that the DC plan outperforms the JSPP increases. When the bond yield and stock return processes begin from values far above their long-term means (not far below, as is the case today), the DC plan is projected to outperform the JSPP even more frequently, because the higher required contributions accrue to the advantage of the individual member only, instead of also financing benefits for others.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Barbara Sanders
Gary Parker
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

Analysis of the Bitcoin Exchange using Particle MCMC Methods

Author: 
Date created: 
2017-03-24
Abstract: 

Stochastic volatility models (SVM) are commonly used to model time series data. They have many applications in finance and are useful tools to describe the evolution of asset returns. The motivation for this project is to determine if stochastic volatility models can be used to model Bitcoin exchange rates in a way that can contribute to an effective trading strategy. We consider a basic SVM and several extensions that include fat tails, leverage, and covariate effects. The Bayesian approach with the particle Markov chain Monte Carlo (PMCMC) method is employed to estimate the model parameters. We assess the goodness of the estimated model using the deviance information criterion (DIC). Simulation studies are conducted to assess the performance of particle MCMC and to compare with the traditional MCMC approach. We then apply the proposed method to the Bitcoin exchange rate data and compare the effectiveness of each type of SVM.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Liangliang Wang
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

Bayesian methods for multi-modal posterior topologies

Date created: 
2017-04-18
Abstract: 

The purpose of this thesis is to develop efficient Bayesian methods to address multi-modality in posterior topologies. In Chapter 2 we develop a new general Bayesian methodology that simultaneously estimates parameters of interest and probability of the model. The proposed methodology builds on the Simulated Tempering algorithm, which is a powerful sampling algorithm that handles multi-modal distributions, but it is difficult to use in practice due to the requirement to choose suitable prior for the temperature and temperature schedule. Our proposed algorithm removes this requirement, while preserving the sampling efficiency of the Simulated Tempering algorithm. We illustrate the applicability of the new algorithm to different examples involving mixture models of Gaussian distributions and ordinary differential equation models. Chapter 3 proposes a general optimization strategy, which combines results from different optimization or parameter estimation methods to overcome shortcomings of a single method. Embedding the proposed optimization strategy in the Incremental Mixture Importance Sampling with Optimization algorithm (IMIS-Opt) significantly improves sampling efficiency and removes the dependence on the choice of the prior of the IMIS-Opt. We demonstrate that the resulting algorithm provides accurate parameter estimates, while the IMIS-Opt gets trapped in a local mode in the case of the ordinary differential equation (ODE) models. Finally, the resulting algorithm is implemented within the Approximate Bayesian Computation framework to draw likelihood-free inference. Chapter 4 introduces a generalization of the Bayesian Information Criterion (BIC) that handles multi-modality in the posterior space. The BIC is a computationally efficient model selection tool, but it relies on the assumption that the posterior distribution is unimodal. When the posterior is multi-modal the BIC uses only one posterior mode, while discarding the information from the rest of the modes. We demonstrate that the BIC produces inaccurate estimates of the posterior probability of the bimodal model, which in some cases results in the BIC selecting the sub-optimal model. As a remedy, we propose a Multi-modal BIC (MBIC) that incorporates all relevant posterior modes, while preserving the computational efficiency of the BIC. The accuracy of the MBIC is demonstrated through bimodal models and mixture models of Gaussian distributions.

Document type: 
Thesis
Senior supervisor: 
Dr. David Campbell
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Thesis) Ph.D.

Marginal Loglinear Models for Three Multiple-Response Categorical Variables

Author: 
Date created: 
2016-12-09
Abstract: 

A lot of survey questions include a phrase like, “Choose all that apply”, which lets the respondents choose any number of options from predefined lists of items. Responses to thesequestions result in multiple-response categorical variables (MRCVs). This thesis focuses on analyzing and modeling three MRCVs. There are 232 possible models representing different combinations of associations. Parameters are estimated using generalized estimating equations generated by a pseudo-likelihood and variances of the estimators are corrected using sandwich methods. Due to the large number of possible models, model comparisons based on nested models would be inappropriate. As an alternative, model averaging is proposed as a model comparison tool as well as to account for model selection uncertainty. Further the calculations required for computing the variance of the estimators can exceed 32-bit machine capacity even for a moderately large number of items. This issue is addressed by reducing dimensions of the matrices.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Thomas Loughin
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

A Shot Quality Adjusted Plus-Minus for the NHL

Date created: 
2016-12-19
Abstract: 

We explore two regression models for creating an adjusted plus-minus statistic for the NHL. We compare an OLS regression models and a penalized gamma-lasso regression model. The traditional plus-minus metric is a simple marginal statistic that allocates a +1 to players for scoring a goal and a -1 for allowing a goal according to whether they were on the ice. This is a very noisy and uninformative statistic since it does not take into account the quality of the other players on the ice with an individual. We build off of previous research to create a more informative statistic that takes into account all of the players on the ice. This previous research has focused on goals to build an adjusted plus-minus, which is information deficient due to the fact that there are only approximately 5 goals scored per game. We improve upon this by instead using shots which provides us with ten times as much information per game. We use shot location data from 2007 to 2013 to create a smoothed probability map for the probability of scoring a goal from all locations in the offensive zone. We then model the shots from 2014-2015 season to get player estimates. Two models are compared, an OLS regression and a penalized regression (lasso). Finally, we compare our adjusted plus-minus to the traditional plus-minus and complete a salary analysis to determine if teams are properly valuing players for the quality of shots they are taking and allowing.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Dr. Tim Swartz
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

IBNR Claims Reserving Using INAR Processes

Author: 
Date created: 
2016-12-15
Abstract: 

This project studies the reserving problem for incurred but not reported (IBNR) claims in non-life insurance. Based on an idea presented in Kremer (1995), we propose a new Poisson INAR (integer-valued autoregressive) model for the unclosed claim counts, which are the number of reported but not enough reported claims. The properties and the prediction of the proposed Poisson INAR model are discussed. We modify the estimation methods proposed in Silva et al. (2005) for the replicated INAR(1) processes to be applied to our model and introduce new algorithms for estimating the model parameters. The performance of three different estimation methods used in this project is compared, and the impact of the sample size to the accuracy of the estimates is examined in the simulation study. To illustrate, we also present the prediction results of our proposed model using a generated sample.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Yi Lu
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.