Statistics and Actuarial Science - Theses, Dissertations, and other Required Graduate Degree Essays

Receive updates for this collection

Stochastic Modelling and Comparison of Two Pension Plans

Author: 
Date created: 
2017-04-19
Abstract: 

In this project, we simulate the operation of a stylized jointly sponsored pension plan (JSPP) and a stylized defined contribution (DC) plan with identical contribution patterns using a vector autoregressive model for key economic variables. The performance of the two plans is evaluated by comparing the distribution of pension ratios for a specific cohort of new entrants. We find that the DC plan outperforms the JSPP in terms of expected pension ratio, and experiences only a moderate degree of downside risk. This downside risk is not enough to outweigh the upside potential even for a relatively risk-averse member, as reflected in the expected discounted utility of benefits under the two plans. Under more sophisticated rate stabilization techniques, the probability that the DC plan outperforms the JSPP increases. When the bond yield and stock return processes begin from values far above their long-term means (not far below, as is the case today), the DC plan is projected to outperform the JSPP even more frequently, because the higher required contributions accrue to the advantage of the individual member only, instead of also financing benefits for others.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Barbara Sanders
Gary Parker
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

Analysis of the Bitcoin Exchange using Particle MCMC Methods

Author: 
Date created: 
2017-03-24
Abstract: 

Stochastic volatility models (SVM) are commonly used to model time series data. They have many applications in finance and are useful tools to describe the evolution of asset returns. The motivation for this project is to determine if stochastic volatility models can be used to model Bitcoin exchange rates in a way that can contribute to an effective trading strategy. We consider a basic SVM and several extensions that include fat tails, leverage, and covariate effects. The Bayesian approach with the particle Markov chain Monte Carlo (PMCMC) method is employed to estimate the model parameters. We assess the goodness of the estimated model using the deviance information criterion (DIC). Simulation studies are conducted to assess the performance of particle MCMC and to compare with the traditional MCMC approach. We then apply the proposed method to the Bitcoin exchange rate data and compare the effectiveness of each type of SVM.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Liangliang Wang
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

Bayesian methods for multi-modal posterior topologies

Date created: 
2017-04-18
Abstract: 

The purpose of this thesis is to develop efficient Bayesian methods to address multi-modality in posterior topologies. In Chapter 2 we develop a new general Bayesian methodology that simultaneously estimates parameters of interest and probability of the model. The proposed methodology builds on the Simulated Tempering algorithm, which is a powerful sampling algorithm that handles multi-modal distributions, but it is difficult to use in practice due to the requirement to choose suitable prior for the temperature and temperature schedule. Our proposed algorithm removes this requirement, while preserving the sampling efficiency of the Simulated Tempering algorithm. We illustrate the applicability of the new algorithm to different examples involving mixture models of Gaussian distributions and ordinary differential equation models. Chapter 3 proposes a general optimization strategy, which combines results from different optimization or parameter estimation methods to overcome shortcomings of a single method. Embedding the proposed optimization strategy in the Incremental Mixture Importance Sampling with Optimization algorithm (IMIS-Opt) significantly improves sampling efficiency and removes the dependence on the choice of the prior of the IMIS-Opt. We demonstrate that the resulting algorithm provides accurate parameter estimates, while the IMIS-Opt gets trapped in a local mode in the case of the ordinary differential equation (ODE) models. Finally, the resulting algorithm is implemented within the Approximate Bayesian Computation framework to draw likelihood-free inference. Chapter 4 introduces a generalization of the Bayesian Information Criterion (BIC) that handles multi-modality in the posterior space. The BIC is a computationally efficient model selection tool, but it relies on the assumption that the posterior distribution is unimodal. When the posterior is multi-modal the BIC uses only one posterior mode, while discarding the information from the rest of the modes. We demonstrate that the BIC produces inaccurate estimates of the posterior probability of the bimodal model, which in some cases results in the BIC selecting the sub-optimal model. As a remedy, we propose a Multi-modal BIC (MBIC) that incorporates all relevant posterior modes, while preserving the computational efficiency of the BIC. The accuracy of the MBIC is demonstrated through bimodal models and mixture models of Gaussian distributions.

Document type: 
Thesis
Senior supervisor: 
Dr. David Campbell
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Thesis) Ph.D.

Marginal Loglinear Models for Three Multiple-Response Categorical Variables

Author: 
Date created: 
2016-12-09
Abstract: 

A lot of survey questions include a phrase like, “Choose all that apply”, which lets the respondents choose any number of options from predefined lists of items. Responses to thesequestions result in multiple-response categorical variables (MRCVs). This thesis focuses on analyzing and modeling three MRCVs. There are 232 possible models representing different combinations of associations. Parameters are estimated using generalized estimating equations generated by a pseudo-likelihood and variances of the estimators are corrected using sandwich methods. Due to the large number of possible models, model comparisons based on nested models would be inappropriate. As an alternative, model averaging is proposed as a model comparison tool as well as to account for model selection uncertainty. Further the calculations required for computing the variance of the estimators can exceed 32-bit machine capacity even for a moderately large number of items. This issue is addressed by reducing dimensions of the matrices.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Thomas Loughin
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

A Shot Quality Adjusted Plus-Minus for the NHL

Date created: 
2016-12-19
Abstract: 

We explore two regression models for creating an adjusted plus-minus statistic for the NHL. We compare an OLS regression models and a penalized gamma-lasso regression model. The traditional plus-minus metric is a simple marginal statistic that allocates a +1 to players for scoring a goal and a -1 for allowing a goal according to whether they were on the ice. This is a very noisy and uninformative statistic since it does not take into account the quality of the other players on the ice with an individual. We build off of previous research to create a more informative statistic that takes into account all of the players on the ice. This previous research has focused on goals to build an adjusted plus-minus, which is information deficient due to the fact that there are only approximately 5 goals scored per game. We improve upon this by instead using shots which provides us with ten times as much information per game. We use shot location data from 2007 to 2013 to create a smoothed probability map for the probability of scoring a goal from all locations in the offensive zone. We then model the shots from 2014-2015 season to get player estimates. Two models are compared, an OLS regression and a penalized regression (lasso). Finally, we compare our adjusted plus-minus to the traditional plus-minus and complete a salary analysis to determine if teams are properly valuing players for the quality of shots they are taking and allowing.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Dr. Tim Swartz
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

IBNR Claims Reserving Using INAR Processes

Author: 
Date created: 
2016-12-15
Abstract: 

This project studies the reserving problem for incurred but not reported (IBNR) claims in non-life insurance. Based on an idea presented in Kremer (1995), we propose a new Poisson INAR (integer-valued autoregressive) model for the unclosed claim counts, which are the number of reported but not enough reported claims. The properties and the prediction of the proposed Poisson INAR model are discussed. We modify the estimation methods proposed in Silva et al. (2005) for the replicated INAR(1) processes to be applied to our model and introduce new algorithms for estimating the model parameters. The performance of three different estimation methods used in this project is compared, and the impact of the sample size to the accuracy of the estimates is examined in the simulation study. To illustrate, we also present the prediction results of our proposed model using a generated sample.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Yi Lu
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

Sports analytics

Date created: 
2016-12-08
Abstract: 

This thesis consists of a compilation of four research papers. Chapter 2 investigates the powerplay in one-day cricket. The form of the analysis takes a “what if” approach where powerplay outcomes are substituted with what might have happened had there been no powerplay. This leads to a paired comparisons setting consisting of actual matches and hypothetical parallel matches where outcomes are imputed during the powerplay period. We also investigate individual batsmen and bowlers and their performances during the powerplay. Chapter 3 considers the problem of determining optimal substitution times in soccer. An analysis is presented based on Bayesian logistic regression. We find that with evenly matched teams, there is a goal scoring advantage to the trailing team during the second half of a match. We observe that there is no discernible time during the second half when there is a benefit due to substitution. Chapter 4 explores two avenues for the modification of tactics in Twenty20 cricket. The first idea is based on the realization that wickets are of less importance in Twenty20 cricket than in other formats of cricket (e.g. one-day cricket and Test cricket). The second idea may be applicable when there exists a sizeable mismatch between two competing teams. In this case, the weaker team may be able to improve its win probability by increasing the variance of run differential. A specific variance inflation technique which we consider is increased aggressiveness in batting. Chapter 5 explores new definitions for pace of play in ice hockey. Using detailed event data from the 2015-2016 regular season of the National Hockey League (NHL), the distance of puck movement with possession is the proposed criterion in determining the pace of a game. Although intuitive, this notion of pace does not correlate with expected and familiar quantities such as goals scored and shots taken.

Document type: 
Thesis
File(s): 
Senior supervisor: 
Tim Swartz
Department: 
Science: Statistics and Actuarial Science
Thesis type: 
(Dissertation) Ph.D.

Pricing Defaultable Catastrophe Bonds with Compound Doubly Stochastic Poisson Losses and Liquidity Risk

Date created: 
2016-12-15
Abstract: 

Catastrophe bond (CAT bond) is one of the modern financial instruments to transfer the risk of natural disasters to capital markets. In this project, we provide a structure of payoffs for a zero-coupon CAT bond in which the premature default of the issuer is also considered. The defaultable CAT bond price is computed by Monte Carlo simulations under the Vasicek interest rate model with losses generated from a compound doubly stochastic Poisson process. In the underlying Poisson process, the intensity of occurrence is assumed to follow a geometric Brownian motion. Moreover, the issuer’s daily total asset value is modelled by the approach proposed in Duan et al. (1995), and the liquidity process is incorporated to capture the additional return of investors. Finally, a sensitivity analysis is carried out to explore the effects of key parameters on the CAT bond price.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Cary Chi-Liang Tsai
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

Sparse Multivariate Reduced-Rank Regression with Covariance Estimation

Author: 
Date created: 
2016-12-14
Abstract: 

Multivariate multiple linear regression is multiple linear regression, but with multiple responses. Standard approaches assume that observations from different subjects are uncorrelated and so estimates of the regression parameters can be obtained through separate univariate regressions, regardless of whether the responses are correlated within subjects. There are three main extensions to the simplest model. The first assumes a low rank structure on the coefficient matrix that arises from a latent factor model linking predictors to responses. The second reduces the number of parameters through variable selection. The third allows for correlations between response variables in the low rank model. Chen and Huang propose a new model that falls under the reduced-rank regression framework, employs variable selection, and estimates correlations among error terms. This project reviews their model, describes its implementation, and reports the results of a simulation study evaluating its performance. The project concludes with ideas for further research.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Jinko Graham
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

A multi-state model for a life insurance product with integrated health rewards program

Author: 
Date created: 
2016-12-02
Abstract: 

With the prevalence of chronic diseases that account for a significant portion of deaths, a new approach to life insurance has emerged to address this issue. The new approach integrates health rewards programs with life insurance products; the insureds are classified by fitness statuses according to their level of participation and would get premium reductions at the superior statuses. We introduce a Markov chain process to model the dynamic transition of the fitness statuses, which are linked to corresponding levels of mortality risks reduction. We then embed this transition process into a stochastic multi-state model to describe the new life insurance product. Formulas are given for calculating its benefit, premium, reserve and surplus. These results are compared with those of the traditional life insurance. Numerical examples are given for illustration.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Yi Lu
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.