Statistics and Actuarial Science - Theses, Dissertations, and other Required Graduate Degree Essays

Receive updates for this collection

Distributions of Time to First Spot Fire

Author: 
Date created: 
2017-08-15
Abstract: 

In wildfire management, a spot fire is the result of an airborne ember igniting a separate fire away from the main wildfire. Under certain environmental and wildfire conditions, a burning ember can breach a fuel break, such as a river or road, and result in the production of a spot fire. This project derives distributions of the time to the first spot fire in various situations, and verifies them by simulation. To demonstrate the implementation of the distributions in practice, we incorporate a stochastic fire spread model. This research assesses the likelihood of spot fire occurring passed a fuel break, all while taking into account both spotting distance and spotting rate. This contrasts with the traditional approach that solely involves the maximal spotting distance, and can be a tool for fire management.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Joan Hu
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

Mendelian randomization for causal inference of the relationship between obesity and 28-day survival following septic shock

Date created: 
2017-08-10
Abstract: 

Septic shock is a leading cause of death in intensive care units. Septic shock occurs when a body-wide infection leads to low blood pressure, and ultimately organ failure. Some recent studies suggest that overweight and obese patients have a better chance of survival following septic shock than normal or underweight patients. In this project we apply Mendelian randomization to assess whether the observed obesity effect on 28-day survival following septic shock is causal or more likely due to unmeasured confounding variables. Mendelian randomization is an instrumental variables approach that uses genetic markers as instruments. Under modelling assumptions, unconfounded estimates of the obesity effect can be obtained by fitting a model for 28-day survival that includes a residual obesity term. Data for the project comes from the Vasopressin and Septic Shock Trial (VASST). Our analysis suggests that the observed obesity effect on survival following septic shock is not causal.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Brad McNeney
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

A Multi-Dimensional Bühlmann Credibility Approach to Modeling Multi-Population Mortality Rates

Author: 
Date created: 
2017-06-08
Abstract: 

In this project, we first propose a multi-dimensional Bühlmann credibility approach to forecasting mortality rates for multiple populations, and then compare forecasting performances among the proposed approach and the joint-k/co-integrated/augmented common factor Lee-Carter models. The model is applied to mortality data of the Human Mortality Database for both genders of three well-developed countries with an age span and a wide range of fitting year spans. Empirical illustrations show that the proposed multi-dimensional Bühlmann credibility approach contributes to more accurate forecast results, measured by MAPE (mean absolute percentage error), than those based on the Lee-Carter model.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Cary Chi-Liang Tsai
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

Using AI and Statistical Techniques to Correct Play-by-play Substitution Errors

Author: 
Date created: 
2017-05-26
Abstract: 

Play-by-play is an important data source for basketball analysis, particularly for leagues that cannot afford the infrastructure for collecting video tracking data; it enables advanced metrics like adjusted plus-minus and lineup analysis like With Or Without You (WOWY). However, this analysis is not possible unless all substitutions are recorded and are correct. In this paper we use six seasons of play-by-play from the Canadian university league to derive a framework for automated cleaning of play-by-play that is littered with substitution logging errors. These errors include missing substitutions, unequal number of players subbing in and out, substitution patterns of a player not alternating between in/out, and more. We define features to build a prediction model for identifying correct/incorrect recorded substitutions and outline a simple heuristic for player activity to use for inferring the players who were not accounted for in the substitutions. We define two performance measures for objectively quantifying the effectiveness of this framework. The play-by-play which results from the algorithm opens up a set of statistics that were not obtainable for the Canadian university league which improves their analytics capabilities; coaches can improve strategy leading to a more competitive product, and media can introduce modern statistics in their coverage to increase engagement from fans.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Tim Swartz
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

An applied analysis of high-dimensional logistic regression

Author: 
Date created: 
2017-05-16
Abstract: 

In the high dimensional setting, we investigate common regularization approaches for fitting logistic regression models with binary response variables. A literature review is provided on generalized linear models, regularization approaches which include the lasso, ridge, elastic net and relaxed lasso, and recent post-selection methods for obtaining p-values of coefficient estimates proposed by Lockhart et. al. and Buhlmann et. al. We consider varying n, p conditions, and assess model performance based on several evaluation metrics - such as their sparsity, accuracy and algorithmic time efficiency. Through a simulation study, we find that Buhlmann et. al’s multi sample splitting method performed poorly when selected covariates were highly correlated. When λ was chosen through cross validation, the elastic net had similar levels of performance as compared to the lasso, but it did not possess the level of sparsity Zou and Hastie have suggested.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Richard Lockhart
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

Bayesian Sensitivity Analysis for Non-ignorable Missing Data in Longitudinal Studies

Author: 
Date created: 
2017-04-13
Abstract: 

The use of Bayesian statistical methods to handle missing data in biomedical studies has become popular in recent years. In this thesis, we propose a novel Bayesian sensitivity analysis (BSA) model that accounts for the influences of missing outcome data on the estimation of treatment effects in randomized control trials with non-ignorable missing data. We implement the method using the probabilistic programming language Stan, and apply it to data from the Vancouver At Home (VAH) Study, which is a randomized control trial that provided housing to homeless people with mental illness. We compare the results of BSA to those from an existing Bayesian longitudinal model that ignores missingness in the outcome. Furthermore, we demonstrate in a simulation study that, when a diffuse conservative prior that describes a range of assumptions about the bias effect is used, BSA credible intervals have greater length and higher coverage rate of the target parameters than existing methods, and that sensitivity increases as the percentage of missingness increases.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Lawrence McCandless
Joan Hu
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

Delta Hedging for Single Premium Segregated Fund

Author: 
Date created: 
2017-03-31
Abstract: 

Segregated funds are individual insurance contracts that offer growth potential of investment in underlying assets while providing a guarantee to protect part of the money invested. The guarantee can cause significant losses to the insurer which makes it essential for the insurer to hedge this risk. In this project, we discuss the hedging effectiveness of delta hedging by studying the distribution of hedging errors under different assumptions about the return on underlying assets. We consider a Geometric Brownian motion and a Regime Switching Lognormal to model equity returns and compare the hedging effectiveness when risk-free rates are constant or stochastic. Two one-factor short-rate models, the Vasicek and CIR models, are used to model the risk-free rate. We find that delta hedging is in general effective but large hedging errors can occur when the assumptions of the Black-Scholes' framework are violated.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Gary Parker
Barbara Sanders
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

Analysis of Target Benefit Plans with Aggregate Cost Method

Author: 
Date created: 
2017-04-06
Abstract: 

The operational characteristics of a target benefit plan based on an aggregate pension cost method are studied through simulation under a multivariate time series model for projected interest rates and equity returns. The performance of the target benefit plan is evaluated by applying a variety of performance metrics for benefit security, benefit adequacy, benefit stability and intergenerational equity. Performance is shown to improve when the economy remains relatively stable over time and when the choice of valuation rate does not create persistent gains or losses.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Gary Parker
Barbara Sanders
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

Predictive Estimation in Canadian Federal Elections

Date created: 
2017-04-20
Abstract: 

Various estimation methods are employed to provide seat projections during Canadian federal elections. This project explores discrepancies between the real outcomes of recent Canadian federal elections and the predictions by the existing approaches such as the ones proposed by Grenier and Rosenthal. It appears that each seat projection procedure requires a set of assumptions, but the assumptions are not explicitly listed in the accessible references. We formulate the required assumptions used in the two prediction procedures proposed by Rosenthal, and present variance estimation procedures. Departures from the assumptions are explored with real data from the 2006, 2008, 2011, and 2015 federal election. An extensive simulation study is conducted to examine potential impacts of various deviations from the assumptions. The simulation indicates that, compared to other assumption violations, misleading polling results may cause the most damage to the prediction. In addition, we find by the simulation that the prediction is least affected by a change in number of voters and the heterogeneity of riding patterns within a region may not affect the the prediction at the national level.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Joan Hu
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.

Stochastic Modelling and Comparison of Two Pension Plans

Author: 
Date created: 
2017-04-19
Abstract: 

In this project, we simulate the operation of a stylized jointly sponsored pension plan (JSPP) and a stylized defined contribution (DC) plan with identical contribution patterns using a vector autoregressive model for key economic variables. The performance of the two plans is evaluated by comparing the distribution of pension ratios for a specific cohort of new entrants. We find that the DC plan outperforms the JSPP in terms of expected pension ratio, and experiences only a moderate degree of downside risk. This downside risk is not enough to outweigh the upside potential even for a relatively risk-averse member, as reflected in the expected discounted utility of benefits under the two plans. Under more sophisticated rate stabilization techniques, the probability that the DC plan outperforms the JSPP increases. When the bond yield and stock return processes begin from values far above their long-term means (not far below, as is the case today), the DC plan is projected to outperform the JSPP even more frequently, because the higher required contributions accrue to the advantage of the individual member only, instead of also financing benefits for others.

Document type: 
Graduating extended essay / Research project
File(s): 
Senior supervisor: 
Barbara Sanders
Gary Parker
Department: 
Science: Department of Statistics and Actuarial Science
Thesis type: 
(Project) M.Sc.