Statistics and Actuarial Science - Theses, Dissertations, and other Required Graduate Degree Essays

Receive updates for this collection

Hockey pools for profit: A simulation based player selection strategy

Author: 
Date created: 
2005
Abstract: 

The goal of this project is to develop an optimal player selection strategy for a common playoff hockey pool. The challenge is to make the strategy applicable in real time. Most selection methods rely on the draftee's hockey knowledge. Our selection strategy was created by applying appropriate statistical models to regular season data and introducing a reasonable optimality criterion. A simulated draft is performed in order to test our selection method. The results suggest that the approach is superior to several ad-hoc strategies.

Document type: 
Thesis
File(s): 
Department: 
Department of Statistics and Actuarial Science - Simon Fraser University
Thesis type: 
Project (M.Sc.)

Methods for the analysis of spatio-temporal multi-state processes

Author: 
Date created: 
2005
Abstract: 

Studies of recurring infection or chronic disease often collect longitudinal data on the disease status of subjects. Multi-state transitional models are commonly used for describing the development of such longitudinal data. In this setting, we model a stochastic process, which at any point in time will occupy one of a discrete set of states and interest centers on the transition process between states. For example, states may refer to the number of recurrences of an event or the stage of a disease. Geographic referencing of data collected in longitudinal studies is progressively more common as scientific databases are being linked with GIs systems. This has created a need for statistical methods addressing the resulting spatial-longitudinal structure of the data. In this thesis, we develop hierarchical mixed multi-state models for the analysis of such longitudinal data when the processes corresponding to different subjects may be correlated spatially over a region. Methodological developments have been strongly driven by studies in forestry and spatial epidemiology. Motivated by an application in forest ecology studying pine weevil infestations, the second chapter develops methods for handling mixtures of populations for spatial discrete-time twestate processes. The two-state discrete-time transitional model, often used for studying chronic conditions in human populations, is extended to settings where subjects are spatially arranged. A mixed spatially correlated mover-stayer model is developed. Here, clustering of infection is modelled by a spatially correlated random effect reflecting the density or closeness of the individuals under study. Analysis is carried out using maximum likelihood with a Monte Carlo EM algorithm for implementation and also using a fully Bayesian analysis. The third chapter presents continuous-time spatial multi-state models. Here, joint modelling of both the spatial correlation as well as correlation between different transition rates is required and a multivariate spatial approach is employed. A proportional intensities frailty model is developed where baseline intensity functions are modelled using both parametric Weibull forms as well as flexible representations based on cubic B-splines. The methodology is applied to a study of invasive cardiac procedure in Quebec examining readmission and mortality rates over a four-year period. Finally, in the fourth chapter we return to the two-state discrete-time setting. An extension of the mixed mover-stayer model is motivated and developed within the Bayesian framework. Here, a multivariate conditional autoregressive (MCAR) model is incorporated providing flexible joint correlation structures. We also consider a test for the number of mixture components, quantifying the existence of a hidden subgroup of 'stayers' within the population. Posterior summarization is based on a Metropolis-Hastings sampler and methods for assessing the model goodness-of-fit are based on posterior predictive comparisons.

Document type: 
Thesis
File(s): 
Department: 
Department of Statistics and Actuarial Science - Simon Fraser University
Thesis type: 
Thesis (Ph.D.)

Confidentiality and variance estimation in complex surveys

Author: 
Date created: 
2004
Abstract: 

A variance estimator in a large survey based on jackknife or balanced repeated replication typically requires a large number of replicates and replicate weights. Reducing the number of replicates has important advantages for computation and for limiting the risk of data disclosure in public use data files. In the first part of this thesis, we propose algorithms adapted from scheduling theory to reduce the number of replicates. The algorithms are simple and efficient and can be adapted to easily account for analytic domains. An important concern with combining strata is that the resulting variance estimators may be inconsistent. We establish conditions for the consistency of the variance estimators and give bounds on attained precision of the variance estimators that are linked to the consistency conditions. The algorithms are applied to both a real sample survey and to samples from simulated populations, and the algorithms perform very well in attaining variance estimators with precision levels close to the upper bounds. Another important issue in survey sampling is the conflict of interest between information sharing and disclosure control. Statistical agencies routinely release microdata for public use with stratum and/or cluster indicators suppressed for confidentiality. For the purpose of variance estimation, pseudo-cluster indicators are sometimes produced for use in linearization methods or replication weights for use in resampling methods. If care is not taken these can be used to (partially) reconstruct the stratum and/or cluster indicators and thus inadvertently break confidentiality. In the second part of this thesis, we will demonstrate the dangers and adapt algorithms used from scheduling theory and elsewhere to attempt to reduce this danger.

Document type: 
Thesis
File(s): 
Department: 
Department of Statistics and Actuarial Science - Simon Fraser University
Thesis type: 
Thesis (Ph.D.)

Parametric changepoint survival model with application to coronary artery bypass graft surgery data

Author: 
Date created: 
2005
Abstract: 

Typical survival analyses treat the time to failure as a response and use parametric models, such as the Weibull or log-normal, or non-parametric methods, such as the Cox proportional analysis, to estimate survivor functions and investigate the effect of covariates. In some circumstances, for example where treatment is harsh, the empirical survivor curve appears segmented with steep initial descent followed by a plateau or less sharp decline. This is the case in the analysis of survival experience after coronary artery bypass surgery, the application which motivated this project. We employ a parametric Weibull changepoint model for the analysis of such data, and bootstrap procedures for estimation of standard errors. In addition, we consider the effect on the analyses of rounding of the data, with such rounding leading to large numbers of ties.

Document type: 
Thesis
File(s): 
Department: 
Department of Statistics and Actuarial Science - Simon Fraser University
Thesis type: 
Project (M.Sc.)

The Duckworth-Lewis method and Twenty20 cricket

Date created: 
2010
Abstract: 

Cricket has been a very popular sport around the world. But since most versions of cricket take longer to play than other popular sports, cricket is likely to be affected by unfavourable weather conditions. In 1998, Duckworth and Lewis developed a method for resetting the target scores for the team batting second in interrupted one-day cricket. Twenty20 is the latest form of cricket. Currently, the Duckworth-Lewis method is also used for resetting targets in interrupted Twenty20 matches. However, this may be less than ideal since the scoring pattern in Twenty20 is much more aggressive than that in one-day cricket. In this project, we consider the use of the Duckworth-Lewis method as an approach to reset target scores in interrupted Twenty20 matches. The construction of the Duckworth-Lewis table is reviewed and alternate resource tables are presented for Twenty20. Alternative resource tables are constructed in a nonparametric fashion using two different approaches.

Document type: 
Thesis
File(s): 
Supervisor(s): 
D
Department: 
Department of Statistics and Actuarial Science - Simon Fraser University
Thesis type: 
Project (M.Sc.)

Modeling investment returns with a multivariate Ornstein-Uhlenbeck process

Author: 
Date created: 
2010
Abstract: 

A multivariate Ornstein-Uhlenbeck process is used to model the returns on different investment instruments. Model parameters are estimated under the principle of covariance equivalence. Fitted models can be used to price insurance products and analyze the risk associated with different asset allocation strategies. To illustrate the results obtained, an annuity is studied when assets are allocated between equity and two types of bonds. To show the importance of using a multivariate model, annuity prices are compared to those obtained from independent univariate processes.

Document type: 
Thesis
File(s): 
Supervisor(s): 
G
Department: 
Department of Statistics and Actuarial Science - Simon Fraser University
Thesis type: 
Project (M.Sc.)

Actuarial applications of the linear hazard transform

Author: 
Date created: 
2010
Abstract: 

In this thesis, we study the linear hazard (LH) transform and its applications in actuarial science. Under the LH transform, the survival function of a risk is distorted, which provides a safety margin for pricing insurance products. Combining the assumption of alpha-approximation, the net single premium of a continuous insurance policy can be approximated in terms of the net single premiums of discrete insurance ones. We also find that the LH transform is good at fitting by regression between two mortality curves. With the method of mortality fitting, the mortalities for the future years can be predicted as well. Finally, the applications of the LH transform for an insurance company's asset managements, such as mortality swap, risk ordering and optimal reinsurance, are explored.

Document type: 
Thesis
File(s): 
Supervisor(s): 
C
Department: 
Department of Statistics and Actuarial Science - Simon Fraser University
Thesis type: 
Project (M.Sc.)

Integer-valued autoregressive processes with dynamic heterogeneity and their applications in automobile insurance

Author: 
Peer reviewed: 
No, item is not peer reviewed.
Date created: 
2009
Abstract: 

Bonus-malus systems in automobile insurance describe how the past claim frequencies determine the future insurance premiums. The potential risks of the policyholders vary due to differences in driving behavior, which leads to the unobserved heterogeneity in individual average claim counts. While the Poisson distribution has been used as a simple model for discrete count data, the negative binomial distribution is suggested for modeling the claim counts with unobserved heterogeneity by letting the mean parameter of the Poisson distribution follow a Gamma distribution. In this project, we introduce an integer-valued autoregressive process with dynamic heterogeneity to model the random fluctuations and correlations of the heterogeneity from year to year. Some properties of the model are studied, and a bonus-malus system is built and illustrated using the Gibb's Sampler algorithm. Finally, comparisons with other existing models are provided in terms of the extent to which they use the claim history.

Document type: 
Thesis
File(s): 
Supervisor(s): 
Y
Department: 
Dept. of Statistics and Actuarial Science - Simon Fraser University
Thesis type: 
Project (M.Sc.)

Use of the Lognormal Distribution for Survival Data: Inference and Robustness

Author: 
Date created: 
2004
Abstract: 

Two data sets are presented and various distributions, including the lognormal, are fitted to the data. A method is given to calculate exact confidence intervals for the quantiles of the lognormal distribution. The coverage probability of the confidence intervals is examined when the lognormal distribution is the correct model, and for various departures from lognormality. In addition, the connection between the coverage probability and the pvalue from a goodness-of-fit test is explored.

Document type: 
Thesis
File(s): 
Department: 
Department of Statistics and Actuarial Science - Simon Fraser University
Thesis type: 
Project (M.Sc.)

A statistical method for high-throughput screening of predicted orthologs

Author: 
Peer reviewed: 
No, item is not peer reviewed.
Date created: 
2009
Abstract: 

Orthologs are genes in different species that diverged from a common ancestral gene after speciation. Their identification is critical for reliable prediction of gene function in newly sequenced genomes. Orthologous genes are usually identified by a high-throughput method called Reciprocal-Best BLAST-hit (RBH). As RBH is subject to errors from incomplete sequencing or gene loss in a species, a bioinformatics tool called Ortholuge was developed that identifies RBH-predicted orthologs with atypical genetic divergence. However, declaring the cut-off for atypical divergence in Ortholuge is very computationally-intensive, and so we propose a faster statistical procedure and examine its performance by simulation. We find that performance depends on the fit of the assumed model for the distribution of divergence measures in true orthologs.

Document type: 
Thesis
File(s): 
Supervisor(s): 
J
Department: 
Dept. of Statistics and Actuarial Science - Simon Fraser University
Thesis type: 
Project (M.Sc.)