Statistics and Actuarial Science - Theses, Dissertations, and other Required Graduate Degree Essays

Receive updates for this collection

Optimal fractional factorial split-plot designs for model selection

Author: 
Peer reviewed: 
No, item is not peer reviewed.
Date created: 
2009
Abstract: 

Fractional factorial designs are used widely in screening experiments, where significant effects are identified. It is not always possible to perform the trials in a complete random order and hence, fractional factorial split-plot designs arise. In order to identify optimal fractional factorial split-plot designs in this setting, the Hellinger distance criterion (Bingham and Chipman (2007)) is proposed. The approach is Bayesian and directly incorporates common experimenter assumptions. By specifying prior distributions for the model space, the criterion for fractional factorial split-plot designs aims to discriminate between the most probable competing models. Techniques for evaluating the criterion and searching for optimal designs are proposed. The criterion is then illustrated through a few examples with further discussion on the choice of hyperparameters and flexibility of the criterion.

Document type: 
Thesis
File(s): 
Supervisor(s): 
D
Department: 
Dept. of Statistics and Actuarial Science - Simon Fraser University
Thesis type: 
Project (M.Sc.)

Multi-state processes with duration-dependent transition intensities: statistical methods and applications

Author: 
Peer reviewed: 
No, item is not peer reviewed.
Date created: 
2009
Abstract: 

Multi-state processes provide a convenient framework for analysis of event history data, which arise in many fields including public health, biomedical and health services research, reliability, business, and social sciences. This thesis develops methods for statistical analyses with various Markov processes in particular, and presents applications of the methodology. Starting with the homogeneous semi-Markov (HSM) process, a generalization of the classical homogeneous Markov processes, we propose an alternative estimation procedure with right-censored data to the existing approaches to avoid their possible inconsistency in estimating the transition probabilities. Two simulation based algorithms are implemented to construct confidence bands for the HSM kernel and the sojourn time distributions. The modulated semi-Markov (MSM) process extends the HSM process to a Cox regression setting, allowing for general time-dependent covariates but invalidating the usual martingale methods to derive asymptotics. We consider estimation of the regression parameters in the MSM model and establish the consistency, asymptotic normality and efficiency of the estimators, applying the modern empirical process theory. As a further generalization, the nonhomogeneous semi-Markov (NHSM) process assumes its transition intensity involving two time scales, the individual study time since the onset of the process and the duration time in the current state. We provide estimation procedures for the parameters in four model specifications with the NHSM process. The last topic of the thesis is to deal with dependent censoring in event history data analysis. We focus on a particular informative censoring scheme with the observation of a NHSM process, and adapt a copula-based approach for dependent competing risks. Finite sample properties of all the proposed methods are examined via simulation. In addition, with the proposed methods, we conduct analyses of two real data sets, the human sleep data presented in Kneib and Hennerfeind (2008) and the hospitalization data collected by the CAYACS program (PI: M. McBride) with BC Cancer Centre.

Document type: 
Thesis
File(s): 
Supervisor(s): 
X
Department: 
Dept. of Statistics and Actuarial Science - Simon Fraser University
Thesis type: 
Thesis (Ph.D.)

Assessing longevity risk with generalized linear array models

Peer reviewed: 
No, item is not peer reviewed.
Date created: 
2009
Abstract: 

Longevity risk is becoming more important in the current economic environment; if mortality improvements are larger than expected, profits erode in the annuity business and in defined benefit pension schemes. The Lee-Carter model, although a popular model for mortality rates by age and calendar year, has been critiqued for its inflexibility. A recently proposed alternative is to smooth the mortality surface with a generalized linear array model (GLAM), allowing for an additive surface of shocks. We compare the GLAM and Lee-Carter models by fitting them to Swedish mortality data. Lee-Carter mortality predictions are calculated, and a time series method for GLAM prediction is developed. The predicted mortality rates and associated uncertainties are compared directly, and their impact on annuity pricing is analyzed. Letting future mortality be stochastic, we can calculate the expected value and variance of the present value for various annuities.

Document type: 
Thesis
File(s): 
Supervisor(s): 
G
Department: 
Dept. of Statistics and Actuarial Science - Simon Fraser University
Thesis type: 
Project (M.Sc.)

Multiple hypothesis testing procedures with applications to epidemiologic studies

Author: 
Peer reviewed: 
No, item is not peer reviewed.
Date created: 
2009
Abstract: 

Epidemiologic and genetic studies often involve the testing of a large number of hypotheses with test statistics that are potentially dependent. In this project, we investigate multiple testing procedures to control the family-wise error rate and false discovery rate. We consider several classic and novel multiple hypothesis testing procedures. Furthermore, we compare the results of the procedures which take advantage of the dependent structure among test statistics to those of the procedures which do not. The data we used is from a case-control study of non-Hodgkin Lymphoma.

Document type: 
Thesis
File(s): 
Supervisor(s): 
J
Department: 
Dept. of Statistics and Actuarial Science - Simon Fraser University
Thesis type: 
Project (M.Sc.)

Variable-weighted ultrametric optimization for mixed-type data: continuous, ordinal, nominal, binary symmetric and binary asymmetric

Author: 
Peer reviewed: 
No, item is not peer reviewed.
Date created: 
2009
Abstract: 

Scientific research begins with hypothesis generation, for which cluster analysis (CA) can be used. Traditionally, CA involves continuous variables weighted equally, and the subjective choice of linkage and stopping rules. Variable weighting for cluster analysis (VWCA), beginning with De Soete (1985/6), produces weights that may be useful for hypothesis generation. De Soete’s VWCA optimized ultrametricity, a property of better separated clusters, without requiring CA. We developed variable-weighted ultrametric optimization for mixed-type data (VWUO-MD), starting with a variable-weighted, multivariate distance for data with any number of continuous, ordinal, nominal, binary symmetric and binary asymmetric (e.g., rare disease) variables. In Monte Carlo simulations we found that weights are consistent with a priori relationships between variables, under several distributions. On some relationships (e.g., single group linear), the method performs poorly. Compared to De Soete, VWUO-MD better penalizes for 0-weights, and better ensures a unique solution with a strategic random restart procedure. The bootstrap covariance matrix is slightly conservative. For mixtures of at least four continuous/nominal variables, a U-statistic-based covariance matrix performs well. Point estimates and covariances are invariant to column/category/record order and affine transformations. We analyzed of a subset of the Joint Canada/United States Survey of Health: working, mature students 50+ years old who received health services in the past year (n=167), split into training and testing segments. Prescreening within types and backwards elimination with VWUO-MD reduced the space. The final 14 variable weights were plotted as a scree plot. On the testing segment, a model was fit from the upper scree plot variables. Similar models were fit from the lower scree plot, prescreening and backwards elimination reject variables. Models were ordered on overall statistical significance and the upper model had the best fit, indicating that VWUO-MD had successfully mined these data for hypotheses. We learned that reduction in activities due to a long term health condition was associated with consultations with a mental health professional in the past year (odds ratio=12.25, 95% CI=1.67, 90.02). While needing additional research, in its present form VWUO-MD produces variable weights that may be informative for hypothesis generation on data with varied mixtures of data types.

Document type: 
Thesis
File(s): 
Supervisor(s): 
R
Department: 
Dept. of Statistics and Actuarial Science - Simon Fraser University
Thesis type: 
Thesis (Ph.D.)

Analyses of physician visits from childhood and adolescent cancer survivors

Author: 
Peer reviewed: 
No, item is not peer reviewed.
Date created: 
2009
Abstract: 

The main focus of the CAYACS program on health care service at BC Cancer Agency is the long-term effect on childhood and adolescent cancer survivors in multiple domains. This project conducts various analyses to evaluate frequency and cost of the physician visits in two different time scales, calendar time and individual time. Starting with cross-sectional analyses, we obtain results comparable to those from the previous CAYACS project in physician visits. We investigate the frequency and cost of the whole cohort over (calendar) time with time series analyses. Furthermore, in an individual time scale base, we analyze the physician visits to provide individual-specific inferences on the effects of risk factors with longitudinal models. The method in Diehr, et al (1999) is adapted to deal with the longitudinal physician visit costs with a positive mass on zero.

Document type: 
Thesis
File(s): 
Supervisor(s): 
X
Department: 
Dept. of Statistics and Actuarial Science - Simon Fraser University
Thesis type: 
Project (M.Sc.)

The discounted penalty function and the distribution of the total dividend payments in a multi-threshold Markovian risk model

Author: 
Peer reviewed: 
No, item is not peer reviewed.
Date created: 
2009
Abstract: 

In this thesis, we study the expected discounted penalty function and the total dividend payments in a risk model with a multi-threshold dividend strategy, where the claim arrivals are modeled by a Markovian arrival process (MAP) and the claim amounts are correlated with the inter-claim times. Systems of integro-differential equations in matrix forms are derived for the expected discounted penalty function and the moments of the discounted dividend payments prior to ruin. A recursive approach based on the integro-differential equations is then provided to obtain the analytical solutions. In addition to the differential approach, by employing some new obtained results in the actuarial literature, another recursive approach with respect to the number of layers is also developed for the expected discounted dividend payments. Examples with exponentially distributed claim amounts are illustrated numerically.

Document type: 
Thesis
File(s): 
Supervisor(s): 
Y
Department: 
Dept. of Statistics and Actuarial Science - Simon Fraser University
Thesis type: 
Project (M.Sc.)

Bayesian clustering for synchronized diving

Peer reviewed: 
No, item is not peer reviewed.
Date created: 
2009
Abstract: 

Synchronized diving is one of the most widely viewed Olympic events after its first appearance at the Sydney Olympic Games in 2000. It gives spectators the ability to compare performances of divers on their own without much understanding of the technical details of the sport. In this project, we develop methodology to investigate the complexity of judges' scores and the relative behaviour of judges from synchronized diving events. We explore a Bayesian clustering methodology as introduced in Gill, Swartz and Treschow (2007) to cluster judges. A model that captures the characteristics of the judges' scores is introduced and a dataset from the 12th FINA World Championships in Melbourne 2007 is fit using the proposed model. We demonstrate how the missing values raised from the judging system can be easily handled in a Bayesian analysis via implementation in WinBUGS. The analysis may reveal associations among judges.

Document type: 
Thesis
File(s): 
Supervisor(s): 
T
Department: 
Dept. of Statistics and Actuarial Science - Simon Fraser University
Thesis type: 
Project (M.Sc.)

On fitting a mixture of two von Mises distributions, with applications

Author: 
Peer reviewed: 
No, item is not peer reviewed.
Date created: 
2009
Abstract: 

Circular data refers to data recorded as points on a circle, either denoting directions, or times when the circle acts as a clock. The von Mises distribution is frequently used to analyze circular data sets with a clear peak. When two clear peaks appear on the circle, a mixture of two von Mises distributions is often used to analyze the data. Parameter estimates are produced by using maximum likelihood estimation, and Watson's U-square is used to test the fit. Two data sets will be discussed in this project: times of Sudden Infant Death Syndrome (SIDS) occurrences and times of Fatal Crash accidents.

Document type: 
Thesis
File(s): 
Supervisor(s): 
M
Department: 
Dept. of Statistics and Actuarial Science - Simon Fraser University
Thesis type: 
Project (M.Sc.)

Analysis of occupational cohort data using exposure as a continuous time-dependent variable

Date created: 
2005
Abstract: 

In occupational cohort studies, a group of workers is followed over time, and disease and work history information are collected for each individual in order to determine whether exposure to a particular substance is linked to differences in mortality or disease incidence rates. These studies are typically analysed by treating cumulative exposure as a categorical variable and then comparing disease or mortality rates between different exposure groups. A main shortfall of such analyses is a heavy dependence on the choice of these exposure categories, as certain choices may mask or exaggerate important features of the doseresponse curve. In this project, an extension to the Cox proportional hazards model is used to treat cumulative exposure as a continuous variable and model the doseresponse curve nonparametrically for a study of aluminium smelter workers conducted by the British Columbia Cancer Agency and compare the results to the categorical analyses.

Document type: 
Thesis
File(s): 
Department: 
Department of Statistics and Actuarial Science - Simon Fraser University
Thesis type: 
Project (M.Sc.)