Over the last decade, the number and sophistication of methods used to do regression on complex datasets have increased substantially. Despite this, our literature review found that research that explores the impact of heteroscedasticity on many widely used modern regression methods appears to be sparse. Thus, our research seeks to clarify the impact that heteroscedasticity has on the predictive effectiveness of modern regression methods. In order to achieve this objective, we begin by analyzing the ability of ten different modern regression methods to predict outcomes for three medium-sized data sets that each feature heteroscedasticity. We then use insights provided from this work to develop a simulation model and design an experiment that explores the impact that various factors have on prediction accuracy of our ten different regression methods. These factors include linearity, sparsity, the signal to noise ratio, the number of explanatory variables, and the use of a variance stabilizing transformation.
Copyright is held by the author.
The author granted permission for the file to be printed and for the text to be copied and pasted.
Member of collection