Data-driven action-value functions for evaluating players in professional team sports

Date created: 
Action-Value Q-function
Deep Reinforcement Learning
Mimic Learning
Regression Tree
Deep Representation Learning
Variational Auto-Encoder

As more and larger event stream datasets for professional sports become available, there is growing interest in modeling the complex play dynamics to evaluate player performance. Among these models, a common player evaluation method is assigning values to player actions. Traditional action-values metrics, however, consider very limited game context and player information. Furthermore, they provide directly related to goals (e.g., shots), not all actions. Recent work has shown that reinforcement learning provided powerful methods for addressing quantifying the value of player actions in sports. This dissertation develops deep reinforcement learning (DRL) methods for estimating action values in sports. We make several contributions to DRL for sports. First, we develop neural network architectures that learn an action-value Q-function from sports events logs to estimate each team's expected success given the current match context. Specifically, our architecture models the game history with a recurrent network and predicts the probability that a team scores the next goal. From the learned Q-values, we derive a Goal Impact Metric (GIM) for evaluating a player's performance over a game season. We show that the resulting player rankings are consistent with standard player metrics and temporally consistent within and across seasons. Second, we address the interpretability of the learned Q-values. While neural networks provided accurate estimates, the black-box structure prohibits understanding the influence of different game features on the action values. To interpret the Q-function and understand the influence of game features on action values, we design an interpretable mimic learning framework for the DRL. The framework is based on a Linear Model U-Tree (LMUT) as a transparent mimic model, which facilitates extracting the function rules and computing the feature importance for action values. Third, we incorporate information about specific players into the action values, by introducing a deep player representation framework. In this framework, each player is assigned a latent feature vector called an embedding, with the property that statistically similar players are mapped to nearby embeddings. To compute embeddings that summarize the statistical information about players, we implement a Variational Recurrent Ladder Agent Encoder (VaRLAE) to learn a contextualized representation for when and how players are likely to act. We learn and evaluate deep Q-functions from event data for both ice hockey and soccer. These are challenging continuous-flow games where game context and medium-term consequences are crucial for properly assessing the impact of a player's actions.

Document type: 
This thesis may be printed or downloaded for non-commercial research and scholarly purposes. Copyright remains with the author.
Oliver Schulte
Applied Sciences: School of Computing Science
Thesis type: 
(Thesis) Ph.D.