Skip to main content

Improving reliability of large-scale multimedia services

Resource type
Thesis type
(Thesis) Ph.D.
Date created
2018-11-20
Authors/Contributors
Abstract
Online multimedia communication services such as Skype and Google Hangouts, are used by millions of users every day. They have Service Level Agreements (SLAs) covering various aspects like reliability, response times, and up-times. They provide acceptable quality on average, but users occasionally suffer from reduced audio quality, dropped video streams, and failed sessions. The cost of SLA violation is low customer satisfaction, fines, and even loss of business. Service providers monitor the performance of their services, and take corrective measures when failures are encountered. Current techniques for managing failures and anomalies are reactive, do not adapt to dynamic changes, and require massive amounts of data to create, train, and test the predictors. In addition, the accuracy of these methods is highly compromised by changes in the service environment and working conditions. Furthermore, multimedia services are composed of complex software components typically implemented as web services. Efficient coordination of web services is challenging and expensive, due to their stateless nature and their constant change. We propose a new approach to creating dynamic failure predictors for multimedia services in real-time and keeping their accuracy high during run-time changes. We use synthetic transactions to generate current data about the service. The data is used in its ephemeral state to create, train, test, and maintain accurate failure predictors. Next, we propose a proactive light-weight approach for estimating the capacity of different components of the multimedia system, and using the estimates in allocating resources to multimedia sessions in {\em real time}. Last, we propose a simple and effective optimization to current web service transaction management protocols.We have implemented all the proposed methods for failure prediction, capacity estimation, and web services coordination in a large-scale, commercial, multimedia system that processes millions of sessions every day. Our empirical results show significant performance gains across several metrics, including quality of the multimedia sessions, number of failed sessions, accuracy of failure prediction, and false positive rates of the anomaly detectors.
Document
Identifier
etd19963
Copyright statement
Copyright is held by the author.
Permissions
This thesis may be printed or downloaded for non-commercial research and scholarly purposes.
Scholarly level
Supervisor or Senior Supervisor
Thesis advisor: Hefeeda, Mohamed
Member of collection
Download file Size
etd19963.pdf 1.72 MB

Views & downloads - as of June 2023

Views: 0
Downloads: 0