Skip to main content

Distributed in-network task scheduling for datacenters

Resource type
Thesis type
(Thesis) M.Sc.
Date created
2021-10-26
Authors/Contributors
Abstract
Scheduling latency-sensitive applications in large-scale datacenters is challenging. Current approaches use application-layer schedulers, which impose high overheads and result in long latencies. We present Saqr, the first in-network, datacenter-wide scheduler that supports short tasks with execution times in the order of tens of microseconds. Saqr introduces new network-level constructs and a distributed scheduling policy to enable network switches to efficiently schedule tasks within the network at line rate and with minimal latency. We implemented Saqr in a testbed with high-speed programmable switches and compared its performance against the state-of-the-art in-network scheduler (Racksched). Our results show that Saqr can reduce the tail response time by up to 85% and the processing load on switches by up to 2.5X compared to Racksched. In addition, we compared Saqr versus Racksched using large-scale simulations with diverse and dynamic workloads and our results show that Saqr substantially outperforms Racksched across all performance metrics.
Document
Extent
46 pages.
Identifier
etd21691
Copyright statement
Copyright is held by the author(s).
Permissions
This thesis may be printed or downloaded for non-commercial research and scholarly purposes.
Supervisor or Senior Supervisor
Thesis advisor: Hefeeda, Mohamed
Language
English
Member of collection
Download file Size
etd21691.pdf 1.18 MB

Views & downloads - as of June 2023

Views: 30
Downloads: 3