The emergence of deep learning has attracted the attention from a wide range of fields and brought a large number of related applications. With the rapid growth of mobile computing techniques, numerous deep learning applications are designed for the mobile end. However, since deep learning tasks are computational-intensive, the limited computation resource on the mobile device cannot execute the application effectively. Traditional approach is to push the data and the workload to the remote cloud. Meanwhile, it introduces a high data transmission delay and possibly bottlenecks the overall performance. In this thesis, we apply a new rising concept, edge computing, for mobile deep learning applications. Comparing with cloud learning, the communication delay can be significantly reduced by pushing the workload to the near-end edge. Unlike the existing edge learning frameworks only concerning inference or training, this thesis will focus on both and put forward different optimization approaches towards them. Specifically, the thesis proposes a layer-level partitioning strategy for inference tasks and an edge compression approach with the autoencoder preprocessing for training tasks, to exploit all the available resources from the devices, the edge servers, and the cloud to collaboratively improve the performance for mobile deep learning applications. To further verify the optimization performance in practice, we formulate a scheduling problem for the multi-task execution and propose an efficient heuristic scheduling algorithm. Real-world experiments and extensive simulation tests show that our edge learning framework can achieve up to 70% delay reduction.
Copyright is held by the author.
This thesis may be printed or downloaded for non-commercial research and scholarly purposes.
Supervisor or Senior Supervisor
Thesis advisor: Liu, Jiangchuan
Member of collection