Resource type
Thesis type
(Thesis) M.Sc.
Date created
2009
Authors/Contributors
Author: Zhang, Yongping
Abstract
Software development on a cluster for data-intensive applications has always been a challenge. However, the cost advantage over traditional shared memory system has driven the migration of data warehouse to cluster. We propose splitcube - a new approach of OLAP database computation to work on cluster. Splitcube ensures very effective dynamic load balancing and low overhead. We study different ways of splitting the input data for parallel processing in an attempt to heuristically optimize the cost of processing queries for a specific workload at a prescribed level of pre-aggregation. Our results on two real-life datasets reveal great performance improvement in three-fold: 1) Both splitcube building time and query response time experience a near-linear speedup up to 64 processors; 2) The idle time in all but one instance is less than 6% of the total execution time; and 3) Splitcube achieves near-linear or better speedup with much larger datasets.
Document
Copyright statement
Copyright is held by the author.
Scholarly level
Language
English
Member of collection
Download file | Size |
---|---|
etd4487_YZhang.pdf | 1.03 MB |