Zhang, Yongping

Resource type

Thesis

Thesis type

(Thesis) M.Sc.

Date created

2009

Authors/Contributors

Author: Zhang, Yongping

Abstract

Software development on a cluster for data-intensive applications has always been a challenge. However, the cost advantage over traditional shared memory system has driven the migration of data warehouse to cluster. We propose splitcube - a new approach of OLAP database computation to work on cluster. Splitcube ensures very effective dynamic load balancing and low overhead. We study different ways of splitting the input data for parallel processing in an attempt to heuristically optimize the cost of processing queries for a specific workload at a prescribed level of pre-aggregation. Our results on two real-life datasets reveal great performance improvement in three-fold: 1) Both splitcube building time and query response time experience a near-linear speedup up to 64 processors; 2) The idle time in all but one instance is less than 6% of the total execution time; and 3) Splitcube achieves near-linear or better speedup with much larger datasets.

Keywords

Copyright statement

Copyright is held by the author.

Scholarly level

Graduate student (Masters)

Language

English

Member of collection

Computing Science Theses

Download file	Size
etd4487_YZhang.pdf	1.03 MB

OLAP database computation with a splitcube in a cluster

Keywords

Views & downloads - as of June 2023