Mining multidimensional distinct patterns

Resource type
Thesis type
(Thesis) M.Sc.
Date created
2010-11-15
Authors/Contributors
Abstract
How do we find the dominant groups of customers in age, sex and location that were responsible for at least 85% of the sales of iPad, Macbook and iPhone? To answer such types of questions we introduce a novel data mining task – mining multidimensional distinct patterns (DPs). Given a multidimensional data set where each tuple carries some attribute values and a transaction, multidimensional DPs are itemsets whose absolute support ratio in a group-by on the attributes against the rest of the data set passes a given threshold. A baseline algorithm uses BUC as our cubing algorithm, and passes two distinct sets of transactions associated to the tuples of the cell to a pattern mining algorithm called DPMiner. The use of several effective pruning techniques eliminates redundant processing of DPMiner and reduces the runtime. The empirical study between the baseline and advanced algorithm demonstrates that the advanced algorithm is significantly faster.
Document
Identifier
etd6280
Copyright statement
Copyright is held by the author.
Permissions
The author granted permission for the file to be printed and for the text to be copied and pasted.
Scholarly level
Supervisor or Senior Supervisor
Thesis advisor: Pei, Jian
Member of collection
Attachment Size
etd6280_TKubendranathan.pdf 493.72 KB