Resource type
Thesis type
((Thesis)) M.Sc.
Date created
2011-03-09
Authors/Contributors
Author: Alrayes, Norah
Abstract
Large organizations, such as government agencies, often distribute their information on the web in the form multidimensional tables. This thesis describes the extraction of data cubes from the tables, which can be collectively queried by decision-makers using popular OLAP tools. Those tables are also a valuable resource for answering user questions, improving faceted search, and generating ontology. Improving the quality of information extraction from multidimensional tables is mandatory, because of their inherent sophisticated design. In this thesis, algorithms are presented for assigning labels to dimensions, domain integration, identification of measure dimension, table integration, and table partitioning. Experiments were conducted on some 800 tables from Statistics Canada, and our success rate was greater than 90% for each component that was tested.
Document
Identifier
etd6480
Copyright statement
Copyright is held by the author.
Scholarly level
Supervisor or Senior Supervisor
Thesis advisor: Luk, Wo-Shun
Member of collection
Download file | Size |
---|---|
etd6480_NAlrayes.pdf | 2.45 MB |