Resource type
Thesis type
(Thesis) M.Sc.
Date created
2014-10-06
Authors/Contributors
Author: Denis, Dale Lamont
Abstract
This thesis investigates the feasibility of constructing a high-performance, Unicode-capable, regular expression search tool by combining parallel bit stream technologies and algorithms together with the dynamic compilation capabilities of the LLVM compiler infrastructure. A prototype implementation of icGREP successfully demonstrates the feasibility of this undertaking, with asymptotic performance fully in line with that predicted by earlier prototyping work. The icGREP implementation extends the Parabix regular expression algorithms to include new techniques for efficient Unicode character matching. Performance evaluations in comparison with other Unicode-capable regular expression search tools show asymptotic performance advantages that are often over 10X, although the overhead of dynamiccompilation techniques confines the benefits to relatively large input files.
Document
Identifier
etd8659
Copyright statement
Copyright is held by the author.
Scholarly level
Supervisor or Senior Supervisor
Thesis advisor: Cameron, Robert
Member of collection
Download file | Size |
---|---|
etd8659_DDenis.pdf | 3.14 MB |