The validating of XML files is a main component of XML file processing. This thesis investigates single-instruction multiple-data(SIMD) and parallel bit stream technologies in high performance XML validation. The content model and datatypes of the schema are translated into regular expressions and then into parallel bitwise operations. The element content and data of the instance file are extracted to form byte streams, and then transformed into parallel bit streams. Finally, the parallel bitwise operations are applied on corresponding bit streams to validate the content model or datatype. This method is then studied by changing the characteristic of the instance files, such as the proportion of content data, occurrences of elements. Comparisons of the performance are also made with Xerces, the well known XML parser with validator. Whereas the parallel bit stream validation algorithm requires less than 20 cycles per byte, while Xerces requires 40 to 300 cycles per byte.
Copyright is held by the author.
The author granted permission for the file to be printed, but not for the text to be copied and pasted.
Supervisor or Senior Supervisor
Thesis advisor: Cameron, Robert
Member of collection