Resource type
Thesis type
((Thesis)/(Dissertation)) Ph.D.
Date created
2012-03-06
Authors/Contributors
Author: Jowhari, Hossein
Abstract
This thesis is concerned with the study of problems related to the measurement of disorder in the data stream model where the input, accessed in sequential manner, is a long sequence while there are strict memory limitations. In particular, we study sublinear-space algorithms for the following problems. Measuring sortedness. In this category, we study the problem of approximating the length of the longest increasing subsequence of a stream. Its dual problem, known as distance to monotonicity, is another measure that is considered in this work. Measuring periodicity. We study detecting periodicity and estimating closedness to periodicity in time-series data streams. Finding duplicates. In contrast to periodicity which concerns consecutive repetitions, we study standard notion of repetition as well and give algorithms for reporting a duplicate in data streams. Streaming algorithms and lower bounds on the memory requirement are presented for the above problems. In particular, in our main contributions, a tight lower bound on the space complexity of deterministic estimation of the length of the longest increasing subsequences is shown; a streaming algorithm for pattern matching and computing the period of a sequence are given, and finally we show tight bounds on the space complexity of finding duplicates in long streams. To accomplish the latter, we have designed an efficient streaming samplers that has broad applications in processing dynamic data streams.
Document
Identifier
etd7104
Copyright statement
Copyright is held by the author.
Scholarly level
Supervisor or Senior Supervisor
Thesis advisor: Ergun, Funda
Member of collection
Download file | Size |
---|---|
etd7104_HJowhari.pdf | 1.08 MB |