Search engines are a key factor shaping the way people interact with today’s worldwide web. It is therefore of critical importance for web masters to understand and increase the ranking of their pages, and so is for search engines to inspect how a page obtains its ranking in order to detect possible malicious activities for manipulating the natural ranking of pages. PageRank is a popular algorithm used by search engines such as Google to rank web pages returned as search results. This algorithm assigns a score to each web page (i.e., graph node) reﬂecting its importance, which is a function of the score of other pages having a hyperlink or a path of hyperlinks (graph links) to that page. In addition to the web graph, PageRank is also used for ranking graph nodes in other contexts such as the citation network of the scholarly literature. In this thesis, we take a closer look at the PageRank algorithm on graphs and analyze how each graph node collects its PageRank score from other nodes. We develop a systematic method for calculating the contribution that individual nodes make to each other’s PageRank score, i.e., the diﬀerence that it would make in the PageRank score of node v if node u did or did not exist. We then present an approximation algorithm with guaranteed error bounds for approximating page contributions to any given target node v, which operates in a local neighborhood of v in the graph since real-world graph such as the web is too large to be computed on as a complete graph. We evaluate our algorithm on a web graph and a citation graph dataset. Our experimental results indicate that we can estimate the page contribution values with good accuracy. Moreover, our results show that we can ﬁnd higher-contribution supporter nodes for a given target node in a shorter time than previous works.
Copyright is held by the author.
The author granted permission for the file to be printed and for the text to be copied and pasted.
Supervisor or Senior Supervisor
Thesis advisor: Bulatov, Andrei
Member of collection