Skip to main content

The news at 404: Archiving and accessing online news content

Resource type
Thesis type
(Thesis) M.A.
Date created
As news outlets focus their resources on producing more digital content — in some cases ceasing print production entirely — the questions of if and how online news is archived become ever more pressing. While methods of delivering news content online have developed extensively over the past decade, strategies for preserving this content remain unclear, leaving content vulnerable to erasure. Moreover, the study of these archiving efforts remains largely at the fringe of academic research. Digital news cannot be archived in the same ways as its paper counterpart. The content is archived primarily through news media websites and news aggregation sites such as Factiva, LexisNexis, and ProQuest\'s Canadian Newsstream, who offer subscription-based access to archived digital content. This study aims to contribute to the understanding of the processes, policies, and perhaps politics, of news archiving in the digital era. This project documents the rates at which online Canadian national newspapers archive digital articles. In addition to monitoring archiving and permanent deletion trends, this project presents the variation in rates at which national news articles are archived by secondary archiving services Canadian Newsstream, Factiva, and LexisNexis. For this project, a sample of 688 online articles was collected over a constructed week from the Globe and Mail, National Post, and CBC News websites. Of the 688 total, 210 stories were from CBC, 240 were from the Globe, and 238 were from the Post. A quantitative content analysis was conducted on the sample to identify potential trends in how and why some articles are excluded from media outlets’ archives and secondary archives. At the end of a five year observation period, 584 of the original 688 articles were still available through the original sites. 55 of the original 688 stories were permanently deleted from the news sites. The study finds significant article loss by Canadian Newsstream, Factiva, and LexisNexis archives with rates of missing articles being three to five times higher for these secondary sites than the news media sites. Several factors impact the archiving rates for articles in the samples. They include: the parceling of licensed content; the use of video content on news websites; and the reliance on wire stories which are not archived at the same rate as content generated in-house. At its root, this project seeks to raise questions about long-term access to information. As the news media transitions further into the digital realm, the ability of individuals to access content becomes less certain. This potentially impacts community memory; the ability of individuals to access their history through media; and reduces the capacity of researchers to conduct news media-based historical analyses. Threats to future access to Canadian digital news media are threats to myriad forms of research that rely on news articles for historical information.
Copyright statement
Copyright is held by the author.
This thesis may be printed or downloaded for non-commercial research and scholarly purposes.
Scholarly level
Supervisor or Senior Supervisor
Thesis advisor: Gruneau, Rick
Member of collection

Views & downloads - as of June 2023

Views: 0
Downloads: 0