📌 tech|cultureConcept2 views3 min read

What Happened to Gone but Not Forgotten: Recovering the Dead Web?

The concept of 'Gone but Not Forgotten: Recovering the Dead Web' encapsulates the ongoing global effort to combat link rot and the pervasive loss of digital information. Spearheaded largely by the Internet Archive and its Wayback Machine, this initiative involves archiving vast swathes of the internet to preserve cultural heritage and ensure historical accessibility, despite significant challenges posed by dynamic web content and resource limitations. Recent studies in 2024 and 2026 continue to highlight the alarming rate of web page disappearance, underscoring the critical and evolving role of digital preservation.

Share:

Quick Answer

The phrase "Gone but Not Forgotten: Recovering the Dead Web" refers to the critical and ongoing work of digital preservationists, primarily the Internet Archive, to save disappearing online content. As of July 2026, efforts continue to intensify, driven by alarming statistics on link rot and the publication of new research like the 2026 book "Vanishing Culture." The Internet Archive's Wayback Machine remains central, actively rescuing a significant portion of otherwise lost web pages and expanding its reach through new initiatives like Internet Archive Switzerland and collaborations to address the 'Digital Curation Crisis.'

📊Key Facts

Webpages from 2013 inaccessible a decade later (Pew 2024)
38%
Pew Research Center, 2024
Pages sampled 2013-2023 now inaccessible (Pew 2024)
25%
Pew Research Center, 2024
Dead pages rescued by Wayback Machine (Internet Archive 2026 analysis)
~15%
Internet Archive, 2026
URLs found dead on live web (ODU 2023 data)
65%
Old Dominion University, 2026
Wayback Machine archived pages (Oct 2025)
Over 1 trillion
Wikipedia, 2025
Wayback Machine data archived (Oct 2025)
Over 99 petabytes
Wikipedia, 2025
Broken links fixed by TARB project
Over 30 million
Internet Archive, 2026

📅Complete Timeline15 events

1
March 1995Major

Internet Archive Begins Archiving Web Pages

The Internet Archive begins caching web pages, laying the groundwork for future digital preservation efforts.

2
October 2001Critical

Wayback Machine Launched Publicly

Brewster Kahle and Bruce Gilliat launch the Wayback Machine, providing public access to archived web pages and aiming for 'universal access to all knowledge.'

3
October 2009Major

Yahoo Discontinues GeoCities; Internet Archive Preserves Content

Yahoo shuts down GeoCities, a major early web hosting service. The Internet Archive conducts deep crawls to preserve as much of its content as possible, creating a significant collection of 'dead web' material.

4
October 2012Notable

Internet Archive Makes 80TB of Crawl Data Available for Research

The Internet Archive announces the availability of 80 terabytes of archived web crawl data from 2011 for researchers, promoting broader engagement with web archives.

5
September 2021Major

Jonathan Zittrain Publishes 'The Internet Is Rotting'

Jonathan Zittrain's article in The Atlantic highlights the pervasive problem of link rot, reporting that 25% of deep links from New York Times articles had rotted, with 72% of older links from 1998 being dead.

6
March 2022Major

SUCHO Initiative Launched to Archive Ukrainian Heritage

The Saving Ukrainian Cultural Heritage Online (SUCHO) initiative is launched with support from the Internet Archive to preserve Ukrainian digital cultural heritage amidst conflict, demonstrating rapid response archiving.

7
2023Major

Old Dominion University Link Rot Study Findings

A comprehensive longitudinal study from Old Dominion University, analyzing 27.3 million URLs from the Wayback Machine since 1996, reports that approximately 65% of sampled URLs were found dead on the live web in 2023.

8
2024Critical

Pew Research Center Publishes 'When Online Content Disappears'

Pew Research Center releases a significant study on link rot, finding that 38% of webpages from 2013 were inaccessible a decade later, and a quarter of all webpages sampled between 2013-2023 were no longer accessible.

9
October 2025Critical

Wayback Machine Exceeds 1 Trillion Pages Archived

The Wayback Machine reaches a major milestone, archiving over 1 trillion web pages and more than 99 petabytes of data, demonstrating its continuous growth and scale.

10
January 2026Major

Discussions on AI and Emerging Technologies in Archaeology

Conferences and lectures, such as 'This Is Archaeology,' begin to extensively explore the role of AI, blockchain, and other emerging technologies in transforming archaeological data management and preservation, addressing the 'Digital Curation Crisis.'

11
April 23, 2026Critical

Internet Archive Publishes 'Gone but Not Forgotten: Recovering the Dead Web' Blog Post and 'Vanishing Culture' Book

The Internet Archive publishes a blog post with the titular phrase, summarizing link rot studies and the Wayback Machine's rescue efforts. Concurrently, they introduce the 2026 book 'Vanishing Culture: A Report on Our Fragile Cultural Record,' emphasizing the loss of digital memory.

12
May 27, 2026Major

Internet Archive Switzerland Launched

The Internet Archive launches a new non-profit foundation in Switzerland, focusing on preserving endangered archives globally and pioneering the archiving of generative AI models.

13
June 5, 2026Major

IIPC Web Archiving Conference Highlights Open-Source Sustainability

The International Internet Preservation Consortium (IIPC) Web Archiving Conference in Brussels emphasizes the critical need for sustained investment in open-source tools and collective stewardship for web archiving infrastructure.

14
June 23, 2026Major

Cambridge University Press Discusses Digital Curation Crisis in Archaeology

Cambridge University Press publishes an article detailing how the 'Digital Curation Crisis' is endangering archaeology, highlighting the vast amounts of digital data being created without adequate long-term preservation strategies.

15
July 2, 2026Major

Library of Congress Launches America 250 Semiquincentennial Web Archive

The Library of Congress announces the America 250 Semiquincentennial Web Archive, documenting how Americans commemorate the nation's 250th anniversary, showcasing ongoing efforts in targeted web archiving for historical events.

🔍Deep Dive Analysis

The phenomenon of the 'dead web' refers to the widespread issue of link rot, where web pages become inaccessible over time due to site closures, content removal, or changes in URLs. This digital decay poses a significant threat to the historical record and cultural memory, prompting continuous efforts under the banner of 'Gone but Not Forgotten: Recovering the Dead Web.' The Internet Archive, through its seminal Wayback Machine, stands as the foremost institution addressing this challenge, aiming to provide 'universal access to all knowledge' by archiving billions of web pages.

The urgency of this recovery work is consistently highlighted by various studies. For instance, a 2024 Pew Research Center study revealed that 38% of webpages from 2013 were no longer accessible a decade later, and approximately 25% of pages sampled between 2013 and 2023 had vanished. An analysis by Internet Archive data scientist Sawood Alam, published in April 2026, demonstrated that the Wayback Machine had successfully rescued roughly 15% of these otherwise lost pages. Earlier research, such as Jonathan Zittrain's 2021 article in The Atlantic, 'The Internet Is Rotting,' found that 72% of New York Times links from 1998 were dead, while a 2023 Old Dominion University study (analyzing data since 1996) reported that about 65% of sampled URLs were dead on the live web. Brewster Kahle, founder of the Internet Archive, has long cited the average life of a webpage to be a mere 40 to 100 days.

Key turning points in this ongoing battle include the continuous development and expansion of the Wayback Machine, which by October 2025 had archived over 1 trillion web pages and 99 petabytes of data. Initiatives like the 'Save Page Now' service empower individual users to contribute to preservation, while the 'Turn All References Blue (TARB)' project has fixed over 30 million broken links on wikis by leveraging the Internet Archive's resources. However, significant challenges persist, including resource limitations, the complexity of archiving JavaScript-heavy or dynamic content, and deliberate blocking by some publishers who, despite often relying on the Wayback Machine for their own research, prevent their content from being archived.

As of July 2026, the field of digital preservation is seeing renewed focus and innovation. The 2026 book, "Vanishing Culture: A Report on Our Fragile Cultural Record" by Messarra et al., further emphasizes the critical role of libraries and archives in maintaining cultural history. The Internet Archive launched Internet Archive Switzerland in May 2026, specifically to protect endangered archives globally and begin archiving generative AI models, marking a new frontier in preservation. Discussions at the June 2026 IIPC Web Archiving Conference in Brussels highlighted the need for sustained investment in open-source web archiving infrastructure and collective responsibility. Simultaneously, the 'Digital Curation Crisis' is a growing concern in archaeology, where vast amounts of digital data are being produced without clear long-term preservation paths, underscoring the broader societal need for robust digital archiving strategies.

What If...?

Explore alternate histories. What if Gone but Not Forgotten: Recovering the Dead Web made different choices?

Explore Scenarios
Building relationship map...

People Also Ask

What is the 'Dead Web'?
The 'Dead Web' refers to the vast amount of online content that has become inaccessible over time due to various factors like website shutdowns, page removals, or broken links, a phenomenon commonly known as link rot. This loss of digital information poses a significant challenge to preserving historical and cultural records.
How is the 'Dead Web' being recovered?
The 'Dead Web' is primarily being recovered through the efforts of organizations like the Internet Archive, which uses its Wayback Machine to systematically archive billions of web pages. Other initiatives include 'Save Page Now' for user contributions, projects like 'Turn All References Blue' to fix broken links, and specialized archiving tools.
What is the role of the Wayback Machine in this effort?
The Wayback Machine is central to recovering the 'Dead Web,' acting as a digital library that stores historical snapshots of websites. It allows users to view how pages looked in the past and has rescued a significant percentage of content that would otherwise be lost, playing a crucial role in digital preservation.
How much web content is lost to link rot?
Studies indicate a significant loss of web content. A 2024 Pew Research Center study found that 38% of webpages from 2013 were inaccessible a decade later, and 25% of pages sampled between 2013-2023 were no longer available. Other research suggests that the average life of a webpage can be as short as 40 to 100 days.
What are the biggest challenges in recovering the Dead Web?
Key challenges include the sheer volume and dynamic nature of web content, technical limitations with JavaScript-heavy pages, deliberate blocking of archiving bots by some websites, paywalls, login requirements, and the vastness of the 'deep web' which is not easily crawled. Resource limitations also play a significant role.