Skip to content


What are Librarians Doing (or facing) About Digital Banning, Information Loss, and Censorship

What are Librarians Doing (or Facing) About Digital Banning, Information Loss, and Censorship

My collection as of Feb. 10, 2025

As we contemplate the potential loss of so much information under the Trump regime, there are positive activities taking place to safeguard the digital record of medical, legal, foreign aid, political, and so much more information.  Some of these projects archive or back up the data outside of the USA to ensure that there won’t be a new digital Dark Ages.

Sadly this is no Chicken Little “The Sky is Falling” stuff.

Whitehouse.gov captures from: 2008 Sept. 152013 Mar. 212017 Feb. 3; and 2021 Feb. 25

Every four years, before and after the U.S. presidential election, a team of libraries and research organizations, including the Internet Archive, work together to preserve material from U.S. government websites during the transition of administrations.

These “End of Term” (EOT) Web Archive projects have been completed for term transitions in 2004200820122016, and 2020, with 2024 well underway. The effort preserves a record of the U.S. government as it changes over time for historical and research purposes.

With two-thirds of the process complete, the 2024/2025 EOT crawl has collected more than 500 terabytes of material, including more than 100 million unique web pages. All this information, produced by the U.S. government—the largest publisher in the world—is preserved and available for public access at the Internet Archive.

“Access by the people to the records and output of the government is critical,” said Mark Graham, director of the Internet Archive’s Wayback Machine and a participant in the EOT Web Archive project. “Much of the material published by the government has health, safety, security and education benefits for us all.”

The EOT Web Archive project is part of the Internet Archive’s daily routine of recording what’s happening on the web. For more than 25 years, the Internet Archive has worked to preserve material from web-based social media platforms, news sources, governments, and elsewhere across the web. Access to these preserved web pages is provided by the Wayback Machine. “It’s just part of what we do day in and day out,” Graham said. 

To support the EOT Web Archive project, the Internet Archive devotes staff and technical infrastructure to focus on preserving U.S. government sites. The web archives are based on seed lists of government websites and nominations from the general public. Coverage includes websites in the .gov and .mil web domains, as well as government websites hosted on .org, .edu, and other top level domains. 

The Internet Archive provides a variety of discovery and access interfaces to help the public search and understand the material, including APIs and a full text index of the collection. Researchers, journalists, students, and citizens from across the political spectrum rely on these archives to help understand changes on policy, regulations, staffing and other dimensions of the U.S. government. 

As an added layer of preservation, the 2024/2025 EOT Web Archive will be uploaded to the Filecoin network for long-term storage, where previous term archives are already stored. While separate from the EOT collaboration, this effort is part of the Internet Archive’s Democracy’s Library project. Filecoin Foundation (FF) and Filecoin Foundation for the Decentralized Web (FFDW) support Democracy’s Library to ensure public access to government research and publications worldwide.

According to Graham, the large volume of material in the 2024/2025 EOT crawl is because the team gets better with experience every term, and an increasing use of the web as a publishing platform means more material to archive. He also credits the EOT Web Archive’s success to the support and collaboration from its partners.

Web archiving is more than just preserving history—it’s about ensuring access to information for future generations. The End of Term Web Archive serves to safeguard versions of government websites that might otherwise be lost. By preserving this information and making it accessible, the EOT Web Archive has empowered researchers, journalists and citizens to trace the evolution of government policies and decisions.

More questions? Visit https://eotarchive.org/ to learn more about the End of Term Web Archive.

Harvard Law School Library Innovation Lab Announces Launch of Data.gov Archive

https://lil.law.harvard.edu/blog/2025/02/06/announcing-data-gov-archive/

“Today we released our archive of data.gov on Source Cooperative. The 16TB collection includes over 311,000 datasets harvested during 2024 and 2025, a complete archive of federal public datasets linked by data.gov. It will be updated daily as new datasets are added to data.gov.

This is the first release in our new data vault project to preserve and authenticate vital public datasets for academic research, policymaking, and public use.

We’ve built this project on our long-standing commitment to preserving government records and making public information available to everyone. Libraries play an essential role in safeguarding the integrity of digital information. By preserving detailed metadata and establishing digital signatures for authenticity and provenance, we make it easier for researchers and the public to cite and access the information they need over time.

In addition to the data collection, we are releasing open source software and documentation for replicating our work and creating similar repositories. With these tools, we aim not only to preserve knowledge ourselves but also to empower others to save and access the data that matters to them.

For suggestions and collaboration on future releases, please contact us at [email protected].

This project builds on our work with the Perma.cc web archiving tool used by courts, law journals, and law firms; the Caselaw Access Project, sharing all precedential cases of the United States; and our research on Century Scale Storage. This work is made possible with support from the Filecoin Foundation for the Decentralized Web and the Rockefeller Brothers Fund.”

 Archivists Work to Identify and Save the Thousands of Datasets Disappearing From Data.gov

Jason Koebler

  • Jan 30, 2025 at 2:36 PM

 

More than 2,000 datasets have disappeared from data.gov since Trump was inaugurated. But analyzing exactly what happened and where it went is going to take some time.

https://www.404media.co/archivists-work-to-identify-and-save-the-thousands-of-datasets-disappearing-from-data-gov/

The NSA’s “Big Delete”

Today, the National Security Agency (NSA) is planning a “Big Delete” of websites and internal network content that contain any of 27 banned words.

https://popular.info/p/the-nsas-big-delete

Update on the 2024/2025 End of Term Web Archive

https://blog.archive.org/2025/02/06/update-on-the-2024-2025-end-of-term-web-archive/

A complete archive of all CDC datasets uploaded before January 28th, 2025

A complete archive of all CDC datasets uploaded before January 28th, 2025. Under the direction of Trump, the CDC has been removing data related to gender, vaccines, and climate change.

CDC datasets uploaded before January 28th, 2025 : Centers for Disease Control and Prevention · archive.org

An archive of all CDC datasets uploaded to https://data.cdc.gov/browse before January 28th, 2025.

Excludes corrupt datasets and data not publicly accessible.

https://archive.org/details/20250128-cdc-datasets

CDC site scrubs HIV content following Trump DEI policies

The federal health agency began removing all content related to gender identity on Friday.

https://www.nbcnews.com/health/health-news/trump-dei-hiv-cdc-website-removed-lgbtq-rcna190068

CDC Ordered to Scrub Website of Words Like ‘Transgender’ and ‘LGBT’

A CDC employee spoke with Gizmodo about the “unprecedented” changes.

 

 

 

 

 

 

 

 

  • Pro plugin deactivated or invalid

Posted on: February 10, 2025, 5:48 pm Category: Uncategorized

One Response

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

  1. BV Martens said

    Thank you, thank you for this!

Some HTML is OK

(required)

(required, but never shared)

or, reply to this post via trackback.