Editing
Wayback Machine
(section)
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
====The Oakland Archive Policy==== Wayback's retroactive exclusion policy is based in part upon ''Recommendations for Managing Removal Requests and Preserving Archival Integrity'', known as ''The Oakland Archive Policy'', published by the School of Information Management and Systems at [[University of California, Berkeley]] in 2002, which gives a website owner the right to block access to the site's archives.<ref>{{cite web |title=Recommendations for Managing Removal Requests And Preserving Archival Integrity |date=December 14, 2002 |publisher=[[University of California]] |url=http://www2.sims.berkeley.edu/research/conferences/aps/removal-policy.html |access-date=October 20, 2024 |url-status=dead |archive-url=https://web.archive.org/web/20030502165937/http://sims.berkeley.edu/research/conferences/aps/removal-policy.html |archive-date=May 2, 2003}}</ref> Wayback has complied with this policy to help avoid expensive litigation.<ref>{{cite web |title=Retroactive robots.txt removal of past crawls AKA Oakland Archive Policy |date=July 7, 2014 |publisher=Internet Archive |url=https://archive.org/post/1019415/retroactive-robotstxt-removal-of-past-crawls-aka-oakland-archive-policy |access-date=September 14, 2017 |url-status=live |archive-url=https://web.archive.org/web/20171010124036/https://archive.org/post/1019415/retroactive-robotstxt-removal-of-past-crawls-aka-oakland-archive-policy |archive-date=October 10, 2017 }}</ref> The Wayback retroactive exclusion policy began to relax in 2017, when it stopped honoring robots on U.S. government and military web sites for both crawling and displaying web pages. As of April 2017, Wayback is ignoring robots.txt more broadly, not just for U.S. government websites.<ref>{{cite web |url=http://blog.archive.org/2017/04/17/robots-txt-meant-for-search-engines-dont-work-well-for-web-archives/ |title=Robots.txt meant for search engines don't work well for web archives |work=Internet Archive Blogs |first=Mark |last=Graham |date=April 17, 2017 |access-date=April 16, 2017 |url-status=live |archive-url=https://web.archive.org/web/20170417131508/http://blog.archive.org/2017/04/17/robots-txt-meant-for-search-engines-dont-work-well-for-web-archives/ |archive-date=April 17, 2017}}</ref><ref>{{cite web |title=Archivierung des Internets: Internet Archive ignoriert künftig robots.txt |date=April 25, 2017 |url=https://www.heise.de/newsticker/meldung/Archivierung-des-Internets-Internet-Archive-ignoriert-kuenftig-robots-txt-3693558.html |publisher=heise online |access-date=May 14, 2017 |language=de |url-status=live |archive-url=https://web.archive.org/web/20170427035659/https://www.heise.de/newsticker/meldung/Archivierung-des-Internets-Internet-Archive-ignoriert-kuenftig-robots-txt-3693558.html |archive-date=April 27, 2017}}</ref><ref>{{cite web |title=Suchmaschinen: Internet Archive will künftig Robots.txt-Einträge ignorieren – Golem.de |url=https://www.golem.de/news/suchmaschinen-internet-archive-will-kuenftig-robots-txt-eintraege-ignorieren-1704-127446.html |access-date=May 14, 2017 |language=de |url-status=live |archive-url=https://web.archive.org/web/20170619210648/https://www.golem.de/news/suchmaschinen-internet-archive-will-kuenftig-robots-txt-eintraege-ignorieren-1704-127446.html |archive-date=June 19, 2017}}</ref><ref>{{cite news |title=Internet Archive will ignore robots.txt files to keep historical record accurate |url=https://www.digitaltrends.com/computing/internet-archive-robots-txt/ |newspaper=Digital Trends |access-date=May 14, 2017 |date=April 24, 2017 |url-status=live |archive-url=https://web.archive.org/web/20170516130029/https://www.digitaltrends.com/computing/internet-archive-robots-txt/ |archive-date=May 16, 2017}}</ref>
Summary:
By saving changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
Edit source
View history
More
Search
Navigation
Main page
Community portal
Current events
Recent changes
Random page
Help
Donate
Tools
What links here
Related changes
Upload file
Special pages
Page information