Editing
Wayback Machine
(section)
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Uses== From its public launch in 2001, the Wayback Machine has been studied by scholars both for the ways it stores and collects data as well as for the actual pages contained in its archive. As of 2013, scholars had written about 350 articles on the Wayback Machine, mostly from the [[information technology]], [[Library and information science|library science]], and [[social science]] fields. Social science scholars have used the Wayback Machine to analyze how the development of websites from the mid-1990s to the present has affected the company's growth.<ref name="Arora" /> When the Wayback Machine archives a page, it usually includes most of the hyperlinks, keeping those links active when they just as easily could have been broken by the Internet's instability. Researchers in India studied the effectiveness of the Wayback Machine's ability to save hyperlinks in online scholarly publications and found that it saved slightly more than half of them.<ref>{{cite journal |last1=Sampath Kumar |first1=B.T. |last2=Prithviraj |first2=K.R. |date=October 21, 2014 |title=Bringing life to dead: Role of Wayback Machine in retrieving vanished URLs |journal=Journal of Information Science |volume=41 |issue=1 |pages=71β81 |doi=10.1177/0165551514552752 |s2cid=28320982 |issn=0165-5515}}</ref> "Journalists use the Wayback Machine to view dead websites, dated news reports, and changes to website contents. Its content has been used to hold politicians accountable and expose battlefield lies."<ref name="Nelson">{{cite web |url=https://www.usnews.com/news/articles/2016-08-17/wayback-machine-wont-censor-archive-for-taste-director-says-after-olympics-article-scrubbed |first1=Steven |last1=Nelson |date=August 17, 2016 |website=U.S. News & World Report |title=Wayback Machine Won't Censor Archive for Taste, Director Says After Olympics Article Scrubbed |archive-url=https://web.archive.org/web/20170106151933/http://www.usnews.com/news/articles/2016-08-17/wayback-machine-wont-censor-archive-for-taste-director-says-after-olympics-article-scrubbed |archive-date=January 6, 2017 |url-status=live |access-date=May 14, 2017}}</ref> In 2014, an archived social media page of [[Igor Girkin]], a separatist rebel leader in Ukraine, showed him boasting about his troops having shot down a suspected Ukrainian military airplane before it became known that the plane actually was a civilian Malaysian Airlines jet ([[Malaysia Airlines Flight 17]]), after which he deleted the post and blamed Ukraine's military for downing the plane.<ref name="Nelson"/><ref>{{cite magazine |title=What the Web Said Yesterday |url=https://www.newyorker.com/magazine/2015/01/26/cobweb |url-access=limited |magazine=[[The New Yorker]] |access-date=May 14, 2017 |url-status=live |archive-url=https://web.archive.org/web/20150125141230/http://www.newyorker.com/magazine/2015/01/26/cobweb |archive-date=January 25, 2015 |first=Jill |last=Lepore | author-link=Jill Lepore | date=January 26, 2015 }}</ref> In 2017, the [[March for Science]] originated from a discussion on [[Reddit]] that indicated someone had visited Archive.org and discovered that all references to [[climate change]] had been deleted from the White House website. In response, a user commented, "There needs to be a Scientists' March on Washington".<ref>{{cite news |title=The March for Science began with this person's 'throwaway line' on Reddit |url=https://www.washingtonpost.com/news/speaking-of-science/wp/2017/04/21/the-march-for-science-began-with-this-persons-throwaway-line-on-reddit/ |date=April 21, 2017 |first1= Ben |last1=Guarino |newspaper=Washington Post |access-date=April 23, 2017 |url-status=live |archive-url=https://web.archive.org/web/20170423081417/https://www.washingtonpost.com/news/speaking-of-science/wp/2017/04/21/the-march-for-science-began-with-this-persons-throwaway-line-on-reddit/ |archive-date=April 23, 2017}}</ref><ref>{{cite news |url=https://www.washingtonpost.com/news/speaking-of-science/wp/2017/01/24/are-scientists-going-to-march-on-washington/ |url-access=subscription |date=January 25, 2017 |first1=Sarah |last1=Kaplan |title=Are scientists going to march on Washington? |newspaper=The Washington Post |access-date=January 31, 2017 |url-status=live |archive-url=https://web.archive.org/web/20170131152535/https://www.washingtonpost.com/news/speaking-of-science/wp/2017/01/24/are-scientists-going-to-march-on-washington/ |archive-date=January 31, 2017}}</ref><ref>{{cite news |last1=Foley |first1=Katherine Ellen |title=The global March for Science started with a single Reddit thread |url=https://qz.com/965485/the-global-march-for-science-started-with-a-single-reddit-thread/ |date=April 22, 2017 |work=Quartz |access-date=April 23, 2017 |url-status=live |archive-url=https://web.archive.org/web/20170424004314/https://qz.com/965485/the-global-march-for-science-started-with-a-single-reddit-thread/ |archive-date=April 24, 2017}}</ref> The site is used heavily for verification, providing access to references and content creation by [[Wikipedia community|Wikipedia editors]].<ref name="Graham">{{Cite web|url=http://blog.archive.org/2018/10/01/more-than-9-million-broken-links-on-wikipedia-are-now-rescued/|title=More than 9 million broken links on Wikipedia are now rescued|first=Mark|last=Graham|date=October 1, 2018 |website=Internet Archive Blogs |url-status=live |archive-url= https://archive.today/20230408194542/http://blog.archive.org/2018/10/01/more-than-9-million-broken-links-on-wikipedia-are-now-rescued/ |archive-date= April 8, 2023 }}</ref> When new URLs are added to Wikipedia, the Internet Archive has been archiving them.<ref name="Graham" /> In September 2020, a partnership was announced with [[Cloudflare]] to automatically archive websites served via its "Always Online" service, which will also allow it to direct users to its copy of the site if it cannot reach the original host.<ref name="archive-partners">{{Cite web |last=Graham |first=Mark |date=September 17, 2020 |title=Cloudflare and the Wayback Machine, joining forces for a more reliable Web |url= http://blog.archive.org/2020/09/17/internet-archive-partners-with-cloudflare-to-help-make-the-web-more-useful-and-reliable/ |access-date=September 17, 2020 |website= Internet Archive Blogs}}</ref> === Limitations === In 2014, there was a six-month lag time between when a website was crawled and when it became available for viewing in the Wayback Machine.<ref>{{cite web |url=https://archive.org/about/faqs.php |title=Internet Archive Frequently Asked Questions |date=April 2, 2014 |website=Internet Archive |archive-url=https://web.archive.org/web/20140402223358/https://archive.org/about/faqs.php |archive-date=April 2, 2014|url-status=dead|access-date=November 23, 2018}}</ref> As of 2024, the lag time is 3 to 10 hours.<ref name="Using" /> The Wayback Machine offers only limited search facilities. Its "Site Search" feature allows users to find a site based on words describing the site, rather than words found on the web pages themselves.<ref name="Bates" /> The Wayback Machine does not include every web page ever made due to the limitations of its web crawler. The Wayback Machine cannot completely archive web pages that contain interactive features such as Flash platforms and forms written in JavaScript and [[progressive web application]]s, because those functions require interaction with the host website. This means that, since approximately July 9, 2013, the Wayback Machine has been unable to display YouTube comments when saving videos' watch pages, as, according to the Archive Team, comments are no longer "loaded within the page itself."<ref>{{cite web|url=https://www.archiveteam.org/index.php?title=YouTube#Comment_loading|website=archiveteam.org|title=YouTube β Archiveteam|access-date=August 6, 2020|archive-date=August 5, 2020|archive-url=https://web.archive.org/web/20200805184742/https://www.archiveteam.org/index.php?title=YouTube#Comment_loading|url-status=live}}</ref> The Wayback Machine's web crawler has difficulty extracting anything not coded in HTML or one of its variants, which can often result in broken hyperlinks and missing images. Due to this, the web crawler cannot archive "orphan pages" that are not linked to by other pages.<ref name="Bates">{{cite journal |last=Bates |first=Mary Ellen |date=2002 |title=The Wayback Machine |journal=Online |volume=26 |pages=80 }}</ref><ref>{{cite web |url=https://archive.org/about/faqs.php |title=Internet Archive Frequently Asked Questions |website=Internet Archive |access-date=October 18, 2018 |archive-url=https://web.archive.org/web/20130420213122/https://archive.org/about/faqs.php |archive-date=April 20, 2013 |url-status=live }}</ref> The Wayback Machine's crawler only follows a predetermined number of hyperlinks based on a preset depth limit, so it cannot archive every hyperlink on every page.<ref name="Crawls" /> ===In legal evidence=== ====Civil litigation==== =====''Netbula LLC v. Chordiant Software Inc.''===== In a 2009 case, ''Netbula, LLC v. Chordiant Software Inc.'', defendant Chordiant filed a motion to compel Netbula to disable the [[robots.txt]] file on its website that was causing the Wayback Machine to retroactively remove access to previous versions of pages it had archived from Netbula's site, pages that Chordiant believed would support its case.<ref name="Lloyd"/> Netbula objected to the motion on the ground that defendants were asking to alter Netbula's website and that they should have subpoenaed Internet Archive for the pages directly.<ref>{{cite web |last=Cortes |first=Antonio |date=October 2009 |title=Motion Opposing Removal of Robots.txt |url=http://www.american-justice.org/index.cgi/Page/116/OPPOSITION-TO-MOTION-TO-COMPEL-REMOVAL-OF-ROBOT-TXT-FILE-FROM-WEBSITE/ |access-date=October 15, 2009 |url-status=dead |archive-url=https://web.archive.org/web/20101027050350/http://www.american-justice.org/index.cgi/Page/116/OPPOSITION-TO-MOTION-TO-COMPEL-REMOVAL-OF-ROBOT-TXT-FILE-FROM-WEBSITE |archive-date=October 27, 2010 }}</ref> An employee of Internet Archive filed a sworn statement supporting Chordiant's motion, however, stating that it could not produce the web pages by any other means "without considerable burden, expense and disruption to its operations."<ref name="Lloyd"/> Magistrate Judge Howard Lloyd in the Northern District of California, San Jose Division, rejected Netbula's arguments and ordered them to disable the robots.txt blockage temporarily in order to allow Chordiant to retrieve the archived pages that they sought.<ref name="Lloyd">{{cite web |last=Lloyd |first=Howard |date=October 2009 |title=Order to Disable Robots.txt |url=http://www.american-justice.org/upload/page/123/69/docket-187-order-on-IA-motion.pdf |access-date=October 15, 2009 |archive-url=https://web.archive.org/web/20190808173832/http://www.american-justice.org/upload/page/123/69/docket-187-order-on-IA-motion.pdf |archive-date=August 8, 2019 |url-status=dead }}</ref> =====''Telewizja Polska USA, Inc. v. Echostar Satellite''===== In an October 2004 case, ''[[Telewizja Polska|Telewizja Polska USA, Inc.]] v. Echostar Satellite'', No. 02 C 3293, 65 Fed. R. Evid. Serv. 673 (N.D. Ill. October 15, 2004), a litigant attempted to use the Wayback Machine archives as a source of admissible evidence, perhaps for the first time. Telewizja Polska is the provider of [[TVP Polonia]] and [[EchoStar]] operates the [[Dish Network]]. Prior to the trial proceedings, EchoStar indicated that it intended to offer Wayback Machine snapshots as proof of the past content of Telewizja Polska's website. Telewizja Polska brought a motion ''[[in limine]]'' to suppress the snapshots on the grounds of [[hearsay]] and unauthenticated source, but Magistrate Judge Arlander Keys rejected Telewizja Polska's assertion of hearsay and denied TVP's motion ''in limine'' to exclude the evidence at trial.<ref>{{cite journal |last=Gelman |first=Lauren |date=November 17, 2004 |title=Internet Archive's Web Page Snapshots Held Admissible as Evidence |journal=Packets |volume=2 |issue=3 |url=http://cyberlaw.stanford.edu/packets002728.shtml |access-date=January 4, 2007 |archive-url=https://web.archive.org/web/20110430095339/http://cyberlaw.stanford.edu/packets002728.shtml |archive-date=April 30, 2011 |url-status=dead }}</ref><ref>{{cite journal |last=Howell |first=Beryl A. |date=February 2006 |title=Proving Web History: How to use the Internet Archive |journal=Journal of Internet Law |pages=3β9 |url=http://www.strozfriedberg.com/files/Publication/fee98a34-d739-478b-a7db-6af37b757714/Presentation/PublicationAttachment/aae88469-9835-4fe4-ae5f-38637924314f/BAHPROVINGWEBHISTORY.pdf |archive-url=https://web.archive.org/web/20100705043226/http://www.strozfriedberg.com/files/Publication/fee98a34-d739-478b-a7db-6af37b757714/Presentation/PublicationAttachment/aae88469-9835-4fe4-ae5f-38637924314f/BAHPROVINGWEBHISTORY.pdf |url-status=dead |archive-date=July 5, 2010 |access-date=August 6, 2008}}</ref> At the trial, however, District Court Judge Ronald Guzman, the trial judge, overruled Magistrate Keys' findings, and held that neither the affidavit of the Internet Archive employee nor the underlying pages (i.e., the Telewizja Polska website) were admissible as evidence. Judge Guzman reasoned that the employee's affidavit contained both hearsay and inconclusive supporting statements, and the purported web page, printouts were not self-authenticating.<ref>{{cite web |title=Looking For Evidence in Virtual Places Admissibility of Internet Evidence |url=https://www.netforlawyers.com/page/looking-evidence-virtual-places-admissibility-internet-evidence |access-date=June 14, 2020 |archive-url=https://web.archive.org/web/20190701055139/https://www.netforlawyers.com/page/looking-evidence-virtual-places-admissibility-internet-evidence |archive-date=July 1, 2019 |url-status=live}}</ref><ref>{{cite book |last1=Levitt |first1=Carole A. |last2=Rosch |first2=Mark E. |title=Find Info Like a Pro: Mining the Internet's Publicly Available Resources for Investigative Research, Tom 1 |date=2010 |publisher=American Bar Association |isbn=978-1-60442-890-2 |pages=194β196 |url=https://books.google.com/books?id=SUErZdbvcOkC |access-date=June 14, 2020 |archive-date=December 18, 2020 |archive-url=https://web.archive.org/web/20201218143732/https://books.google.com/books?id=SUErZdbvcOkC |url-status=live }}</ref> ====Patent law==== {{Main|Internet as a source of prior art}} The [[United States Patent and Trademark Office]] and the [[European Patent Office]] will accept date stamps from the Internet Archive as evidence of when a given Web page was accessible to the public. These dates are used to determine if a Web page is available as [[prior art]] for instance in examining a patent application.<ref>{{cite web |title=Prior Art in the Field of Business Method Patents β When is an Electronic Document a Printed Publication for Prior Art Purposes? |first=Wynn W. |last=Coggins |date=Fall 2002 |url=http://www.uspto.gov/patents/resources/methods/aiplafall02paper.jsp |work=USPTO |url-status=dead |archive-url=https://web.archive.org/web/20120921083344/http://www.uspto.gov/patents/resources/methods/aiplafall02paper.jsp |archive-date=September 21, 2012 |access-date=August 15, 2012 }}</ref> ====Limitations of utility==== There are technical limitations to archiving a website, and as a consequence, opposing parties in litigation can misuse the results provided by website archives. This problem can be exacerbated by the practice of submitting screenshots of web pages in complaints, answers, or expert witness reports when the underlying links are not exposed and therefore, can contain errors. For example, archives such as the Wayback Machine do not fill out forms and therefore, do not include the contents of non-[[Semantic URL|RESTful]] e-commerce databases in their archives.<ref>{{cite web |url=http://www.practice.com/2008/12/29/debunking-the-wayback-machine |title=Debunking the Wayback Machine |archive-url=https://web.archive.org/web/20100629050840/http://www.practice.com/2008/12/29/debunking-the-wayback-machine |archive-date=June 29, 2010}}</ref>
Summary:
By saving changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
Edit source
View history
More
Search
Navigation
Main page
Community portal
Current events
Recent changes
Random page
Help
Donate
Tools
What links here
Related changes
Upload file
Special pages
Page information