Author

Topic: How safe are archive sites ? (Read 249 times)

member
Activity: 116
Merit: 10
October 07, 2018, 01:46:29 AM
#5
Basically, they set a computer program (called a "crawler") to browse the Internet and capture all the pages it reaches. The crawler follows links on pages to find more pages, which means that popular pages will be captured more often, and pages to which no links exist (sometimes called the "deep web") are excluded. Pages that indicate they do not want to be archived, via an instruction in a file called "robots.txt," are not archived by the crawler.
legendary
Activity: 3654
Merit: 8909
https://bpip.org
July 05, 2018, 02:19:25 PM
#4
archive.org is a non-profit organization that has been around for ~20 years. They archive the web using a crawler (like Google) and also allow manual submissions. I'm not aware of any incidents regarding reliability of their archives but that of course doesn't mean much.

archive.is (.fo, .li, .today) has been around for ~5 years and archives web pages on-demand only. It's run by a some anonymous guy somewhere in Europe and its reputation has been unblemished so far. There is a lot of info about it in the FAQ and even more in the blog:
http://archive.is/faq
https://blog.archive.is/archive

For example:
Quote
All data is stored on HDFS, textual content is duplicated 3 times among servers in different datacenters and images are duplicated 2 times
member
Activity: 486
Merit: 27
HIRE ME FOR SMALL TASK
July 05, 2018, 12:26:16 PM
#3
Be careful most of the stubborn old viruses that is really difficult to delete is from archive.org,  it is usually from appearing a certain place when you search about torrents, better to use private network and do not accept cookies to notify if necessary.

What do you want to search on archive? 
jr. member
Activity: 238
Merit: 9
July 05, 2018, 11:59:48 AM
#2
Ahhhh the magic of the old internet services still intact....

>WHOIS ARCHIVE.ORG

> SEARCHING.....

Domain Name:ARCHIVE.ORG
Domain ID: D2445039-LROR
Creation Date: 1995-12-14T05:00:00Z
Updated Date: 2013-03-04T00:20:16Z
Registry Expiry Date: 2022-12-13T05:00:00Z
Sponsoring Registrar:easyDNS Technologies Inc. (R1247-LROR)
Sponsoring Registrar IANA ID: 469
WHOIS Server:
Referral URL:
Domain Status: clientTransferProhibited
Domain Status: clientUpdateProhibited
Registrant ID:tuPYhPbV5VJa0FS2
Registrant Name:Internet Archive
Registrant Organization:Internet Archive
Registrant Street: 300 Funston Avenue
Registrant City:San Francisco
Registrant State/Province:CA
Registrant Postal Code:94118
Registrant Country:US
Registrant Phone:+1.4155616767
Registrant Phone Ext:
Registrant Fax: +1.1231231234
Registrant Fax Ext:
Registrant Email:[email protected]
Admin ID:tu1R9cns9Biz7aUp
Admin Name:Internet Archive
Admin Organization:
Admin Street: 300 Funston Avenue
Admin City:San Francisco
Admin State/Province:CA
Admin Postal Code:94118
Admin Country:US
Admin Phone:+1.4155616767
Admin Phone Ext:
Admin Fax:
Admin Fax Ext:
Admin Email:[email protected]
Tech ID:tu1R9cns9Biz7aUp
Tech Name:Internet Archive
Tech Organization:
Tech Street: 300 Funston Avenue
Tech City:San Francisco
Tech State/Province:CA
Tech Postal Code:94118
Tech Country:US
Tech Phone:+1.4155616767

Tech Phone Ext:
Tech Fax:
Tech Fax Ext:
Tech Email:[email protected]
Name Server:NS1.ARCHIVE.ORG
Name Server:NS2.ARCHIVE.ORG
Name Server:SFBA.SNS-PB.ISC.ORG
Name Server:ORD.SNS-PB.ISC.ORG
Name Server:AMS.SNS-PB.ISC.ORG
Name Server:NS3.ARCHIVE.ORG
Name Server:
Name Server:
Name Server:
Name Server:
Name Server:
Name Server:
Name Server:
DNSSEC:Unsigned
Ucy
sr. member
Activity: 2674
Merit: 403
Compare rates on different exchanges & swap.
July 05, 2018, 04:49:50 AM
#1
I depend on archive sites alot that sometimes I fear they may be hacked or their contents changed.
* How exactly do they work?

*Where are the archives stored?
 IPFS? Blockchain?

*Why are people confident in the sites

*Who owns the sites?

I use both https://archive.fo/ and https://archive.org/web/
Jump to: