What does HackerNews think of wayback-machine-downloader?

Download an entire website from the Wayback Machine.

Language: Ruby

Came here to say this. My guess is that Amazon paid them to go away. If my guess is accurate ( I could certainly be wrong ), then Amazon could have them add a robots.txt banning archive.org. If they do that access to the archive will be removed. Mirror it now if you want the content.

One nice way to do so ( handy for any site that you think may vanish off Way Back Machine ): https://github.com/hartator/wayback-machine-downloader

There are a few out there ready to go. Here's one

https://github.com/hartator/wayback-machine-downloader

You just dump and sync to s3 and use terraform to provision the route53 and bucket setups.

Yes they are mostly content sites. The hardest part is filtering adult domains assuming you don't want them. There are a staggering number of adult domains that expire every year and get huge traffic.

They should be able to use https://github.com/hartator/wayback-machine-downloader and get at least a static version of the website back online.
Really cool, congrats!

I have built something similar, but to retrieve a backup for one of my dead websites. It was a fun project.

Shameless plug: https://github.com/hartator/wayback-machine-downloader/

  wayback_machine_downloader www.c2.com -c 20
Ref: https://github.com/hartator/wayback-machine-downloader
We don't have a logo yet!

Download an entire website from the Internet Archive Wayback Machine.

https://github.com/hartator/wayback-machine-downloader