arrow_upward

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to download an entire website from the Internet Archive Wayback Machine
#4
(10-07-2018, 07:49 PM)youssefbasha Wrote: Downloads what? Only the index page or all the pages? And can i download the MySQL with it?

No, you can only download html pages.  Like it's an html screenshot of all of the available pages in html that can be viewed by a guest.  It is logical it wouldn't be able to download a database.  If it were able to do that then that would obviously have created a big uproar by the owners.  Someone would have been able to steal the content of the Forum.

Also, not all of the pages are included.  Only the top layer of links.  So if it's a Forum you'll notice that if you click deeper, then you won't be able to find all of the pages. I'd say maybe two or tops three layers.

It also doesn't make html screenshots of restricted pages.  Only the Forums that can be viewed as guest and nothing more than that.

(10-08-2018, 05:05 AM)Abinash Wrote: Never heard of this, and if it actually does the thing like the title tells then it's seriously something else lol.
Exactly what do you mean with this?  Did you read the tutorial?  There was no claim that it can capture a mySQL database - in fact if you had read the last paragraph of my tutorial you'd have noticed that I said that it can't extract a database. 

deanhills Wrote:I'm happy with the outcome so far, however haven't taken it to its conclusion yet.  My download project is very tricky in that it is a Forum instead of a static Website.  I think for uncomplicated static Websites this will work fine.  Not sure about Forums and Blogs though.  There's an issue with time stamps and the way the Forums and Blogs have been archived.  And of course no database.  The layers of .html pages don't go that deep.  Hopefully I'll be able to report back about this at a later stage.  I'm hoping to get a snapshot of the Forum on X date.  Will be interesting to see what will appear.
I doubt the intention of the Wayback Machine was ever to copy Websites in detail.  Only to "snapshot" it - like create a superficial html representation for it.

If you do want to do detailed backups of Forums with the Wayback Machine, I'm sure it could be arranged on a pay basis by contacting the guys there.

But this was not what this tutorial was about.  This tutorial contains simple abbreviated steps for installing the wayback downloader to download the pages from the Internet Archive.  This script is available in different forms by quite a number of authors, you can check out at Github or Google "wayback downloader".  If you check the reviews of the scripts they are all unanimous that you don't get a consistent result - like you don't get all of the pages, and one download of the same doesn't appear the same in the next.  But, for simple Websites, you do get a snapshot view of exactly what the Website looked like.
Terminal
Thank you to Post4VPS and VirMach for my awesome VPS 9!  



Possibly Related Threads…
Thread
Author
Replies
Views
Last Post
7,919
04-04-2017, 06:12 PM
Last Post: FacTioN
2,348
02-03-2017, 12:41 PM
Last Post: Hero^

person_pin_circle Users browsing this thread: 2 Guest(s)
Sponsors: VirMach - Host4Fun - CubeData - Evolution-Host - HostDare - Hyper Expert - Shadow Hosting - Bladenode - Hostlease - RackNerd - ReadyDedis - Limitless Hosting