View on GitHub

Web page archiver

Ruby gem that saves web pages in one single page

Download this project as a .zip file Download this project as a tar.gz file

What is it?

Web page archiver is a gem for creating web page archives: single files that contain images, Javascript, CSS, and the actual HTML. Of course you may zip these files, but there is hardly any support for opening and viewing such files without first requiring the user to extract the files before viewing. The solution offered in this gem is either MHTML or HTML with no external references. MHTML (or MIME HTML) is the default archive format for Internet Explorer and Opera, and can also be read in these browsers. HTML with no external references is not written by any browser as a standard archive format (that I know of) but can be read by any browser, as long as it has support for the Data URI-scheme (all modern browsers, although size limits apply to IE8).

How does it work

Simple:

uri = "http://murb.github.com/web-page-archiver/static/"
result = WebPageArchiver::MhtmlGenerator.generate(uri)

or

result = WebPageArchiver::DataUriHtmlGenerator.generate(uri)

References