↑↑↑ Home ↑↑ UNIX ↑ Updateware  

unmht - Unpack MIME HTML archives

unmht is an unpacker for MIME HTML (.mht / .mthml) archives. I wrote it because mht-rip did not work for me, because mht-rip and the mhtconv seemed inactive for some time, and because Perl makes programming this kind of thing ridiculously easy.

unmht saves all except the primary HTML file to a subdirectory and rewrites HTML links to point to the saved files for offline viewing.

Get unmht: Download

unmht has the following Perl module dependencies: HTML::PullParser, HTML::Tagset, MIME::Base64, MIME::QuotedPrint and Getopt::Long. After checking you have these (try perldoc HTML::PullParser etc.), put unmht somewhere in your path and extract the manual page with pod2man unmht > /usr/local/man/man1/unmht.1 (or similar).


unmht's manual page

NAME

unmht - Unpack a MIME HTML archive

SYNOPSIS

unmht unpacks MIME HTML archives that some browsers (such as Opera) save by default. The file extensions of such archives are .mht or .mhtml.

The first HTML file in the archive is taken to be the primary web page, the other contained files for "page requisites" such as images or frames. The primary web page is written to the output directory (the current directory by default), the requisites to a subdirectory named after the primary HTML file name without extension, with "_files" appended. Link URLs in all HTML files referring to requisites are rewritten to point to the saved files.

OPTIONS

-h, -?, --help

Print a brief usage summary.

-l, --list

List archive contents instead of unpacking. Four columns are output: file name, MIME type, size and URL. Unavailable entries are replaced by "(?)".

-o directory/ or name, --output directory/ or name

If the argument ends in a slash or is an existing directory, unpack to that directory instead of current directory. Otherwise the argument is taken as a path to the file name to write the primary HTML file to. If the output directory does not exist, it is created.

SEE ALSO

http://www.volkerschatz.com/unix/uware/unmht.html

http://www.loganowen.com/mht-rip/

http://sourceforge.net/projects/mhtconv/

COPYLEFT

unmht is Copyright (c) 2012 Volker Schatz. It may be copied and/or modified under the same terms as Perl.


TOS / Impressum