14 July 2010

Blender 2.4 Documentation in CHM

This is my first attempt to create CHM format of Blender's Wiki pages. I use HTTrack with some deliberate scan rules (filter) to only include Manual, Reference, Books and Tutorials in english. Later I use some simple regex to trim and strip unnecessary part.

However I want to keep the navigation feature to aid the poor bookmark so I need several regexes to be applied. Once completed, I start to realize that it was too big, the images alone consume 180MB.
I've managed to optimize images file to almost half of original. Now it's 100MB CHM (with images)
PNG: First, files > 50KB converted to 8bit depth, then all PNG optimized with optipng
JPG: Optimized with jpegoptim 75%
GIF: Optimized with gifsicle

Here is how I fetch it from blender's wiki using HTTrack
At first, I don't know how the sitemap looks like so I start it without filter and downloading only html pages. After some hours HTTrack finished with very messy result. I browse it quite a while to find out which important, which should be included etc. And here is the scan rules for English documentation:

+*.png +*.gif +*.jpg +*.css +*.js -ad.doubleclick.net/* -mime:application/foobar -*title=* -*BlenderWiki:* -*Category:* -*Org:* -*Meta:* -*Talk:* -*User:* -*-Flag-* -*Special:* -*File:* -*:2.5/* -*action=* -*section=* -*stylish.swf* -*Attic:* -*Robotics:* -*Dev:* -*index.php/BlenderDev* -*Template:* -*/invalid_files/* -*Help:* -*index.php/AR/* -*:AR/* -*index.php/BG/* -*:BG/* -*index.php/BR/* -*:BR/* -*index.php/CA/* -*:CA/* -*index.php/CZ/* -*:CZ/* -*index.php/DE/* -*:DE/* -*index.php/DK/* -*:DK/* -*index.php/EL/* -*:EL/* -*index.php/ES/* -*:ES/* -*index.php/FA/* -*:FA/* -*index.php/FI/* -*:FI/* -*index.php/FR/* -*:FR/* -*index.php/ID/* -*:ID/* -*index.php/IT/* -*:IT/* -*index.php/KO/* -*:KO/* -*index.php/MN/* -*:MN/* -*index.php/NL/* -*:NL/* -*index.php/PL/* -*:PL/* -*index.php/PT/* -*:PT/* -*index.php/RO/* -*:RO/* -*index.php/RU/* -*:RU/* -*index.php/SV/* -*:SV/* -*index.php/TH/* -*:TH/* -*index.php/TR/* -*:TR/* -*index.php/ZH/* -*:ZH/*

The language filter could be much simpler if HTTrack support regex...
After refetch the wiki for second time, I have relatively clean dump.

Here is the files: (dedicated to all users who CAN'T afford internet connection, somebody please mirror but only with direct links not rapidshit, ziddumb or other craps)


Blender249Man.chm Contain manual in English, 30MB (This is the main file and linked to 4 files below)
Blender249Books.chm Books that converted to html, 32MB
Blender249Tuts.chm Tutorials and Theory, 31MB
Blender249API.chm Python API and Game Engine API for Blender 2.49, 1MB
Blender249Ref.chm Contain Reference, FAQ and Scripts Catalog, 6MB

To decompile use 7zip or run hh -decompile [chmfile]

Blender249Doc.chm (july 13, 2010) the whole text only version and very basic, 6.8MB


1 comment: