Converting CHM ebooks to PDF for eSlick
CHM stands for Compiled HTML Help. It’s a precusor to Microsoft’s LIT format, and there’s still computer books floating around with electronic copies in CHM format in the accompanying CD.
It’s easy to read on the desktop but not as easy to read on devices like phones or ebook readers. It’s a very good candidate for conversion because of the Table of Contents feature. When converting to PDF from a CHM file using HTMLDOC, you can preserve links and generate chapter bookmarks.
HTMLDOC has open source and commercial components. The open source version isn’t easy to use since it comes as source only. If you’re running Linux then you can use CHM2PDF which works wonderfully for ebooks and websites. There’s no Windows version yet, but since all of the source is open I imagine there’ll be a port soon.
Converting on Linux
On Kubuntu my installation was as simple as: apt-get install chm2pdf
From there, I discovered that the script doesn’t deal with spaces in file names very well. It didn’t work when I enclosed them in quotation marks or when I escaped the spaces with \. As a stopgap I just renamed the file to have no spaces.
From there: chm2pdf –book ebook.chm
That was enough to generate a PDF with hyperlinks that looks like it’ll reflow easily. I don’t have my eSlick yet so I can’t test, but I will revisit this.
If you want to change the size you can use the –size parameter. You can use letter, a4, legal, or specify measurements and margins.
Converting on Windows
I haven’t been able to find as straightforward of a package as chm2pdf on Windows, but I was able to get similar results by using GridinSoft‘s CHM Decoder and the last freeware version of HTMLDOC. Essentially it’s a manual way of doing the same thing that chm2pdf does.
Here’s what you do to decompile the CHM file.
- First off, open up CHM Decoder.
- In the Open File tab, click the Open button and select the CHM you want to convert.
- Click the Decode tab. Append a directory name to the one it selected. It’ll create a new directory. It doesn’t matter what value
Now comes the tricky part. In chm2pdf, it keeps track of the ordering of the HTML files. You lose the ordering here and have to reconstruct it by hand. The difficulty here varies from book to book. Keep the CHM file open in one window so you can peek at the Table of Contents.
Now fire up HTMLDOC.
- You’ll be greeted by the Input tab first. Make sure that the Document Type is set to Book.
- Click on the “Add Files…” button and find the table of contents file, usually called toc.html. Add it first.
- Click “Add Files…” again. You’ll need to add the files in order here. Only add HTML files, don’t worry about images, they’ll get converted. See screenshots below. It may be as easy as shift-selecting the entire thing, or you might have to add chapter by chapter.
- Click the Output tab. Select the PDF radio button in Output format. Click the Browse button on Output path, browse to where you want the pdf to go, type in a name and click OK.
- Click on the Page tab. Check that the margins are okay. Generally Universal works since CHM files are usually easy to reflow.
- Click on the PDF tab. Select PDF version 1.4 (Acrobat 5.0.) For first page, select TOC.
- Click the Generate button.
Check the generated PDF. HTMLDOC will output the table of contents with hyperlinks, so even if out of order it’s great for reference books, but if you’re reading through a manual you may want to ensure that the ordering is correct.
Leave a Reply