EPub from Word: A Third Option

If you recall EPub, First Attempt (three whole days ago), I had tried two free options for creating an ePub ebook file from a fully-formatted book in Word form–that is, either saving it as PDF and converting it via Calibre, or saving it as Word’s “filtered HTML” and converting it via Calibre.

I wasn’t thrilled with either method.

  • The ePub-from-PDF version had great-looking type, but the page headers and footers were included within the stream and there were a number of other oddities, including a useless Contents band.
  • The ePub-from-HTML version (surprisingly, much larger than the ePub-from-PDF version) had a working Contents band and no extraneous page headers and footers, but the onscreen type, while clearly a rendition of the actual type used in the book, was pretty awful.

I can see that a fair number of people have looked at or downloaded the two versions. So far, I’ve had no actual feedback on how they do or don’t work either on ereaders or on ereader simulations.

Meanwhile, I realized that there was a third option: RTF.

  • Here’s an ePub-from-RTF version. It’s halfway in length between the other two–bigger than the from-PDF, smaller than the from-HTML. It clearly makes no attempt at all to provide the original typeface(s). The content panel is essentially unpopulated and useless. The contents within the book itself are odd.
  • On the other hand: It looks pretty good…no extraneous footers or headers and the type looks good (depending on the typeface you choose, since it’s entirely your choice.)

Whadda you think?

3 Responses to “EPub from Word: A Third Option”

  1. John says:

    While not quite what you were looking for, I used Calibre to convert your epubs to my Kindle format.

    Table of contents. Fine in the HTML version. Corrupted in the RTF version (odd “graph-definition>” inserts. Odd wrapping in the PDF version.

    Bullets. Only RTF showed normal bullets, but wrapped a bit off. Other versions used a different symbol, both quite readable.

    Cover missing in the PDF version.

    Main text seemed quite readable in all cases, though I only clicked through about 15-20 pages/locations.

    Hope that is of some use to you.

  2. John says:

    Happy to send you the Kindle (mobi) versions if you are interested.

  3. walt says:

    John: Sending me the Kindle/mobi versions probably wouldn’t help. What you described is pretty much what I saw in Calibre’s own reader–except that the main text wasn’t that readable in the HTML version. (Best guess: The typefaces aren’t embedded, and Calibre’s e-reader view doesn’t use partial-pixel enhancement…and I have Berkeley Book, but you don’t, so in your case the HTML version uses something that’s more readable. That may be more complicated than necessary.)

    The contents-page corruption is exactly what I saw in the RTF-to-ePub version.

    The bullet symbol used in the book is a chevron, so it’s not surprising that it sometimes converts to something other than a bullet.