EPub, First Attempt

So…being a sometimes-advocate of open and all that, and since Lulu now supports ePub, The Standard Ebook Format…

I thought I’d see whether using it makes any sense for the huge (513pp. 6×9, 191K words) collection of OA articles that may or may not emerge as Open Access and Libraries: Essays from Cites & Insights, 2001-2009.

The project itself is on the back burner for a few weeks while I see whether one possible way of getting an index pans out. Meanwhile, I could see what generating an ePub version was like.

The tools

Checking online and asking around, the only software I could find that matches the probable income from the ePub version–that is, $0–was Calibre, which is really an ebook organization (and viewing) program but also includes routines to convert to ePub from various input formats, including PDF and HTML.

The conversion routine is interesting, because it wants to know what reader the output will be used on. (There’s “default,” which may or may not be Kindle, but also a bunch of individual choices.)

  • I had this silly idea that ePub is a device-independent standard. If that’s true, then I don’t get the question.
  • More specifically, if I do an ePub version, it will most certainly be intended to be device-independent.

The trials

I decided to try this two ways, in both cases starting with a Word document that’s designed as a 6×9 book with good margins, using Berkeley Oldstyle Book for body text and Friz Quadrata for major headings, with “typical” page headers and footers (centered page # on first page of chapter, page # and book name in italics on other even-numbered pages, chapter name in italics and page # on other odd-numbered pages).

The PDF used for input was prepared using “Save as PDF,” which yields bookmarks and is really great for use on a PDF-supporting viewer. (Unfortunately, it appears to carry a phantom “Arial” that’s not embedded, which means it may not be possible to upload it to Lulu–which requires that all typefaces be embedded. If I “print to PDF” instead, I can set the PDF properties to embed everything, even Arial, but you don’t get bookmarks in that case. Irrelevant for a printed book, relevant for a PDF-download version.)

The HTML was prepared using Word “Save as filtered HTML,” which is the advice given by another service that does ePub conversion (but only to make the ePubs available through that service…not what I need).

  • PDF-to-ePub results (as opened in Calibre’s ebook viewer): The type looks great. There’s an optional contents band, but it doesn’t really work. Ebook page breaks are peculiar, and text breaks even more so. The page headers and footers show up in the stream (which becomes something like 1,200 pages from the original 519 including prefatory material).
  • HTML-to-ePub results (as opened in Calibre’s ebook viewer): Uggh… The type looks awful, very nearly unreadable, for reasons that escape me. There are no margins. (I think that’s true with the PDF-to-ePub as well.)  On the other hand, the table of contents pane (optional) works just fine–even if there’s an odd pagebreak before the first level-2 heading in each chapter. No extraneous running page headers or footers, and the Friz Quadrata headings are absolutely crisp. The 513-page book turns into 1,800-odd pages (or whatever).


At this point, I’d be a good deal more embarrassed to offer either variety of ePub than I already am by the semi-clunky HTML versions of Cites & Insights essays…which have odd margins but at least have clean typography and proper flow.

Maybe I’m missing something.

Update 5/9/10: Remainder of post removed as no longer relevant. Here’s what there is of an epub version, but I strongly recommend the free PDF or the $17.50 trade paperback at Lulu.

4 Responses to “EPub, First Attempt”

  1. Joe Clark says:

    ePub is XHTML 1.1 plus CSS. CSS is historically your problem. It continues to be so here.

  2. walt says:

    I’ll have to push a little at this, specifically “CSS is historically your problem.” No, it isn’t. At least not if you mean “As a writer, it’s your job to create appropriate CSS.”

    As a writer, I shouldn’t have to be a programmer. (I was a programmer for five decades. Been there, done that, don’t much want to do it any more.) I can buy a highly sophisticated piece of software that handles the “computer stuff” so I can focus on writing–and, as a quondam book designer, on page and publication layout. That’s Word2007 (or its competitors).

    I can generate a predictably high-quality PDF from a Word document–quickly and easily. No programming required.

    If I can’t generate a similarly high-quality ePub from a Word document, quickly and easily, it’s little help to say “just go in and program your way out of it–after all, it’s just CSS plus XHTML.” Which is saying ePub is an exotic format that you need special skills to create–not precisely the best way to make a so-called standard popular.

    A wholly unsuitable answer is to tell me that, as a writer, I should be creating my own CSS. Sorry, but that won’t fly. The ideal would be an ePub “printer driver” for Word (and OpenOffice and GDocs, I suppose) or an effective .doc-to-.epub (or .docx-to-.epub, but the loss in layout and typefitting quality in backconverting from .docx to .doc is, while real, not huge) converter.

    Otherwise, in the absence of convincing evidence that ePub will yield substantially more sales (irrelevant for this $0 publication) than PDF–and, when I asked the question for a priced book, the apparent volume of increased sales was zero–there’s no reason to do ePub. Which is bad news for a supposed standard. (Yes, I’ve thought about tech standards a lot–heck, I wrote the book on technical standards in libraries back in the day.)

  3. Steve Lawson says:

    I know nothing about ePub, but why let that stop me?

    CSS isn’t programming, it’s a stylesheet. I’d argue that it’s not “exotic,” given that it’s CSS that is styling this blog–even these words as I type them.

    If it’s CSS, that would imply to me that it would be easy enough to get a style from someone else and customize it for your own purposes, as you have for this blog. I like messing with CSS, so if it’s something I could advise you on. let me know.

    If all you are saying is that it’s currently difficult for writers to produce an ePub document with no fuss, then I believe you. But if you are saying that people who want to produce ebooks that meet a reasonably high standard of design shouldn’t have to learn to tweak some CSS (or find a friend who can for them) I’d say that’s unrealistic, and doesn’t jibe with my experience using blogs.

  4. walt says:

    Steve: Yes and no. I modified the CSS for this blog, true enough. (It’s programming–it’s a set of instructions for the computer–but it’s a different form of programming.) But I *modified* CSS–heck, I didn’t write it from scratch. And I didn’t have to disassemble a combined package to do so, then reassemble it–after all, an .epub file is a single file (which may be a zipped package for all I know, just as .docx is apparently a zipped XML package).

    I’ve already done the design work in Word. To be a robust standard, ePub should be a carrier, just as PDF is. At this point, that’s clearly not the case. I don’t think the blog analogy works except for people who craft their own CSS from scratch…or maybe it does, since I’d guess that a very high percentage of WordPress bloggers (and an even higher percentage of Blogger bloggers) don’t do any direct editing of the scripts that run their blogs, *or need to.*