Tag Archives: File size

Kindlegen Bloat: Strip it or Leave it?

If you’ve worked with many ePubs and converted them with Kindlegen while paying attention to file sizes, you might well have noticed that the Kindle files are much larger than the original ePubs — typically around twice as large. It turns out that Kindlegen stashes a copy of the original ePub sources inside its output file, which explains the doubling of size. Unfortunately, this can lead to super-sized Kindle files, particularly in heavily-illustrated works, since images get duplicated along with everything else.

To see how you can strip out this bloat, head over to the Mobileread forums, in particular to pdurrant’s post about his excellent Kindlestrip Python script and AppleScript wrapper. Note that if you don’t already have Python, you’ll need to download and install it before you can use this script.

For many people (me too!) the immediate instinct will be to run the above script on everything, in order to minimize file sizes, transfer/bandwidth requirements, and space on our customers’ Kindle devices. However … consider why Amazon might have designed their conversion tool to force the inclusion of this source information in its output file; they hardly did so by accident. Are they planning to use the stashed information in future versions of their Kindle hardware and software, to enable future Kindles to get closer to the layout of the original ePub? Will future generations of Kindle support ePub directly, and try to find ePub files inside Kindle files? If so, then clearly there’s a trade-off involved in the decision of whether to strip the file.

Personally, I’d consider heavily illustrated works as clear candidates for stripping, since the duplication of images is so costly. And, if you’ve designed your ePub to be fully “Kindle-friendly” then maybe there’s little benefit to leaving the extra information in place, since the basic Kindle version will already be fine. However, for Kindle files that suffer by comparison to their ePub originals, or where the overhead is negligible, you might want to consider leaving the bloat in place, in the hope that it might come in useful one day.