Tell HN: Today I learned Epub is just HTML/CSS

ocdtrekkie · on April 8, 2021

This is why I'm super irritated Microsoft dropped EPUB support in Edge when switching to Chromium, and why it's so frustrating that EPUB support isn't common by default in operating systems: It's literally just HTML/CSS in an opinionated structure. EPUB should be as ubiquitously supported as PDF is today, no browser has an excuse for not supporting it.

arthur2e5 · on April 8, 2021

One of the possible reasons Chromium is a bad for EPUB is its lack of MathML support. But then it's not like all EPUB readers have it, and it's not like MathJax can't be used for rendering.

naikrovek · on April 9, 2021

epub supports MathML. are you saying Chrome doesn't support MathML?

rijoja · on April 9, 2021

Sadly yes. There are solutions such as MathJax, however this is obviously not as nice as having it built into a lower level.

detaro · on April 9, 2021

yep. although Igalia is running some efforts to get it added again.

jrimbault · on April 8, 2021

The epub reader in old-Edge was quite nice.

ocdtrekkie · on April 8, 2021

It really was. And just the fact that it was default-installed was a main perk: I could rely on it being on, say, a work PC, where I can't install software for personal use.

Hammershaft · on April 8, 2021

It was the nicest one I've used on any platform. Now I read epubs on Apple books and I come away a little frustrated every time I use it (especially when I copy and paste)

roryokane · on April 21, 2021

> especially when I copy and paste

You can make copy and paste work normally in Apple Books with the solutions on https://apple.stackexchange.com/q/137047 ‘Don't want iBooks to always paste the “Excerpt From” of what I have copied’.

908B64B197 · on April 9, 2021

The PDF reader too. Wish they repackaged it as a standalone app.

emayljames · on April 8, 2021

Firefox addon: https://addons.mozilla.org/en-GB/firefox/addon/epubreader/

Chrome addon: https://chrome.google.com/webstore/detail/epubreader/jhhclmf...

ocdtrekkie · on April 8, 2021

Addons/extensions are a major security risk. For example,

Firefox addon: Download files and read and modify the browser’s download history, Access browser activity during navigation, Access your data for all web sites

Chrome Web Store obscures the full permission list, but the comments for that extension admits: right "read and change all your data on the websites you visit" needed

It's 2021, you should view all browser addons as a threat.

irrational · on April 8, 2021

If it is that easy, there must be a business/political reason for not supporting it. Does anyone have a clue what that might be?

ocdtrekkie · on April 8, 2021

I think it was probably a combination of low usage (because a lot of people either use Kindle or a DRM-encumbered EPUB platform like Adobe Digital Editions), and the fact that Microsoft likely was more concerned with prioritizing porting over other more critical functionality like Active Directory integration into the Chrome codebase.

Presumably if Google decided to add EPUB support, Edge would get it back too, but Microsoft hasn't decided that feature is valuable enough to add onto their modifications.

anoncow · on April 8, 2021

Low usage is a chicken and egg thing.

ocdtrekkie · on April 8, 2021

I agree, I am just speculating to why Microsoft did what it did. :)

I also think it's particularly sad that Windows 10 had a perfectly good EPUB app, Reader, that Microsoft deprecated aggressively to force everyone to read eBooks in Edge... only to remove eBook support from Edge too.

rijoja · on April 9, 2021

They don't support MathML which also is a part of the format apparently. I think Google dropped it presumably not to have to worry about securing the code base.

BlueTemplar · on April 8, 2021

Sadly, these days html is drm-encumbered too : https://www.eff.org/deeplinks/2017/09/open-letter-w3c-direct...

chocolatkey · on April 8, 2021

One of the reasons they couldn't just port it over is that it used some trident-specific CSS layouts to display the book pages. I remember extracting the JS/CSS they used and being very confused.

asdff · on April 8, 2021

Probably just as simple as a lack of willingness to put an engineer on the feature. Mozilla killed of the RSS reader in firefox for similar reasons.

BlueTemplar · on April 8, 2021

Sigh... Well, at least Thunderbird still has it !

BiteCode_dev · on April 8, 2021

On the other hand, you can inline everything into an html page and just ask the browser to open that.

An ebook, unzip, is rarely bigger than 1Mo, which is lower than what most page are today.

ASalazarMX · on April 8, 2021

> An ebook, unzip, is rarely bigger than 1Mo, which is lower than what most page are today.

Oh no, we can't have that. Here are some "beautiful, performant and lightweight" Electron ePUB readers/organizers: https://www.electronjs.org/apps?q=epub

Turing_Machine · on April 8, 2021

Here's an idea:

Rather than complaining about the Electron apps, write better cross-platform apps that don't use Electron.

I know it's fashionable to dog on Electron, but if it didn't fill a legitimate need, people wouldn't use it.

It's easy to compare an Electron app, that actually exists, with some imaginary native app that doesn't.

It's not so easy to find the budget and personnel to actually build dedicated apps for minority platforms. The choice generally isn't between "bloated Electron app" and "sleek native app". It's between "bloated Electron app" and "no app at all".

ASalazarMX · on April 8, 2021

In the case of eBook readers, there's plenty of native software that fills that role in every platform, there's hardly a need for a cross-platform Electron ePUB reader.

I suspect Electron frequently fills a need for the developers, instead of their users. It's easy to deploy, cross-platform and stable, I give you that, but the users pay for it with RAM, disk, CPU and energy.

A single user might not mean much, but multiply that by the millions of Electron programs installed, that's the scale of lost resources that pay for the advantages of Electron.

Turing_Machine · on April 8, 2021

> here's hardly a need for a cross-platform Electron ePUB reader.

If there's "no need" for them, why are people using them? How come you get to decide what other people "need"?

> I suspect Electron frequently fills a need for the developers, instead of their users.

It fills the need of the users to have actual apps they can install and run, rather than imaginary ones.

> that's the scale of lost resources that pay for the advantages of Electron.

People don't hand-write programs in assembly language any more, either, even though that means that you can no longer write a word processor that runs in 12K of RAM.

ASalazarMX · on April 8, 2021

I see we can't meet in the middle on this issue.

I'm not saying all Electron programs are bad. VSCode, for example, is surprisingly good for many use cases, and being an IDE with many features, its resource usage is pretty justified.

I'm also not buying that people need software that only exists as Electron programs. Check the categories in https://www.electronjs.org/apps, there's even taskbar notifications and app launchers, do those merit running a dedicated browser?

It's not that users need "non-imaginary" software and Electron fills that need, it's that most users don't know about native and web frameworks, and they will install software as long as their computer can run it, even if better alternatives exist right now.

rualca · on April 9, 2021

> In the case of eBook readers, there's plenty of native software that fills that role in every platform, there's hardly a need for a cross-platform Electron ePUB reader.

I see this sort of assertion pop up often, but it never is accompanied by specific verifiable examples of what those options are.

Can you point out a single example of said native software that fills that role in every platform? A single one.

ASalazarMX · on April 12, 2021

I think you misunderstood my response. it's not that the same software works as an eBook reader on every platform, but that nearly every platform has native and high-quality eBook readers. Try and run your Electron application in a Symbian phone or an iPad, for example.

BlueTemplar · on April 8, 2021

What's wrong with Qt (and such) ??

(For instance : VLC, Spyder...)

rualca · on April 9, 2021

> What's wrong with Qt (and such) ??

Compared with simple html+javascript running on a webview, Qt is relatively hard to maintain and develop, relies on source code generations and processors to work, prototyping tools are subpar and undermaintained, has no support for centralized theming, it's basic support for component-specific theming is already CSS shoved in a convoluted way, its model/view take is absurd and very poorly engineered, and yeah there's the fact that it forces you to write frontend code in C++. But wait, it's not even C++ because it requires code to be preprocessed to generate boilerplate code.

And let's not pretend that Qt's widgets successor is already a markup+javascript combo that takes the bad parts of javascript and bundles it with the bad parts of a custom markup language.

rijoja · on April 9, 2021

Well developers that can use those are probably more expensive for one.

BlueTemplar · on April 9, 2021

Are they ?

http://www.pyqtgraph.org/

rijoja · on April 11, 2021

I'd bet that there are more people who knows and if you follow the law of supply and demand I think you can figure out the rest.

What is python qt support like for android and so forth by the way?

BlueTemplar · on April 11, 2021

Looks like it was added with Qt5 (which itself uses Python 3 rather than Qt4's Python 2).

BlueTemplar · on April 8, 2021

For sore reason Firefox still doesn't support MIME HTML (.mhtml) even though Thunderbird does ?? (.eml)

https://tools.ietf.org/html/rfc2557

huachimingo · on April 8, 2021

But what could I do when I try to convert some big epubs (like dictionaries, over 6MB) to a PDF?

Calibre seems to use all the memory after some time and then it starts to use swap memory...

salamandersauce · on April 9, 2021

Does pandoc work?

BlueTemplar · on April 8, 2021

Why would you want to do that ?!?

APhoenixRises · on April 8, 2021

My assumption was that it was a branch of functionality that could be dropped to make any maintenance of classic Edge easier. I was really irritated when they dropped support as epub support for Windows has been minimal until recently.

eurasiantiger · on April 9, 2021

Lack of DRM.

ipsum2 · on April 8, 2021

There's a few JS libraries that allows you to basically have the same functionality, example: https://github.com/futurepress/epubjs-reader/

systemvoltage · on April 8, 2021

Great. Instead of us choosing the simplest solution, i.e. native support in browsers that already have the HTML/CSS engine, we continue to build layers and layers of bloated abstractions, now with Javascript™.

Not the fault of the library developer - they’re just trying to help, but it’s like instead of fixing holes in the ship, we build pumps to dump the water out. Too many pumps on the ship and it gets bloated and can’t take any cargo. This is the current web in a nutshell. We need a ship captain that can guide us authoritatively.

GoblinSlayer · on April 9, 2021

SumatraPDF supports it.

johnchristopher · on April 8, 2021

I don't think epub is just HTML/CSS, there's something in the way it's being processed by readers.

I have tried a lot of browser extension based and standalone readers in the past and they all render things differently with different bugs. Something doesn't add up.

kcartlidge · on April 8, 2021

Structurally it's a renamed ZIP file containing a pre-defined collection of XHTML, XML, NCX, OPF, and HTML/CSS content files.

So it's not quite just HTML/CSS when packaged up, but it is just HTML/CSS when it comes to the actual text content.

Other than getting the various constituent files zipped up, with their interrelated contents synced, the only other oddity is that the renamed ZIP must always start with an uncompressed 'mimetype' file.

The IDPF maintain the standard. The easy one is v2 (http://idpf.org/epub/201) but that is now deprecated. Unfortunately v3 allows more interactivity and scripting - and we all know how bad the tech industry is at keeping that kind of stuff secure.

johnchristopher · on April 8, 2021

Now that you mention it I am pretty sure v3 is not entirely backward compatible with v2. Which could explain some oddities on older e-readers or out-of-date software.

capableweb · on April 8, 2021

> I don't think epub is just HTML/CSS

It's XHTML to be precise, and that doesn't change no matter what processing is done by readers, it's still (just) XHTML. The format is literally described in the submission :)

> they all render things differently with different bugs

Just like HTML did in the beginning.

johnchristopher · on April 8, 2021

> > they all render things differently with different bugs

> Just like HTML did in the beginning.

ePub wasn't born yesterday. It's revision 3, first released ~2006/7. XHTML was proposed to correct and prevent the kind of problems html4 had because of how it organically grew, specifically relying on its XML root (no pun intended). Epub should definitely not suffer from bugs like HTML had in the pre-XHTML and pre-HTML5 era.

I sometimes have bugs like:

- whole book is black

- some pages can't be loaded/read so I have to skip them

- some toc and back link don't work like they should (probable bad markup)

There's also some readers oddities:

- completely inconsistent line-height

- aligned setting not working at all

etc.

Anyway, there's a reason XHTML2 didn't happen and we got HTML5 instead. Either ePub has some extensions that are not trivial to implement or most readers are buggy. Or both.

richeyryan · on April 9, 2021

I've worked on a ePub parser and renderer and the issues you're describing sound pretty familiar.

The three main components of the ePub (aside from the actual pages) are the TOC, the spine and the manifest. The manifest basically tells you where everything is, the TOC is the table of contents which can link to various pages and the spine gives you the traversal order.

Some mistakes I've seen are using the TOC to traverse the book. Using the spine to traverse the book but not handling hidden pages properly. Not handling two page spread properly.

So yeah the spec is nuanced and it would be easy to make a reader that worked with a lot of books but then had weird issues on another set of books that aren't particularly different. We ended up writing our own parser because we kept finding issues with the main open source ones.

I recall using this repo (https://github.com/IDPF/epub3-samples) to test specific functionality to make sure it was in line with the spec.

johnchristopher · on April 9, 2021

Thanks for the explanation.

jrimbault · on April 8, 2021

You can use 7z to decompress and view an epub contents. Typically : a table of content `toc.xhtml` file, some `chapterXX.xhtml` files, maybe a few css and images files. (I don't remember the archival format epub use, probably zip, but 7z will guess for you)

frosted-flakes · on April 9, 2021

You can also just rename it to have a .zip extension and open it in the OS file explorer.

notjustanymike · on April 8, 2021

Crazy right? And then you realize you can publish a legitimate epub using a JAMStack, which means some of us may have turned our onboarding documentation into a book, preloaded it onto a cheap branded android tablet, and then sent it to our premium clients as marketing schwag!

mettamage · on April 8, 2021

That sounds awesome!

Wait, so you could actually do all of that and then let it interact with APIs as well? When JS gets involved like this, I can see some crazy applications in my mind packaged as an ".epub" book.

I guess it depends on what reader you're targetting then. A quick cursory search shows that not all of them support JS. Makes sense to me.

richeyryan · on April 8, 2021

I used to work for a large US-based publisher with a big presence in education. I worked on the ePub parser and renderer written in React. As a company we basically took the standard and ran with it. Each book could have its own interactive widgets where kids could do reading comprehension questions or math problems and the system would capture all this for the teacher to grade. We had closed captioned audio for a lot books that the ePub reader would co-ordinate and play. Last I heard they abandoned all that for a completely proprietary format though. It's been the only situation where I've gotten elbow deep into implementing a specification. It was interesting feeling out the nuances and finding the optional parts of the spec that actually end up being important because it was all planned to fit together.

inetknght · on April 8, 2021

> Wait, so you could actually do all of that and then let it interact with APIs as well?

With epub? I hope not!

> I guess it depends on what reader you're targetting then. A quick cursory search shows that not all of them support JS. Makes sense to me.

Any epub reader supporting javascript would very much be an antifeature.

GRiMe2D · on April 8, 2021

EPUB3.0 spec includes Javascript. So probably much of existing readers support Javascript. IRRC, iBook on Mac also supports javascript, but it is activated after user clicking.

https://www.w3.org/publishing/epub3/epub-contentdocs.html#se...

smnrchrds · on April 8, 2021

It would be a gem in the hands of someone like Bret Victor.

kemayo · on April 8, 2021

It's all at a very approachable level, too. I had to write an epub-maker as a necessary component for a project, and it turned out to be ~150 lines of python. You have to make a few indexes in XML and stick them into a zip file, basically.

https://github.com/kemayo/leech/blob/master/ebook/epub.py

phuff · on April 8, 2021

Hey I did this, too! I forgot until just now :)

https://github.com/phuff/epub_builder

I built it because I wanted to have something that made a daily brief news paper that was personalized and sent to my kindle. It makes an epub and uses kindlegen to convert it to a .mobi. There's a lot of fun epub formatting stuff you can do.

Here's the system that makes the daily newspaper, but it's been so long I'm not sure it's actually functional code outside of my production version:

https://github.com/phuff/steward

ivansavz · on April 8, 2021

Yes it's "just" HTML/CSS, but given the wide range of ePub reader capabilities, it's not like you can just take any web page and put it in an .epub. You have be conservative, and use only basic stuff. Also JavaScript is not supported by most ePub readers, so many of the modern web "dynamic" niceties are not available.

For example, rendering math on the web has been a solved problem for many years thanks to MathJax and KaTeX, but these require JS, so cannot be used in ePubs (unless you know the reader supports scripting).

If anyone is interested, I wrote a mega blog post about my journey to produce decent-looking math equations inside a ePub (and mobi files): https://minireference.com/blog/generating-epub-from-latex/ some discussion from when I posted on HN https://news.ycombinator.com/item?id=26356903

mettamage · on April 8, 2021

Hi Ivan, nice to see you here! Haven't gotten back to my math, but did improve in programming ;-)

The struggle to get better at math and find time + motivation for it is real.

I'm the high school textbook guy, if that rings any bells.

bobbylarrybobby · on April 8, 2021

Couldn’t you just use KaTeX to compile to MathML ahead of time?

ivansavz · on April 8, 2021

Yeah, MathML is going to be the "right" way to do this in the long term, but right now not many reader devices support it, so not a viable option.

I'm doing testing though, and hopefully going to see more MathML in the future (in browsers and ePub readers).

salamandersauce · on April 9, 2021

Kobo's do. Apple iBooks does. PocketBooks Android app does (not sure about their readers though). Kindle's don't still. Which I feel is holding back it's use in books at least. It seems like with KFX they will at least convert MathML to images for publishers now? So hopefully that will lead publishers to use it.

Out of all my epub math textbooks I have exactly one done with MathML and it's fucking glorious compared to the dogshit image based ones most publishers put out with blurry images intended for the 800x600 readers of 2007 and not modern 300 dpi ones. I had one book where literally 2/3rds of the equations were just missing from the file and unviewable on any device. This started on PAGE 7. I then had to ask the publisher to fix it, which they sort of did by replacing with a PDF version.

alias_neo · on April 8, 2021

I found this out last week when I bought a Kobo Forma and started converting all of my favourite Markdown documents to epub to stick on there. Calibre even lets you create a TOC by specifying the header regex (#, ##, etc for Markdown), it's great! had to edit a few manually to tweak layout and Calibre (https://calibre-ebook.com/) has a nice editor for epubs built in.

ineedasername · on April 8, 2021

It actually goes deeper than that: Epub is basically a specific implementation of DocBook [0], which is itself a specific XML specification derived from the grand daddy of markup languages SGML.

[0] https://en.wikipedia.org/wiki/DocBook

rchaud · on April 8, 2021

Epubs are basically what "motherfuckingwebsite.com" advocates for.

Despite being HTML/CSS, the layouts aren't particularly interesting though. Most content reads from top to bottom, and the formatting is identical whether you read it on a phone or a tablet.

cwitty88 · on April 8, 2021

I was at a developer conference and one of the original Apple guys was there (name evades me at the moment). He mentioned that after they built webkit and wanted to move into the book space with the iPad launch that Steve Jobs wanted to reuse all of the webkit work. They did that and made epub.

Turing_Machine · on April 8, 2021

It's pretty easy to write code that generates and displays EPUB2.

EPUB3 is a dog's breakfast -- it's hard to think of a better example of "second system effect". As far as I know, there's still not even one reference implementation that supports the full standard, even though it's been out for nearly 10 years. It gains you very little over EPUB2 for standard novels written in western scripts. EPUB3 is only needed if you require embedded scripting, support for non-alphabetical or bidirectional scripts, etc. I believe that most commercial "EPUB3" files still have an EPUB2 toc.ncx file and are designed to fall back to EPUB2 if the reader doesn't support EPUB3 (there are a lot of readers like this).

Something that's easy to overlook: "The mimetype file must be a text document in ASCII that contains the string application/epub+zip. It must also be uncompressed, unencrypted, and the first file in the ZIP archive".

All the other files in the ZIP can be compressed normally.

What this means in practice is that uncompressing an EPUB is easy (just rename it to .zip, if necessary, and run unzip), but recompressing it requires some care.

Assuming you've got your book's content in an OEBPS folder, and the container XML file in the META-INF folder, you can do it like this:

    zip -X0 test.epub mimetype
    zip -X9Dr test.epub META-INF OEBPS

(edit to fix code formatting)

BlueTemplar · on April 9, 2021

Wait, so what is the difference between .epub and .mhtml already ?

walton_simons · on April 8, 2021

Shoutout to some excellent software for ebook wrangling.

The first is the "Standard Ebooks"[1] toolset, which is a suite of Python scripts to create, process, and build ebooks in all common formats. The results on the Standard Ebooks site speak for themselves. They're impeccable in every way, and far better than many big name, commercially produced efforts.

GitHub: https://github.com/standardebooks/tools

How to use: https://standardebooks.org/contribute/producing-an-ebook-ste...

The second is Sigil, which is a great editor if you prefer to work with a GUI:

GitHub: https://github.com/Sigil-Ebook/Sigil

Homepage: https://sigil-ebook.com/about/

[1] https://standardebooks.org/

neweraccount · on April 8, 2021

On Linux, I rename epub to zip, unzip it and use browser to read books.

k_sze · on April 8, 2021

If you don't already know about Calibre, I wholeheartedly recommend it.

ok123456 · on April 8, 2021

You don't even need to load the books into calibre's database to view it. You can invoke 'ebook-viewer' from the command line directly with the epub file's path as the argument.

ASalazarMX · on April 8, 2021

There's even a console reader, if the reason for unzipping is reading in the text terminal: https://github.com/wustho/epy

JNRowe · on April 8, 2021

emacs can be a surprisingly comfortable text mode epub reader too, via nov.el¹ which has been discussed here². If you use a GUI emacs build you get inline images and other goodies, but starting emacs with -nw can be a reasonable solution for quickly checking a book in a term.

Note: You don't have to be a full-time emacs user to use nov.el.

¹ https://depp.brause.cc/nov.el/

² https://news.ycombinator.com/item?id=21426315

stewx · on April 8, 2021

I learned this a while back and used the Python web page scraping tool BeautifulSoup to take an eBook version of a cookbook and generate individual recipe files compatible with my favourite recipe manager, Paprika.

hoophoop · on April 8, 2021

Epub can do all sort of homecalling / user tracking using HTML or CSS or javascript.

What's even worse - almost all Epub readers don't do proper sandboxing.

Tagbert · on April 8, 2021

That is one more reason why it should be supported in browsers where we have better understood ways to control this.

hoophoop · on April 9, 2021

Browsers don't block any homecalling either.

firefoxd · on April 8, 2021

To be a little pedantic, it's XHTML.

I recently published a book and going through the w3c epub specifications was a pain. Instead, I bought a book I wanted to read then reversed engineered it.

For small files you can use the w3c online validator, which will give you an overwhelming list of errors.

Note: The kindle does not support epub, instead it uses kpf. For that you have to download a 333 MB program to convert your epubs.

mettamage · on April 8, 2021

Note: it's simply the wiki page, but in all my years that I read .epub files I never bothered to check the wiki page. So it is to my surprise I found out that it's just some XML and HTML/CSS!

sumtechguy · on April 8, 2021

I had the same reaction when I found out word docx files are just zip files with a bunch of xml in there.

lucb1e · on April 8, 2021

You might be surprised to learn just how many files are zip files.

Java software (jar, war): zip files

Android packages (apk): zip files

OpenDocument Format (odt, ods, odp): zip files

Quake 3 / OpenArena / Urban Terror / etc. (pk3): zip files

Firefox/Thunderbird/Chromium extensions (xpi, crx): zip files

EPUB: :D

sumtechguy · on April 8, 2021

Just look for that PK in the first 2 bytes of the file and it is a good chance it is a zip file. That jar/war one has saved me a few times in figuring out what exactly the compiler did to a program.

Turing_Machine · on April 8, 2021

Yeah, it is surprising at first, but after you think about it, maybe not so much.

If you need to cram a bunch of files into one package, zip is the obvious candidate. There are well-tested libraries and apps for dealing with zips for essentially every language and operating system.

As the saying goes, "don't mess with success".

ok123456 · on April 8, 2021

numpy's npz is also zip.

doodpants · on April 8, 2021

Docx was Microsoft's answer to the Open Document Format used by OpenOffice, which is also just zipped XML files.

rvz · on April 8, 2021

cool.

rvz · on April 8, 2021

To Downvoters: So there is something wrong with saying 'cool'? What is the problem this time? There is nothing malicious or 'offensive' than reacting to something by saying 'cool'. Come on.

Care to explain yourselves this time?

GrumpySloth · on April 8, 2021

I didn't downvote, but it's a low-effort comment that doesn't really add anything to the discussion. There is no information, no argument, no widening of context, no additional perspective, nothing. It also isn't an acknowledgement-type reply like "I see" ending a discussion. In other words, it decreases signal-to-noise ratio. You can compare it to "congrats" email chains at corps.

EDIT: It would be ok in a live discussion. But on a forum, not so much.

robin_reala · on April 9, 2021

The HN equivalent of a “cool” response is clicking the upvote button.

rvz · on April 11, 2021

Excuse me?

Am I 'not allowed' to express my own opinion of this post via typing in text, even if it says 'cool'.

I think I am getting tone-policed.

open-source-ux · on April 8, 2021

ePub is a open format but as the wiki page states "it is supported by almost all hardware readers, except for Kindle".

I can recommend Kobo as an e-ink e-reader that supports ePub with one caveat: Kobo requires you to sign-up for a Kobo account before you can even use the device - horrible. It's easy to search online to find a way to bypass this.

Although Kobo is an alternative to Kindle, you won't find the range of titles that Amazon sells. However, I think e-readers are best for text-only, small paperback-sized books. Anything else simply doesn't fit the small screen and is inferior to the physical version of a title. (Amazon sells a lot of Kindle titles that are simply unsuitable for small e-reader screens.)

m-p-3 · on April 8, 2021

One of the benefit of EPUB is that text can be reflowed, so the display size doesn't matter much, unlike PDF which sets a specific page size. I'm not sure about the MOBI format, but I assume it has similar features to EPUB?

At least it's possible to strip the DRM on Amazon books with the right set of tools, and Calibre is able to convert them to EPUB.

open-source-ux · on April 8, 2021

"One of the benefit of EPUB is that text can be reflowed, so the display size doesn't matter much"

I do feel that the e-ink reader screen size does matter because reflowed text only works well for small, paperback-sized books. Any book larger than this small size that also features tables, charts, images, diagrams, code listings and more, will not display well on a small e-ink screen.

Tagbert · on April 8, 2021

Mobi is very similar to EPub. It’s almost a 0.9 version of EPub. The main differences are in the container.

codpiece · on April 8, 2021

When I travel, I often put all my documents and PDFs on a Kobo. Easy to read, great battery life, barcodes display even when sleeping.

rijoja · on April 9, 2021

Had an amazon Kindle but it was practically useless for me due to how locked down it was. If I wanted to add something esoteric I had to mail it to amazon for whatever reason. I don't mind paying for book, but I would hate to have amazon decide what books I read.

Is the kobo better in this regard?

kcartlidge · on April 10, 2021

- You can create an account to set up the device then never network-connect again

- Shows as an external drive when connected via a micro-USB cable

- Copy and paste books, then when you unplug from your laptop your Kobo sees them

- With Calibre you can tag your books; tags become collections on-device

- Understands more formats, but Calibre can also convert to EPUB during copying

- You can also read the Adobe-DRM protected books (or ask Apprentice Alf)

rijoja · on April 11, 2021

Thanks!

karol · on April 8, 2021

Does it mean you could build a better web for knowledge sharing than medium and similar? It seems epub only supports a subset of CSS.

hosh · on April 8, 2021

People sometimes want to preserve whole websites that they can then access and use in their personal library. (Or at least, back before the cloud and streaming got a lot of people off of developing and maintaining their own personal library).

I am thinking of a scenario where, if there is a collapse (societal, economic, political, or technological), how can knowledge be disseminated and preserved in a resilient way?

frosted-flakes · on April 9, 2021

A printer?

hosh · on April 9, 2021

That can work, although the paper should be acid-free, and there are probably considerations for the ink that I am ignorant of.

Even better if the paper and ink can be made onsite, and the printer can is repairable by someone within the nearby geographical region.

gbraad · on April 8, 2021

... and still its an issue to open an epub on a desktop using a browser.

LordDragonfang · on April 9, 2021

Not to sound egotistical, but did you learn that from a HN comment? Because I literally pointed that out to someone 4 days ago on here.

https://news.ycombinator.com/item?id=26702559

rambojazz · on April 9, 2021

What's going on with the link "wikipedia.org/wiki/EPUB#:~:text=EPUB%20is%20an%20e%2Dbook,smartphones%2C%20tablets%2C%20and%20computers."? There doesn't seem to be any ID in the page with that fragment.

qwtel · on April 9, 2021

https://web.dev/text-fragments/#text-fragments

tannhaeuser · on April 8, 2021

Epub is XHTML not HTML, though.

frosted-flakes · on April 9, 2021

XHTML is also valid HTML, so both are true.

GoblinSlayer · on April 9, 2021

It went down the google hole in epub3.

darkhorse13 · on April 8, 2021

Does EPUB support JavaScript as well? And if it doesn't, are there any similar alternatives? Seems like a single file document that can also pull in data from somewhere could be pretty useful to say the least.

progval · on April 8, 2021

> EPUB 3 Reading Systems may optionally support scripting, which was explicitly discouraged in EPUB 2.

http://idpf.org/epub/30/spec/epub30-changes.html#sec-new-cha...

inetknght · on April 8, 2021

Books should be immutable much like a true real website. Anyone using javascript in a book should not be writing a book. If you're writing javascript then go write an app.

Karawebnetwork · on April 8, 2021

I can see some useful cases. For example, in a computer science book, you could update a caption space that gets its data from the web. This would allow you to display an "obsolete sample code" warning below the examples. When the user is not connected to the internet, you could display "Get online to know code snippet status". And so on.

inetknght · on April 8, 2021

> in a computer science book, you could update a caption space that gets its data from the web.

First, there's opportunity for that web endpoint to stop functioning. Second, there's opportunity for that web endpoint to become taken over by malice. And third, there's opportunity to turn that caption space into an advertisement.

So, to put it succinctly: fuck no.

> This would allow you to display an "obsolete sample code" warning below the examples.

So now the book isn't timeless. It changes. It's no longer a book.

A better idea: include the "obsolete sample code" warning in the book and ask the user check for the latest practices at a URL also included in the book.

> When the user is not connected to the internet, you could display "Get online to know code snippet status". And so on.

When the user is not connected to the internet should be the only case ever considered for a book. Otherwise you're not writing a book. You're writing an app.

Karawebnetwork · on April 8, 2021

> So, to put it succinctly: fuck no.

Cheers.

rijoja · on April 9, 2021

Also I wouldn't want my reading habits to be tracked by anyone, or at least for it to be minimized. Lets say that I want to read the communist manifesto then that shouldn't go into the hands of people creating targeted political propaganda campaigns or whatever people keeping track of credit score.

rchaud · on April 8, 2021

You know how online newspaper articles have ads appearing in between paragraphs? That is precisely how JS would be implemented in epubs.

Sure, we all have pleasant visions of truly interactive ebooks driven by creatively built JS content. But in the real world, ads would be the first thing to be added if JS was supported.

darkhorse13 · on April 8, 2021

I know what you mean, I really do. But think of an interactive document that keeps updating itself with remote data. That sounds really cool to me.

finiteseries · on April 8, 2021

That’s a website.

darkhorse13 · on April 8, 2021

It is. But you can't really run an HTML file locally without at least setting up some type of server (if you plan to make requests). And I know setting up a server is extremely easy, but it's almost impossible for someone who hasn't programmed before.

vxNsr · on April 8, 2021

The point you’re missing is that if you just change the file extension of a ePub to html it will work fine in the browser. There’s nothing special going on. Just bec most .html files are served from a remote location that doesn’t mean they need to be, you can send someone a .html file as a download which they can open from their desktop, it will work just as well as an ePub file, in fact bec the browser recognizes the file ext it will likely run it better!

kcartlidge · on April 10, 2021

> if you just change the file extension of a ePub to html it will work fine in the browser

If that's the case, you didn't have a genuine EPUB to start with. To meet the spec it needs to be in a container (a renamed ZIP file) and have a handful of related metadata and navigation files alongside it.

That said, the actual text of the book is done by HTML/CSS, but within the EPUB container file.

dariusj18 · on April 8, 2021

> you just change the file extension of a ePub to html it will work fine in the browser

Isn't epub a zip file of a bunch of html docs, metadata and images?

BlueTemplar · on April 9, 2021

I don't understand, by definition, where is that remote data going to come from, if not another computer ? Also, peer to peer software has made it easier.

IanGabes · on April 8, 2021

Most frequently used in my experience by malware: https://en.wikipedia.org/wiki/MHTML

Mooty · on April 8, 2021

Maybe what you are searching for is just a html page exported.

vbezhenar · on April 8, 2021

According to Wikipedia, EPUB requires readers to support the HTML5, JavaScript, CSS, SVG formats. For me it seems like another name of HTML.

BlueTemplar · on April 8, 2021

It's still not the full specification, it doesn't seem to support current animation formats. (Fingers crossed for AV1 ?)

rijoja · on April 9, 2021

I could agree on gifs or animated pngs or something, but AV1 I think is pushing it a bit to far. What you are looking for if you want an offline webpage is electron right?

BlueTemplar · on April 9, 2021

Why is AV1 a bad choice for that "something" ??

No, why would I want Electron ??

IMHO EPUB should specifically be script-free.

And if I wanted scripts in my document¤, then instead of packaging the whole browser (that would be overkill!!) with the document like Electron does (?), I would rather use MHTML.

¤ The following was supposed to be my example, but since this website uses Flash, these scripts are no longer easily ran. But since they are described, you should have a good idea what they are going for :

http://resonanceswavesandfields.blogspot.com/2007/08/phasors...

rijoja · on April 11, 2021

> MHO EPUB should specifically be script-free. Agreed.

Yeah maybe I am just knee jerking, with AV1. Broadly speaking however there would be two schools of image and video compression, one for computer generated images and one for images of natural concepts. So AV1 isn't really suited for animated graphs and the like at the end of the day.

On second thought maybe allowing in AV1 would open the door for sound, which really would take us to far away from the book format.

I do definitely see the need for having animations such as the ones you link in your post however. So many things that just takes sentence and sentence to describe can still be described way more efficiently with an animation.

CSS Animations or animated SVG maybe?

BlueTemplar · on April 11, 2021

Hmm, I should really look into vector graphics one of these days.

Yeah, animated SVGs would be even better for this use case - but you need raster graphics support too for other use cases (some animations might require the inclusion of photographs).

I don't see how sound support is an issue - no more than color and high frame rate support are issues for a format that might also end up displayed on grayscale only displays incapable of high framerates. It's up to the creator of the media to take these into account (or not).

rijoja · on April 12, 2021

Ah yeah me to only realised that you could animate them while writing the comment above.

I think that we should define a book as something that you can use only with your eyes.

If I bought a book and I would need to use headphones to "read" all the content in it I would feel a bit defrauded.

BlueTemplar · on April 12, 2021

But epubs aren't books. What is the point of trying to artificially restrict a format ?

(Though a (sub?)format dedicated to "e-ink" readers might actually be a good idea ?)

rijoja · on April 16, 2021

Yes epubs aren't books. I think you really nailed it there. E-ink display devices would have a way lower refresh rates from what we are used to and are as such probably really bad at or unable to play video.

BlueTemplar · on April 8, 2021

You probably want mhtml = eml ?

GoblinSlayer · on April 9, 2021

You want Electron.

garrickvanburen · on April 8, 2021

It’s so fun!

I learned this a while back when I was deep in @font-face and web fonts and style sheets.

I was almost immediately discouraged because, modern features like @font-face were inconsistently supported.

Haven’t checked I a while now, maybe it’s better.

rijoja · on April 9, 2021

Huh that is interesting. Well not just as it also includes MathML which isn't present in all modern browsers sadly.

fnord77 · on April 8, 2021

sounds like the kindle format (actually mostly .mobi) is moving towards html/css too

prewett · on April 8, 2021

When I checked, years ago, it seemed that .mobi included an embedded EPUB file. The books I checked also had it in the old format, presumably for compatibility, but I think I remember seeing something that this was not recommended as the only form.

thomond · on April 8, 2021

It's actually always used XHTML. Both .mobi and .epub is based on the Open Ebook format - https://en.wikipedia.org/wiki/Open_eBook

rhapsodic · on April 8, 2021

I wonder why I can't save an MS Word doc as an Epub.

sto_hristo · on April 8, 2021

- Wait, email and epub are just html?

- points gun Always has been.

rchaud · on April 8, 2021

clearly it's not just html because if it was, dragging an epub file into the browser would display the book in a navigable format.

adjav · on April 8, 2021

Yeah, it's compressed HTML/CSS with some additional XML files for use in navigation and DRM support.

theandrewbailey · on April 8, 2021

Except for the time when email was plain text only.

mattl · on April 8, 2021

Or NeXT RTFd