Common Issues found in EPUB Files – and How to Fix Them!

At NNELS, we have had the opportunity to review hundreds of EPUBs from publishers all over Canada, which means that we have been able to learn, first-hand, what errors are the most common! 

In this resource, we will

  • describe the most common issues we see;
  • explain how they can be a barrier to accessibility;
  • provide a way to check for these issues in a book; and
  • share solutions—including how to set things up InDesign, so that the issue is avoided, and how to remediate the issue in the code of the book if it is too late for InDesign!

Topics

Language Attributes in Content Documents (XHTML Files)

Issue description:

Each XHTML content document that makes up an EPUB should have “lang” and “xml-lang=” attributes in the <html> element. Frequently, one or both of these are missing.

Accessibility impact:

The language attributes need to be on the <html> element in order to ensure that the content is correctly parsed by the user’s technology. Both lang and xml:lang are required in order to ensure that it will be parsed correctly, regardless of the technology the reader is using.

How to check:

Open up the book in a code-viewing/editing software, like Sigil, Calibre, BBEdit, or whatever you are comfortable with. At the top of each content document, there will be an <html> tag, that may look something like this:

<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops">

(Depending on the book, there may be more attributes present in the <html> element – that is fine.)

How to address in InDesign:

Ensure that the [Basic Paragraph] Paragraph Style has its language set as the language of the main content. The language is set in the “Advanced Character Formats” tab in the Paragraph Style Options window; this is accessed by right-clicking on the [Basic Paragraph] style, and selecting “Edit [Basic Paragraph]”. When you export the book, the language attributes should be present on each document (except the OPF file) – but always double check!

How to remediate in the code:

Ensure that the <html> element of each XHTML document has both “lang=” and “xml:lang=”attributes. That will result in an <html> element that looks something like:

<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops" lang="en" xml:lang="en">

If you are doing this manually in the code, you should be able to use find and replace! Simply “Find” the original HTML element (e.g., <html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops">) and set the “Replace” as your original code, plus the language declarations (e.g., <html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops" lang="en" xml:lang="en">).

Connection to WCAG or EPUB Accessibility Specification, if applicable:

WCAG 3.1.1 Language of Page, Level A.

Return to top of page.


Language Shifts within Content

Issue description:

Sometimes, a book will have a few words in language that is different from the main content of the book. If these are not coded correctly, they will not be pronounced with the correct accent.

Accessibility impact:

If text that is in another language is not pronounced with the correct accent, it can sound very confusing. Watch this video segment from a NNELS Accessibility Consultant demonstrating this using VoiceOver, the iOS screen reader software (watch from 8:15-9:53).

How to check if language shifts are coded correctly:

Open up the book in a code-viewing/editing software, like Sigil, Calibre, BBEdit, or whatever you are comfortable with, and look for text in another language.

If it has been correctly coded, it will look something like:

<p class="bodyText">The French man turned to me and said <span class="FrenchLang" lang="fr-CA" xml:lang="fr-CA">”Je suis tres français.”</span>.</p>

If the text is in italics, it may look like this, as the language attributes can also be placed within <i> elements:

<p class="bodyText">The French man turned to me and said <i lang="fr-CA" xml:lang="fr-CA">Je suis tres français.”</i>.</p>

If the language shift is not marked up, it will simply look like:

<p class="bodyText">The French man turned to me and said “Je suis tres français.”</p>

Or like this, if it has been styled as italic:

<p class="bodyText">The French man turned to me and said <i>“Je suis tres français</i>.”</p>

How to address in InDesign:

If you are developing a book in InDesign, you can create a Character Style which, when applied to text, will ensure that it is exported with the required language attributes. To do this:

  1. Create a Character Style, and name it something meaningful for your work – likely, the name of the Language (e.g. “French”)
  2. Under “Advanced Character Formats”, set the language from the dropdown menu.
  3. Under “Export Tagging”, choose “span” from the dropdown menu, or type in “i” if you are styling the non-English text as italic

When you apply this Character Style to text segments, it should be correctly marked up in the exported EPUB file.

How to remediate in the code:

If you are reviewing a completed EPUB file, then look through the book to find segments of text in other languages, and wrap those segments in span tags with the language attributes, or, if the text is italicized, add the language attributes into the <i> tag.

Connection to WCAG or EPUB Accessibility Specification, if applicable:

WCAG 3.1.2 Language of Parts, Level AA.

Return to top of page.


Context Breaks

Issue Description:

When someone is reading a book visually, it is usually easy to see a context break. It might be coded with a character or symbol, a small image, or extra space between paragraphs.

Accessibility impact:

We often see context breaks coded in ways that are not fully accessible, including:

  • With a character, like * or § – this can be confusing, as it is read as a normal paragraph;
  • Using an image that has been marked up as decorative (which they are not, as they are functional) or has unhelpful alt-text; or
  • Using blank space which has been coded using CSS—this is totally ignored by screen reader software.

How to check context break coding:

Open up the book in a code-viewing/editing software that has a preview pane like Sigil or Calibre. Navigate to a part of the book that has a context break, and take a look at the code.

  • Is it coded with an <hr/> element? Or perhaps <hr class="something"/>? Then you are all set – this is the way to do it!
  • Is it coded with a character, like: <p class="center">***</p>? This is somewhat okay—it does communicate to users that there is a break or change of context here—but it would still be better to code it with <hr/>. We’ll discuss how to do it below.
  • Is it coded with an image? If yes, does the alt-text say “context break” or “section break”? This is also okay—but again, would be better as an <hr/>. If left as an image without using <hr/>, make sure the alt-text is not left empty, and contains a useful description.
  • Is it coded using CSS that adds extra white space between paragraphs? This is not ok – screen readers will not announce this at all, and readers who are using assistive technology will not be alerted to the fact that they have passed a context break. In the code, you will be able to determine if this is the case because (usually) the paragraph that comes either before or after the context break will have a different class than the rest of the main body paragraphs.

How to address in InDesign:

While you are able to edit the export options for a paragraph style to be <hr>, using the following remediation options is cleaner and more reliable.

How to remediate in the code:

The best way to code a context break is by using an <hr/> element. One thing to note about the three following approaches: you may want/need to play around with the CSS styling (size, padding, positioning, etc.) in order to achieve the appearance you want – this is fine! As long is the context break is <hr/> based, it will be great. And also note: we have used the class names of “character”, “blankSpace” and “image” – but you can name them whatever you like.

Character Context Breaks

To use a textual character, like an asterisk:

In the CSS:

hr.character {
overflow: visible;
border:0;
text-align:center;
}

hr:after {
content: "***";
display:inline-block;
position:relative;
font-size:1em;
padding:1em;
}

In the HTML:

<hr class="character" />

Note that you can change the characters in the “content” line of the CSS to whatever you like.

Blank Space Context Breaks

To add blank space:

In the CSS:

hr.blankSpace {
border:0;
height:2em;
}

In the HTML:

<hr class="blankSpace" />

Note that you can change the height to whatever size you wish.

Image Context Breaks

To use an image:

In the CSS:

hr.image {
display:block;
background: transparent url("images/sectionBreakImage.png") no-repeat center;
height:2em;
border:0;
}

In the HTML:

<hr class="image" />

Note that you will need to change the URL to direct systems to your image, and you may need to play around with the sizing/spacing.

Connection to WCAG or EPUB Accessibility Specification, if applicable:

WCAG 1.3.1 Information and Relationships, Level A.

Return to top of page.


Emphasis

Issue Description:

Text may be emphasized for several reasons, such as vocal emphasis or technical terms. In HTML, there are several emphasis tags available, with specific defined uses for each. Frequently, they are used incorrectly, such as <em> being used for italic text that does not have vocal emphasis, or having all forms of emphasis being coded visually, just using CSS.

Accessibility Impact:

Using the correct markup will allow readers to better distinguish between the different kinds of emphasis. Note that at the time of writing (Fall, 2022), there is not a lot of support by assistive technology for differentiating between these elements, but having correct markup will allow readers to take advantage of it if support improves.

How to check:

Open up the book in a code-viewing/editing software that has a preview pane like Sigil or Calibre. Navigate to a part of the book that has some bolding or italicization, and take a look at the code. Is <em> used on words that require vocal emphasis? Is <i> used on a dream sequence or a word in another language? Then you may have it correct! But, if the usage doesn’t match the directions below, you may want to do some editing.

How to address in InDesign

When you use InDesign, you can create Character Styles, which are a collection of character formatting attributes that can be applied to a selected range of text. When it comes to emphasis, there are a variety of different elements that should be used for different situations; they are described below, in the “How to remediate the code” section, and are also broken down here: Emphasis Options Overview.

You can create Character Styles for each emphasis style – Strong, Bold, Italic, Emphasis, and Cite, and code each style to map to the corresponding HTML element. To do this:

  1. Create a Character Style, and name it whatever you wish – something that corresponds to the HTML code would be a good idea!
  2. Under “Basic Character Formats”, set the style as bold or italic.
  3. Under “Export Tagging”, type i, b, em, strong, or cite in the tag field, depending on what style you are creating.
  4. Repeat the process for all other character emphasis styles you know will be present in the book.

Then, when you export the books, those HTML styles will correctly be assigned to the corresponding text – but be sure to double check!

How to remediate in the code:

Use the correct element for marking up emphasis:

  • <em> and <strong> should be used for vocal emphasis. <em> is used for vocal stress, like “I never said she stole the money”, and <strong> is used to indicate content that is of “strong importance,” including things of great seriousness or urgency, like “Be warned: this book may change your life.”).
  • <i> and <b> should be used for keywords, thoughts, etc., (like: “I wonder where he’s going, she thought.” or “The oxalis triangularis is an interesting plant.” or “The Titanic set sail in 1912.”).
  • <cite> should be used for titles of works (like “The article appeared in The New York Times.”).
  • CSS should be used for styling that is purely visual (like “This book is dedicated to all my loved ones.” or “Figure 1. A graph demonstrating growth over time.”).

Take a look at our Emphasis Options Overview page for more examples.

Connection to WCAG or EPUB Accessibility Specification, if applicable:

WCAG 1.3.1 Information and Relationships, Level A. Note that correct formatting of emphasis is a best practice, not a requirement for EPUB accessibility, but it is something to start learning about!

Return to top of page.


Lists

Issue Description:

Lists commonly occur in front and back matter, such as in a table of contents or index. We frequently see them marked up as paragraphs, instead of using list formatting.

Accessibility impact:

When lists are not marked up properly, assistive technology is not able to indicate the presence of the list. Indentation using CSS is often used to visually indicate hierarchy, but this is ignored by assistive technology. Assistive technology has features that allow users to skip to the beginning or end of a list, or move to the next or previous list item; these features are only available for properly marked-up lists.

How to check:

Open up the book in a code-viewing/editing software that has a preview pane like Sigil or Calibre. Navigate to a part of the book that has a list, and take a look at the code. Is it a series of <p> elements? Or does it use <ol> / <ul> and <li>? If it is a series of paragraphs (<p>), you may want to do some editing.

How to address in InDesign

If you have created list styles in InDesign, then you can ensure they are exported correctly, as lists, when you export the book. In the export dialog, there is a tab called “Text”. In this tab, you can tell InDesign to map Bulleted lists to “Unordered Lists”, and map Numbered lists to “Ordered Lists”. Be sure to check in the code to make sure it worked correctly! But, as described below, if you want any complex styling or formatting, you will likely want to work with the CSS.

How to remediate in the code:

Mark up lists using list elements: <ol> for ordered lists, like a table of contents or the steps in a recipe, and <ul> for unordered lists, like lists of options, grocery list, etc.

We also recommend, depending on the content, using labelled sections to break up lists in back matter, such as indexes which divide the entries into alphabetical groups. The DAISY Knowledge Base has separate pages for bibliographies, glossaries, and indexes with examples of how the divisions can be semantically marked up to make them available to non-visual users. These divisions can be especially helpful if there are a lot of entries.

If you want to use your own visual styling, you can use CSS.

Removing Default Formatting:

NOTE: If you wish to remove default formatting, and use list-style-type:none; or list-style:none; in the CSS (as detailed below), keep this in mind: some systems will not recognize an ordered or unordered list with a list-style or list-style-type value of none. That said, there is a workaround: include the ARIA role="list" in the list element.

If you are including an <ol>, the default behaviour of the HTML is to add numbers, like:

  1. Step one
  2. Step two
  3. Step three

But, you may already have the numbers (or letters, or roman numerals, etc.) in the text, so you need to add something to the CSS, or else you’ll end up with:

  1. 1. Step one
  2. 2. Step two
  3. 3. Step three

To avoid this, add this to the CSS:

ol.noNumbers {
list-style-type: none;
}

Then, in the HTML, you would have:

<ol class="noNumbers" role="list">
<li>1. Step one</li>
<li>2. Step two</li>
<li>3. Step three</li>
</ol>

The same code, list-style-type: none; can be used for unordered lists as well to remove visual bullets.

Substituting an image for default bullets:

If you want to substitute an image for the bullet, you can use the following code:

ul.specialBullet {
list-style-image:  url("images/my-bullet.png");
}

Then, in the HTML, you would have:

<ul class="specialBullet">
<li>Books</li>
<li>Magazines</li>
<li>Movies</li>
</ul>

Note that you will need to change the URL to direct systems to your image.

Connection to WCAG or EPUB Accessibility Specification, if applicable:

WCAG 1.3.1 Information and Relationships, Level A. Note that coding all lists as HTML lists is not required – but it is definitely a best practice.

Return to top of page.


Heading Hierarchy

Issue Description:

Headings represent a key way for readers who use assistive technology to get a feel for how the book is laid out, know when a new section begins, and navigate efficiently between sections. We frequently see a few different problems with headings and heading hierarchy, including:

  • heading elements are not used (which means that the content will not be recognized as a heading) – we most frequently see some code like <p class="heading1Title">
  • heading elements are not used in order – levels are skipped (which will be confusing to machines and assistive technology);
  • two separate heading tags are used when a heading breaks over two (or more) lines;
  • a single heading with a chapter number and title that is marked up as two headings, and that treats the chapter title as a subsection; and
  • in a book which has parts and chapters, headings for both parts and chapters are marked up as <h1> – heading level one.

Accessibility impact:

If headings are not marked up properly, it is more difficult for assistive technology users to understand the structure of a document. This becomes especially hard when there is more than one level of hierarchy, such as subsections within a chapter. Many screen readers and reading systems allow users to move through the headings in a document, which allows users to easily move to a section of interest. If headings are not marked up correctly, the user’s experience can be a lot more confusing or annoying than it needs to be!

How to check:

One great way to check your headings is to run the book through the Ace by DAISY app, and take a look at the Outlines tab >  Headings outline.- this will show you all of the HTML coded headings that are in the book, and what level they are (h1, h2, h3, etc.).

  • If heading elements were not used, there won’t be much there, other than perhaps the Table of Contents.
  • If heading elements were not used in order, it will be apparent in the Headings outline – you may see some red-alert text that lets you know if a heading level has been skipped, or you may see that h1 was not used at all, and everything in the book is either h2 or h3.
  • If a single heading breaks over two (or more) lines, it will appear in the Ace by DAISY Headings outline like:
    • [h1] Chapter One
    • [h1] An Unexpected Party
  • If a single heading treats the second part as a sub-section, it will look like:
    • [h1] Chapter One
      • [h2] An Unexpected Party
  • And finally, if a book treats both parts and chapters as level one headings, it will look like:
    • [h1] Part One
    • [h1] Chapter One: Playing Pilgrims
    • [h1] Chapter Two: A Merry Christmas

How to address in InDesign

When creating paragraph styles for headings in InDesign, make sure to go into the Paragraph Style Options dialog, navigate to the Export Tagging tab, and set the tag to the appropriate heading level. If your headings are on a single line, this will be all you need to do.

If a single heading breaks over two (or more) lines:

If your heading breaks over two lines, like:

Chapter One

Title of the Chapter

Then you will need to do a little more work.

  1. First, create a Paragraph Style that is tagged with the appropriate heading level (maybe called “chapterHeading” or something similar.
  2. Type the text of the heading, with a space after the first line, followed by a soft return (usually accomplished by hitting Shift + Enter) before the second line.
  3. Then, apply the “chapterHeading” Paragraph Style to the whole paragraph, which should be the full heading.
  4. Then, if you want one line to have a different visual appearance, create a Character Style with the qualities you desire. Select the text you want to have this style, and apply it.

When you export the EPUB, you should see code that looks like:

<h1 id="_idParaDest-1" class="heading-1">CHAPTER I <br /><span class="chapTitle">Chapter Title</span></h1>

For a great demonstration, watch Laura Brady’s video in her InDesign course: 1.2 Basic Styles.

How to remediate in the code:

Note that in all of the following solutions, CSS will likely need to be updated in order to ensure that the desired visual appearance is maintained.

If heading elements were not used:

Add HTML headings (<h1> for top level headings, <h2> for the next level, etc).

If heading elements are out of order:

Change the HTML headings to match the correct structure of the book. Make sure no heading levels are skipped, and that the book does not start with an <h2> heading (which we see sometimes).

If a single heading breaks over two (or more) lines:

A single heading can break over two (or more) lines for a number of different reasons. Here we’ll share each potential scenario, and explain how to correctly format it.

Scenario 1:

The chapter number and title are on two separate lines, each within their own heading tags, and they have the same appearance/styling like:

<h1 class="heading1">Chapter One</h1>
<h1 class="heading1">The Unexpected Party</h1>

Scenario 1 Solution:

Use <br/> between the lines, and wrap the whole thing in a single set of heading tags, like:

<h1 class="heading1">Chapter One<br/>
The Unexpected Party</h1>

Scenario 2

The chapter number and title are on two separate lines, each within their own heading tags, and they have different appearance/styling like:

<h1 class="chapterNumber">Chapter One</h1>
<h1 class="chapterTitle">The Unexpected Party</h1>

Scenario 2 Solution:

Ensure that the styles can be used in <span> tags, in the CSS. Then, correct the code so that the HTML looks like:

<h1><span class="chapterNumber">Chapter One</span><br/>
<span class="chapterTitle">The Unexpected Party</span></h1>

Scenario 3:

The chapter name is on more than one line, but is part of a single phrase/name, like:

<h1 class="chapterTitle">The</h1>
<h1 class="chapterTitle">Unexpected</h1>
<h1 class="chapterTitle">Party</h1>

Scenario 3 Solution

Use CSS to create a style which allows text to break over two lines. Add this to the CSS (and remember, you can change the class name to whatever you want, instead of “break” as we have it here):

span.break {
display: block;
}

Then, update the HTML to look something like:

<h1 class="chapterTitle">The <span class="break">Unexpected</span> <span class="break">Party</span></h1>

If a single heading treats the second part as a sub-section:

Same as the above examples, in “If a single heading breaks over two lines”; any scenario may apply, depending on how they are styled. Just make sure that a single heading element is used, and that you choose the correct heading level to correctly reflect the hierarchy of the book.

If a book treats both parts and chapters as level one headings:

Top level headings like front matter and back matter sections (table of contents, list of figures, index, bibliography, about the author, etc.) will pretty much always be <h1> (heading level one), unless they are a subsection of something. If a book has parts, chapters, and subsections, these should be <h1>, <h2>, and <h3>, respectively; if there is another level of subsections, these should be <h4>, etc. If a book just has chapters (i.e., no parts), then the chapters will be <h1>, and any subsections will begin at <h2>.

Connection to WCAG or EPUB Accessibility Specification, if applicable:

WCAG 1.3.1 Information and Relationships, Level A. Correct heading structure is imperative to accessibility.

Return to top of page.


Capitalization for Vocal Emphasis

Issue description:

Sometimes, we see words in all capital letters used to indicate importance or vocal emphasis, like “stop” in the following example sentence: “Please, STOP doing that!”.

Accessibility impact:

When a word is in all caps for emphasis, this will not be announced by a screen reader. When the emphasized word impacts the meaning of a sentence, this can be a problem – think of the difference between “I never said SHE stole the money” vs. “I never said she stole the MONEY”. The emphasis is important, and the reader will not be exposed to that distinction.

There is an additional potential impact to accessibility: very occasionally, depending on the reading system and software of the user, a screen reader may read out all-caps words letter-by-letter. This is rare, but can happen.

How to check:

Take a look through the book in code-viewing/editing software like Sigil, Calibre, BBEdit, or whatever you are comfortable with, looking for capitalized words used as emphasis. Look in the HTML code for the capitalized words.

How to address in InDesign:

When words in all capital letters are included in the text, be sure to also style them as <strong>; or <i>, depending on the usage. This way, if the word is read as a word, screen reader users will be alerted to the semantic meaning.

How to remediate in the code:

Use one of the standard HTML options for vocal emphasis; you can leave the word in capitals, simply add <em>. If the emphasized text warns of danger, you can use <strong> instead of <em>.

Note: support for these formatting (<em>, <strong>, etc.) tags is not universal, and generally needs to be enabled by the user if it is available; however, they are currently the best option for marking up emphasis, and using them is the best way to future-proof your EPUB!

Connection to WCAG or EPUB Accessibility Specification, if applicable:

WCAG 1.3.1 Information and Relationships, Level A.

Return to top of page.