Experimental Lex

Playing with words.

Tag Archives: html5

Page Layout Hacks in HTML5 – Part 2

Let’s keep going on this topic. The first post laid out some of the challenges that were faced in using HTML5 as the display format for an iPad app. Essentially, that also means it was used as the source format where the text copy was mixed in with the code. That has to be some kind of programming anti-pattern where the view logic is mixed with the business logic. In this case, the business logic is defined by the user experience requirements which calls for horizontal page flipping with multi-column text layout and fully-justified text.

As we continue this exploration, it becomes more clear that the external development team did not understand the modern CSS technologies that are now available for handling these kinds of display requirements. All things considered, this area of HTML and CSS is somewhat obscure… really an “edge-case” for most of the web development world. Almost all prior web design and development has relied on arbitrary pagination of vertically scrollable pages instead of pagination based on a fixed page/column size. Without the devices that would impose specific pagination constraints, these programming techniques were mostly unknown and virtually irrelevant.

While we continue to research the so-called “right way” of structuring HTML5 content to support text reflow between columns and pages, we should still look at the hacks that were necessary. Hacks seem to be a constant in any programming endeavor.

Aesthetics of Magazine-Style Publishing

In newspaper and magazine publishing, multi-column article layout is the norm and full justification of text is pretty standard. The narrower columns are meant to be easier to read, since the readers’ eyes have to do less work while scanning words from one margin to the other continuously. This also improves the readability of the content, since it becomes less likely for the reader to get lost in the middle of a long sentence.

The full justification of text forces each line of paragraph text to use the full width of the column (with the exception of the last line), and this gives the content a cleaner look. Without the full justification, the text is (usually) left-justified and the other margin has a “ragged” edge.

This improved readability creates additional challenges. Often, there are places in the text where long words close together get word-wrapped in an inconvenient place which results in excessive spacing between words. With the narrower columns of multi-column layout, this problem is compounded.

In addition, the usual concerns of widows and orphans need to be addressed. When paragraphs are split across pages and columns, there are often problems where a single line from the beginning or end of a paragraph is stranded by itself at the top or bottom of a page/column.

Web-Based Reading

On web pages, article content is usually presented in a single column and readers are expected to scroll down the page to continue reading. In addition, the paragraph width for an article is usually narrow, which is meant to improve the readability. This has the side effect of making the article longer in terms of vertical space. A long article on a website might have pagination links, but there is no obligation for the page content to fit into a single viewable page area. Hence, vertical scrolling is usually a requirement in web-based reading.

E-Book Reading

For digital book publishing, page layout is less of a concern since paragraph text is meant to reflow smoothly regardless of screen size. Yet, e-book readers like the Kindle and iBooks are designed to simulate the book reading experience by providing pagination of content in a non-scrollable page. The horizontal page-flipping experience is also part of most e-book readers which is important to book-lovers and purists.

Tablet-Based Reading

The coming revolution in digital publishing is being driven by tablet-based devices like the iPad. For publishers who want to provide a magazine-style look and feel, it may be necessary to support a multi-column layout and often with full-justification of text.  In the remainder of this article, I will share some of the tricks and hacks that might come in handy if you are forced down this perilous path.

Code Hacks

Force Justifying the Last Line

With fully-justified text, the last line at the bottom of a column or page is the one you have to worry about. In order to force-justify a line, you need to employ a trick that makes the paragraph think there is more text that continues on the next line. In the example below, we inserted a <span> tag with the style visibility:hidden and within the tag we have long series of characters. Together, this simulates a long word that gets word-wrapped to the next line, yet is invisible.

<p>One morning, as Gregor Samsa was waking up from anxious dreams, he discovered that in his bed he had been changed into a monstrous verminous bug. He lay on his armour-hard back and saw, as he lifted his head up a little, <span style=”visibility: hidden”>aaaaaaaaaaaaaaa</span></p>

Tweaking Word Spacing

Sometimes, paragraph text will split across pages or columns in an unattractive way. For example, an orphan line that continues in the next page or column. That will happen often. One way to handle this is to increase or decrease the number of pixels between each word by setting the word-spacing CSS style. This will often help you get the text reflow you want.

<p style=”word-spacing:2px;”>Gregor’s glance then turned to the window. The dreary weather—the rain drops were falling audibly down on the metal window ledge—made him quite melancholy.</p>

Note: the word-spacing style only accepts whole numbers when rendered by Safari/WebKit.

Squeezing Words Together

In some cases, a line is word-wrapped even though it looks like there is almost enough space to fit the next word on the same line. One trick you can apply is to use the HTML thin space entity, which is thinner than a standard space. I was surprised to find out about it and it’s very useful in this kind of situation.

for the contact felt like a cold shower all over&thinsp;him.

Keeping Words Together

On occasion, you may prefer to keep two words together, especially when a paragraph is split between pages. This can be achieved with the HTML non-breaking space (&nbsp;). However, the &nbsp; and &thinsp; entities both have a fixed width will not stretch out in fully-justified paragraphs, which will make the word spacing like odd.

Superscript and Subscript Handling

Perfect and precise line height is important from a visual perspective and to ensure that paragraph height is consistent. When you have a two-column layout, it is important that the lines of text in each column are lined up precisely.

So, it is surprising to find out that the CSS line-height style can easily get broken by superscript or subscript text in a paragraph. If you have footnote markers, this will be a common problem. The fix for this is to modify the CSS for the <sup> or <sub> tags. Here’s an example:

sup {
vertical-align: baseline;
position: relative;
bottom: 8px;

Page Layout Hacks in HTML5

I have been meaning to share some of my experiences and solutions with advanced page layout using HTML5. This is a tough topic for a number of reasons. First, using HTML5 as an editable source format is a very imperfect practice. Really, it feels like a bunch of hacks you have to perform to make it … stick. Second, I have a feeling that the way we did it this time is not the way we will do it next time. I’m thinking we did not truly use the right CSS and HTML5 solutions available to us. So, I am hoping that we will also uncover more elegant hacks along the way.

So, let’s get started. We will start by discussing the underlying challenges based on the design requirements of the publication.

Tools and Workflow Overview
Like so many web designs, our digital publication was designed in Adobe Photoshop. The ideal magazine-style experience was mocked up to look like pages you might find in GQ or New York Magazine. First, we had the usual Latin text to serve as placeholder for the copy, and later the copy deck was inserted into the PSD one page at a time.

Within the PSD, separate layers were defined for each page and the content for each page was defined as a set of Photoshop text and image containers. You could switch between “pages” by turning off visibility of one layer and turning on visibility in another. It’s an age-old Photoshop trick for creating pages of content using one background and set of styles.

Oh, did I mention that the copy deck consists of a bunch MS Word files floating through the dark matter of our corporate e-mail system? As the copy for each article is updated, there is a massive challenge in getting copy updates into the PSDs, which became the de facto “master copy” of the content under development.

Yes, I did mention that this is an imperfect system.

Choosing the Right Tools and Workflow
I know it might make more sense to use Adobe InDesign as the basis of our page layout and content editing. However, InDesign is not really part of the standard workflow of a web design and development team. Yet. Moreover, the end product is an iPad app that renders HTML5 content with a magazine-style design and there is no automatic workflow to move InDesign pages to a custom HTML5 display solution. Perhaps someday.

Two-Column Page Layout and Paragraph Splitting
The core design called for a traditional page-flipping user experience. The desire was to provide the obviousness of flipping pages in iBooks, yet provide rich layouts with images in a multi-column layout. In our publication design, paragraph text is presented in a two-column layout with full text justification throughout. With two-column or any multi-column layout, text reflow from one column to the next and from one page to the next is a challenge. Even in a single column layout, you need to account for paragraphs that are split between pages.

With word processing and page layout programs, text reflow is a natural part of the software. Pagination is also handled smoothly since the program is always aware of the amount of space consumed by the text based on a multitude of variables, including: font styles and sizes, horizontal spacing between characters and words, vertical spacing between lines and paragraphs, etc.

With HTML, a paragraph is represented by content wrapped in a <p> element, which consists of an opening <p> tag and closing </p> tag. Everything in between is treated as a single paragraph. Even though it is sometimes called the “paragraph” tag, the <p> tag does not provide a way of reflowing between columns or pages. And so, a paragraph that breaks across different columns or different pages must be split into separate <p> elements.

Problems with Broken Copy
When splitting copy across paragraphs, the biggest problem is finding out where paragraphs are split and moving copy between different paragraph blocks. As copy gets updated or when formatting is changed, you have the possibility that the text reflow will result in a “domino effect” as the developer must move chunks of text between paragraphs in different columns/pages. Inevitably, there are mistakes made in this tedious task of moving text that one paragraph to another.

Problems with Full-Justification
This becomes a problem for fully-justified text where the left and right margin of each paragraph is perfectly aligned, since the words and characters are spread out to make this effect happen. Naturally, the last line of a fully-justified paragraph is exempt from this formatting rule. And since we are forced to split paragraphs between columns and pages, the last line of a column or page can look horribly wrong when it falsely assumes itself to be the last line of a paragraph.

Problems with Column Splitting
Our problem with splitting paragraphs is further compounded by page layouts that allow for banner images that span across columns. In the sample diagram below, you can see the different regions identified for column 1 and column 2, top and bottom. The column 1 paragraph text at the top (col1 top) needs to reflow to the bottom (col1 bottom). Again, we have the problem with paragraph splitting and the treatment of fully-justified text.


From the viewpoint of HTML programming, this is further complicated by the fact that HTML content is structured left-to-right, then top-to-bottom. In the diagram above, note how the two “top” areas are surrounded by yellow background. Even though it seems unnatural, the text copy for “col1 top” and “col2 top” are wrapped in the same horizontal <div> container. Same thing for the two areas of text at the bottom. This adds to the difficulty of proofing and fixing text copy.

We are just getting started with this exploration of the many challenges of managing text and page layout in a complex design. It may already seem that choosing HTML5 as the display format is a horrible mistake. Yet, I will go ahead and make two observations:

  1. We are hoping to discover a better approach to managing text copy and page layout as we dig deeper. In our initial development effort, the bulk of the HTML5 content development was handled by an external team. Some of the technical decisions that were made deserve some re-evaluation.
  2. We need a better content workflow and integration model for text copy that will provide better quality. Is it possible for HTML5 to be smarter about content and page layout? We intend to find out through some hardcore experimentation.

Display Formats in Digital Publishing

This is an offshoot of the series “HTML5 in Digital Publishing”. In the first article, we tried to explain the significance of HTML5 and how it has become an important part of digital publishing and mobile devices. Today, we continue this analysis by addressing the rationale for choosing HTML5 versus other display formats.

In this article, we will focus on “display formats” in digital publishing   A display format defines how the content is rendered for display and viewed by the user. Therefore, a display format is also related to the technologies in the development platform that is used to publish content to the device. For example, the Apple iOS software development kit (SDK) is a development platform that targets a set of devices (iPhone, iPad, etc). With the iOS SDK, you have a choice of rendering content through HTML web views or native app components.

HTML5 as Display Format
The choice of HTML5 as a display format is easy to justify. Most tablet devices have strong support for HTML5 content views and this makes HTML5 a good platform-agnostic strategy. In the rapidly-evolving world of devices, we are seeing consistent and wide-spread support of HTML5, particularly through the WebKit browser engine. When you test complex HTML and CSS across different WebKit browsers, you usually see great consistency.

Images as Display Format
Image-based content display is a reasonable choice for some publishers. This option is especially suitable for photography and art where full-page image galleries are the desired experience. And yet, in a tablet device with a touch-based interface, an image gallery or slideshow can feel very flat and boring. When creating an image-based experience, it is a good idea to look for opportunities to add interactive features such as image pan and zoom, text layers, and visual navigation.

Using images as a display format also opens up the possibility of eliminating complex page layout issues by using images exported from programs like Adobe Photoshop and InDesign. By authoring complex text and image layouts in a graphics/design programs and publishing images instead of text, you can guarantee absolute page fidelity when comparing the comps to the end product. However, the idea of publishing books without text might seem unsavory to some. It seems odd to remove the text from a book or magazine and only display the screenshots.

As you can guess, replacing text content with images would remove the possibility of searching and selecting text content, which is one of the promising features of digital e-readers.

Native as Display Format
We use this abstract term “native” when referring to content that is implemented through the programming language and tools required by the development platform. Among the current development platforms that have native programming languages are: Apple iOS, Android, Adobe Flash/AIR, and Windows Phone 7. “Native” also has the connotation of being expensive and proprietary, which is usually true. Native application programming requires specialized programmers and is often more time-consuming.

To make native app content more plausible as a display format in digital publishing, there are a few approaches you can take:

  1. Use a template-based system for loading data fields into content template(s):

    Native code will perform the task of reading data stored in a database or as structured data like XML and then injecting the data into a template. With native apps, a template is often a big chunk of code that places data on the screen as well as the layout, formatting and effects to apply when rendering the content.

    The drawback here is the same as any template-based publishing system… the content templates can feel too restrictive and the content look-and-feel may look stagnant and boring. The art directors will never go along with it.

  2. Mix-and-match with HTML and images:

    Each development platform has support for web view and image view components and it is possible to create a native app solution that uses both to enrich the experience. Since most HTML5 and image-based solutions still require a native app as the container around the web view components, a mixture of native and other display formats will usually be present at some level.

EPUB and Kindle
Perhaps it’s unfair to lump these two together since they are competing e-book formats. However, as display formats they are similar enough to group together. These e-book formats both use HTML as the core document storage format and both have a standard packaging structure that defines the organization of files within a file structure.

E-books also speak to a narrower audience within the digital publishing universe. Most e-books will only contain chapters and paragraphs of text, presented in a format similar to the paper-based books that they may eventually replace.

EPUB and Kindle formats are both interesting beasts and we can learn much from how they are constructed. We will analyze later them both in a separate set of articles.

HTML5 in Digital Publishing: Part 1

This is the first in a series of posts on the use of HTML5 as a content format in digital publishing. This will be an informal journal with no real plan as to the number of posts or the topics that will be covered beyond the current post. In this first post, we will provide an intro to HTML5 and why it is relevant to digital publishing.

Explain HTML5
We should start by explaining what HTML5 is. I am sure it is not adequate to say that HTML5 is just a newer version of HTML. In general, I assume the audience here is kind of technical, but not necessarily involved in web development. So, I will start by explaining the big picture. Bear with me. This exploration is not intended to be a boring roundup of technology history. There’s a story with real meaning here.

Since the beginning of the Internet, the primary way for interacting with the Web* was through a web browser. The content that makes up a web page is assembled in a text structure called HTML and delivered to a web browser. HTML is a hierarchical text structure that resembles XML, which means that it has named elements (or “tags”) with metadata attributes that define specific page layout and formatting details. The HTML text that is rendered by a web browser will often have references to images and other media, and the browser will also fetch and display that content.

Altogether, that complex mass of tags and metadata is received by a web browser and translated for display on a computer screen for a person to view and interact with. In general, when we refer to “HTML”, we usually mean HTML4 and prior versions. With each new version of HTML, there are new features that are defined through new tags and attributes (usually with corresponding updates to CSS and Javascript). To support the new features, new web browsers are released and updated. This takes us back to “HTML5 is just a newer version of HTML”.

Just kidding. It’s much more than that.

HTML5 is Really About Mobile
With HTML5, we have a new and evolving world of Internet-connected devices that includes computers, televisions, and mobile devices. With mobile devices, especially smartphones and tablet devices, there is a driving need for alternate ways of viewing web content, due to the different content consumption habits of people when they are away from their computers and laptops. One major factor is the need for mobile devices to be able display content for users who are not currently connected to the Internet or when mobile networks are too slow.

With the iPhone and the iPad, Apple redefined mobile content consumption by creating an app-centric universe of mobile apps. Instead of depending on the web browser and an Internet connection for content, apps are capable of delivering content and entertainment when the user is away from work/home or simply relaxing. With current and future generations of mobile devices, the web browser is no longer the primary means of interacting with the Internet.

And yet, the definition of a web browser has changed or maybe lost its original meaning as a program that can display websites. However, custom apps are also capable of displaying web content, either remote websites or content stored locally. In mobile applications development, there is the notion of a “web view” component, which is like an embedded web browser that can display HTML content without looking like a web browser (with windows and tabs and menus, etc). The end-user may see it as richly-formatted content, while the source content may in fact be HTML.

Summary: Why HTML5 is Relevant
To bring this long-winded story home, I will summarize what this all means:

  • The browser is now embedded and invisible: The “webview” component in mobile apps is an HTML5-capable browser engine, but it doesn’t look like a browser. Very often, it is the WebKit rendering engine underneath, and that’s a good thing. This means you can expect consistency in the display of HTML5 content.
  • The Web is now local: Webview components are often used to display content that is stored locally on the device (and often deployed in the downloadable app). As users and devices become more mobile, the Web will be there with or without an Internet connection.
  • HTML is still a good publishing format: EBook readers like the Apple iBooks app uses the WebKit browser engine to read HTML files included inside an EPUB file. On top of that, it adds an interactive Table of Contents, bookmarks, and thumbnail navigation to make the book experience more exciting. You can do the same and create your own custom reader to deliver the experience you want.

Bottom line: HTML is no longer limited to the traditional web browser-based experience. And yet, it still supports the traditional browser-based content model.

HTML5 Features
HTML5, as a language that defines a number of features, was developed during the evolution of the Internet and towards mobile computing. Without going into the details of each feature, the overall enhancements in HTML5 can be described as follows:

  1. Portable: The portability of mobile devices also requires a web content model that is capable of operating without an Internet connection. To support this need, HTML5 provides additional features like database storage to allow HTML5 content to store and query data in a local database instead of a remote website.
  2. Media-Capable: Online video and audio in desktop web browsers almost always depends on the Adobe Flash plug-in. With mobile devices, Flash does not have the same pervasiveness due to performance constraints in mobile devices and due to legal licensing issues. One of the key goals of HTML5 is to provide built-in media players for video and audio content.
  3. Canvas Animation: Again, without the Flash plugin, there is a need to provide advanced animation capabilities. The HTML5 Canvas, with lots of help from Javascript, aims to provide this.
  4. Location-Aware: To provide location-based experiences in web content, HTML5 provides support for geolocation data for the current user location (if the user gives permission to share their geolocation info).

NEXT: Choosing a Content Format for Digital Publishing
So far, we have only started to explain the role of HTML5 in our evolving world of Internet devices. Next time, we will need to address the rationale for choosing HTML5 and what the other options are. When you consider the alternatives, you might decide that HTML5 is the best approach. Let the smackdown begin.

Baker: Publish HTML5 to iPad

With every passing day, there is more innovation in digital publishing and it is mind-blowing. And increasingly, the innovations are being shared as open source projects. I first read about the Baker EBook Framework from the Mashable story published last week and it was another one of those jaw-dropping moments. I made plans to try it out and report on my findings.

HTML5 Publishing Workflow

Background: I’ve been involved in a proprietary digital publishing project that is using a similar architecture. When we were planning the architecture, we felt sure we were following the right path. The iPad magazine-style apps were either bloated slideshows with clever/weird navigation or they were were full-blown native apps and not really books or magazines. Or they were just EPUB books.

We chose to build a publishing model that resembles the EPUB model in terms of content organization of HTML5-rendered content, but more like magazine-like. Magazines are full-bleed color experiences with rich layouts and images. It is such an obvious model (at first anyway), that I am not so surprised that others are following the same path.

The Baker Framework follows this model. If you can build the page as HTML5 and make it look beautiful in a Webkit browser (Safari/Chrome), you should be able to deploy it with perfect fidelity in an iPad app and other platforms that have a WebKit engine. That includes Android and Adobe AIR apps.

The 5-Minute Test Drive
On the bakerframework.com website, the home page shows you the 3 easy steps to publishing your content in an iPad app. I skipped to step 3, since it seems like the others are not necessary if you already have your HTML content. I should mention that the Baker Framework is an XCode project which means you need a recent generation Mac that can run the iPhone SDK in XCode.

Note: content development is the hard part. Good original content doesn’t just appear. It gets created through much effort and review. Keep that in mind.

Since things have been terribly busy lately, I only had a few minutes to try out Baker. This is partly because of the nonstop activity and innovation in digital publishing. I downloaded the framework, looked at the instructions for about 30 seconds and started to add my own HTML5 files and assets. I clicked on Build and Run in XCode.

OMG. It freaking works. It’s a little strange to see a free, open-source solution that replicates the functionality of our internal and proprietary iPad publishing platform. The page fidelity is … uncanny. And yet…

The Reality of WebView Rendering
The WebView renderer, in generic terms, is the component in iOS or Android that can load html from a file or URL and display it. Web browsers have a built-in delay that users expect when a page is requested. The page-load psychology for web browsers is fairly tolerant of this reality.

However, the iPad/tablet computing generation is pretty used to the idea of immediate gratification. And rendering HTML on a mobile or tablet device does not feel immediate. As you swipe with your fingers to flip between pages, you experience a delay before the page content is displayed. I think it’s about 1.2 seconds even on the iPhone Simulator. I saw similar results for our custom app, but probably faster. Regardless, that’s not good enough for the impatient, attention-deficit world we live in.

Conclusion (for now)
Baker Framework is very cool. Although, it’s still early-stage. It may make it easy for you to get your HTML pages into an iPad app, but that’s not quite enough yet. In an upcoming article, we will discuss the hardcore realities of the HTML5-based content approach for publishing to iPad and similar device/platforms.