Experimental Lex

Playing with words.

Tag Archives: dpub

Lies, Damned Lies, and Mime Types

If you’ve ever had the joy of building EPUB files and struggled with the cryptic error messages produced by epubcheck, you might find this interesting. I was having a number of facepalm moments the other day while dealing with an image icon for a Twitter user. My new Twitter friend @longreads is giving me a steady stream of wonderful content to read. And yet, I was having an awful time trying to figure out why epubcheck was rejecting the longreads Twitter icon as having the wrong mime type.

Here is the url for the icon . It has a “.jpg” extension which suggests that its mimetype is “image/jpeg”. If you download the file and open it in an image viewer app like the Preview app for Mac, it can tell you that it is actually a PNG file. In my automated workflow, this little time bomb silently waits until the EPUB is created before it announces itself during epubcheck validation.

I was previously using the file extension to determine the mime type, and that was obviously not going to work in this world of deceptive file names. Moreover, there are image urls that do not have any file extension. So, I tried to get clever and check the “Content-Type” header in the response when downloading files. However, I even found that this was not always correct.

I did some research on existing open source tools (Java and Python-based) and it is surprising how many use the file extension as the main determinant. And the tools that actually read the file header were said to be buggy. So, it dawned on me later that I should just look into the epubcheck source code and find out how it was reading image types.

Here’s the source code for BitmapChecker.java in the epubcheck source code. As is, it is not designed to be used externally so I created a copy and compiled it into a command-line tool like epubcheck. It is called MimeCheck and you can run it like this. It reads the file header returns the mime type.

$ java -jar mimecheck-1.0.jar twitpic.jpg


That only works if the file is in the same directory as the mimecheck-1.0.jar file. You can also provide the absolute path and it works the same. Here’s a download link for the jar file if you are interested in using it.


Let me know if you have any questions or if you need the source code. I think I have a few other EPUB-related tools that I can contribute and perhaps turn this into an open source project on Google code. That depends on whether anyone else finds it useful. I realize that not many people need to do automated builds of EPUB files.

The Social Content Graph

Greetings and happy holidays! Apologies for not publishing anything lately. I feel physical pain every day that goes by when I am not writing anything. Possibly, my choice of long-form articles have hampered me from reaching that tipping point where writing becomes an easy and fluid task. So, we will try a new approach by writing lighter pieces. Many of these might look like “fluff” pieces based on current news, personal views, and speculation, yet still within the general universe of digital publishing and mobile content. We can worry about how to re-assemble this content into a book some other day. The only thing that matters is actually writing.

With that said, I would like to discuss my vision of the social content graph. This is not a new term and I have not researched prior usage of this term. However, I do feel it is a meaningful term that has relevance to digital publishing. When you look at the entire spectrum of companies, platforms, software, and content in the extended digital content ecosystem today, you will recognize that reading is becoming more social.

That may sound like Captain Obvious talking. Nonetheless, this is the essence of the social content graph. The term “social graph” is well known and widely used in reference to a person’s social network. A person with a large number of Twitter followers and Facebook friends has a large social graph and the content they share has a powerful “network effect”. Perhaps it’s just the technical and contemporary way of quantifying popularity.

So is there such a thing as a content graph? A quick search on Google shows that it was a significant term in 2010. For example, here’s an interesting quote from this article called The Content Graph and the Future of Brands

In the Social Graph, you’re defined by your friends. In the Content Graph, a content brand is defined by its distribution relationships with other content brands.

Unfortunately, that’s not the Content Graph I am thinking about. Instead, I am trying to express how digital content is published, consumed, shared, aggregated, republished, and consumed again in the digital world today. Publishers and content creators seek to publish content that is original and popular. Content is given life by consumers who share the content with others. Until the content is consumed, shared and discussed, the content barely exists. (Call up metaphors like “tree falling in the woods” or “Waiting for Godot”)

Consider the inter-twining relationships between content creators, consumers, and companies like Twitter, Bit.ly, and Flipboard in our digital content ecosystem. Sharing content via Twitter is usually done with shortened urls (generated through services like Bit.ly) which redirects users to the original URL. The importance of short URLs is a by-product of the 140-character limit that is built into Twitter. While this limit originates from the character limit of SMS messages, it also provides a universal rule that makes all messages short and easy to browse.

Anatomy of a Tweet

When browsing through Twitter messages, you see a microcosm of specialized syntax to accommodate as much content and meaning within the 140-character limit. For those new to Twitter, it can be a daunting experience trying to grok the meaning. The most basic tweet is just text from a Twitter user. In addition, a tweet can have any of the following:

  1. link: usually a shortened URL (example: http://bit.ly/eOsrVQ)
  2. #hashtag — one or more topic tags that serve as searchable metadata
  3. @username — used for replies and mentions using the “@username” format
  4. RT (retweet) — a flag that signifies when one user has republished another user’s tweet

In addition, your Twitter feed contains not only the people you follow, but also the extended conversation taking place between between people in your Twitter network and their network. Each “@” mention is a clickable link that takes you to a user’s Twitter page. Thus, it becomes another point of interest as you browse for interesting content. Yes, it seems overwhelming and yet it happens to be the best way of getting the latest and most interesting content. In the end, the Twitter messages that contain links are often the ones that are the most interesting, and the Twitter users who share the most interesting content are usually the ones worth following.

Curated Content

This brings us to curated content, which is content that is shared and republished by tastemakers and thought leaders within different areas of interest. In contrast, content that is found through organic search is not hand-picked and the quality of search results can vary greatly.

Flipboard is an iPad app that presents streams of curated content that the reader chooses, across a number of topics. Most notably, Twitter integration in Flipboard presents the links shared by people in your Twitter network in a pleasing user interface that resembles a magazine. Hence, Flipboard and other reading apps like it are an important part of our social content graph. Of course, the content you consume in Flipboard is easy to share with others through the usual channels (Twitter, Facebook, e-mail, etc).

Social Reading

When you have reading devices and apps that have social networking “baked in”, you have the beginnings of a social reading experience. Curated content that is recommended to you through your social network is an entry point to social reading. Social reading is also found at a deeper level, where readers can share bookmarks, comments, and quotes from the content they are reading within their social network or with everyone. Such social reading features are found in the Amazon Kindle reader and may have originated there. We are starting to see this kind of social reading and sharing in education reading apps like Inkling.

Mapping the Social Content Graph

I’m going to admit that my concept of the Social Content Graph is still half-baked, and I think that’s ok for now. The point I am trying to make is that the social content graph is a complex beast, since it is a chain of people and content links. The reason why Flipboard is such an excellent reading experience is that it understands that this is a complex beast and it tries to organize it for you in a way that makes it pleasing to browse and consume.

And yet, Flipboard is just a reading experience and does not help you understand and organize your social content graph. You still need to bookmark or republish the content you like if you want to be able to find and re-read content in the future. That feels a little weird to me… sharing by Twitter just because I don’t have a convenient way of mapping and saving the parts that I want to keep.

In my mind, I have this mental image of a social content graph somehow looking like that clever visualization that you see in a LinkedIn profile that shows how you are related to another person. It nicely illustrates the “degrees of separation” between you and others on LinkedIn. And the ideal visualization of my social content graph would be something like that. It would show me a 2D/3D spatial view of the people I follow and the content I like, and it would let me pivot the view along the people axis and the content axis. Someday, it would be nice to explore the reverse angle and see the people who follow me or like the content I have shared or created. Yeah, whenever that happens.

Page Layout Hacks in HTML5

I have been meaning to share some of my experiences and solutions with advanced page layout using HTML5. This is a tough topic for a number of reasons. First, using HTML5 as an editable source format is a very imperfect practice. Really, it feels like a bunch of hacks you have to perform to make it … stick. Second, I have a feeling that the way we did it this time is not the way we will do it next time. I’m thinking we did not truly use the right CSS and HTML5 solutions available to us. So, I am hoping that we will also uncover more elegant hacks along the way.

So, let’s get started. We will start by discussing the underlying challenges based on the design requirements of the publication.

Tools and Workflow Overview
Like so many web designs, our digital publication was designed in Adobe Photoshop. The ideal magazine-style experience was mocked up to look like pages you might find in GQ or New York Magazine. First, we had the usual Latin text to serve as placeholder for the copy, and later the copy deck was inserted into the PSD one page at a time.

Within the PSD, separate layers were defined for each page and the content for each page was defined as a set of Photoshop text and image containers. You could switch between “pages” by turning off visibility of one layer and turning on visibility in another. It’s an age-old Photoshop trick for creating pages of content using one background and set of styles.

Oh, did I mention that the copy deck consists of a bunch MS Word files floating through the dark matter of our corporate e-mail system? As the copy for each article is updated, there is a massive challenge in getting copy updates into the PSDs, which became the de facto “master copy” of the content under development.

Yes, I did mention that this is an imperfect system.

Choosing the Right Tools and Workflow
I know it might make more sense to use Adobe InDesign as the basis of our page layout and content editing. However, InDesign is not really part of the standard workflow of a web design and development team. Yet. Moreover, the end product is an iPad app that renders HTML5 content with a magazine-style design and there is no automatic workflow to move InDesign pages to a custom HTML5 display solution. Perhaps someday.

Two-Column Page Layout and Paragraph Splitting
The core design called for a traditional page-flipping user experience. The desire was to provide the obviousness of flipping pages in iBooks, yet provide rich layouts with images in a multi-column layout. In our publication design, paragraph text is presented in a two-column layout with full text justification throughout. With two-column or any multi-column layout, text reflow from one column to the next and from one page to the next is a challenge. Even in a single column layout, you need to account for paragraphs that are split between pages.

With word processing and page layout programs, text reflow is a natural part of the software. Pagination is also handled smoothly since the program is always aware of the amount of space consumed by the text based on a multitude of variables, including: font styles and sizes, horizontal spacing between characters and words, vertical spacing between lines and paragraphs, etc.

With HTML, a paragraph is represented by content wrapped in a <p> element, which consists of an opening <p> tag and closing </p> tag. Everything in between is treated as a single paragraph. Even though it is sometimes called the “paragraph” tag, the <p> tag does not provide a way of reflowing between columns or pages. And so, a paragraph that breaks across different columns or different pages must be split into separate <p> elements.

Problems with Broken Copy
When splitting copy across paragraphs, the biggest problem is finding out where paragraphs are split and moving copy between different paragraph blocks. As copy gets updated or when formatting is changed, you have the possibility that the text reflow will result in a “domino effect” as the developer must move chunks of text between paragraphs in different columns/pages. Inevitably, there are mistakes made in this tedious task of moving text that one paragraph to another.

Problems with Full-Justification
This becomes a problem for fully-justified text where the left and right margin of each paragraph is perfectly aligned, since the words and characters are spread out to make this effect happen. Naturally, the last line of a fully-justified paragraph is exempt from this formatting rule. And since we are forced to split paragraphs between columns and pages, the last line of a column or page can look horribly wrong when it falsely assumes itself to be the last line of a paragraph.

Problems with Column Splitting
Our problem with splitting paragraphs is further compounded by page layouts that allow for banner images that span across columns. In the sample diagram below, you can see the different regions identified for column 1 and column 2, top and bottom. The column 1 paragraph text at the top (col1 top) needs to reflow to the bottom (col1 bottom). Again, we have the problem with paragraph splitting and the treatment of fully-justified text.


From the viewpoint of HTML programming, this is further complicated by the fact that HTML content is structured left-to-right, then top-to-bottom. In the diagram above, note how the two “top” areas are surrounded by yellow background. Even though it seems unnatural, the text copy for “col1 top” and “col2 top” are wrapped in the same horizontal <div> container. Same thing for the two areas of text at the bottom. This adds to the difficulty of proofing and fixing text copy.

We are just getting started with this exploration of the many challenges of managing text and page layout in a complex design. It may already seem that choosing HTML5 as the display format is a horrible mistake. Yet, I will go ahead and make two observations:

  1. We are hoping to discover a better approach to managing text copy and page layout as we dig deeper. In our initial development effort, the bulk of the HTML5 content development was handled by an external team. Some of the technical decisions that were made deserve some re-evaluation.
  2. We need a better content workflow and integration model for text copy that will provide better quality. Is it possible for HTML5 to be smarter about content and page layout? We intend to find out through some hardcore experimentation.

Display Formats in Digital Publishing

This is an offshoot of the series “HTML5 in Digital Publishing”. In the first article, we tried to explain the significance of HTML5 and how it has become an important part of digital publishing and mobile devices. Today, we continue this analysis by addressing the rationale for choosing HTML5 versus other display formats.

In this article, we will focus on “display formats” in digital publishing   A display format defines how the content is rendered for display and viewed by the user. Therefore, a display format is also related to the technologies in the development platform that is used to publish content to the device. For example, the Apple iOS software development kit (SDK) is a development platform that targets a set of devices (iPhone, iPad, etc). With the iOS SDK, you have a choice of rendering content through HTML web views or native app components.

HTML5 as Display Format
The choice of HTML5 as a display format is easy to justify. Most tablet devices have strong support for HTML5 content views and this makes HTML5 a good platform-agnostic strategy. In the rapidly-evolving world of devices, we are seeing consistent and wide-spread support of HTML5, particularly through the WebKit browser engine. When you test complex HTML and CSS across different WebKit browsers, you usually see great consistency.

Images as Display Format
Image-based content display is a reasonable choice for some publishers. This option is especially suitable for photography and art where full-page image galleries are the desired experience. And yet, in a tablet device with a touch-based interface, an image gallery or slideshow can feel very flat and boring. When creating an image-based experience, it is a good idea to look for opportunities to add interactive features such as image pan and zoom, text layers, and visual navigation.

Using images as a display format also opens up the possibility of eliminating complex page layout issues by using images exported from programs like Adobe Photoshop and InDesign. By authoring complex text and image layouts in a graphics/design programs and publishing images instead of text, you can guarantee absolute page fidelity when comparing the comps to the end product. However, the idea of publishing books without text might seem unsavory to some. It seems odd to remove the text from a book or magazine and only display the screenshots.

As you can guess, replacing text content with images would remove the possibility of searching and selecting text content, which is one of the promising features of digital e-readers.

Native as Display Format
We use this abstract term “native” when referring to content that is implemented through the programming language and tools required by the development platform. Among the current development platforms that have native programming languages are: Apple iOS, Android, Adobe Flash/AIR, and Windows Phone 7. “Native” also has the connotation of being expensive and proprietary, which is usually true. Native application programming requires specialized programmers and is often more time-consuming.

To make native app content more plausible as a display format in digital publishing, there are a few approaches you can take:

  1. Use a template-based system for loading data fields into content template(s):

    Native code will perform the task of reading data stored in a database or as structured data like XML and then injecting the data into a template. With native apps, a template is often a big chunk of code that places data on the screen as well as the layout, formatting and effects to apply when rendering the content.

    The drawback here is the same as any template-based publishing system… the content templates can feel too restrictive and the content look-and-feel may look stagnant and boring. The art directors will never go along with it.

  2. Mix-and-match with HTML and images:

    Each development platform has support for web view and image view components and it is possible to create a native app solution that uses both to enrich the experience. Since most HTML5 and image-based solutions still require a native app as the container around the web view components, a mixture of native and other display formats will usually be present at some level.

EPUB and Kindle
Perhaps it’s unfair to lump these two together since they are competing e-book formats. However, as display formats they are similar enough to group together. These e-book formats both use HTML as the core document storage format and both have a standard packaging structure that defines the organization of files within a file structure.

E-books also speak to a narrower audience within the digital publishing universe. Most e-books will only contain chapters and paragraphs of text, presented in a format similar to the paper-based books that they may eventually replace.

EPUB and Kindle formats are both interesting beasts and we can learn much from how they are constructed. We will analyze later them both in a separate set of articles.

HTML5 in Digital Publishing: Part 1

This is the first in a series of posts on the use of HTML5 as a content format in digital publishing. This will be an informal journal with no real plan as to the number of posts or the topics that will be covered beyond the current post. In this first post, we will provide an intro to HTML5 and why it is relevant to digital publishing.

Explain HTML5
We should start by explaining what HTML5 is. I am sure it is not adequate to say that HTML5 is just a newer version of HTML. In general, I assume the audience here is kind of technical, but not necessarily involved in web development. So, I will start by explaining the big picture. Bear with me. This exploration is not intended to be a boring roundup of technology history. There’s a story with real meaning here.

Since the beginning of the Internet, the primary way for interacting with the Web* was through a web browser. The content that makes up a web page is assembled in a text structure called HTML and delivered to a web browser. HTML is a hierarchical text structure that resembles XML, which means that it has named elements (or “tags”) with metadata attributes that define specific page layout and formatting details. The HTML text that is rendered by a web browser will often have references to images and other media, and the browser will also fetch and display that content.

Altogether, that complex mass of tags and metadata is received by a web browser and translated for display on a computer screen for a person to view and interact with. In general, when we refer to “HTML”, we usually mean HTML4 and prior versions. With each new version of HTML, there are new features that are defined through new tags and attributes (usually with corresponding updates to CSS and Javascript). To support the new features, new web browsers are released and updated. This takes us back to “HTML5 is just a newer version of HTML”.

Just kidding. It’s much more than that.

HTML5 is Really About Mobile
With HTML5, we have a new and evolving world of Internet-connected devices that includes computers, televisions, and mobile devices. With mobile devices, especially smartphones and tablet devices, there is a driving need for alternate ways of viewing web content, due to the different content consumption habits of people when they are away from their computers and laptops. One major factor is the need for mobile devices to be able display content for users who are not currently connected to the Internet or when mobile networks are too slow.

With the iPhone and the iPad, Apple redefined mobile content consumption by creating an app-centric universe of mobile apps. Instead of depending on the web browser and an Internet connection for content, apps are capable of delivering content and entertainment when the user is away from work/home or simply relaxing. With current and future generations of mobile devices, the web browser is no longer the primary means of interacting with the Internet.

And yet, the definition of a web browser has changed or maybe lost its original meaning as a program that can display websites. However, custom apps are also capable of displaying web content, either remote websites or content stored locally. In mobile applications development, there is the notion of a “web view” component, which is like an embedded web browser that can display HTML content without looking like a web browser (with windows and tabs and menus, etc). The end-user may see it as richly-formatted content, while the source content may in fact be HTML.

Summary: Why HTML5 is Relevant
To bring this long-winded story home, I will summarize what this all means:

  • The browser is now embedded and invisible: The “webview” component in mobile apps is an HTML5-capable browser engine, but it doesn’t look like a browser. Very often, it is the WebKit rendering engine underneath, and that’s a good thing. This means you can expect consistency in the display of HTML5 content.
  • The Web is now local: Webview components are often used to display content that is stored locally on the device (and often deployed in the downloadable app). As users and devices become more mobile, the Web will be there with or without an Internet connection.
  • HTML is still a good publishing format: EBook readers like the Apple iBooks app uses the WebKit browser engine to read HTML files included inside an EPUB file. On top of that, it adds an interactive Table of Contents, bookmarks, and thumbnail navigation to make the book experience more exciting. You can do the same and create your own custom reader to deliver the experience you want.

Bottom line: HTML is no longer limited to the traditional web browser-based experience. And yet, it still supports the traditional browser-based content model.

HTML5 Features
HTML5, as a language that defines a number of features, was developed during the evolution of the Internet and towards mobile computing. Without going into the details of each feature, the overall enhancements in HTML5 can be described as follows:

  1. Portable: The portability of mobile devices also requires a web content model that is capable of operating without an Internet connection. To support this need, HTML5 provides additional features like database storage to allow HTML5 content to store and query data in a local database instead of a remote website.
  2. Media-Capable: Online video and audio in desktop web browsers almost always depends on the Adobe Flash plug-in. With mobile devices, Flash does not have the same pervasiveness due to performance constraints in mobile devices and due to legal licensing issues. One of the key goals of HTML5 is to provide built-in media players for video and audio content.
  3. Canvas Animation: Again, without the Flash plugin, there is a need to provide advanced animation capabilities. The HTML5 Canvas, with lots of help from Javascript, aims to provide this.
  4. Location-Aware: To provide location-based experiences in web content, HTML5 provides support for geolocation data for the current user location (if the user gives permission to share their geolocation info).

NEXT: Choosing a Content Format for Digital Publishing
So far, we have only started to explain the role of HTML5 in our evolving world of Internet devices. Next time, we will need to address the rationale for choosing HTML5 and what the other options are. When you consider the alternatives, you might decide that HTML5 is the best approach. Let the smackdown begin.

Baker: Publish HTML5 to iPad

With every passing day, there is more innovation in digital publishing and it is mind-blowing. And increasingly, the innovations are being shared as open source projects. I first read about the Baker EBook Framework from the Mashable story published last week and it was another one of those jaw-dropping moments. I made plans to try it out and report on my findings.

HTML5 Publishing Workflow

Background: I’ve been involved in a proprietary digital publishing project that is using a similar architecture. When we were planning the architecture, we felt sure we were following the right path. The iPad magazine-style apps were either bloated slideshows with clever/weird navigation or they were were full-blown native apps and not really books or magazines. Or they were just EPUB books.

We chose to build a publishing model that resembles the EPUB model in terms of content organization of HTML5-rendered content, but more like magazine-like. Magazines are full-bleed color experiences with rich layouts and images. It is such an obvious model (at first anyway), that I am not so surprised that others are following the same path.

The Baker Framework follows this model. If you can build the page as HTML5 and make it look beautiful in a Webkit browser (Safari/Chrome), you should be able to deploy it with perfect fidelity in an iPad app and other platforms that have a WebKit engine. That includes Android and Adobe AIR apps.

The 5-Minute Test Drive
On the bakerframework.com website, the home page shows you the 3 easy steps to publishing your content in an iPad app. I skipped to step 3, since it seems like the others are not necessary if you already have your HTML content. I should mention that the Baker Framework is an XCode project which means you need a recent generation Mac that can run the iPhone SDK in XCode.

Note: content development is the hard part. Good original content doesn’t just appear. It gets created through much effort and review. Keep that in mind.

Since things have been terribly busy lately, I only had a few minutes to try out Baker. This is partly because of the nonstop activity and innovation in digital publishing. I downloaded the framework, looked at the instructions for about 30 seconds and started to add my own HTML5 files and assets. I clicked on Build and Run in XCode.

OMG. It freaking works. It’s a little strange to see a free, open-source solution that replicates the functionality of our internal and proprietary iPad publishing platform. The page fidelity is … uncanny. And yet…

The Reality of WebView Rendering
The WebView renderer, in generic terms, is the component in iOS or Android that can load html from a file or URL and display it. Web browsers have a built-in delay that users expect when a page is requested. The page-load psychology for web browsers is fairly tolerant of this reality.

However, the iPad/tablet computing generation is pretty used to the idea of immediate gratification. And rendering HTML on a mobile or tablet device does not feel immediate. As you swipe with your fingers to flip between pages, you experience a delay before the page content is displayed. I think it’s about 1.2 seconds even on the iPhone Simulator. I saw similar results for our custom app, but probably faster. Regardless, that’s not good enough for the impatient, attention-deficit world we live in.

Conclusion (for now)
Baker Framework is very cool. Although, it’s still early-stage. It may make it easy for you to get your HTML pages into an iPad app, but that’s not quite enough yet. In an upcoming article, we will discuss the hardcore realities of the HTML5-based content approach for publishing to iPad and similar device/platforms.

Anthologize WordPress-to-EPUB Publishing

I recently wrote about my interest in WordPress-Based Publishing Tools and I am continuing that thread with a test-drive of the Anthologize WordPress plugin. The story behind Anthologize is interesting. It is an open source plugin for WordPress that originated from an innovation project called One Week | One Tool hosted by the Center for History and New Media at George Mason University. I totally love the tagline “Digital Humanities Barn Raising”. (As an aside, I am curious whether digital humanities will be a mainstream college degree in the near future).

Anthologize will appeal to a specific audience in the realm digital publishing: WordPress-based publishers interested in publishing to the EPUB format. The EPUB format is an e-book format that has been adopted by most e-book readers (pretty much every device except the Amazon Kindle). I should also mention that Anthologize can export to other formats besides EPUB, such as PDF. It is very possible that Anthologize will support other digital formats and workflows in the future.

There are a number of programs that let you create and assemble book content and export to the EPUB format. However, these programs and tools often require advanced technical skills which make them unattractive to content creators, who really want simplicity like the kind provided by blog platforms. Hence, the idea of integrating EPUB authoring tools into WordPress is an attractive one. Note, however, custom plugins like this can only be installed in independent blog servers running the WordPress software and not on hosted blog sites on WordPress.com

Getting Started
While I have nearly 20 years of hard technology experience and can create and deploy massive Internet sites across a dozen servers, I still shy away from managing my own blog server. This blog, Experimental Lex, is hosted on WordPress.com and I have other blogs on Blogger and Tumblr. I, too, like simplicity. I want my writing persona to never think about server technologies or hackers or whatever.

So it’s a little amusing that I have to setup my own WordPress server to try out Anthologize. That’s okay. I’m sure it won’t be the only WordPress-based digital publishing solution that I will need to explore. (Note to self: must try out CoverPad). If you have ever tried or witnessed a WordPress install, you will know that it’s a piece of cake. A few minutes (maybe 5) and you should be up and running.

Installing Anthologize is quite easy as well. Look for the “Plugins” menu in the left navigation and click on “Add New”. You can then search for “Anthologize” by name and WordPress will download and install it.

It is worth noting that the latest release of Anthologize is still at version 0.5-alpha, which means it is not quite complete.

Test Drive
After the plugin is installed, you will now have an “Anthologize” menu in the left nav. Click on “My Projects” you will see the empty “My Projects” page.

At this point, it is probably worthwhile to discuss the concepts and terminology used in Anthologize. A “project” represents a collection of content that will be assembled and exported as an EPUB or other digital reader format. You start by creating a project with a name and then start adding “parts” to it. In e-book terms, the parts will be chapters and other sections that make up a book.

In the screenshot, you can see that I created a project called “Digital Future” and added parts for “Introduction” and “Wordpress-Driven Publishing”. Since this is a brand-spaking new WordPress blog, the only blog post I had was the “Hello World” post. I can drag-and-drop “posts” from the “Items” section of the screen and into the Parts organizer.

Anthologize copies each post that you drag over and creates a “Library Item”. Several items can be added to a part and you can re-order or edit the content of the item within the Anthologize editor. Since it is a copy of the original post, the edits you make will not affect the real blog post.

Importing Content
Up until this point, I had not considered the “Import Content” menu in the left navigation. I assumed it might be some kind of upload tool that converts Microsoft Word docs, etc. Not so! It actually lets you pull content from an RSS feed into Anthologize.

I tried it out with the RSS feed from Paidcontent.org and within moments I had a rich collection of content to use in my sample project.

Ok, maybe that’s cool if you are one of those content pirates who rip off other people’s work. (just kidding) It wasn’t until some time later that I realized that you could pull your own RSS feed content. That means you could download blog posts from your hosted blog site using the RSS feed, assuming that the feed contains the full posts. And if you are running Anthologize on your own computer like I am, you can use this setup for assembling and creating e-books.

Now that’s a pretty big deal, since this truly becomes a more powerful digital publishing workflow. This system was designed with the understanding that the blog publishing workflow is different from publishing an e-book. Yet, it lets you re-use and re-integrate your content in a fairly smooth way.

Exporting Content
The Export Content feature is pretty barebones at this stage. As you can see in the screenshots below, you have some very basic controls over the metadata and formatting of the exported EPUB file. Still, it is very gratifying to see that it successfully produces a valid EPUB file that looks pretty decent in terms of formatting.



Inkling and the Reinvention of Education

We live in exciting times. The next decade or so will see the gradual exit of paper-based books and magazines as printed content moves from the physical world to digital. The digital trendsetters, who already read books and other content on tablet devices like the iPad and Kindle and countless other devices to come, have already embraced this future. The upcoming battle between Apple, Amazon, and Google on this playing field is mainly focused on this audience. However, I continue to wonder if this playing field will eventually encompass education.

The education market opportunity for digital publishing is almost an accepted fact. We probably agree that it’s going to be huge. Yet, we may not agree on what products and solutions will succeed in this market. The reading experience for novels and magazines on tablet devices is a more linear experience than the kind of interactive reading and problem-solving that students must do as part of their homework and curriculum.

And so it is always exciting to hear about new companies and products in the e-learning marketplace that are pioneering the way. Inkling is one such company that is offering an interactive reading/learning software platform for the iPad. When I heard about Inkling, I had to give it a try.

As an iPad app, installation was dead simple. It’s also a free download, so there were no excuses at all. Upon install and after you complete the registration form, you find yourself looking at the one book that is pre-installed. An essential classic — The Elements of Style. Except that it’s not really the Strunk and White edition. It’s the Inkling Edition “based on the work of William Strunk, Jr.”.

You can either feel horrified or amused at this point. They are messing with the classics and adapting them for the digital future. If you want the original Strunk and White, you probably want it to look like the original, front to back. And this is not the old-timer’s original edition. In the first 2 to 3 pages, you will see commentary like “ftw” and “wtf” as part of the dialogue.

However, if you accept that the digital textbooks of the future need to be updated and adapted for the next generation of students, then you will likely enjoy the experience. I’m in this camp and I found it to be intriguing and inspiring… with some reservations.

Test Drive
Here are a few screenshots to help you visualize the Inkling experience. If you have an iPad, just download it and try it yourself. In the opening screen after signing up / signing in, you see the one book. In Part 2 of this article, I will discuss the Inkling business model and speculate about the digital publishing workflow behind this system. It can’t possibly be easy.


Table of Contents

When you open the book, you immediately can see that this is not a traditional table of contents. The TOC is one of the primary areas for enhancement in any digital publication and this one wants you to know that the linear TOC you knew as a child is a piece of history.


Cover Page

In the cover page, you see a nice splashy title and image along with a toolbar on the left and a call-to-action graphic near the bottom to instruct you on scrolling the content using a swipe gesture. Note, that you will need to do a heavy swipe from the bottom of the screen to the top.


Navigation Menu

The menu button in the toolbar shows you several actions available to you. The “Highlights” and “Notes” menu options are particularly interesting. According to the Inkling website, these can contain shared notes provided by other students reading the same pages in the same book.


Search Tool

And lastly, here’s a screenshot of the Search Tool, also found in the toolbar. I found it strange that there were no matches for the word “Footnote”. I felt sure that Strunk and White addressed this topic. We will have to do a little fact-checking.


WordPress-Based Publishing Tools

As the digital publishing future unfolds, there are several unknowns we often think about. We like to speculate about which devices will win or lose, which e-book format will dominate, etc. Essentially, these are questions about how users will consume written content. From the publisher’s perspective, there are also questions about what tools are best to use. Often this is a complicated decision based on the reading platforms that the publisher wants to target.   

Blog-based publishers are an important market segment that is starting to adapt to the changing landscape of digital publishing. In the last decade, they had it easy in the sense that they only target one device (personal computers) and essentially one format (html)*. Now and in the rapidly-approaching future, blog-based publishers will need to evaluate how their blog content looks on a tablet-based browser or whether they need their own custom reader app.

In general, blog-based publishers will prefer solutions that let them preserve their existing publishing tools and workflow, so they can continue to concentrate on writing great content. Since a large percentage of blog publishers use WordPress, it makes sense for them to evaluate solutions based on WordPress plugins. For example, Akismet is a WordPress plugin that helps control and moderate comment spam and is provided in the standard WordPress installation.

CoverPad / PadPressed

A few days ago, I mentioned the CoverPad app from PadPressed, which offers a set of publishing tools and apps that help publishers target the iPad. PadPressed makes this possible through their custom WordPress plugin which they sell as a commercial product. This highlights an interesting digital publishing workflow for blog-based publishers who want to reach the growing market of tablet-based readers.

I visited the PadPressed website to try it out, but found that I needed to buy the product before I could try it out. The pricing seemed fair enough for publishers who are using WordPress. However, since I don’t have my own WordPress server, I was a little shy about paying to try it out. Eventually, I will get around to it.

It is worth noting that PadPressed has stated that they will be moving from WordPress-based solutions and towards CMS’s in general. It seems they have their eye on the larger market of digital publishing and not just content published via WordPress.


Another intriguing WordPress plugin called Anthologize shows great potential. This plugin helps you assemble and build content in the EPUB format, which is the book format used in the Apple iBooks app and several other reader apps available for Apple iOS and Android devices. It originated from the One Week | One Tool project hosted by the Center for History and New Media at George Mason University. It is now an open source project that is quickly evolving. Here’s a brief description pulled from the Anthologize website:

Anthologize is a free, open-source, plugin that transforms WordPress 3.0 into a platform for publishing electronic texts. Grab posts from your WordPress blog, import feeds from external sites, or create new content directly within Anthologize. Then outline, order, and edit your work, crafting it into a single volume for export in several formats, including—in this release—PDF, ePUB, TEI.

I don’t have my own WordPress server, but I suddenly have one good reason to set one up. By the way, you have to have your own WordPress server (not necessarily your own server), since it’s not meant to be installed on a hosted WordPress service. Since I was just testing it out, I installed the whole WordPress stack on my computer along with the Anthologize plugin.

I will save the review for another post, but I can say that I was definitely impressed. As an early alpha version 0.5, it shows lots of potential and clearly shows that the Anthologize creator(s) have a good understanding of the digital publishing workflow possibilities.