Features

The Story Behind HTML5

By Alan Houser | Fellow

We often use Standards without thinking about the process by which they were developed or the people behind them. But standards are created by real people, working for and representing real organizations. The standards development process includes conflict, compromise, consensus-building, and sometimes rebellion. Usually this process happens privately among the members of standards organizations, but sometimes the process is very public.

The last major version of HTML, HTML 4.01, was finalized by the World Wide Web Consortium (W3C) in December 1999. That was more than 10 years ago. What has happened since then? Quite a bit, actually. After nearly 10 years of controversy, contention, rebellion, and reconciliation, we are now close to the next-generation markup language for the World Wide Web—HTML5.

Philosophies of HTML5

The What Working Group sought to create an HTML5 specification that meets the needs of browser vendors, Web designers, programmers, and users of the World Wide Web. The following are some of their guiding principles:

  • Pave the cowpaths—Support common behaviors and conventions that have been developed over the 20 years of Web history.

  • Support best practices—Support best practices in HTML markup, including structured semantics and accessibility.

  • Graceful degradation—When a Web browser encounters an error condition in HTML markup, don’t “die” in response. Make a reasonable attempt to display the page.

  • Document error handling—Specify how browsers should handle error situations and ill-formed HTML markup.

  • Don’t break things—Retain backward compatibility with existing Web pages and all previous versions of HTML.

The Next Generation Web, circa 2000

After HTML 4.01 was finalized in 1999, the W3C shifted its focus almost exclusively to XML-based markup languages. The W3C published the first of these specifications, XHTML 1.0, in January 2000. This specification was essentially a reformulation of HTML 4.01 that conforms to XML syntax rules, including:

  • all begin tags must have a close tag

  • all tags must be properly nested

  • all attribute values must be quoted

  • letter-case is meaningful

  • documents must conform to a document type definition (DTD).

The Problem with XML

The creators of XML envisioned a language that required 100% conformance to the specification and was unforgiving of any deviation from the specification. In fact, they built this requirement into the XML specification. Commonly referred to as “draconian error handling,” the specification says that upon encountering any syntax error, the application processing the XML (e.g., a Web browser) must abort processing. If a Web page contains a single XML error, the browser will abort and not show anything. Content must conform or not be displayed.

This draconian error handling of XML was supposed to fix the imperfect HTML markup underlying the World Wide Web. You had to be perfect to play. If your HTML was broken, in any way, your page would not display.

This requirement of perfect XML syntax caused a great amount of concern among browser vendors. It also ran counter to a well-known principle of computer communications known as “Postel’s Law,” which states: “Be liberal in what you accept, and conservative in what you send.” The original Web conformed to Postel’s Law. Web browsers were forgiving of ill-formed markup. Web pages were simple to create. An eight-year-old could write an HTML Web page. And the Web exploded in popularity as a result.

But many in the W3C and other communities thought a stricter Web would be a more perfect Web. By moving the Web to XML-based markup, the W3C hoped to rid the Web of “broken” syntax and markup that is common in HTML Web pages, provide compatibility with other related XML-based specifications (including RDF [resource description framework], which would provide the basis for the semantic Web, and SVG, for scalable vector graphics), and provide the capability to process XML Web content with XML tools and languages, including XSLT (the XML transformation language).

The W3C knew they couldn’t make this transition immediately. Even though XHTML 1.0 documents conform to XML syntax, browsers are allowed to parse them as standard HTML documents. But this flexibility ended with XHTML 1.1, which required that browsers parse XHTML 1.1 documents as XML.

The W3C began to create the next generation language for the Web and called it XHTML 2.0. The W3C working group began working on XHTML 2.0 in August 2002.

8 For, 14 Against

But several browser vendors and a community of Web application developers were gravely concerned about XHTML 2.0. Not only would XHTML 2.0 require draconian error handling, it was by design not compatible with HTML or previous versions of XHTML, and it did not provide features needed for Web applications, such as real-time generated graphics and audio and video handling.

In June 2004, at a W3C workshop, a group of participants proposed that HTML (not XHTML) should be extended to support Web application requirements. The W3C considered their proposal and voted it down—8 for, 14 against. At the W3C, XML remained the future of the World Wide Web (see www.w3.org/2004/04/webapps-cdf-ws/minutes-20040602.html#topic28.1).

Just a few days later, representatives of several browser vendors split from the W3C and formed their own standards organization. They called themselves the Web Hypertext Application Technology (WHAT) Working Group and began development on a specification they called Web Applications 1.0, which later became HTML5. From the WHAT Working Group FAQ (http://wiki.whatwg.org/wiki/FAQ):

“The WHATWG was founded by individuals of Apple, the Mozilla Foundation, and Opera Software in 2004, after a W3C workshop. Apple, Mozilla and Opera were becoming increasingly concerned about the W3C’s direction with XHTML, lack of interest in HTML and apparent disregard for the needs of real-world authors.”

In contrast to the private, consensus-driven processes of the W3C, the WHAT Working Group features open membership and public deliberations. It also places a substantial amount of decision-making authority with the specification editor. The editor of the WHAT Working Group specification is Ian Hickson, formerly employed by Opera Software and currently by Google. Under this model, the WHAT Working Group was able to produce a specification relatively quickly, especially in contrast to the typically slow development pace found in the standards world.

Reconciliation

For about two years, the WHAT Working Group developed HTML, while the W3C continued work on XHTML. But it became increasingly apparent that XHTML was not gaining traction among browser vendors or authors.

In October 2006, W3C CEO and inventor of the World Wide Web Tim Berners-Lee acknowledged that the goal of XML for the Web may have been a mistake (see http://dig.csail.mit.edu/breadcrumbs/node/166). The W3C soon re-chartered an HTML working group, which used the efforts of the WHAT Working Group as its starting point. Originally, the W3C HTML working group was to work in parallel with the XHTML working group. The W3C later (2009) ended the efforts of the XHTML working group.

Instead of XML’s draconian error handling, HTML5 adopts the philosophy of graceful degradation. Browsers should try to display Web pages no matter how broken the HTML markup may be. Although HTML5 does declare some older HTML markup to be obsolete, the specification details how browsers should handle deprecated tags. Browsers are required to handle old HTML markup.

There remains evidence of the original split between the W3C and WHAT Working Group. Other W3C working groups restrict membership to employees of W3C member organizations and to “invited experts” at the sole discretion of the working group chair. However, anybody can join the HTML5 working group. All it takes is to declare oneself an invited expert by signing up for an email list (www.whatwg.org/mailing-list).

How to Get Started

The HTML5 specification is in its final phase of development at the WHAT Working Group, with completion scheduled for 2012. Many components of HTML5 are ready to use now. Current versions of all modern Web browsers provide some support for HTML5, although that support varies. If you wish to start using HTML5 now, your general strategy will be to choose a specific feature you wish to use, code that feature in HTML5, test whether the feature is supported, and provide a fallback option if a person opens your Web page in a browser that does not support the feature. Most HTML5 features can be imitated with Javascript. The Modernizr Javascript library provides feature tests that you can include in your Web pages with just a few lines of Javascript code.

You will probably want to learn HTML5 and CSS3 together. CSS3 (cascading stylesheets) is a W3C recommendation for styling content in Web browsers, print, and other output media. CSS3 explicitly considers the styling requirements and constraints of today’s wide range of output devices, including smartphones and tablets.

Recent Developments

January 2011 brought several interesting developments in HTML5.

HTML5 provides a

The W3C announced a branding campaign for HTML5, including a logo. This logo is provided under the creative commons license, giving people free reign to use it. Notably, the logo is not intended to promote only HTML5, but the current generation of Web standards, including CSS3 and SVG (see www.w3.org/html/logo/).

Within days of the W3C HTML5 logo announcement, the WHAT Working Group announced that it would no longer refer to its specification as HTML5. The specification would heretofore be referred to as HTML and would be a “living standard,” updated and modified by consensus of the WHATWG members and the specification editor (see http://blog.whatwg.org/html-is-the-new-html5). Was this announcement a sign of continuing contention between the W3C and the WHAT Working Group? Perhaps.

Alan Houser (arh@groupwellesley.com) is an accomplished trainer, consultant, and conference presenter. Alan is a member of the OASIS DITA Technical Committee and contributed to the DITA 1.2 specification. He is an STC Fellow and served as the STC liaison to the W3C, 2007–2009. Alan has served STC most recently as conference manager for the annual STC Summit.

Key HTML5 Features

Doctype

Structure

(which may be nested) and

.
,

, and

to

).

Semantics

Interactivity

Audio

Graphics and Video

Web Application Support

Suggested Reading

World Wide Web Consortium (W3C): www.w3.org/

W3C HTML5 Working Group: www.w3.org/html/wg/

W3C HTML5 Working Draft: www.w3.org/TR/html5/

WHAT Working Group (WHATWG): www.whatwg.org/

WHATWG HTML5 spec: www.whatwg.org/specs/web-apps/current-work/multipage/

Dive Into HTML5 by Mark Pilgrim: http://diveintohtml5.org/

HTML5 for Web Designers by Jeremy Keith

Modernizr Javascript library: www.modernizr.com/

Tags