HTML Coding Standards

Contents

[Home]

Many software developers do not consider HTML to be a programming language - because it isn't. Strictly speaking, HTML is a text markup language; it does not have any of the looping or flow control constructs present in a real programming language.

However, the format and style of HTML code is important primarily because others may be required to modify and/or maintain it one day. Hence, HTML should be logically organized and neatly laid out; it must be easy to read and maintain.

This page serves as an example of good HTML source. The markup used in this page adheres to all of the style guidelines described below. This style guide was originally written in September 2000. Since then it has seen a number of revisions by different authors. It is quite comical to note that the last few revisions were made either using an HTML editor or by a individual completely lacking clue as the style of the HTML markup was just terrible!! The most recent version of this guide has returned it to its desired state.

A note about HTML editors

[Top] [Home]

Thou shalt not use HTML editors. This rule is not negotiable.

If one has little technical expertise and merely wishes to post some information on the web quickly, then HTML editors are an acceptable solution. However, any computer geek worth their pay really has no excuse to use these horrible beasts. The HTML markup language is simple enough to learn and use that all developers should be able to form a good working knowledge of its use in a short time period. Most of my colleagues have learned to use HTML within a few hours time.

Most WYSIWYG HTML editors generate poor quality HTML source that violates many of the points to be discussed in the style guide. Most HTML editors place far more markup on a page than is necessary to achieve a given layout. Furthermore, most HTML editors generate poorly formed HTML as many neglect to properly terminate container tags.

When developing web tools, one must interweave the business logic of the problem domain with the generation of HTML markup. It is very tedious to grab the output from an HTML editor and cut and paste it into one's script/servlet/whatever. As the verbosity of the markup generated by HTML editors is beyond reproach, the resulting script or servlet would become very long and messy which in turn makes the code harder to read and maintain.

The worst thing about HTML editors is that if you open a file containing markup that was either written by hand or generated by another editor, your editor will place all sorts of idiotic tags in the HTML source and generally munge it beyond recognition. I have seen examples of editors tripling the size of an HTML file merely by opening it and then saving it (Microsoft FrontPage is really bad for this)!! The result of this feature is that someone (hopefully you) will have to wade through the code and manually eliminate all of the stupidity so that the web page may once again display properly. As a side note, Microsoft Word's 'Save As HTML' feature does not work well for anything but very simple documents so don't even think about using it!!
Make the web site available to as wide an audience as possible

[Top] [Home]

The purpose of any web page or site often falls into one of four categories; education, entertainment, sales or communication. To be effective in any of these categories, the first and most basic thing a web author must achieve is to effectively reach as large an audience as possible. Hence it would make sense to ensure that a web page or site was accessible to as many potential users as possible. Ideally, one's web site should be accessible to 100% of one's potential user base.

A target audience for a web site will contain users having a variety of browsers running on a larger variety of operating systems. Therefore, a web site must be made to function on as many of these browsers and platforms as possible. Ideally, one's web site should have all content and/or functionality (in the case of web forms) accessible to all users on all platforms and all browsers.

Out in the corporate intranet and the big bad public internet, a large number of web sites exist that are only viewable by browser X on platform Y. Individuals using anything but a specific magic combination of OS and browser would be not be able to view such a web site. This practice is not only reckless, irresponsible and stupid but it also demonstrates a profound lack of technical expertise on the part of the web developer responsible for the site.

If one was trying to sell a product on one's web site, would it be deemed acceptable to only attempt to sell the product to a portion of one's potential customer base? Would the owners of a firm be happy to learn that potential buyers [of the firm's product] did not even have an opportunity to purchase the goods simply because the web site wouldn't let them in? While it is true that only rich people can afford poor programmers, it is also true that only rich people can afford poor web developers!!
Be a minimalist

[Top] [Home]

Minimalist design is good. Far too often, web pages are cluttered with cutsie bells and whistles that do not enhance the page content or usability. Such extraneous elements do little more than consume precious bandwidth and make pages slower to download and render. A good example of a useless bandwidth chewing page element would be an animated gif of a mailbox opening a closing placed next to an email address.

When placing an element on a page, one should ask the question "if I don't include this thing will the content or the usability of the page be seriously compromised". If the answer is no, then deep six the superfluous element. By adopting a minimalist design philosophy, one will hopefully build pages that are not overly cluttered.
Be mindful of the seven plus or minus two rule

[Top] [Home]

The average human's short term memory can retain seven plus or minus two items at any given time. Presentation of more than this magic number of stimuli can overload one's sensory inputs. Pages containing more than 7 ± 2 logical elements or groupings of elements will tend to visually overwhelm the average user. Such sites are often viewed as being overly busy and hard to use.

However, portal style sites (those that offer many bits of content and/or services on one page) have become very popular. The CNN website is a good example of a really busy portal style site. Just because such layouts are frequently used by big commercial sites does not necessarily mean that such designs are a good idea or that such sites were designed with basic human computer usability principles in mind.

If it is absolutely necessary to include great gobs of content on single page, designers must recognize that they do so at the expense of usability. At the very least, such portal style sites should always provide a simple hierarchical site map, site content search tool or index that may be used as an alternative form of navigation.
Thou shalt not use the BLINK tag

[Top] [Home]

Thou shalt not ever even consider using the BLINK tag. This rule is not negotiable.

The BLINK tag is perhaps one of the least value added tags in the HTML markup definition. Some browsers (IE) do not support this tag as its misuse can cause a web page to be virtually unreadable.
Don't be afraid to place an element in more than one location

[Top] [Home]

There is no hard and fast rule that states that a given piece of content or navigation element must live in one and only one place. Different people have different conceptual models and may look for a a specific piece of content in different locations. Hence if an element is placed in more than one location, there is a greater chance that a user will find it when looking where they think it should be.

The use of multiple access paths for a given content element ensures that it will be easily found by diverse users. This approach will also reduce the memory load of the user as they need not learn were things are in a site to begin using it [the site] effectively.

Consider for example a site designed to provide an online presence for a manufacturer of bicycles. One important feature of this site would be contact information for product support and warranty assistance. One could place a link to such info in the following locations:
- on the main page
- the company yellow pages
- a product support or warranty info page
- included with product descriptions or specifications
- in a FAQ list
- etc...
One problem with this approach is that site maintenance becomes more difficult; when a link changes it must be updated in multiple locations. Therefore this approach must be used wisely in order to balance the usability and the maintainability of a web site.
Easy navigation

[Top] [Home]

Nothing is more annoying than getting N levels deep in a web site and then having no means of getting back to the top level. This severity of this problem is greatly amplified in sites that use framed layouts. One would think that good navigation elements would be basic to all web sites but this is not always the case.

It is critically important that a set of pages have an easy to use and ever-present navigation system. It is important that the navigation system does not change form or style from page to page; positive transfer is a property of human computer interaction that one must harness.

The navigation system need be nothing more complex than plain old links. There are many Java applet or JavaScript navigation tools that may be used as a navigation system. One must realize that the use of such tools will undoubtedly increase the load and render time for a page and may make a web site inaccessible to some browsers. For example, the use of FLASH components for site navigation locks out users on older browsers and some operating systems (mostly versions of Unix for which there are no FLASH plug-ins).

When designing a web site, it is important to think about navigation before one actually begins to build any content. Often the navigation system used will influence the structure of a web site. It is much better to build a web site around a strong navigation system than to add a poor one after the fact.
Include author and/or owner information on every web page

[Top] [Home]

One must include contact information for the author and/or owner on every web page. This rule is not negotiable.

Inclusion of this information facilitates feedback or questions from readers. This information also denotes who is responsible for maintaining the pages should the content require updates.
Scrolling

[Top] [Home]

It is very rare to create web content that fits completely into one browser pane. Often scrolling is required to view the entire page content.

In such cases [where scrolling is required] one should endeavor to construct the page such that either vertical or horizontal scrolling is required but not both. Pages that require both vertical and horizontal scrolling are difficult to use and may frustrate users not benefiting from large monitors or high resolution video cards.

An astute web designer will decide which scrolling plane will be used prior to designing the page layout. Such a clever web designer will then employ this rule consistently throughout all pages in a logical set.
Don't put vast amounts of HTML markup on one line

[Top] [Home]

As an example, here is what you probably should not do:

<HTML><HEAD><TITLE>My Annoying Page</TITLE></HEAD><BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#003399" ALINK="#003399" VLINK="#003399" BGSOUND="isuck.wav"><TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0" WIDTH="644" ALIGN="CENTER" VSPACE="0" HSPACE="0"><TR VALIGN="TOP"><TD ALIGN="LEFT" VALIGN="MIDDLE">Are you annoyed yet?</TD></TR></TABLE></BODY></HTML>

This page will render normally. However, the HTML source is a bit hard to read. Future maintenance attempts will probably be met with much frustration and cursing.

A better way to format the above code would be as follows:

<HTML>

<HEAD>
<TITLE>My Not So Annoying Page</TITLE>
</HEAD>

<BODY BGCOLOR="white">

<TABLE CELLSPACING="0" CELLPADDING="0" BORDER="0"
WIDTH="644" ALIGN="center" VSPACE="0" HSPACE="0">

<TR VALIGN="top">
<TD ALIGN=LEFT VALIGN=MIDDLE>Feeling better??</TD>
</TR>

</TABLE>

</BODY>
</HTML>

By using whitespace (judiciously ) to separate logical page elements the HTML source is much easier to read.

Note that indentation was used in the second example. This is usually a good idea as it adds to the readability of your HTML. Also note that the tags were all in upper case with all attribute values being in lower case. Some web authors prefer tags in lower case. However, one should keep in mind that when HTML markup is embedded within a Java servlet (for example), upper case tags make it much easier to distinguish between HTML and Java.
Don't overuse whitespace

[Top] [Home]

This advise may seem contrary to the previous point but it isn't. While it is a good idea to judiciously use whitespace to separate page elements, it isn't necessary to leave a blank line between every line of HTML markup. Too much white space can be very distracting.
Pick a case for tags (and stick with it)

[Top] [Home]

Some web authors choose to write all HTML tags in upper case while some choose lower case. Neither choice is wrong but the two styles should never be mixed!! Once one has decided which convention one will use, one must use it consistently throughout a page or set of pages. Mixed case HTML source is not acceptable.

A good style to adopt is to use upper case for HTML tags and attribute names and to use lower case for attribute values. Use of this style produces HTML in which it is easy to visually distinguish between the markup and the content.
Don't use tags that STADN (sit there and do nothing)

[Top] [Home]

STADN tags do just that - they don't actually contribute much to the content or layout of a page. An example of a STADN tag would be:

<FONT SIZE=2><B> </B></FONT>

The bold and font tags do not contribute to the layout or appearance of the non-breaking space. We could add as many surrounding tags to the non-breaking space and it still wouldn't affect the appearance of the page.

Most HTML editors liberally insert STADN tags. This is yet another reason why HTML editors must not be used.

Another more insidious form of a STADN tag is one that re-defines existing page defaults.

<FONT SIZE="3" FACE="Times New Roman" COLOR="black">

This font tag specifies (redundantly) the browser default font face. In general, unless one has a really good reason, one should not set a specific font face for an entire page. I don't think the content of a given page will be any more powerful or impressive if rendered in the 'Courier' font rather than the default (for most browsers) 'Times New Roman' font. If the font specified is one not know to a browser, the page will be rendered using the default anyways so why bother?

It is also worth noting that if one uses a specific font for a page they must specify the font face for every table cell on that page (unless CSS is used)!! For whatever reason, font specifications do not get picked up in tables and one is forced to specify the font for each cell. This equates to a lot of extra work for little tangible reward. I equate this behavior with that of beating one's own head with a stick because it feels good when one stops!!
Terminate all tags

[Top] [Home]

Most HTML tags are container tags; they require both an opening and closing tag. Such tags must be properly closed to ensure correct rendering on all browsers. HTML source in which container tags have not been properly closed is described as being poorly formed. It is worth noting that poorly formed XML documents will not parse let alone render!!

Many web browsers will correctly render the content without the use of the proper closing tag. This tends to lull web authors into doing the wrong thing; the closing paragraph and list item tags are rarely used for example. However, just because one can get away with this practice most of the time does not mean that it is the right thing to do.

Poorly formed HTML is often difficult to maintain and/or modify. Often the lack of closing tags leads to undesired side effects when new content is added. Hence the use of closing tags will yield HTML source that is more robust and easier to maintain.
Enclose tag attribute values in quotes

[Top] [Home]

Thou shalt always enclose tag attribute values in single or double quotes. One shall not mix single and double quotes as once thou choseth a style thou shalt stick with it!!

As an example, the following HTML doth sucketh and shall be avoided:

<IMG SRC=/sompath/to/image.gif size=42>
<TABLE BORDER='2' CELLPADDING="5" CELLSPACING=5>

The following HTML is good and would free one from the wrath of the omnipotent web master:

<IMG SRC="/good/image.gif" ALT="my sample image" SIZE="42">
<TABLE BORDER="0" CELLSPACING="5" CELLPADDING="2">
Use colour names instead of RGB values

[Top] [Home]

When specifying colours, one should endeavor to use colour names instead of RGB values. The HTML 4.0 standard defines 16 standard colours but most popular browsers support an extended set of colour names.

For example, one could specify the colour black as #000000. If one does not understand RGB values for specifying colours then this value will be more than a little cryptic and hence add complexity to the maintenance of a web page. It would be much easier to specify this colour as black.

There are exceptions to this rule of course. When one wishes to use a particular colour for which there are no predefined colours one must specify an RGB value. However, if a colour name is close enough one should use that colour name. For example, consider the two colours magenta and #FD00FD. There is little apparent visual difference between these two colours; a user's video card and/or monitor will probably further narrow the perceived hue difference between these colours. Hence one does not gain much (besides obfuscated markup) by using RGB colour values when a colour name will suffice.
Don't change link colours or decoration

[Top] [Home]

It is important to remember that people spend most of the their time [on the web] viewing sites other than yours. Thusly people will build up a set of expectations and learned behaviors based on what most other websites do. If you deviate from any common design conventions or practices, you run the risk of confusing and/or annoying your users.

The vast majority of websites currently available use the default colours and decoration for links. This simple link decoration convention allows users to quickly distinguish a link from content and to determine if they have previously visited that link's target content. The use of link decoration takes advantage of the positive transfer usability heuristic.

Simply stated, positive transfer is the ability to apply past learning experience to a new but similar situation. For example, once a person learns how to ride a pedal bike, they can probably ride just about any similarly designed pedal bike in the world; the learned skill was not confined to the one unique bike on which they learned to ride.

The naive web designer is often tempted to remove the underline on a link and to change the visited, unvisited and active colours; often the same colour is used for all three link states and the colour is chosen more to match the page's aesthetics than to aid usability or function. This practice makes it more difficult to distinguish content from links and may result in users having to scrub your entire page (with their mouse pointer) to determine what is a link and what is not.

Most web pages do not muck with the colour of links. Most users have become accustomed to the standard colours used by visited and unvisited links. Therefore specifying custom link colours would negate any positive transfer effect afforded by a user's prior web surfing experience and tends to decrease the usability of a web page. The resulting page may look pretty but it will be harder to use.

To illustrate an even more insidious practice, consider the following HTML snippet:

<BODY TEXT="#000000" BGCOLOR="#FFFFFF" LINK="#003399"
VLINK="#003399" ALINK="#003399">

The colours specified in the example above uses colours that are almost identical to browser defaults. It is also worth pointing out that the text colour of black (#000000 as specified is useless as this is the default on most browsers!!

It is also worth noting that the same colour is used for visited, unvisited and active links. This is particularly insidious as the colour of links gives a user an indication of which links they have followed and which they have not. Removing this colour coding removes an important visual cue that would otherwise enhance the usability of a web page.

Removing all of the redundant and silly attributes, the above HTML snippet should appear as follows:

<BODY BGCOLOR="white">

As a parting comment, changing link colours could impact readers with certain types of colour blindness. Changing link colours to match a particular colour scheme may render the page next to useless for colour blind people. Similarly, poorly chosen link colours could simply vanish against a background image, texture or colour.
Avoid using too many large images

[Top] [Home]

One should avoid placing large images on a page. Large images take a long time to download and render. The use of many large images will ultimately serve only to delay the loading time of the page. Many users will not wait for a really slow page to download a metric butt-load of images. Many people will skip a slow page and move on to something faster. If the web page they are skipping represents a product you are trying to sell, then one can infer that a slow page may be directly equated to lost sales.

A good way to use images on one's web page is to place small thumbnails of the images on the page. Most images can be reduced to 25% of their original size and still provide an adequate preview of the larger images. The thumbnail images should then be used a clickable links to the full size images. If this technique is used, users will be able to view only the images they wish to see without being penalized by lengthy download times.

It is important to note that the images must be reduced to thumbnails using a suitable graphics tool. Merely using the full size image and scaling it to 25% (using attributes of the IMG tag) is not recommended because the browser will then have to both download and scale the image.
Do not use cheesy custom bullets or line-separators

[Top] [Home]

The use of custom images for list bullets or line-separators should be avoided at all costs. One should endeavor to use standard HTML elements in place of custom images. Such cheesy offerings rarely add value to a page; their only contribution is to lengthen page download and render times.

If for artistic reasons one absolutely must use custom bullets or lines, do so sparingly. The overuse of such things can clutter up a page and make the consumption of the actual content more difficult. Nobody likes an to read overly busy web page.
Avoid using background images

[Top] [Home]

Background images seldom enhance the content of a web page. As a general rule, if an element doesn't add to a page's content or usability, the element should not be there. The use of background images does not represent an exception to this rule.

More often than not, background images make the page content harder to read. Busy backgrounds images or those whose colouration closely matches that of the page text can make it difficult to read the content.

Most browsers download and render the background image prior to rendering any content. Consequently one is forced to sit and wait while an image of pocket lint downloads and renders. We know that impatient users will not wait for a silly background image to load and will move on to another page. This is bad.
Linking to Other Pages

[Top] [Home]

When linking to other pages within a set of related pages, one should endeavor to use relative links. The use of relative links in a set of related pages (a training manual for example) would enable one to move the page(s) to a new location (on the server) without the need for numerous link updates. If every link were absolute, moving the pages to a new storage location (on the server) would require the update of every link in the set of pages. I think we can all agree that this would not be a good use of one's time and would contribute to a repetitive strain injury!

However, when linking to other pages on the same server that are not related, one should use absolute links. If one uses relative links for unrelated pages, one may be need to update the links every time something is moved on the server.

When linking to pages on another server, one has no choice but to use a fully qualified URL.