HTML and CSS Standards
Webpage Analysis Criteria
Below are criteria by which I judge the quality of a site's
HTML/CSS
and its accessibility. It starts with a numerical analysis
of various features and then evaluates the meaning of those numbers within the given context.
There are few absolutes, but there should be neither validation
errors nor <font> tags.
The whole purpose of the HTML/CSS distinction is to separate structure from presentation through
judicious use of CSS in combination with well-structured HTML. HTML provides a semantic and
logical structure to the file while the CSS provides the appearance you want to achieve. The
separation also makes it possible to change the appearance of an element throughout a site from
one location, the CSS file(s), rather than having to edit each HTML file or some collection
of include files. This process also simplifies meeting accessibility standards.
For my own part, all code will validate as correct HTML and CSS and will address accessibility
issues. I use XHTML 1.0 Strict unless otherwise required. This most restrictive implementation
path ultimately leads to the most flexible, accessible, and easily maintainable code.
There is a glossary at the end of this document just in case any term
or especially acronym is unfamiliar. Terms included are italicized and underlined with red dots
on first appearence
Contents
HTML
- DOCTYPE:
- Is there a DOCTYPE declaration and, more
importantly, does the code conform to it? Very often there is a declaration of HTML 4.01
but then the code includes XHTML tag closers on contentless tags (e.g., link, input)
or vice-versa. This throws the validator into a tizzy, producing spurious
errors and masking real ones. Also, is it a valid declaration so browsers use standards
mode instead of quirks mode rendering?
- Validation errors:
- With standards well-defined and agreed upon everyone should strive to meet those standards.
Adherence improves accessibility and uniformity across platforms and
user agents. It also reduces the chance of user agents giving
wildly different interpretations to the applied CSS (I've seen some really screwy looking
material when code violated standards).
- There are some recent sophisticated developments in accessibility which use techniques
which don't validate under the present system. Those are the only exceptions to
validation that should be accepted.
- <font> tags:
- Totally unnecessary with CSS and a heavy-handed and inflexible way of controlling text.
CSS has more ways to achieve more effects.
See HTML/CSS vs. the Browsers and
W3C QA Tips
- <table> tags:
- Generally superseded by <div> and CSS for layout purposes.
CSS is far more flexible and allows the HTML to be structured in a logical fashion
for people who use alternate access methods (e.g., the handicapped);
tables often present material in an illogical order. Any page with more than five tables
for layout is using them badly—and even five is excessive. I have also seen
only one used badly. Tabular data should take full advantage of <th>,
<colgroup>, and other structural markup, though colgroup, thead,
etc. are not well supported in CSS by current browsers.
We hope that will change in the new releases.
- For a demonstration of the flexibility provided by abandoning tables for layout purposes
see CSS Zen Garden. In the right-hand column you
will find alternate views of the same material—exactly the same material
since there is no difference in the HTML; the apparent changes are in the applied CSS.
There are over 200 designs the webmaster has thought worthy of posting—more than
1000 were rejected. My favorite is
“Mozart”,
number 189, and another good one for its simplicity and clarity is
“CSS Co.,
Ltd.”, number 209.
- Improper inter-element spacing:
- Using <br>, , and spacer images is also made obsolete
by CSS. Margins and padding can be controlled better and margins can be negative
to achieve visual effects like these outdented paragraphs
(effect achieved here through a semantic/HTML method).
- <div> tags:
- The preferred structural element, but can also be over-used with multiple nesting levels
just because someone didn't bother to think out structure (tables redux).
Machine generation (e.g., PHP, ColdFusion) often prompts people to stop thinking
and overload a page (especially when using tables but also with divs).
- <h#> tags:
- Should be used to give structure to the page and never used solely to size text. They are
needed only if the page has a relevant structure. If used, must start with h1 and go in
proper nesting order. They serve the same purpose as an outline but the
Web page includes the text that fills in the outline.
Accessibility standards encourage their use—at least <h1>
and a blind informant says he relies heavily on levels of headings while a
survey
confirms their common use.
- For instance, this page uses “HTML and CSS Standards” as its
<h1> and the page title, “Webpage Analysis Criteria”,
as its <h2>. It then also has several <h3> and
<h4> tags to organize subsidiary portions.
The size and other display characteristics are controlled by CSS.
- <ul> tags:
- Should be used for all menus and for anything else that looks like a list. The fact that
the menu is horizontal or you don't want bullets is irrelevant; we're talking structure
(semantics), not appearance. CSS performs the bulletless horizontal magic (see
Zephyr Press for a horizontal menu—top and bottom—and
MGA or
Wheelchair Mobility for a vertical menu whose buttons are solely CSS creations). See
list tips at W3C for more
information and resources.
- Forms:
- Is the form restricted in scope to its place of use? People commit one of two form sins—enclose
the whole page in a form even though the actual form is only a small part of the page or
break the form across multiple structural elements. Avoiding the latter by committing the
former is not a solution. A form whose only purpose is to present a search box should consist
of little more than the associated input tags. Feedback and data entry forms may require
extensive structural elements within the form—but not the whole page. Locating a form syntactically
correctly also seems to be a challenge.
- Liquid design
- Is the website of fixed width or does it adjust to the window size? A site that looks great
on a 1280px wide screen may require horizontal scrolling on a 1024px screen. Now that screens
are wide enough, many people want to have two windows open side-by-side. Fixed width sites
make this difficult because they don't adjust to the window size. Most of the
Wikipedia pages can be scrunched down to
less than 500px before they become unuseable. See
Web Matters for a more complete explanation with examples.
CSS
- Validation errors:
- Not generally a severe problem, but sometimes people invent values or even properties or
use a value that's not valid for that property. There are also several proprietary or
proposed properties that are used, but then cross-browser behaviour is difficult to control.
From CSS Zen Garden—“The only real requirement we have is that your CSS validates.”
- Efficiency/readability:
- This is where WYSInWYG tools truly shine in their stupidity.
I have seen rules like elementx { border-top: 1px solid red; padding-bottom: 4px;
padding-top: 4px; margin-right: 7px; border-left: red 1px solid; margin-left: 7px;
border-bottom: 1px solid red; margin-top: 7px; border-right: solid 1px red;
padding-left: 4px; padding-right: 4px; margin-bottom: 7px; }.
I've seen this sort of thing many times,
often in one seemingly endless and unreadable line, as above; it is not rare
(and often font or other information is included randomly just to add to the confusion).
If someone actually wants to read the code without having to feed it into some compatible
WYSInWYG tool, it will take many minutes to figure out that the rule says
elementx { margin: 7px; border: 1px solid red; padding: 4px; }.
And the first way probably isn't even the right way
when there are differences in the TRBL values.
- I format my CSS so it's easier to read than a single line,
making the rule above appear in my files as
elementx {
margin: 7px;
border: 1px solid red;
padding: 4px;
}
- External vs. page-level vs. inline:
- As much as possible CSS should be put in external files for sharing across pages. A typical
page should have no inline styles and only a minimum of page-level styles. Inline styles
should be used only when there is a single usage of that style and it is unlikely to be
needed by any other element. Home pages are often different from the rest of a
site; a few page-level styles are okay but extensive CSS should be moved to a home page-specific
CSS file. Remember, an HTML page can reference as many CSS files as required to do the job
and one CSS file can reference others to help provide some coherent structure. I hate trying
to read through CSS files that are 20+ screens long (there are a lot of them) just to find
the code that applies to some limited portion of a page. Some of those bloated CSS files
are a result of the efficiency issue mentioned above, but not all.
- As an example of multiple CSS files, King's College London
uses a different colour scheme for each major portion of the
Website—Undergraduate,
Graduate, Research, etc. This is effected by the invocation of different external
CSS files whose only function is to control issues surrounding colour (background, border,
associated images, etc.). Other CSS files create a consistent appearance across
the whole Website. The same is true of
Graficsmiths, a site I restructured from using frames and other outdated techniques.
These Standards pages also mix and match CSS files to achieve the desired effect.
- Text size:
- Should be specified relatively rather than absolutely, so it can be resized by users (see
HTML/CSS vs. the Browsers
for basic type information and two scalable methods).
- Class/id names:
- Should be chosen to reflect the function of the matter covered and not its appearance (e.g.,
bad: class="bluebox"; good: id="special-note").
Some things, like class and id name choices, cannot be quantified. Another is where to place
rules and yet another is how much repetition to tolerate. These are judgment calls where I choose
the highest level that can easily be controlled. For example, people often specify the same
font-family for all paragraphs, headings, and table data when it could be specified in the body
rule. (e.g., WGBH home page twenty times, but at no
point do they use any other font-family, even the default).
A large CSS file is not, by itself, an indication of good CSS usage. Very often CSS is over-specified
and underutilized. Bloating occurs from things mentioned already as well as creating many more
classes than needed. BBC news has at least seven CSS files,
all large, and I couldn't find the
organizing element. One class name can occur in multiple files, making it frighteningly difficult
to find the rule that applies at any given moment. Yet many such sites still have a
<body> tag that specifies the non-standard attributes of marginheight,
marginwidth, leftmargin, and topmargin—which are correctly handled in CSS.
There is also an issue called “liquid design”, which refers to building a site so it
adjusts to the user's window width. As screens get wider (1280 or 1600 pixels),
people often want two windows visible at the same time. Narrow windows with a fixed-width
site often end up forcing horizontal scrolling—a true inconvenience most of the time.
For the “how to” basics of CSS usage check A CSS Quick Reference.
JavaScript
Like CSS, as much as possible—all functions—should be in external files to reduce clutter and
load times through caching, leaving only the function invocations (onload, etc.) in HTML. JavaScript
also needs to have a workaround for user agents that don't recognize JavaScript or have it turned
off. Very often people use JS for menus or even create the menu with JS, making navigation of a site
impossible for such people.
See J Korpela
for one example of how to fix a common problem and see the rest of the page for more JavaScript
advice—including having the introduction say “Specifically, one should never rely
on JavaScript alone in the processing of data entered by user” [my emphasis].
That being said, the Web is constantly evolving and where it started out as an alternate publishing
medium, it has recently also acquired the function of an alternate application platform. Instead
of writing a document in MSWord or some other word processor and then sending copies (print
or electronic) to interested parties, it is now possible to write the document on a word processor
accessed through the web, have it immediately available to others, and to allow them to contribute
to or modify/edit it themselves. You can also create and submit forms whose contents vary according
to initial and evolving conditions, where JavaScript changes the page dynamically, without going
back to the server. I don't think the standards have caught up with this situation.
Flash
Flash is almost universally hated by accessibility and usability people. “in the usability
field, we've learned that more technical capabilities and a broader set of design options
usually translate into more rope for hanging the users.”
(http://www.useit.com/alertbox/20021125.html). Flash isn't accessible unless people make
a special effort; since most make no effort on the website, why should they make an effort with Flash?
One guide says the content is in the HTML, the layout is in the CSS, and JS and Flash are
decoration only. One site I know (which means there must be more) uses JS to create the menu
on the client and is thus unusable by the blind, etc. Any number of sites are made
solely in Flash.
Accessibility
The W3C Introduction to Web Accessibility
and its referenced pages answer more questions than I ever dreamed of asking.
It provides the rationale and methods for creating accessible pages.
The Web Accessibility Initiative (WAI) is the
W3C set of “Strategies, guidelines, resources to make the Web
accessible to people with disabilities.” The method of achieving accessibility is set out
in the Web Content Accessibility Guidelines (WCAG 1.0
and 2.0).
WCAG 1.0 consist of fourteen guidelines, each with several checkpoints which are grouped into
three priority levels. Some of these checkpoints can be checked with automated tools while others
must be checked manually. Version 2.0 is similar, but updated to more recent changes in Web
practices and feedback from 1.0.
The U.S. government has standards set forth in Section 508
that government sites and contractors are supposed to follow. They are similar to WAI, but not
as rigorous. The U.K.
also has its own set of standards, as do other governments.
"Since validation is the first step towards ensuring accessibility" (http://www.w3.org/WAI/AU/reviews/homesite#gl4),
simply converting from the old, table-based structure to modern structure and validating
the code will reduce the number of accessibility errors and warnings significantly. For instance,
the alt attribute for images is required for both validation and WCAG. In addition,
WCAG wants the contents of the alt attribute to be meaningful; that has to be a manual check.
Eliminating the use of images as spacers thus eliminates all associated errors and warnings.
Several online tools make it relatively easy to find and fix many errors. I use
ATRC,
FAE,
Basic Checker, and
Cynthia Says at various times.
There are others and if you find one you particularly like, please let me know. I also use
Web Developer toolbar
to easily turn off CSS and JS, validate pages, and other things.
Unfortunely, though I am intensely interested in accessibility and test all pages, I will never
post a claim of passing any particular standard. The problem is that few standards agree with
each other and some even violate HTML logic, while others are so obtusely
written that I can't understand them.
- Skip links:
- Generally agreed that they're needed; disagreement on how to implement them. Should they be
visible or invisible? How to make them invisible that won't impact one audience or another.
Use "skip to content" or "skip to navigation" with the page structured appropriately?
- Access keys
- Some people use them extensively, even for things I never dreamed of, while others say flat out,
don't use them. My blind informant doesn't use them, but someone with limited mobility and
who can't use a mouse probably would find them useful.
- Headings
- (Also see the notes under HTML.) Some people advocate using h2 to mark menus,
even if one or more precede the h1. Most accessibility checkers would flag this as an error.
- WCAG 1.0 vs. WCAG 2.0 vs. Section 508 vs. …
- I just got used to working with WCAG 1.0 when they released WCAG 2.0. They are not
completely incompatible, but there are some significant changes. And other people have
other ideas. Section 508 is too lax, so I do what I think gives a reasonable
result—unless the client has specific requirements.
- Text size:
- Many advocates recommend setting body { font-size: 100%; }. The laudable intent
is based on the principle that users set their preferred size in their own stylesheet and
any adjustments are based on that. Unfortunatly, that's rarely the case (my computer
science PhD nephew is the only person I know) and most Web
authors are now graphic artists, rather than techies, who never look at the code because
they're using a WYSInWYG tool.
Graphs
Page graph
Websites_as_Graphs
This tool graphs the tags on an individual page within a website, despite its name. It creates
a tree of color-coded nodes that gives some idea of how the page is put together. For instance,
lots of red says table-based structure and green indicates div-based. Lots of nodes off the
body tag indicates a probable lack of structure. All elements within a form should be clustered
together (apparently harder than one might think). I've also seen pages where the form tag encloses
everything (<body><form>…</form></body>), even though only a few lines,
if any, are the real form. Lots of images may indicate their use as spacers.
I'd like to see an additional color for lists, since they should be a strong structural element.
Not all table tags get colored red; caption, th, thead, tbody, tfoot, col, and colgroup are omitted.
Unbranched chains of red or green indicate nesting that is probably not well thought out and
therefore unnecessary.
A reduction in the number of nodes almost invariably means a gain in clarity of structure and
with it, easier maintenance and modification.
What do the colors mean?
- black: the html tag, the root node
- green: the div tag
- red: tables (table, tr and td
tags; not th, tbody, etc., colgroup, etc.)
- orange: line breaks and block quotes (p,
br, and blockquote tags)
- blue: links (the a tag)
- violet: images (the img tag)
- yellow: forms (form, input,
textarea, select, and option tags)
- gray: all other tags
node size (px) ==> # nodes
- 27 ==> 38, 83
- 17 ==> 227 ?
- 15 ==> 130
- 10 ==> 230
- 9 ==> 310, 340
- 8 ==> 370
- 7 ==> 410
- 6 ==> 620,~840
Site graph
Glossary
- <…>:
- Material enclosed between angle brackets constitute one of many HTML tags and associated
attributes which control what appears on the computer screen.
- Accessibility:
- The concept that Web pages should be structured and constructed in such a way that they
are available to the widest range of people possible regardless of access method. Some accommodations
are directed at hand-held devices or text-only browsers. Others address disabilities ranging
from color-blindness to cognitive and physical impairments (perhaps 10% of the U. S. population).
- CSS:
- Cascading Style Sheets—a tri-level system of applying rules to control the appearance of
Web pages. The rules consist of one or more property/value pairs and can be applied to multiple
pages with external files, to a single page, or to a single tag.
A CSS Quick Reference gives the basic outline for use.
- DOCTYPE:
- A formal statement at the beginning of a conforming document of its Document Type Definition
(DTD)—a rigourous specification for a language so the user agent knows how to treat what
follows. Failure to include a DOCTYPE leaves the user agent to guess at what parsing rules
to use and how best to display the document.
- Browsers operate in “standards mode” or “quirks mode”
based on a correct DOCTYPE. The latter tries to match
the bad-old-methods that don't display the same under the former.
- HTML:
- HyperText Markup Language—the basic language for writing pages that appear on the World
Wide Web (WWW). This includes XHTML (eXtensible HTML), which is a subset of XML (eXtensible
Markup Language), a more rigourous definition of how a computer language should be structured.
Until HTML version 4.0 the language did not have a clear definition that most players accepted
and agreed would be the basis for browser and other user agent development.
- Tag (W3C often uses “element” to refer to a tag):
- The basic structural element of HTML which may include several attribute/value pairs to
more precisely control its effect on a Web page.
- TRBL:
- Top, Right, Bottom, Left (TRouBLe—i.e., stay out of trouble by following this sequence);
the sequence for interpreting CSS shortcut properties. For example, the rule img { margin-left:
5px; margin-right; 2px; margin-top: 10px; margin-bottom: 0px } is more simply and clearly
written as img { margin: 10px 2px 0px 5px; }.
- User agent:
- Any device through which a person accesses the Web, whether it be one of the standard browsers,
a handheld device (PDA, cell phone), a text-only browser, or screen reader or tactile device
for the blind (list not exhaustive).
- Validation:
- The process of measuring HTML or other code against a precise syntactic definition or other
specification of a standard.
- W3C:
- World Wide Web Consortium—the body responsible for setting
standards for the Web, i.e., HTML, CSS, etc. It's members constitute various
stakeholders in the Web.
- Web or WWW:
- Shorthand for World Wide Web. (WWW is sometimes spoken as “dub-dub-dub”
to avoid having to say so many syllables.)
- Website:
- The collection of pages (one or many; static or dynamic) which originate at a single page
(generally designated the home page) which is itself uniquely identified by a WWW domain
name (e.g., NPR.org).
- WYSInWYG:
- What you see is NOT what you get—my reformulation of the usual WYSIWYG (What You See Is
What You Get) description of a tool that, unlike original computer tools, purported, like
MSWord, to immediately reflect the appearance of the final product. A WYSInWYG tool, on
the other hand, mimics its namesake but has no hope of actually fulfilling that mission
because of internal and external constraints beyond its control. Any visual web authoring
tool is WYSInWYG because it uses an internal browser which is of necessity different from
all outside browsers.
- There is also another similar formulation called WYSINWOG—What You See Is Not What Others
Get. Again, the reason is that each browser interprets the code differently and not everyone
is using the standard visual browsers. That's why disciplined use of modern methods and
standards is necessary.