thoughts
[SGML] the NET shorttag
2005-01-27
Now we're all validating our documents, please tell me why this document (view source) validates and this document (view source) does not validate (hint: you might want to take a look at the parse tree).
And why are people talking that XHTML served as text/html should show ">" charachters all over the place.
The answer is the SGML NET (null end tag) shorttag. In SGML it is legal to close your tags like <em/bla/ instead of <em>bla</em>. So when you use XHTML things like <br /> in SGML (read: XHTML documents served as text/html), the tag should close when the parser encounters the "/" and the ">" remains. Now the ">" is treated as character data and shown. Most browsers do not do this however, so that's a pre for the unaware.
Now you know what it is with my two documents. The a opening tag in the first document closes as soon as the parser encounters the first "/" in that URL, then reads an empty element contents and closes the tag (second "/"). Everything after it is treated as character data, including that bogus attribute. So everything is correct here.
The second document does not validate because of the same reason: the a is already closed when the parser reaches </a>, so an error occurs.
This can be avoided by adding quotes to attributes, in fact, you are only alowed to omit those quotes if the attribute value only contains letters (a-z and A-Z), digits (0-9), hyphens (ASCII decimal 45), periods (ASCII decimal 46), underscores (ASCII decimal 95), and colons (see W3C: attributes in HTML 4.01 on this).
Note that this is only a validation issue, frequently used browsers do not support the NET shorttag (lynx does). See this example.
Further reading
Additional resources (top 15)
Below is a list of additional resources that might contain extra information about the subject at hand. These are all sites linking to this one (i.e. backtracking).
- html vs xhtml (72)
