Specification

XHTML Elements

code/i stands for "an i element immediately within a code element". This notation is from XPath.

XHTML elements must be in the XHTML Transitional namespace, http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd.

XHTMLDocbookNotes
b, i, em, strongemphasisThe role attribute is the original tag name
dfnglossitem, and also primaryindexterm 
code/i, tt/i, pre/ireplaceableIn practice, i within a monospace content is usually used to mean replaceable text. If you're using it for emphasis, use em instead.
pre, body/codeprogramlisting 
imginlinemediaobject/imageobject/imagedataIn an inline context.
img[informal]figure/mediaobject/imageobject/imagedataIf it has a title attribute or db:title it's wrapped in a figure. Otherwise it's wrapped in an informalfigure.
table[informal]tableXHTML table becomes Docbook table if it has a summary attribute; informaltable otherwise.
ulitemizedlistBut see the processing instruction below.

Links

Table 1. Link Translation

XHTMLDocbookNotes
<a name="name"><anchor id="{$anchor-id-prefix}name">An anchor within a hn element is attached to the enclosing section as an id attribute instead.
<a href="#name"><link linkend="{$anchor-id-prefix}name"> 
<a href="url"><ulink url="name"> 
<a name="mailto:address"><email>address</email> 

Tables

XHTML table support is minimal. html2db.xsl changes the element names and counts the columns (this is necessary to get table footnotes to span all the columns), but it does not attempt to deal with tables in their full generality.

An XHTML table with a summary attribute generates a table, whose title is the value of that summary. An XHTML table without a summary generates an informaltable.

Any trs that contain ths are pulled to the top of the table, and placed inside a thead. Other trs are placed inside a tbody. This matches the commanon XHTML table pattern, where the first row is a header row.

Implicit Blocks

XHTML allows li, dd, and td elements to contain either inline text (for instance, <li>a list item</li>) or block structure (<li><p>a block</p></li>). The corresponding Docbook elements require block structure, such as para.

html2db.xsl provides limited support for wrapping naked text in these positions in para elements. If a list item or table cell item directly contains text, all text up to the position of the first element (or all text, if there is no element) is wrapped in para. This handles the simple case of an item that directly contains text, and also the case of an item that contains text followed by blocks such as paragraphs.

Note that this algorithm is easily confused. It doesn't distinguish between block and inline XHTML elements, so it will only wrap the first word in <li>some <b>bold</b> text</li>, leading to badly formatted output. Twhe workaround is to wrap troublesome content in explicit <p> tags.

Docbook Elements

Elements from the Docbook namespace are passed through as is. There are two ways to include a Docbook element in your XHTML source:

Global prefix

A fake Docbook namespace[2] declaration may be added to the document root element. Anywhere in the document, the prefix from this namespace declaration may be used to include a Docbook element. This is useful if a document contains many Docbook elements, such as footnote or glossterm, interspersed with XHTML. (In this case it may be more convenient to allow these elements in the XHMTL namespace and add a customization layer that translates them to docbook elements, however. See Customization.)

<html xmlns="http://www.w3.org/1999/xhtml"
      xmlns:db="urn:docbook">
  ...
  <p>Some text<db:footnote>and a footnote</db:footnote>.</p>
Local namespace

A Docbook element may be introduced along with a prefix-less namespace declaration. This is useful for embedding a Docbook document fragment (a hierarchy of elements that all use Docbook tags) within of a XHTML document.

  ...
  <articleinfo xmlns="urn:docbook">
    <author>
      <firstname>...</firstname>
  ...

The source to this document illustrates both of these techniques.

Note

Both these techniques will cause your document to be invalid as XHTML. In order to validate an XHTML document that contains Docbook elements, you will need to create a custom schema. Technically, you then ought to place your document in a different namespace, but this will cause html2db.xsl not to recognize it!

Output Processing Instructions

html2db.xsl adds a few of processing instructions to the output file. The Docbook XSL stylesheets ignore these, but if you write a customization layer for Docbook XSL, you can use the information in these processing instructions to customize the HTML output. This can be used, for example, to set the a onclick and target attributes in the HTML files that Docbook XSL creates to the same values they had in the input document.

<?html2db attribute="name" value="value"?>

Placed inside a link element to capture the value of the a target and onclick attributes. name is the name of the attribute (target or onclick), and value is its value, with " and \ replaced by \" and \\, respectively.

<?html2db element="br"?>

Represents the location of an XHTML br element in the source document.

You can also include <?db2html?> processing instructions in the HTML source document, and they will be copied through to the Docbook output file unchanged (as will all other processing instructions).



[2] The fake Docbook namespace is urn:docbook. Docbook doesn't really have a namespace, and if it did, it wouldn't be this one. See Docbook namespace for a discussion of this issue.