RFC2629throughXSLT J. F. Reschke 
 greenbytes 
 April 2005 


Transforming RFC2629-formatted XML through XSLT


Table of Contents


1. Introduction

This document describes a set of XSLT transformations that can be used to transform RFC2629-compliant XML (see [RFC2629]) to various output formats, such as HTML and PDF. The main topics are


2. Supported RFC2629 elements

rfc2629.xslt supports both all RFC2629 grammar elements and the extensions implemented in xml2rfc 1.21.

2.1 Extension elements

In addition, rfc2629.xslt supports a set of extension elements, using elements and attributes in the namespace "http://greenbytes.de/2002/rfcedit". They are used for

Note that these extensions are experimental. Please email the author in case you're interested in using these extensions.


3. Processing Instructions

All PIs can be set as XSLT parameter as well, overriding any value that is found in the source file to be transformed.

Using processing instructions:

<?rfc toc="yes"?>
<?rfc-ext support-rfc2731="no"?>

Using XSLT parameters:

saxon foo.xml rfc2629.xslt xml2rfc-toc=yes \
  xml2rfc-ext-support-rfc2731=no > result.hzml 

3.1 Supported xml2rfc-compatible PIs

PI targetPI pseudo-attributeXSLT parameter namedefaultcomment
rfc background
 
xml2rfc-background
 
(not set)  
rfc compact
 
xml2rfc-compact
 
"no" only applies to HTML output method when printing 
rfc comments
 
xml2rfc-comments
 
(not set)  
rfc editing
 
xml2rfc-editing
 
"no"  
rfc footer
 
xml2rfc-footer
 
(not set)  
rfc header
 
xml2rfc-header
 
(not set)  
rfc inline
 
xml2rfc-inline
 
(not set)  
rfc iprnotified
 
xml2rfc-iprnotified
 
"no"  
rfc linkmailto
 
xml2rfc-linkmailto
 
"yes"  
rfc private
 
xml2rfc-private
 
(not set)  
rfc sortrefs
 
xml2rfc-sortrefs
 
"no"  
rfc symrefs
 
xml2rfc-symrefs
 
"no"  
rfc toc
 
xml2rfc-toc
 
"no"  
rfc tocdepth
 
xml2rfc-tocdepth
 
99  
rfc topblock
 
xml2rfc-topblock
 
"yes"  

3.2 Unsupported xml2rfc-compatible PIs

PI targetPI pseudo-attributecomment
rfc include
 
incompatible with XML/XSLT processing model 
rfc needLines
 
 
rfc slides
 
 
rfc strict
 
 
rfc subcompact
 
 
rfc tocindent
 
(defaults to "yes") 
rfc tocompact
 
 

3.3 Extension PIs

PI targetPI pseudo-attributeXSLT parameter namedefaultdescription
rfc-ext allow-markup-in-artwork
 
xml2rfc-allow-markup-in-artwork
 
"no" Enables support for specific elements inside abstract elements (using this extension makes the document incompatible to the RFC2629bis DTD; see description of conversion XSLT in Section 10.3). 
rfc-ext authors-section
 
xml2rfc-ext-authors-section
 
 When "end", place the authors section at the end (just before the copyright statements). This seems to be the preferred order in the newest RFCs. 
rfc-ext parse-xml-in-artwork
 
xml2rfc-parse-xml-in-artwork
 
"no" May be used to enable parsing of XML content in figures (MSXML only). 
rfc-ext support-rfc2731
 
xml2rfc-ext-support-rfc2731
 
"yes" Decides whether the HTML transformation should generate META tags according Section 6.4
rfc-ext sec-no-trailing-dots
 
xml2rfc-ext-sec-no-trailing-dots
 
 When set to "yes", add trailing dots to section numbers. This seems to be the preferred format in the newest RFCs. 

4. Anchors

The transformation automatically generates anchors that are supposed to be stable and predictable and that can be used to identify specific parts of the document. Anchors are generated both in HTML and XSL-FO content (but the latter will only be used for PDF output when the XSL-FO engine supports producing PDF anchors).

The following anchors get auto-generated:

Anchor nameDescription
rfc.abstract
 
Abstract 
rfc.authors
 
Authors section 
rfc.copyright
 
Copyright section 
rfc.copyrightnotice
 
Copyright notice 
rfc.figure.n
 
Figures (titled) 
rfc.figure.u.n
 
Figures (untitled) 
rfc.index
 
Index 
rfc.ipr
 
Intellectual Property 
rfc.iref.n
 
Internal references 
rfc.note.n
 
Notes (from front section) 
rfc.references
 
References 
rfc.references.n
 
Additional references 
rfc.section.n
 
Section n 
rfc.section.n.p.m
 
Section n, paragraph m 
rfc.status
 
Status of memo 
rfc.toc
 
Table of contents 

5. Supported XSLT engines

The transformation requires a non-standard extension function (see exsl:node-set) which is however widely available. XSLT processors that do not support this extension (or a functional equivalent) currently are not supported.

5.1 Standalone Engines

The following XSLT engines are believed to work well:

5.2 In-Browser Engines

The following browsers seem to work fine:

The following browsers are known not to work properly:


6. Transforming to HTML

Transformation to HTML can be done inside the browser if it supports XSLT. To enable this, add the following processing instruction to the start of the source file:

  <?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>

(and ensure that rfc2629.xslt is present).

6.1 HTML compliance

The transformation result is supposed to conform to the HTML 4.01 strict DTD [HTML]. This can be checked using the W3C's online validator at <http://validator.w3.org>.

6.2 Standard HTML LINK elements

LINK elements exist since HTML 2.0. They can be used to embed content-independant links inside the document. Unfortunately, only few user agents fully support this element, namely Mozilla where it's called "Site Navigation Bar" (by default disabled!).

The following LINK elements are produced:

LINK typedescription
alternate
 
for RFCs, a link to the authorative ASCII version on the IETF web site 
appendic
 
pointer to all top-level appendics 
author
 
pointer to "authors" section 
chapter
 
pointer to all top-level sections 
contents
 
pointer to table of contents 
copyright
 
pointer to copyright statement 
index
 
pointer to index 

The figure below shows how Mozilla Firefox displays the Site Navigation Bar for rfc2396.xml.


(LINK elements displayed in Mozilla Firefox for RFC2396.xml)

6.3 Standard HTML metadata

The following standard HTML META elements are produced:

META namedescription
generator
 
from XSLT engine version and stylesheet version 
keywords
 
from keyword elements in front section 

6.4 Dublin Core (RFC2731) metadata

Unless turned off using the "rfc-ext support-rfc2731" processing instruction, the transformation will generate metadata according to [RFC2731].

The following DCMI properties are produced:

META namedescription
DC.Creator
 
from author information in front section 
DC.Date.Issued
 
from date information in front section 
DC.Description.Abstract
 
from abstract 
DC.Identifier
 
document URN [RFC2648] from "docName" attribute 
DC.Relation.Replaces
 
from "obsoletes" attribute 

6.5 Experimental hCard support

The generated author information is formatted in hCard format.


7. Transforming to XHTML

Transforming to XHTML requires slightly different XSLT output options and is implemented by the derived transformation script rfc2629toXHTML.xslt.

Note: Microsoft Internet Explorer does not support XHTML. Therefore it usually makes more sense to generate plain old HTML.


8. Transforming to CHM (Microsoft Compiled Help)

To generate a CHM file using Microsoft's HTML Help Compiler (hhc), three files are required in addition to the HTML file.

  1. hhc - table of contents file (HTML)
  2. hhk - index file (HTML)
  3. hhp - project file (plain text)

The three files are generated with three specific transformations, each requiring the additional XSLT parameter "basename" to specify the filename prefix.

Example:

saxon rfc2616.xml rfc2629toHhp.xslt basename=rfc2616  > rfc2616.hhp
saxon rfc2616.xml rfc2629toHhc.xslt basename=rfc2616  > rfc2616.hhc
saxon rfc2616.xml rfc2629toHhk.xslt basename=rfc2616  > rfc2616.hhk
hhc rfc2616.hhp

9. Transforming to PDF via XSL-FO

Transformation to XSL-FO [XSL-FO] format is available through rfc2629toFO.xslt (which includes rfc2629.xslt, so keep both in the same folder).

Compared to HTML user agents, XSL-FO engines unfortunately either come as open source (for instance, Apache FOP) or feature-complete (for instance, AntennaHouse XSL Formatter), but not both at the same time.

As Apache FOP needs special workarounds (page breaking, table layout), and some popular extensions aren't standardized yet, the translation produces a generic output (hopefully) conforming to [XSL-FO-11-WD]. Specific backends (xsl11toFop.xslt, xsl11toXep.xslt, xsl11toAn.xslt) the provide post-processing for the individual processors.

9.1 Extension feature matrix

PDF anchorsPDF bookmarksPDF document informationIndex cleanup
XSL 1.1 WD no, but can be auto-generated from "id" attributes yes no, but uses XEP output extensions yes 
Antenna House XSL formatter no yes (from XSL 1.1 bookmarks) yes (from XEP document info) yes (just page duplicate elimination, from XSL 1.1 page index) 
Apache FOP yes yes (from XSL 1.1 bookmarks) no no 
RenderX XEP no yes (from XSL 1.1 bookmarks) yes yes (from XSL 1.1 page index) 

9.2 Example: producing output for Apache FOP

Example:

saxon rfc2616.xml rfc2629toFo.xslt > tmp.fo
saxon tmp.fo xsl11toFop > rfc2629.fo

10. Utilities

10.1 Checking References

check-ietf-references.xslt can be used to check all references to RFC-series IETF publications (note this script requires a local copy of <ftp://ftp.isi.edu/in-notes/rfc-index.xml>). For instance:

> saxon rfc2518.xml check-ietf-references.xslt
Normative References:
RFC1766: [PROPOSED STANDARD] obsoleted by RFC3066 RFC3282
RFC2277: [BEST CURRENT PRACTICE] (-> BCP0018) ok
RFC2119: [BEST CURRENT PRACTICE] (-> BCP0014) ok
RFC2396: [DRAFT STANDARD] ok
RFC2069: [PROPOSED STANDARD] obsoleted by RFC2617
RFC2068: [PROPOSED STANDARD] obsoleted by RFC2616
RFC2141: [PROPOSED STANDARD] ok
RFC2279: [PROPOSED STANDARD] obsoleted by RFC3629
Informational References:
RFC2026: [BEST CURRENT PRACTICE] (-> BCP0009) ok
RFC1807: [INFORMATIONAL] ok
RFC2291: [INFORMATIONAL] ok
RFC2413: [INFORMATIONAL] ok
RFC2376: [INFORMATIONAL] obsoleted by RFC3023

10.2 Producing reference entries for books

amazon-asin.xslt uses the Amazon web services to generate a <reference> element for a given ASIN (ISBN).

For instance:

<?xml version="1.0" encoding="utf-8"?>
<references>
 <reference target="urn:isbn:0134516591">
   <front>
     <title>Simple Book, The: An Introduction to Internet Management,
               Revised Second Edition</title>
     <author surname="Rose"
                fullname="Marshall T. Rose" initials="M. T. ">
       <organization/>
     </author>
     <author surname="Marshall"
                fullname="Rose T. Marshall" initials="R. T.">
       <organization/>
     </author>
     <date year="1996" month="March"/>
   </front>
   <seriesInfo name="Prentice Hall" value=""/>
 </reference>
</references>

Note that the resulting XML usually requires checking, in this case Amazon's database is playing tricks with Marshall's name...

10.3 Down-converting to RFC2629bis DTD

clean-for-DTD.xslt can be used to down-convert some extensions to a format that is supported by the base xml2rfc distribution. Note that these extensions are experimental (feedback appreciated).

The following mappings are done:

10.4 Extracting artwork

With extract-artwork.xslt, artwork elements named through the "name" attribute can be extracted. This can be used to automatically check it's syntax (for instance, when ABNFs appear within a figure element).

For instance:

saxon rfc3986.xml extract-artwork.xslt name=uri.abnf

11. Informative References

[RFC2629]Rose, M.T., “Writing I-Ds and RFCs using XML”, RFC 2629, June 1999.
[RFC2648]Moats, R., “A URN Namespace for IETF Documents”, RFC 2648, August 1999.
[RFC2731]Kunze, J.A., “Encoding Dublin Core Metadata in HTML”, RFC 2731, December 1999.
[HTML]Raggett, D., Hors, A., and I. Jacobs, “HTML 4.01 Specification”, W3C REC REC-html401-19991224, December 1999.
[XSL-FO]Adler, S., Berglund, A., Caruso, J., Deach, S., Graham, T., Grosso, P., Gutentag, E., Milowski, R., Parnell, S., Richman, J., and S. Zilles, “Extensible Stylesheet Language (XSL) Version 1.0”, W3C REC REC-xsl-20011015, October 2001.
[XSL-FO-11-WD]Berglund, A., “Extensible Stylesheet Language (XSL) Version 1.1”, W3C REC WD-xsl11-20031217, December 2003.

Author's Address

Julian F. Reschkegreenbytes GmbHSalzmannstrasse 152Muenster, NW 48159GermanyPhone: +49 251 2807760Fax: +49 251 2807761EMail: URI: http://greenbytes.de/tech/webdav/

Index

A B C D E F G H I K L M N P R S T X