Fork me on GitHub

view-source

An XML Source Formatter in XSLT

About view-source

The view-source XSLT stylesheet takes any XML file as input and renders an HTML view of its source structure as output. You can choose, if you want to see an indented version that follows the structure of the elements, or a version with the original whitespace preserved.

Being an XSLT stylesheet the advantage of view-source over other syntax highlighters is its origin in and deep interlocking with the XML world. Thus, handling namespaces is natural to it, and the difference between concepts like “local name” and “qualified name” is right built into its executing environment.

Example

A certain example is easily explained: Take the source of this page and run it through view-source. The resulting document is a valid XHTML file.

Another example is the RSS feed of the W3C Twitter account (abbridged, read at 24th Sep. 2010). The feed, transformed with view-source, looks like this:

<rss version="2.0">
  <channel>
    <title>Twitter / w3c</title>
    <link>http://twitter.com/w3c</link>
    <atom:link type="application/rss+xml" href="http://twitter.com/statuses/user_timeline/35761106.rss" rel="self" />
    <description>Twitter updates from W3C Team / w3c.</description>
    <language>en-us</language>
    <ttl>40</ttl>
  <item>
    <title>w3c: RT @scrawford: Tim berners-lee: net neutrality should be an amendment to the constitution - it&apos;s that important #stppICTconf</title>
    <description>w3c: RT @scrawford: Tim berners-lee: net neutrality should be an amendment to the constitution - it&apos;s that important #stppICTconf</description>
    <pubDate>Fri, 24 Sep 2010 07:31:53 +0000</pubDate>
    <guid>http://twitter.com/w3c/statuses/25384071023</guid>
    <link>http://twitter.com/w3c/statuses/25384071023</link>
    <twitter:source>&lt;a href=&quot;http://identi.ca&quot; rel=&quot;nofollow&quot;&gt;identica&lt;/a&gt;</twitter:source>
    <twitter:place />
  </item>
  <item>
    <title>w3c: RDFa API Draft Published http://ow.ly/1981zX</title>
    <description>w3c: RDFa API Draft Published http://ow.ly/1981zX</description>
    <pubDate>Thu, 23 Sep 2010 13:27:06 +0000</pubDate>
    <guid>http://twitter.com/w3c/statuses/25306604005</guid>
    <link>http://twitter.com/w3c/statuses/25306604005</link>
    <twitter:source>&lt;a href=&quot;http://identi.ca&quot; rel=&quot;nofollow&quot;&gt;identica&lt;/a&gt;</twitter:source>
    <twitter:place />
  </item>
  <item>
    <title>w3c: One Web Day and W3C Community Groups http://ow.ly/197jRE</title>
    <description>w3c: One Web Day and W3C Community Groups http://ow.ly/197jRE</description>
    <pubDate>Wed, 22 Sep 2010 18:30:08 +0000</pubDate>
    <guid>http://twitter.com/w3c/statuses/25235054643</guid>
    <link>http://twitter.com/w3c/statuses/25235054643</link>
    <twitter:source>&lt;a href=&quot;http://identi.ca&quot; rel=&quot;nofollow&quot;&gt;identica&lt;/a&gt;</twitter:source>
    <twitter:place />
  </item>

  [...]

  </channel>
</rss>

Download the Stylesheet

The project is hosted at GitHub, where you can find additional information regarding the stylesheet. From this page, you can directly download the following packages:

Other Projects from me

To remove nodes of a certain namespace from an XML document, you can use any of the three implementations of rm-ns, XSLT, Javascript or Python. Stripping SOAP envelopes or remains of Word’s HTML export, these are the tools to tackle the task.

Unicodeinfo is a set of tools to access the data of the Unicode database. At the moment the collection consists of a toolset to convert the Unicode Dataset to an SQLite database, and a Python module with various useful methods to accompany Python’s own unicodedata library.

Copyright & License

Copyright © 2010 Manuel Strehl. All rights reserved.

The code is dual licensed under an MIT-style and the Gnu Public License, version 2. You are free to choose any of the two for your project. In most cases, you can just use the stylesheet, as long as the copyright statement stays intact.

The Official README

                                XML Source View

                  Syntax highlighting for XML files with XSLT


This stylesheet package contains XSLT styles for syntax highlighting of
arbitrary XML files.


P a r a m e t e r s :
=====================

* format: controls, whether the output should be pretty-printed or tried to be
  kept as near as possible to the original source. Default is to apply
  formatting.

* base-indent: sets the indentation step for each level, if the output should
  be formatted. The default are two spaces.

* style: The name of a stylesheet (without extension) to be used for display.
  Note, that the content of the stylesheet, though CSS otherwise, must be
  encircled by an element <css/> in the empty namespace.


H o w   t o   D e p l o y :
===========================

a) in Firefox: Add the following lines to your XML file:

    <?xslt-param name="format" select="true()" ?>
    <?xml-stylesheet type="text/xsl" href="view-source.xsl"?>

    (other browsers don't support <?xslt-param ?>, you have to touch
    view-source.xsl itself there.)

b) via a command line XSLT processor:

    $ saxon -s:source.xml -xsl:view-source.xsl -o:out.xhtml

    $ xalan -IN source.xml -XSL view-source.xsl -OUT out.xhtml

c) inside PHP:

    <?php
    $xsl = new DOMDocument;
    $xsl->load('view-source.xsl');
    $proc = new XSLTProcessor;
    $proc->importStyleSheet($xsl);

    $xml = new DOMDocument;
    $xml->load('source.xml');
    $proc->setParameter('', 'format', TRUE);
    $proc->transformToURI($xml, 'file:///tmp/out.xhtml');
    ?>

d) in Python with libxml2 and libxslt bindings:

    #! /usr/bin/env python

    import libxml2, libxslt

    styledoc = libxml2.parseFile("view-source.xsl")
    style = libxslt.parseStylesheetDoc(styledoc)
    doc = libxml2.parseFile('source.xml')
    result = style.applyStylesheet(doc, {"format": True})

    out = open('out.xhtml', 'w')
    out.write(result.serialize())

    style.freeStylesheet()
    doc.freeDoc()
    result.freeDoc()
    out.close()


L i c e n s e :
===============

The stylesheet is published under an MIT-style license and the GPL v2. Choose
at your liking.