       Examples and the sample code of SSAX parsing and SXML transformations

Remove-tags example 
-------------------

Given an XML document, remove all the markup and print the resulting document.
This is a simple in-out application.

    $ make run-remove-markup-bigloo
    $ ./run-remove-markup-bigloo xml/ddn.rdf

See the files remove-markup.scm and remove-markup.scm.

The Makefile includes targets that run the above example under a few other
Scheme compilers/interpreters.


Outline example
--------------- 

Pretty-print the structure of an XML document (disregarding the
character data)
This example corresponds to outline.c of the Expat distribution.
The example demonstrates how to transform an XML document on the
fly, as we parse it.

    $ make run-outline-bigloo
    $ ./run-outline-bigloo xml/total_weather.xsl

Note that the tags of elements with no XML namespace are printed as
they are. Tags of elements within an XML namespace are printed as a
pair (Namespace-URI . Local-name)

See the files outline.scm and run-outline.scm.

The Makefile includes targets that run the above example under a few other
Scheme compilers/interpreters.


SXML example
------------
Transform an XML document into SXML. See ../docs/SXML.html for
the description.

    $ make run-sxml-bigloo
    $ ./run-sxml-bigloo xml/OMF-sample.xml

See the file run-sxml.scm.

The Makefile includes targets that run the above example under a few other
Scheme compilers/interpreters.



Permissive HTML parsing
-----------------------

This example gets SSAX to _permissively_ parse HTML documents or HTML
fragments. Because HTML browsers are so lax, many web pages on the
Internet contain invalid HTML. For example, the FrontPage editor is
notorious for creates such invalid sequences as <b><i>text</b></i>.

SSAX is more than an XML parser -- it is a library, which includes
lexers and parsers of various kinds. The library comes in handy when
we need to permissively parse (or better say, lex) HTML. Because we
accept even HTML with unmatched tags, we make no attempt in this
example to recover the structure. We faithfully record all the
occurring tags and let the user sort out what matches what. Here's the
result of parsing of an ill-composed sample document, which includes
comments, parsed entities and attributes of various kinds.
	
Source:
 <html> <head> <title> </title> <title> whatever </title> </head>
  <body> <a href=\"url\">link</a> <p align=center> <ul compact
style='aa'>
  <p> BLah <!-- comment <comment> --> <i> italic <b> bold <tt> ened
</i> still  &lt; bold </b>
  </body>
  <P> But not done yet...
	
Result: a flat list (a token stream)
	
(#(START html ()) " " #(START head ()) " " #(START title ())
  " " #(END title) " " #(START title ()) " whatever " #(END title) " "
  #(END head) "\n  "
  #(START body ()) " " #(START a ((href . "url"))) "link" #(END a) " "
  #(START p ((align . "center"))) " "
  #(START ul ((compact . "compact") (style . "aa")))
  "\n  " #(START p ()) " BLah  " #(START i ()) " italic "
  #(START b ()) " bold " #(START tt ()) " ened " #(END i)
  " still  < bold " #(END b) "\n  " #(END body)
  "\n  " #(START P ()) " But not done yet...")
	
Note that the parser handled "&lt;" and other parsed entity
references. The parser fully preserved all the whitespace (including
newlines). Also note that the three possible styles of HTML attributes
	<tag name="value">  <tag name=value>  <tag name>
are handled properly. In addition, single quotes are also
allowed. Comments are silently skipped.

    $ make run-html-parse-sample-bigloo-i

This command runs the example that parses the sample HTML document and
prints out the above result.
