3.2. Module HdomParser

The module HdomParser provides functions for parsing XML files and building the generic tree data structure XmlTree from these documents. The whole parsing process takes place in the State-I/O monad of the module XmlState, so that well-formedness errors can be reported and different computations can be traced by outputting their results.

Parse functions

parseDoc :: String -> IO [XmlTree]

Parses the file specified by the first parameter.

parseXmlFile :: IO [XmlTree]

Parses an XML file specified by a command line argument:

  • --source "source file" - XML file to parse

  • --encoding "encoding" - Encoding scheme used in the file (optional)

The following example shows how an XML parser is constructed in the module HdomParser. The whole parsing process takes place in the State-I/O monad defined in the module XmlState. All computations are of type XmlStateFilter.


processXmlN	:: Int -> XmlTree -> IO [XmlTree]
processXmlN n t0
    = run' $ do
             setSysState (selXTagAttrl . getNode $ t0)
             setTraceLevel n
             t1 <- getXmlContents $ t0
             t2 <- parseXmlDoc    $$< t1
             t3 <- liftM transfAllCharRef $$< t2
             t4 <- processDTD     $$< t3
             t5 <- processGeneralEntities $$< t4
             el <- getErrorLevel
             return ( if el == 0
                      then t5
                      else [] )
		

Actions during parsing

getXmlContents :: XmlStateFilter a

Returns a filter for reading the XML file. The filename is retrieved from the attribute with the name "source" which must be part of the initial node t0.

parseXmlDoc :: XmlStateFilter a

Parses the XML file and builds the XmlTree.

transfAllCharRef :: XmlFilter

The XmlFilter transfAllCharRef has to be lifted to an XmlStateFilter. The filter substitutes character references by their characters.

processDTD :: XmlStateFilter a

Substitutes parameter entities, adds include sections and removes exclude sections of DTDs, merges internal and external DTD subsets.

processGeneralEntities :: XmlStateFilter a

Substitutes general entities.

getErrorLevel :: XState state Int

If an error occurred during applying the XmlStateFilters, processXmlN returns an empty list, otherwise it returns the constructed XmlTree.