4.2. Creating a validating XML parser

The only public module for validating XML documents is the module Validation. It exports several functions for validating XML documents, parts of XML documents and transforming them.

validateAndTransform :: XmlSFilter

Combines validation and transformation of a document. If errors or fatal errors occurred during validation, a list of errors is returned. Otherwise the transformed document is returned.

validate :: XmlSFilter

Checks if the DTD and the document are valid.

validateDTD :: XmlSFilter

Checks if the DTD is valid.

validateDoc :: XmlSFilter

Checks if the document corresponds to the given DTD.

transform :: XmlSFilter

Transforms the document with respect to the given DTD. Validating parsers are expected to normalize attribute values and add default values. This function should be called after a successful validation.

The following example shows how the functions validate and transform can be used in an XML processing application. The document is valid, if validate returns an empty list or a list containing only errors of type warning. If the list contains errors of type error or fatal error, the document is not valid. If the document is valid the document is transformed and displayed to the user.


printMsg :: XmlTrees -> XmlTrees -> IO()
printMsg errors doc
    = if null ((isError +++ isFatalError) $$ errors)
        then do
             if null errors
               then
                  putStrLn "The document is valid."
                  putStrLn (xmlTreesToString $ transform doc)
             else do
                  putStrLn "The document is valid, but there were warnings:"
                  putStrLn (xmlTreesToString $ transform doc)
                  putStrLn (showXErrors errors)
        else do
             putStrLn "The document is not valid. List of errors:"
             putStrLn (showXErrors errors)



main :: IO()
main
    = do
      doc <- parseDoc "invalid.xml"
      printMsg (validate doc) doc
      return ()

		

Calling the module ValidateExample from the directory example of the Haskell XML Toolbox with the invalid document invalid.xml produces the following error messages.

Example 4-1. Validating a document with errors


<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>

<!DOCTYPE a [
<!ATTLIST a  att1  CDATA  #IMPLIED>
<!ELEMENT a  (z, c?)>
<!ELEMENT b  EMPTY>
<!ELEMENT c  (#PCDATA)>
]>

<a att2="test">
    <y/>
    <c>hello world</c>
</a>

The document is not valid. List of errors:
Warning: The element type "z", used in content model of element "a", is not declared.
Error: The content of element "a" must match ( "z" , "c"? ). Element "z" expected,
       but Element "y" found.
Error: Attribute "att2" of element "a" is not declared in DTD.
Error: Element "y" not declared in DTD.