4.3. Validation of the Document Type Definition

Validation of the DTD is done by the module DTDValidation. Notations, unparsed entities, element declarations and attribute declarations are checked if they correspond to the constraints of the XML 1.0 specification [WWW01].

The following checks are performed:

DTD

Notations

Unparsed entities:

Element declarations:

Attribute declarations:

Each check is done by a separate function, which takes the child list of the XDTD DOCTYPE node as input and returns a list of errors. Some functions can optionally take some further arguments to have access to context information, e.g. when validating unparsed entities, a list of all defined notations is needed. The result of validating the DTD is a concatenated list of the results of all validation functions.

The following example shows a filter function for checking the validity constraint: "No Notation on Empty Element" (section 3.3.1 in XML 1.0 specification [WWW01]). It means that an attribute of type NOTATION must not be declared for an element declared EMPTY. A notation attribute is a way to give an application a clue how the content of an element should be processed. The notation might refer to a program that can process the content, e.g. a base-64 encoded JPEG. Because empty elements cannot have contents, attributes of type notation are forbidden. The function checkNoNotationForEmptyElement is initialized with a list of all element names declared EMPTY. The constructed filter is then applied to all XDTD ATTLIST nodes of type NOTATION that have been selected from the DTD by another filter function.

Example 4-2. Validation that notations are not declared for EMPTY elements


checkNoNotationForEmptyElement :: [String] -> XmlFilter
checkNoNotationForEmptyElement emptyElems nd@(NTree (XDTD ATTLIST al) _)
    = if elemName `elem` emptyElems
      then err ("Attribute \""++ attName ++"\" of type NOTATION must not be "++
                "declared on the element \""++ elemName ++"\" declared EMPTY.") nd
      else []
      where
      elemName = getAttrValue1 a_name  al
      attName  = getAttrValue1 a_value al

checkNoNotationForEmptyElement _ nd
    = error ("checkNoNotationForEmptyElement: illegal parameter:\n" ++ show nd)
			

The validation functions cannot check if a content model is deterministic. This requirement is for compatibility with SGML, because some SGML tools can rely on unambiguous content models. XML processors may flag such content models as errors, but the Haskell XML Toolbox does not. It does not need deterministic content models for checking if the children of an element are valid (see Section 4.7).