2.3. Filter functions

2.3.1. Introduction

Filters are the basic functions for processing the XmlTree representation of XML documents. A filter takes a node or a list of nodes and returns some sequence of nodes. The result list might be empty, might contain a single item, or it could contain several items.

The idea of filters was adopted from HaXml [WWW21], but has been modified. In HaXml filters work only on the document subset part of XML documents. The Haskell XML Toolbox uses the generic tree data type NTree for modeling XML documents. This generic data model makes it possible to generalize HaXml's filter idea so that filters can process the whole XML document, including the DTD subset or document subset. This generalization allows implementing a very uniform design of XML processing applications by using filters. In fact the whole XML Parser of the Haskell XML Toolbox works internally with filters. The differences between HaXml's approach and the approach of the Haskell XML Toolbox are described in depth in Section 5.2.

TFilter and TSFilter are filters for the general n-ary tree defined by the data type NTree. The function TFilter takes a node and returns a list of nodes. TSFilter takes a list of nodes and returns a list, too.


type TFilter  node	= NTree  node -> NTrees node
type TSFilter node	= NTrees node -> NTrees node
			

XmlFilter and XmlSFilter base on these types. They only work on XNode data types.


type XmlFilter  = TFilter  XNode
type XmlSFilter = TSFilter XNode
			

The filters can be used to select parts of a document, to construct new document parts or to change document parts. They can even be used for checking validity constraints as described in Chapter 4. In this case a filter returns an empty list if the document is valid or a list with errors.

Filters can sometimes be thought of as predicates. In this case they are used for deciding whether or not to keep its input. The functional approach differs from predicate logic. If the predicate is false, an empty list is returned. If the predicate is true, a list with the passed element is returned.

All filters share the same basic type so that combining them with the help of combinators, described in Section 2.4, is possible. With this approach defining complex filters on the basis of easier ones is possible.

The following list describes the basic filter functions for processing XML documents represented as an XmlTree. Some functions are higher-order functions and return a filter function as a result. The arguments of these functions are used to construct parameterized filters. This is useful for example for constructing filters that should be used to return nodes with a certain property.

2.3.2. Filters from module NTree

Simple filters

none :: TFilter node

Takes any node, returns always an empty list. Algebraically zero.

this :: TFilter node

Takes any node, returns always a list of the passed node. Algebraically unit.

Selection filters

isOfNode :: (node -> Bool) -> TFilter node

Takes a predicate functions and returns a filter. The filter returns a list with passed node if the predicate function is true for the node, otherwise it returns an empty list.

isNode :: Eq node => node -> TFilter node

The same as isOfNode. Instead of a predicate function a reference node is taken.

Filters for modifying nodes

mkNTree :: NTree node -> TFilter node

Takes a node and returns a filter. The filter returns always a list with this node and ignores the passed one.

replaceNode :: node -> TFilter node

Takes a node and returns a filter. The filter replaces a passed node with the initialized one. The children of the passed node are added to the new node.

replaceChildren :: NTrees node -> TFilter node

Like replaceNode except that in this case the children are replaced by an initialized list. The passed node itself is not modified.

modifyNode :: (node -> Maybe node) -> TFilter node

Takes a function for modifying nodes and returns a filter. The function (node -> Maybe node) is applied to the passed node, the children are not modified.

modifyNode0 :: (node -> node) -> TFilter node

Like modifyNode0 except that the type of the modification function (node -> node) is different.

modifyChildren :: TSFilter node -> TFilter node

Takes a filter that processes lists of nodes and returns a new filter. The new filter applies the filter TSFilter node to the child list of a passed node. The node itself is not modified.

2.3.3. Filters from module XmlTreeAccess

Predicate filters

isTag :: TagName -> XmlFilter

Takes a TagName and returns a filter. The filter returns a list with the passed node if its name equals TagName, otherwise an empty list is returned.

isOfTag :: (TagName -> Bool) -> XmlFilter

Takes a predicate function (TagName -> Bool) and returns a filter. The filter applies the predicate function to the name of a passed node. If the predicate function is true, the filter returns a list with the node. Otherwise an empty list is returned.

attrHasValue :: AttrName -> (AttrValue -> Bool) -> XmlFilter

Constructs a predicate filter for attributes which value meets a predicate function. The constructed filter returns a list with the passed node if the node has an attribute with name AttrName and its value matches the predicate function (AttrValue -> Bool). Otherwise an empty list is returned.

Lots of further predicate functions are provided by the module XmlTreePredicates: isXCdata, isXCharRef, isXCmt, isXDTD, isXEntityRef, isXError, isXNoError, isXPi, isXTag, isXText, etc. These filters are used for identifying special types of nodes.

Construction filters

mkXTag :: TagName -> TagAttrl -> XmlTrees -> XmlFilter

The created filter constructs an XTag node with the name TagName, an attribute list TagAttrl and a list of children. The passed node is ignored by the filter.

mkXText :: String -> XmlFilter

The created filter constructs an XText node with text data. The passed node is ignored by the filter. There exists a shortcut function txt that does the same.

mkXCharRef :: Int -> XmlFilter

The created filter constructs an XCharRef node with a reference number to a character. The passed node is ignored by the filter.

mkXEntityRef :: String -> XmlFilter

The created filter constructs an XEntityRef node with an entity reference. The passed node is ignored by the filter.

mkXCmt :: String -> XmlFilter

The created filter constructs an XCmt node with text data. The passed node is ignored by the filter. There exists a shortcut function cmt that does the same.

mkXDTD :: DTDElem -> TagAttrl -> XmlTrees -> XmlFilter

The created filter constructs an XDTD node. The type of the node is specified by the algebraic data type DTDElem. The node has attributes and a list of children. The passed node is ignored by the filter.

mkXPi :: String -> TagAttrl -> XmlFilter

The created filter constructs an XPi node with a name and attributes. The passed node is ignored by the filter.

mkXCdata :: String -> XmlFilter

The created filter constructs an XCdata node with text data. The passed node is ignored by the filter.

mkXError :: Int -> String -> XmlFilter

The created filter constructs an XError node with an error level and an error message. The passed node is stored in the child list of this error node, so that the location where the error occurred can be preserved. The shortcut functions warn, err and fatal of type String -> XmlFilter can be used to create specific error nodes.

mkXElem :: TagName -> TagAttrl -> [XmlFilter] -> XmlFilter

The created filter constructs an XTag node with the name TagName and the attribute list TagAttrl. Its child list is constructed by applying the filter list [XmlFilter] to the passed node. There exists a shortcut function tag that does the same.

mkXSElem :: TagName -> [XmlFilter] -> XmlFilter

The created filter constructs a simple XTag node. It works like mkXElem except that no attribute list is created. There exists a shortcut function stag that does the same.

mkXEElem :: TagName -> XmlFilter

The created filter constructs an empty XTag node. It works like mkXSElem except that no child list is created. There exists a shortcut function etag that does the same.

Selection filters

getXTagName :: XmlFilter

If the passed node is of type XTag, a list with an XText node is returned. This node contains the name of the element. Otherwise an empty list is returned.

getXTagAttr :: AttrName -> XmlFilter

If the passed node is of type XTag and there exists an attribute with the name AttrName, a list with an XText node is returned. This node contains the value of the attribute. Otherwise an empty list is returned.

getXDTDAttr :: AttrName -> XmlFilter

The same as getXTagAttr except that it works on XDTD nodes.

getXText :: XmlFilter

If the passed node is of type XText, a list with an XText node is returned. This node contains text data. Otherwise an empty list is returned.

getXCmt :: XmlFilter

If the passed node is of type XCmt, a list with an XText node is returned. This node contains the text data of the comment. Otherwise an empty list is returned.

getXPiName :: XmlFilter

If the passed node is of type XPi, a list with an XText node is returned. This node contains the name of the processing instruction. Otherwise an empty list is returned.

getXCdata :: XmlFilter

If the passed node is of type XCdata, a list with an XText node is returned. This node contains the text data of the element. Otherwise an empty list is returned.

getXError :: XmlFilter

If the passed node is of type XError, a list with an XText node is returned. This node contains the error message. Otherwise an empty list is returned.

Substitution filters

replaceTagName :: TagName -> XmlFilter

Constructed filter replaces the name of an XTag or XPi node by the TagName and returns a list with the modified node.

replaceAttrl :: TagAttrl -> XmlFilter

Constructed filter replaces the attribute list of an XTag, XDTD or XPi node by the TagAttrl and returns a list with the modified node.

modifyTagName :: (TagName -> TagName) -> XmlFilter

Constructed filter modifies the name of an XTag or XPi node by applying the function (TagName -> TagName) to the name. The filter returns a list with the modified node.

modifyAttrl :: (TagAttrl -> TagAttrl) -> XmlFilter

Constructed filter modifies the attribute list of an XTag, XDTD or XPi node by applying the function (TagAttrl -> TagAttrl) to the attribute list. The filter returns a list with the modified node.

modifyAttr :: AttrName -> AttrValue -> XmlFilter

Constructed filter changes the attribute value of the attribute which name equals AttrName to AttrValue and returns a list with the modified node.