A.6. Performance and profiling

Time and space profiling is very useful to optimize programs and remove bottlenecks. We already added the ability of profiling to the XML parser, but did not use this information yet. The main focus of the parser lays on a clear and easy design at the moment.

A.6.1. Usage information

The validating XML parser can be analyzed with the time and space profiling system of the Glasgow Haskell Compiler (GHC). Cost centers have been added for all main computations, so that it could be measured which parts of the parser consume most time and memory.

The tool Profiling can be used to run these tests with the XML parser. It can be built by executing make Profiling in the root directory of the Haskell XML Toolbox.

Profiling must be executed with an XML file as test input and several different parameters that influence GHC's profiling behavior. These parameters are described in depth in the documentation of GHC.

Four predefined profiling types have been added to the Makefile in the root directory of the Haskell XML Toolbox. The input file of Profiling can be set by the variable PROFILING_INPUT.

Predefined profiling targets:

make runprofiling1

Time and allocation profiling - Generate profiling information in XML format and display them with the tool ghcprof of GHC.

make runprofiling2

Time and allocation profiling - Generate profiling information in text format and display them with less.

make runprofiling4

Heap profiling - Break down the graph by the cost-center stack which produced the data.

make runprofiling4

Heap profiling - Break down the graph by the retainer set.

A.6.2. How to interpret the results

This section will show how to interpret the results from the profiling tests. The W3C Recommendation of the Extensible Hypertext Markup Language (XHTML) and its Strict DTD was chosen as test input. The test machine was a Pentium 4 with 1.6 GHz and 256 MB memory, running Debian 3.0.

The results from time and space profiling show that most time and memory is consumed by the function procesDTD. This function provides substitution of parameter entities in the internal and external DTD part. The next resource intensive function is parseXmlDoc that does the parsing of an XML document and constructs an XmlTree representation from it.


total time  =        1.26 secs   (63 ticks @ 20 ms)
total alloc = 102,664,004 bytes  (excludes profiling overheads)


COST CENTER                    MODULE               %time %alloc

processDTD                     HdomParser            27.0   45.6
parseXmlDoc                    HdomParser            19.0   29.3
buildAllValFcts                DocValidation         14.3    5.0
validateAttributes             DTDValidation         12.7    0.5
processGeneralEntities         HdomParser             7.9    9.6
MAIN                           MAIN                   6.3    5.0
buildAllTransFcts              DocTransformation      4.8    1.8
IdVal.validateIds              Validation             4.8    1.1
validateElements               DTDValidation          3.2    0.2
getXmlContents                 HdomParser             0.0    1.1
			

The ranking of the cost centers might have changed dramatically if the profiling tests are performed with other documents. In this case, parameter entities are exhaustively used in the Strict DTD of XHTML. It is not astonishing that the function that handles their substitution takes the most time and space. Another important aspect is the ratio of the DTD subset and the document subset. If for example the DTD subset is very large, its processing functions might take a large percentage of time and space consumption, too.

Heap profiling produces graphs that show the memory consumption of the cost centers during execution time. These graphs show very clearly how the program execution behavior of Haskell programs differs from programs written in imperative languages. Haskell uses lazy evaluation when evaluating expressions. Arguments to functions are evaluated only when they are needed. Functions only calculate their result as much as it is needed by other functions that called them.

Lazy evaluation leads to the fact that Haskell programs do not have a strict sequential evaluation order, but that they resemble a stream. The graphs show very clearly this execution behavior. First the parsing functions are started, after producing enough output, processing of the DTD starts. In the end the validation process starts calculations. The graphs show that almost all functions work in parallel after they have enough input.