Legend:
Library
Module
Module type
Parameter
Class
Class type
Library
Module
Module type
Parameter
Class
Class type
Xml Light Parser
While basic parsing functions can be used in the Xml
module, this module is providing a way to create, configure and run an Xml parser.
type source =
| SFile of string
| SChannel of in_channel
| SString of string
| SLexbuf of Lexing.lexbuf
Several kind of resources can contain Xml documents.
val make : unit -> t
This function returns a new parser with default options.
val prove : t -> bool -> unit
This function enable or disable automatic DTD proving with the parser. Note that Xml documents having no reference to a DTD are never proved when parsed (but you can prove them later using the Dtd
module (by default, prove is true).
val resolve : t -> (string -> Dtd.checked) -> unit
When parsing an Xml document from a file using the Xml.parse_file
function, the DTD file if declared by the Xml document has to be in the same directory as the xml file. When using other parsing functions, such as on a string or on a channel, the parser will raise everytime Xml.File_not_found
if a DTD file is needed and prove enabled. To enable the DTD loading of the file, the user have to configure the Xml parser with a resolve
function which is taking as argument the DTD filename and is returning a checked DTD. The user can then implement any kind of DTD loading strategy, and can use the Dtd
module functions to parse and check the DTD file (by default, the resolve function is raising Xml.File_not_found
).
val check_eof : t -> bool -> unit
When a Xml document is parsed, the parser will check that the end of the document is reached, so for example parsing "<A/><B/>"
will fail instead of returning only the A element. You can turn off this check by setting check_eof
to false
(by default, check_eof is true).
Once the parser is configurated, you can run the parser on a any kind of xml document source to parse its contents into an Xml data structure.
val concat_pcdata : t -> bool -> unit
When several PCData elements are separed by a \n (or \r\n), you can either split the PCData in two distincts PCData or merge them with \n as seperator into one PCData. The default behavior is to concat the PCData, but this can be changed for a given parser with this flag.