package pacomb
Install
Dune Dependency
Authors
Maintainers
Sources
md5=e48dc9fae5b96632bd1de929a49af71c
sha512=e4bf5dcfb0d4c5225a81fffe8e74cd9c147221eb9c8278b05d22391da0e06c6997e5b9a83a6431d72829f07f402da2449778cfe0bd56e7e2d3c8e08bbc1a73d5
Description
Pacomb is a parsing library that compiles grammars to combinators prior to parsing together with a PPX extension to write parsers inside OCaml files.
The advantages of Pacomb are
- Grammars as first class values defined in your OCaml files. This is an example from the distribution:
(* The three levels of priorities ) type p = Atom | Prod | Sum let%parser rec ( This includes each priority level in the next one ) expr p = Atom < Prod < Sum ( all other rule are selected by their priority level ) ; (p=Atom) (x::FLOAT) => x ; (p=Atom) '(' (e::expr Sum) ')' => e ; (p=Prod) (x::expr Prod) '' (y::expr Atom) => x*.y ; (p=Prod) (x::expr Prod) '/' (y::expr Atom) => x/.y ; (p=Sum ) (x::expr Sum ) '+' (y::expr Prod) => x+.y ; (p=Sum ) (x::expr Sum ) '-' (y::expr Prod) => x-.y
-
Good performances:
- on non ambiguous grammars, 2 to 3 time slower compared to ocamlyacc
- on ambiguous grammars O(N^3 ln(N)) can be achieved.
-
Parsing from left to right (despite the use of combinators) allowing not to keep the whole input in memory and allowing to parse streams.
-
Dependant sequence allowing for self extensible grammars (like new infix with a given priority in a given example).
-
Managing of blanks that for instance allows for nested language using different kind of comments or blanks.
-
Support for cache and merge for ambiguous grammars (to get O(N^3 ln(N)))
-
Enough support for utf8 to write parser for a language using utf8.
-
Comes with documentation and various examples illustrating most possibilities.
All this makes Pacomb a promising solution to write languages in OCaml.
Published: 30 Jul 2023
README
PaComb: an efficient parsing library for OCaml
PaComb implements a representation of grammars with semantic actions (values returned as a result of parsing). Parsing is performed by compiling grammars defined with the Grammar
module (or indirectly though a PPX extension) to the combinators of the Combinator
module. The library offers scanner less parsing, but the Lex
module provide a notion of terminals and blanks that give a simple way to write grammars in two phases, as usual.
The main advantage of PaComb and similar solutions, contrary to ocamlyacc, is that grammars (compiled or not) are first class values. This allows using the full power of OCaml for manipulating grammars. For example, this is very useful when working with syntax extension mechanisms.
Importantly, the performances of PaComb are very good: it is only two to five times slower than grammars generated by ocamlyacc, which is a compiler.
Defining languages using the Grammar
module directly is cumbersome. For that reason, PaComb provides a BNF-like PPX syntax extension (enabled using the -ppx pacomb.ppx
compilation flag).
A complete documentation is available via ocamldoc (make doc)
Pacomb also support: self extensible grammars, ambiguous grammars (with merge), late rejection of rule via raising exception from action code, priority and others.
A complete [documentation is available]{https://raffalli.eu/opam}
As teaser, the usual calculator example:
(* The three levels of priorities *)
type p = Atom | Prod | Sum
let%parser rec
(* This includes each priority level in the next one *)
expr p = Atom < Prod < Sum
(* all other rule are selected by their priority level *)
; (p=Atom) (x::FLOAT) => x
; (p=Atom) '(' (e::expr Sum) ')' => e
; (p=Prod) (x::expr Prod) '*' (y::expr Atom) => x*.y
; (p=Prod) (x::expr Prod) '/' (y::expr Atom) => x/.y
; (p=Sum ) (x::expr Sum ) '+' (y::expr Prod) => x+.y
; (p=Sum ) (x::expr Sum ) '-' (y::expr Prod) => x-.y
̀̀̀
Dependencies (4)
- stdlib-shims
-
ppxlib
>= "0.10.0"
-
dune
>= "1.9.0"
-
ocaml
>= "4.04.1"
Dev Dependencies
None
Used by
None
Conflicts
None