Operators
Goals
The learning goals of this tutorial are:
- Using operators as functions and reciprocally, using functions as operators
- Assign the right associativity and precedence to a custom operator
- Use and define custom
let
binders
Using Binary Operators
In OCaml, almost all binary operators are regular functions. The function underlying an operator is referred by surrounding the operator symbol with parentheses. Here are the addition, string concatenation, and equality functions:
# (+);;
- : int -> int -> int = <fun>
# (^);;
- : string -> string -> string = <fun>
# (=);;
- : 'a -> 'a -> bool = <fun>
Note: the operator symbol for multiplication is *
, but can't be referred as (*)
. This is because comments in OCaml are delimited by (*
and *)
. To resolve the parsing ambiguity, space characters must be inserted to get the multiplication function.
# ( * );;
- : int -> int -> int = <fun>
Using operators as functions is convenient when combined with partial application. For instance, here is how to get the values that are greater than or equal to 10 in a list of integers, using the function List.filter
and an operator.
# List.filter;;
- : ('a -> bool) -> 'a list -> 'a list = <fun>
# List.filter (( <= ) 10);;
- : int list -> int list = <fun>
# List.filter (( <= ) 10) [6; 15; 7; 14; 8; 13; 9; 12; 10; 11];;
- : int list = [15; 14; 13; 12; 10; 11]
# List.filter (fun n -> 10 <= n) [6; 15; 7; 14; 8; 13; 9; 12; 10; 11];;
- : int list = [15; 14; 13; 12; 10; 11]
The first two lines and the last line are informative only.
- The first shows the
List.filter
type, which is a function taking two parameters. The first parameter is a function; the second is a list. - The second is the partial application of
List.filter
to( <= ) 10
, a function returningtrue
if applied to a number that is greater or equal than 10.
Finally, in the third line, all the arguments expected by List.filter
are provided. The returned list contains the values satisfying the ( <= ) 10
function.
Defining Binary Operators
It is also possible to define binary operators. Here is an example:
# let cat s1 s2 = s1 ^ " " ^ s2;;
val cat : string -> string -> string = <fun>
# let ( ^? ) = cat;;
val ( ^? ) : string -> string -> string = <fun>
# "hi" ^? "friend";;
- : string = "hi friend"
It is a recommended practice to define operators in two steps, like shown in the example. The first definition contains the function's logic. The second definition is merely an alias of the first one. This provides a default pronunciation to the operator and clearly indicates that the operator is syntactic sugar: a means to ease reading by rendering the text more compact.
Unary Operators
Unary operators are also called prefix operators. In some contexts, it can make sense to shorten a function's name into a symbol. This is often used as a way to shorten the name of a function that performs some sort of conversion over its argument.
# let ( !! ) = Lazy.force;;
val ( !! ) : 'a lazy_t -> 'a = <fun>
# let rec transpose = function
| [] | [] :: _ -> []
| rows -> List.(map hd rows :: transpose (map tl rows));;
val transpose : 'a list list -> 'a list list = <fun>
# let ( ~: ) = transpose;;
val ( ~: ) : 'a list list -> 'a list list
This allows users to write more compact code. However, be careful not to write excessively terse code, as it is harder to maintain. Understanding operators must be obvious to most readers, otherwise they do more harm than good.
Allowed Operators
OCaml has a subtle syntax; not everything is allowed as an operator symbol. An operator symbol is an identifier with a special syntax, so it must have the following structure:
Prefix Operator
- First character, either:
?
~
!
- Following characters, at least one if the first character is
?
or~
, optional otherwise:$
&
*
+
-
/
=
>
@
^
|
%
<
Binary Operator
- First character, either:
$
&
*
+
-
/
=
>
@
^
|
%
<
#
- Following characters, at least one if the first character is
#
, optional otherwise:$
&
*
+
-
/
=
>
@
^
|
%
<
!
.
:
?
~
This is defined in the Prefix and Infix symbols section of The OCaml Manual.
Tips:
- Don't define wide scope operators. Restrict their scope to module or function.
- Don't use many of them.
- Before defining a custom binary operator, check that the symbol is not already used. This can be done in two ways:
- By surrounding the candidate symbol with parentheses in UTop and see if it responds with a type or with an
Unbound value
error - Use Sherlocode to check if it is already used in some OCaml project
- By surrounding the candidate symbol with parentheses in UTop and see if it responds with a type or with an
- Avoid shadowing existing operators.
Operator Associativity and Precedence
Let's illustrate operator associativity with an example. The following function concatenates its string arguments, surrounded by |
characters and separated by a _
character.
# let par s1 s2 = "|" ^ s1 ^ "_" ^ s2 ^ "|";;
val par : string -> string -> string = <fun>
# par "hello" "world";;
- : string = "|hello_world|"
Let's turn par
into two different operators:
# let ( @^ ) = par;;
val ( @^ ) : string -> string -> string = <fun>
# let ( &^ ) = par;;
val ( &^ ) : string -> string -> string = <fun>
At first sight, operators @^
and &^
are the same. However, the OCaml parser allows forming expressions using several operators without parentheses.
# "foo" @^ "bar" @^ "bus";;
- : string = "|foo_|bar_bus||"
# "foo" &^ "bar" &^ "bus";;
- : string = "||foo_bar|_bus|"
Although both expressions are calling the same function (par
), they are evaluated in different orders.
- Expression
"foo" @^ "bar" @^ "bus"
is evaluated as if it was"foo" @^ ("bar" @^ "bus")
. Parentheses are added at the right, therefore@^
associates to the right - Expression
"foo" &^ "bar" &^ "bus"
is evaluated as if it was"(foo" &^ "bar") &^ "bus"
. Parentheses are added at the left, therefore&^
associates to the left
Operator precedence rules how expressions combining different operators without parentheses are interpreted. For instance, using the same operators, here is how expressions using both are evaluated:
# "foo" &^ "bar" @^ "bus";;
- : string = "|foo_|bar_bus||"
# "foo" @^ "bar" &^ "bus";;
- : string = "||foo_bar|_bus|"
In both cases, values are passed to @^
before &^
. Therefore, it is said that @^
has precedence over &^
. Rules for operator priorities are detailed in the Expressions section of the OCaml Manual. They can be summarised the following way. The first character of an operator dictates its associativity and priority. Here are the first characters of the groups' operators. Each group has the same associativity and precedence. Groups are sorted in increasing precedence order.
- Left associative:
$
&
<
=
>
|
- Right associative:
@
^
- Left associative:
+
-
- Left associative:
%
*
/
- Left associative:
#
The complete list of precedence is longer because it includes the predefined operators that are not allowed to be used as custom operators. The OCaml Manual has a table that sums up the operator associativity rules.
Binding Operators
OCaml allows the creation of custom let
operators. This is often used on monad-related functions such as Option.bind
or List.concat_map
. See Monads for more on this topic.
The doi_parts
function attempts to extract the registrant and identifier parts from string expected to contain a Digital Object Identifier (DOI).
# let ( let* ) = Option.bind;;
val ( let* ) : 'a option -> ('a -> 'b option) -> 'b option = <fun>
# let doi_parts s =
let open String in
let* slash = rindex_opt s '/' in
let* dot = rindex_from_opt s slash '.' in
let prefix = sub s 0 dot in
let len = slash - dot - 1 in
if len >= 4 && ends_with ~suffix:"10" prefix then
let registrant = sub s (dot + 1) len in
let identifier = sub s (slash + 1) (length s - slash - 1) in
Some (registrant, identifier)
else
None;;
# doi_parts "doi:10.1000/182";;
- : (string * string) option = Some ("1000", "182")
# doi_parts "https://doi.org/10.1000/182";;
- : (string * string) option = Some ("1000", "182")
This function is using Option.bind
as a custom binder over the calls to rindex_opt
and rindex_from_opt
. This allows to only consider the case where both searches are successful and return the positions of the found characters. If any of them fails, doi_parts
implicitly returns None
.
The let open String in
construct allows calling functions rindex_opt
, rindex_from_opt
, length
, ends_with
and sub
from module String
without prefixing each of them with String.
within the scope of the definition of doi_parts
.
The rest of the function applies if relevant delimiting characters have been found. It does performs additional checks and extracts registrant and identifier form the string s
, if possible.
Help Improve Our Documentation
All OCaml docs are open source. See something that's wrong or unclear? Submit a pull request.