Memory Representation of Values
Prerequisites
This is an adaptation of the chapter Memory Representation of Values from the book Real World OCaml, reproduced here with permission.
This document covers the precise mapping from OCaml types to runtime values and walks you through them via the toplevel. The internal runtime representation of OCaml values in memory is relevant
- when you want to interface with code written in other programming languages, e.g. C, Rust, and others,
- when using various external system tools that operate on OCaml binaries, such as profiling tools and debuggers
The OCaml toolchain is very predictable. The compiler minimizes the amount of optimization magic that it performs, and relies instead on its straightforward execution model for good performance. With some experience, you can know rather precisely where a block of performance-critical OCaml code is spending its time.
At the end of this document, you will be able to translate between OCaml values and their memory representation.
OCaml Types Disappear at Runtime
The OCaml compiler runs through several phases during the compilation
process. The first phase is syntax checking, during which source files are
parsed into abstract syntax trees (ASTs). The next stage is a type checking
pass over the AST. In a validly typed program, a function cannot be applied
with an unexpected type. For example, the print_endline
function must
receive a single string
argument, and an int
will result in a type error.
Since OCaml verifies these properties at compile time, it doesn't need to keep track of as much information at runtime. Thus, later stages of the compiler can discard and simplify the type declarations to a much more minimal subset that's actually required to distinguish polymorphic values at runtime. This is a major performance win versus something like a Java or .NET method call, where the runtime must look up the concrete instance of the object and dispatch the method call. Those languages amortize some of the cost via "Just-in-Time" dynamic patching, but OCaml prefers runtime simplicity instead.
We'll explain this compilation pipeline in more detail in The Compiler Frontend: Parsing And Type Checking and The Compiler Backend: Byte Code And Native Code. -->
We'll cover how these values are managed by the runtime later on in Understanding The Garbage Collector.
OCaml Blocks and Values
A running OCaml program uses blocks of memory (i.e., contiguous sequences of words in RAM) to represent values such as tuples, records, closures, or arrays. An OCaml program implicitly allocates a block of memory when such a value is created:
# type t = { foo: int; bar: int };;
type t = { foo : int; bar : int; }
# let x = { foo = 13; bar = 14 };;
val x : t = {foo = 13; bar = 14}
The type declaration t
doesn't take up any memory at runtime, but the
subsequent let
binding allocates a new block of memory with two words of
available space. One word holds the foo
field, and the other word holds the
bar
field. The OCaml compiler translates such an expression into an
explicit allocation for the block from OCaml's runtime system.
OCaml uses a uniform memory representation in which every OCaml variable is stored as a value. An OCaml value is a single memory word that is either an immediate integer or a pointer to some other memory. The OCaml runtime tracks all values so that it can free them when they are no longer needed. It thus needs to be able to distinguish between integer and pointer values, since it scans pointers to find further values but doesn't follow integers that don't point to anything meaningful beyond their immediate value.
Distinguishing Integers and Pointers at Runtime
Wrapping primitive types (such as integers) inside another data structure that records extra metadata about the value is known as boxing. Values are boxed in order to make it easier for the garbage collector (GC) to do its job, but at the expense of an extra level of indirection to access the data within the boxed value.
OCaml values don't all have to be boxed at runtime. Instead, values
use a single tag bit per word to distinguish integers and pointers at
runtime. The value is an integer if the lowest bit of the block word
is nonzero, and a pointer if the lowest bit of the block word is
zero. Several OCaml types map onto this integer representation,
including bool
, int
, the empty list, and unit
. Some types, like
variants, sometimes use this integer representation and sometimes
don't. In particular, for variants, constant constructors, i.e.,
constructors with no arguments like None
, are represented as
integers, but constructors like Some
that carry associated values
are boxed.
This representation means that integers are unboxed runtime values in OCaml so that they can be stored directly without having to allocate a wrapper block. They can be passed directly to other function calls in registers and are generally the cheapest and fastest values to use in OCaml.
A value is treated as a memory pointer if its lowest bit is zero. A pointer value can still be stored unmodified despite this, since pointers are guaranteed to be word-aligned (with the bottom bits always being zero).
The only problem that remains with this memory representation is distinguishing between pointers to OCaml values (which should be followed by the GC) and pointers into the system heap to C values (which shouldn't be followed).
The mechanism for this is simple, since the runtime system keeps track of the heap blocks it has allocated for OCaml values. If the pointer is inside a heap chunk that is marked as being managed by the OCaml runtime, it is assumed to point to an OCaml value. If it points outside the OCaml runtime area, it is treated as an opaque C pointer to some other system resource.
Some History About OCaml's Word-Aligned Pointers
The alert reader may be wondering how OCaml can guarantee that all of its pointers are word-aligned. In the old days, when RISC chips such as Sparc, MIPS, and Alpha were commonplace, unaligned memory accesses were forbidden by the instruction set architecture and would result in a CPU exception that terminated the program. Thus, all pointers were historically rounded off to the architecture word size (usually 32 or 64 bits).
Modern CISC processors such as the Intel x86 do support unaligned memory accesses, but the chip still runs faster if accesses are word-aligned. OCaml therefore simply mandates that all pointers be word-aligned, which guarantees that the bottom few bits of any valid pointer will be zero. Setting the bottom bit to a nonzero value is a simple way to mark an integer, at the cost of losing that single bit of precision.
An even more alert reader will be wondering about the performance
implications are for integer arithmetic using this tagged representation.
Since the bottom bit is set, any operation on the integer has to shift the
bottom bit right to recover the "native" value. The native code OCaml
compiler generates efficient x86 assembly code in this case, taking advantage
of modern processor instructions to hide the extra shifts where possible.
Addition is a single LEA
x86 instruction, subtraction can be two
instructions, and multiplication is only a few more.
Blocks and Values
An OCaml block is the basic unit of allocation on the heap. A block consists of a one-word header (either 32 or 64 bits depending on the CPU architecture) followed by variable-length data that is either opaque bytes or an array of fields. The header has a multipurpose tag byte that defines whether to interpret the subsequent data as opaque bytes or OCaml fields.
The GC never inspects opaque bytes. If the tag indicates an array of OCaml fields are present, their contents are all treated as more valid OCaml values. The GC always inspects fields and follows them as part of the collection process described earlier.
The size
field records the length of the block in memory words. This is 22
bits on 32-bit platforms, which is the reason OCaml strings are limited to 16
MB on that architecture. If you need bigger strings, either switch to a
64-bit host, or use the Bigarray
module.
The 2-bit color
field is used by the GC to keep track of its state during
mark-and-sweep collection.
We'll come back to this field in
Understanding The Garbage Collector.
This tag isn't exposed to OCaml source code in any case.
A block's tag byte is multipurpose, and indicates whether the data array
represents opaque bytes or fields. If a block's tag is greater than or equal
to No_scan_tag
(251), then the block's data are all opaque bytes, and are
not scanned by the collector. The most common such block is the string
type, which we describe in more detail later in this chapter.
The exact representation of values inside a block depends on their static
OCaml type. All OCaml types are distilled down into values
, and summarized below.
-
int
orchar
are stored directly as a value, shifted left by 1 bit, with the least significant bit set to 1. -
unit
,[]
,false
are all stored as OCamlint
0. -
true
is stored as OCamlint
1. -
Foo | Bar
variants are stored as ascending OCamlint
s, starting from 0. -
Foo | Bar of int
variants with parameters are boxed, while variants with no parameters are unboxed. -
Polymorphic variants with parameters are boxed with an extra header word to store the value, as compared to normal variants. Polymorphic variants with no parameters are unboxed.
-
Floating-point numbers are stored as a block with a single field containing the double-precision float.
-
Strings are word-aligned byte arrays with an explicit length.
-
[1; 2; 3]
lists are stored as1::2::3::[]
where[]
is an int, andh::t
a block with tag 0 and two parameters. -
Tuples, records, and arrays are stored as a C array of values. Arrays can be variable size, but tuples and records are fixed-size.
-
Records or arrays that are all float use a special tag for unboxed arrays of floats, or records that only have
float
fields.
Integers, Characters, and Other Basic Types
Many basic types are efficiently stored as unboxed integers at runtime. The
native int
type is the most obvious, although it drops a single bit of
precision due to the tag bit. Other atomic types such as unit
and the empty
list []
value are stored as constant integers. Boolean values have a value
of 1
and 0
for true
and false
, respectively.
These basic types such as empty lists and unit
are very efficient to use,
since integers are never allocated on the heap. They can be passed directly
in registers and not appear on the stack if you don't have too many
parameters to your functions. Modern architectures such as x86_64
have a
lot of spare registers to further improve the efficiency of using unboxed
integers.
Tuples, Records, and Arrays
Tuples, records, and arrays are all represented identically at runtime as a
block with tag 0
. Tuples and records have constant sizes determined at
compile time, whereas arrays can be of variable length. While arrays are
restricted to containing a single type of element in the OCaml type system,
this is not required by the memory representation.
You can check the difference between a block and a direct integer yourself
using the Obj
module, which exposes the internal representation of values
to OCaml code:
# Obj.is_block (Obj.repr (1,2,3));;
- : bool = true
# Obj.is_block (Obj.repr 1);;
- : bool = false
The Obj.repr
function retrieves the runtime representation of any OCaml
value. Obj.is_block
checks the bottom bit to determine if the value is a
block header or an unboxed integer.
Floating-Point Numbers and Arrays
Floating-point numbers in OCaml are always stored as full, double-precision
values. Individual floating-point values are stored as a block with a single
field that contains the number. This block has the Double_tag
set, which
signals to the collector that the floating-point value is not to be scanned:
# Obj.tag (Obj.repr 1.0);;
- : int = 253
# Obj.double_tag;;
- : int = 253
Since each floating-point value is boxed in a separate memory block,
it can be inefficient to handle large arrays of floats in comparison
to unboxed integers. OCaml therefore special-cases records or arrays
that contain only float
types. These are stored in a block that
contains the floats packed directly in the data section, with
Double_array_tag
set to signal to the collector that the contents
are not OCaml values.
First, let's check that float arrays do in fact have a different tag number from normal floating-point values:
# Obj.double_tag;;
- : int = 253
# Obj.double_array_tag;;
- : int = 254
This tells us that float arrays have a tag value of 254. Now let's test some
sample values using the Obj.tag
function to check that the allocated block
has the expected runtime tag, and also use Obj.double_field
to retrieve a
float from within the block:
# Obj.tag (Obj.repr [| 1.0; 2.0; 3.0 |]);;
- : int = 254
# Obj.tag (Obj.repr (1.0, 2.0, 3.0) );;
- : int = 0
# Obj.double_field (Obj.repr [| 1.1; 2.2; 3.3 |]) 1;;
- : float = 2.2
# Obj.double_field (Obj.repr 1.234) 0;;
- : float = 1.234
The first thing we tested was that a float array has the correct unboxed float array tag value (254). However, the next line tests a tuple of floating-point values instead, which are not optimized in the same way and have the normal tuple tag value (0).
Only records and arrays can have the float array optimization, and for records, every single field must be a float.
Variants and Lists
Basic variant types with no extra parameters for any of their branches are
simply stored as an OCaml integer, starting with 0
for the first option and
in ascending order:
# type t = Apple | Orange | Pear;;
type t = Apple | Orange | Pear
# ((Obj.magic (Obj.repr Apple)) : int);;
- : int = 0
# ((Obj.magic (Obj.repr Pear)) : int);;
- : int = 2
# Obj.is_block (Obj.repr Apple);;
- : bool = false
Obj.magic
unsafely forces a type cast between any two OCaml types; in this
example, the int
type hint retrieves the runtime integer value. The
Obj.is_block
confirms that the value isn't a more complex block, but just
an OCaml int
.
Variants that have parameters are a little more complex. They are stored as blocks, with the value tags ascending from 0 (counting from leftmost variants with parameters). The parameters are stored as words in the block:
# type t = Apple | Orange of int | Pear of string | Kiwi;;
type t = Apple | Orange of int | Pear of string | Kiwi
# Obj.is_block (Obj.repr (Orange 1234));;
- : bool = true
# Obj.tag (Obj.repr (Orange 1234));;
- : int = 0
# Obj.tag (Obj.repr (Pear "xyz"));;
- : int = 1
# (Obj.magic (Obj.field (Obj.repr (Orange 1234)) 0) : int);;
- : int = 1234
# (Obj.magic (Obj.field (Obj.repr (Pear "xyz")) 0) : string);;
- : string = "xyz"
In the preceding example, the Apple
and Kiwi
values are still stored as
normal OCaml integers with values 0
and 1
, respectively. The Orange
and
Pear
values both have parameters and are stored as blocks whose tags ascend
from 0
(and so Pear
has a tag of 1
, as the use of Obj.tag
verifies).
Finally, the parameters are fields that contain OCaml values within the
block, and Obj.field
can be used to retrieve them.
Lists are stored with a representation that is exactly the same as if the
list was written as a variant type with Nil
and Cons
. The empty list
[]
is an integer 0
, and subsequent blocks have tag 0
and two
parameters: a block with the current value, and a pointer to the rest of the
list.
Obj Module Considered Harmful
Obj
is an undocumented module that exposes the internals of the OCaml
compiler and runtime. It is very useful for examining and understanding how
your code will behave at runtime but should never be used for production
code unless you understand the implications. The module bypasses the OCaml
type system, making memory corruption and segmentation faults possible.
Some theorem provers such as Coq do output code that uses Obj
internally,
but the external module signatures never expose it. Unless you too have a
machine proof of correctness to accompany your use of Obj
, stay away from
it except for debugging!
Due to this encoding, there is a limit around 240 variants with parameters that applies to each type definition, but the only limit on the number of variants without parameters is the size of the native integer (either 31 or 63 bits). This limit arises because of the size of the tag byte, and that some of the high-numbered tags are reserved.
Polymorphic Variants
Polymorphic variants are more flexible than normal variants when writing code but are slightly less efficient at runtime. This is because there isn't as much static compile-time information available to optimize their memory layout.
A polymorphic variant without any parameters is stored as an unboxed integer
and so only takes up one word of memory, just like a normal variant. This
integer value is determined by applying a hash function to the name of the
variant. The hash function is exposed via the compiler-libs
package that
reveals some of the internals of the OCaml compiler:
# #require "ocaml-compiler-libs.common";;
# Btype.hash_variant "Foo";;
- : int = 3505894
# (Obj.magic (Obj.repr `Foo) : int);;
- : int = 3505894
The hash function is designed to give the same results on 32-bit and 64-bit architectures, so the memory representation is stable across different CPUs and host types.
Polymorphic variants use more memory space than normal variants when
parameters are included in the data type constructors. Normal variants use
the tag byte to encode the variant value and save the fields for the
contents, but this single byte is insufficient to encode the hashed value for
polymorphic variants. They must allocate a new block (with tag 0
) and store
the value in there instead. Polymorphic variants with constructors thus use
one word of memory more than normal variant constructors.
Another inefficiency over normal variants is when a polymorphic variant constructor has more than one parameter. Normal variants hold parameters as a single flat block with multiple fields for each entry, but polymorphic variants must adopt a more flexible uniform memory representation, since they may be reused in a different context across compilation units. They allocate a tuple block for the parameters that is pointed to from the argument field of the variant. There are thus three additional words for such variants, along with an extra memory indirection due to the tuple.
The extra space usage is generally not significant in a typical application, and polymorphic variants offer a great deal more flexibility than normal variants. However, if you're writing code that demands high performance or must run within tight memory bounds, the runtime layout is at least very predictable. The OCaml compiler never switches memory representation due to optimization passes. This lets you predict the precise runtime layout by referring to these guidelines and your source code.
String Values
OCaml string
s (and their mutable cousins, bytes
) are standard
OCaml blocks with the header size defining the size of the string in
machine words. The String_tag
(252) is higher than the
No_scan_tag
, indicating that the contents of the block are opaque to
the collector. The block contents are the contents of the string, with
padding bytes to align the block on a word boundary.
On a 32-bit machine, the padding is calculated based on the modulo of the string length and word size to ensure the result is word-aligned. A 64-bit machine extends the potential padding up to 7 bytes instead of 3. Given a string length modulo 4:
0
has padding00 00 00 03
1
has padding00 00 02
2
has padding00 01
3
has padding00
This string representation is a clever way to ensure that the contents are always zero-terminated by the padding word and to still compute its length efficiently without scanning the whole string. The following formula is used:
number_of_words_in_block * sizeof(word) - last_byte_of_block - 1
The guaranteed NULL
termination comes in handy when passing a string to C,
but is not relied upon to compute the length from OCaml code. OCaml strings
can thus contain NULL
bytes at any point within the string.
Care should be taken that any C library functions that receive these buffers
can also cope with arbitrary bytes within the buffer contents and are not
expecting C strings. For instance, the C memcopy
or memmove
standard
library functions can operate on arbitrary data, but strlen
or strcpy
both require a NULL
-terminated buffer, and neither has a mechanism for
encoding a NULL
value within its contents.
Custom Heap Blocks
OCaml supports custom heap blocks via a Custom_tag
that lets the runtime
perform user-defined operations over OCaml values. A custom block lives in
the OCaml heap like an ordinary block and can be of whatever size the user
desires. The Custom_tag
(255) is higher than No_scan_tag
and so isn't
scanned by the GC.
The first word of the data within the custom block is a C pointer to a
struct
of custom operations. The custom block cannot have pointers to OCaml
blocks and is opaque to the GC:
struct custom_operations {
char *identifier;
void (*finalize)(value v);
int (*compare)(value v1, value v2);
intnat (*hash)(value v);
void (*serialize)(value v,
/*out*/ uintnat * wsize_32 /*size in bytes*/,
/*out*/ uintnat * wsize_64 /*size in bytes*/);
uintnat (*deserialize)(void * dst);
int (*compare_ext)(value v1, value v2);
};
The custom operations specify how the runtime should perform polymorphic
comparison, hashing and binary marshaling. They also optionally contain a
finalizer that the runtime calls just before the block is
garbage-collected. This finalizer has nothing to do with ordinary OCaml
finalizers (as created by Gc.finalize
and explained in
Understanding The Garbage Collector).
They are instead used to call C cleanup functions such as free
.
Managing External Memory with Bigarray
A common use of custom blocks is to manage external system memory directly from within OCaml. The Bigarray interface was originally intended to exchange data with Fortran code, and maps a block of system memory as a multidimensional array that can be accessed from OCaml. Bigarray operations work directly on the external memory without requiring it to be copied into the OCaml heap (which is a potentially expensive operation for large arrays).
Bigarray sees a lot of use beyond just scientific computing, and several Core libraries use it for general-purpose I/O:
Iobuf
: The Iobuf
module maps I/O buffers as a one-dimensional array of bytes. It
provides a sliding window interface that lets consumer processes read from
the buffer while it's being filled by producers. This lets OCaml use I/O
buffers that have been externally allocated by the operating system without
any extra data copying.
Bigstring
: The Bigstring
module provides a String
-like interface that uses
Bigarray
internally. The Bigbuffer
collects these into extensible
string buffers that can operate entirely on external system memory.
The Lacaml library isn't part of Core but provides the recommended interfaces to the widely used BLAS and LAPACK mathematical Fortran libraries. These allow developers to write high-performance numerical code for applications that require linear algebra. It supports large vectors and matrices, but with static typing safety of OCaml to make it easier to write safe algorithms.
Conclusion
We covered the precise mapping from OCaml types to their internal runtime representation in memory, which should enable you to better understand the output of various debugging and profiling tools, as well as how to integrate your OCaml code with code from other languages.
Other recommended tutorials:
Help Improve Our Documentation
You are encouraged to contribute to the original sources of this page at the Real World OCaml GitHub repository.