Library
Module
Module type
Parameter
Class
Class type
val run :
?preserve:bool ->
?csize:int ->
int ->
demux:(unit -> 'a) ->
work:('a -> 'b) ->
mux:('b -> unit) ->
unit
run ~csize:10 16 ~demux:f ~work:g ~mux:h
will run in parallel on 16 cores the g
function. Inputs to function g
are produced by function f
and grouped by 10 (the chunk size csize
). If not provided, csize
defaults to one. The performance-optimal csize
depends on your computer, the functions you are using and the granularity of your computation. Elements which are fast to process may benefit from a csize
greater than one. The demux function f
must throw Parany.End_of_input
once it is done. Outputs of function g
are consumed by function h
. Functions f
and g
are run by different threads. Function g
is run in parallel by several threads (16 in this example). Only function mux
is run by the same thread that called Parany.run
. ~preserve
is an optional parameter which defaults to false. If set to true, results will be accumulated by h
in the same order that function f
emitted them. However, for parallel performance reasons, the jobs are still potentially computed by g
out of order.
INSIDE OF ITS WORK FUNCTION, a parallel worker can call get_rank()
to know its rank. The first spawned worker thread has rank 0. The second one has rank 1, etc. With N parallel workers, ranks are in 0..N-1
.
module DLS : sig ... end
A domain/thread private store
module Parmap : sig ... end
Wrapper module for near-compatibility with Parmap