BED data files.
A BED file is in the format shown below, where columns must be separted by a tab character.
chrA lo1 hi1 chrA lo2 hi2 . . . . . . . . . chrB lo1 hi1 chrB lo2 hi2 . . . . . . . . .
The definition is that intervals are zero based and half-open. So by
default the line "chrA lo hi" is parsed to the interval [lo + 1,
hi]
on chromosome chrA
. Similarly, when printing, the default
is to print [lo - 1, hi]
. The optional argument
increment_lo_hi
allows changing this behavior for non-conformant
files. In addition, the optional argument chr_map
is a string
-> string
function that allows changing of the chromosome name to
a specified format, and defaults to identity
.
Some tools require that the set of intervals do not overlap within
each chromosome. This is not enforced, but you can use
any_overlap
to verify this property when needed.
module Biocaml_bed:
sig
typeitem =
string * int * int * Biocaml_table.Row.t
typeparsing_spec =
[ `enforce of Biocaml_table.Row.t_type | `strings ]
module Error:
sig
with sexp
)typeparsing_base =
[ `wrong_format of
[ `column_number | `float_of_string of string | `int_of_string of string ] *
Biocaml_table.Row.t_type * string
| `wrong_number_of_columns of Biocaml_table.Row.t ]
typeparsing =
[ `bed of parsing_base ]
typet =
parsing
val parsing_base_of_sexp : Sexplib.Sexp.t -> parsing_base
val sexp_of_parsing_base : parsing_base -> Sexplib.Sexp.t
val parsing_of_sexp : Sexplib.Sexp.t -> parsing
val sexp_of_parsing : parsing -> Sexplib.Sexp.t
end
In_channel
Functions exception Error of Error.t
*_exn
functions.val in_channel_to_item_stream : ?buffer_size:int ->
?more_columns:parsing_spec ->
Pervasives.in_channel ->
(item, [> Error.parsing ]) Core.Std.Result.t Stream.t
item
values.val in_channel_to_item_stream_exn : ?buffer_size:int ->
?more_columns:parsing_spec ->
Pervasives.in_channel -> item Stream.t
in_channel_to_item_stream
but use exceptions for errors
(raised within Stream.next
).val item_of_line : how:parsing_spec ->
Biocaml_lines.item ->
(item, [> Error.parsing ]) Core.Std.Result.t
val item_to_line : item -> Biocaml_lines.item
item
.module Transform:
sig
val string_to_item : ?more_columns:Biocaml_bed.parsing_spec ->
unit ->
(string,
(string * int * int * Biocaml_table.Row.item array,
[> Biocaml_bed.Error.parsing ])
Core.Std.Result.t)
Biocaml_transform.t
Biocaml_transform.t
-based parser, while providing the
format of the additional columns (default `strings
).val item_to_string : unit -> (Biocaml_bed.item, string) Biocaml_transform.t
Biocaml_transform.t
which “prints” BED data
(reminder: includes ends-of-line).end
val item_of_sexp : Sexplib.Sexp.t -> item
val sexp_of_item : item -> Sexplib.Sexp.t
val parsing_spec_of_sexp : Sexplib.Sexp.t -> parsing_spec
val sexp_of_parsing_spec : parsing_spec -> Sexplib.Sexp.t
end