module Biocaml_bed:BED data. A BED file is in the format shown below, where columns must be separted by a tab character.sig
..end
chrA lo1 hi1 chrA lo2 hi2 . . . . . . . . . chrB lo1 hi1 chrB lo2 hi2 . . . . . . . . .
The definition is that intervals are zero based and half-open. So by
default the line "chrA lo hi" is parsed to the interval [lo + 1,
hi]
on chromosome chrA
. Similarly, when printing, the default
is to print [lo - 1, hi]
. The optional argument
increment_lo_hi
allows changing this behavior for non-conformant
files. In addition, the optional argument chr_map
is a string
-> string
function that allows changing of the chromosome name to
a specified format, and defaults to identity
.
Some tools require that the set of intervals do not overlap within
each chromosome. This is not enforced, but you can use
any_overlap
to verify this property when needed.
val enum_input : Batteries.IO.input -> (string * int * int) Batteries.Enum.t
val sqlite_db_of_enum : ?db:string ->
?table:string -> (string * int * int) Batteries.Enum.t -> Sqlite3.db