Module Biocaml_entrez (.ml)

Entrez Utilities API

This modules provides a partial access to Entrez databases such as Pubmed, Gene or Protein. The API proposed by the NCBI is based on HTTP requests, and this modules contains a couple of functions to ease the construction of appropriate URLs. This module also offers a more high-level access, with parsers for the answers from Entrez.

Databases in Entrez can be seen as collections of records, each record representing an object of the database. The basic usage of the low-level API is first to search a database with the esearch utility. Given a query string, esearch will return a collection of identifiers. These identifiers are then used to fetch the actual records with the efetch utility. These two operations are done in one call with the high-level API.

module Biocaml_entrez: 
sig
type database = [ `gene
| `genome
| `geodatasets
| `geoprofiles
| `protein
| `pubmed
| `pubmedcentral
| `sra
| `taxonomy
| `unigene ]
Represents available databases

Low level access



For a documentation of the parameters, see this reference
val esearch_url : ?retstart:int ->
?retmax:int ->
?rettype:[ `count | `uilist ] ->
?field:string ->
?datetype:[ `edat | `mdat | `pdat ] ->
?reldate:int ->
?mindate:string ->
?maxdate:string -> database -> string -> string
Construction of esearch URLs.
type esearch_answer = {
   count : int;
   retmax : int;
   retstart : int;
   ids : string list;
}
Represents the result of a request to esearch
val esearch_answer_of_string : string -> esearch_answer
Parses an answer of esearch under XML format
val esummary_url : ?retstart:int ->
?retmax:int -> database -> string list -> string
Construction of esummary URLs
val efetch_url : ?rettype:string ->
?retmode:[ `asn_1 | `text | `xml ] ->
?retstart:int ->
?retmax:int ->
?strand:[ `minus | `plus ] ->
?seq_start:int ->
?seq_stop:int -> database -> string list -> string
Construction of efetch URLs. Note that this access method does not support more than 200 ids. For legible values of rettype and retmode please consult the official specification.

High level access


module type Fetch = 
sig
type 'a fetched 
val fetch : string -> (string -> 'a) -> 'a fetched
val (>>=) : 'a fetched ->
('a -> 'b fetched) -> 'b fetched
val (>|=) : 'a fetched ->
('a -> 'b) -> 'b fetched
end
A signature for an HTTP request framework

module Make: 
functor (F : Fetch) ->
sig

module Object_id: 
sig
type t = [ `int of int | `string of string ] 
val to_string : t -> string
end

module Dbtag: 
sig
type t = {
   db : string;
   tag : Biocaml_entrez.Make.Object_id.t;
}
end

module Gene_ref: 
sig
type t = {
   locus : string option;
   allele : string option;
   desc : string option;
   maploc : string option;
   pseudo : bool option;
   db : Biocaml_entrez.Make.Dbtag.t list;
}
end

module PubmedSummary: 
sig
type t = {
   pmid : int;
   doi : string option;
   pubdate : string option;
   source : string option;
   title : string;
}
val search : string -> t list F.fetched
end

module Pubmed: 
sig
type t = {
   pmid : int;
   title : string;
   abstract : string;
}
val search : string -> t list F.fetched
end

module Gene: 
sig
type t = {
   _type : [ `miscRNA
| `ncRNA
| `other
| `protein_coding
| `pseudo
| `rRNA
| `scRNA
| `snRNA
| `snoRNA
| `tRNA
| `transposon
| `unknown ]
;
   summary : string option;
   gene : Biocaml_entrez.Make.Gene_ref.t;
}
val search : string -> t list F.fetched
end
end
end