A Solexa score is defined as -10*log(p/(1-p)) rounded to an
integer, where p is a probability. Phred scores are far more
widely used, and the
Biocaml_phred_score module supports
converting Solexa scores to Phred scores.
For details see The
Sanger FASTQ file format for sequences with quality scores, and
the Solexa/Illumina FASTQ variants. This module supports what is
called the fastq-solexa format in this paper, with one
exception. We are more permissive here in allowing conversions
from/to the entire range of visible ASCII characters (codes 33 -
126) instead of restricting to codes 59 - 126 as specified in this
paper. The smaller range is apparently based on the original
Solexa software returning minimum scores of -5, but there is no
reason for this minimum based on the general definition of Solexa
exception Error of
val of_ascii :
char -> t
of_ascii xreturns the PHRED score encoded by ASCII character
x's ASCII code is not between 33 - 126.
val to_ascii :
t -> char
tas an ASCII character.
tcannot be encoded as a visible ASCII character (codes 33 - 126).
val of_probability :
?f:(float -> int) -> float -> t
of_probability ~f xreturns
-10 * log_10(x/(1-x)), which is the definition of Solexa scores.
Solexa scores are integral, and it is unclear what convention is
used to convert the resulting float value to an integer. Thus,
f is provided to dictate this. The default is to
round the computed score to the closest integer.
x is not between 0.0 - 1.0.
val to_probability :
t -> float
xto a probablity score. Note this is not the inverse of
of_probabilitydue to the rounding done by the latter.