is a set of utilities for interacting with and post-processing short DNA sequence read alignments
in the SAM
(Sequence Alignment/Map), BAM
(Binary Alignment/Map) and CRAM formats, written by Heng Li
. These files are generated as output by short read aligners
like BWA. Both simple and advanced tools are provided, supporting complex tasks like variant calling and alignment viewing as well as sorting, indexing, data extraction and format conversion
. SAM files can be very large (10s of Gigabytes
is common), so compression is used to save space. SAM files are human-readable text files, and BAM files are simply their binary equivalent, whilst CRAM files are a restructured column-oriented binary container format. BAM files are typically compressed and more efficient for software to work with than SAM. SAMtools makes it possible to work directly with a compressed BAM file, without having to uncompress the whole file. Additionally, since the format for a SAM/BAM file is somewhat complex - containing reads, references, alignments, quality information, and user-specified annotations - SAMtools reduces the effort needed to use SAM/BAM files by hiding low-level details. Read more...