A genomic analysis toolkit focused on variant discovery. The GATK is the industry standard for identifying SNPs and indels in germline DNA and RNAseq data. Its scope is now expanding to include somatic short variant calling, and to tackle copy number (CNV) and structural variation (SV). In addition to the variant callers themselves, the GATK also includes many utilities to perform related tasks such as processing and quality control of high-throughput sequencing data, and bundles the popular Picard toolkit.
These tools were primarily designed to process exomes and whole genomes generated with Illumina sequencing technology, but they can be adapted to handle a variety of other technologies and experimental designs. And although it was originally developed for human genetics, the GATK has since evolved to handle genome data from any organism, with any level of ploidy.
This package is built using SPACK and optimized for AVX, AVX2 and AVX512 CPUs. To use the optimized version, you need to add source /etc/profile.d/zlmod.sh to your submit script before loading any modules. By default, the AVX2 optimized version (head node is Haswell CPU) is in your path. The AVX2 optimized version will run on Skylake (enge, im2080, chem, health) and Cascade Lake (hawkcpu, hawkmem, hawkgpu, infolab) CPUs but not on Ivybridge (debug) CPUs.
Version | module name |
---|---|
4.1.0.0 | gatk/4.1.0.0 |
For more information visit https://gatk.broadinstitute.org/hc/en-us