Adapter Removal
Removes adapter sequences, trims low quality bases from 3' ends, or merges overlapping pairs into consensus
AfterQC
Automatic filtering, trimming, error removing, and quality control for FastQ data
Anglerfish
Quality controls Illumina libraries sequenced on Oxford Nanopore flowcells
Bakta
Rapid & standardized annotation of bacterial genomes, MAGs & plasmids
Bamdst
Lightweight tool to stat the depth coverage of target regions of BAM file(s)
Bamtools
Provides both a programmer's API and an end-user's toolkit for handling BAM files
BBDuk
Common data-quality-related trimming, filtering, and masking operations with a kmer based approach
BBTools
Pre-processing, assembly, alignment, and statistics tools for DNA/RNA sequencing reads
Bcftools
Utilities for variant calling and manipulating VCFs and BCFs
bcl2fastq
Demultiplexes data and converts BCL files to FASTQ file formats for downstream analysis
BCL Convert
Demultiplexes data and converts BCL files to FASTQ file formats for downstream analysis
biobambam2
Tools for early stage alignment file processing
BioBloom Tools
Assigns reads to different references using bloom filters. This is faster than alignment and can be used for contamination detection
BISCUIT
Maps bisulfite converted DNA sequence reads and determines cytosine methylation states
Bismark
Maps bisulfite converted sequence reads and determine cytosine methylation states
Bowtie 1
Ultrafast, memory-efficient short read aligner
Bowtie 2 / HiSAT2
Results from both Bowtie 2 and HISAT2, tools for aligning reads against a reference genome
BUSCO
Assesses genome assembly and annotation completeness
Bustools
Tools for BUS files - a file format for single-cell RNA-seq data designed to facilitate the development of modular workflows for data processing
CCS
PacBio tool that generates highly accurate single-molecule consensus reads (HiFi Reads)
Cell Ranger
Analyzes single cell expression or VDJ data produced by 10X Genomics
CheckQC
Checks a set of quality criteria against an Illumina runfolder
ClipAndMerge
Adapter clipping and read merging for ancient DNA data
Cluster Flow
Simple and flexible bioinformatics pipeline tool
Conpair
Estimates concordance and contamination for tumor–normal pairs
Cutadapt
Finds and removes adapter sequences, primers, poly-A tails, and other types of unwanted sequences
DeDup
Improved Duplicate Removal for merged/collapsed reads in ancient DNA analysis
deepTools
Tools to process and analyze deep sequencing data
DIAMOND
Sequence aligner for protein and translated DNA searches, a drop-in replacement for the NCBI BLAST
Disambiguate
Disambiguate reads aligned to two different species (e.g. human and mouse)
DRAGEN
Illumina Bio-IT Platform that uses FPGA for secondary analysis of sequencing data
DRAGEN-FastQC
Illumina Bio-IT Platform that uses FPGA for secondary analysis of sequencing data
eigenstratdatabasetools
Tools to compare and manipulate the contents of EingenStrat databases, and to calculate SNP coverage statistics in such databases
fastp
All-in-one FASTQ preprocessor (QC, adapters, trimming, filtering, splitting...)
FastQ Screen
Screens a library of sequences in FastQ format against a set of sequence databases to see if the composition of the library matches with what you expect
FastQC
Quality control tool for high throughput sequencing data
featureCounts
Counts mapped reads for genomic features such as genes, exons, promoter, gene bodies, genomic bins and chromosomal locations
fgbio
Processing and evaluating data containing UMIs
FLASh
Merges paired-end reads from next-generation sequencing experiments
Flexbar
Barcode and adapter removal tool
Freyja
Recovers relative lineage abundances from mixed SARS-CoV-2 samples
Ganon
Metagenomics classification: quickly assigns sequence fragments to their closest reference among thousands of references via Interleaved Bloom Filters of k-mer/minimizers
GATK
Wide variety of tools with a primary focus on variant discovery and genotyping
GffCompare
Tool to compare, merge and annotate one or more GFF files with a reference annotation in GFF format
GLIMPSE
Low-coverage whole genome sequencing imputation
goleft indexcov
Quickly estimate coverage from a whole-genome bam index, providing 16KB resolution
GoPeaks
Calls peaks in CUT&TAG/CUT&RUN datasets
Haplocheck
Haplocheck detects in-sample contamination in mtDNA or WGS sequencing studies by analyzing the mitchondrial content
hap.py
Benchmarks variant calls against gold standard truth datasets
HiCExplorer
Hi-C analysis from processing to visualization
HiC-Pro
Pipeline for Hi-C data processing
HiCUP
Mapping and quality control on Hi-C data
HiFiasm
Haplotype-resolved assembler for accurate Hifi reads
HISAT2
Maps DNA or RNA reads against a genome or a population of genomes
HOMER
Motif discovery and next-gen sequencing analysis
HOPS
Ancient DNA characteristics screening tool of output from the metagenomic aligner MALT
Hostile
Removes host sequences from short and long read (meta)genomes, from paired or unpaired fastq[.gz]
HTSeq Count
Part of the HTSeq package: counts reads covering specified genomic features
HUMID
Reference-free tool to quickly remove duplicates from FastQ files, with or without UMIs
Iso-Seq
Identifies transcripts in PacBio single-molecule sequencing data (HiFi reads)
iVar
Functions for viral amplicon-based sequencing
Kaiju
Taxonomic classification for metagenomics
Kallisto
Quantifies abundances of transcripts (or more generally, of target sequences) from RNA-Seq data
Kraken
Taxonomic classification using exact k-mer matches to find the lowest common ancestor (LCA) of a given sequence
leeHom
Bayesian reconstruction of ancient DNA
Librarian
Predicts the sequencing library type from the base composition of a FastQ file
Lima
Demultiplex PacBio single-molecule sequencing reads
Long Ranger
Sample demultiplexing, barcode processing, alignment, quality control, variant calling, phasing, and structural variant calling
MACS2
Identifies transcription factor binding sites in ChIP-seq data
MALT
Aligns of metagenomic reads to a database of reference sequences (such as NR, GenBank or Silva) and outputs a MEGAN RMA file
mapDamage
Tracks and quantifies damage patterns in ancient DNA sequences
MetaPhlAn
Profiles the composition of microbial communities from metagenomic shotgun sequencing data
methylQA
Methylation sequencing data quality assessment tool
MinIONQC
Quality control for ONT (Oxford Nanopore) long reads
mirtop
Annotates miRNAs and isomiRs and compute general statistics in mirGFF3 format
miRTrace
Quality control for small RNA sequencing data
Mosdepth
Fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing
Motus
Microbial profiling through marker gene (MG)-based operational taxonomic units (mOTUs)
mtnucratio
Computes mitochondrial to nuclear genome ratios in NGS datasets
MultiVCFAnalyzer
Reads multiple VCF files into combined genotype calls, produces summary statistics and downstream formats
nanoq
Reports read quality and length from nanopore sequencing data
NanoStat
Reports various statistics for long read dataset in FASTQ, BAM, or albacore sequencing summary format (supports NanoPack; NanoPlot, NanoComp)
Nextclade
Viral genome alignment, clade assignment, mutation calling, and quality checks
ngs-bits
Calculating statistics from FASTQ, BAM, and VCF
ngsderive
Forensic tool for by backwards computing library information in sequencing data
Nonpareil
Estimates metagenomic coverage and sequence diversity
ODGI
Analysis and manipulation of pangenome graphs structured in the variation graph model
OptiType
Precision HLA typing from next-generation sequencing data
pairtools
Toolkit for Chromatin Conformation Capture experiments. Handles short-reads paired reference alignments, extracts 3C-specific information, and perform common tasks such as sorting, filtering, and deduplication
Pangolin
Uses variant calls to assign SARS-CoV-2 genome sequences to global lineages
pbmarkdup
Takes one or multiple sequencing chips of an amplified libray as HiFi reads and marks or removes duplicates
Peddy
Compares familial-relationships and sexes as reported in a PED file with those inferred from a VCF
phantompeakqualtools
Computes informative enrichment and quality measures for ChIP-seq/DNase-seq/FAIRE-seq/MNase-seq data
Picard
Tools for manipulating high-throughput sequencing data
Porechop
Finds and removes adapters from Oxford Nanopore reads
Preseq
Estimates library complexity, showing how many additional unique reads are sequenced for increasing total read count
PRINSEQ++
C++ implementation of the prinseq-lite.pl program. Filters, reformats, and trims genomic and metagenomic reads
Prokka
Rapid annotation of prokaryotic genomes
PURPLE
A purity, ploidy and copy number estimator for whole genome tumor data
Pychopper
Identifies, orients, trims and rescues full length Nanopore cDNA reads. Can also rescue fused reads
pycoQC
Computes metrics and generates interactive QC plots for Oxford Nanopore technologies sequencing data
qc3C
Reference-free and BAM based quality control for Hi-C data
QoRTs
Toolkit for analysis, QC, and data management of RNA-Seq datasets
QualiMap
Quality control of alignment data and its derivatives like feature counts
QUAST
Quality assessment tool for genome assemblies
RNA-SeQC
RNA-Seq metrics for quality control and process optimization
Rockhopper
Bacterial RNA-seq analysis: align reads to coding sequences, rRNAs, tRNAs, and miscellaneous RNAs
RSEM
Estimates gene and isoform expression levels from RNA-Seq data
RSeQC
Evaluates high throughput RNA-seq data
Salmon
Quantifies expression of transcripts using RNA-seq data
Sambamba
Toolkit for interacting with BAM/CRAM files
Samblaster
Marks duplicates and extracts discordant and split reads from sam files
Samtools
Toolkit for interacting with BAM/CRAM files
Sargasso
Separates mixed-species RNA-seq reads according to their species of origin
Sequali
Sequencing quality control for both long-read and short-read data
SeqWho
Determines FASTQ(A) sequencing file source protocol and the species of origin, to check that the composition of the library is expected
SeqyClean
Filters adapters, vectors, and contaminants while quality trimming
SexDetErrmine
Calculates relative coverage of X and Y chromosomes and their associated error bars from the depth of coverage at specified SNPs
Sickle
A windowed adaptive trimming tool for FASTQ files using quality
Skewer
Adapter trimming tool for NGS paired-end sequences
Snippy
Rapid haploid variant calling and core genome alignment
SnpEff
Annotates and predicts the effects of variants on genes (such as amino acid changes)
SNPsplit
Allele-specific alignment sorter. Determines allelic origin of reads that cover known SNP positions
Somalier
Genotype to pedigree correspondence checks from sketches derived from BAM/CRAM or VCF
SortMeRNA
Program for filtering, mapping and OTU-picking NGS reads in metatranscriptomic and metagenomic data
Sourmash
Quickly searches, compares, and analyzes genomic and metagenomic data sets
Space Ranger
Tool to analyze 10x Genomics spatial transcriptomics data
Stacks
Analyzes restriction enzyme-based data (e.g. RAD-seq)
STAR
Universal RNA-seq aligner
Supernova
De novo genome assembler of 10X Genomics linked-reads
THetA2
Estimates tumour purity and clonal / subclonal copy number
Tophat
Splice junction RNA-Seq reads mapper for mammalian-sized genomes
Truvari
Benchmarking, merging, and annotating structural variants
UMICollapse
Algorithms for efficiently collapsing reads with Unique Molecular Identifiers
UMI-tools
Tools for dealing with Unique Molecular Identifiers (UMIs)/(RMTs) and scRNA-Seq barcodes
VarScan2
Variant detection in massively parallel sequencing data
VCFTools
Program to analyse and reporting on VCF files
VEP
Determines the effect of variants on genes, transcripts and protein sequences, as well as regulatory regions
VerifyBAMID
Detects sample contamination and/or sample swaps
VG
Toolkit to manipulate and analyze graphical genomes, including read alignment
WhatsHap
Phasing genomic variants using DNA reads (aka read-based phasing, or haplotype assembly)
Xengsort
Fast xenograft read sorter based on space-efficient k-mer hashing
Xenome
Classifies reads from xenograft sources