Abawaca (1.07)– abawaca (A Binning Algorithm Without A Cool
Acronym) is a binning program that can take advantage of different types of information such as
differential coverage and DNA signature
Abyss (2.1.0)– ABySS is a parallel,
paired-end sequence assembler that is designed for short reads. The single-processor version is useful
for assembling genomes up to 100 Mbases in size. The parallel version is implemented using MPI and is
capable of assembling larger genomes.
AFNI (17.2.16)– AFNI is a set of C programs for processing,
analyzing, and displaying functional MRI (FMRI) data
Ants (2.2)– Advanced Normalization Tools (ANTs)
extracts information from complex datasets that include imaging.
38.16)– BBMap is a splice-aware global aligner for DNA and RNA sequencing reads. It can align reads
from all major platforms – Illumina, 454, Sanger, Ion Torrent, Pac Bio, and Nanopore. BBMap has a large
array of options, described in its shell script. It can output many different statistics files, such as
an empirical read quality histogram, insert-size distribution, and genome coverage, with or without
generating a sam file.
Bedtools (2.27.1)– The bedtools utilities are a
swiss-army knife of tools for a wide-range of genomics analysis tasks. The most widely-used tools enable
genome arithmetic: that is, set theory on the genome. For example, bedtools allows one to intersect,
merge, count, complement, and shuffle genomic intervals from multiple files in widely-used genomic file
formats such as BAM, BED, GFF/GTF, VCF.
blast (2.6.0)– Basic Local Alignment Search Tool
is a sequence comparison algorithm optimized for speed used to search sequence databases for optimal
local alignments to a query.
blat (35)– BLAT on DNA is designed to quickly find
sequences of 95% and greater similarity of length 25 bases or more. It may miss more divergent or
shorter sequence alignments. It will find perfect sequence matches of 20 bases. BLAT on proteins finds
sequences of 80% and greater similarity of length 20 amino acids or more.
bowtie2 (188.8.131.52, 2.3.3)– Bowtie is
an ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human
genome at a rate of over 25 million 35-bp reads per hour. Bowtie indexes the genome with a
Burrows-Wheeler index to keep its memory footprint small: typically about 2.2 GB for the human genome
(2.9 GB for paired-end).
bwa (3.16)– BWA is a program for aligning sequencing reads
against a large reference genome (e.g. human genome). It has two major components, one for read shorter
than 150bp and the other for longer reads.
checkm (1.0.9)– Assess the quality of microbial
genomes recovered from isolates, single cells, and metagenomes
concoct (0.4.0)– A program for unsupervised binning of
metagenomic contigs by using nucleotide composition, coverage data in multiple samples and linkage data
from paired end reads.
cufflinks (2.2.1)– Cufflinks assembles
transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq
samples. It accepts aligned RNA-Seq reads and assembles the alignments into a parsimonious set of
transcripts. Cufflinks then estimates the relative abundances of these transcripts based on how many
reads support each one, taking into account biases in library preparation protocols.
DAS_Tool (1.1)– DAS Tool is an automated method that
integrates the results of a flexible number of binning algorithms to calculate an optimized,
non-redundant set of bins from a single assembly.
epacts (3.2.6)– EPACTS (Efficient and
Parallelizable Association Container Toolbox) is a versatile software pipeline to perform various
statistical tests for identifying genome-wide association from sequence data through a user-friendly
interface, both to scientific analysts and to method developers.
FastQC (0.11.5)– FastQC aims to
provide a simple way to do some quality control checks on raw sequence data coming from high throughput
freesurfer (2017)– FreeSurfer is a set of tools for
analysis and visualization of structural and functional brain imaging data
fsl (5.0.10)– FSL is a comprehensive library of
analysis tools for FMRI, MRI and DTI brain imaging data.
GATK (3.80, 184.108.40.206)– The Genome Analysis Toolkit
or GATK is a software package developed at the Broad Institute to analyze high-throughput sequencing
kallisto 0.44.0– kallisto is a program for
quantifying abundances of transcripts from RNA-Seq data, or more generally of target sequences using
high-throughput sequencing reads. It is based on the novel idea of pseudoalignment for rapidly
determining the compatibility of reads with targets, without the need for alignment.
lammps (mar 2017)– LAMMPS is a classical molecular dynamics
code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. LAMMPS has
potentials for soft materials (biomolecules, polymers) and solid-state materials (metals,
semiconductors) and coarse-grained or mesoscopic systems. It can be used to model atoms or, more
generically, as a parallel particle simulator at the atomic, meso, or continuum scale.
maxbin (2.2.4)– MaxBin is a software for binning
assembled metagenomic sequences based on an Expectation-Maximization algorithm.
megahit (1.1.3)– An ultra-fast single-node solution for
large and complex metagenomics assembly via succinct de Bruijn graph
metabat (2.12.1)– A robust statistical framework
for reconstructing genomes from metagenomic data.
miRA (1.2.0)– MIRA is a whole genome
shotgun and EST sequence assembler
miRExpress (2.1.4)– A database-supported, efficient and
flexible tool for detecting miRNA expression profiles.
MiRge (2018)– A
fast, smart small RNA-seq solution to process samples in a highly multiplexed fashion. miRge employs a
Bayesian alignment approach, whereby reads are sequentially aligned against customized mature miRNA,
hairpin miRNA, noncoding RNA and mRNA sequence libraries.
oases (0.2.09)– Oases is a de novo transcriptome
assembler designed to produce transcripts from short read sequencing technologies, such as Illumina,
SOLiD, or 454 in the absence of any genomic assembly.
picard (2017)– A set of Java command line tools
for manipulating high-throughput sequencing data (HTS) data and formats
Pplacer (1.1)– Pplacer places query sequences
on a fixed reference phylogenetic tree to maximize phylogenetic likelihood or posterior probability
according to a reference alignment.
prodigal (2.6.3)– Prodigal (Prokaryotic Dynamic
Programming Genefinding Algorithm) is a microbial (bacterial and archaeal) gene finding program
rsem (1.3.0)– Accurate quantification of gene and isoform
expression from RNA-Seq data
samtools (1.5)– SAM (Sequence Alignment/Map) format is a
generic format for storing large nucleotide sequence alignments
snpEff (12_2017)– Genetic variant annotation and effect
SNPiR (12_2017)– Identifies single nucleotides
polymorphisms (SNPs) in RNA-seq data. SNPiR consists of (1) a modified RNA-seq read-mapping procedure
that allows alignment of reads to the reference in a splice-aware manner, (2) variant calling using the
Genome Analysis Toolkit (GATK) and (3) vigorous filtering of false-positive calls.
SOAPdenovo (r240)– SOAPdenovo is a novel
short-read assembly method that can build a de novo draft assembly for the human-sized genomes. The
program is specially designed to assemble Illumina GA short reads. It creates new opportunities for
building reference sequences and carrying out accurate analyses of unexplored genomes in a cost
spades (3.11, 3.12)– SPAdes – St. Petersburg genome
assembler – is intended for both standard isolates and single-cell MDA bacteria assemblies.
star (2.5)– RNAseq aligner
stringtie (1.3.3)– StringTie is a fast and highly
efficient assembler of RNA-Seq alignments into potential transcripts. It uses a novel network flow
algorithm as well as an optional de novo assembly step to assemble and quantitate full-length
transcripts representing multiple splice variants for each gene locus.
subread(1.5.3)– The Subread software package is a tool kit
for processing next-gen sequencing data. It includes Subread aligner, Subjunc exon-exon junction
detector and featureCounts read summarization program
tophat (2.1.1)– TopHat is a fast splice
junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes using the ultra
high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice
junctions between exons.
transabyss (2.0.1)– de novo
assembly of RNA-Seq data using ABySS
trimmomatic (0.36)– a flexible read trimming
tool for Illumina NGS data.
Trinity (2.6.6)– Trinity, developed at the
Broad Institute and the Hebrew University of Jerusalem, represents a novel method for the efficient and
robust de novo reconstruction of transcriptomes from RNA-seq data. Trinity combines three independent
software modules: Inchworm, Chrysalis, and Butterfly, applied sequentially to process large volumes of
RNA-seq reads. Trinity partitions the sequence data into many individual de Bruijn graphs, each
representing the transcriptional complexity at at a given gene or locus, and then processes each graph
independently to extract full-length splicing isoforms and to tease apart transcripts derived from
vcf2maf (2017)– Convert a VCF into a MAF, where each
variant is annotated to only one of all possible gene isoforms