Data portals

The Cancer Genoma Atlas (TCGA)
The Cancer Genome Atlas (TCGA) is a comprehensive and coordinated effort to accelerate the understanding of the molecular basis of cancer through the application of genome analysis technologies, including large-scale genome sequencing.

Encode project (ENCyclopedia Of DNA Elements)
The goal of ENCODE is to build a comprehensive parts list of functional elements in the human genome, including elements that act at the protein and RNA levels, and regulatory elements that control cells and circumstances in which a gene is active.

NIH Roadmap Epigenomics Mapping Consortium
The goal of Roadmap Epigenomics Mapping Consortium is to produce a public resource of human epigenomic data to catalyze basic biology and disease-oriented research.

FANTOM (Functional ANnoTation Of the Mammalian genome)
FANTOM is a worldwide collaborative project aiming at identifying all functional elements in mammalian genomes.

Epigenomic mapping of distinct types of haematopoietic cells from healthy individuals and on their malignant leukaemic counterparts.

Web application that integrates and unifies high-throughput cancer profiling data so that target expression across a large volume of cancer types, subtypes, and experiments can be assessed online.

Gene Expression Omnibus (GEO)
A public functional genomics data repository supporting MIAME-compliant data submissions.


The R Project for Statistical Computing
R is a free software environment for statistical computing and graphics.

Bioconductor provides tools for the analysis and comprehension of high-throughput genomic data. Bioconductor uses the R statistical programming language, and is open source and open development.

The Genome Analysis Toolkit or GATK is a software package developed at the Broad Institute to analyze next-generation sequencing data. The toolkit offers a wide variety of tools, with a primary focus on variant discovery and genotyping as well as strong emphasis on data quality assurance.

SAM Tools provide various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format.

Allows one to intersect, merge, count, complement, and shuffle genomic intervals from multiple files in widely-used genomic file formats such as BAM, BED, GFF/GTF, VCF.

Software for motif discovery and next generation sequencing analysis.

Model-based Analysis for ChIP-Seq.

A program to map bisulfite treated sequencing reads to a genome of interest and perform methylation calls in a single step.

An ultrafast, memory-efficient short read aligner.

Trim Galore
A wrapper script to automate quality and adapter trimming as well as quality control, with some added functionality to remove biased methylation positions for RRBS sequence files (for directional, non-directional (or paired-end) sequencing).

This pipeline aligns RRBS and ERRBS reads to given genomes and calls for per base DNA methylation scores.

MethylKit is an R package for DNA methylation analysis and annotation from high-throughput bisulfite sequencing. The package is designed to deal with sequencing data from RRBS and its variants.

Optimized DMR analysis based on bimodal normal distribution model and cost function for regional methylation analysis.

BSMAP is a short reads mapping software for bisulfite sequencing reads.

Tools for analyzing and visualizing Illumina's 450k array data.

Genomic Regions Enrichment of Annotations Tool (GREAT) predicts functions of cis-regulatory regions.

The Galaxy project
Galaxy is an open, web-based platform for data intensive biomedical research.

Circos is a software package for visualizing data and information.

Roadmap epigenome project protocols database
Protocols in epigenetics.

CGI Hunter
A software tool for exhaustive CpG island annotation.

Software for the analysis of MeDIP-seq data.

A tool to visualise and analyse high throughput mapped sequence data.


The next generation sequencing community.

Genome browsers

UCSC Genome Browser.

Human Epigenome Browser from WashU
A cutting-edge resource for visualizing and interacting with whole-genome datasetsr.

Integrative Genomics Viewer
The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets.