Mar. 12, 2022: ShinyGO 0.75c released with customized annotation databases for 12 species.

Feb. 11, 2022: ShinyGO 0.75c released with a customized database including annotation and pathway information for 5 species requested by users.

Feb. 8, 2022: ShinyGO v0.75 officially released. Old versions are still available. See the last tab.

Nov. 15, 2021: Database update. ShinyGO v0.75 available in testing mode. It includes Ensembl database update, new species from Ensembl Fungi and Ensembl Protists, and STRINGdb (5090 species) update to 11.5.

Oct25, 2021: Interactive genome plot. Identificantion of genomic regions signficantly enriched with user genes.

Oct.23, 2021: Version 0.741 A fully customizable enrichment chart! Switch between bar, dot or lollipop plots. Detailed gene informations with links on the Genes tab.

Oct. 15, 2021: Version 0.74. Database updated to Ensembl Release 104 and STRING v11. We now recommends the use of background genes in enrichment analysis. V.0.74 is much faster with even large set of background genes.

We recently hired Jenny for database updates and user support. Email Jenny for questions, suggestions or data contributions.

2/3/2020: Now published by Bioinformatics.

Just paste your gene list to get enriched GO terms and othe pathways for over 400 plant and animal species, based on annotation from Ensembl , Ensembl plants and Ensembl Metazoa. An additional 5000 genomes (including bacteria and fungi) are annotated based on STRING-db (v.11). In addition, it also produces KEGG pathway diagrams with your genes highlighted, hierarchical clustering trees and networks summarizing overlapping terms/pathways, protein-protein interaction networks, gene characterristics plots, and enriched promoter motifs. See example outputs below:


















FDR is calculated based on nominal P-value from the hypergeometric test. Fold Enrichment is defined as the percentage of genes in your list belonging to a pathway, divided by the corresponding percentage in the background. FDR tells us how likely the enrichment is by chance. Large gene-sets tend to have smaller FDR. As a measure of effect size, Fold Enrichment indicates how drastically genes of a certain pathway is overrepresented. When 'Remove redundant pathway' is selected, similar pathways sharing 95% of genes are represented by the most significant pathway. Pathways that are too big or too small are excluded from analysis using the Pathway Size limits.



Please select KEGG from the pathway databases to conduct enrichment analysis first. Then you can visualize your genes on any of the significant pathways. Only for some species.


Your genes are highlighted in red. Downloading pathway diagram from KEGG can take 3 minutes.
A hierarchical clustering tree summarizing the correlation among significant pathways listed in the Enrichment tab. Pathways with many shared genes are clustered together. Bigger dots indicate more significant P-values.
Some sizes may not work. Try different combinations. Figure needs to be wide as pathway names are long.
Edge cutoff:
Download HTML Edges Nodes
Similar to the Tree tab, this interactive plot also shows the relationship between enriched pathways. Two pathways (nodes) are connected if they share 20% (default) or more genes. You can move the nodes by dragging them, zoom in and out by scrolling, and shift the entire network by click on an empty point and drag. Darker nodes are more significantly enriched gene sets. Bigger nodes represent larger gene sets. Thicker edges represent more overlapped genes.
Download
Your genes are grouped by functional categories defined by high-level GO terms.
The characteristics of your genes are compared with the rest in the genome. Chi-squared and Student's t-tests are run to see if your genes have special characteristics when compared with all the other genes or, if uploaded, a customized background.
The genes are represented by red dots. The purple lines indicate regions where these genes are statistically enriched, compared to the density of genes in the background. We scanned the genome with a sliding window. Each window is further divided into several equal-sized steps for sliding. Within each window we used the hypergeometric test to determine if your genes are significantly overrepresented. Essentially, the genes in each window define a gene set/pathway, and we carried out enrichment analysis. The chromosomes may be only partly shown as we use the last gene's location to draw the line. Mouse over to see gene symbols. Zoom in regions of interest.
Download
The promoter sequences of your genes are compared with those of the other genes in the genome in terms of transcription factor (TF) binding motifs. "*Query gene" indicates a transcription factor coded by a gene included in your list.
Your genes are sent to STRING-db website for enrichment analysis and retrieval of a protein-protein network. We tries to match your species with the archaeal, bacterial, and eukaryotic species in the STRING server and send the genes. If it is running, please wait until it finishes. This can take 5 minutes, especially for the first time when shinyGO downloads large annotation files.



Download
For feedbacks, please contact us, or visit our homepage. For details, please see our paper and a detailed demo. ShinyGO shares many functionalities and databases with iDEP. Source code at GitHub.

Citation:
Ge SX, Jung D & Yao R, Bioinformatics 2020

Previous versions (still functional):
ShinyGO V0.74, based on database derived from Ensembl Release 104, archived on Feb. 8, 2022
ShinyGO V0.65, based on database derived from Ensembl Release 103, archived on Oct. 15, 2021
ShinyGO V0.61, based on database derived from Ensembl Release 96, archived on May 23, 2020
ShinyGO V0.60, based on database derived from Ensembl BioMart version 96, archived on Nov 6, 2019
ShinyGO V0.51, based on database derived from Ensembl BioMart version 95, archived on May 20, 2019
ShinyGO V0.50, based on database derived from Ensembl BioMart version 92, archived on March 29, 2019
ShinyGO V0.41, based on database derived from Ensembl BioMart version 91, archived on July 11, 2018
Based on gene onotlogy (GO) annotation and gene ID mapping of 315 animal and plant genomes in Ensembl BioMart release 96 as of 5/20/2019. In addition, 115 archaeal, 1678 bacterial, and 238 eukaryotic genomes are annotated based on STRING-db v10. Additional pathway data are being collected for some model species from difference sources.
Genomes based on STRING-db is marked as STRING-db. If the same genome is included in both Ensembl and STRING-db, users should use Ensembl annotation, as it is more updated and is supported in more functional modules.

Sources for human pathway databases:

Type

Subtype/Database name

#GeneSets

Source

Gene Ontology

Biological Process (BP)

15796

Ensembl 92

 

Cellular Component (CC)

1916

Ensembl 92

 

Molecular Function (MF)

4605

Ensembl 92

KEGG

KEGG

327

Release 86.1

Curated

Biocarta

249

Whichgenes 1.5

 

GeneSetDB.EHMN

55

GeneSetDB

 

Panther

168

1.0.4

 

HumanCyc

240

pathway Commons V9

 

INOH

576

pathway Commons V9

 

NetPath

27

pathway Commons V9

 

PID

223

pathway Commons V9

 

PSP

327

pathway Commons V9

 

Recon X

2339

pathway Commons V9

 

Reactome

2010

V64

 

Wiki

457

20180610

TF.Target

CircuitsDB.TF

829

V2012

 

ENCODE

181

V70.0

 

Marbach2016

628

regulatorycircuits Release 1.0

 

RegNetwork.TF

1400

7/1/2017

 

TFacts

428

Feb. 2012

 

tftargets.ITFP

1926

tftargets May,2017

 

tftargets.Neph2012

16476

tftargets May,2017

 

tftargets.TRED

131

tftargets May,2017

 

TRRUST

793

V2

miRNA.Targets

CircuitsDB.miRNA

140

V. 2012

 

GeneSetDB.MicroCosm

44

GeneSetDB

 

miRDB

2588

V 5.0

 

miRTarBase

2599

V 7.0

 

RegNetwork.miRNA

618

V. 2015

 

TargetScan

219

V7.2

MSigDB.Computational

Computational gene sets 

858

MSigDB 6.1

MSigDB.Curated

Literature

3465

MSigDB 6.1

MSigDB.Hallmark

hallmark

50

MSigDB 6.1

MSigDB.Immune

Immune system

4872

MSigDB 6.1

MSigDB.Location

Cytogenetic band

326

MSigDB 6.1

MSigDB.Motif

TF and miRNA Motifs

836

MSigDB 6.1

MSigDB.Oncogenic

Oncogenic signatures

189

MSigDB 6.1

PPI

BioGRID

15542

3.4.160

 

CORUM

2178

02.07.2017

 

BIND

3807

pathway Commons V9

 

DIP

2630

pathway Commons V9

 

HPRD

7141

pathway Commons V9

 

IntAct

11991

pathway Commons V9

Drug

GeneSetDB.MATADOR

266

GeneSetDB

 

GeneSetDB.SIDER

473

GeneSetDB

 

GeneSetDB.STITCH

4616

GeneSetDB

 

GeneSetDB.T3DB

846

GeneSetDB

 

SMPDB

699

pathway Commons V9

 

CTD

8758

pathway Commons V9

 

Drugbank

2563

pathway Commons V9

Other

GeneSetDB.CancerGenes

23

GeneSetDB

 

GeneSetDB.MethCancerDB

21

GeneSetDB

 

GeneSetDB.MethyCancer

54

GeneSetDB

 

GeneSetDB.MPO

3134

GeneSetDB

 

HPO

6785

May,2018

Total:

 

140,438

 

Sources for mouse pathway databases:

 

 

 

 

Type

Source

#Sets

Note

Co-expression

Literature

8,742

Differentially expressed genes from 2526 studies

 

MSigDB

3,964

Molecular Signature Database, v.6.0

 

L2L

248

List of lists,  v.2006.2

 

CancerGenes*

23

Cancer gene lists

 

GeneSigDB

494

Gene Signature Database,  R.4

Gene

GO_BP

11,943

V2017.5

Ontology

GO_MF

2,932

 

 

GO_CC

1,475

 

Curated

Biocarta*

176

Metabolic and signaling pathways

pathways

PANTHER

151

Ontology-based pathway database,  v3.4.1

 

WikiPathways*

146

Open platform for pathway curation

 

INOH*

73

Integrating network objects with hierarchies

 

NetPath*

25

Signal transduction pathways

Metabolic

KEGG

314

Metabolic pathways, R.82.0

pathways

EHMN*

53

Edinburgh human metabolic network

 

MouseCyc

321

Mouse Biochemical Pathways , v2013.7

Drug

CTD*

910

The Comparative Toxicogenomics Database

related

SIDER*

460

Side Effect Resource

 

MATADOR*

248

Manually Annotated Targets and Drugs Online Resource

 

DrugBank*

136

Open data drug and target database

 

SMPDB*

74

Small Molecule Pathway Database

miRNA

miRDB

1,912

miRNA target prediction and annotations, v 5.0

Target

microRNA.org

314

Predicted miRNA targets, v.R2010

Genes

Grimson et al.

179

Predicted miRNA targets. v.6.2

 

TarBase

84

Experimentally validated miRNA targets, v.6.0

 

miRTarBase

775

Experimentally validated miRNA targets, V6.1

 

MicroCosm

464

Predicted targets

 

PicTar

35

Predicted miRNA sites, v. 2007.3

TF Target

TFactS*

101

Predicted TF targets

Genes

TRED

99

Confirmed TF target genes, v.2013.7

 

CircuitsDB

94

Mixed miRNA/TF regulation, v. 2012

 

TRANSFAC

78

Confirmed TF binding sites, v7.0

Others

Location

341

Genomic location on chromosomes, v.2017

 

HPO*

1,518

The human phenotype ontology

 

STITCH*

3,929

Interaction networks of chemicals and proteins

 

MPO*

2,943

Mammalian Phenotype Ontology

 

T3DB*

722

Database of common toxins and their targets

 

PID*

193

Pathway Interaction Database

 

MethyCancer*

50

Human DNA methylation and cancer

 

MethCancerDB*

19

Aberrant DNA methylation in human cancer

 

Total

46,758

*Secondary data from GeneSetDB




Input:

A list of gene ids, separated by tab, space, comma or the newline characters. Ensembl gene IDs are used internally to identify genes. Other types of IDs will be mapped to Ensembl gene IDs using ID mapping information available in Ensembl BioMart.

Output:

Enriched GO terms and pathways:


In addition to the enrichment table, a set of plots are produced. If KEGG database is choosen, then enriched pathway diagrams are shown, with user's genes highlighted, like this one below:


Many GO terms are related. Some are even redundant, like "cell cycle" and "cell cycle process". To visualize such relatedness in enrichment results, we use a hierarchical clustering tree and network. In this hierarchical clustering tree, related GO terms are grouped together based on how many genes they share. The size of the solid circle corresponds to the enrichment FDR.


In this network below, each node represents an enriched GO term. Related GO terms are connected by a line, whose thickness reflects percent of overlapping genes. The size of the node corresponds to number of genes.


Through API access to STRING-db, we also retrieve a protein-protein interaction (PPI) network. In addition to a static network image, users can also get access to interactive graphics at the www.string-db.org web server.


ShinyGO also detects transcription factor (TF) binding motifs enriched in the promoters of user's genes.


Changes:

6/6/2021: V0.66 Adjusted interface.
6/2/2021: V0.66 add customized background genes.
5/23/2021: V0.65 Database update to Ensembl 103 and STRING-db v11.
11/3/2019: V 0.61 Improved visualization based on suggestions from reviewers. Interactive networks.
5/20/2019: V0.60 Upgraded to Ensembl Biomart 96. Add annotation from STRING-db v10
3/29/2019: V0.51 Update annotation to Ensembl release 95. Interface change. Demo gene lists. Error messages.
9/10/2018: V0.5 Upgraded to Ensembl Biomart 92
4/30/2018: V0.42 changed figure configurations for tree.
4/27/2018: V0.41 Change to ggplot2, add grid and gridExtra packages
4/24/2018: V0.4 Add STRING API, KEGG diagram, tree display and network.