Mar. 12, 2022: ShinyGO 0.75c released with customized annotation databases for 12 species.
Feb. 11, 2022: ShinyGO 0.75c released with a customized database including annotation and pathway information for 5 species requested by users.
Feb. 8, 2022: ShinyGO v0.75 officially released. Old versions are still available. See the last tab.
Nov. 15, 2021: Database update. ShinyGO v0.75 available in testing mode. It includes Ensembl database update, new species from Ensembl Fungi and Ensembl Protists, and STRINGdb (5090 species) update to 11.5.
Oct25, 2021: Interactive genome plot. Identificantion of genomic regions signficantly enriched with user genes.
Oct.23, 2021: Version 0.741 A fully customizable enrichment chart! Switch between bar, dot or lollipop plots. Detailed gene informations with links on the Genes tab.
Oct. 15, 2021: Version 0.74. Database updated to Ensembl Release 104 and STRING v11. We now recommends the use of background genes in enrichment analysis. V.0.74 is much faster with even large set of background genes.
We recently hired Jenny for database updates and user support. Email Jenny for questions, suggestions or data contributions.
2/3/2020: Now published by Bioinformatics.
Just paste your gene list to get enriched GO terms and othe pathways for over 400 plant and animal species, based on annotation from Ensembl , Ensembl plants and Ensembl Metazoa. An additional 5000 genomes (including bacteria and fungi) are annotated based on STRING-db (v.11). In addition, it also produces KEGG pathway diagrams with your genes highlighted, hierarchical clustering trees and networks summarizing overlapping terms/pathways, protein-protein interaction networks, gene characterristics plots, and enriched promoter motifs. See example outputs below:
FDR is calculated based on nominal P-value from the hypergeometric test. Fold Enrichment is defined as the percentage of genes in your list belonging to a pathway, divided by the corresponding percentage in the background. FDR tells us how likely the enrichment is by chance. Large gene-sets tend to have smaller FDR. As a measure of effect size, Fold Enrichment indicates how drastically genes of a certain pathway is overrepresented. When 'Remove redundant pathway' is selected, similar pathways sharing 95% of genes are represented by the most significant pathway. Pathways that are too big or too small are excluded from analysis using the Pathway Size limits.
Please select KEGG from the pathway databases to conduct enrichment analysis first. Then you can visualize your genes on any of the significant pathways. Only for some species.
Your genes are highlighted in red. Downloading pathway diagram from KEGG can take 3 minutes.
A hierarchical clustering tree summarizing the correlation among significant pathways listed in the Enrichment tab. Pathways with many shared genes are clustered together. Bigger dots indicate more significant P-values.
Some sizes may not work. Try different combinations. Figure needs to be wide as pathway names are long.
Edge cutoff:
Similar to the Tree tab, this interactive plot also shows the relationship between enriched pathways. Two pathways (nodes) are connected if they share 20% (default) or more genes. You can move the nodes by dragging them, zoom in and out by scrolling, and shift the entire network by click on an empty point and drag. Darker nodes are more significantly enriched gene sets. Bigger nodes represent larger gene sets. Thicker edges represent more overlapped genes.
Your genes are grouped by functional categories defined by high-level GO terms.
The characteristics of your genes are compared with the rest in the genome. Chi-squared and Student's t-tests are run to see if your genes have special characteristics when compared with all the other genes or, if uploaded, a customized background.
The genes are represented by red dots. The purple lines indicate regions where these genes are statistically enriched, compared to the density of genes in the background. We scanned the genome with a sliding window. Each window is further divided into several equal-sized steps for sliding. Within each window we used the hypergeometric test to determine if your genes are significantly overrepresented. Essentially, the genes in each window define a gene set/pathway, and we carried out enrichment analysis. The chromosomes may be only partly shown as we use the last gene's location to draw the line. Mouse over to see gene symbols. Zoom in regions of interest.
The promoter sequences of your genes are compared with those of the other genes in the genome in terms of transcription factor (TF) binding motifs. "*Query gene" indicates a transcription factor coded by a gene included in your list.
Your genes are sent to STRING-db website for enrichment analysis and retrieval of a protein-protein network. We tries to match your species with the archaeal, bacterial, and eukaryotic species in the STRING server and send the genes. If it is running, please wait until it finishes. This can take 5 minutes, especially for the first time when shinyGO downloads large annotation files.
Citation:
Ge SX, Jung D & Yao R, Bioinformatics 2020
Previous versions (still functional):
ShinyGO V0.74, based on database derived from Ensembl Release 104, archived on Feb. 8, 2022
ShinyGO V0.65, based on database derived from Ensembl Release 103, archived on Oct. 15, 2021
ShinyGO V0.61, based on database derived from Ensembl Release 96, archived on May 23, 2020
ShinyGO V0.60, based on database derived from Ensembl BioMart version 96, archived on Nov 6, 2019
ShinyGO V0.51, based on database derived from Ensembl BioMart version 95, archived on May 20, 2019
ShinyGO V0.50, based on database derived from Ensembl BioMart version 92, archived on March 29, 2019
ShinyGO V0.41, based on database derived from Ensembl BioMart version 91, archived on July 11, 2018
Based on gene onotlogy (GO) annotation and gene ID mapping of
315 animal and plant genomes
in Ensembl BioMart release 96 as of 5/20/2019.
In addition, 115 archaeal, 1678 bacterial, and 238 eukaryotic genomes are annotated based on STRING-db v10.
Additional pathway data are being collected for some model species from difference sources.
Genomes based on STRING-db is marked as STRING-db. If the same genome is included in both Ensembl and STRING-db, users should
use Ensembl annotation, as it is more updated and is supported in more functional modules.
Sources for human pathway databases:
Type
Subtype/Database name
#GeneSets
Source
Gene Ontology
Biological Process (BP)
15796
Ensembl 92
Cellular Component (CC)
1916
Ensembl 92
Molecular Function (MF)
4605
Ensembl 92
KEGG
KEGG
327
Release 86.1
Curated
Biocarta
249
Whichgenes 1.5
GeneSetDB.EHMN
55
GeneSetDB
Panther
168
1.0.4
HumanCyc
240
pathway Commons V9
INOH
576
pathway Commons V9
NetPath
27
pathway Commons V9
PID
223
pathway Commons V9
PSP
327
pathway Commons V9
Recon X
2339
pathway Commons V9
Reactome
2010
V64
Wiki
457
20180610
TF.Target
CircuitsDB.TF
829
V2012
ENCODE
181
V70.0
Marbach2016
628
regulatorycircuits Release 1.0
RegNetwork.TF
1400
7/1/2017
TFacts
428
Feb. 2012
tftargets.ITFP
1926
tftargets May,2017
tftargets.Neph2012
16476
tftargets May,2017
tftargets.TRED
131
tftargets May,2017
TRRUST
793
V2
miRNA.Targets
CircuitsDB.miRNA
140
V. 2012
GeneSetDB.MicroCosm
44
GeneSetDB
miRDB
2588
V 5.0
miRTarBase
2599
V 7.0
RegNetwork.miRNA
618
V. 2015
TargetScan
219
V7.2
MSigDB.Computational
Computational gene sets
858
MSigDB 6.1
MSigDB.Curated
Literature
3465
MSigDB 6.1
MSigDB.Hallmark
hallmark
50
MSigDB 6.1
MSigDB.Immune
Immune system
4872
MSigDB 6.1
MSigDB.Location
Cytogenetic band
326
MSigDB 6.1
MSigDB.Motif
TF and miRNA Motifs
836
MSigDB 6.1
MSigDB.Oncogenic
Oncogenic signatures
189
MSigDB 6.1
PPI
BioGRID
15542
3.4.160
CORUM
2178
02.07.2017
BIND
3807
pathway Commons V9
DIP
2630
pathway Commons V9
HPRD
7141
pathway Commons V9
IntAct
11991
pathway Commons V9
Drug
GeneSetDB.MATADOR
266
GeneSetDB
GeneSetDB.SIDER
473
GeneSetDB
GeneSetDB.STITCH
4616
GeneSetDB
GeneSetDB.T3DB
846
GeneSetDB
SMPDB
699
pathway Commons V9
CTD
8758
pathway Commons V9
Drugbank
2563
pathway Commons V9
Other
GeneSetDB.CancerGenes
23
GeneSetDB
GeneSetDB.MethCancerDB
21
GeneSetDB
GeneSetDB.MethyCancer
54
GeneSetDB
GeneSetDB.MPO
3134
GeneSetDB
HPO
6785
May,2018
Total:
140,438
Sources for mouse pathway databases:
Type
Source
#Sets
Note
Co-expression
Literature
8,742
Differentially expressed genes from 2526
studies
MSigDB
3,964
Molecular Signature Database, v.6.0
L2L
248
List of lists, v.2006.2
CancerGenes*
23
Cancer gene lists
GeneSigDB
494
Gene Signature Database, R.4
Gene
GO_BP
11,943
V2017.5
Ontology
GO_MF
2,932
GO_CC
1,475
Curated
Biocarta*
176
Metabolic and signaling pathways
pathways
PANTHER
151
Ontology-based pathway database, v3.4.1
WikiPathways*
146
Open platform for pathway curation
INOH*
73
Integrating network objects with hierarchies
NetPath*
25
Signal transduction pathways
Metabolic
KEGG
314
Metabolic pathways, R.82.0
pathways
EHMN*
53
Edinburgh human metabolic network
MouseCyc
321
Mouse Biochemical Pathways
, v2013.7
Drug
CTD*
910
The Comparative Toxicogenomics
Database
related
SIDER*
460
Side Effect Resource
MATADOR*
248
Manually Annotated Targets and Drugs Online
Resource
DrugBank*
136
Open data drug and target database
SMPDB*
74
Small Molecule Pathway Database
miRNA
miRDB
1,912
miRNA target prediction and annotations, v
5.0
Target
microRNA.org
314
Predicted miRNA targets, v.R2010
Genes
Grimson et al.
179
Predicted miRNA targets. v.6.2
TarBase
84
Experimentally validated miRNA targets, v.6.0
miRTarBase
775
Experimentally validated miRNA targets, V6.1
MicroCosm
464
Predicted targets
PicTar
35
Predicted miRNA sites, v. 2007.3
TF Target
TFactS*
101
Predicted TF targets
Genes
TRED
99
Confirmed TF target genes, v.2013.7
CircuitsDB
94
Mixed miRNA/TF regulation, v. 2012
TRANSFAC
78
Confirmed TF binding sites, v7.0
Others
Location
341
Genomic location on chromosomes, v.2017
HPO*
1,518
The human phenotype ontology
STITCH*
3,929
Interaction networks of chemicals and
proteins
MPO*
2,943
Mammalian Phenotype Ontology
T3DB*
722
Database of common toxins and their targets
PID*
193
Pathway Interaction Database
MethyCancer*
50
Human DNA methylation and cancer
MethCancerDB*
19
Aberrant DNA methylation in human cancer
Total
46,758
*Secondary data from GeneSetDB
Input:
A list of gene ids, separated by tab, space, comma or the newline characters.
Ensembl gene IDs are used internally to identify genes. Other types of IDs will be mapped to Ensembl
gene IDs using ID mapping information available in Ensembl BioMart.
Output:
Enriched GO terms and pathways:
In addition to the enrichment table, a set of plots are produced. If KEGG database is choosen, then enriched pathway diagrams are shown, with user's genes highlighted, like this one below:
Many GO terms are related. Some are even redundant, like "cell cycle" and "cell cycle process".
To visualize such relatedness in enrichment results, we use a hierarchical clustering tree and network.
In this hierarchical clustering tree, related GO terms are grouped together based on how many genes they share. The size of the solid circle corresponds to the enrichment FDR.
In this network below, each node represents an enriched GO term. Related GO terms are connected by a line, whose thickness reflects percent of overlapping genes. The size of the node corresponds to number of genes.
Through API access to STRING-db, we also retrieve a protein-protein interaction (PPI) network. In addition to a static network image, users can also get access to interactive graphics at the www.string-db.org web server.
ShinyGO also detects transcription factor (TF) binding motifs enriched in the promoters of user's genes.
Changes:
6/6/2021: V0.66 Adjusted interface.
6/2/2021: V0.66 add customized background genes.
5/23/2021: V0.65 Database update to Ensembl 103 and STRING-db v11.
11/3/2019: V 0.61 Improved visualization based on suggestions from reviewers. Interactive networks.
5/20/2019: V0.60 Upgraded to Ensembl Biomart 96. Add annotation from STRING-db v10
3/29/2019: V0.51 Update annotation to Ensembl release 95. Interface change. Demo gene lists. Error messages.
9/10/2018: V0.5 Upgraded to Ensembl Biomart 92
4/30/2018: V0.42 changed figure configurations for tree.
4/27/2018: V0.41 Change to ggplot2, add grid and gridExtra packages
4/24/2018: V0.4 Add STRING API, KEGG diagram, tree display and network.
Type |
Subtype/Database name |
#GeneSets |
Source |
Gene Ontology |
Biological Process (BP) |
15796 |
Ensembl 92 |
|
Cellular Component (CC) |
1916 |
Ensembl 92 |
|
Molecular Function (MF) |
4605 |
Ensembl 92 |
KEGG |
KEGG |
327 |
Release 86.1 |
Curated |
Biocarta |
249 |
Whichgenes 1.5 |
|
GeneSetDB.EHMN |
55 |
GeneSetDB |
|
Panther |
168 |
1.0.4 |
|
HumanCyc |
240 |
pathway Commons V9 |
|
INOH |
576 |
pathway Commons V9 |
|
NetPath |
27 |
pathway Commons V9 |
|
PID |
223 |
pathway Commons V9 |
|
PSP |
327 |
pathway Commons V9 |
|
Recon X |
2339 |
pathway Commons V9 |
|
Reactome |
2010 |
V64 |
|
Wiki |
457 |
20180610 |
TF.Target |
CircuitsDB.TF |
829 |
V2012 |
|
ENCODE |
181 |
V70.0 |
|
Marbach2016 |
628 |
regulatorycircuits Release 1.0 |
|
RegNetwork.TF |
1400 |
7/1/2017 |
|
TFacts |
428 |
Feb. 2012 |
|
tftargets.ITFP |
1926 |
tftargets May,2017 |
|
tftargets.Neph2012 |
16476 |
tftargets May,2017 |
|
tftargets.TRED |
131 |
tftargets May,2017 |
|
TRRUST |
793 |
V2 |
miRNA.Targets |
CircuitsDB.miRNA |
140 |
V. 2012 |
|
GeneSetDB.MicroCosm |
44 |
GeneSetDB |
|
miRDB |
2588 |
V 5.0 |
|
miRTarBase |
2599 |
V 7.0 |
|
RegNetwork.miRNA |
618 |
V. 2015 |
|
TargetScan |
219 |
V7.2 |
MSigDB.Computational |
Computational gene sets |
858 |
MSigDB 6.1 |
MSigDB.Curated |
Literature |
3465 |
MSigDB 6.1 |
MSigDB.Hallmark |
hallmark |
50 |
MSigDB 6.1 |
MSigDB.Immune |
Immune system |
4872 |
MSigDB 6.1 |
MSigDB.Location |
Cytogenetic band |
326 |
MSigDB 6.1 |
MSigDB.Motif |
TF and miRNA Motifs |
836 |
MSigDB 6.1 |
MSigDB.Oncogenic |
Oncogenic signatures |
189 |
MSigDB 6.1 |
PPI |
BioGRID |
15542 |
3.4.160 |
|
CORUM |
2178 |
02.07.2017 |
|
BIND |
3807 |
pathway Commons V9 |
|
DIP |
2630 |
pathway Commons V9 |
|
HPRD |
7141 |
pathway Commons V9 |
|
IntAct |
11991 |
pathway Commons V9 |
Drug |
GeneSetDB.MATADOR |
266 |
GeneSetDB |
|
GeneSetDB.SIDER |
473 |
GeneSetDB |
|
GeneSetDB.STITCH |
4616 |
GeneSetDB |
|
GeneSetDB.T3DB |
846 |
GeneSetDB |
|
SMPDB |
699 |
pathway Commons V9 |
|
CTD |
8758 |
pathway Commons V9 |
|
Drugbank |
2563 |
pathway Commons V9 |
Other |
GeneSetDB.CancerGenes |
23 |
GeneSetDB |
|
GeneSetDB.MethCancerDB |
21 |
GeneSetDB |
|
GeneSetDB.MethyCancer |
54 |
GeneSetDB |
|
GeneSetDB.MPO |
3134 |
GeneSetDB |
|
HPO |
6785 |
May,2018 |
Total: |
|
140,438 |
|
|
|
|
|
Type |
Source |
#Sets |
Note |
Co-expression |
Literature |
8,742 |
Differentially expressed genes from 2526 studies |
|
MSigDB |
3,964 |
Molecular Signature Database, v.6.0 |
|
L2L |
248 |
List of lists, v.2006.2 |
|
CancerGenes* |
23 |
Cancer gene lists |
|
GeneSigDB |
494 |
Gene Signature Database, R.4 |
Gene |
GO_BP |
11,943 |
V2017.5 |
Ontology |
GO_MF |
2,932 |
|
|
GO_CC |
1,475 |
|
Curated |
Biocarta* |
176 |
Metabolic and signaling pathways |
pathways |
PANTHER |
151 |
Ontology-based pathway database, v3.4.1 |
|
WikiPathways* |
146 |
Open platform for pathway curation |
|
INOH* |
73 |
Integrating network objects with hierarchies |
|
NetPath* |
25 |
Signal transduction pathways |
Metabolic |
KEGG |
314 |
Metabolic pathways, R.82.0 |
pathways |
EHMN* |
53 |
Edinburgh human metabolic network |
|
MouseCyc |
321 |
Mouse Biochemical Pathways , v2013.7 |
Drug |
CTD* |
910 |
The Comparative Toxicogenomics Database |
related |
SIDER* |
460 |
Side Effect Resource |
|
MATADOR* |
248 |
Manually Annotated Targets and Drugs Online Resource |
|
DrugBank* |
136 |
Open data drug and target database |
|
SMPDB* |
74 |
Small Molecule Pathway Database |
miRNA |
miRDB |
1,912 |
miRNA target prediction and annotations, v 5.0 |
Target |
microRNA.org |
314 |
Predicted miRNA targets, v.R2010 |
Genes |
Grimson et al. |
179 |
Predicted miRNA targets. v.6.2 |
|
TarBase |
84 |
Experimentally validated miRNA targets, v.6.0 |
|
miRTarBase |
775 |
Experimentally validated miRNA targets, V6.1 |
|
MicroCosm |
464 |
Predicted targets |
|
PicTar |
35 |
Predicted miRNA sites, v. 2007.3 |
TF Target |
TFactS* |
101 |
Predicted TF targets |
Genes |
TRED |
99 |
Confirmed TF target genes, v.2013.7 |
|
CircuitsDB |
94 |
Mixed miRNA/TF regulation, v. 2012 |
|
TRANSFAC |
78 |
Confirmed TF binding sites, v7.0 |
Others |
Location |
341 |
Genomic location on chromosomes, v.2017 |
|
HPO* |
1,518 |
The human phenotype ontology |
|
STITCH* |
3,929 |
Interaction networks of chemicals and proteins |
|
MPO* |
2,943 |
Mammalian Phenotype Ontology |
|
T3DB* |
722 |
Database of common toxins and their targets |
|
PID* |
193 |
Pathway Interaction Database |
|
MethyCancer* |
50 |
Human DNA methylation and cancer |
|
MethCancerDB* |
19 |
Aberrant DNA methylation in human cancer |
|
Total |
46,758 |
*Secondary data from GeneSetDB |