This is still a work in progress. Please refer to the PDF document for other sections.

API document

GET|POST /api/summary/[type]

Returns a JSON of available data on WebGestalt. Data type could be idtype, geneset, referenceset, network, indicating supported ID mappings, gene set, reference set, and network (for NTA), respectively. Without the organism parameter, data of all supported organisms are returned. Supported organisms are athaliana, btaurus, celegans, cfamiliaris, drerio, sscrofa, dmelanogaster, ggallus, hsapiens, mmusculus, rnorvegicus, scerevisiae.

Optional parameters

Name Description default
organism Limit the data to one organism -

Code example

library(httr)
type <- "geneset"
organism <- "athaliana"
response <- GET(file.path("http://www.webgestalt.org/api/summary", type),
        query=list(organism=organism))
if (response$status_code == 200) {
    jsonData <- content(response)
    print(jsonData)
}

Example output:

{
  "geneontology": [
    {
      "name": "Biological_Process",
      "description": "The gene ontology biological process database was downloaded from http://www.geneontology.org/.",
      "idtype": "entrezgene"
    },
    {
      "name": "Biological_Process_noRedundant",
      "description": "The gene ontology biological process database was downloaded from http://www.geneontology.org/. Then, we only contain the non-redundant categories by selecting the most general categories in each branch of the GO DAG structure from all categories with the number of annotated genes from 20 to 500.",
      "idtype": "entrezgene"
    },
    {
      "name": "Cellular_Component",
      "description": "The gene ontology cellular component database was downloaded from http://www.geneontology.org/.",
      "idtype": "entrezgene"
    },
    {
      "name": "Cellular_Component_noRedundant",
      "description": "The gene ontology cellular component database was downloaded from http://www.geneontology.org/. Then, we only contain the non-redundant categories by selecting the most general categories in each branch of the GO DAG structure from all categories with the number of annotated genes from 20 to 500.",
      "idtype": "entrezgene"
    },
    {
      "name": "Molecular_Function",
      "description": "The gene ontology molecular function database was downloaded from http://www.geneontology.org/.",
      "idtype": "entrezgene"
    },
    {
      "name": "Molecular_Function_noRedundant",
      "description": "The gene ontology molecular function database was downloaded from http://www.geneontology.org/. Then, we only contain the non-redundant categories by selecting the most general categories in each branch of the GO DAG structure from all categories with the number of annotated genes from 20 to 500.",
      "idtype": "entrezgene"
    }
  ],
  "pathway": [
    {
      "name": "KEGG",
      "description": "The KEGG pathway database was downloaded from http://www.kegg.jp/.",
      "idtype": "entrezgene"
    },
    {
      "name": "Wikipathway",
      "description": "The Wikipathway database was downloaded from http://www.wikipathway.org/.",
      "idtype": "entrezgene"
    }
  ],
  "network": [
    {
      "name": "PPI_BIOGRID",
      "description": "The protein-protein interaction (PPI) network was downloaded from BIOGRID (https://thebiogrid.org/). Then, we used the NetSAM R package (http://bioconductor.org/packages/release/bioc/html/NetSAM.html) to identify the hierarchical co-expression modules.",
      "idtype": "entrezgene"
    }
  ],
  "disease": [],
  "drug": [],
  "phenotype": [],
  "chromosomalLocation": [
    {
      "name": "CytogeneticBand",
      "description": "",
      "idtype": "entrezgene"
    }
  ],
  "community-contributed": [],
  "others": []
}

GET|POST /api/geneset

Returns gene set data files, i.e. GMT, description file, DAG edge list, network.

Required parameters

Name Description
organism see above for supported organisms
dbType Database name e.g. ‘geneontology’, ‘pathway’. See summary results for supported types.
dbName Database name, e.g. ‘Biological_Process’, ‘KEGG’. See summary results.
database Shorthand for “dbType_dbName”.
fileType Could be one of “gmt”, “des”, “dag”; see below for details

fileType: - gmt: the functional annotation in GMT format. - des: a two-column file of gene set ID in GMT and its description. - dag: for some databases like GO, a DAG file with columns of parent and child relationship.

Optioanl parameters

Name Description
ids Just for datatype ‘des’, a subset of description is returned.

Code example

library(httr)
organism <- "hsapiens"
database <- "pathway_KEGG"
fileType <- "gmt"
response <- GET("http://www.webgestalt.org/api/geneset",
        query=list(organism=organism, database=database, fileType=fileType))
if (response$status_code == 200) {
    fileContent <- content(response)
    write(fileContent, "geneset.gmt")
    geneSetData <- unlist(strsplit(fileContent, "\n", fixed=TRUE))
    print(geneSetData[1:3])
}

Example output:

First several lines of the GMT file.

hsa00010    http://www.kegg.jp/kegg-bin/show_pathway?hsa00010   10327   124 125 126 127 128 130 130589  131 160287  1737    1738    2023    2026    2027    217 218 219 220 2203    221 222 223 224 226 229 230 2538    2597    26330   2645    2821    3098    3099    3101    387712  3939    3945    3948    441531  501 5105    5106    5160    5161    5162    5211    5213    5214    5223    5224    5230    5232    5236    5313    5315    55276   55902   57818   669 7167    80201   83440   84532   8789    92483   92579   9562
hsa00020    http://www.kegg.jp/kegg-bin/show_pathway?hsa00020   1431    1737    1738    1743    2271    3417    3418    3419    3420    3421    4190    4191    47  48  4967    50  5091    5105    5106    5160    5161    5162    55753   6389    6390    6391    6392    8801    8802    8803
hsa00030    http://www.kegg.jp/kegg-bin/show_pathway?hsa00030   132158  2203    221823  226 229 22934   230 2539    25796   2821    414328  51071   5211    5213    5214    5226    5236    55276   5631    5634    6120    64080   6888    7086    729020  8277    84076   8789    9104    9563
...

GET|POST /api/reference

Returns reference set text file used in ORA. The reference file has one NCBI Entrez gene ID or one phosphosite motif sequence per line.

Required parameters

Name Description
organism See above for supported organisms
referenceSet Name of the reference. See summary results.

Code example

library(httr)
organism <- "hsapiens"
referenceSet <- "genome_protein-coding"
response <- GET("http://www.webgestalt.org/api/reference",
        query=list(organism=organism, referenceSet=referenceSet))
if (response$status_code == 200) {
    fileContent <- content(response)
    write(fileContent, "reference.txt")
    genes <- unlist(strsplit(fileContent, "\n", fixed=TRUE))
    print(genes[1:3])
}

Example output:

1
2
9
...