Help

InnateDB is designed with Mozilla Firefox Web Browser. Firefox Download Button

Please see below for help and tutorials for InnateDB. If you have any questions about the database, please feel free to contact us.

A PowerPoint tutorial to using InnateDB is available here.

A guide to using Cerebral, for pathway and interaction network visualization in InnateDB, is available here.

A guide to how manual curation of molecular interactions is done in InnateDB, is available here.

Notes on using Cerebral

Please note: Java Runtime Environment (JRE) version 1.5.0 or greater must be installed for Cerebral to work.
When using Firefox in a linux environment, please ensure that the Cerebral .jnlp file is opened using javaws rather than Firefox itself.

Note that on Mac OS X machines the latest Java update (Java update 4 - 1.6.0_13) seems to have caused an issue in launching .jnlp files. Launching these files is necessary to launch the webstart version of Cytoscape/Cerebral and CyOog. To fix this simply open the directory /System/Library/CoreServices. The issue should then be resolved.

Notes on linking to InnateDB
One can link to gene cards using an InnateDB or Ensembl Gene ID specified in the following URL format:

http://www.innatedb.ca/getGeneCard.do?id=ENSG00000136560
http://www.innatedb.ca/getGeneCard.do?id=IDBG-73552

Advanced Search for Genes and Proteins

This page provides the user with a number of options to search for genes or proteins of interest.

Choose organism - InnateDB contains human and mouse genes and proteins.

The default on this page is to search for proteins/genes by the protein/gene name. A search of InnateDB by the Protein/Gene name will automatically search the full gene name, all encoded protein names, the gene symbol and all synonym symbols (alternative names) for the gene. This option provides users the flexibility to find a gene/protein without having to know the HUGO name for it, which may be different from the common name(s). An exact search will be much faster and will return database entries that exactly match your query.

Several other options for searching the database are provided on this page:

  • InnateDB molecule ID: stable identifiers for a gene/protein in InnateDB (e.g. IDBG-82738 is the ID for human TLR4).
  • HUGO gene symbol/ID: Human Genome Organization Gene Nomenclature Committee official symbol or numerical ID. (For human genes only).
  • Gene Ontology (GO) Accession: search for genes/proteins that are annotated by the GO consortium as being involved in a particular function, role or localization using the GO ID number (e.g. GO:0006954 is the ID for inflammatory response).
  • Gene Ontology (GO) Term: as above but allows free-text search (e.g. inflammatory). Using exact searches is not recommended here.
  • Ensembl ID: search using a human or mouse gene, protein or transcript accession number from the Ensembl database.
  • UniProt Accession: search using an accession number from the UniProt protein database.
  • Entrez Gene ID: search using an ID number from NCBI's Entrez Gene database.
  • RefSeq Accession: search using an accession number from NCBI's RefSeq database.
  • UniGene ID: search using an ID number from NCBI's UniGene database.
  • OMIM ID: search using an ID number from the Online Mendelian Inheritance in Man database.
  • EMBL Accession: search for a particular gene/protein based on the EMBL accession number.
  • PFAM Accession: search for encoded proteins containing matching PFAM domains.
  • InterPro Accession: search for encoded proteins containing matching InterPro domains.

Choose to return Genes (default) or proteins using the "Return a list of" menu.

Advanced Search for Interactions

InnateDB contains detailed information for more than 150,000 human and mouse molecular interactions integrated from several of the major public interaction databases along with several thousand manually-curated innate immunity relevant interactions. See the statistics page for further details.

The Advanced Search for Interactions page allows users to search InnateDB for molecular interactions of interest.

  • Interaction Participant: interactions involving particular genes/proteins of interest may be searched similar to gene/protein searches as described above.
  • Interaction Xref: search for interactions using an ID number for InnateDB (e.g. IDB-104135), by PubMed ID or an ID from one of the external interaction databases.
  • Interaction Level: by default interactions involving molecules that directly interact with the genes/proteins of interest are returned. By choosing "Show direct and 2nd order interactions" both direct and secondary interactions are returned. Secondary interactors are molecules which interact with the direct interactors of the genes/proteins of interest.
  • Host system: interaction searches can be limited to interactions detected in vitro, in vivo or ex vivo. The default is to return all.
  • Cell Type: interaction searches can be limited to interactions annotated to occur in a particular cell type. Choose a cell type by browsing through the hierarchy of cell type terms or search for a cell type of interest by typing the name in the box provided and hitting enter. (e.g. try type 'neut' and hit enter). Open Biomedical Ontologies (OBO) cell type ontologies are used. Note: interactions with at least one evidence matching the chosen cell type term will be returned. Checking the "extended search" checkbox will also include all children terms of the selected term in the search.
  • Tissue Type: interaction searches can be limited to interactions annotated to occur in a particular tissue type. Choose a tissue type similar to as described for cell type. OBO BRENDA tissue type terms are used. Note: interactions matching the chosen tissue type term AND all of that terms children terms will be returned.
  • Interaction Type: interaction searches can be limited to interactions involved in a particular molecular function (e.g. phosphorylation, acetylation, etc).
  • Molecule Type: search for interactions where at least one participant in the interaction is of the selected molecule type (protein, DNA or RNA).
  • Interaction Detection Method: select a particular interaction detection method (e.g.coimmunoprecipitation) from the menu to return interactions detected using this method. Selection is done similar to cell and tissue type boxes. OBO PSI Molecular Interaction terms are used.

To reduce redundancy, interactions in InnateDB that have the same participants and interaction type are grouped together by default. Choose 'No' to return all redundant interactions separately.

Search Interactions or Genes by Pathway

The pathway advanced search page allows users to search for genes or interactions found within a particular pathway. More than 3,500 pathways have been loaded in InnateDB from major public pathway databases.

Pathways in InnateDB and the molecules in them are species-specific. Please ensure you have selected the correct species.

You can select an example pathway from the top drop-down menu or search any of the 3,500+ pathways from all data sources by typing the pathway name in the second box.

You can also search by the external database pathway ID (e.g. REACTOME: 166016).

By selecting "Return a list of Genes", a list of all genes annotated in the pathway will be returned.

By default, a list of molecular interactions are returned which is restricted to interactions only between annotated members of the pathway. If "No" is selected, a more comprehensive list of interactions is returned, displaying interactions between pathway members and all other molecules with which they interact.

Note: If a pathway member has no annotated interactions, then it will not be returned using the "Return a list of interactions" function.

Data Analysis - Upload your own data

We have had reports of users having some issues uploading data to InnateDB while using the Google Chrome browser. InnateDB is best used with the Firefox browser and is also tested with current versions of IE and Safari.

1. Select a file to upload by clicking on the "Upload File" button - upload a tab-delimited file of protein/gene identifiers or accession numbers and obtain a list of all genes, proteins, pathways, interactors or interactions that they are associated with. Alternatively, click on the "Web Form" button and paste your tab-delimited data in the text box (max. 1000 lines)

Note: There should be only one accession number per row. Probes that map to multiple genes should be removed.

Accession numbers from the following databases are currently accepted:

  • Ensembl
  • RefSeq
  • Entrez
  • UniProt
  • InnateDB (gene IDs only)

2. Click on the column headers to specify which column in your data file contains the identifiers/accession numbers for each gene (and which database they come from). This is called the "Cross-reference ID".
You can only specify one cross-reference ID column. Please note that when using identifiers from InnateDB, only gene IDs are allowed, not interactions IDs!

3. Specify the Cross-reference database. This is the database where the identifiers in the cross-reference column come from.

4. If you have included gene expression data - identify which columns contain the gene expression values and their associated p-values.

You may also identify the column containing the probe IDs if you have included them in your file.

Including quantitative data such as gene expression data is optional but a very useful way to investigate quantitative data in a pathway and interaction network context and to carry out subsequent analysis such as pathway over-representation analysis. It is used to include gene expression values in your file that are mapped to molecule cross-references.

Expression values must be in the format where a value of +2 represents a 2 fold increase in expression and a value of -2 a 2 fold decrease in expression.

You can specify values from up to ten different conditions or time-points. You can also specify a name for each condition.


4. Choose whether you want to return interactions, interactors, genes or pathways associated with your list of genes or proteins.

  • Returning a list of interactions allows one to identify all interactions in InnateDB in which the genes (or their encoded products) in the uploaded list are a participant and to construct a network of these interactions for visualization and further analysis. Detailed annotation and evidence is then available for each interaction. The resulting interaction network may then be downloaded in a variety of supported formats or interactively visualized using Cerebral.

  • Returning a list of interactors allows one to identify all molecules in InnateDB which interact with the genes (or their encoded products) in the uploaded list.

  • Returning a list of genes provides detailed annotation for each gene in the uploaded list and is a prerequisite to performing a Gene Ontology over-representation analysis.

  • Returning a list of pathways provides pathway annotation for each gene in the uploaded list and is a prerequisite to performing a pathway over-representation analysis.

Filter the interaction batch search results

You can choose to filter the results by using one of the following methods:

  • Genes - This will not return interactions involving any molecule other than those in the uploaded file. i.e. if molecule A interacts with B and C but only A and B are in your file, the interaction between A and C will not appear in the returned results.
  • This is very useful to construct a network of interactions only between molecules in the uploaded list (e.g. differentially expressed genes).

  • Pathways - This option limits the interactions returned to a particular pathway. You can search for any of the 3,500+ pathways from all data sources by typing the name of the pathway in the text box and by selecting one of the given choices.

Pathway Over-representation Analysis

To do pathway over-representation analysis (ORA) you first need to upload a list of gene identifiers and associated fold-change in gene expression values (and P values) as described above.

InnateDB recommends that you to upload All genes from your array dataset not just differentially expressed (DE) genes (probes mapping to multiple different genes should be removed). The pathway ORA tool uses the proportion of DE genes on the whole array to determine if a particular pathway is significant.

InnateDB also provides users with the option of uploading a subset of genes and performing the pathway ORA analysis. This subset analysis uses a slightly different algorithm that does not take gene expression values into account. This is necessary as the algorithm does not know the proportion of DE genes on the array. Therefore, this analysis cannot handle data from multiple conditions.

If you have multiple probes for the same gene these values will be averaged for the purposes of the pathway ORA.

Make sure you select "Return a list of Pathways" from the menu in the data analysis section.

A list of pathways associated with the uploaded genes will be returned.

To do the pathway ORA click on the red Pathway ORA button at the top of the page.

This will take you to a page where you can choose the parameters for the pathway over-representation analysis.

First you need to specify whether you are analyzing an entire array dataset or just a subset of genes.
If you try to analyze a subset of genes using the entire dataset algorithm or vice versa your results will NOT be correct.

If you are analyzing a complete array dataset choose the following parameters for the pathway over-representation analysis.

  • Fold-Change Cutoff (+/-): choose what fold-change in gene expression threshold should be used to determine which genes are differently expressed. Default = +/- 1.5.
  • Expression P-Value Cutoff: choose what P value threshold associated with each fold-change in gene expression value should be used to determine which genes are differently expressed. Default P < 0.05.

Now choose the analysis algorithm and multiple testing correction method.

  • Choose algorithm: several different statistical methods are available to determine if pathways are significantly associated with DE genes - Hypergeometric, Fisher & Chi Square.
  • Choose Correction Method: two options to correct for multiple testing are included - The Benjamini & Hochberg correction for the FDR and the more conservative Bonferroni correction.

Hit submit. A new page will be returned showing the pathways that are significantly associated with up-regulated genes.

Click the green button to see pathways that are significantly associated with down-regulated genes.

Click on the 'summary' link to see information for all genes in the pathway. The interactions in the pathway, along with overlaid gene expression data can be visualized by clicking on the 'visualize' link.

Gene Ontology Over-representation Analysis

To do gene ontology over-representation analysis (ORA) you first need to upload a list of gene identifiers and associated fold-change in gene expression values (and P values) as described above.

Make sure you select "Return a list of Genes" from the 4th drop-down menu in the data analysis section.

Gene annotation including gene ontology terms associated with the uploaded genes will be returned. (To show associated gene ontology terms go into the show/hide options at the top of the page and select the relevant columns to display - this is not necessary for the ORA tool).

To do the gene ontology ORA click on the red Ontology ORA button at the top of the page.

This will take you to a page where you can choose the parameters for the over-representation analysis.

Please see the pathway ORA section above for further details regarding these options.