GOBU Overview
We intend to give readers a conceptual understanding on GOBU by this document. For other documents, please refer our document lists.
Layout
Every time when you execute GOBU and load a properly prepared file, the GOBU program will show the main window with three tree components, listed from left to right. For convenience, we called the left tree as the "GO tree", the middle tree as the "user tree", the right tree as the "focus GO tree".
User Tree
First, a loaded file is visualized as the user tree. In the user tree, biologists assign some nodes with an "R" icon, we called them representative nodes (R-nodes for short). Generally speaking, biologists have to organize their data into a tree structure and assign their interested objects as R-nodes. For example, in the above figure, the user data is organized by the following procedure:
- Collect all human protein-coding gene IDs from NCBI database, with their location information
- Group gene IDs into one group if they are on the same chromosome
- For each group, divide gene IDs into sub-groups according their map location
- Assign gene IDs as R-nodes
(We provide some simple utilities for building up your own data, please see Technical Section.)
Further, we also assign annotations to R-nodes (i.e. gene IDs), including GO annotations and genomic location. By so doing, it is convenient to:
- use a node of map location to represent its descendant R-nodes, i.e., gene IDs in this map location.
- use a node of chromosome to represent its decendant R-nodes, i.e., gene IDs on this chromosome.
- treat descendant nodes of an R-node as its annotations.
GO tree
By appropriate assigning R-nodes and their GO annotation, biologists can observe the GO distribution on the GO tree directly; that is, if a number (in parentheses) appears in a node of GO tree, it means the total number of R-nodes annotated with this node or its descendants. Biologists can just expand or collapse GO tree for browsing.
Another benefit is that, by clicking any node of the GO tree, it means a selection of a GO annotation, including descendants of the clicked node. Such a selection will reflect to user tree, where nodes having a "selected" descendant will be shown in color, otherwise in gray.
Focus GO Tree
For every click on any node of the user tree or the GO tree, a "summary" will be shown in the focus GO tree:
- click on a GO node, the focus GO tree will be a partial-tree of GO tree, composed of all paths from GO Root to the clicked GO node. It should be noticed that some GO nodes have multiple appearances, and the clicked GO node will be shown "selected".
- click on a colored user node, focus GO tree will be a partial-tree of the GO tree, composed of all paths from the GO Root to colored GO descendants of the clicked user node. These GO descendants will be shown "selected".
Special Functionalities
Table Building
Biologists can assign any node on the GO tree as a reporting node, then by some operations, a table telling the GO distributions of different sub-trees of the user tree can be built. The table can be saved as an MS Excel file for further use.
User-Defined Data Types and the Extendable Architecture
To facilitate the use of biologists, GOBU permits biologists to create their own data types and corresponding extension software. For example, we create the LOC data type for representating genomic locations. Under the extendable architecture, we provide an extension software (plugin) for drawing these genomic locations on human chromosomes.