Short abstract
A software platform called Osprey has been developed for visualization and manipulation of complex interaction networks.
Abstract
We have developed a software platform called Osprey for visualization and manipulation of complex interaction networks. Osprey builds data-rich graphical representations that are color-coded for gene function and experimental interaction data. Mouse-over functions allow rapid elaboration and organization of network diagrams in a spoke model format. User-defined large-scale datasets can be readily combined with Osprey for comparison of different methods.
Rationale
The rapidly expanding biological datasets of physical, genetic and functional interactions present a daunting task for data visualization and evaluation [1]. Existing applications such as Pajek allow the user to visualize networks in a simple graphical format [2], but lack the necessary features needed for functional assessment and comparative analysis between datasets. Typically, interaction networks are viewed within a graphing application, but data is manipulated in other contexts, often manually.
To address these shortfalls, we developed a network visualization system called Osprey that not only represents interactions in a flexible and rapidly expandable graphical format, but also provides options for functional comparisons between datasets. Osprey was developed with the Sun Microsystems Java Standard Development Kit version 1.4.0_02 [3], which allows it to be used both in stand-alone form and as an add-on viewer for online interaction databases.
Network visualization
Osprey represents genes as nodes and interactions as edges between nodes (Figure 1). Unlike other applications, Osprey is fully customizable and allows the user to define personal settings for generation of interaction networks, as described below. Any interaction dataset can be loaded into Osprey using one of several standard file formats, or by upload from an underlying interaction database. By default, Osprey uses the General Repository for Interaction Datasets as a database (The GRID [4]), from which the user can rapidly build out interaction networks. User-defined interactions are added or subtracted from mouse-over pop-up windows that link to the database. Networks can be saved as tab-delimited text files for future manipulation or exported as JPEG or JPG graphics, portable network graphics (PNG), and scalable vector graphics (SVG) [5]. The SVG image format allows the user to produce high-quality images that can be opened in applications such as Adobe Illustrator [6] for further manipulation.
Searches and filters
A drawback of current network visualization systems is the inability to search the network for an individual gene in the context of large graphs. To overcome this problem, Osprey allows text-search queries by gene names. A further difficulty with visualization systems is the absence of functional information within the graphical interface. This problem is remedied by Osprey, which provides a one-click link to all database fields for all displayed nodes including open reading frame (ORF) name, gene aliases, and a description of gene function. By default, this information is obtained from The GRID, which in turn compiles gene annotations provided by the Saccharomyces Genome Database (SGD [7]). Various filters have been developed that allow the user to query the network. For example, an interaction network can be parsed for interactions derived from a particular experimental method. Current Osprey filters include source, function, experimental system and connectivity (Figure 2).
Network layout
As network complexity increases, graphical representations become cluttered and difficult to interpret. Osprey simplifies network layouts through user-implemented node relaxation, which disperses nodes and edges according to any one of a number of layout options. Any given node or set of nodes can be locked into place in order to anchor the network. Osprey also provides several default network layouts, including circular, concentric circles, spoke and dual ring (Figure 3). Finally, for comparison of large-scale datasets, Osprey can superimpose two or more datasets on top of each other in an additive manner. In conjunction with filter options, this feature allows interactions specific to any given approach to be identified.
Color representations
Osprey allows user defined colors to indicate gene function, experimental systems and data sources. Genes are colored by their biological process as defined by standardized Gene Ontology (GO) annotations. Genes that have been assigned more than one process are represented as multicolored pie charts. Osprey currently recognizes 29 biological processes derived from the categories maintained by the GO Consortium [8]. Interactions are colored by experimental system along the entire length of the edge between two nodes. If a given interaction is supported by multiple experimental systems, the edges are segmented into multiple colors to reflect each system. Alternatively, interactions can be colored by data source, again as multiply colored if more than one source supports the interaction. When combined with filter options, a network can be rapidly visualized according to any number of experimental parameters.
Osprey download
A personal copy of the Osprey network visualization system version 0.9.9 for use in not-for-profit organizations can be downloaded from the Osprey webpage at [9]. Registration is required for the sole purpose of enabling notification of software fixes and updates. A limited version of Osprey used for online interaction viewing can be used at The GRID website [4]. For implementation of Osprey as an online viewer for other online interaction databases please contact the authors.
Acknowledgments
Acknowledgements
We thank Hosam Abdulrrazek for contributions to our layout algorithms, and Lorrie Boucher, Ashton Breitkreutz and Paul Jorgensen for suggestions on Osprey features. Development of Osprey was supported by the Canadian Institutes of Health Research. M.T. is a Canada Research Chair in Biochemistry.
A previous version of this manuscript was made available before peer review at http://genomebiology.com/2002/3/12/preprint/0012/
References
- Vidal M. A biological atlas of functional maps. Cell. 2001;104:333–339. doi: 10.1016/s0092-8674(01)00221-5. [DOI] [PubMed] [Google Scholar]
- Batagelj V, Mrvar A. Pajek - program for large network analysis. Connections. 1998;21:47–57. [Google Scholar]
- Sun Microsystems Java Standard Development Kit 1.4.0_02 http://java.sun.com
- The Grid http://biodata.mshri.on.ca/grid
- Batik SVG toolkit http://xml.apache.org/batik/
- Adobe Illustrator 10 http://www.adobe.com/products/illustrator/main.html
- Cherry JM, Ball C, Dolinski K, Dwight S, Harris M, Matese JC, Sherlock G, Binkley G, Jin H, Weng S, Botstein D. Saccharomyces Genome Database June 2002 ftp://genome-ftp.stanford.edu/pub/yeast/SacchDB/ [DOI] [PMC free article] [PubMed]
- The Gene Ontology Consortium. Gene Ontology: tool for the unification of biology. Nat Genet. 2000;25:25–29. doi: 10.1038/75556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Osprey http://biodata.mshri.on.ca/osprey
- Gavin AC, Bosche M, Krause R, Grandi P, Marzioch M, Bauer A, Schultz J, Rick JM, Michon AM, Cruciat CM, et al. Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature. 2002;415:141–147. doi: 10.1038/415141a. [DOI] [PubMed] [Google Scholar]
- Ho Y, Gruhler A, Heilbut A, Bader GD, Moore L, Adams SL, Millar A, Taylor P, Bennett K, Boutilier K, et al. Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature. 2002;415:180–183. doi: 10.1038/415180a. [DOI] [PubMed] [Google Scholar]