The diversity of online resources storing biological data in different formats provides a challenge for bioinformaticians to integrate and analyse their biological data. The semantic web provides a standard to facilitate knowledge integration using statements built as triples describing a relation between two objects. WikiPathways, an online collaborative pathway resource, is now available in the semantic web through a SPARQL endpoint at http://sparql.wikipathways.org. Having biological pathways in the semantic web allows rapid integration with data from other resources that contain information about elements present in pathways using SPARQL queries. In order to convert WikiPathways content into meaningful triples we developed two new vocabularies that capture the graphical representation and the pathway logic, respectively. Each gene, protein, and metabolite in a given pathway is defined with a standard set of identifiers to support linking to several other biological resources in the semantic web. WikiPathways triples were loaded into the Open PHACTS discovery platform and are available through its Web API ( https://dev.openphacts.org/docs) to be used in various tools for drug development. We combined various semantic web resources with the newly converted WikiPathways content using a variety of SPARQL query types and third-party resources, such as the Open PHACTS API. The ability to use pathway information to form new links across diverse biological data highlights the utility of integrating WikiPathways in the semantic web.
WikiPathways is a crowd-sourced online platform for biological pathways. It is based on the same underlying platform as Wikipedia. Pathways are saved as graphical images embedded in a set of meta data elements (i.e. references, list of pathways elements, and context annotations). Pathways are used as proxies of biological knowledge in their role as descriptors of processes. Yet integrating these hubs of biological knowledge with other biological data resources remains challenging due to a cacophony of file formats, identifier systems, and hidden content. We show the application of the semantic web to enable a straightforward integration of heterogeneous biological data sources. We have taken high-quality pathways from a curated set from WikiPathways and converted the content into a data format native to the semantic web. Here, data is expressed as a set of statements where the statements are built upon a set of web addresses. Given the results, we successfully integrated external resources (e.g., EBI Expression Atlas) and pathway content with a single query .