Secondary Logo

Journal Logo

REVIEW ARTICLES

Biomedical ontologies and their development, management, and applications in and beyond China

Pan, Hongjiea; Zhu, Yanb; Yang, Shenga; Wang, Zhiganga; Zhou, Weic; He, Yongqund; Yang, Xiaolina,*

Author Information
doi: 10.1097/JBR.0000000000000051
  • Open

Abstract

A quick glimpse of the ontology framework

Ontology, historically, was used to describe a branch of philosophy in ancient Greece, which was concerned with the study of nature and relations of being.[1] In modern computer science, the word has been borrowed to refer to a common, controlled knowledge representation designed to help knowledge sharing and computer reasoning in a specific domain.[2] Specifically, an ontology is a set of terms to represent entities, and a set of relations to specify the semantic relationships between those terms in a specific domain; and those terms and relationships should be computer- and human-interpretable. Currently, World Wide Web Consortium (W3C) Web Ontology Language (OWL) is the most predominant used language for ontology.

Ontologies can be divided into three types from the perspective of application. The first type is upper ontology, also called top-lever ontology, which provides a common framework for defining and structuring the terms and relations in a specific domain.[2] Commonly used upper level ontologies include Descriptive Ontology for Linguistic and Cognitive Engineering) (DOLCE),[3] Suggested Upper Merged Ontology (SUMO),[4–6] CYC upper ontology (CYC),[7] General Formal Ontology (GFO)[8] and Basic Formal Ontology (BFO).[1,9] Another type is reference ontology (domain ontologies), which represents the terms and their relationships in a specific domain. For example, Disease Ontology (DOID)[10] and Human Phenotype Ontology (HPO)[11] are widely used domain ontologies. The third type is application ontology, which is developed for a specific scenario or application and provide a minimal terminological structure to fit the limited needs.[12]

In the recent decades, ontology has been widely used in the biomedical area to help solve the issue of data heterogeneity, leading to advanced data analysis, knowledge organization and reasoning.

Database search strategy

The author used the following criteria or procedures to retrieve related literatures: (i) a general search of ontology development within 2000 and 2019 was performed, mainly using the Google Scholar and NCBI database and Google server, and focused on ontology development and application; (ii) to narrow down the coverage to find ontology used in biomedical field, a further search using key words “ontology” and “biomedical”, “clinical” was utilized and have a preliminary understanding of biomedical ontology; (iii) and the result were screened with additional terms “resource” and “repository” to find some centralized ontology portals to have a comprehensive vision of biomedical ontology groups and the supported communities; (iv) similarly, the former processes were also applied in domestic academic literature search via Baidu server and some related academic databases in China to find related domestic built ontology resources.

Ontology, biomedical big data, and precision medicine

Since the National Aeronautics and Space Administration, U.S. scientists Michael Cox and David Ellsworth first proposed the forthcoming challenge of big data in 1997,[13] the idea of big data has rapidly penetrated into all corners of the world, and it has had an influence in all fields of natural science, social science and engineering.[14] In the biomedical domain, with the rapid development of new technologies such as the completion of the Human Genome Project and high-throughput sequencing, a large amount of biological data has been generated, including genomics and proteomics, and clinical medicine data. Precision medicine,[15] also known as personalized medicine, is a new medical concept and medical model with the aim of achieving individualized medicine, which relies on the integration of data from diverse sources,[16] including multi-omics, electronic health records and even wearable devices. However, the huge heterogeneity in these biological data has greatly impeded data harmonization and integration of precision medicine. Therefore, an effective and efficient precision medical decision system requires advanced data analysis, knowledge organization, representation and reasoning.

Ontology plays a pivotal role in precision medical data organization and application. First, ontology provides a terminology framework to reduce data heterogeneity and allows data to be interoperable between biomedical information systems. Through the process of data annotation by biomedical ontologies, data and the description of metadata are coded by a unique ID and its corresponding label. For example, the terms in HPO can be used to standardize a patient's symptoms and signs. DOID can be used to represent the diagnosis results. As for metadata standardization, Ontology for Biomedical Investigations (OBI)[17] provides a formal representation of the content of metadata types. As a result, it not only clearly specifies the concepts of metadata, but also describes the relationships between them. In this way, semantic heterogeneity from biomedical information systems can be reduced to ensure interoperability, and automatic data integration, analysis pipeline and some extensible data analysis method-based reasoning can be realized more easily.

Ontology can help to form precise and useful classification required for individualized medicine. Classifications describe entities from domains of interest, such as diseases, phenotypes, medications, and exposures, by naming the entities in each domain and providing computational specifications of diverse degrees of sophistication.[18] In ontology, the most important relationship is the ‘is_a’ relation, which is used to construct the basic hierarchy. Beyond that, ontology also specifies other logic relationships between entities, such as ‘has phenotype’ (RO_0002200) relationship, which is mainly used to describe a disease inhering in an organism, then the organism has a phenotype. Figure 1 shows how a disease is represented by DOID. A term for disease can have multiple parents and disease related factors, such as etiology, genetic mutation, phenotype, and location of the disease are expressed in a descriptive logical way. Thus, a multiaxial classification is established that defines a patient with a more accurate and deeper classification, which also provides a computable framework for further data analysis.

Figure 1
Figure 1:
The representation of tyrosinemia type II in Disease Ontology (DO). This figure shows the ontological presentation of a disease node/term and its related nodes/terms in DO disease. Different colors represent different semantic relations between the center node (red block) and the pointing nodes (blocks in various colors except the red block).

Ontology also provides formal knowledge representation required for medical decision support, whereby clinicians, staff, patients, or other individuals can use this knowledge and person-specific information, intelligently filtered or presented at appropriate times, to enhance health and health care.[19] In such a system, a central task is how to develop a reasoning system and specify the knowledge on which the reasoning system can operate. In an ontology-driven clinical decision support system, the key concepts and relationships from clinical practice guidelines, a source of evidence-based knowledge, are formalized in ontologies to ensure knowledge sharing, integration, and reuse. In Assessment and Treatment of Hypertension: Evidence-Based Automation - Clinical Decision Support,[20] an automated clinical practice guideline-based clinical decision support system, a general guideline ontology was generated to define how to represent a guideline, which can then instantiate the ontology to build a knowledge base for a specific disease or disease condition. Then a problem-solving program combines the knowledge base with individual patient data from electronic health records to determine whether the patient is eligible for treatment in accordance with the guideline.[21]

Ontology resource open sharing

In the past 2 decades, the range and diversity of ontologies have increased dramatically. The demand for ontology reuse, data integration and annotation, as well as ontology alignment has led to the development of specialized ontology repositories. The most well-known biomedical ontology resource repositories are National Center for Biomedical Ontology (NCBO) BioPortal,[22,23] Ontobee,[24] Ontology Lookup Service (OLS)[25,26] and AberOWL.[27–29]

These ontology repositories provide many tools for ontology. Table 1 lists these repositories and their ontology application tools. For example, Ontology Lookup Service OxO and NCBO BioPortal Annotator are data annotation tools that annotate natural language using ontology terms and thus semantic standardization can be realized. Linked data services have been a recent research hotspot, although difficulties in ontology online representation exist. One of the important principles of linked data is to provide valid data for Uniform Resource Identifier (URI) when indexed,[30] especially to show the meaning of a single term through the URI of the ontology term. NCBO BioPortal proposed a way of dereferencing ontology terms by using the Persistent Uniform Resources Location (PURL) server to redirect the ontology term to a permanent html page. However, the user still cannot directly use the URI of the term to dereference a real web page and obtain a corresponding Resource Description Framework (RDF) file. To accomplish the conversion to linked data, He's Group[31] created a terminology dereference model in Ontobee that is different from other methods. In this model, all ontology triples are stored in the Virtuoso triple store, SPARQL queries are submitted to the triple store via PHP, and the returned SPARQL query results are stored in the JSON format and as an OWL file for the term. Ontobee converts JSON to an HTML page via PHP and embeds the resulting HTML page link in the OWL file via Extensible Stylesheet Language Transformations (XSLT) technology. When the browser accesses the corresponding ontology term URI, the html page and the OWL file are simultaneously acquired, and the display of the HTML page and the OWL file content in the source code of the page are realized. Therefore, Ontobee realizes the deference of the ontology term URI, and is a real ontology linked data server. It is the default linked data server for the Open Biological and Biomedical Ontology (OBO) Foundry.[32,33]

Table 1
Table 1:
Ontology resources repository.

To provide diverse ontology services, several ontology libraries have been constructed by using the technology framework mentioned above. Among them, the French AgroPortal ontology resource repository, based on NCBO BioPortal architecture, is dedicated to the construction of multilingual ontology resources in the fields of agronomy and related domains, and adds new external ontology term mapping information and new metadata services. The ontology resources in AgroPortal cover the contexts in six languages including English and French.[34–39] MedPortal, created in China, is also based on the NCBO BioPortal framework and is used in the biomedical field.[40] It now stores 55 ontology resources, covering both Chinese and English language environments, aimed at meeting the needs of domestic Chinese users for international high-quality ontology terms and Chinese translation ontology terms. In addition, ontology resources and tools mirrors, such as Ontobee and Ontofox[41] are also used in China to satisfy the needs of domestic ontology development and application.

Community-based ontology development

With the proliferation of ontology, heterogeneity and redundancy have become big problems in ontology development. A traditional approach to solving these problems is to build retrospective mapping based on synonymy relations.[32] However, this method introduces new errors and confusion to the ontologies. Therefore, the OBO Foundry, as an internationally recognized ontology developer community, advocates a new strategy to build community-based biomedical ontologies based on a specific set of principles.[32]

The OBO Foundry established community-based ontology development principles for life science, mainly encompass open use, non-overlapping and strictly-scoped content, collaborative development, common syntax and relations, and more details can be viewed online (http://www.obofoundry.org). Under the guidance of these principles, OBO ontologies have been formed more logically, and are interoperable, accurate, easily reused and with less redundancy. As of September 2019, 10 mature ontologies have passed the OBO Foundry principles and are recommended as preferred targets for community convergence. The other 149 ontologies are candidates and require further updating and verification. According to the results from an analysis of ontology reuse in NCBO BioPortal in 2017, content from OBO ontologies are most widely reused by other ontologies for the purposes of creating new ontology content.[23]

To disseminate the OBO principles and promote the quality of biomedical ontologies in China, the China Biomedical Ontology Joint Working Group (OntoChina, http://www.ontochina.org) was established in October 2017 with support from NCMI (China National Population Health Data Sharing Platform).[16] Based on the OBO principles, one primary task for OntoChina is to build English-Chinese bilingual ontologies or Chinese translation ontologies based on the original English reference ontology. OntoChina is also committed to building a comprehensive informatics framework to support ontology development and ontology-based applications. OntoChina has become an interactive and collaborative community devoted to ontology research in and beyond China.

A miniature of biomedical ontology application

Currently, ontology has become an important solution to the biomedical big data issue. Here we will provide an overview of some successful applications of biomedical ontology.

Ontologies make science data more Findable, Accessible, Interoperable and Reusable (FAIR)

In 2016, the Future of Research Communications and e-Scholarship community (FORCE 11) introduced a fundamental principle for scientific data management and stewardship, termed FAIR, which highlights criteria that the open science data should reach.[42]

One principle of FAIR is how to represent metadata in a manner that machines and humans can process. Ontology provides logical definitions and relationships for metadata to enhance interoperability between different metadata systems.[16] OBI provides terms with precise definitions to describe metadata types of how investigations in the biological and medical domains are conducted.[17] Dugan et al[43] established a standardized metadata system for human pathogen genome sequences based on OBI to maintain a consistent representation of data from different projects or institutes. Gonçalves et al[44] set up the Center for Expanded Data Annotation and Retrieval (CEDAR) workbench to create a biomedical metadata construction process that provides a template designer, metadata editor tools, and a metadata repository to provide technology support for building and storing users’ metadata templates. The workbench encapsulates the NCBO BioPortal API, providing ontology standard vocabulary support for metadata creation.[44]

Several software programs were developed to support the process of semantics standardization. The Investigation-Study-Assay (ISA) software suite helps users to utilize all types of data standards (Minimum reporting guidelines, terminologies and formats) in life science to provide rich and interoperable descriptions of the experimental metadata.[45] OntoMaton, a tool in ISA and a plugin on Google Sheet, facilitates ontology searches and then annotates experimental data by ontology terms through NCBO BioPortal API.[46] Furthermore, to facilitate online metadata curation in an online database, a browser extension, CEDAR OnDemand powered by the NCBO BioPortal, enables users to seamlessly enter ontology-based metadata through existing web forms of the native database systems.[47] Of note, data scientists, domain experts, and software developers should work together to create an environment where data standards are easily found and can be used by experimental data collectors, which is the first step to data FAIR.

Gene Ontology (GO) and GO annotation

GO is likely the most successful ontology. Since it was established 20 years ago, it has boosted bioinformatics database construction and data analysis. GO describes the functions of gene products across all species in a consistent and computer-accessible manner using a controlled vocabulary and logical relationship that facilitates the integration of public biological data.[48,49] In GO, the gene product function covers three distinct aspects: molecular function (the activity of a gene product at the molecular level), cellular component (the location of a gene product's activity relative to biological structures), and biological process (a larger biological program in which a gene's molecular function is utilized).[48,49] GO annotation, an international collaborative project, was initiated to provide high-quality electronic and manual annotations to individual gene products from specific species using the standardized vocabulary in GO[50]. Thus, the relationships between specific gene products and GO terms are constructed. Consequently, a GO knowledge base was formed, in which an annotation is represented in a “triple” format: UniProtKB:P04637 “involved in” GO:0045944. Here, UniProtKB:P04637 denotes the cellular tumor antigen p53 for Homo sapiens and denotes the process of positive regulation of transcription by RNA polymerase II. Furthermore, the curator can assign an evidence code from the evidence and conclusion ontology[51,52] to a GO annotation so that the user can assess the correctness of the annotation.

The GO knowledge base has had a significant impact on bioinformatics data analysis. The most widely used GO-based method is GO enrichment analysis, which is used to identify enriched functions of a list of given genes.[53] The generation of multilayers of biological complexity in computational function annotations also raised the question of prediction performance in reflecting biological reality. Indeed, some evaluation mechanisms have been used to improve the prediction effect.[52] In 2017, the GO consortium provided a new annotation model, GO-Causal Activity Modeling,[54] which represents a detailed semantic model of how one or several gene products contribute to the execution of a biological process. Although the GO-Causal Activity Modeling data is currently limited, it will promote new data analysis methods based on the GO knowledge base.

HPO and rare disease diagnosis

The HPO is committed to the collection and integration of standard terminology and disease annotation information for human disease phenotype abnormalities to enable large-scale disease phenotype computer analysis.[11] The contents come from databases and medical literature such as Orphanet,[55] DECIPHER,[56] and Online Mendelian Inheritance in Man (OMIM),[57] covering more than 10,000 clinically abnormal phenotypic terms and over 50,000 genetic disease annotations.

In clinical data, abnormal phenotypic terminology in HPO can be used to diagnose human diseases. However, the relationships between genes and phenotypes have been built up and recorded in many databases. Therefore, phenotypes can be used as a bridge to infer the relationships between genes and diseases. Phenotypic Interpretation of eXomes (PhenIX), a computer disease diagnosis tool based on HPO data developed by Zemojtel et al,[58] can sort the variant sites obtained by sequencing and select the mutations most likely to cause diseases, as well as assist clinicians in disease diagnosis. Because of the weakness of simple term-matching in identifying diseases in non-ontological presentations of features in syndromes by different physicians, the Ontological Similarity Search has been used to find a “good match” in syndromes or disease by ranking the similarity scores of terms related to the presentation.[59] The Ontological Similarity Search with P values can further improve the Ontological Similarity Search analysis, resulting in the identification of the “best match” in disease. The application Phenomizer has implemented the Ontological Similarity Search-PV online to assist physicians in the diagnostics of human genetics.[59,60] Additionally, to promote the utilization of HPO in non-English speaking countries, many countries have translated the source HPO into a domestic version, such as the Chinese version of the CHPO database (http://www.chinahpo.org/#), aimed to promote rare disease research in China.

Challenges and ontology development in China

The development of biomedical big data has promoted the open sharing of biomedical ontology and the development of the OBO Foundry, and therefore the number of open and shared ontologies has increased rapidly in recent years. The open sharing of biomedical ontology resources in turn promotes the reuse of ontology resources and the development of new ontology resources in biomedical big data integration. For ontology curators or developers, the reuse of ontology terms reduces the complexity of the review process for new terminology. But this also brings some new problems to curators and users. First, it is difficult to determine the degree of reuse of existing terms[23]; that is, either to choose to reuse the entire ontology or part of the ontology module.[61–63] Second, it is unclear how to keep the reuse term with the version control of the source ontology, especially for terms in partial module reuse. Version control is relatively cumbersome. Last, developers may be confused on the choice of the top-level framework of the ontology. Should they choose the top-level framework design pattern recommended by OBO or choose the ontology framework that is most suitable for the field? Such a decision is required to be discussed before ontology construction. In addition, as shown in Table 2, we analyzed the statistics of the distribution of ontologies in 4 repositories, ie, NCBO BioPortal, Ontobee and OLS, and AberOWL. Overall, there are 141 ontologies stored in the four repositories. In many cases, the same ontology between ontology resource repositories has no version control or the source is unknown. The ontology developer or user needs to manually discriminate the version to obtain the final desired ontology. These issues need to be discussed and resolved by the OBO Foundry and/or other ontology communities.

Table 2
Table 2:
Ontologies stored in more than one repository.

Additionally, the development of biomedical big data has promoted the demand for multilingual ontologies. Biomedical big data, especially clinically related data, often are recorded in native languages. However, existing bio-ontologies are mostly written in English. It is often necessary to translate the corresponding ontology standard or database to form a local language copy to be able to apply the ontology. For example, in France, Japan and other countries, the translation of the International Statistical Classification of Diseases and Related Health Problems 10th Revision (ICD10) ontology terms has been completed successfully. The OBO Foundry also suggested that a multi-language ontology should be constructed. In the past 2 years, OntoChina has translated several OBO ontologies into Chinese and built OWL files for multi-language ontologies, such as the Cell Line Ontology (CLO),[64] and OBI and BFO. As an advocate for the Chinese translation of high-quality English ontologies and OBO Foundry, OntoChina will continue to promote the construction of Chinese translations of high-quality middle-level ontologies and their corresponding ontology applications in the near future.

Acknowledgments

None.

Author contributions

HP and XY designed and drafted the manuscript. YH, XY and YZ supervised and wrote the manuscript. All authors revised the manuscript and approved the final version of the manuscript.

Financial support

This work was supported by Chinese Academy of Medical Science (CAMS) Innovation Fund for Medical Sciences (CIFMS) (No. 2018-I2M-AI-009 to XY), Independent Subject Project Funded by Basic Scientific Research Fund of Chinese Academy of Chinese Medical Science (No. zz110318 to YZ) and the University of Michigan Global Reach Award (to YH).

Conflicts of interest

The authors declare that they have no conflicts of interest.

References

[1]. Arp R, Smith B, Spear AD. Building Ontologies with Basic Formal Ontology. Cambridge, USA: MIT Press; 2015.
[2]. Robinson PN, Bauer S. Introduction to Bio-Ontologies. Boca Raton, USA: CRC Press; 2011.
[3]. Gangemi A, Guarino N, Masolo C, et al. Sweetening ontologies with DOLCE. Knowledge Engineering and Knowledge Management: Ontologies and the Semantic Web. Berlin, Heidelberg. 2002.
[4]. Pease A. The suggested upper merged ontology: a large ontology for the semantic web and its applications. Working Notes of the AAAI-2002 Workshop on Ontologies and the Semantic Web. Edmonton, Canada. 2002.
[5]. Niles I, Pease A. Towards a standard upper ontology. Proceedings of the international conference on Formal Ontology in Information Systems. Ogunquit, ME, USA. 2001.
[6]. Melo Gd, Suchanek F, Pease A. Integrating YAGO into the suggested upper merged ontology. 2008 20th IEEE International Conference on Tools with Artificial Intelligence. 2008.
[7]. Matuszek C, Cabral J, Witbrock MJ, et al. An Introduction to the Syntax and Content of Cyc. AAAI Spring Symposium: Formalizing and Compiling Background Knowledge and Its Applications to Knowledge Representation and Question Answering. Stanford, CA, USA. 2006.
[8]. Herre H. Poli R, Healy M, Kameas A. General Formal Ontology (GFO): a foundational ontology for conceptual modelling. Theory and Applications of Ontology: Computer Applications Dordrecht: Springer Netherlands; 2010;297–345.
[9]. Arp R, Smith B. Function, role and disposition in Basic Formal Ontology. Proceedings of Bio-Ontologies Workshop, Intelligent Systems for Molecular Biology (ISMB), Toronto. 2008:45–48.
[10]. Kibbe WA, Arze C, Felix V, et al Disease Ontology 2015 update: an expanded and updated database of human diseases for linking biomedical knowledge through disease data. Nucleic Acids Res 2015;43:D1071–D1078.
[11]. Robinson PN, Köhler S, Bauer S, et al The human phenotype ontology: a tool for annotating and analyzing human hereditary disease. Am J Hum Genet 2008;83:610–615.
[12]. Menzel C. Reference ontologies - application ontologies: Either/or or both/and? Proceedings of the KI2003 Workshop on Reference Ontologies and Application Ontologies. Hamburg, Germany. 2003.
[13]. Manyika J CM, Brown B, Bughin J, et al. Big data: The next frontier for innovation, competition, and productivity. http://www.mckinsey.com/insights/business_technology/big_data_the_next_frontier_for_innovation. [Accessed August 1, 2019]
[14]. Freitas A, Curry E. Big data curation. In: Cavanillas JM, Curry E, Wahlster W, eds. New Horizons for a Data-Driven Economy: A Roadmap for Usage and Exploitation of Big Data in Europe. Cham: Springer International Publishing. 2016:87–118.
[15]. Collins FS, Varmus H. A new initiative on precision medicine. N Engl J Med 2015;372:793–795.
[16]. He Y, Yu H, Yang X, et al Ontology: Foundation of biomedical big data and precision medicine research. Shengxu Xinxi Xue 2018;16:7–14.
[17]. Bandrowski A, Brinkman R, Brochhausen M, et al The ontology for biomedical investigations. PLoS One 2016;11:e0154556.
[18]. Haendel MA, Chute CG, Robinson PN. Classification, ontology, and precision medicine. N Engl J Med 2018;379:1452–1462.
[19]. Osheroff JA, Teich JM, Middleton B, et al A roadmap for national action on clinical decision support. J Am Med Inform Assoc 2007;14:141–145.
[20]. Tso GJ, Tu SW, Oshiro C, et al Automating guidelines for clinical decision support: knowledge engineering and implementation. AMIA Annu Symp Proc 2017;2016:1189–1198.
[21]. Musen MA, Middleton B, Greenes RA. Shortliffe EH, Cimino JJ. Clinical decision-support systems. Biomedical Informatics: Computer Applications in Health Care and Biomedicine London: Springer London; 2014;643–674.
[22]. Salvadores M, Alexander PR, Musen MA, et al BioPortal as a dataset of linked biomedical ontologies and terminologies in RDF. Semant Web 2013;4:277–284.
[23]. Ochs C, Perl Y, Geller J, et al An empirical analysis of ontology reuse in BioPortal. J Biomed Inform 2017;71:165–177.
[24]. Ong E, Xiang Z, Zhao B, et al Ontobee: A linked ontology data server to support ontology term dereferencing, linkage, query and integration. Nucleic Acids Res 2017;45:D347–D352.
[25]. Côté RG, Jones P, Apweiler R, et al The Ontology Lookup Service, a lightweight cross-platform tool for controlled vocabulary queries. BMC Bioinformatics 2006;7:97.
[26]. Côté R, Reisinger F, Martens L, et al The ontology lookup service: bigger and better. Nucleic Acids Res 2010;38:W155–W160.
[27]. Hoehndorf R, Slater L, Schofield PN, et al Aber-OWL: a framework for ontology-based data access in biology. BMC Bioinformatics 2015;16:26.
[28]. Slater L, Rodríguez-García MÁ, O'Shea K, et al. Experiences with Aber-OWL, an Ontology Repository with OWL EL Reasoning. 12th International Experiences and Directions Workshop on Ontology Engineering. Bethlehem, PA, USA. 2016.
[29]. Slater L, Gkoutos GV, Schofield PN, et al Using AberOWL for fast and scalable reasoning over BioPortal ontologies. J Biomed Semantics 2016;7:49.
[30]. Berners-Lee T. Linked Data - Design Issues. http://www.w3.org/DesignIssues/LinkedData.html. [Accessed September 20, 2019].
[31]. He Y. He Group. http://www.hegroup.org/. [Accessed March 1, 2019].
[32]. Smith B, Ashburner M, Rosse C, et al The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration. Nat Biotechnol 2007;25:1251.
[33]. Ghazvinian A, Noy NF, Musen MA. How orthogonal are the OBO Foundry ontologies? J Biomed Semantics 2011;2:S2.
[34]. Jonquet C, Dzalé-Yeumo E, Arnaud E, et al. AgroPortal: a proposition for ontology-based services in the agronomic domain. IN-OVIVE. Rennes, France. 2015.
[35]. Jonquet C, Toulet A, Arnaud E, et al. Reusing the NCBO BioPortal technology for agronomy to build AgroPortal. International Conference on Biomedical Ontologies: Demo Session. Corvallis. 2016.
[36]. Jonquet C, Toulet A, Arnaud E, et al. AgroPortal: an open repository of ontologies and vocabularies for agriculture and nutrition data. GODAN Summit. New York, USA. 2016.
[37]. Jonquet C. AgroPortal: an ontology repository for agronomy. EFITA WCCA Congress. Montpellier, France. 2017.
[38]. Jonquet C, Toulet A, Arnaud E, et al AgroPortal: A vocabulary and ontology repository for agronomy. Comput Electron Agric 2018;144:126–143.
[39]. Jonquet C, Toulet A, Dutta B, et al Harnessing the power of unified metadata in an ontology repository: the case of AgroPortal. J Data Semant 2018;7:191–221.
[40]. Guo J, Yang S, Shi F, et al MedPortal: a biomedical ontology repository and platform focused on precision medicine. Zhongguo Shengwu Yixue Gongcheng Xuebao 2017;36:557–564.
[41]. Xiang Z, Courtot M, Brinkman RR, et al OntoFox: web-based support for ontology reuse. BMC Res Notes 2010;3:175.
[42]. Wilkinson MD, Dumontier M, Aalbersberg IJ, et al The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 2016;3:160018.
[43]. Dugan VG, Emrich SJ, Giraldo-Calderón GI, et al Standardized metadata for human pathogen/vector genomic sequences. PLoS One 2014;9:e99979.
[44]. Gonçalves RS, O’Connor MJ, Martínez-Romero M, et al. The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata that Describe Scientific Experiments. The Semantic Web – ISWC. Cham. 2017.
[45]. Sansone S-A, Rocca-Serra P, Field D, et al Toward interoperable bioscience data. Nat Genet 2012;44:121–126.
[46]. Maguire E, González-Beltrán A, Whetzel PL, et al OntoMaton: a bioportal powered ontology widget for Google Spreadsheets. Bioinformatics 2013;29:525–527.
[47]. Bukhari SAC, Martínez-Romero M, O’ Connor MJ, et al CEDAR OnDemand: a browser extension to generate ontology-based scientific metadata. BMC Bioinformatics 2018;19:268.
[48]. Ashburner M, Ball CA, Blake JA, et al Gene ontology: tool for the unification of biology. The gene ontology consortium. Nat Genet 2000;25:25–29.
[49]. The Gene Ontology Consortium. The gene ontology resource: 20 years and still going strong. Nucleic Acids Res 2019;47:D330–D338.
[50]. Camon E, Barrell D, Brooksbank C, et al The Gene Ontology Annotation (GOA) Project--application of GO in SWISS-PROT, TrEMBL and InterPro. Comp Funct Genomics 2003;4:71–74.
[51]. Chibucos MC, Siegele DA, Hu JC, et al The Evidence and Conclusion Ontology (ECO): supporting GO annotations. Methods Mol Biol 2017;1446:245–259.
[52]. Dessimoz C, Škunca N. The Gene Ontology Handbook. New York, USA. 2017.
[53]. Mi H, Muruganujan A, Huang X, et al Protocol Update for large-scale genome and gene function analysis with the PANTHER classification system. Nat Protoc 2019;14:703–721.
[54]. Balhoff JP, Good B, Carbon S, et al. Arachne: an OWL RL Reasoner Applied to Gene Ontology Causal Activity Models (and Beyond). International Semantic Web Conference. Monterey, CA, USA. 2018.
[55]. Rath A, Olry A, Dhombres F, et al Representation of rare diseases in health information systems: The orphanet approach to serve a wide range of end users. Hum Mutat 2012;33:803–808.
[56]. Firth HV, Richards SM, Bevan AP, et al DECIPHER: Database of chromosomal imbalance and phenotype in humans using ensembl resources. Am J Hum Genet 2009;84:524–533.
[57]. Amberger JS, Hamosh A. Searching Online Mendelian Inheritance in Man (OMIM): a knowledgebase of human genes and genetic phenotypes. Curr Protoc Bioinformatics 2017;58:1.2.1–1.2.12.
[58]. Zemojtel T, Köhler S, Mackenroth L, et al Effective diagnosis of genetic disease by computational phenotype analysis of the disease-associated genome. Sci Transl Med 2014;6:252ra123.
[59]. Köhler S, Vasilevsky NA, Engelstad M, et al The human phenotype ontology in 2017. Nucleic Acids Res 2017;45:D865–D876.
[60]. Köhler S, Schulz MH, Krawitz P, et al Clinical diagnostics in human genetics with semantic similarity searches in ontologies. Am J Hum Genet 2009;85:457–464.
[61]. Stuckenschmidt H, Parent C, Spaccapietra S. Modular ontologies: concepts, theories and techniques for knowledge modularization. Berlin: Springer; 2009.
[62]. Doran P, AM, Tamma V, Iannone L. Ontology Module Extraction for Ontology Reuse: An Ontology Engineering Perspective. New York: ACM; 2007.
[63]. Grau BC, Horrocks I, Kazakov Y, et al. Just the right amount: extracting modules from ontologies. Proceedings of the 16th international conference on World Wide Web. Banff, Alberta, Canada. 2007.
[64]. Ong E, Xie J, Ni Z, et al Ontological representation, integration, and analysis of LINCS cell line cells and their cellular responses. BMC Bioinformatics 2017;18:556.
Keywords:

biomedical big data; community-based; ontology; ontology-based application; open sharing

Copyright © 2019 The Chinese Medical Association. Published by Wolters Kluwer Health, Inc.