机器学习友好的生物医学数据集，用于等效和集体本体匹配

论文标题

机器学习友好的生物医学数据集，用于等效和集体本体匹配

Machine Learning-Friendly Biomedical Datasets for Equivalence and Subsumption Ontology Matching

论文作者

He, Yuan, Chen, Jiaoyan, Dong, Hang, Jiménez-Ruiz, Ernesto, Hadian, Ali, Horrocks, Ian

论文摘要

本体匹配（OM）在许多领域（例如生物信息学和语义网络）中起着重要作用，其研究变得越来越流行，尤其是在机器学习（ML）技术的应用中。尽管本体学评估计划（OAEI）代表了对OM系统系统评估的令人印象深刻的努力，但它仍然受到了几个局限性，包括对集合映射的评估，次优参考映射以及对基于ML的系统评估的支持有限。为了应对这些局限性，我们介绍了五项新的生物医学OM任务，涉及从Mondo和UMLS提取的本体。每个任务既包括等价匹配又包括匹配；人类策划，本体论修剪等确保参考映射的质量。并提出了一个全面的评估框架，以从基于ML的基于ML和非ML的OM系统从各个角度衡量OM性能。我们报告了不同类型的OM系统的评估结果，以证明这些资源的使用情况，所有这些资源都是在OAEI 2022的新BioMl轨道的一部分中公开使用的。

Ontology Matching (OM) plays an important role in many domains such as bioinformatics and the Semantic Web, and its research is becoming increasingly popular, especially with the application of machine learning (ML) techniques. Although the Ontology Alignment Evaluation Initiative (OAEI) represents an impressive effort for the systematic evaluation of OM systems, it still suffers from several limitations including limited evaluation of subsumption mappings, suboptimal reference mappings, and limited support for the evaluation of ML-based systems. To tackle these limitations, we introduce five new biomedical OM tasks involving ontologies extracted from Mondo and UMLS. Each task includes both equivalence and subsumption matching; the quality of reference mappings is ensured by human curation, ontology pruning, etc.; and a comprehensive evaluation framework is proposed to measure OM performance from various perspectives for both ML-based and non-ML-based OM systems. We report evaluation results for OM systems of different types to demonstrate the usage of these resources, all of which are publicly available as part of the new BioML track at OAEI 2022.

下载PDF全文

下载文献需遵守相关版权规定

论文标题