论文标题

多模式蛋白知识图构建和应用

Multi-modal Protein Knowledge Graph Construction and Applications

论文作者

Cheng, Siyuan, Liang, Xiaozhuan, Bi, Zhen, Chen, Huajun, Zhang, Ningyu

论文摘要

现有的以数据为中心的蛋白质科学方法通常无法充分捕获和利用生物学知识,这对于许多蛋白质任务可能至关重要。为了促进该领域的研究,我们创建了蛋白质科学知识图Proteinkg65。我们将基因本体论和Uniprot知识基础作为基础,分别将各种知识与对齐描述和蛋白质序列分别转换为术语和蛋白质实体。 Proteinkg65主要致力于提供专门的蛋白质知识图,将基因本体论的知识带入蛋白质功能和结构预测。我们还用原型说明了Proteinkg65的潜在应用。我们的数据集可以在https://w3id.org/proteinkg65上下载。

Existing data-centric methods for protein science generally cannot sufficiently capture and leverage biology knowledge, which may be crucial for many protein tasks. To facilitate research in this field, we create ProteinKG65, a knowledge graph for protein science. Using gene ontology and Uniprot knowledge base as a basis, we transform and integrate various kinds of knowledge with aligned descriptions and protein sequences, respectively, to GO terms and protein entities. ProteinKG65 is mainly dedicated to providing a specialized protein knowledge graph, bringing the knowledge of Gene Ontology to protein function and structure prediction. We also illustrate the potential applications of ProteinKG65 with a prototype. Our dataset can be downloaded at https://w3id.org/proteinkg65.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源