论文标题
Ehrkit:电子健康记录文本的Python自然语言处理工具包
EHRKit: A Python Natural Language Processing Toolkit for Electronic Health Record Texts
论文作者
论文摘要
电子健康记录(EHR)是现代医疗系统的重要组成部分,影响了医疗保健,运营和研究。尽管在EHR中进行了结构化信息,但非结构化的文本仍吸引了很多关注,并已成为一个令人兴奋的研究领域。最近的神经自然语言处理(NLP)方法的成功导致了处理非结构化临床笔记的新方向。在这项工作中,我们创建了一个用于临床文本的Python库,Ehrkit。该库包含两个主要部分:模拟III特定功能和任务特定功能。第一部分介绍了用于访问MIMIC-III NOTEEVENT数据的接口列表,包括基本搜索,信息检索和信息提取。第二部分集成了许多第三方库,用于多达12个删除NLP任务,例如命名实体识别,摘要,机器翻译等。
The Electronic Health Record (EHR) is an essential part of the modern medical system and impacts healthcare delivery, operations, and research. Unstructured text is attracting much attention despite structured information in the EHRs and has become an exciting research field. The success of the recent neural Natural Language Processing (NLP) method has led to a new direction for processing unstructured clinical notes. In this work, we create a python library for clinical texts, EHRKit. This library contains two main parts: MIMIC-III-specific functions and tasks specific functions. The first part introduces a list of interfaces for accessing MIMIC-III NOTEEVENTS data, including basic search, information retrieval, and information extraction. The second part integrates many third-party libraries for up to 12 off-shelf NLP tasks such as named entity recognition, summarization, machine translation, etc.