论文标题
MED-SE:基于医学实体定义的句子嵌入
MED-SE: Medical Entity Definition-based Sentence Embedding
论文作者
论文摘要
我们建议基于医学实体定义的句子嵌入(MED-SE),这是一种专为临床文本而设计的新颖无监督的对比学习框架,利用了医疗实体的定义。为此,我们对临床语义文本相似性(STS)设置中的多句嵌入技术进行了广泛的分析。在我们设计的以实体为中心的环境中,MED-SE取得了更好的性能,而包括SIMCSE在内的现有无监督方法显示出降级性能。我们的实验阐明了一般和临床域文本之间的固有差异,并表明以实体为中心的对比方法可以帮助弥合这一差距并导致更好地表示临床句子。
We propose Medical Entity Definition-based Sentence Embedding (MED-SE), a novel unsupervised contrastive learning framework designed for clinical texts, which exploits the definitions of medical entities. To this end, we conduct an extensive analysis of multiple sentence embedding techniques in clinical semantic textual similarity (STS) settings. In the entity-centric setting that we have designed, MED-SE achieves significantly better performance, while the existing unsupervised methods including SimCSE show degraded performance. Our experiments elucidate the inherent discrepancies between the general- and clinical-domain texts, and suggest that entity-centric contrastive approaches may help bridge this gap and lead to a better representation of clinical sentences.