论文标题
是否想在推文中识别,提取和标准化不良药物反应?使用罗伯塔
Want to Identify, Extract and Normalize Adverse Drug Reactions in Tweets? Use RoBERTa
论文作者
论文摘要
本文介绍了我们的任务2和社交媒体挖掘任务3的方法(SMM4H)2020共享任务。在任务2中,我们必须将不良药物反应(ADR)推文与非ADR推文区分开,并被视为二元分类。任务3涉及提取ADR提及,然后将它们映射到Meddra代码。提取ADR提及被视为序列标记,而正常化的ADR提及被视为多类分类。我们的系统基于预先训练的语言模型罗伯塔(Roberta),它在Task2中达到58%的F1得分比平均得分高12%b)在任务3的ADR提取中,轻松的F1评分为70.1%,比平均得分均高于ADR的F1分数比平均得分均高于ADR +标准的平均得分35% +标准均高于标准的35%。总体而言,我们的模型在两项任务中都取得了令人鼓舞的结果,并且比平均得分有了显着改善。
This paper presents our approach for task 2 and task 3 of Social Media Mining for Health (SMM4H) 2020 shared tasks. In task 2, we have to differentiate adverse drug reaction (ADR) tweets from nonADR tweets and is treated as binary classification. Task3 involves extracting ADR mentions and then mapping them to MedDRA codes. Extracting ADR mentions is treated as sequence labeling and normalizing ADR mentions is treated as multi-class classification. Our system is based on pre-trained language model RoBERTa and it achieves a) F1-score of 58% in task2 which is 12% more than the average score b) relaxed F1-score of 70.1% in ADR extraction of task 3 which is 13.7% more than the average score and relaxed F1-score of 35% in ADR extraction + normalization of task3 which is 5.8% more than the average score. Overall, our models achieve promising results in both the tasks with significant improvements over average scores.