Duluth在2019年Semeval-2019任务6：识别和分类进攻推文的词汇方法

论文标题

Duluth在2019年Semeval-2019任务6：识别和分类进攻推文的词汇方法

Duluth at SemEval-2019 Task 6: Lexical Approaches to Identify and Categorize Offensive Tweets

论文作者

Pedersen, Ted

论文摘要

本文介绍了参加Semeval-2019-2019任务6的Duluth系统，并在社交媒体（攻击性）中识别和分类了进攻性语言。在大多数情况下，这些系统采用了传统的机器学习方法，这些方法从手动标记的培训数据中发现的词汇特征构建了分类器。但是，我们将推文分类为令人反感的最成功的系统是一种基于规则的黑人列表方法，我们还尝试结合来自两个不同但相关的半厌恶任务的培训数据。我们在比较评估中间的三个进攻任务中的每个任务中的每一个中的最佳系统，在任务A中排名第57位，在任务B中排名第75，在任务C中排名第44位，第44位。

This paper describes the Duluth systems that participated in SemEval--2019 Task 6, Identifying and Categorizing Offensive Language in Social Media (OffensEval). For the most part these systems took traditional Machine Learning approaches that built classifiers from lexical features found in manually labeled training data. However, our most successful system for classifying a tweet as offensive (or not) was a rule-based black--list approach, and we also experimented with combining the training data from two different but related SemEval tasks. Our best systems in each of the three OffensEval tasks placed in the middle of the comparative evaluation, ranking 57th of 103 in task A, 39th of 75 in task B, and 44th of 65 in task C.

下载PDF全文

下载文献需遵守相关版权规定

论文标题