自动检测工作场所常用的性别歧视陈述

论文标题

自动检测工作场所常用的性别歧视陈述

Automatic Detection of Sexist Statements Commonly Used at the Workplace

论文作者

Grosz, Dylan, Conde-Cespedes, Patricia

论文摘要

在工作场所中检测仇恨言论是一项独特的分类任务，因为基本的社会背景暗示了传统仇恨言论的微妙版本。有关最先进的工作场所性别歧视检测模型的申请包括人力资源部门的辅助工具，AI聊天机器人和情感分析。大多数现有的仇恨言论检测方法虽然强大且准确，但仍集中在社交媒体上的仇恨言论上，特别是Twitter。社交媒体的背景比工作场所更匿名，因此，它倾向于将自己的性别歧视版本更具侵略性和“敌对”版本。因此，具有大量“敌对”性别歧视的数据集具有稍微容易的检测任务，因为“敌对”的性别歧视陈述可以在几个单词上取决于几个单词，无论上下文如何，该陈述是性别歧视的。在本文中，我们介绍了一个性别歧视声明的数据集，这些数据集更有可能在工作场所中说，以及可以实现最先进结果的深度学习模型。先前的研究创建了最先进的模型，以基于汇总的Twitter数据来区分“敌对”和“仁慈”性别歧视。我们的深度学习方法是用手套或随机单词嵌入初始化的，它使用具有注意机制的LSTM在更多样化的，过滤的数据集上胜过这些模型，该模型更具针对工作场所性别歧视的模型，从而导致F1得分为0.88。

Detecting hate speech in the workplace is a unique classification task, as the underlying social context implies a subtler version of conventional hate speech. Applications regarding a state-of the-art workplace sexism detection model include aids for Human Resources departments, AI chatbots and sentiment analysis. Most existing hate speech detection methods, although robust and accurate, focus on hate speech found on social media, specifically Twitter. The context of social media is much more anonymous than the workplace, therefore it tends to lend itself to more aggressive and "hostile" versions of sexism. Therefore, datasets with large amounts of "hostile" sexism have a slightly easier detection task since "hostile" sexist statements can hinge on a couple words that, regardless of context, tip the model off that a statement is sexist. In this paper we present a dataset of sexist statements that are more likely to be said in the workplace as well as a deep learning model that can achieve state-of-the art results. Previous research has created state-of-the-art models to distinguish "hostile" and "benevolent" sexism based simply on aggregated Twitter data. Our deep learning methods, initialized with GloVe or random word embeddings, use LSTMs with attention mechanisms to outperform those models on a more diverse, filtered dataset that is more targeted towards workplace sexism, leading to an F1 score of 0.88.

下载PDF全文

下载文献需遵守相关版权规定

论文标题