论文标题
与意识形态相关的话题相关性:社会主题检测模型中政治意识形态偏见的案例研究
Inflating Topic Relevance with Ideology: A Case Study of Political Ideology Bias in Social Topic Detection Models
论文作者
论文摘要
我们研究了政治意识形态偏见在培训数据中的影响。通过一系列比较研究,我们研究了几种广泛使用的NLP模型中偏差的传播及其对整体检索准确性的影响。我们的工作强调了大型,复杂的模型对传播人类选择的输入的偏见的敏感性,这可能导致检索准确性的恶化,以及控制这些偏见的重要性。最后,作为减轻偏见的一种方式,我们建议学习一种文本表示,这是政治意识形态不变的,同时仍在判断主题相关性。
We investigate the impact of political ideology biases in training data. Through a set of comparison studies, we examine the propagation of biases in several widely-used NLP models and its effect on the overall retrieval accuracy. Our work highlights the susceptibility of large, complex models to propagating the biases from human-selected input, which may lead to a deterioration of retrieval accuracy, and the importance of controlling for these biases. Finally, as a way to mitigate the bias, we propose to learn a text representation that is invariant to political ideology while still judging topic relevance.