论文标题
语言驱动的对话推荐中意外的偏见
Unintended Bias in Language Model-driven Conversational Recommendation
论文作者
论文摘要
会话推荐系统(CRS)最近开始利用验证的语言模型(LM),例如BERT,可以通过语义解释广泛的偏好语句变化的能力。然而,验证的LMS众所周知,训练数据中易于内在偏见,这可能会因嵌入在域特异性语言数据中的偏见而加剧,例如用于微调CRS的LMS的偏见。我们研究了CRS的最近引入的LM驱动的建议主链(称为LMREC),以调查意外偏见,即语言变化,例如姓名参考或性取向或位置的间接指标,这些指标或位置不应影响建议在餐厅建议的价格和类别分布中不应影响建议表现出来。我们观察到的令人震惊的结果表明,LMREC通过其建议学会了增强有害的刻板印象。例如,与黑人社区相关的名称的副手明显降低了推荐餐厅的价格分布,而副手提到了与男性相关的常见名称导致推荐的酒精服务场所的增加。这些工作中提出的这些和许多相关的结果引起了一个危险信号,该危险信号的语言处理能力的提高并没有带来与减轻未来部署的CRS助手相关的重大挑战,并具有数亿最终用户的潜在影响力。
Conversational Recommendation Systems (CRSs) have recently started to leverage pretrained language models (LM) such as BERT for their ability to semantically interpret a wide range of preference statement variations. However, pretrained LMs are well-known to be prone to intrinsic biases in their training data, which may be exacerbated by biases embedded in domain-specific language data(e.g., user reviews) used to fine-tune LMs for CRSs. We study a recently introduced LM-driven recommendation backbone (termed LMRec) of a CRS to investigate how unintended bias i.e., language variations such as name references or indirect indicators of sexual orientation or location that should not affect recommendations manifests in significantly shifted price and category distributions of restaurant recommendations. The alarming results we observe strongly indicate that LMRec has learned to reinforce harmful stereotypes through its recommendations. For example, offhand mention of names associated with the black community significantly lowers the price distribution of recommended restaurants, while offhand mentions of common male-associated names lead to an increase in recommended alcohol-serving establishments. These and many related results presented in this work raise a red flag that advances in the language handling capability of LM-drivenCRSs do not come without significant challenges related to mitigating unintended bias in future deployed CRS assistants with a potential reach of hundreds of millions of end-users.