论文标题
orgfaq:关于组织常见问题解答和用户问题的新数据集和分析
orgFAQ: A New Dataset and Analysis on Organizational FAQs and User Questions
论文作者
论文摘要
常见问题(FAQ)网页是由组织为其用户创建的。常见问题解答在几种情况下,例如回答用户问题。另一方面,根据定义,常见问题解答的内容受用户问题的影响。为了促进该领域的研究,存在一些常见问题解答数据集。但是,我们声称是从社区网站收集的,它们在组织环境中并不能正确地代表与常见问题解答相关的挑战。因此,我们发布了Orgfaq,这是一个由$ 6988 $的用户问题组成的新数据集和$ 1579 $相应的常见问题,这些常见问题是从组织域中从组织的FAQ网页中提取的。在本文中,我们对此类常见问题解答的属性进行了分析,并通过在工作域中的相关任务中利用它来证明我们的新数据集的有用性。我们还显示了ORGFAQ数据集的值,该任务是不同领域的任务-COVID -19大流行。
Frequently Asked Questions (FAQ) webpages are created by organizations for their users. FAQs are used in several scenarios, e.g., to answer user questions. On the other hand, the content of FAQs is affected by user questions by definition. In order to promote research in this field, several FAQ datasets exist. However, we claim that being collected from community websites, they do not correctly represent challenges associated with FAQs in an organizational context. Thus, we release orgFAQ, a new dataset composed of $6988$ user questions and $1579$ corresponding FAQs that were extracted from organizations' FAQ webpages in the Jobs domain. In this paper, we provide an analysis of the properties of such FAQs, and demonstrate the usefulness of our new dataset by utilizing it in a relevant task from the Jobs domain. We also show the value of the orgFAQ dataset in a task of a different domain - the COVID-19 pandemic.