学会表征匹配的专家

论文标题

学会表征匹配的专家

Learning to Characterize Matching Experts

论文作者

Shraga, Roee, Amir, Ofra, Gal, Avigdor

论文摘要

匹配是任何数据集成过程的核心，旨在识别数据元素之间的对应关系。传统上，匹配的问题是以半自动方式解决的，通过匹配算法和随后由人类专家验证的结果来产生对应关系。最近，通过引入大数据来挑战人类的数据集成，最近的研究分析了有效的人类匹配和验证的障碍。在这项工作中，我们描述了人类匹配的专家，那些提出信件的人可以被认为是有效的。我们提供了一个新颖的框架来表征匹配专家，这些专家伴随着一组新型的功能，可用于识别可靠且有价值的人类专家。我们使用广泛的经验评估证明了方法的有用性。特别是，我们表明我们的方法可以通过过滤廉价匹配器来改善匹配结果。

Matching is a task at the heart of any data integration process, aimed at identifying correspondences among data elements. Matching problems were traditionally solved in a semi-automatic manner, with correspondences being generated by matching algorithms and outcomes subsequently validated by human experts. Human-in-the-loop data integration has been recently challenged by the introduction of big data and recent studies have analyzed obstacles to effective human matching and validation. In this work we characterize human matching experts, those humans whose proposed correspondences can mostly be trusted to be valid. We provide a novel framework for characterizing matching experts that, accompanied with a novel set of features, can be used to identify reliable and valuable human experts. We demonstrate the usefulness of our approach using an extensive empirical evaluation. In particular, we show that our approach can improve matching results by filtering out inexpert matchers.

下载PDF全文

下载文献需遵守相关版权规定

论文标题