论文标题

Agask:一个代理人,可以帮助回答科学文件中的农民问题

AgAsk: An Agent to Help Answer Farmer's Questions From Scientific Documents

论文作者

Koopman, Bevan, Mourad, Ahmed, Li, Hang, van der Vegt, Anton, Zhuang, Shengyao, Gibson, Simon, Dang, Yash, Lawrence, David, Zuccon, Guido

论文摘要

农业的决策越来越多地由数据驱动;但是,有价值的农业知识经常被锁定在自由文本的报告,手册和期刊文章中。需要专门的搜索系统,可以挖掘农业信息,以为用户的问题提供相关答案。本文介绍了Agask-能够通过挖掘科学文件回答自然语言农业问题的代理商。 我们仔细调查和分析农民的信息需求。根据这些需求,我们发布了一个信息检索测试收集,其中包括真实问题,大量在段落中分配的科学文档以及基础真理相关性评估,表明哪些段落与每个问题有关。 我们实施并评估许多信息检索模型,以回答农民问题,包括两个最新的神经排名模型。我们表明,在这种情况下,神经排名者在将段落与问题相匹配方面非常有效。 最后,我们为Agask提出了一个部署体系结构,该架构包括基于电报消息平台的客户端和部署在商品硬件上的检索模型。 我们提供的测试收集旨在刺激更多的研究研究,以使自然语言与科学文档中的答案相匹配。尽管在农业领域中评估了检索模型,但它们是可以普遍存在的,并且对于其他处理类似问题的人来说也很感兴趣。 该测试集合可在以下位置提供:\ url {https://github.com/ielab/agvaluate}。

Decisions in agriculture are increasingly data-driven; however, valuable agricultural knowledge is often locked away in free-text reports, manuals and journal articles. Specialised search systems are needed that can mine agricultural information to provide relevant answers to users' questions. This paper presents AgAsk -- an agent able to answer natural language agriculture questions by mining scientific documents. We carefully survey and analyse farmers' information needs. On the basis of these needs we release an information retrieval test collection comprising real questions, a large collection of scientific documents split in passages, and ground truth relevance assessments indicating which passages are relevant to each question. We implement and evaluate a number of information retrieval models to answer farmers questions, including two state-of-the-art neural ranking models. We show that neural rankers are highly effective at matching passages to questions in this context. Finally, we propose a deployment architecture for AgAsk that includes a client based on the Telegram messaging platform and retrieval model deployed on commodity hardware. The test collection we provide is intended to stimulate more research in methods to match natural language to answers in scientific documents. While the retrieval models were evaluated in the agriculture domain, they are generalisable and of interest to others working on similar problems. The test collection is available at: \url{https://github.com/ielab/agvaluate}.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源