论文标题

低资源场景中的信息提取:调查和观点

Information Extraction in Low-Resource Scenarios: Survey and Perspective

论文作者

Deng, Shumin, Ma, Yubo, Zhang, Ningyu, Cao, Yixin, Hooi, Bryan

论文摘要

信息提取(IE)试图从非结构化的文本中得出结构化信息,由于数据稀缺和看不见的类别,通常在低资源场景中面临挑战。本文介绍了对低资源的神经方法的回顾,即从\ emph {fradiflic}和\ emph {llm-llm}观点,系统地将它们分类为细粒的分类法。然后,我们对基于LLM的方法进行了实证研究,与以前的最新模型相比,(1)良好的LMS仍然主要是主导。 (2)与GPT家族一起调整开放资源LLM和ICL是有希望的; (3)低资源IE的基于LLM的最佳技术解决方案可以取决于任务。此外,我们与LLM讨论了低资源IE,突出显示了有希望的应用程序,并概述了潜在的研究方向。这项调查旨在促进对这一领域的理解,激发新的想法,并鼓励在学术界和行业中广泛应用。

Information Extraction (IE) seeks to derive structured information from unstructured texts, often facing challenges in low-resource scenarios due to data scarcity and unseen classes. This paper presents a review of neural approaches to low-resource IE from \emph{traditional} and \emph{LLM-based} perspectives, systematically categorizing them into a fine-grained taxonomy. Then we conduct empirical study on LLM-based methods compared with previous state-of-the-art models, and discover that (1) well-tuned LMs are still predominant; (2) tuning open-resource LLMs and ICL with GPT family is promising in general; (3) the optimal LLM-based technical solution for low-resource IE can be task-dependent. In addition, we discuss low-resource IE with LLMs, highlight promising applications, and outline potential research directions. This survey aims to foster understanding of this field, inspire new ideas, and encourage widespread applications in both academia and industry.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源