论文标题
租赁合同审查的基准
A Benchmark for Lease Contract Review
论文作者
论文摘要
从法律合同中提取实体和其他有用的信息是一项重要的任务,其自动化可以帮助法律专业人员更有效地进行合同审查并降低相关风险。在本文中,我们解决了检测两种不同类型的元素在合同审查中起重要作用的问题,即实体和危险信号。后者是术语或句子,表明一个或多个签署各方存在某些危险或其他潜在问题的情况。我们专注于支持租赁协议的审查,这是一种合同类型,在法律信息提取文献中很少关注,我们定义了该任务所需的实体类型和危险信号。我们发布了179个租赁协议文件的新基准数据集,我们已经手动注释了它们所包含的实体和危险信号,可用于训练和测试相关的提取算法。最后,我们发布了一种名为Aleasebert的新语言模型,该模型已在此数据集上进行了预培训,并进行了微调以检测上述元素,从而为进一步的研究提供了基线
Extracting entities and other useful information from legal contracts is an important task whose automation can help legal professionals perform contract reviews more efficiently and reduce relevant risks. In this paper, we tackle the problem of detecting two different types of elements that play an important role in a contract review, namely entities and red flags. The latter are terms or sentences that indicate that there is some danger or other potentially problematic situation for one or more of the signing parties. We focus on supporting the review of lease agreements, a contract type that has received little attention in the legal information extraction literature, and we define the types of entities and red flags needed for that task. We release a new benchmark dataset of 179 lease agreement documents that we have manually annotated with the entities and red flags they contain, and which can be used to train and test relevant extraction algorithms. Finally, we release a new language model, called ALeaseBERT, pre-trained on this dataset and fine-tuned for the detection of the aforementioned elements, providing a baseline for further research