论文标题

系统调查针对低资源设置的低资源依赖解析的策略

Systematic Investigation of Strategies Tailored for Low-Resource Settings for Low-Resource Dependency Parsing

论文作者

Sandhan, Jivnesh, Behera, Laxmidhar, Goyal, Pawan

论文摘要

在这项工作中,我们专注于多种语言的低资源依赖性解析。量身定制了几种策略,以提高低资源场景的性能。尽管这些在社区中是众所周知的,但选择这些策略的表现最好的组合并不是微不足道的,我们对我们感兴趣的低资源语言,并且对衡量这些策略的效率并不高。我们为7种普遍依赖性(UD)低资源语言的结合方法尝试了5种低资源策略。我们对这些语言的详尽实验支持了未经审计模型未涵盖的语言的有效改进。我们在真正的低资源语言梵语中成功地展示了连接系统的成功应用。代码和数据可在以下网址获得:https://github.com/jivnesh/sandp

In this work, we focus on low-resource dependency parsing for multiple languages. Several strategies are tailored to enhance performance in low-resource scenarios. While these are well-known to the community, it is not trivial to select the best-performing combination of these strategies for a low-resource language that we are interested in, and not much attention has been given to measuring the efficacy of these strategies. We experiment with 5 low-resource strategies for our ensembled approach on 7 Universal Dependency (UD) low-resource languages. Our exhaustive experimentation on these languages supports the effective improvements for languages not covered in pretrained models. We show a successful application of the ensembled system on a truly low-resource language Sanskrit. The code and data are available at: https://github.com/Jivnesh/SanDP

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源