论文标题

Hinglishnlp:用于呼希什情绪检测的微调语言模型

HinglishNLP: Fine-tuned Language Models for Hinglish Sentiment Detection

论文作者

Bhange, Meghana, Kasliwal, Nirant

论文摘要

代码混合的社交媒体文本的情感分析仍然是一个未经探索的领域。这项工作添加了两种常见的方法:微调大型变压器模型和诸如ULMFIT之类的样品有效方法。先前的工作证明了经典ML方法在极性检测中的功效。微调的通用语言表示模型(例如BERT家族的模型)与经典的机器学习和合奏方法一起进行了基准测试。我们表明,NB-SVM击败了罗伯塔(Roberta)6.2%(相对)F1。最佳性能模型是大多数票数合奏,其F1达到0.707。排行榜提交是在Codalab用户名Nirantk下进行的,F1为0.689。

Sentiment analysis for code-mixed social media text continues to be an under-explored area. This work adds two common approaches: fine-tuning large transformer models and sample efficient methods like ULMFiT. Prior work demonstrates the efficacy of classical ML methods for polarity detection. Fine-tuned general-purpose language representation models, such as those of the BERT family are benchmarked along with classical machine learning and ensemble methods. We show that NB-SVM beats RoBERTa by 6.2% (relative) F1. The best performing model is a majority-vote ensemble which achieves an F1 of 0.707. The leaderboard submission was made under the codalab username nirantk, with F1 of 0.689.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源