Hinglishnlp：用于呼希什情绪检测的微调语言模型

论文标题

Hinglishnlp：用于呼希什情绪检测的微调语言模型

HinglishNLP: Fine-tuned Language Models for Hinglish Sentiment Detection

论文作者

Bhange, Meghana, Kasliwal, Nirant

论文摘要

代码混合的社交媒体文本的情感分析仍然是一个未经探索的领域。这项工作添加了两种常见的方法：微调大型变压器模型和诸如ULMFIT之类的样品有效方法。先前的工作证明了经典ML方法在极性检测中的功效。微调的通用语言表示模型（例如BERT家族的模型）与经典的机器学习和合奏方法一起进行了基准测试。我们表明，NB-SVM击败了罗伯塔（Roberta）6.2％（相对）F1。最佳性能模型是大多数票数合奏，其F1达到0.707。排行榜提交是在Codalab用户名Nirantk下进行的，F1为0.689。

Sentiment analysis for code-mixed social media text continues to be an under-explored area. This work adds two common approaches: fine-tuning large transformer models and sample efficient methods like ULMFiT. Prior work demonstrates the efficacy of classical ML methods for polarity detection. Fine-tuned general-purpose language representation models, such as those of the BERT family are benchmarked along with classical machine learning and ensemble methods. We show that NB-SVM beats RoBERTa by 6.2% (relative) F1. The best performing model is a majority-vote ensemble which achieves an F1 of 0.707. The leaderboard submission was made under the codalab username nirantk, with F1 of 0.689.

下载PDF全文

下载文献需遵守相关版权规定

论文标题