论文标题
半监督的学习方法可以从反馈和支持中发现企业用户的见解
Semi-Supervised Learning Approach to Discover Enterprise User Insights from Feedback and Support
论文作者
论文摘要
随着云和以客户为中心的文化的发展,我们固有地积累了文本评论,反馈和支持数据的大量存储库。这使企业驱使企业寻求和研究参与模式,用户网络分析,主题检测等。但是,巨大的手动工作对于挖掘可行的可行的数据仍然是必要的。在本文中,我们通过利用深度学习和主题建模来更好地理解用户语音,提出并开发了一种创新的半监督学习方法。这种方法结合了一种基于BERT的多种分类算法,通过有监督的学习结合了一个新颖的概率和语义性的混合主题(PSHTI),并自动地构建途径,并构成远程构建的途径,并探索临时的概率和语义的概率和语义,从而自动地识别途径。文本反馈和支持的子主题。有三个主要的突破性:1。随着深度学习技术的发展,NLP领域已经进行了巨大的创新,但是传统的主题建模是NLP应用程序的一种落后于深度学习的浪潮背后的NLP应用程序之一。在方法和技术观点中,我们采用转移学习来微调基于BERT的多分类系统,以对主要主题进行分类,然后利用新颖的PSHTI模型来推断预测的主要主题下的子主题。 2。传统的无监督的基于学习的主题模型或聚类方法的困难很难自动生成有意义的主题标签,但是我们的系统可以通过通过Web-Crawling利用有关产品的域知识来将顶级单词映射到自助问题上。 3。这项工作通过利用真实生产中的最新方法来帮助阐明灯光以发现用户见解并推动商业投资的优先事项,从而提供了一个突出的展示。
With the evolution of the cloud and customer centric culture, we inherently accumulate huge repositories of textual reviews, feedback, and support data.This has driven enterprises to seek and research engagement patterns, user network analysis, topic detections, etc.However, huge manual work is still necessary to mine data to be able to mine actionable outcomes. In this paper, we proposed and developed an innovative Semi-Supervised Learning approach by utilizing Deep Learning and Topic Modeling to have a better understanding of the user voice.This approach combines a BERT-based multiclassification algorithm through supervised learning combined with a novel Probabilistic and Semantic Hybrid Topic Inference (PSHTI) Model through unsupervised learning, aiming at automating the process of better identifying the main topics or areas as well as the sub-topics from the textual feedback and support.There are three major break-through: 1. As the advancement of deep learning technology, there have been tremendous innovations in the NLP field, yet the traditional topic modeling as one of the NLP applications lag behind the tide of deep learning. In the methodology and technical perspective, we adopt transfer learning to fine-tune a BERT-based multiclassification system to categorize the main topics and then utilize the novel PSHTI model to infer the sub-topics under the predicted main topics. 2. The traditional unsupervised learning-based topic models or clustering methods suffer from the difficulty of automatically generating a meaningful topic label, but our system enables mapping the top words to the self-help issues by utilizing domain knowledge about the product through web-crawling. 3. This work provides a prominent showcase by leveraging the state-of-the-art methodology in the real production to help shed light to discover user insights and drive business investment priorities.