论文标题
自动标记K12数学问题的知识点
Automatic tagging of knowledge points for K12 math problems
论文作者
论文摘要
自动标记实践问题的知识点是管理问题基础并改善教育的自动化和智能的基础。因此,研究用于实践问题的自动标记技术具有很大的实际意义。但是,关于数学问题的知识点自动标记的研究很少。与一般文本相比,数学文本具有更复杂的结构和语义,因为它们包含符号和公式之类的独特元素。因此,很难通过直接应用一般域中的文本分类技术来满足知识点预测的准确性要求。在本文中,K12数学问题是研究对象,提出了基于标签语义关注和组合文本特征的多标签平滑的实验室模型,以改善数学问题知识点的自动标记。该模型将文本分类技术与一般域中的文本分类技术和数学文本的独特功能结合在一起。结果表明,使用标签语义关注或多标签平滑度的模型在精度,召回和F1得分指标上的性能要比传统的BilstM模型更好,而实验室模型使用两者都表现最佳。可以看出,标签信息可以指导神经网络从问题文本中提取有意义的信息,从而改善了模型的文本分类性能。此外,结合文本特征的多标签平滑性可以充分探索文本和标签之间的关系,提高模型的新数据预测能力,并提高模型的分类精度。
Automatic tagging of knowledge points for practice problems is the basis for managing question bases and improving the automation and intelligence of education. Therefore, it is of great practical significance to study the automatic tagging technology for practice problems. However, there are few studies on the automatic tagging of knowledge points for math problems. Math texts have more complex structures and semantics compared with general texts because they contain unique elements such as symbols and formulas. Therefore, it is difficult to meet the accuracy requirement of knowledge point prediction by directly applying the text classification techniques in general domains. In this paper, K12 math problems taken as the research object, the LABS model based on label-semantic attention and multi-label smoothing combining textual features is proposed to improve the automatic tagging of knowledge points for math problems. The model combines the text classification techniques in general domains and the unique features of math texts. The results show that the models using label-semantic attention or multi-label smoothing perform better on precision, recall, and F1-score metrics than the traditional BiLSTM model, while the LABS model using both performs best. It can be seen that label information can guide the neural networks to extract meaningful information from the problem text, which improves the text classification performance of the model. Moreover, multi-label smoothing combining textual features can fully explore the relationship between text and labels, improve the model's prediction ability for new data and improve the model's classification accuracy.