论文标题

在印地语和旁遮普语中的正拼schwas的素式转换

Supervised Grapheme-to-Phoneme Conversion of Orthographic Schwas in Hindi and Punjabi

论文作者

Arora, Aryaman, Gessler, Luke, Schneider, Nathan

论文摘要

印地语字素至phoneme(g2p)转换主要是微不足道的,一个例外:矫正术中表示的schwa是发音还是未发音(已删除)。先前的工作试图使用韵律或语音分析以基于规则的方式预测SCHWA删除。我们提出了印地语的第一个统计SCHWA删除分类器,该分类器仅依赖于拼字法,因为输入和优于先前的方法。我们培训了模型,该模型是从各种在线词典中提取的新编译的发音词典。我们最好的印地语模型可实现最先进的表现,并且在无需修改的旁遮普语密切相关的语言上也取得了良好的性能。

Hindi grapheme-to-phoneme (G2P) conversion is mostly trivial, with one exception: whether a schwa represented in the orthography is pronounced or unpronounced (deleted). Previous work has attempted to predict schwa deletion in a rule-based fashion using prosodic or phonetic analysis. We present the first statistical schwa deletion classifier for Hindi, which relies solely on the orthography as the input and outperforms previous approaches. We trained our model on a newly-compiled pronunciation lexicon extracted from various online dictionaries. Our best Hindi model achieves state of the art performance, and also achieves good performance on a closely related language, Punjabi, without modification.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源