论文标题

EMMT:用于多模式阅读和翻译方案的同时引人注目,4电极EEG和音频语料库

EMMT: A simultaneous eye-tracking, 4-electrode EEG and audio corpus for multi-modal reading and translation scenarios

论文作者

Bhattacharya, Sunit, Kloudová, Věra, Zouhar, Vilém, Bojar, Ondřej

论文摘要

我们介绍了43位参与者的眼球多模式翻译(EMMT)语料库,该数据集包含单眼眼动记录,音频和4个电极脑电图(EEG)数据的数据集。目的是收集认知信号,作为参与许多语言密集型任务的参与者的反应,涉及不同的文本图像刺激设置,从英语翻译为捷克语。 每个参与者都暴露于32个文本图像刺激对,并要求(1)阅读英文句子,(2)将其转换为捷克语,(3)咨询图像,(4)再次翻译,更新或重复上一本翻译。文本刺激由200个独特的句子组成,其中有616个独特的单词以及200个独特的图像作为视觉刺激。 录音是在两个星期内收集的,研究中包括的所有参与者都是具有强大英语技能的捷克人。由于研究中涉及的任务的性质以及相对涉及的参与者的性质,该语料库非常适合翻译过程研究,认知科学以及其他学科的认知科学。

We present the Eyetracked Multi-Modal Translation (EMMT) corpus, a dataset containing monocular eye movement recordings, audio and 4-electrode electroencephalogram (EEG) data of 43 participants. The objective was to collect cognitive signals as responses of participants engaged in a number of language intensive tasks involving different text-image stimuli settings when translating from English to Czech. Each participant was exposed to 32 text-image stimuli pairs and asked to (1) read the English sentence, (2) translate it into Czech, (3) consult the image, (4) translate again, either updating or repeating the previous translation. The text stimuli consisted of 200 unique sentences with 616 unique words coupled with 200 unique images as the visual stimuli. The recordings were collected over a two week period and all the participants included in the study were Czech natives with strong English skills. Due to the nature of the tasks involved in the study and the relatively large number of participants involved, the corpus is well suited for research in Translation Process Studies, Cognitive Sciences among other disciplines.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源