基于LSTM的架构，将语音刺激与脑电图联系起来

论文标题

基于LSTM的架构，将语音刺激与脑电图联系起来

An LSTM Based Architecture to Relate Speech Stimulus to EEG

论文作者

Monesi, Mohammad Jalilpour, Accou, Bernd, Montoya-Martinez, Jair, Francart, Tom, Van Hamme, Hugo

论文摘要

建模自然语音与记录的脑电图（EEG）之间的关系有助于我们了解大脑如何处理语音并在神经科学和脑部计算机界面中具有各种应用。在这种情况下，到目前为止，主要使用了线性模型。但是，由于人脑中听觉处理的复杂且高度的线性性质，线性模型的解码性能受到限制。我们提出了一个新型的长期短期记忆（LSTM）架构，作为分类问题的非线性模型，即给定（EEG，语音信封）是否相互对应。该模型使用EEG路径中的CNN和语音路径中的LSTN映射EEG和包膜的短段和包膜。后者还补偿了大脑反应延迟。此外，我们使用转移学习来微调每个主题的模型。所提出的模型的平均分类精度达到85％，显着高于基于最先进的卷积神经网络（CNN）模型（73％）和线性模型（69％）的平均分类精度。

Modeling the relationship between natural speech and a recorded electroencephalogram (EEG) helps us understand how the brain processes speech and has various applications in neuroscience and brain-computer interfaces. In this context, so far mainly linear models have been used. However, the decoding performance of the linear model is limited due to the complex and highly non-linear nature of the auditory processing in the human brain. We present a novel Long Short-Term Memory (LSTM)-based architecture as a non-linear model for the classification problem of whether a given pair of (EEG, speech envelope) correspond to each other or not. The model maps short segments of the EEG and the envelope to a common embedding space using a CNN in the EEG path and an LSTM in the speech path. The latter also compensates for the brain response delay. In addition, we use transfer learning to fine-tune the model for each subject. The mean classification accuracy of the proposed model reaches 85%, which is significantly higher than that of a state of the art Convolutional Neural Network (CNN)-based model (73%) and the linear model (69%).

下载PDF全文

下载文献需遵守相关版权规定

论文标题