for: This paper is written for analyzing the underlying neural mechanisms of human speech comprehension, specifically using a match-mismatch classification of speech stimuli and neural responses.
methods: The paper uses a network of convolution layers to process both speech and EEG signals, followed by a word boundary-based average pooling and a recurrent layer to incorporate inter-word context.
results: The experiments show that the modeling accuracy can be significantly improved to 93% on a publicly available speech-EEG data set, which is higher than previous efforts that achieved an accuracy of 65-75% for this task.Abstract
Recent studies have shown that the underlying neural mechanisms of human speech comprehension can be analyzed using a match-mismatch classification of the speech stimulus and the neural response. However, such studies have been conducted for fixed-duration segments without accounting for the discrete processing of speech in the brain. In this work, we establish that word boundary information plays a significant role in sentence processing by relating EEG to its speech input. We process the speech and the EEG signals using a network of convolution layers. Then, a word boundary-based average pooling is performed on the representations, and the inter-word context is incorporated using a recurrent layer. The experiments show that the modeling accuracy can be significantly improved (match-mismatch classification accuracy) to 93% on a publicly available speech-EEG data set, while previous efforts achieved an accuracy of 65-75% for this task.
摘要
近期研究表明,人类语言理解的下面神经机制可以通过匹配-不匹配分类的语音刺激和神经回快的方式进行分析。然而,这些研究通常是在固定时间段内进行的,没有考虑大脑对语音的精度处理。在这项工作中,我们证明了单词边界信息在句子处理中发挥了重要作用,并使用神经网络进行语音和EEG信号处理。然后,我们对表示进行了单词边界基于的均值抽取,并通过回快层 incorporate 了间隔词上下文。实验结果显示,我们的模型准确率可以提高至 93% 在一个公共可用的语音-EEG数据集上,而之前的努力只能达到 65-75% 的水平。