eess.AS - 2023-11-28

Study of speaker localization under dynamic and reverberant environments

  • paper_url: http://arxiv.org/abs/2311.16927
  • repo_url: None
  • paper_authors: Daniel A. Mitchell, Boaz Rafaely
  • for: 本研究探讨了在听到噪音和反射环境中的 speaker localization 问题。
  • methods: 本研究使用了一种基于 glasses 悬挂麦克风数组的方法,并对其在动态环境中的性能进行了分析和改进。
  • results: 研究发现,该方法在静止环境下表现良好,但在 EasyCom 数据集上表现较差。研究还提出了改进方法。
    Abstract Speaker localization in a reverberant environment is a fundamental problem in audio signal processing. Many solutions have been developed to tackle this problem. However, previous algorithms typically assume a stationary environment in which both the microphone array and the sound sources are not moving. With the emergence of wearable microphone arrays, acoustic scenes have become dynamic with moving sources and arrays. This calls for algorithms that perform well in dynamic environments. In this article, we study the performance of a speaker localization algorithm in such an environment. The study is based on the recently published EasyCom speech dataset recorded in reverberant and noisy environments using a wearable array on glasses. Although the localization algorithm performs well in static environments, its performance degraded substantially when used on the EasyCom dataset. The paper presents performance analysis and proposes methods for improvement.
    摘要 <> tranlate(Speaker localization in a reverberant environment is a fundamental problem in audio signal processing. Many solutions have been developed to tackle this problem. However, previous algorithms typically assume a stationary environment in which both the microphone array and the sound sources are not moving. With the emergence of wearable microphone arrays, acoustic scenes have become dynamic with moving sources and arrays. This calls for algorithms that perform well in dynamic environments. In this article, we study the performance of a speaker localization algorithm in such an environment. The study is based on the recently published EasyCom speech dataset recorded in reverberant and noisy environments using a wearable array on glasses. Although the localization algorithm performs well in static environments, its performance degraded substantially when used on the EasyCom dataset. The paper presents performance analysis and proposes methods for improvement.)中文简体<>音频信号处理中的发言者定位是一个基础问题,许多解决方案已经被开发出来。然而,以前的算法通常假设静止环境,其中麦克风数组和声音源都不会移动。随着携带式麦克风数组的出现,听音场景变得了动态,声音源和麦克风数组都在移动。这需要能够在动态环境中表现良好的算法。在这篇文章中,我们研究了一种发言者定位算法在这种环境中的性能。研究基于最近发布的EasyCom语音数据集,该数据集在噪声和折射环境中记录了穿戴在眼镜上的携带式麦克风数组。虽然本算法在静止环境中表现良好,但在EasyCom数据集上,其性能受到了显著的降低。文章提供了性能分析和改进方法。)