cs.SD - 2023-08-30

A Review of Differentiable Digital Signal Processing for Music & Speech Synthesis

  • paper_url: http://arxiv.org/abs/2308.15422
  • repo_url: None
  • paper_authors: Ben Hayes, Jordie Shier, György Fazekas, Andrew McPherson, Charalampos Saitis
  • for: 这篇论文主要针对的是音乐和语音合成领域中的差分可控数字信号处理技术的应用。
  • methods: 该论文使用了差分可控数字信号处理技术,其中包括后退传播和权重调整等方法。
  • results: 该论文对音乐和语音合成任务进行了评估,并结果表明这些技术可以提高音乐和语音的生成质量。同时,论文还提出了一些未来研究的挑战,如优化症状、真实世界情况下的稳定性和设计决策。
    Abstract The term "differentiable digital signal processing" describes a family of techniques in which loss function gradients are backpropagated through digital signal processors, facilitating their integration into neural networks. This article surveys the literature on differentiable audio signal processing, focusing on its use in music & speech synthesis. We catalogue applications to tasks including music performance rendering, sound matching, and voice transformation, discussing the motivations for and implications of the use of this methodology. This is accompanied by an overview of digital signal processing operations that have been implemented differentiably. Finally, we highlight open challenges, including optimisation pathologies, robustness to real-world conditions, and design trade-offs, and discuss directions for future research.
    摘要 “差分可读取数字信号处理”是一家技术集合,其中损失函数导数通过数字信号处理器进行反propagation,以便将其 интегрирова到神经网络中。本文对差分音频信号处理的文献进行了报告,专注于它在音乐与语音合成中的应用。我们列出了各种应用场景,包括音乐演奏渲染、声音匹配和语音转换,并讨论了使用这种方法的动机和影响。此外,我们还提供了对数字信号处理操作的差分实现的概述。最后,我们指出了当前的开放挑战,包括优化症状、对实际 Condition 的Robustness以及设计贸易OFF,并讨论了未来研究的方向。