results: 论文的实验结果显示,该方法可以substantially outperform 一些选择的文本匹配基准,并且与最先进的语言模型(GPT-4)的性能相当,在处理大规模的公众意见信和 regulators 回复时更加经济。Abstract
U.S. Federal Regulators receive over one million comment letters each year from businesses, interest groups, and members of the public, all advocating for changes to proposed regulations. These comments are believed to have wide-ranging impacts on public policy. However, measuring the impact of specific comments is challenging because regulators are required to respond to comments but they do not have to specify which comments they are addressing. In this paper, we propose a simple yet effective solution to this problem by using an iterative contrastive method to train a neural model aiming for matching text from public comments to responses written by regulators. We demonstrate that our proposal substantially outperforms a set of selected text-matching baselines on a human-annotated test set. Furthermore, it delivers performance comparable to the most advanced gigantic language model (i.e., GPT-4), and is more cost-effective when handling comments and regulator responses matching in larger scale.
摘要
美国联邦监管机构每年收到超过一百万个公众意见书籍,来自企业、利益团体和公民,强烈提出修改提案的修订。这些意见被认为具有广泛的公共政策影响。然而,评估特定意见的影响很困难,因为监管机构需要回复意见,但并不需要指定哪些意见。在这篇论文中,我们提议一种简单 yet effective的解决方案,使用迭代对照方法训练一个神经网络,以匹配公众意见和监管机构的回复。我们示出,我们的提议在一个人工标注的测试集上显著超越了一组选择的文本匹配基线。此外,它可以与最先进的巨大语言模型(即GPT-4)的性能相似,并在处理评论和监管机构回复的更大规模时更加经济。
OpusCleaner and OpusTrainer, open source toolkits for training Machine Translation and Large language models
paper_authors: Nikolay Bogoychev, Jelmer van der Linde, Graeme Nail, Barry Haddow, Jaume Zaragoza-Bernabeu, Gema Ramírez-Sánchez, Lukas Weymann, Tudor Nicolae Mateiu, Jindřich Helcl, Mikko Aulamo
results: 使用这两个工具,可以创建高质量的机器翻译模型,抗雷达User输入噪音,以及多语言模型和专业词汇模型。Abstract
Developing high quality machine translation systems is a labour intensive, challenging and confusing process for newcomers to the field. We present a pair of tools OpusCleaner and OpusTrainer that aim to simplify the process, reduce the amount of work and lower the entry barrier for newcomers. OpusCleaner is a data downloading, cleaning, and proprocessing toolkit. It is designed to allow researchers to quickly download, visualise and preprocess bilingual (or monolingual) data that comes from many different sources, each of them with different quality, issues, and unique filtering/preprocessing requirements. OpusTrainer is a data scheduling and data augmenting tool aimed at building large scale, robust machine translation systems and large language models. It features deterministic data mixing from many different sources, on-the-fly data augmentation and more. Using these tools, we showcase how we can use it to create high quality machine translation model robust to noisy user input; multilingual models and terminology aware models.
摘要
开发高质量机器翻译系统是一项劳动密集、挑战性强、容易困惑的过程,特别是对新手而言。我们提供了一对工具——OpusCleaner和OpusTrainer——以简化过程、减少工作量和降低新手入门难度。OpusCleaner是一个数据下载、清洁、预处理工具集。它是为研究人员快速下载、视见和预处理来自多种不同来源的双语(或单语)数据,每个来源都有不同的质量、问题和唯一的筛选/预处理要求。OpusTrainer是一个数据调度和数据增强工具,旨在建立大规模、可靠的机器翻译系统和大语言模型。它具有 deterministic 数据混合、在线数据增强等功能。使用这两个工具,我们示例如如何使其创建高质量机器翻译模型,抗骚抗噪的用户输入,多语言模型和专有词汇模型。
Custom Data Augmentation for low resource ASR using Bark and Retrieval-Based Voice Conversion
results: 提高 ASR 技术的发展,并为各种应用场景提供高质量、个性化的声音生成Abstract
This paper proposes two innovative methodologies to construct customized Common Voice datasets for low-resource languages like Hindi. The first methodology leverages Bark, a transformer-based text-to-audio model developed by Suno, and incorporates Meta's enCodec and a pre-trained HuBert model to enhance Bark's performance. The second methodology employs Retrieval-Based Voice Conversion (RVC) and uses the Ozen toolkit for data preparation. Both methodologies contribute to the advancement of ASR technology and offer valuable insights into addressing the challenges of constructing customized Common Voice datasets for under-resourced languages. Furthermore, they provide a pathway to achieving high-quality, personalized voice generation for a range of applications.
摘要
Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
methods: 作者使用现有的混合式 ASR 系统生成训练声音的 triphone 对Alignment,然后在某层的 encoder 中创建一个 cross-entropy 损失函数。
results: 结果表明,在第三层 encoder 中使用 label smoothing 参数值为 0.5 的 weak alignment supervision 比一般一颗 cross-entropy 损失函数和 CTC 损失函数 WITH loss weighting 更好,可以在 TED-LIUM 2 数据集上减少约 5% 的 relative WER。Abstract
In this paper, we aim to create weak alignment supervision to aid the end-to-end modeling. Towards this end, we use the existing hybrid ASR system to produce triphone alignments of the training audios. We then create a cross-entropy loss at a certain layer of the encoder using the derived alignments. In contrast to the general one-hot cross-entropy losses with or without loss weighting, here we use a cross-entropy loss with a label smoothing parameter to regularize the supervision. As a comparison, we also conduct the experiments with one-hot cross-entropy losses and CTC losses with loss weighting. The results show that placing the weak alignment supervision with the label smoothing parameter of 0.5 at the third encoder layer outperforms the other two approaches and leads to about 5% relative WER reduction on the TED-LIUM 2 dataset over the baseline. We see similar improvements when applying the method out-of-the-box on a Tagalog end-to-end ASR system.
摘要
在这篇论文中,我们目的是创建弱对Alignment超级vision来 помо助端到端模型。为此,我们使用现有的混合式ASR系统生成训练听力的triphone对Alignment。然后,我们在encoder层中定义一个cross-entropy损失函数,使用 derive的对Alignment来定义损失。与通常的一个hot cross-entropy损失函数不同,我们使用一个标签平滑参数来规范Supervision。为了比较,我们还进行了使用一个hot cross-entropy损失函数和CTC损失函数的实验。结果表明,在第三层encoder层上添加弱对Alignment超级vision,使用标签平滑参数0.5,可以比基eline的5%相对WRER降低。我们在Tagalog端到端ASR系统上也 observe到类似的改进。
results: 实验结果表明,使用这种方法可以生成同一个信息在英文和法文两种语言中,无论是翻译或是同时拥有两种语言能力。此外,与GPT实例的文本生成结果进行比较,也表明这种方法的优势。Abstract
This document illustrates the use of pyrealb for generating two parallel texts (English and French) from a single source of data. The data selection and text organisation processes are shared between the two languages. only language dependent word and phrasing choices are distinct processes. The realized texts thus convey identical information in both languages without the risk of being lost in translation. This is especially important in cases where strict and simultaneous bilingualism is required. We first present the types of applications targeted by this approach and how the pyrealb English and French realizer can be used for achieving this goal in a natural way. We describe an object-oriented organization to ensure a convenient realization in both languages. To illustrate the process, different types of applications are then briefly sketched with links to the source code. A brief comparison of the text generation is given with the output of an instance of a GPT.
摘要
We will discuss the types of applications that can benefit from this approach and how the pyrealb English and French realizer can be used to achieve this goal in a natural way. We will also describe an object-oriented organization to make the realization process convenient for both languages.To illustrate the process, we will provide brief sketches of different types of applications and links to the source code. Finally, we will compare the text generation produced by pyrealb with the output of an instance of a GPT.
One Pass Streaming Algorithm for Super Long Token Attention Approximation in Sublinear Space
results: 论文的实验结果表明,该算法可以在流处理中高效地应用大语言模型,并且可以避免内存溢出问题。特别是当文本长度增长时,该算法可以保持减少内存使用的特点。Abstract
Deploying Large Language Models (LLMs) in streaming applications that involve long contexts, particularly for extended dialogues and text analysis, is of paramount importance but presents two significant challenges. Firstly, the memory consumption is substantial during the decoding phase due to the caching of Key and Value states (KV) of previous tokens. Secondly, attention computation is time-consuming with a time complexity of $O(n^2)$ for the generation of each token. In recent OpenAI DevDay (Nov 6, 2023), OpenAI released a new model that is able to support a 128K-long document, in our paper, we focus on the memory-efficient issue when context length $n$ is much greater than 128K ($n \gg 2^d$). Considering a single-layer self-attention with Query, Key, and Value matrices $Q, K, V \in \mathbb{R}^{n \times d}$, the polynomial method approximates the attention output $T \in \mathbb{R}^{n \times d}$. It accomplishes this by constructing $U_1, U_2 \in \mathbb{R}^{n \times t}$ to expedite attention ${\sf Attn}(Q, K, V)$ computation within $n^{1+o(1)}$ time executions. Despite this, storing the Key and Value matrices $K, V \in \mathbb{R}^{n \times d}$ still necessitates $O( n d)$ space, leading to significant memory usage. In response to these challenges, we introduce a new algorithm that only reads one pass of the data in streaming fashion. This method employs sublinear space $o(n)$ to store three sketch matrices, alleviating the need for exact $K, V$ storage. Notably, our algorithm exhibits exceptional memory-efficient performance with super-long tokens. As the token length $n$ increases, our error guarantee diminishes while the memory usage remains nearly constant. This unique attribute underscores the potential of our technique in efficiently handling LLMs in streaming applications.
摘要
部署大型自然语言模型(LLMs)在流处理应用程序中,特别是在长 Context 中进行长时间的对话和文本分析,是非常重要的。然而,这种部署存在两个主要挑战。首先,在解码阶段,模型的内存占用非常大,主要是因为缓存前 tokens 的 Key 和 Value 状态。其次,计算注意力的时间复杂度为 $O(n^2)$,对于每个token的生成。在OpenAI DevDay(2023年11月6日)上,OpenAI 发布了一个新的模型,可以支持128K字长的文档。在我们的论文中,我们关注的是,当 Context 长度远大于128K($n \gg 2^d)时,内存使用效率的问题。对于单层自注意的模型,我们使用多项式方法来近似注意输出 $T \in \mathbb{R}^{n \times d}$。它通过构建 $U_1, U_2 \in \mathbb{R}^{n \times t}$来加速注意力计算,从而在 $n^{1+o(1)}$ 时间内执行注意力计算。尽管如此,保存 Key 和 Value 矩阵 $K, V \in \mathbb{R}^{n \times d}$仍需要 $O(n d)$ 空间,导致内存使用增加。为了解决这些挑战,我们提出了一新的算法,只需要在流处理模式下读取一次数据。这种方法使用 sublinear 空间 $o(n)$ 存储三个笔记矩阵,从而消除了 $K, V$ 的准确存储需求。值得一提的是,我们的算法在长 token 时 exhibit 出色的内存减少性,即,随着 token 长度 $n$ 增加,我们的错误保证逐渐减少,而内存使用则保持相对常数。这种特点强调了我们的技术在流处理应用中的高效性。
results: 我们发现,基于语言相似性的多语言神经机器翻译(MNMT)模型可以提高格 Ethiopic 机器翻译的性能,并且使用 GPT-3.5 大语言模型进行几招翻译也可以达到 remarkable BLEU 分数。然而,对于只有4k 的训练样本,NLLB-200 模型的 finsheet 表现较差。Abstract
Machine translation (MT) for low-resource languages such as Ge'ez, an ancient language that is no longer spoken in daily life, faces challenges such as out-of-vocabulary words, domain mismatches, and lack of sufficient labeled training data. In this work, we explore various methods to improve Ge'ez MT, including transfer-learning from related languages, optimizing shared vocabulary and token segmentation approaches, finetuning large pre-trained models, and using large language models (LLMs) for few-shot translation with fuzzy matches. We develop a multilingual neural machine translation (MNMT) model based on languages relatedness, which brings an average performance improvement of about 4 BLEU compared to standard bilingual models. We also attempt to finetune the NLLB-200 model, one of the most advanced translation models available today, but find that it performs poorly with only 4k training samples for Ge'ez. Furthermore, we experiment with using GPT-3.5, a state-of-the-art LLM, for few-shot translation with fuzzy matches, which leverages embedding similarity-based retrieval to find context examples from a parallel corpus. We observe that GPT-3.5 achieves a remarkable BLEU score of 9.2 with no initial knowledge of Ge'ez, but still lower than the MNMT baseline of 15.2. Our work provides insights into the potential and limitations of different approaches for low-resource and ancient language MT.
摘要
机器翻译(MT) для低资源语言如格'ез(Ge'ez)面临挑战,包括无法词、领域不匹配和不足的训练数据。在这个工作中,我们探索了不同的方法来改善格'езMT,包括将相关语言的转移学习应用到格'ез,优化共享词汇和分词方法,调整大型预训模型,以及使用大型自然语言模型(LLM)进行几据翻译。我们开发了一个多语言神经机器翻译(MNMT)模型,基于语言之间的相关性,带来了约4个BLEU的平均性能提升。我们还尝试了调整NLLB-200模型,但发现它对于格'ез的4000个训练数据表现不佳。此外,我们尝试使用GPT-3.5,一个现今最先进的自然语言模型,进行几据翻译,使用类似度基于的汇集搜寻获得上下文示例。我们发现GPT-3.5在无任何格'ез知识下可以获得9.2个BLEU分,但仍比MNMT基准下的15.2分低。我们的工作提供了低资源语言和古语言MT的可能性和限制。
paper_authors: Francesco Paissan, Elisabetta Farella
for: 降低对比语音预训练模型的复杂性,以实现高效的语音识别和生成
methods: 基于首肯定理 derivation的单模型热退、约束梯度下降和精简
results: 使用 tinyCLAP 模型,只需使用原 Microsoft CLAP 参数的 6%,在三个声音事件检测数据集上实现零 shot 分类性能下降 less than 5%Abstract
Contrastive Language-Audio Pretraining (CLAP) became of crucial importance in the field of audio and speech processing. Its employment ranges from sound event detection to text-to-audio generation. However, one of the main limitations is the considerable amount of data required in the training process and the overall computational complexity during inference. This paper investigates how we can reduce the complexity of contrastive language-audio pre-trained models, yielding an efficient model that we call tinyCLAP. We derive an unimodal distillation loss from first principles and explore how the dimensionality of the shared, multimodal latent space can be reduced via pruning. TinyCLAP uses only 6% of the original Microsoft CLAP parameters with a minimal reduction (less than 5%) in zero-shot classification performance across the three sound event detection datasets on which it was tested
摘要
对于语音处理领域而言,对照语言-语音预训(CLAP)已经成为非常重要的一种方法。它的应用范围自声事件探测到文本-语音生成。然而,CLAP的主要限制是训练过程中需要很大量数据,以及推导过程中的总 Computational Complexity。这篇文章探讨了如何将对照语言-语音预训模型简化,实现一个高效的模型,我们称之为“tinyCLAP”。我们从基本原理开始, derivate一种单modal distillation损失函数,并考虑如何透过剪枝来降低共享多modal的 latent space 维度。 tinyCLAP 只需6%的原始 Microsoft CLAP 参数,并且在三个声事件探测数据集上进行零 shot 分类时,几乎没有损失(少于5%)。
Analysing the Impact of Removing Infrequent Words on Topic Quality in LDA Models
results: 结果显示,去掉不常见词语可以提高主题估计的质量,并且可以去掉一 considerable amount of vocabulary。Abstract
An initial procedure in text-as-data applications is text preprocessing. One of the typical steps, which can substantially facilitate computations, consists in removing infrequent words believed to provide limited information about the corpus. Despite popularity of vocabulary pruning, not many guidelines on how to implement it are available in the literature. The aim of the paper is to fill this gap by examining the effects of removing infrequent words for the quality of topics estimated using Latent Dirichlet Allocation. The analysis is based on Monte Carlo experiments taking into account different criteria for infrequent terms removal and various evaluation metrics. The results indicate that pruning is beneficial and that the share of vocabulary which might be eliminated can be quite considerable.
摘要
<>文本为数据应用的初始过程之一是文本处理。其中一个常见的步骤是去掉不常用词,因为这些词据信能够提供 corpus 中的有限信息。虽然词汇剔除受欢迎,但在 литературе 中有很少关于如何实现它的指南。本文的目标是填补这个空白,通过对 Latent Dirichlet Allocation 估算的话题质量的影响进行分析。这些分析基于 Monte Carlo 实验,考虑了不同的不常用词去除 criterion 和不同的评价指标。结果表明,剔除不常用词是有利的,并且可以去掉一部分词汇的比例。Note: I used the Traditional Chinese characters for "文本" (wén tiě) and "词汇" (cí huì) to match the original text.
SER_AMPEL: A multi-source dataset for SER of Italian older adults
results: 这 paper 预览了提出的 Dataset 的需求,并对一个子集进行了初步的分类结果分析,探讨了SER 的关键问题。Abstract
In this paper, SER_AMPEL, a multi-source dataset for speech emotion recognition (SER) is presented. The peculiarity of the dataset is that it is collected with the aim of providing a reference for speech emotion recognition in case of Italian older adults. The dataset is collected following different protocols, in particular considering acted conversations, extracted from movies and TV series, and recording natural conversations where the emotions are elicited by proper questions. The evidence of the need for such a dataset emerges from the analysis of the state of the art. Preliminary considerations on the critical issues of SER are reported analyzing the classification results on a subset of the proposed dataset.
摘要
在本文中,我们提出了一个多源数据集 для语音情感识别(SER),称为SER_AMPEL。该数据集的特点是集成了意大利老年人的语音情感识别参考数据集。数据集采集了不同协议,包括从电影和电视剧中提取的 acted conversations,以及通过适当问题诱发的自然对话。我们认为这样的数据集是有必要的,因为我们通过分析现状技术的报告发现了语音情感识别领域的挑战。本文的前提是对SER_AMPEL数据集的一些首要考虑。
Controlled Text Generation via Language Model Arithmetic
paper_authors: Jasper Dekoninck, Marc Fischer, Luca Beurer-Kellner, Martin Vechev
for: 这 paper 是为了提出一种新的推理框架,帮助在更广泛的场景中使用大型自然语言模型(LLMs)进行自定义。
methods: 这 paper 使用了一种名为“模型算术”的新的推理方法,可以无需再训练模型或使用高度特定的数据集来进行自定义。这种方法还允许更精细地控制生成的文本,比直接提示和先前的控制文本生成(CTG)技术更有效。
results: 根据这 paper,使用模型算术可以实现精细地控制生成的文本,同时超过了现有的状态对tasks of toxicity reduction。Abstract
As Large Language Models (LLMs) are deployed more widely, customization with respect to vocabulary, style and character becomes more important. In this work we introduce model arithmetic, a novel inference framework for composing and biasing LLMs without the need for model (re)training or highly specific datasets. In addition, the framework allows for more precise control of generated text than direct prompting and prior controlled text generation (CTG) techniques. Using model arithmetic, we can express prior CTG techniques as simple formulas and naturally extend them to new and more effective formulations. Further, we show that speculative sampling, a technique for efficient LLM sampling, extends to our setting. This enables highly efficient text generation with multiple composed models with only marginal overhead over a single model. Our empirical evaluation demonstrates that model arithmetic allows fine-grained control of generated text while outperforming state-of-the-art on the task of toxicity reduction.
摘要
As Large Language Models (LLMs) 广泛部署,自定义 vocabulary、style 和 character 变得更加重要。在这项工作中,我们介绍 model arithmetic,一种新的推理框架,可以无需模型(重)训练或特定的数据集来组合和偏迷 LLMs。此外,该框架还允许更精细地控制生成的文本,比直接提示和先前控制的文本生成(CTG)技术更为灵活。使用 model arithmetic,我们可以将先前的 CTG 技术表示为简单的公式,并自然地扩展到新的有效的表述。此外,我们发现,用于高效的 LLM 抽样的 speculative sampling 技术可以应用于我们的设置中。这使得可以使用多个组合的模型进行高效的文本生成,只需单个模型的负担。我们的实验证明,model arithmetic 允许细化控制生成的文本,而且在减少攻击性 task 上超越了当前的状态。
results: 本研究通过在不同的数据集和评估指标下进行了一系列实验,以验证 DP-NMT 框架的可行性和有效性。Abstract
Neural machine translation (NMT) is a widely popular text generation task, yet there is a considerable research gap in the development of privacy-preserving NMT models, despite significant data privacy concerns for NMT systems. Differentially private stochastic gradient descent (DP-SGD) is a popular method for training machine learning models with concrete privacy guarantees; however, the implementation specifics of training a model with DP-SGD are not always clarified in existing models, with differing software libraries used and code bases not always being public, leading to reproducibility issues. To tackle this, we introduce DP-NMT, an open-source framework for carrying out research on privacy-preserving NMT with DP-SGD, bringing together numerous models, datasets, and evaluation metrics in one systematic software package. Our goal is to provide a platform for researchers to advance the development of privacy-preserving NMT systems, keeping the specific details of the DP-SGD algorithm transparent and intuitive to implement. We run a set of experiments on datasets from both general and privacy-related domains to demonstrate our framework in use. We make our framework publicly available and welcome feedback from the community.
摘要
神经机器翻译(NMT)是广泛应用的文本生成任务,但是在开发隐私保护NMT模型方面还存在较大的研究差距,尽管NMT系统存在数据隐私问题。不同的隐私保护权限的权限评估(DP-SGD)是训练机器学习模型的受欢迎方法,但是在训练模型时的具体实现细节不一定是已经解释的,存在不同的软件库和代码库,导致复制问题。为了解决这问题,我们介绍DP-NMT框架,这是一个开源的框架,用于进行隐私保护NMT模型的研究,汇集了许多模型、数据集和评价指标在一个系统化的软件包中。我们的目标是提供一个平台,使研究人员可以在隐私保护NMT系统的发展中进行研究,并且在DP-SGD算法中保持简明易懂的具体细节。我们在不同的数据集上进行了一系列实验,以示DP-NMT框架的应用。我们将DP-NMT框架公开发布,欢迎社区的反馈。
results: 本文的实验结果表明,CorPipe 在 CRAC 2023 中的得分高于其他参与者的平均分数点数 by 4.5% 之多。Abstract
We present CorPipe, the winning entry to the CRAC 2023 Shared Task on Multilingual Coreference Resolution. Our system is an improved version of our earlier multilingual coreference pipeline, and it surpasses other participants by a large margin of 4.5 percent points. CorPipe first performs mention detection, followed by coreference linking via an antecedent-maximization approach on the retrieved spans. Both tasks are trained jointly on all available corpora using a shared pretrained language model. Our main improvements comprise inputs larger than 512 subwords and changing the mention decoding to support ensembling. The source code is available at https://github.com/ufal/crac2023-corpipe.
摘要
我们现在介绍CorPipe,CRAC 2023共享任务中的赢家。我们的系统是之前的多语言核心引用管道的改进版本,在其他参与者之上减分4.5个百分点。CorPipe首先检测提及,然后通过 antecedent-maximization 方法对检测到的跨度进行核心关系链接。两个任务都是通过所有可用 corpora 进行共同训练,使用共享预训练语言模型。我们的主要改进包括输入大于512个子词和更改提及解码以支持集成。源代码可以在 GitHub 上找到:https://github.com/ufal/crac2023-corpipe。
Average Token Delay: A Duration-aware Latency Metric for Simultaneous Translation
results: 在实验中,ATD 与 EVS 之间存在高度相关性,特别在大多数情况下。Abstract
Simultaneous translation is a task in which the translation begins before the end of an input speech segment. Its evaluation should be conducted based on latency in addition to quality, and for users, the smallest possible amount of latency is preferable. Most existing metrics measure latency based on the start timings of partial translations and ignore their duration. This means such metrics do not penalize the latency caused by long translation output, which delays the comprehension of users and subsequent translations. In this work, we propose a novel latency evaluation metric for simultaneous translation called \emph{Average Token Delay} (ATD) that focuses on the duration of partial translations. We demonstrate its effectiveness through analyses simulating user-side latency based on Ear-Voice Span (EVS). In our experiment, ATD had the highest correlation with EVS among baseline latency metrics under most conditions.
摘要
同时翻译是一种任务,在输入语音段结束之前,翻译就开始了。其评估应该基于延迟,而不仅仅是质量。用户希望的最小化延迟。现有的度量都是基于部分翻译的开始时间,忽略其持续时间。这意味着这些度量不会负担由长翻译输出带来的延迟,这会延迟用户的理解和后续翻译。在这项工作中,我们提出了一种新的同时翻译延迟评估度量called 平均字符延迟(ATD),它关注部分翻译持续时间。我们通过 simulate user-side 延迟基于耳语间距(EVS)进行分析,并证明 ATD 在大多数情况下与基准延迟度量之间存在最高的相关性。