2023-10-07

cs.LG

cs.LG - 2023-10-07

Transferable Deep Clustering Model

paper_url: http://arxiv.org/abs/2310.04946
repo_url: https://github.com/cirosantilli/china-dictatorship
paper_authors: Zheng Zhang, Liang Zhao
for: 本研究旨在提出一种可传递的深度划分模型，以便根据源领域中获得的知识自动调整目标领域中的划分结果。
methods: 我们提出了一种新的注意力模块，该模块可以自动调整划分中心点，根据样本与划分中心点之间的关系。此外，我们还证明了我们的模型比一些经典的划分算法，如k-means或GMM更有力。
results: 我们在实验中对真实 dataset 进行了测试，结果表明我们的提出的传输学习框架可以显著提高目标领域中的性能，同时降低计算成本。

Abstract
Deep learning has shown remarkable success in the field of clustering recently. However, how to transfer a trained clustering model on a source domain to a target domain by leveraging the acquired knowledge to guide the clustering process remains challenging. Existing deep clustering methods often lack generalizability to new domains because they typically learn a group of fixed cluster centroids, which may not be optimal for the new domain distributions. In this paper, we propose a novel transferable deep clustering model that can automatically adapt the cluster centroids according to the distribution of data samples. Rather than learning a fixed set of centroids, our approach introduces a novel attention-based module that can adapt the centroids by measuring their relationship with samples. In addition, we theoretically show that our model is strictly more powerful than some classical clustering algorithms such as k-means or Gaussian Mixture Model (GMM). Experimental results on both synthetic and real-world datasets demonstrate the effectiveness and efficiency of our proposed transfer learning framework, which significantly improves the performance on target domain and reduces the computational cost.

摘要
深度学习在 clustering 领域最近显示出惊人的成功。然而，如何通过获得的知识来导引 clustering 过程中的转移还是一个挑战。现有的深度 clustering 方法通常缺乏对新领域的泛化能力，因为它们通常学习一组固定的集群中心点，这些中心点可能不适合新领域的数据分布。在这篇文章中，我们提出了一种新的可传递深度 clustering 模型，可以自动调整集群中心点根据数据样本的分布。而不是学习固定的集群中心点，我们的方法引入了一种新的注意力基于模块，可以通过测量中心点和样本之间的关系来调整中心点。此外，我们也证明了我们的模型比一些经典的 clustering 算法，如 k-means 或 Gaussian Mixture Model (GMM) 更加强大。实验结果表明，我们的提出的转移学习框架在目标领域中显著提高性能，同时降低计算成本。

Beyond Text: A Deep Dive into Large Language Models’ Ability on Understanding Graph Data

paper_url: http://arxiv.org/abs/2310.04944
repo_url: None
paper_authors: Yuntong Hu, Zheng Zhang, Liang Zhao
for: 本研究旨在评估大语言模型（LLM）在不同的图数据预测任务中的表现，以及是否可以利用图结构提高性能。
methods: 本研究使用了多种示例和任务/数据集选择方式，对 LLM 的表现进行了分析和比较，以评估它们是否可以正确地理解和利用图结构。
results: 研究发现 LLM 在图数据预测任务中的表现有限，特别是在图结构更复杂的任务中。然而， LLM 仍然可以在某些任务中提供高性能，特别是在使用特定的示例和任务/数据集时。这些发现可以帮助我们更好地理解 LLM 在图分析中的能力和局限性。

Abstract
Large language models (LLMs) have achieved impressive performance on many natural language processing tasks. However, their capabilities on graph-structured data remain relatively unexplored. In this paper, we conduct a series of experiments benchmarking leading LLMs on diverse graph prediction tasks spanning node, edge, and graph levels. We aim to assess whether LLMs can effectively process graph data and leverage topological structures to enhance performance, compared to specialized graph neural networks. Through varied prompt formatting and task/dataset selection, we analyze how well LLMs can interpret and utilize graph structures. By comparing LLMs' performance with specialized graph models, we offer insights into the strengths and limitations of employing LLMs for graph analytics. Our findings provide insights into LLMs' capabilities and suggest avenues for further exploration in applying them to graph analytics.

摘要
大型语言模型（LLMs）已经在许多自然语言处理任务中表现出色。然而，它们对图结构数据的能力仍然相当未为人所知。在这篇论文中，我们进行了一系列的实验，测试领先的 LLMs 在多种图预测任务中表现。我们想要评估 LLMS 能否有效地处理图数据，并利用图结构来提高性能，与特殊化图 ней罗网络相比。透过不同的提示格式和任务/数据选择，我们分析了 LLMS 如何理解和利用图结构。通过与特殊化图模型比较 LLMS 的表现，我们提供了关于 LLMS 的优点和局限性，以及在应用它们到图分析方面的可能性。我们的发现可以帮助我们更好地理解 LLMS 的能力，并提供未来应用它们到图分析的方向。

Large Language Models for Spatial Trajectory Patterns Mining

paper_url: http://arxiv.org/abs/2310.04942
repo_url: None
paper_authors: Zheng Zhang, Hossein Amiri, Zhenke Liu, Andreas Züfle, Liang Zhao
for: 这 paper 用于评估大型自然语言模型 (LLMs) 是否可以检测人类空间轨迹异常行为。
methods: 这 paper 使用了 GPT-4 和 Claude-2 等 LLMs，并对它们进行了比较，以评估它们在检测异常行为方面的表现。
results: 研究发现，LLMs 可以达到一定的异常检测性能，而不需要特定的cue。此外，在提供 contextual clues 的情况下，LLMs 的预测效果可以进一步提高。此外，LLMs 还可以提供可读的解释，从而提高了透明度。

Abstract
Identifying anomalous human spatial trajectory patterns can indicate dynamic changes in mobility behavior with applications in domains like infectious disease monitoring and elderly care. Recent advancements in large language models (LLMs) have demonstrated their ability to reason in a manner akin to humans. This presents significant potential for analyzing temporal patterns in human mobility. In this paper, we conduct empirical studies to assess the capabilities of leading LLMs like GPT-4 and Claude-2 in detecting anomalous behaviors from mobility data, by comparing to specialized methods. Our key findings demonstrate that LLMs can attain reasonable anomaly detection performance even without any specific cues. In addition, providing contextual clues about potential irregularities could further enhances their prediction efficacy. Moreover, LLMs can provide reasonable explanations for their judgments, thereby improving transparency. Our work provides insights on the strengths and limitations of LLMs for human spatial trajectory analysis.

摘要
检测人类空间轨迹异常 patrtern可以指示人们的 mobilty 行为发生了动态变化，有应用于各种领域，如传染病监测和老年人护理。 latest advancements in large language models (LLMs) 表明它们可以像人类一样思考。这种可能性具有检测人类空间轨迹异常行为的潜在应用。在这篇论文中，我们进行了实验研究，以评估主流 LLMs 如 GPT-4 和 Claude-2 在分析人类空间轨迹数据中的异常行为检测能力，并与专门的方法进行比较。我们的主要发现表明 LLMS 可以获得合理的异常行为检测性能，而无需任何特定的提示。此外，在提供Contextual 提示后，LLMS 的预测效果可以进一步提高。此外，LLMS 可以提供合理的解释，从而提高透明度。我们的工作提供了 LLMS 对人类空间轨迹分析的强点和局限性。

Statistical Guarantees for Variational Autoencoders using PAC-Bayesian Theory

paper_url: http://arxiv.org/abs/2310.04935
repo_url: https://github.com/diarra2339/pac-bayes-vae
paper_authors: Sokhna Diarra Mbacke, Florence Clerc, Pascal Germain
for: 这篇论文是为了提供关于Variational Autoencoders（VAEs）的理论保证。
methods: 这篇论文使用了PAC-Bayesian理论来提供关于VAEs的统计保证。
results: 这篇论文提供了对VAEs的重建损失的泛化保证，以及输入和生成模型之间的距离的Upper bound。

Abstract
Since their inception, Variational Autoencoders (VAEs) have become central in machine learning. Despite their widespread use, numerous questions regarding their theoretical properties remain open. Using PAC-Bayesian theory, this work develops statistical guarantees for VAEs. First, we derive the first PAC-Bayesian bound for posterior distributions conditioned on individual samples from the data-generating distribution. Then, we utilize this result to develop generalization guarantees for the VAE's reconstruction loss, as well as upper bounds on the distance between the input and the regenerated distributions. More importantly, we provide upper bounds on the Wasserstein distance between the input distribution and the distribution defined by the VAE's generative model.

摘要
以下是文本的简化中文翻译：自它们的创始以来，变量自动编码器（VAEs）在机器学习中变得中心。尽管广泛使用，但还有许多关于它们理论性质的问题未解决。使用PAC-Bayesian理论，这项工作提供了统计保证 для VAEs。首先，我们 derivate了对各个样本的数据生成分布 conditioned posterior distributions的首个PAC-Bayesian bound。然后，我们利用这个结果，为VAE的重建损失提供了总体化保证，以及输入和重新生成分布之间的距离的 Upper bound。更重要的是，我们提供了输入分布和VAE的生成模型定义的 Wasserstein距离的Upper bound。

Crystal-GFN: sampling crystals with desirable properties and constraints

paper_url: http://arxiv.org/abs/2310.04925
repo_url: https://github.com/alexhernandezgarcia/gflownet
paper_authors: Mila AI4Science, Alex Hernandez-Garcia, Alexandre Duval, Alexandra Volokhova, Yoshua Bengio, Divya Sharma, Pierre Luc Carrier, Michał Koziarski, Victor Schmidt
for: 加速材料发现可以帮助减轻气候危机。发现新的固体晶体，如电解质、离子导电器或太阳能电池，可以提高可再生能源生产和储存的效率。
methods: 本文引入了Crystal-GFlowNet，一种基于晶体结构的生成模型。该模型顺序样本晶体的组成物质、空GROUP和晶格参数，并可以采用任何可用的预测模型来对晶体结构的物理和几何约束进行约束。
results: 通过使用新的代理模型训练后，Crystal-GFlowNet能够样本出低能formation crystal structure。

Abstract
Accelerating material discovery holds the potential to greatly help mitigate the climate crisis. Discovering new solid-state crystals such as electrocatalysts, ionic conductors or photovoltaics can have a crucial impact, for instance, in improving the efficiency of renewable energy production and storage. In this paper, we introduce Crystal-GFlowNet, a generative model of crystal structures that sequentially samples a crystal's composition, space group and lattice parameters. This domain-inspired approach enables the flexible incorporation of physical and geometrical constraints, as well as the use of any available predictive model of a desired property as an objective function. We evaluate the capabilities of Crystal-GFlowNet by using as objective the formation energy of a crystal structure, as predicted by a new proxy model trained on MatBench. The results demonstrate that Crystal-GFlowNet is able to sample diverse crystals with low formation energy.

摘要
加速材料发现可能帮助减轻气候危机。发现新的固体晶体，如电化学catalysts、离子导电体或光伏材料，可以有重要影响，例如提高可再生能源生产和存储的效率。在这篇论文中，我们介绍了Crystal-GFlowNet，一种生成模型的晶体结构，可以顺序样本晶体的化学组成、空间群和晶格参数。这种域名称的approach允许采用物理和几何约束，以及使用任何可用的预测模型的所需性能为目标函数。我们通过使用新的proxy模型，训练于MatBench，来评估Crystal-GFlowNet的能力。结果表明，Crystal-GFlowNet能够样本低能formation晶体。

The Conditional Prediction Function: A Novel Technique to Control False Discovery Rate for Complex Models

paper_url: http://arxiv.org/abs/2310.04919
repo_url: None
paper_authors: Yushu Shi, Michael Martens
for: 这个研究的目的是确定哪些变量与结果相关，并在大量可能的预测器中进行选择，以控制false discovery rate（FDR）。
methods: 这个研究使用knockoff检查来进行变量选择，并使用conditional prediction function（CPF）来评估预测模型与结果的关系。
results: 这个研究显示，使用CPF statistics可以提供更高的权威性，并且可以捕捉非线性关系between predictors和结果。实际应用中，这种方法可以帮助选择 truly prognostic variables，并且可以提高预测模型的准确性。

Abstract
In modern scientific research, the objective is often to identify which variables are associated with an outcome among a large class of potential predictors. This goal can be achieved by selecting variables in a manner that controls the the false discovery rate (FDR), the proportion of irrelevant predictors among the selections. Knockoff filtering is a cutting-edge approach to variable selection that provides FDR control. Existing knockoff statistics frequently employ linear models to assess relationships between features and the response, but the linearity assumption is often violated in real world applications. This may result in poor power to detect truly prognostic variables. We introduce a knockoff statistic based on the conditional prediction function (CPF), which can pair with state-of-art machine learning predictive models, such as deep neural networks. The CPF statistics can capture the nonlinear relationships between predictors and outcomes while also accounting for correlation between features. We illustrate the capability of the CPF statistics to provide superior power over common knockoff statistics with continuous, categorical, and survival outcomes using repeated simulations. Knockoff filtering with the CPF statistics is demonstrated using (1) a residential building dataset to select predictors for the actual sales prices and (2) the TCGA dataset to select genes that are correlated with disease staging in lung cancer patients.

摘要
现代科学研究中的目标Frequently是确定一个大类中的变量与结果相关的变量。这个目标可以通过控制False Discovery Rate（FDR）来实现。Knockoff filtering是一种先进的变量选择方法，它提供了FDR控制。现有的knockoff统计 часто使用线性模型来评估特征和响应之间的关系，但在实际应用中，这种假设可能被打砸。这会导致捕捉真正预测变量的能力强化。我们介绍了基于条件预测函数（CPF）的knockoff统计，它可以与当前的机器学习预测模型结合使用，如深度神经网络。CPF统计可以捕捉特征和结果之间的非线性关系，同时也考虑特征之间的相关性。我们通过重复的 simulations 示例，展示了CPF统计在不同类型的输出（连续、分类、生存时间）上的优异能力。knockoff filtering with CPF statistics是使用（1）一个住宅建筑数据集来选择实际销售价格的预测变量，（2）TCGA数据集来选择与肺癌级别相关的基因。

Tight Certified Robustness via Min-Max Representations of ReLU Neural Networks

paper_url: http://arxiv.org/abs/2310.04916
repo_url: None
paper_authors: Brendon G. Anderson, Samuel Pfrommer, Somayeh Sojoudi
for: 这个研究旨在提供对神经网络系统的可靠部署需要严格的Robustness保证。
methods: 这篇论文使用了一种对征测函数的凸形式表示，并使用了最近的分布式 robust optimization的结果来解决这个问题。
results: 这篇论文可以获得严格的Robustness保证，并且可以解决问题中的原始非数据问题。实验结果显示了这个方法的有效性。

Abstract
The reliable deployment of neural networks in control systems requires rigorous robustness guarantees. In this paper, we obtain tight robustness certificates over convex attack sets for min-max representations of ReLU neural networks by developing a convex reformulation of the nonconvex certification problem. This is done by "lifting" the problem to an infinite-dimensional optimization over probability measures, leveraging recent results in distributionally robust optimization to solve for an optimal discrete distribution, and proving that solutions of the original nonconvex problem are generated by the discrete distribution under mild boundedness, nonredundancy, and Slater conditions. As a consequence, optimal (worst-case) attacks against the model may be solved for exactly. This contrasts prior state-of-the-art that either requires expensive branch-and-bound schemes or loose relaxation techniques. Experiments on robust control and MNIST image classification examples highlight the benefits of our approach.

摘要
要确保神经网络在控制系统中可靠地部署，需要强制性的Robustness保证。在这篇论文中，我们获得了对凸攻击集的紧张Robustness证明。这是通过“升级”问题到无穷维度优化中的概率分布来实现的，利用最近的分布式Robust优化技术解决优化问题，并证明原始非 convex问题的解是由离散分布下的解决方案生成的，只要满足某些轻度的约束条件。因此，我们可以解决最优（最差）攻击问题。这与之前的状态艺术技术不同，需要昂贵的分支和缓冲 schemes或者笔势的放松技术。实验表明，我们的方法在Robust控制和MNIST图像识别中具有优异的表现。

A Dual Latent State Learning Approach: Exploiting Regional Network Similarities for QoS Prediction

paper_url: http://arxiv.org/abs/2310.05988
repo_url: None
paper_authors: Ziliang Wang, Xiaohong Zhang, Meng Yan
for: 本文是为了提高服务质量（QoS）预测的精度，而设计的一种深度学习框架。
methods: 本文使用的方法包括建立两个区域网络幽默状态（城市网络幽默状态和AS网络幽默状态），并使用改进的惯性损失函数来解决数据稀缺和标签不均匀问题。
results: 经过实验表明，本文的方法在实际的QoS数据集上表现较为出色，超过了现有的状态艺术方法。

Abstract
Individual objects, whether users or services, within a specific region often exhibit similar network states due to their shared origin from the same city or autonomous system (AS). Despite this regional network similarity, many existing techniques overlook its potential, resulting in subpar performance arising from challenges such as data sparsity and label imbalance. In this paper, we introduce the regional-based dual latent state learning network(R2SL), a novel deep learning framework designed to overcome the pitfalls of traditional individual object-based prediction techniques in Quality of Service (QoS) prediction. Unlike its predecessors, R2SL captures the nuances of regional network behavior by deriving two distinct regional network latent states: the city-network latent state and the AS-network latent state. These states are constructed utilizing aggregated data from common regions rather than individual object data. Furthermore, R2SL adopts an enhanced Huber loss function that adjusts its linear loss component, providing a remedy for prevalent label imbalance issues. To cap off the prediction process, a multi-scale perception network is leveraged to interpret the integrated feature map, a fusion of regional network latent features and other pertinent information, ultimately accomplishing the QoS prediction. Through rigorous testing on real-world QoS datasets, R2SL demonstrates superior performance compared to prevailing state-of-the-art methods. Our R2SL approach ushers in an innovative avenue for precise QoS predictions by fully harnessing the regional network similarities inherent in objects.

摘要
各个对象，无论是用户或服务，在特定区域 often exhibit 相似的网络状态，这是因为它们来自同一个城市或自治系统（AS）的共同起源。 DESPITE THIS REGIONAL NETWORK SIMILARITY， many existing techniques overlook 其潜在可能性，导致subpar performance due to challenges such as data sparsity and label imbalance. In this paper, we introduce the regional-based dual latent state learning network (R2SL), a novel deep learning framework designed to overcome the pitfalls of traditional individual object-based prediction techniques in Quality of Service (QoS) prediction. Unlike its predecessors, R2SL captures the nuances of regional network behavior by deriving two distinct regional network latent states: the city-network latent state and the AS-network latent state. These states are constructed utilizing aggregated data from common regions rather than individual object data. Furthermore, R2SL adopts an enhanced Huber loss function that adjusts its linear loss component, providing a remedy for prevalent label imbalance issues. To cap off the prediction process, a multi-scale perception network is leveraged to interpret the integrated feature map, a fusion of regional network latent features and other pertinent information, ultimately accomplishing the QoS prediction. Through rigorous testing on real-world QoS datasets, R2SL demonstrates superior performance compared to prevailing state-of-the-art methods. Our R2SL approach ushers in an innovative avenue for precise QoS predictions by fully harnessing the regional network similarities inherent in objects.

Regret Analysis of Repeated Delegated Choice

paper_url: http://arxiv.org/abs/2310.04884
repo_url: None
paper_authors: MohammadTaghi Hajiaghayi, Mohammad Mahdavi, Keivan Rezaei, Suho Shin
for: 这个研究是关于一个重复委托选择问题的，这是 Kleinberg 和 Kleinberg 的在线学习变体（EC’18）的第一个考虑。在这个模型中，一位主任与一位代理人进行重复互动，以寻找高效的解决方案。每个解决方案都可以为主任和代理人带来不同的利用价值，并且代理人可以在自私的方式下提议解决方案以 Maximize 自己的利用价值。为了缓解这种行为，主任公布一个合法的集合，以排除一些解决方案。但是，主任没有任何关于解决方案的分布信息。因此，主任会在不同的征文中动态公布不同的合法集合，以高效地学习分布。主任的目标是在审核后比Optimal 合法集合减少征文 regret。
methods: 我们explore 两个维度的问题设置：代理人是否做出myopic 选择，以及解决方案是否具有 deterministic 或 stochastic 的利用价值。我们的分析主要描述了一些情况下，主任可以在不同的征文中减少 regret，并且 shed 光于不同的征文中的升降。
results: 我们的分析主要描述了一些情况下，主任可以在不同的征文中减少 regret，并且 shed 光于不同的征文中的升降。

Abstract
We present a study on a repeated delegated choice problem, which is the first to consider an online learning variant of Kleinberg and Kleinberg, EC'18. In this model, a principal interacts repeatedly with an agent who possesses an exogenous set of solutions to search for efficient ones. Each solution can yield varying utility for both the principal and the agent, and the agent may propose a solution to maximize its own utility in a selfish manner. To mitigate this behavior, the principal announces an eligible set which screens out a certain set of solutions. The principal, however, does not have any information on the distribution of solutions in advance. Therefore, the principal dynamically announces various eligible sets to efficiently learn the distribution. The principal's objective is to minimize cumulative regret compared to the optimal eligible set in hindsight. We explore two dimensions of the problem setup, whether the agent behaves myopically or strategizes across the rounds, and whether the solutions yield deterministic or stochastic utility. Our analysis mainly characterizes some regimes under which the principal can recover the sublinear regret, thereby shedding light on the rise and fall of the repeated delegation procedure in various regimes.

摘要
我们介绍了一项研究，探讨了一种循环委托问题，这是首次考虑了在克林伯格和克林伯格（EC'18）的在线学习变体中。在这个模型中，一位主人与一位代理人进行了多次互动，以寻找高效的解决方案。每个解决方案可以为主人和代理人带来不同的利用价值，而代理人可能会为自己的利益而选择解决方案。为了缓解这种行为，主人公布了合法的集合，从而排除一部分解决方案。然而，主人没有任何关于解决方案的分布的信息。因此，主人在不断更新的集合中 dynamically 宣布了多个合法集合，以有效地学习分布。主人的目标是在后知之前对合法集合进行最小化的恨 regret。我们探讨了两个维度的问题设置：代理人是否偏爱短期目标，以及解决方案是否具有 deterministic 或 stochastic 的利用价值。我们的分析主要描述了一些情况下，主人可以在不同的环境下重新恢复子线性的恨 regret，从而为各种情况下的循环委托过程带来更好的理解。

Randomized Sparse Neural Galerkin Schemes for Solving Evolution Equations with Deep Networks

paper_url: http://arxiv.org/abs/2310.04867
repo_url: https://github.com/julesberman/rsng
paper_authors: Jules Berman, Benjamin Peherstorfer
for: 这项研究的目的是为了解决时间依赖的偏微分方程的解场的近似问题，以保持 causality 和其他物理性质。
methods: 该研究使用了神经网络随时序顺序训练，以 aproximate 时间依赖的偏微分方程的解场。在训练过程中，采用了随机杂化 sparse 网络参数的更新方法，以避免当地过拟合和error 快速积累。
results: 实验表明，提案的方法在各种时间演化方程中比较精度和效率，可以在固定计算预算下提高精度至少两个数量级，并在固定精度下提高速度至少两个数量级。

Abstract
Training neural networks sequentially in time to approximate solution fields of time-dependent partial differential equations can be beneficial for preserving causality and other physics properties; however, the sequential-in-time training is numerically challenging because training errors quickly accumulate and amplify over time. This work introduces Neural Galerkin schemes that update randomized sparse subsets of network parameters at each time step. The randomization avoids overfitting locally in time and so helps prevent the error from accumulating quickly over the sequential-in-time training, which is motivated by dropout that addresses a similar issue of overfitting due to neuron co-adaptation. The sparsity of the update reduces the computational costs of training without losing expressiveness because many of the network parameters are redundant locally at each time step. In numerical experiments with a wide range of evolution equations, the proposed scheme with randomized sparse updates is up to two orders of magnitude more accurate at a fixed computational budget and up to two orders of magnitude faster at a fixed accuracy than schemes with dense updates.

摘要
培训神经网络Sequentially在时间上来 aproximate解析场的时间依赖partial differential equations可以保持 causality和其他物理性质;然而，逐步在时间上培训是数学上具有挑战性，因为培训错误快速积累和增强在时间上。这项工作提出了神经加尔金方案，在每个时间步骤中随机选择神经网络参数的 subset。随机性可以避免在时间上的局部适应，从而防止培训错误快速积累，这与Dropout的解决方案类似。神经网络参数的稀疏更新可以降低培训计算成本，保持表达能力，因为在每个时间步骤中的神经网络参数具有重复性。在一系列的演化方程中的数学实验中，提出的方案与随机 sparse更新比较其他方案在 fix computational budget 下具有更高的准确性，并且在 fix accuracy 下具有更高的计算效率，可以达到两个数量级。

Universal Graph Random Features

paper_url: http://arxiv.org/abs/2310.04859
repo_url: https://github.com/djdprogramming/adfa2
paper_authors: Isaac Reid, Krzysztof Choromanski, Eli Berger, Adrian Weller
for: 本研究的目的是提出一种新的随机游走算法，用于无偏估计图上任意函数的权重邻接矩阵。
methods: 该算法使用随机游走模块化函数，可以在图中实现估计。它的时间复杂度为乘数减少，可以处理更大的图。此外，该算法可以轻松分布在多台机器上，实现大规模学习。
results: 研究人员通过实验和理论分析表明，该算法可以提供更高质量的估计或高效、扩展性的学习。具体来说，该算法可以实现点精度估计、非同Homogeneous图ordinary differential equations、节点划分和kernel regression等任务。

Abstract
We propose a novel random walk-based algorithm for unbiased estimation of arbitrary functions of a weighted adjacency matrix, coined universal graph random features (u-GRFs). This includes many of the most popular examples of kernels defined on the nodes of a graph. Our algorithm enjoys subquadratic time complexity with respect to the number of nodes, overcoming the notoriously prohibitive cubic scaling of exact graph kernel evaluation. It can also be trivially distributed across machines, permitting learning on much larger networks. At the heart of the algorithm is a modulation function which upweights or downweights the contribution from different random walks depending on their lengths. We show that by parameterising it with a neural network we can obtain u-GRFs that give higher-quality kernel estimates or perform efficient, scalable kernel learning. We provide robust theoretical analysis and support our findings with experiments including pointwise estimation of fixed graph kernels, solving non-homogeneous graph ordinary differential equations, node clustering and kernel regression on triangular meshes.

摘要
我们提出了一种新的随机步骤基本算法，用于不偏估任意函数权重图adjacency矩阵中的函数，称为universal graph random features（u-GRF）。这包括许多最流行的节点上定义的kernels的示例。我们的算法在节点数量的减少方面具有下方几何时间复杂度，超越了精确图kernels评估的不orioius cubic scaling。它还可以轻松分布在机器上，允许学习更大的网络。algorithm的核心是一个modulation函数，该函数根据不同的随机步骤长度而增加或减少其贡献。我们示出了使用神经网络参数化这个函数可以得到更高质量的kernel估计或进行高效、可扩展的kernel学习。我们提供了robust的理论分析，并通过包括固定图kernels的点约估、非同Homogeneous图ordinary differential equations、节点划分和kernel regression on triangular meshes等实验来支持我们的发现。

LIPEx – Locally Interpretable Probabilistic Explanations – To Look Beyond The True Class

paper_url: http://arxiv.org/abs/2310.04856
repo_url: None
paper_authors: Hongbo Zhu, Angelo Cangelosi, Procheta Sen, Anirbit Mukherjee
for: 这 paper 的目的是提出一种新的干扰基于的多类解释框架，LIPEx（本地可解释概率解释）。
methods: 这 paper 使用了一种新的方法，即通过在概率分布空间进行回归来定义解释，并通过HELLINGER距离来衡量解释的准确性。
results: 实验表明，LIPEx 可以不仅在本地复制 Complex 分类模型输出的概率分布，还可以提供每个特征对预测概率的解释，并且在文本和图像数据上进行了ablation 测试，显示 LIPEx 在隐藏特征 elimination 方面比其他 saliency-based 或 feature importance-based XAI 方法更加有效。

Abstract
In this work, we instantiate a novel perturbation-based multi-class explanation framework, LIPEx (Locally Interpretable Probabilistic Explanation). We demonstrate that LIPEx not only locally replicates the probability distributions output by the widely used complex classification models but also provides insight into how every feature deemed to be important affects the prediction probability for each of the possible classes. We achieve this by defining the explanation as a matrix obtained via regression with respect to the Hellinger distance in the space of probability distributions. Ablation tests on text and image data, show that LIPEx-guided removal of important features from the data causes more change in predictions for the underlying model than similar tests on other saliency-based or feature importance-based XAI methods. It is also shown that compared to LIME, LIPEx is much more data efficient in terms of the number of perturbations needed for reliable evaluation of the explanation.

摘要
在这个工作中，我们实现了一种新的扰动基于多类解释框架LIPEx（本地可解释概率解释）。我们示出了LIPEx不仅可以在广泛使用的复杂分类模型输出的概率分布上重建本地的概率分布，而且还提供了每个被识别为重要的特征对预测概率的影响的理解。我们实现了这一点通过定义解释为基于地LLOyd Distance在概率分布空间上进行回归的矩阵。我们在文本和图像数据上进行了ablation测试，显示LIPEx引导的删除重要特征从数据中引入的改变对下面模型的预测更大。此外，与LIME相比，LIPEx在数据效率方面远胜一步，需要的扰动数量远少于其他质量基于或特征重要性基于XAI方法。

Epsilon non-Greedy: A Bandit Approach for Unbiased Recommendation via Uniform Data

paper_url: http://arxiv.org/abs/2310.04855
repo_url: None
paper_authors: S. M. F. Sani, Seyed Abbas Hosseini, Hamid R. Rabiee
for: 降低推荐系统中的自回给舌径偏见，并提高推荐系统的多样性和准确性。
methods: 提出一个框架，使用小量的均匀收集的数据来学习无偏观估计器，并专注于生成改进的训练数据 для后续的训练过程。
results: 透过实验显示了与现有的减偏方法相比，提出的模型具有更高的多样性和准确性。

Abstract
Often, recommendation systems employ continuous training, leading to a self-feedback loop bias in which the system becomes biased toward its previous recommendations. Recent studies have attempted to mitigate this bias by collecting small amounts of unbiased data. While these studies have successfully developed less biased models, they ignore the crucial fact that the recommendations generated by the model serve as the training data for subsequent training sessions. To address this issue, we propose a framework that learns an unbiased estimator using a small amount of uniformly collected data and focuses on generating improved training data for subsequent training iterations. To accomplish this, we view recommendation as a contextual multi-arm bandit problem and emphasize on exploring items that the model has a limited understanding of. We introduce a new offline sequential training schema that simulates real-world continuous training scenarios in recommendation systems, offering a more appropriate framework for studying self-feedback bias. We demonstrate the superiority of our model over state-of-the-art debiasing methods by conducting extensive experiments using the proposed training schema.

摘要
通常，推荐系统会使用连续训练，导致一个自适应循环偏见，其中系统会偏好前一些推荐。近期研究尝试了消除这种偏见，通过收集一些无偏见数据。although these studies have successfully developed less biased models, they ignore the crucial fact that the recommendations generated by the model serve as the training data for subsequent training sessions. To address this issue, we propose a framework that learns an unbiased estimator using a small amount of uniformly collected data and focuses on generating improved training data for subsequent training iterations. To accomplish this, we view recommendation as a contextual multi-arm bandit problem and emphasize on exploring items that the model has a limited understanding of. We introduce a new offline sequential training schema that simulates real-world continuous training scenarios in recommendation systems, offering a more appropriate framework for studying self-feedback bias. We demonstrate the superiority of our model over state-of-the-art debiasing methods by conducting extensive experiments using the proposed training schema.Here's the translation breakdown:* 通常 (tōng zhì) - usually* 推荐系统 (pù jiào xì tǒng) - recommendation system* employs (yǐn) - employs* 连续训练 (lián jiè xùn) - continuous training* 自适应循环 (zì shì bìng xún) - self-feedback loop* 偏见 (pēn jiàn) - bias* 前一些 (qián yī xiē) - previous* 推荐 (pù jiào) - recommendations* 收集 (shōu jié) - collect* 无偏见数据 (wú pēn jiàn shù) - unbiased data* 近期研究 (jìn qī yán jí) - recent studies* successfully (gōng cháng) - successfully* developed (fāng zhì) - developed* less biased models (liǎo pēn jiàn mó delè) - less biased models* ignore (yì qù) - ignore* crucial fact (zhì zhèng shí) - crucial fact* 系统会偏好前一些推荐 (xiàng zhì bìng hǎo qián yī xiē pù jiào) - the system will have a preference for previous recommendations* To address this issue (dāng zhèng yì bù) - to address this issue* 我们提出一个框架 (wǒ men tím zhāng yī jīng) - we propose a framework* 学习一个无偏见估计器 (xué xí yī jī without bias estimator) - learn an unbiased estimator* 使用一小量 uniformly collected data (fù yǐn yī xiǎo liàng uniform collected data) - use a small amount of uniformly collected data* 并对后续训练进行改进 (bìng duì hòu xiù xùn zhì gòng gòng) - and improve subsequent training* To accomplish this (dāng zhèng yì bù) - to accomplish this* 我们视推荐为 contextual multi-arm bandit problem (wǒ men wèi pù jiào as contextual multi-arm bandit problem) - we view recommendation as a contextual multi-arm bandit problem* 强调在模型有限理解的 item (qiáng dǎo zài mó delè yǒu xiǎn lǐ yǐn de item) - emphasize on exploring items that the model has a limited understanding of* 我们引入一种新的 offline sequential training schema (wǒ men yìn rù yī xiāng xīn de offline sequential training schema) - we introduce a new offline sequential training schema* 这种方法可以更好地 simulate real-world continuous training scenarios (zhè zhòng fāng yì kěn hǎo de chéng shí yī jiàn shì zhèng) - this method can better simulate real-world continuous training scenarios* 我们通过对比 experiment results (wǒ men tōng zhì yǐn bǐ experiment results) - we demonstrate the superiority of our model over state-of-the-art debiasing methods by conducting extensive experiments using the proposed training schema.

Repelling Random Walks

paper_url: http://arxiv.org/abs/2310.04854
repo_url: None
paper_authors: Isaac Reid, Eli Berger, Krzysztof Choromanski, Adrian Weller
for: 提高图形基于采样的效率，使统计估计更集中。
methods: 使用吸引相互作用的ensemble，使每个步骤的概率不变，从而更好地探索图形。
results: 在不同的设置中，包括图kernels估计、PageRank向量和图лет集中心估计等，证明了repelling random walks的效果。提供了详细的实验评估和理论保证。

Abstract
We present a novel quasi-Monte Carlo mechanism to improve graph-based sampling, coined repelling random walks. By inducing correlations between the trajectories of an interacting ensemble such that their marginal transition probabilities are unmodified, we are able to explore the graph more efficiently, improving the concentration of statistical estimators whilst leaving them unbiased. The mechanism has a trivial drop-in implementation. We showcase the effectiveness of repelling random walks in a range of settings including estimation of graph kernels, the PageRank vector and graphlet concentrations. We provide detailed experimental evaluation and robust theoretical guarantees. To our knowledge, repelling random walks constitute the first rigorously studied quasi-Monte Carlo scheme correlating the directions of walkers on a graph, inviting new research in this exciting nascent domain.

摘要
我团队提出了一种新的 quasi-Monte Carlo 机制，称为吸引随机步行（repelling random walks），用于改进基于图的采样。我们通过控制互动 ensemble 的轨迹相互关联，使其 marginal transition probabilities 不受变化，从而更好地探索图，提高统计估计器的集中程度，保持它们无偏性。该机制具有轻松实现的 Drop-in 实现。我们在多个设置中展示了 repelling random walks 的效果，包括图kernels 的估计、PageRank вектор和 graphlet 的吸引度。我们提供了详细的实验评估和坚实的理论保证。我们认为，repelling random walks 是首个正式研究基于图的 quasi-Monte Carlo 机制，吸引了新的研究在这一领域。

HyperSINDy: Deep Generative Modeling of Nonlinear Stochastic Governing Equations

paper_url: http://arxiv.org/abs/2310.04832
repo_url: None
paper_authors: Mozes Jacobs, Bingni W. Brunton, Steven L. Brunton, J. Nathan Kutz, Ryan V. Raut
for: 该 paper 是为了探索数据驱动的 governing differential equations 的开放前ier。
methods: 该 paper 使用了 HyperSINDy 框架，该框架是一种基于 deep generative model 的 sparse governing equations 模型，通过 variational encoder 和 hypernetwork 来学习 differential equation 的参数。
results: 该 paper 实验表明，HyperSINDy 可以准确地回归真实的杂性 governing equations，并且学习出来的杂性几乎与数据中的杂性相同，同时也提供了高维系统中的uncertainty quantification。

Abstract
The discovery of governing differential equations from data is an open frontier in machine learning. The sparse identification of nonlinear dynamics (SINDy) \citep{brunton_discovering_2016} framework enables data-driven discovery of interpretable models in the form of sparse, deterministic governing laws. Recent works have sought to adapt this approach to the stochastic setting, though these adaptations are severely hampered by the curse of dimensionality. On the other hand, Bayesian-inspired deep learning methods have achieved widespread success in high-dimensional probabilistic modeling via computationally efficient approximate inference techniques, suggesting the use of these techniques for efficient stochastic equation discovery. Here, we introduce HyperSINDy, a framework for modeling stochastic dynamics via a deep generative model of sparse governing equations whose parametric form is discovered from data. HyperSINDy employs a variational encoder to approximate the distribution of observed states and derivatives. A hypernetwork \citep{ha_hypernetworks_2016} transforms samples from this distribution into the coefficients of a differential equation whose sparse form is learned simultaneously using a trainable binary mask \citep{louizos_learning_2018}. Once trained, HyperSINDy generates stochastic dynamics via a differential equation whose coefficients are driven by a Gaussian white noise. In experiments, HyperSINDy accurately recovers ground truth stochastic governing equations, with learned stochasticity scaling to match that of the data. Finally, HyperSINDy provides uncertainty quantification that scales to high-dimensional systems. Taken together, HyperSINDy offers a promising framework for model discovery and uncertainty quantification in real-world systems, integrating sparse equation discovery methods with advances in statistical machine learning and deep generative modeling.

摘要
发现数据中的束缚方程是机器学习的开放前ier。SINDy框架（\cite{brunton_discovering_2016}）使得数据驱动的发现可以获得可读的模型，这些模型是稀疏的、决定的 governing laws。然而，在Stochastic setting中，这些方法受到维度约束的困扰。与此同时， bayesian-inspired deep learning方法在高维probabilistic模型中得到了广泛的成功，这些方法通过计算效率高的approximate inference技术来实现。在这种情况下，我们引入了HyperSINDy框架，它通过深度生成模型来模型 Stochastic dynamics，并使用可变的binary mask来学习稀疏的 governing equations。一旦训练完成，HyperSINDy可以生成Stochastic dynamics，其中的koefficients是由Gaussian white noise驱动的。在实验中，HyperSINDy能够准确地回归真实的Stochastic governing equations，并且学习到的Stochasticity可以与数据中的Stochasticity相匹配。此外，HyperSINDy还提供了可靠的uncertainty quantification，可以应用于高维系统。综上所述，HyperSINDy表示了一种有前途的模型发现和uncertainty quantification框架，它将SINDy框架与统计机器学习和深度生成模型相结合。

Critique Ability of Large Language Models

paper_url: http://arxiv.org/abs/2310.04815
repo_url: None
paper_authors: Liangchen Luo, Zi Lin, Yinxiao Liu, Lei Shu, Yun Zhu, Jingbo Shang, Lei Meng
for: 这项研究旨在探讨大语言模型（LLM）是否能够提供准确的批判。
methods: 该研究使用了一个统一的评估框架，称为CriticBench，来评估 LLM 的批判能力。 CriticBench 包括了 3000 个高质量的自然语言问题和对应的模型回答，并对这些回答进行了注释。
results: 研究发现，大多数 LLM 在批判任务中表现不佳，而且自我批判是特别困难的。 Even top-performing LLMs 在自我批判任务中表现不满足。此外，模型在不确定性最高的问题上的批判精度较低。为解决这个问题，研究人员提出了一种简单 yet effective 的基线方案，称为自我检查，可以使用自我批判来提高任务性能。

Abstract
Critical thinking is essential for rational decision-making and problem-solving. This skill hinges on the ability to provide precise and reasoned critiques and is a hallmark of human intelligence. In the era of large language models (LLMs), this study explores the ability of LLMs to deliver accurate critiques across various tasks. We are interested in this topic as a capable critic model could not only serve as a reliable evaluator, but also as a source of supervised signals for model tuning. Particularly, if a model can self-critique, it has the potential for autonomous self-improvement. To examine this, we introduce a unified evaluation framework for assessing the critique abilities of LLMs. We develop a benchmark called CriticBench, which comprises 3K high-quality natural language queries and corresponding model responses; and annotate the correctness of these responses. The benchmark cover tasks such as math problem-solving, code completion, and question answering. We evaluate multiple LLMs on the collected dataset and our analysis reveals several noteworthy insights: (1) Critique is generally challenging for most LLMs, and this capability often emerges only when models are sufficiently large. (2) In particular, self-critique is especially difficult. Even top-performing LLMs struggle to achieve satisfactory performance. (3) Models tend to have lower critique accuracy on problems where they are most uncertain. To this end, we introduce a simple yet effective baseline named self-check, which leverages self-critique to improve task performance for various models. We hope this study serves as an initial exploration into understanding the critique abilities of LLMs, and aims to inform future research, including the development of more proficient critic models and the application of critiques across diverse tasks.

摘要
<>translate_language Simplified Chinese;critical_thinking 是必备的，以便进行 rational decision-making 和 problem-solving。这种技能取决于能够提供精准和理据的批评，是人类智能的标志。在大语言模型（LLM）时代，这项研究探讨了 LLM 是否能够提供准确的批评。我们对这个话题感兴趣，因为一个能够自我批评的模型不仅可以成为可靠的评估器，而且还可以用于自主改进。为了探讨这一点，我们提出了一个统一的评估框架，用于评估 LLM 的批评能力。我们开发了一个名为 CriticBench 的benchmark，该benchmark包括 3K 个高质量的自然语言问题和对应的模型回答；并对这些回答进行了正确性的标注。该benchmark覆盖了数学问题解决、代码完成和问答等任务。我们对收集到的数据进行了多个 LLM 的评估，我们的分析发现了一些有趣的发现：1. 批评通常是 LLM 最大的挑战之一，并且这种能力通常只在模型够大时出现。2. 特别是自我批评是 LLM 最大的挑战之一。even top-performing LLMs 很难达到满意的性能。3. 模型在它们最不确定的问题上的批评精度通常较低。为了解决这一问题，我们提出了一种简单 yet effective 的基线名为 self-check，它利用自我批评来提高任务性能。我们希望这项研究能够作为 LLM 批评能力的初步探讨，并希望这项研究能够导向未来的研究，包括开发更具批评能力的模型和在多种任务上应用批评。

Applications of Littlestone dimension to query learning and to compression

paper_url: http://arxiv.org/abs/2310.04812
repo_url: None
paper_authors: Hunter Chase, James Freitag, Lev Reyzin
for: 这个论文提供了几种利用Littlestone维度的应用，包括对\cite{angluin2017power}模型的扩展，以及对无穷域概念类型的扩展。
methods: 该论文使用了Littlestone维度来扩展\cite{angluin2017power}模型，并使用随机counterexample来学习。它还使用了扩展$d$-压缩算法来提高结果。
results: 该论文提出了一个强版的\cite{floyd1995sample} conjecture，证明了Littlestone维度与扩展$d$-压缩算法之间的关系。

Abstract
In this paper we give several applications of Littlestone dimension. The first is to the model of \cite{angluin2017power}, where we extend their results for learning by equivalence queries with random counterexamples. Second, we extend that model to infinite concept classes with an additional source of randomness. Third, we give improved results on the relationship of Littlestone dimension to classes with extended $d$-compression schemes, proving a strong version of a conjecture of \cite{floyd1995sample} for Littlestone dimension.

摘要
在这篇论文中，我们提供了一些对带纹度（Littlestone dimension）的应用。第一个是对《\cite{angluin2017power}》中的学习模型进行扩展，使其可以通过 equipollence 查询和随机反例来学习。第二，我们将这个模型扩展到无穷无数的概念类型，并添加了随机性作为另一种来源。第三，我们提供了对带纹度与 extended $d$-压缩表示法之间的关系的改进结果，证明了带纹度的强版推测。

Accelerate Multi-Agent Reinforcement Learning in Zero-Sum Games with Subgame Curriculum Learning

paper_url: http://arxiv.org/abs/2310.04796
repo_url: None
paper_authors: Jiayu Chen, Zelai Xu, Yunfei Li, Chao Yu, Jiaming Song, Huazhong Yang, Fei Fang, Yu Wang, Yi Wu
for: 提高复杂零点游戏中多智能机器人学习（MARL）的效率，使其更加快速地学习 Nash 平衡（NE）。
methods: 使用 curriculum learning 方法，采用适应性的初始状态分布，并采用 particle-based state sampler 来生成子游戏。
results: 在 particle-world 环境和 Google Research Football 环境中，SACL 生成了更强的策略，并在复杂的 hide-and-seek quadrant 环境中生成了所有四个 emergent stage，使用的样本数量只是 MAPPO 自我玩家的一半。

Abstract
Learning Nash equilibrium (NE) in complex zero-sum games with multi-agent reinforcement learning (MARL) can be extremely computationally expensive. Curriculum learning is an effective way to accelerate learning, but an under-explored dimension for generating a curriculum is the difficulty-to-learn of the subgames -- games induced by starting from a specific state. In this work, we present a novel subgame curriculum learning framework for zero-sum games. It adopts an adaptive initial state distribution by resetting agents to some previously visited states where they can quickly learn to improve performance. Building upon this framework, we derive a subgame selection metric that approximates the squared distance to NE values and further adopt a particle-based state sampler for subgame generation. Integrating these techniques leads to our new algorithm, Subgame Automatic Curriculum Learning (SACL), which is a realization of the subgame curriculum learning framework. SACL can be combined with any MARL algorithm such as MAPPO. Experiments in the particle-world environment and Google Research Football environment show SACL produces much stronger policies than baselines. In the challenging hide-and-seek quadrant environment, SACL produces all four emergent stages and uses only half the samples of MAPPO with self-play. The project website is at https://sites.google.com/view/sacl-rl.

摘要
学习奈什均衡（NE）在复杂的零点游戏中的多智能体学习（MARL）可以非常 computationally expensive。课程学习是一种有效的加速学习方法，但是一个未经探索的维度是难度学习子游戏 -- 由特定状态开始的游戏。在这项工作中，我们提出了一种新的子游戏课程学习框架 для零点游戏。它采用了一种自适应的初始状态分布，通过将代理人重置到之前访问过的状态中，以便快速提高性能。基于这个框架，我们 derivate了一个子游戏选择度量，该度量约等于均衡值的平方距离。其后，我们采用一种粒子基本的状态采样器来生成子游戏。将这些技术集成到我们的新算法中，我们称之为Subgame Automatic Curriculum Learning（SACL）。SACL可以与任何MARL算法结合，如MAPPO。实验在团子环境和Google研究足球环境中表明，SACL生成了强一点的策略，并且在具有挑战性的隐藏和找猎 quadrant环境中生成了所有的四个 Emergent 阶段，并且只使用了半个MAPPO的自我玩家样本。项目网站的地址是https://sites.google.com/view/sacl-rl。

Conditional Diffusion Model for Target Speaker Extraction

paper_url: http://arxiv.org/abs/2310.04791
repo_url: None
paper_authors: Theodor Nguyen, Guangzhi Sun, Xianrui Zheng, Chao Zhang, Philip C Woodland
For: 提出了一种基于分布式生成模型的目标 speaker 提取方法（DiffSpEx），用于提取混合来源中的目标 speaker。* Methods: 使用了连续时间的随机扩散过程在复杂短时间傅立叶Transform领域，从目标 speaker 源开始，向 Gaussian 分布中心的混合来源 converges。在反时间过程中，使用了参数化的得分函数，根据目标 speaker 嵌入来提取目标 speaker。* Results: 在WSJ0-2mix dataset 上运行，实现了 SI-SDR 12.9 dB 和 NISQA 分数 3.56。此外，我们还示出了在特定 speaker 的 fine-tuning 可以进一步提高性能，启用了个性化目标 speaker 提取。

Abstract
We propose DiffSpEx, a generative target speaker extraction method based on score-based generative modelling through stochastic differential equations. DiffSpEx deploys a continuous-time stochastic diffusion process in the complex short-time Fourier transform domain, starting from the target speaker source and converging to a Gaussian distribution centred on the mixture of sources. For the reverse-time process, a parametrised score function is conditioned on a target speaker embedding to extract the target speaker from the mixture of sources. We utilise ECAPA-TDNN target speaker embeddings and condition the score function alternately on the SDE time embedding and the target speaker embedding. The potential of DiffSpEx is demonstrated with the WSJ0-2mix dataset, achieving an SI-SDR of 12.9 dB and a NISQA score of 3.56. Moreover, we show that fine-tuning a pre-trained DiffSpEx model to a specific speaker further improves performance, enabling personalisation in target speaker extraction.

摘要
我们提出了DiffSpEx，一种基于得分型生成模型的目标说话人提取方法。DiffSpEx通过在短时 Fourier transform 领域中使用概率扩散过程来实现连续时间的生成，从目标说话人源开始，向 Gaussian 分布中心点转化。在反时间过程中，我们使用 Parametrized 得分函数，通过 conditioning target speaker embedding来提取目标说话人。我们使用 ECAPA-TDNN 目标说话人嵌入，并在 SDE 时间嵌入和目标说话人嵌入之间交互 conditioning。我们在 WSJ0-2mix 数据集上实现了 SI-SDR 12.9 dB 和 NISQA 分数 3.56。此外，我们还证明了在预训练 DiffSpEx 模型后进行个性化训练可以进一步提高性能，实现目标说话人个性化提取。

Online Corrupted User Detection and Regret Minimization

paper_url: http://arxiv.org/abs/2310.04768
repo_url: https://github.com/JizeXie/Online-Corrupted-User-Detection-and-Regret-Minimization
paper_authors: Zhiyong Wang, Jize Xie, Tong Yu, Shuai Li, John C. S. Lui
for: 学习和识别在线恶意用户的多用户场景中的恶意用户
methods: 提出了一种新的带刺恶意抑制算法RCLUB-WCU，以及一种基于RCLUB-WCU的在线检测算法OCCUD
results: 对比前一代算法，其在多用户场景下的性能和检测精度均达到了新的高水平。

Abstract
In real-world online web systems, multiple users usually arrive sequentially into the system. For applications like click fraud and fake reviews, some users can maliciously perform corrupted (disrupted) behaviors to trick the system. Therefore, it is crucial to design efficient online learning algorithms to robustly learn from potentially corrupted user behaviors and accurately identify the corrupted users in an online manner. Existing works propose bandit algorithms robust to adversarial corruption. However, these algorithms are designed for a single user, and cannot leverage the implicit social relations among multiple users for more efficient learning. Moreover, none of them consider how to detect corrupted users online in the multiple-user scenario. In this paper, we present an important online learning problem named LOCUD to learn and utilize unknown user relations from disrupted behaviors to speed up learning, and identify the corrupted users in an online setting. To robustly learn and utilize the unknown relations among potentially corrupted users, we propose a novel bandit algorithm RCLUB-WCU. To detect the corrupted users, we devise a novel online detection algorithm OCCUD based on RCLUB-WCU's inferred user relations. We prove a regret upper bound for RCLUB-WCU, which asymptotically matches the lower bound with respect to $T$ up to logarithmic factors, and matches the state-of-the-art results in degenerate cases. We also give a theoretical guarantee for the detection accuracy of OCCUD. With extensive experiments, our methods achieve superior performance over previous bandit algorithms and high corrupted user detection accuracy.

摘要
在现实世界上的在线网络系统中，多个用户通常会sequentially进入系统。对于应用程序如click fraud和假评论，一些用户可能会有恶意的行为来欺骗系统。因此，设计高效的在线学习算法是非常重要的，以robustly从潜在的损害用户行为中学习，并快速确定损害用户。现有的工作提出了对抗恶意损害的bandit算法。然而，这些算法只适用于单个用户，无法利用多个用户之间的隐式社交关系以更高效地学习。另外，它们没有考虑在多用户场景中如何在线检测损害用户。在这篇论文中，我们提出了一个重要的在线学习问题，named LOCUD，以学习和利用不确定用户关系从损害行为中快速学习，并在线确定损害用户。为了robustly学习和利用不确定用户之间的关系，我们提出了一种新的bandit算法RCLUB-WCU。为了在线检测损害用户，我们设计了一种基于RCLUB-WCU的推理用户关系的检测算法OCCUD。我们证明了RCLUB-WCU的 regretUpper bound，其在T approached to infinity asymptotically与低 bound相匹配，并与现状最佳结果在不良情况下匹配。我们还给出了检测精度的理论保证。通过广泛的实验，我们的方法在前一代bandit算法之上 achieved superior performance和高的损害用户检测精度。

Robust Low-Rank Matrix Completion via a New Sparsity-Inducing Regularizer

paper_url: http://arxiv.org/abs/2310.04762
repo_url: None
paper_authors: Zhi-Yong Wang, Hing Cheung So, Abdelhak M. Zoubir
for: 这篇论文提出了一种新的损失函数，即混合普通-Welsch（HOW）函数，以及一种基于HOW函数的稀疏化规范。
methods: 该论文提出了一种新的规范，即启用HOW函数的规范，并证明了这个规范是 quasi-convex 的。此外，该论文还提出了一种基于 alternating direction method of multipliers 的高效算法。
results: 实验结果表明，相比非整数正则化函数（如 lp-norm 函数），提出的规范在矩阵完成问题中表现更加优秀，并且可以在 synthetic 和实际世界数据上达到更高的Restoration性能。

Abstract
This paper presents a novel loss function referred to as hybrid ordinary-Welsch (HOW) and a new sparsity-inducing regularizer associated with HOW. We theoretically show that the regularizer is quasiconvex and that the corresponding Moreau envelope is convex. Moreover, the closed-form solution to its Moreau envelope, namely, the proximity operator, is derived. Compared with nonconvex regularizers like the lp-norm with 0

摘要

Unit Commitment Predictor With a Performance Guarantee: A Support Vector Machine Classifier

paper_url: http://arxiv.org/abs/2310.08601
repo_url: None
paper_authors: Farzaneh Pourahmadi, Jalal Kazempour
for: 这篇论文的目的是提供一个实用的解决方案，以帮助系统操作人员在短时间内解决大规模的发电单位调度问题。
methods: 这篇论文使用了学习和预测核心发电单位的启用/停用决策，以实现系统操作人员可以快速整合启用数据，并且提供了一个稳定的性能保证。
results: 根据IEEE 6-bus和118-bus测试系统，这篇论文的结果显示，使用kernelized SVM核函数预测器，并且适当地调整正规化，可以实现更好的性能，提高computational time的速度。此外，如果时间限制很紧，即使发电单位调度问题无法在时间限制内解决，但是可以使用暖启动的方法，仍可以在时间限制内解决问题。

Abstract
The system operators usually need to solve large-scale unit commitment problems within limited time frame for computation. This paper provides a pragmatic solution, showing how by learning and predicting the on/off commitment decisions of conventional units, there is a potential for system operators to warm start their solver and speed up their computation significantly. For the prediction, we train linear and kernelized support vector machine classifiers, providing an out-of-sample performance guarantee if properly regularized, converting to distributionally robust classifiers. For the unit commitment problem, we solve a mixed-integer second-order cone problem. Our results based on the IEEE 6-bus and 118-bus test systems show that the kernelized SVM with proper regularization outperforms other classifiers, reducing the computational time by a factor of 1.7. In addition, if there is a tight computational limit, while the unit commitment problem without warm start is far away from the optimal solution, its warmly started version can be solved to optimality within the time limit.

摘要
系统运维人员通常需要在有限时间内解决大规模单位承诺问题。这篇论文提供了一种实用的解决方案，显示通过学习和预测传统单位的启动和停止决策，有可能为系统运维人员提供快速的计算。为预测，我们训练了线性和核kernel支持向量机分类器，提供了外样性能 garantía，转化为分布robust分类器。对于单位承诺问题，我们解决了杂合二次 cone 问题。我们的结果基于IEEE 6-bus和118-bus测试系统表明，使用适当规则化的核kernel SVM可以超越其他分类器，降低计算时间约为1.7倍。此外，如果计算时间有紧张的限制，而单位承诺问题无暖启动情况下的计算结果远离优解，而暖启动后的计算结果可以在有限时间内达到优解。

Digital Twin Assisted Deep Reinforcement Learning for Online Optimization of Network Slicing Admission Control

paper_url: http://arxiv.org/abs/2310.09299
repo_url: None
paper_authors: Zhenyu Tao, Wei Xu, Xiaohu You
for:The paper aims to address the initial instability issue of Deep Reinforcement Learning (DRL) models in admission control for network slicing in 5G and beyond networks.methods:The proposed solution uses a digital twin (DT) assisted DRL approach, which involves formulating the admission decision-making process as a semi-Markov decision process and simplifying it into an equivalent discrete-time Markov decision process. The DT is established through supervised learning and employed to assist the training phase of the DRL model.results:The DT-assisted DRL model achieved over 40% increase in resource utilization compared to the directly trained state-of-the-art Dueling-DQN and over 20% compared to the directly trained DRL model during initial training, while preserving the model’s capacity to optimize long-term rewards.Here’s the Simplified Chinese version of the information:for:本研究旨在解决 Deep Reinforcement Learning (DRL) 模型在网络压力控制中初始稳定性问题。methods:提议的解决方案使用了数字双子 (DT) 助け的 DRL 方法，其中将招召决策过程模型化为半Markov决策过程，并将其简化为等效的 discrete-time Markov 决策过程。 DT 通过监督学习建立，并用于 DRL 模型训练阶段的助手。results:DT 助け的 DRL 模型在初始训练阶段实现了资源利用率提高超过 40%，比直接训练的 state-of-the-art Dueling-DQN 和直接训练的 DRL 模型提高了超过 20%。此外，该模型还能保持长期奖励优化的能力。

Abstract
The proliferation of diverse network services in 5G and beyond networks has led to the emergence of network slicing technologies. Among these, admission control plays a crucial role in achieving specific optimization goals through the selective acceptance of service requests. Although Deep Reinforcement Learning (DRL) forms the foundation in many admission control approaches for its effectiveness and flexibility, the initial instability of DRL models hinders their practical deployment in real-world networks. In this work, we propose a digital twin (DT) assisted DRL solution to address this issue. Specifically, we first formulate the admission decision-making process as a semi-Markov decision process, which is subsequently simplified into an equivalent discrete-time Markov decision process to facilitate the implementation of DRL methods. The DT is established through supervised learning and employed to assist the training phase of the DRL model. Extensive simulations show that the DT-assisted DRL model increased resource utilization by over 40\% compared to the directly trained state-of-the-art Dueling-DQN and over 20\% compared to our directly trained DRL model during initial training. This improvement is achieved while preserving the model's capacity to optimize the long-term rewards.

摘要
“5G以及以后的网络服务多样性的兴盛，导致网络排他控制技术的出现。其中，入选控制具有重要的优化目标 дости度，通过选择性地接受服务请求。深度循环学习（DRL）在许多入选控制方法中扮演了基础，因为它的效果和灵活性。但是，DRL模型的初始不稳定性限制了它们在实际网络中的实际应用。在这个工作中，我们提出了一个“数字双子”（DT）协助DRL解决方案。具体来说，我们首先将入选决策过程转换为半markt状况过程，然后将其简化为可以实现DRL方法的类似碎时间Markov做决策过程。DT通过监督学习被建立，并用于协助DRL模型训练阶段。实际验证表明，DT协助DRL模型可以提高资源利用率高于40%，比直接训练的现有Dueling-DQN和我们直接训练的DRL模型在初期训练阶段。这个提高是同时保持模型的长期回报优化能力。”

Parameter Efficient Multi-task Model Fusion with Partial Linearization

paper_url: http://arxiv.org/abs/2310.04742
repo_url: None
paper_authors: Anke Tang, Li Shen, Yong Luo, Yibing Zhan, Han Hu, Bo Du, Yixin Chen, Dacheng Tao
for: 这个研究的目的是提高多任务模型融合的效率和数据� astriction，使得可以更好地搭配不同任务的模型。
methods: 这个研究使用了一种新的方法，即只部分线性化adapter模组，然后将不同任务的模型组合成一个多任务模型。这个方法可以充分利用模型融合的优点，同时仍然能够实现精确的 fine-tuning 和测试。
results: 实验结果显示，这个方法可以更好地融合多个任务，并且在增加任务数量时表现更好。相比于标准的参数有效率调整方法，这个方法可以更好地搭配不同任务的模型，并且可以更好地减少调整的参数数量。

Abstract
Large pre-trained models have enabled significant advances in machine learning and served as foundation components. Model fusion methods, such as task arithmetic, have been proven to be powerful and scalable to incorporate fine-tuned weights from different tasks into a multi-task model. However, efficiently fine-tuning large pre-trained models on multiple downstream tasks remains challenging, leading to inefficient multi-task model fusion. In this work, we propose a novel method to improve multi-task fusion for parameter-efficient fine-tuning techniques like LoRA fine-tuning. Specifically, our approach partially linearizes only the adapter modules and applies task arithmetic over the linearized adapters. This allows us to leverage the the advantages of model fusion over linearized fine-tuning, while still performing fine-tuning and inference efficiently. We demonstrate that our partial linearization technique enables a more effective fusion of multiple tasks into a single model, outperforming standard adapter tuning and task arithmetic alone. Experimental results demonstrate the capabilities of our proposed partial linearization technique to effectively construct unified multi-task models via the fusion of fine-tuned task vectors. We evaluate performance over an increasing number of tasks and find that our approach outperforms standard parameter-efficient fine-tuning techniques. The results highlight the benefits of partial linearization for scalable and efficient multi-task model fusion.

摘要
具体来说，我们的方法只是部分 Linearize 适配器模块，然后对Linearized的适配器进行任务加法。这样可以利用模型融合的优势，同时仍然可以进行 fine-tuning 和推理高效。我们的partial linearization technique可以更好地将多个任务融合到一个单独的模型中，超过标准适配器调整和任务加法alone。我们的方法可以更有效地构建多任务模型，并且在增加任务数量时表现更好。我们的实验结果表明，我们的partial linearization technique可以更好地实现多任务模型的融合，并且可以高效地进行 fine-tuning 和推理。我们在增加任务数量时评估了我们的方法，并发现它在标准参数有效 fine-tuning 技术的基础上表现更好。这些结果表明，我们的partial linearization technique可以为可扩展和高效的多任务模型融合提供了很好的 benefits。

Subspace Identification for Multi-Source Domain Adaptation

paper_url: http://arxiv.org/abs/2310.04723
repo_url: None
paper_authors: Zijian Li, Ruichu Cai, Guangyi Chen, Boyang Sun, Zhifeng Hao, Kun Zhang
for: 提高多源频道适应性（MSDA）方法的可靠性和效能，使其能够在实际应用中更好地适应不同频道的变化。
methods: 基于subspace标识理论，开发了一种名为SIG模型，该模型通过变分推断来实现适应。此外，SIG模型还包括了类快捷适应，以适应目标频道的变化。
results: 对多个 benchmark 数据集进行实验，研究发现，SIG模型比现有的 MSDA 技术表现更高效和可靠。

Abstract
Multi-source domain adaptation (MSDA) methods aim to transfer knowledge from multiple labeled source domains to an unlabeled target domain. Although current methods achieve target joint distribution identifiability by enforcing minimal changes across domains, they often necessitate stringent conditions, such as an adequate number of domains, monotonic transformation of latent variables, and invariant label distributions. These requirements are challenging to satisfy in real-world applications. To mitigate the need for these strict assumptions, we propose a subspace identification theory that guarantees the disentanglement of domain-invariant and domain-specific variables under less restrictive constraints regarding domain numbers and transformation properties, thereby facilitating domain adaptation by minimizing the impact of domain shifts on invariant variables. Based on this theory, we develop a Subspace Identification Guarantee (SIG) model that leverages variational inference. Furthermore, the SIG model incorporates class-aware conditional alignment to accommodate target shifts where label distributions change with the domains. Experimental results demonstrate that our SIG model outperforms existing MSDA techniques on various benchmark datasets, highlighting its effectiveness in real-world applications.

摘要
多源领域适应（MSDA）方法目的是将多个标注源领域中的知识传递到无标注目标领域。现有的方法通过保证领域之间的最小变化来实现目标共享分布的准确性，但这些条件frequently是在实际应用中困难或不可能满足。为了缓解这些严格的假设，我们提出了一种子空间识别理论，该理论 garanties the disentanglement of domain-invariant and domain-specific variables under less restrictive constraints regarding domain numbers and transformation properties, thereby facilitating domain adaptation by minimizing the impact of domain shifts on invariant variables。基于这种理论，我们开发了一个Subspace Identification Guarantee（SIG）模型，该模型通过变量极限推理来实现。此外，SIG模型还包括类感 conditional alignment，以适应目标频谱中的变化。实验结果表明，我们的SIG模型在多个 benchmark dataset上表现出色，超过了现有的 MSDA 技术，这有力地表明其在实际应用中的效果。

Offline Imitation Learning with Variational Counterfactual Reasoning

paper_url: http://arxiv.org/abs/2310.04706
repo_url: None
paper_authors: Bowei He, Zexu Sun, Jinxin Liu, Shuai Zhang, Xu Chen, Chen Ma
for: 提高offlineimitasion学习（IL） Agent的灵活性和泛化能力，使其能够在缺乏专家数据的情况下学习优化的专家行为策略。
methods: 我们提出了一种名为OILCA的框架，利用可识别的变量自动编码器生成对应的 counterfactual 样本，并对其进行了理论分析和实验 validate。
results: 我们的方法在具有不同环境变化和缺乏专家数据的情况下，在 \textsc{DeepMind Control Suite} 和 \textsc{CausalWorld} 测试集上显著超过了多种基eline。

Abstract
In offline Imitation Learning (IL), an agent aims to learn an optimal expert behavior policy without additional online environment interactions. However, in many real-world scenarios, such as robotics manipulation, the offline dataset is collected from suboptimal behaviors without rewards. Due to the scarce expert data, the agents usually suffer from simply memorizing poor trajectories and are vulnerable to the variations in the environments, lacking the capability of generalizing to new environments. To effectively remove spurious features that would otherwise bias the agent and hinder generalization, we propose a framework named \underline{O}ffline \underline{I}mitation \underline{L}earning with \underline{C}ounterfactual data \underline{A}ugmentation (OILCA). In particular, we leverage the identifiable variational autoencoder to generate \textit{counterfactual} samples. We theoretically analyze the counterfactual identification and the improvement of generalization. Moreover, we conduct extensive experiments to demonstrate that our approach significantly outperforms various baselines on both \textsc{DeepMind Control Suite} benchmark for in-distribution robustness and \textsc{CausalWorld} benchmark for out-of-distribution generalization.

摘要
Offline Imitation Learning (IL) 是一种尝试学习最佳专家行为策略的 Agent，无需在线环境交互。然而，在许多实际场景中，如机器人操作， Offline 数据集是由不优化的行为而收集的。由于专家数据稀缺， Agent 通常会记忆低质量的轨迹，感受不到环境变化，缺乏新环境泛化能力。为了有效地除掉干扰特征，防止 Agent 受到环境变化的影响，我们提出了名为 OILCA 的框架。具体来说，我们利用可识别性的变分自动编码器来生成 "counterfactual" 样本。我们 theoretically 分析了 counterfactual 识别和改进的泛化性。此外，我们进行了广泛的实验，以示我们的方法在各种基eline上出perform 许多基eline。Specifically, we use an identifiable variational autoencoder to generate counterfactual samples, which are used to remove spurious features that would otherwise bias the agent and hinder generalization. We theoretically analyze the counterfactual identification and the improvement of generalization. Moreover, we conduct extensive experiments to demonstrate that our approach significantly outperforms various baselines on both in-distribution robustness and out-of-distribution generalization.In particular, we use an identifiable variational autoencoder to generate counterfactual samples, which are used to remove spurious features that would otherwise bias the agent and hinder generalization. We theoretically analyze the counterfactual identification and the improvement of generalization. Moreover, we conduct extensive experiments to demonstrate that our approach significantly outperforms various baselines on both in-distribution robustness and out-of-distribution generalization.

paper_url: http://arxiv.org/abs/2310.04701
repo_url: https://github.com/alipay/microservice_system_twin_graph_based_anomaly_detection
paper_authors: Jun Huang, Yang Yang, Hang Yu, Jianguo Li, Xiao Zheng
for: 这个论文旨在提出一种基于图像学的 semi-supervised 方法，用于检测微服务系统中的异常。
methods: 论文使用了多种数据模式（指标、日志和跟踪）的混合学习方法，通过具有异常检测和模式识别功能的 transformer 神经网络来检测异常。
results: 论文通过实验表明，该方法可以准确地检测微服务系统中的异常，并且可以提供实时的异常检测结果。

Abstract
Microservice architecture has sprung up over recent years for managing enterprise applications, due to its ability to independently deploy and scale services. Despite its benefits, ensuring the reliability and safety of a microservice system remains highly challenging. Existing anomaly detection algorithms based on a single data modality (i.e., metrics, logs, or traces) fail to fully account for the complex correlations and interactions between different modalities, leading to false negatives and false alarms, whereas incorporating more data modalities can offer opportunities for further performance gain. As a fresh attempt, we propose in this paper a semi-supervised graph-based anomaly detection method, MSTGAD, which seamlessly integrates all available data modalities via attentive multi-modal learning. First, we extract and normalize features from the three modalities, and further integrate them using a graph, namely MST (microservice system twin) graph, where each node represents a service instance and the edge indicates the scheduling relationship between different service instances. The MST graph provides a virtual representation of the status and scheduling relationships among service instances of a real-world microservice system. Second, we construct a transformer-based neural network with both spatial and temporal attention mechanisms to model the inter-correlations between different modalities and temporal dependencies between the data points. This enables us to detect anomalies automatically and accurately in real-time. The source code of MSTGAD is publicly available at https://github.com/alipay/microservice_system_twin_graph_based_anomaly_detection.

摘要
干脆服务架构在近年来为企业应用程序管理而出现，因为它可以独立部署和缩放服务。然而，确保微服务系统的可靠性和安全性仍然非常困难。现有的异常检测算法基于单一数据模式（例如指标、日志或跟踪）无法完全考虑微服务系统中不同模式之间的复杂相关性和互动，导致假阳性和假警示，而在包含更多数据模式时可以获得更高的性能提升。在这篇论文中，我们提出了一种基于图的半监督异常检测方法，称为MSTGAD，它可以兼容所有可用的数据模式，并通过注意力多模式学习来协调异常检测。首先，我们从三种数据模式中提取和 норализова特征，并使用图，即MST（微服务系统双）图，其中每个节点表示一个服务实例，两个节点之间的边表示这些服务实例之间的调度关系。MST图为实际微服务系统中服务实例的状态和调度关系提供虚拟表示。其次，我们构建了基于变换器的神经网络，其中包括空间和时间注意力机制，以模型不同模式之间的相互关系和时间关系。这使得我们可以在实时中自动和准确地检测异常。MSTGAD的源代码可以在https://github.com/alipay/microservice_system_twin_graph_based_anomaly_detection上获取。

Tight Rates in Supervised Outlier Transfer Learning

paper_url: http://arxiv.org/abs/2310.04686
repo_url: None
paper_authors: Mohammadreza M. Kalan, Samory Kpotufe
for: 本研究旨在探讨在异常检测中如何传输知识，具体来说是在异常检测任务中使用相似 yet imperfect 的异常数据来传输信息。
methods: 本研究采用了传统的Neyman-Pearson分类框架，并假设有访问一些相关 yet imperfect 的异常数据。我们的主要结果是：我们首先确定了异常检测问题下的信息理论上限，并证明这些上限是可实现的，并且可以通过适应过程来实现。
results: 我们的研究结果显示，与传统的平衡分类不同，在异常检测任务中，不同的源数据可以提供大量的信息，从而实现快速的知识传输。

Abstract
A critical barrier to learning an accurate decision rule for outlier detection is the scarcity of outlier data. As such, practitioners often turn to the use of similar but imperfect outlier data from which they might transfer information to the target outlier detection task. Despite the recent empirical success of transfer learning approaches in outlier detection, a fundamental understanding of when and how knowledge can be transferred from a source to a target outlier detection task remains elusive. In this work, we adopt the traditional framework of Neyman-Pearson classification -- which formalizes supervised outlier detection -- with the added assumption that one has access to some related but imperfect outlier data. Our main results are as follows: We first determine the information-theoretic limits of the problem under a measure of discrepancy that extends some existing notions from traditional balanced classification; interestingly, unlike in balanced classification, seemingly very dissimilar sources can provide much information about a target, thus resulting in fast transfer. We then show that, in principle, these information-theoretic limits are achievable by adaptive procedures, i.e., procedures with no a priori information on the discrepancy between source and target outlier distributions.

摘要
一个重要的障碍在精准地学习异常检测的准确决策规则是异常数据的罕见性。因此，实践者们经常会使用类似 pero 不完全的异常数据，从而将信息传递到目标异常检测任务中。虽然latest empirical success of transfer learning approaches in outlier detection中有一定的成功，但是一个基本的理解当when和如何从源到目标异常检测任务中传递知识仍然是 unclear。在这个工作中，我们采用传统的Neyman-Pearson分类框架--这个框架将supervised outlier detection formalized--并假设有一些相关的 pero 不完全的异常数据。我们的主要结果如下：我们首先确定了异常检测问题下的信息论限制，这个限制是基于扩展一些现有的传统平衡分类的不同概念的一种度量。有趣的是，在平衡分类中，看起来非常不相似的源数据可以提供大量的信息给目标，因此可以快速传递。我们然后表明，在理论上，这些信息论限制是可以实现的，即可以通过适应程序来实现。这些适应程序没有对异常分布的先验信息，即可以在不知道源和目标异常分布之间的差异情况下进行学习。

Surgical Gym: A high-performance GPU-based platform for reinforcement learning with surgical robots

paper_url: http://arxiv.org/abs/2310.04676
repo_url: https://github.com/samuelschmidgall/surgicalgym
paper_authors: Samuel Schmidgall, Axel Krieger, Jason Eshraghian
for: 这 paper 是为了提高 робоット助成手术的精度、效率和非侵入性而写的。
methods: 这 paper 使用的方法是深度强化学习方法，以实现手术自动化。
results: 这 paper 的结果表明，使用 Surgical Gym 平台可以减少手术结果的变化和风险。训练时间也比前一代平台快了100-5000倍。

Abstract
Recent advances in robot-assisted surgery have resulted in progressively more precise, efficient, and minimally invasive procedures, sparking a new era of robotic surgical intervention. This enables doctors, in collaborative interaction with robots, to perform traditional or minimally invasive surgeries with improved outcomes through smaller incisions. Recent efforts are working toward making robotic surgery more autonomous which has the potential to reduce variability of surgical outcomes and reduce complication rates. Deep reinforcement learning methodologies offer scalable solutions for surgical automation, but their effectiveness relies on extensive data acquisition due to the absence of prior knowledge in successfully accomplishing tasks. Due to the intensive nature of simulated data collection, previous works have focused on making existing algorithms more efficient. In this work, we focus on making the simulator more efficient, making training data much more accessible than previously possible. We introduce Surgical Gym, an open-source high performance platform for surgical robot learning where both the physics simulation and reinforcement learning occur directly on the GPU. We demonstrate between 100-5000x faster training times compared with previous surgical learning platforms. The code is available at: https://github.com/SamuelSchmidgall/SurgicalGym.

摘要

Modeling non-uniform uncertainty in Reaction Prediction via Boosting and Dropout

paper_url: http://arxiv.org/abs/2310.04674
repo_url: None
paper_authors: Taicheng Guo, Changsheng Ma, Xiuying Chen, Bozhao Nan, Kehan Guo, Shichao Pei, Nitesh V. Chawla, Olaf Wiest, Xiangliang Zhang
for: 预测化学反应的精度性能，尤其是考虑到反应过程中的不确定性。
methods: 提出了一种基于Variational Autoencoder（VAE）框架的方法，通过采用不同的dropout和boosting Ensemble来模型化不确定性。
results: 实验结果表明，提出的方法在USPTO-MIT大型反应预测 benchmark 上表现出色，比基elines有更好的精度性能。

Abstract
Reaction prediction has been recognized as a critical task in synthetic chemistry, where the goal is to predict the outcome of a reaction based on the given reactants. With the widespread adoption of generative models, the Variational Autoencoder(VAE) framework has typically been employed to tackle challenges in reaction prediction, where the reactants are encoded as a condition for the decoder, which then generates the product. Despite effectiveness, these conditional VAE (CVAE) models still fail to adequately account for the inherent uncertainty in reaction prediction, which primarily stems from the stochastic reaction process. The principal limitations are twofold. Firstly, in these CVAE models, the prior is independent of the reactants, leading to a default wide and assumed uniform distribution variance of the generated product. Secondly, reactants with analogous molecular representations are presumed to undergo similar electronic transition processes, thereby producing similar products. This hinders the ability to model diverse reaction mechanisms effectively. Since the variance in outcomes is inherently non-uniform, we are thus motivated to develop a framework that generates reaction products with non-uniform uncertainty. Firstly, we eliminate the latent variable in previous CVAE models to mitigate uncontrol-label noise. Instead, we introduce randomness into product generation via boosting to ensemble diverse models and cover the range of potential outcomes, and through dropout to secure models with minor variations. Additionally, we design a ranking method to union the predictions from boosting and dropout, prioritizing the most plausible products. Experimental results on the largest reaction prediction benchmark USPTO-MIT show the superior performance of our proposed method in modeling the non-uniform uncertainty compared to baselines.

摘要
<>translate text into Simplified ChineseReaction prediction has been recognized as a critical task in synthetic chemistry, where the goal is to predict the outcome of a reaction based on the given reactants. With the widespread adoption of generative models, the Variational Autoencoder(VAE) framework has typically been employed to tackle challenges in reaction prediction, where the reactants are encoded as a condition for the decoder, which then generates the product. Despite effectiveness, these conditional VAE (CVAE) models still fail to adequately account for the inherent uncertainty in reaction prediction, which primarily stems from the stochastic reaction process. The principal limitations are twofold. Firstly, in these CVAE models, the prior is independent of the reactants, leading to a default wide and assumed uniform distribution variance of the generated product. Secondly, reactants with analogous molecular representations are presumed to undergo similar electronic transition processes, thereby producing similar products. This hinders the ability to model diverse reaction mechanisms effectively. Since the variance in outcomes is inherently non-uniform, we are thus motivated to develop a framework that generates reaction products with non-uniform uncertainty. Firstly, we eliminate the latent variable in previous CVAE models to mitigate uncontrol-label noise. Instead, we introduce randomness into product generation via boosting to ensemble diverse models and cover the range of potential outcomes, and through dropout to secure models with minor variations. Additionally, we design a ranking method to union the predictions from boosting and dropout, prioritizing the most plausible products. Experimental results on the largest reaction prediction benchmark USPTO-MIT show the superior performance of our proposed method in modeling the non-uniform uncertainty compared to baselines.中文简体版：<>将文本翻译成中文简体版化学synthesis中的反应预测Task被认为是一个关键任务，目标是根据给定的reactants预测反应的结果。通过大量采用生成模型，Variational Autoencoder（VAE）框架通常被用来解决反应预测挑战，其中reactants被用作生成产品的condition。尽管有效，但这些conditional VAE（CVAE）模型仍然无法准确地考虑反应预测中的内在不确定性，主要来自反应过程的随机性。主要的限制有两点：firstly，在这些CVAE模型中，假设是独立的reactants，导致生成产品的default宽度和假设的uniform分布变iance。secondly，reactantswith相似分子表示会经历类似的电子过程，因此生成相似的产品。这会限制我们模型多样化的反应机理。由于结果的 variance是非常不均匀的，我们因此受动于开发一个框架，可以生成反应产品中的非均匀不确定性。firstly，我们在前一代CVAE模型中消除latent variable，以避免Label noise。相反，我们通过boosting ensemble多种模型，覆盖potential outcome的范围，并通过dropout保持模型中的小变化。此外，我们设计了一种排名方法，以union boosting和dropout的预测结果，优先级最有可能的产品。实验结果表明，我们提议的方法在USPTO-MIT最大反应预测benchmark上表现出优于基线。

Oracle Efficient Algorithms for Groupwise Regret

paper_url: http://arxiv.org/abs/2310.04652
repo_url: None
paper_authors: Krishna Acharya, Eshwar Ram Arunachaleswaran, Sampath Kannan, Aaron Roth, Juba Ziani
for: 这个论文是为了解决在线预测问题，在每个时间步 $t$ 时，一个个体 $x_t$ 出现，我们需要预测其标签。每个个体都有不同的群体特征，例如年龄、性别、种族等，这些群体可能 intersect。我们的目标是在每个子序列中 simultaneously 获得群体不同的 regret guarantee。
methods: 我们使用了一种简单修改的睡眠专家技术，将问题转化为不含群体考虑的外在 regret 问题。我们的方法可以在大型模型集上实现高效，并且具有oracle-efficient 性。
results: 我们的算法可以在多个群体间 simultaneously 获得 regret guarantee，并且在实际数据上进行了广泛的实验，证明了我们的算法在各个群体中都具有显著的预测错误改善。

Abstract
We study the problem of online prediction, in which at each time step $t$, an individual $x_t$ arrives, whose label we must predict. Each individual is associated with various groups, defined based on their features such as age, sex, race etc., which may intersect. Our goal is to make predictions that have regret guarantees not just overall but also simultaneously on each sub-sequence comprised of the members of any single group. Previous work such as [Blum & Lykouris] and [Lee et al] provide attractive regret guarantees for these problems; however, these are computationally intractable on large model classes. We show that a simple modification of the sleeping experts technique of [Blum & Lykouris] yields an efficient reduction to the well-understood problem of obtaining diminishing external regret absent group considerations. Our approach gives similar regret guarantees compared to [Blum & Lykouris]; however, we run in time linear in the number of groups, and are oracle-efficient in the hypothesis class. This in particular implies that our algorithm is efficient whenever the number of groups is polynomially bounded and the external-regret problem can be solved efficiently, an improvement on [Blum & Lykouris]'s stronger condition that the model class must be small. Our approach can handle online linear regression and online combinatorial optimization problems like online shortest paths. Beyond providing theoretical regret bounds, we evaluate this algorithm with an extensive set of experiments on synthetic data and on two real data sets -- Medical costs and the Adult income dataset, both instantiated with intersecting groups defined in terms of race, sex, and other demographic characteristics. We find that uniformly across groups, our algorithm gives substantial error improvements compared to running a standard online linear regression algorithm with no groupwise regret guarantees.

摘要
我们研究在线预测问题上，每个时间步骤 $t$ 时，一个个体 $x_t$ 到来，我们必须预测其标签。每个个体都与不同的群体相关，根据其特征（如年龄、性别、种族等），这些群体可能交叉。我们的目标是获得不同群体的同时 regret guarantee，而不是单纯的总 regret guarantee。先前的研究，如 [Blum & Lykouris] 和 [Lee et al] 提供了吸引人的 regret guarantee，但这些方法 computationally intractable 在大型模型集上。我们展示了一个简单的修改 later 的 sleeping experts 技术，可以将这个问题转换为 absent groupe considerations 的问题。我们的方法具有相似的 regret guarantee，但它们在number of groups 是线性增长的情况下执行，并且是 oracle-efficient 在假设集上。这意味着我们的算法是当number of groups 是 polynomially bounded 且 external-regret problem 可以有效解决时，才会高效。我们的方法可以处理在线 Linear Regression 和 online combinatorial optimization 问题。我们不仅提供了理论上的 regret bound，还进行了广泛的实验，评估我们的算法在 synthetic data 和 Medical costs 和 Adult income 两个真实数据集上的表现。我们发现在所有群体中，我们的算法均匀地提高了错误值。

NPEFF: Non-Negative Per-Example Fisher Factorization

paper_url: http://arxiv.org/abs/2310.04649
repo_url: https://github.com/mmatena/npeff_ref
paper_authors: Michael Matena, Colin Raffel
for: 本研究旨在解释深度学习模型的预测结果，提供一种可靠的解释方法NPEFF，可应用于任何端到端可微分模型。
methods: 本研究使用的NPEFF方法基于Shared Characteristic Processing（SCP）的原理，即处理共同特征的模型参数 subset 的特征。我们对每个例子的 Fisher 信息矩阵进行分解，并将每个分解组件表示为非负向量或rank-1正semidefinite矩阵。
results: 我们通过语言和视觉模型的实验表明，NPEFF可以恢复有意义的参数空间表示，并且这些表示与模型的处理有直接的连接。此外，我们还证明了NPEFF可以探测和修复模型中的潜在错误假设。我们发布了我们的代码，以便进行基于NPEFF的研究。

Abstract
As deep learning models are deployed in more and more settings, it becomes increasingly important to be able to understand why they produce a given prediction, but interpretation of these models remains a challenge. In this paper, we introduce a novel interpretability method called NPEFF that is readily applicable to any end-to-end differentiable model. It operates on the principle that processing of a characteristic shared across different examples involves a specific subset of model parameters. We perform NPEFF by decomposing each example's Fisher information matrix as a non-negative sum of components. These components take the form of either non-negative vectors or rank-1 positive semi-definite matrices depending on whether we are using diagonal or low-rank Fisher representations, respectively. For the latter form, we introduce a novel and highly scalable algorithm. We demonstrate that components recovered by NPEFF have interpretable tunings through experiments on language and vision models. Using unique properties of NPEFF's parameter-space representations, we ran extensive experiments to verify that the connections between directions in parameters space and examples recovered by NPEFF actually reflect the model's processing. We further demonstrate NPEFF's ability to uncover the actual processing strategies used by a TRACR-compiled model. We further explore a potential application of NPEFF in uncovering and correcting flawed heuristics used by a model. We release our code to facilitate research using NPEFF.

摘要
深度学习模型在更多的场景中部署，理解这些模型的预测结果变得越来越重要。然而，模型解释仍然是一个挑战。在这篇论文中，我们介绍了一种新的解释方法called NPEFF，可以应用于任何端到端可导模型。它基于处理特征 shared across不同的示例 involve特定的模型参数子集的原理。我们通过分解每个示例的 Fisher信息矩阵为非负总和组件来实现NPEFF。这些组件可以是非负向量或 Rank-1正semidefinite矩阵，哪怕我们使用 диагональ或低级 Fisher表示。对于后者，我们引入了一种新的并高度可扩展的算法。我们通过实验表示，NPEFF recovered的组件具有可解释的调整。我们还证明了NPEFF的参数空间表示具有unique的性质，可以用来检验模型的处理策略。此外，我们还探讨了NPEFF可以用于探测和更正模型使用的恶劣假设的可能性。我们发布了我们的代码，以便研究人员可以通过NPEFF进行研究。

2023-10-07

Transferable Deep Clustering Model

Beyond Text: A Deep Dive into Large Language Models’ Ability on Understanding Graph Data

Large Language Models for Spatial Trajectory Patterns Mining

Statistical Guarantees for Variational Autoencoders using PAC-Bayesian Theory

Crystal-GFN: sampling crystals with desirable properties and constraints

The Conditional Prediction Function: A Novel Technique to Control False Discovery Rate for Complex Models

Tight Certified Robustness via Min-Max Representations of ReLU Neural Networks

A Dual Latent State Learning Approach: Exploiting Regional Network Similarities for QoS Prediction

Regret Analysis of Repeated Delegated Choice

Randomized Sparse Neural Galerkin Schemes for Solving Evolution Equations with Deep Networks

Universal Graph Random Features

LIPEx – Locally Interpretable Probabilistic Explanations – To Look Beyond The True Class

Epsilon non-Greedy: A Bandit Approach for Unbiased Recommendation via Uniform Data

Repelling Random Walks

HyperSINDy: Deep Generative Modeling of Nonlinear Stochastic Governing Equations

Critique Ability of Large Language Models

Applications of Littlestone dimension to query learning and to compression

Accelerate Multi-Agent Reinforcement Learning in Zero-Sum Games with Subgame Curriculum Learning

Conditional Diffusion Model for Target Speaker Extraction

Online Corrupted User Detection and Regret Minimization

Robust Low-Rank Matrix Completion via a New Sparsity-Inducing Regularizer

Unit Commitment Predictor With a Performance Guarantee: A Support Vector Machine Classifier

Digital Twin Assisted Deep Reinforcement Learning for Online Optimization of Network Slicing Admission Control

Parameter Efficient Multi-task Model Fusion with Partial Linearization

Subspace Identification for Multi-Source Domain Adaptation

Offline Imitation Learning with Variational Counterfactual Reasoning

Twin Graph-based Anomaly Detection via Attentive Multi-Modal Learning for Microservice System

Tight Rates in Supervised Outlier Transfer Learning

Surgical Gym: A high-performance GPU-based platform for reinforcement learning with surgical robots

Modeling non-uniform uncertainty in Reaction Prediction via Boosting and Dropout

Oracle Efficient Algorithms for Groupwise Regret

NPEFF: Non-Negative Per-Example Fisher Factorization