2023-07-29

cs.AI

cs.AI - 2023-07-29

Marrying Dialogue Systems with Data Visualization: Interactive Data Visualization Generation from Natural Language Conversations

paper_url: http://arxiv.org/abs/2307.16013
repo_url: None
paper_authors: Yuanfeng Song, Xuefang Zhao, Raymond Chi-Wing Wong
for: 本研究旨在提高数据视化（DV）系统的使用效率，通过自动化DV任务，如自然语言问题（NLQ）到视化翻译（formally called text-to-vis）。
methods: 本研究提出了一个新任务名为CoVis，即对话式文本到视化，旨在通过用户和系统之间的多次交互来构建DV。
results: 研究人员建立了一个名为Dial-NVBench的benchmark dataset，并提出了一种多模式神经网络名为MMCoVisNet，可以回答DV相关的问题。MMCoVisNet使用对话 контекст进行全面理解，然后使用适应性decoder提供相应的回答。实验结果表明，MMCoVisNet在比较基eline上表现出色，达到了状态略。

Abstract
Data visualization (DV) has become the prevailing tool in the market due to its effectiveness into illustrating insights in vast amounts of data. To lower the barrier of using DVs, automatic DV tasks, such as natural language question (NLQ) to visualization translation (formally called text-to-vis), have been investigated in the research community. However, text-to-vis assumes the NLQ to be well-organized and expressed in a single sentence. However, in real-world settings, complex DV is needed through consecutive exchanges between the DV system and the users. In this paper, we propose a new task named CoVis, short for Conversational text-to-Visualization, aiming at constructing DVs through a series of interactions between users and the system. Since it is the task which has not been studied in the literature, we first build a benchmark dataset named Dial-NVBench, including dialogue sessions with a sequence of queries from a user and responses from the system. Then, we propose a multi-modal neural network named MMCoVisNet to answer these DV-related queries. In particular, MMCoVisNet first fully understands the dialogue context and determines the corresponding responses. Then, it uses adaptive decoders to provide the appropriate replies: (i) a straightforward text decoder is used to produce general responses, (ii) an SQL-form decoder is applied to synthesize data querying responses, and (iii) a DV-form decoder tries to construct the appropriate DVs. We comparatively evaluate MMCoVisNet with other baselines over our proposed benchmark dataset. Experimental results validate that MMCoVisNet performs better than existing baselines and achieves a state-of-the-art performance.

摘要
数据视化（DV）已成为市场上最受欢迎的工具，因为它能够快速和有效地表示大量数据中的意见。为了降低使用DV的门槛，研究者们已经对自动DV任务进行了详细的研究，如自然语言问题（NLQ）到视觉翻译（文本至图）。然而，文本至图假设NLQ是结束的和整洁的单句话。在实际应用中，需要通过多次交互来构建复杂的DV。在这篇论文中，我们提出了一个新的任务，即对话式文本至视觉（CoVis），旨在通过用户和系统之间的多次交互来构建DV。由于这是Literature中没有研究过的任务，我们首先构建了一个名为Dial-NVBench的 benchmark数据集，包括用户和系统之间的对话会话，以及一系列关于DV的查询。然后，我们提出了一种多模态神经网络名为MMCoVisNet，用于回答这些DV相关的查询。具体来说，MMCoVisNet首先完全理解对话上下文，然后确定相应的回答。接着，它使用适应编码器提供相应的答案，包括：（i）一般文本编码器用于生成通用答案，（ii）SQL形式编码器用于生成数据查询答案，（iii）DV形式编码器用于构建适当的DV。我们对MMCoVisNet进行了与其他基准点进行比较的实验，并证明它在我们提出的benchmark数据集上表现出色。

RoCar: A Relationship Network-based Evaluation Method to Large Language Models

paper_url: http://arxiv.org/abs/2307.15997
repo_url: https://github.com/neu-datamining/rocar
paper_authors: Ming Wang, Wenfang Wu, Chongyun Gao, Daling Wang, Shi Feng, Yifei Zhang
for: 评估大语言模型（LLMs）的能力
methods: 使用定义的基本模式Random constructions of task graphs and generates natural language evaluation tasks to evaluate LLMs’ reasoning and memory abilities
results: Ensures fairness of evaluation method by preventing LLMs from directly learning the evaluation tasks

Abstract
Large language models (LLMs) have received increasing attention. However, due to the complexity of its capabilities, how to rationally evaluate the capabilities of LLMs is still a task to be solved. We propose the RoCar method, which utilizes the defined basic schemas to randomly construct a task graph and generates natural language evaluation tasks based on the task graph to evaluate the reasoning and memory abilities of LLMs respectively. Due to the very large randomness of the task construction process, it is possible to ensure that none of the LLMs to be tested has directly learned the evaluation tasks, guaranteeing the fairness of the evaluation method.

摘要
大型语言模型（LLMs）已经获得了越来越多的关注。然而，由于它们的能力相当复杂，如何合理评估它们的能力仍然是一个需要解决的任务。我们提议使用RoCar方法，它利用定义的基本模板来随机建立任务图和生成基于任务图的自然语言评估任务，以评估LLMs的推理和记忆能力。由于随机任务建构过程的非常大的Randomness，因此可以保证none of the LLMs to be tested haven't directly learned the evaluation tasks，确保评估方法的公平性。

UPFL: Unsupervised Personalized Federated Learning towards New Clients

paper_url: http://arxiv.org/abs/2307.15994
repo_url: None
paper_authors: Tiandi Ye, Cen Chen, Yinggui Wang, Xiang Li, Ming Gao
for: addressing the challenge of providing personalized models for new clients in federated learning settings
methods: extends adaptive risk minimization technique to unsupervised personalized federated learning, with two optimization strategies (proxy regularization and early-stopping) and a knowledge distillation loss specifically designed for FedTTA
results: extensive experiments on five datasets against eleven baselines demonstrate the effectiveness of the proposed FedTTA and its variants

Abstract
Personalized federated learning has gained significant attention as a promising approach to address the challenge of data heterogeneity. In this paper, we address a relatively unexplored problem in federated learning. When a federated model has been trained and deployed, and an unlabeled new client joins, providing a personalized model for the new client becomes a highly challenging task. To address this challenge, we extend the adaptive risk minimization technique into the unsupervised personalized federated learning setting and propose our method, FedTTA. We further improve FedTTA with two simple yet effective optimization strategies: enhancing the training of the adaptation model with proxy regularization and early-stopping the adaptation through entropy. Moreover, we propose a knowledge distillation loss specifically designed for FedTTA to address the device heterogeneity. Extensive experiments on five datasets against eleven baselines demonstrate the effectiveness of our proposed FedTTA and its variants. The code is available at: https://github.com/anonymous-federated-learning/code.

摘要
个人化联合学习已经吸引了广泛关注，作为数据不同性的解决方案。在这篇论文中，我们解决了联合学习中较少研究的问题。当一个联合模型已经训练并部署后，新客户加入时，为新客户提供个性化模型是一个非常具有挑战性的任务。为解决这个挑战，我们将适应风险最小化技术推广到无标签联合学习设置中，并提出我们的方法，FedTTA。我们还通过两种简单却有效的优化策略来进一步提高FedTTA：在适应模型训练中添加代理规则，并在适应过程中使用熵来停止。此外，我们还提出了特有的知识传播损失，用于解决设备不同性。我们在五个数据集上对十一个基准进行了广泛的实验，并证明了我们提出的FedTTA和其变种的效果。代码可以在以下地址获取：https://github.com/anonymous-federated-learning/code。

Ultrasound Image Reconstruction with Denoising Diffusion Restoration Models

paper_url: http://arxiv.org/abs/2307.15990
repo_url: https://github.com/yuxin-zhang-jasmine/drus-v1
paper_authors: Yuxin Zhang, Clément Huneau, Jérôme Idier, Diana Mateus
for: 这个论文是为了解决超声影像重建问题，通过学习前知识来提高重建质量。
methods: 这篇论文使用了学习前知识的权重，在Denosing Diffusion Restoration Models（DDRM）框架下实现了超声影像重建。提出了两种 modificates of DDRM，DRUS和WDRUS，并对合成数据和PICMUS数据进行了测试。
results: 该方法可以从单个平面波开始，并且可以达到或更好于DAS和当前最佳方法的图像质量。可以在https://github.com/Yuxin-Zhang-Jasmine/DRUS-v1中下载代码。

Abstract
Ultrasound image reconstruction can be approximately cast as a linear inverse problem that has traditionally been solved with penalized optimization using the $l_1$ or $l_2$ norm, or wavelet-based terms. However, such regularization functions often struggle to balance the sparsity and the smoothness. A promising alternative is using learned priors to make the prior knowledge closer to reality. In this paper, we rely on learned priors under the framework of Denoising Diffusion Restoration Models (DDRM), initially conceived for restoration tasks with natural images. We propose and test two adaptions of DDRM to ultrasound inverse problem models, DRUS and WDRUS. Our experiments on synthetic and PICMUS data show that from a single plane wave our method can achieve image quality comparable to or better than DAS and state-of-the-art methods. The code is available at: https://github.com/Yuxin-Zhang-Jasmine/DRUS-v1.

摘要
ultrasound图像重建可以 aproximately 看作一个线性 inverse 问题，传统上使用 $l_1$ 或 $l_2$ 范数或浪涌基元的 regularization 函数来解决。但这些 regularization 函数经常坚持不够平衡稀疏性和稳定性。一种有前途的替代方案是使用学习的 prior 来让 prior 更加接近 reality。在这篇论文中，我们利用学习的 prior 在 Denoising Diffusion Restoration Models（DDRM）框架下，DDRM 最初是为静止图像修复任务设计的。我们提出并测试了 two 种适应 DDRM 到 ultrasound inverse problem 模型的变体，DRUS 和 WDRUS。我们的实验表明，从单个扩散波的数据中，我们的方法可以 achieved 图像质量与 DAS 和现有方法相当或更高。代码可以在：https://github.com/Yuxin-Zhang-Jasmine/DRUS-v1 中找到。

Freespace Optical Flow Modeling for Automated Driving

paper_url: http://arxiv.org/abs/2307.15989
repo_url: None
paper_authors: Yi Feng, Ruge Zhang, Jiayuan Du, Qijun Chen, Rui Fan
for: 这篇论文的目的是为自动驾驶视觉识别提出一个新的方法，具体来说是计算车辆在驾驶环境中的运动流。
methods: 这篇论文使用了一种新的方法，即在三维驾驶环境中利用几何信息来模型光流。这个方法利用了碰撞范围（也称为可动范围或简单地说是“自由空间”）中的几何信息，以便更好地利用环境信息和几何制约。
results: 这篇论文的实验结果显示了新的光流模型的高精度和可靠性。此外，这个模型还有许多应用在自动驾驶领域，例如顶对应探测、车辆位置探测等。实验结果显示了这个模型在不同的公共数据集上的高性能。另外，作者们还提供了一个公开的源代码，让其他研究人员可以免费地使用。

Abstract
Optical flow and disparity are two informative visual features for autonomous driving perception. They have been used for a variety of applications, such as obstacle and lane detection. The concept of "U-V-Disparity" has been widely explored in the literature, while its counterpart in optical flow has received relatively little attention. Traditional motion analysis algorithms estimate optical flow by matching correspondences between two successive video frames, which limits the full utilization of environmental information and geometric constraints. Therefore, we propose a novel strategy to model optical flow in the collision-free space (also referred to as drivable area or simply freespace) for intelligent vehicles, with the full utilization of geometry information in a 3D driving environment. We provide explicit representations of optical flow and deduce the quadratic relationship between the optical flow component and the vertical coordinate. Through extensive experiments on several public datasets, we demonstrate the high accuracy and robustness of our model. Additionally, our proposed freespace optical flow model boasts a diverse array of applications within the realm of automated driving, providing a geometric constraint in freespace detection, vehicle localization, and more. We have made our source code publicly available at https://mias.group/FSOF.

摘要
优化流和差异是自动驾驶视觉感知中的两种有用特征。它们已经用于许多应用程序，如障碍物和车道检测。在文献中，“U-V-差异”概念已经广泛探讨，而其对优化流的匹配相对较少。传统的运动分析算法在两帧视频之间匹配对应点，这限制了环境信息和几何约束的完全利用。因此，我们提出了一种新的策略，在碰撞自由空间（也称为可驾驶空间或简单地 freespace）中模型优化流，充分利用3D驾驶环境中的几何信息。我们提供了优化流的Explicit表示，并证明了优化流组件与垂直坐标之间的 quadratic关系。经过对多个公共数据集的广泛实验，我们示出了我们的模型具有高准确性和稳定性。此外，我们提出的免碰撞自由流模型在自动驾驶领域中拥有多种应用，包括免碰撞自由空间检测、车辆定位和更多。我们的源代码已经公开在https://mias.group/FSOF。

You Can Backdoor Personalized Federated Learning

paper_url: http://arxiv.org/abs/2307.15971
repo_url: None
paper_authors: Tiandi Ye, Cen Chen, Yinggui Wang, Xiang Li, Ming Gao
for: This paper focuses on backdoor attacks in personalized federated learning (pFL) scenarios, where each client constructs a personalized model based on its local data.methods: The paper proposes three backdoor attack methods: BapFL, BapFL+, and Gen-BapFL, which can effectively attack pFL methods by maintaining clean local parameters while implanting the backdoor into the global parameters, and by introducing Gaussian noise to the local parameters.results: The paper demonstrates the effectiveness of the proposed attack methods against two classic pFL methods with partial model-sharing, FedPer and LG-FedAvg, on four FL benchmark datasets. Additionally, the paper assesses the defense efficacy of various defense strategies against the proposed attacks and finds that Gradient Norm-Clipping is particularly effective.

Abstract
Backdoor attacks pose a significant threat to the security of federated learning systems. However, existing research primarily focuses on backdoor attacks and defenses within the generic FL scenario, where all clients collaborate to train a single global model. \citet{qin2023revisiting} conduct the first study of backdoor attacks in the personalized federated learning (pFL) scenario, where each client constructs a personalized model based on its local data. Notably, the study demonstrates that pFL methods with partial model-sharing can significantly boost robustness against backdoor attacks. In this paper, we whistleblow that pFL methods with partial model-sharing are still vulnerable to backdoor attacks in the absence of any defense. We propose three backdoor attack methods: BapFL, BapFL+, and Gen-BapFL, and we empirically demonstrate that they can effectively attack the pFL methods. Specifically, the key principle of BapFL lies in maintaining clean local parameters while implanting the backdoor into the global parameters. BapFL+ generalizes the attack success to benign clients by introducing Gaussian noise to the local parameters. Furthermore, we assume the collaboration of malicious clients and propose Gen-BapFL, which leverages meta-learning techniques to further enhances attack generalization. We evaluate our proposed attack methods against two classic pFL methods with partial model-sharing, FedPer and LG-FedAvg. Extensive experiments on four FL benchmark datasets demonstrate the effectiveness of our proposed attack methods. Additionally, we assess the defense efficacy of various defense strategies against our proposed attacks and find that Gradient Norm-Clipping is particularly effective. It is crucial to note that pFL method is not always secure in the presence of backdoor attacks, and we hope to inspire further research on attack and defense in pFL scenarios.

摘要
背门攻击对联合学习系统安全性提出了严重的威胁。然而，现有的研究主要集中在背门攻击和防御在通用 Federated Learning（FL）场景中，其中所有客户端协力训练单一的全球模型。 however, 某些客户端协力训练个人化 Federated Learning（pFL）场景，每个客户端都会根据本地数据建立个人化的模型。不ably, 这些研究显示了pFL方法在部分模型分享情况下可以大幅提高防御背门攻击的能力。在这篇文章中，我们宣布pFL方法在部分模型分享情况下仍然受到背门攻击的威胁，在没有任何防御措施的情况下。我们提出了三种背门攻击方法：BapFL、BapFL+和Gen-BapFL，并经过实验显示了它们可以有效地攻击pFL方法。具体来说，BapFL的关键原理是维持清洁的本地参数，同时将背门嵌入到全球参数中。BapFL+扩展了攻击成功到良好的客户端，通过引入 Gaussian 噪声到本地参数中。此外，我们假设了合作的黑客端，并提出了Gen-BapFL，利用了元学习技术以进一步增强攻击扩展。我们对两个类型的pFL方法进行了广泛的实验，评估了我们所提出的攻击方法的效果。我们还评估了不同防御策略对我们所提出的攻击方法的防御效果，发现Gradient Norm-Clipping particularly effective。需要注意的是，pFL方法不一定在背门攻击下安全，我们希望透过这篇文章启发更多的研究背门攻击和防御在pFL场景中。

Graph Condensation for Inductive Node Representation Learning

paper_url: http://arxiv.org/abs/2307.15967
repo_url: None
paper_authors: Xinyi Gao, Tong Chen, Yilong Zang, Wentao Zhang, Quoc Viet Hung Nguyen, Kai Zheng, Hongzhi Yin
for: 提高大型图的 Graph Neural Networks (GNNs) 的计算效率，以便在多种应用中使用。
methods: 使用 Graph Condensation 技术，将大型图构建成小型的 sintetic graph，以便训练 GNNs。同时，通过学习一对多节点映射，使新节点可以直接在 sintetic graph 上进行信息传递。
results: 在 Reddit dataset 上，使用 MCond 方法可以 achieve up to 121.5x 的推理速度增幅和 55.9x 的存储需求减少，比对方法 Based on 原始图更高效。

Abstract
Graph neural networks (GNNs) encounter significant computational challenges when handling large-scale graphs, which severely restricts their efficacy across diverse applications. To address this limitation, graph condensation has emerged as a promising technique, which constructs a small synthetic graph for efficiently training GNNs while retaining performance. However, due to the topology structure among nodes, graph condensation is limited to condensing only the observed training nodes and their corresponding structure, thus lacking the ability to effectively handle the unseen data. Consequently, the original large graph is still required in the inference stage to perform message passing to inductive nodes, resulting in substantial computational demands. To overcome this issue, we propose mapping-aware graph condensation (MCond), explicitly learning the one-to-many node mapping from original nodes to synthetic nodes to seamlessly integrate new nodes into the synthetic graph for inductive representation learning. This enables direct information propagation on the synthetic graph, which is much more efficient than on the original large graph. Specifically, MCond employs an alternating optimization scheme with innovative loss terms from transductive and inductive perspectives, facilitating the mutual promotion between graph condensation and node mapping learning. Extensive experiments demonstrate the efficacy of our approach in inductive inference. On the Reddit dataset, MCond achieves up to 121.5x inference speedup and 55.9x reduction in storage requirements compared with counterparts based on the original graph.

摘要
GRAPH NEURAL NETWORKS (GNNs) 面临大规模图处理中的 significiant 计算挑战，这限制了它们在多种应用中的效果。为解决这些限制，图简化技术 emerged as a promising technique, which constructs a small synthetic graph for efficiently training GNNs while retaining performance. However, due to the topology structure among nodes, graph condensation is limited to condensing only the observed training nodes and their corresponding structure, thus lacking the ability to effectively handle the unseen data. Consequently, the original large graph is still required in the inference stage to perform message passing to inductive nodes, resulting in substantial computational demands. To overcome this issue, we propose mapping-aware graph condensation (MCond), explicitly learning the one-to-many node mapping from original nodes to synthetic nodes to seamlessly integrate new nodes into the synthetic graph for inductive representation learning. This enables direct information propagation on the synthetic graph, which is much more efficient than on the original large graph. Specifically, MCond employs an alternating optimization scheme with innovative loss terms from transductive and inductive perspectives, facilitating the mutual promotion between graph condensation and node mapping learning. Extensive experiments demonstrate the efficacy of our approach in inductive inference. On the Reddit dataset, MCond achieves up to 121.5x inference speedup and 55.9x reduction in storage requirements compared with counterparts based on the original graph.

Towards the Visualization of Aggregated Class Activation Maps to Analyse the Global Contribution of Class Features

paper_url: http://arxiv.org/abs/2308.00710
repo_url: None
paper_authors: Igor Cherepanov, David Sessler, Alex Ulmer, Hendrik Lücke-Tieke, Jörn Kohlhammer
for: 这篇论文旨在解释深度学习（DL）模型在分类任务中的决策过程，以便在高风险应用中使用DL模型。
methods: 我们对recent Class Activation Maps（CAMs）方法进行了扩展，以visualize每个数据样本中对分类决策的重要性。我们将多个样本的CAMs进行聚合，以提供一个全局的解释视图，并为每个特征添加了一个方块图示，以显示该特征对分类决策的影响。
results: 我们的视觉表示方法可以帮助分析者了解DL模型在高维数据中做出决策的重要特征，并提供了一种交互式 histogram 来筛选样本和细化 CAM，以便进一步分析 interessing 特征。

Abstract
Deep learning (DL) models achieve remarkable performance in classification tasks. However, models with high complexity can not be used in many risk-sensitive applications unless a comprehensible explanation is presented. Explainable artificial intelligence (xAI) focuses on the research to explain the decision-making of AI systems like DL. We extend a recent method of Class Activation Maps (CAMs) which visualizes the importance of each feature of a data sample contributing to the classification. In this paper, we aggregate CAMs from multiple samples to show a global explanation of the classification for semantically structured data. The aggregation allows the analyst to make sophisticated assumptions and analyze them with further drill-down visualizations. Our visual representation for the global CAM illustrates the impact of each feature with a square glyph containing two indicators. The color of the square indicates the classification impact of this feature. The size of the filled square describes the variability of the impact between single samples. For interesting features that require further analysis, a detailed view is necessary that provides the distribution of these values. We propose an interactive histogram to filter samples and refine the CAM to show relevant samples only. Our approach allows an analyst to detect important features of high-dimensional data and derive adjustments to the AI model based on our global explanation visualization.

摘要

The effect of network topologies on fully decentralized learning: a preliminary investigation

paper_url: http://arxiv.org/abs/2307.15947
repo_url: None
paper_authors: Luigi Palmieri, Lorenzo Valerio, Chiara Boldrini, Andrea Passarella
for: 这篇论文研究了在分布式机器学习系统中节点之间的网络拓扑如何影响模型的训练和性能。
methods: 作者使用了直接各节点间协作来训练机器学习模型，并研究了不同网络拓扑的影响。
results: 研究发现，即使网络组件之间存在较弱的连接，也可以快速传播信息，但是不足以传播知识。另外，研究还发现，核心节点（hubs）在传播知识方面扮演着更重要的角色，而叶节点（leaves）的作用则取决于分布的重要性。最后，研究还发现，紧密结构的社区会很大程度地阻碍知识的传播。

Abstract
In a decentralized machine learning system, data is typically partitioned among multiple devices or nodes, each of which trains a local model using its own data. These local models are then shared and combined to create a global model that can make accurate predictions on new data. In this paper, we start exploring the role of the network topology connecting nodes on the performance of a Machine Learning model trained through direct collaboration between nodes. We investigate how different types of topologies impact the "spreading of knowledge", i.e., the ability of nodes to incorporate in their local model the knowledge derived by learning patterns in data available in other nodes across the networks. Specifically, we highlight the different roles in this process of more or less connected nodes (hubs and leaves), as well as that of macroscopic network properties (primarily, degree distribution and modularity). Among others, we show that, while it is known that even weak connectivity among network components is sufficient for information spread, it may not be sufficient for knowledge spread. More intuitively, we also find that hubs have a more significant role than leaves in spreading knowledge, although this manifests itself not only for heavy-tailed distributions but also when "hubs" have only moderately more connections than leaves. Finally, we show that tightly knit communities severely hinder knowledge spread.

摘要

A Theory for Emergence of Complex Skills in Language Models

paper_url: http://arxiv.org/abs/2307.15936
repo_url: https://github.com/dia2018/What-is-the-Difference-Between-AI-and-Machine-Learning
paper_authors: Sanjeev Arora, Anirudh Goyal
for: 本研究旨在解释语言模型新技能的emergence现象，当参数集和训练数据集scale up时emergence现象的机制尚未得到充分理解。
methods: 本研究使用了著名的Scaling Laws of LLMs和简单的统计分析方法来分析emergence现象。
results: 研究发现，透MDb的损失函数和语言任务的基本技能之间存在 Statistical关系，而Scaling Laws imply strong inductive bias，allowing pre-trained模型在efficiently learn new skills。例如，在$k$-tuple skills任务中，模型可以 Essentially at the same scaling and rate as learning elementary skills themselves。

Abstract
A major driver of AI products today is the fact that new skills emerge in language models when their parameter set and training corpora are scaled up. This phenomenon is poorly understood, and a mechanistic explanation via mathematical analysis of gradient-based training seems difficult. The current paper takes a different approach, analysing emergence using the famous (and empirical) Scaling Laws of LLMs and a simple statistical framework. Contributions include: (a) A statistical framework that relates cross-entropy loss of LLMs to competence on the basic skills that underlie language tasks. (b) Mathematical analysis showing that the Scaling Laws imply a strong form of inductive bias that allows the pre-trained model to learn very efficiently. We informally call this {\em slingshot generalization} since naively viewed it appears to give competence levels at skills that violate usual generalization theory. (c) A key example of slingshot generalization, that competence at executing tasks involving $k$-tuples of skills emerges essentially at the same scaling and same rate as competence on the elementary skills themselves.

摘要
现代AI产品的一个主要驱动力是语言模型新的技能的出现，当其参数集和训练 Corpora 的大小增加时。这种现象还不够了解，而且使用梯度基本训练的数学分析还 seems difficult。本文采用了一种不同的方法，通过著名的（empirical）涨大法律和简单的统计框架来分析出现。本文的贡献包括：(a) 一种统计框架，将语言任务下的基本技能的杂合 entropy loss 与语言模型的 competed 关系。(b) 数学分析，显示了涨大法律的强形 inductive bias，使得预训练模型可以非常高效地学习。我们 Informally 称这为“箭头泛化”，因为从直观来看，它看起来会让模型在不同的任务上达到高效的 competed 水平。(c) 一个重要的例子，即在执行包含 $k $-tuple 技能的任务时，语言模型的 competed 水平会出现在基本技能的 competed 水平之上，并且在同样的涨大程度和速度上进行。

Language models as master equation solvers

paper_url: http://arxiv.org/abs/2308.02514
repo_url: https://github.com/Aryia-Behroziuan/References
paper_authors: Chuanbo Liu, Jin Wang
for: 解决幂等方程（master equation），即模拟随机动力系统的基本方程。
methods: 使用语言模型（language model）作为机器学习方法，将率参数、初始条件和时间值映射到状态共享分布中，即确切匹配输入上下文。
results: 对多模块和高维系统进行了示例应用，并观察到高准确率和扩展性。通过这种方法，可以使用单个预训练大模型解决任何幂等方程。

Abstract
Master equations are of fundamental importance in modeling stochastic dynamical systems.However, solving master equations is challenging due to the exponential increase in the number of possible states or trajectories with the dimension of the state space. In this study, we propose repurposing language models as a machine learning approach to solve master equations. We design a prompt-based neural network to map rate parameters, initial conditions, and time values directly to the state joint probability distribution that exactly matches the input contexts. In this way, we approximate the solution of the master equation in its most general form. We train the network using the policy gradient algorithm within the reinforcement learning framework, with feedback rewards provided by a set of variational autoregressive models. By applying this approach to representative examples, we observe high accuracy for both multi-module and high-dimensional systems. The trained network also exhibits extrapolating ability, extending its predictability to unseen data. Our findings establish the connection between language models and master equations, highlighting the possibility of using a single pretrained large model to solve any master equation.

摘要

ATESA-BÆRT: A Heterogeneous Ensemble Learning Model for Aspect-Based Sentiment Analysis

paper_url: http://arxiv.org/abs/2307.15920
repo_url: None
paper_authors: Elena-Simona Apostol, Alin-Georgian Pisică, Ciprian-Octavian Truică
for: 本研究旨在提高在线评论的分析精度，通过确定用户对不同产品和服务的意见。
methods: 本文提出了一种基于矩阵优化的多元搜索模型，可以同时处理多个凝聚因素。
results: 实验结果表明，该模型在两个 datasets 上具有更高的准确率和更好的精度，比现有方法更有优势。

Abstract
The increasing volume of online reviews has made possible the development of sentiment analysis models for determining the opinion of customers regarding different products and services. Until now, sentiment analysis has proven to be an effective tool for determining the overall polarity of reviews. To improve the granularity at the aspect level for a better understanding of the service or product, the task of aspect-based sentiment analysis aims to first identify aspects and then determine the user's opinion about them. The complexity of this task lies in the fact that the same review can present multiple aspects, each with its own polarity. Current solutions have poor performance on such data. We address this problem by proposing ATESA-B{\AE}RT, a heterogeneous ensemble learning model for Aspect-Based Sentiment Analysis. Firstly, we divide our problem into two sub-tasks, i.e., Aspect Term Extraction and Aspect Term Sentiment Analysis. Secondly, we use the \textit{argmax} multi-class classification on six transformers-based learners for each sub-task. Initial experiments on two datasets prove that ATESA-B{\AE}RT outperforms current state-of-the-art solutions while solving the many aspects problem.

摘要
随着在线评论的量的增加，可以开发出情感分析模型，以确定客户对不同产品和服务的看法。到目前为止，情感分析已经证明是一个有效的工具，用于确定评论的总性。为了提高各个方面的细化，以便更好地理解产品或服务，分别针对每个方面进行情感分析是一项有挑战性的任务。这是因为同一篇评论可能会涵盖多个方面，每个方面都有其自己的正面或负面。现有的解决方案在处理这类数据时表现不佳。我们解决这个问题，提出了ATESA-B{\AE}RT，一种多样性ensemble学习模型，用于各个方面的情感分析。首先，我们将问题分为两个子任务：一是方面术语提取，二是方面术语情感分析。其次，我们使用六个基于转换器的学习器进行每个子任务的\textit{argmax}多类分类。初步实验表明，ATESA-B{\AE}RT在两个数据集上的表现优于当前状态的最佳解决方案，并解决了多个方面问题。

Opportunistic Air Quality Monitoring and Forecasting with Expandable Graph Neural Networks

paper_url: http://arxiv.org/abs/2307.15916
repo_url: None
paper_authors: Jingwei Zuo, Wenbin Li, Michele Baldo, Hakim Hacid
for: 本研究旨在提出一种可扩展的图注意网络模型（EGAT），用于融合不同空间结构的数据采集，以提高空气质量预测的灵活性和准确性。
methods: 本研究使用了一种名为EGAT的图注意网络模型，可以处理不同空间结构的数据采集，并且可以与现有的预测模型结合使用。
results: 研究者通过使用EGAT模型，在实际的空气质量数据集上进行了验证，结果表明EGAT模型可以提高空气质量预测的灵活性和准确性。

Abstract
Air Quality Monitoring and Forecasting has been a popular research topic in recent years. Recently, data-driven approaches for air quality forecasting have garnered significant attention, owing to the availability of well-established data collection facilities in urban areas. Fixed infrastructures, typically deployed by national institutes or tech giants, often fall short in meeting the requirements of diverse personalized scenarios, e.g., forecasting in areas without any existing infrastructure. Consequently, smaller institutes or companies with limited budgets are compelled to seek tailored solutions by introducing more flexible infrastructures for data collection. In this paper, we propose an expandable graph attention network (EGAT) model, which digests data collected from existing and newly-added infrastructures, with different spatial structures. Additionally, our proposal can be embedded into any air quality forecasting models, to apply to the scenarios with evolving spatial structures. The proposal is validated over real air quality data from PurpleAir.

摘要
近年来，空气质量监测和预测已成为科研领域的热点话题。现在，基于数据驱动的空气质量预测方法受到了广泛关注，因为城市地区的数据收集设施已经成熔化了。 fixed 的基础设施，通常由国家机构或科技巨头部署，经常无法满足个性化的情况，例如预测没有任何基础设施的地区。因此，小型机构或公司具有有限预算的情况下，需要寻找更灵活的基础设施来采集数据。在这篇论文中，我们提出了一种可扩展的图注意网络（EGAT）模型，该模型可以处理来自现有和新增的基础设施的数据，并且具有不同的空间结构。此外，我们的提议可以融入任何空气质量预测模型中，以适应不断发展的空间结构。我们的提议被验证通过实际的紫色空气数据。

Moisesdb: A dataset for source separation beyond 4-stems

paper_url: http://arxiv.org/abs/2307.15913
repo_url: https://github.com/moises-ai/moises-db
paper_authors: Igor Pereira, Felipe Araújo, Filip Korzeniowski, Richard Vogl
for: 本研究 introduce了 Musical Source Separation 领域的 MoisesDB 数据集，用于驱动和评估精细音源分离系统的发展。
methods: 本研究使用了一个二级层次的 taxonomy 组织音频源，并提供了一个简单易用的 Python 库来下载、处理和使用 MoisesDB。
results: 本研究提供了不同精细度的开源分离模型的基准结果，并对数据集的内容进行了详细的文档和分析。

Abstract
In this paper, we introduce the MoisesDB dataset for musical source separation. It consists of 240 tracks from 45 artists, covering twelve musical genres. For each song, we provide its individual audio sources, organized in a two-level hierarchical taxonomy of stems. This will facilitate building and evaluating fine-grained source separation systems that go beyond the limitation of using four stems (drums, bass, other, and vocals) due to lack of data. To facilitate the adoption of this dataset, we publish an easy-to-use Python library to download, process and use MoisesDB. Alongside a thorough documentation and analysis of the dataset contents, this work provides baseline results for open-source separation models for varying separation granularities (four, five, and six stems), and discuss their results.

摘要
在这篇论文中，我们介绍了Musical Source Separation的MoisesDB数据集。它包含240首歌曲，来自45位艺术家，涵盖了12种音乐类型。每首歌曲都有其自己的声音来源，以两级层次的概念分类为stems。这将有助于建立和评估精细的音乐来源分离系统，超越使用四个声音来源（鼓、贝斯、其他和 vocals）的限制，因为缺乏数据。为了促进这个数据集的采用，我们在Python库中发布了一个易于使用的下载、处理和使用MoisesDB的工具。同时，我们还提供了数据集的详细文档和分析，以及不同的分离精度（四、五、六个声音来源）的基准结果。

Reinforcement Learning Under Probabilistic Spatio-Temporal Constraints with Time Windows

paper_url: http://arxiv.org/abs/2307.15910
repo_url: None
paper_authors: Xiaoshan Lin, Abbasali Koochakzadeh, Yasin Yazicioglu, Derya Aksaray
for: 本文提出了一种自动机理论方法，用于在复杂的空间时间约束下进行强化学习（RL）。
methods: 本文使用Markov决策过程下的 bounded temporal logic约束来形式化问题，并使用总自动机来翻译这个约束。文章还采用了基于可用的历史过程概率信息的方法来避免”危险”的动作。
results: 本文提供了关于约束满足的概率的理论保证，并提供了一个具有 periodic pick-up和交付任务的enario的数值结果，以证明本方法的有效性。

Abstract
We propose an automata-theoretic approach for reinforcement learning (RL) under complex spatio-temporal constraints with time windows. The problem is formulated using a Markov decision process under a bounded temporal logic constraint. Different from existing RL methods that can eventually learn optimal policies satisfying such constraints, our proposed approach enforces a desired probability of constraint satisfaction throughout learning. This is achieved by translating the bounded temporal logic constraint into a total automaton and avoiding "unsafe" actions based on the available prior information regarding the transition probabilities, i.e., a pair of upper and lower bounds for each transition probability. We provide theoretical guarantees on the resulting probability of constraint satisfaction. We also provide numerical results in a scenario where a robot explores the environment to discover high-reward regions while fulfilling some periodic pick-up and delivery tasks that are encoded as temporal logic constraints.

摘要
我们提出一个自动化理论方法来解决具有复杂时空范围的强化学习（RL）问题。问题是以Markov决策过程形式ulated，并受到紧存时逻规范例的约束。与现有RL方法不同的是，我们的提议方法可以在学习过程中确保满足这些约束的条件，并且可以在学习过程中确保这些约束的满意度。这是通过转换紧存时逻规范例为总自动aton来实现的。我们提供了对结果的概率满意度的理论保证。我们还提供了一个实际应用的数据，该数据显示一个 robot 在环境中探索高奖区域，并且遵循一些periodic pick-up和交付任务，这些任务是通过时间逻规范例表示的。

UniBriVL: Robust Universal Representation and Generation of Audio Driven Diffusion Models

paper_url: http://arxiv.org/abs/2307.15898
repo_url: None
paper_authors: Sen Fang, Bowen Gao, Yangjian Wu, Jingwen Cai, Teik Toe Teoh
for: 这篇论文的目的是提出一种基于视与语言的语言表示学习方法，以实现多模态应用程序的开发。
methods: 该方法基于bridging-vision-and-language（BriVL），将语音、图像和文本 embedding到共享空间中，解决了语音和图像表示学习中的主要挑战，同时能够有效地捕捉语音和图像之间的相互关系。
results: 实验结果表明，UniBriVL可以在下游任务中达到优异的效果，并且可以根据音频生成相应的图像。这种方法有很多应用前景，如语音识别、音乐信号处理和标题生成等。

Abstract
Multimodal large models have been recognized for their advantages in various performance and downstream tasks. The development of these models is crucial towards achieving general artificial intelligence in the future. In this paper, we propose a novel universal language representation learning method called UniBriVL, which is based on Bridging-Vision-and-Language (BriVL). Universal BriVL embeds audio, image, and text into a shared space, enabling the realization of various multimodal applications. Our approach addresses major challenges in robust language (both text and audio) representation learning and effectively captures the correlation between audio and image. Additionally, we demonstrate the qualitative evaluation of the generated images from UniBriVL, which serves to highlight the potential of our approach in creating images from audio. Overall, our experimental results demonstrate the efficacy of UniBriVL in downstream tasks and its ability to choose appropriate images from audio. The proposed approach has the potential for various applications such as speech recognition, music signal processing, and captioning systems.

摘要
多modal大型模型已被认可其在多种性能和下游任务中的优势。这种模型的发展对于实现未来的通用人工智能是关键。本文提出一种新的通用语言表示学习方法，称为UniBriVL，它基于bridging-vision-and-language（BriVL）。这种 универсаルBriVL嵌入音频、图像和文本到共享空间中，使得实现多种多modal应用程序变得可能。我们的方法解决了语言表示学习中的重要挑战，并有效地捕捉音频和图像之间的相关性。此外，我们还进行了生成图像的质量评估，以展示我们的方法在创建图像从音频中的可能性。总的来说，我们的实验结果表明UniBriVL在下游任务中的效果，并且可以选择适当的图像从音频中。这种方法在语音识别、音乐信号处理和描述系统等应用中具有潜在的潜力。

A new Gradient TD Algorithm with only One Step-size: Convergence Rate Analysis using $L$-$λ$ Smoothness

paper_url: http://arxiv.org/abs/2307.15892
repo_url: None
paper_authors: Hengshuai Yao
for: 这种论文是关于减少TD更新的维度($d$)的第一个$O(d)$算法，以及这种算法的 konvergence 约束，以及这种约束的证明。
methods: 这种论文使用了GTD算法，以及其两个步长参数。此外，这种论文还使用了一种新的单时间尺度GTD算法，以及一种基于$L$-$\lambda$ гладкость的证明。
results: 这种论文证明了新的GTD算法（即Impression GTD）可以在$O(1/t)$的速度下 konvergence，并且可以在更强的假设下提供更快的 konvergence 约束。此外，这种论文还对四种GTD算法的 konvergence 约束进行了证明，并且提供了实验结果，证明Impression GTD在Random walks、Boyan chain和Baird counterexample中 konvergence faster than其他GTD算法。

Abstract
Gradient Temporal Difference (GTD) algorithms (Sutton et al., 2008, 2009) are the first $O(d)$ ($d$ is the number features) algorithms that have convergence guarantees for off-policy learning with linear function approximation. Liu et al. (2015) and Dalal et. al. (2018) proved the convergence rates of GTD, GTD2 and TDC are $O(t^{-\alpha/2})$ for some $\alpha \in (0,1)$. This bound is tight (Dalal et al., 2020), and slower than $O(1/\sqrt{t})$. GTD algorithms also have two step-size parameters, which are difficult to tune. In literature, there is a "single-time-scale" formulation of GTD. However, this formulation still has two step-size parameters. This paper presents a truly single-time-scale GTD algorithm for minimizing the Norm of Expected td Update (NEU) objective, and it has only one step-size parameter. We prove that the new algorithm, called Impression GTD, converges at least as fast as $O(1/t)$. Furthermore, based on a generalization of the expected smoothness (Gower et al. 2019), called $L$-$\lambda$ smoothness, we are able to prove that the new GTD converges even faster, in fact, with a linear rate. Our rate actually also improves Gower et al.'s result with a tighter bound under a weaker assumption. Besides Impression GTD, we also prove the rates of three other GTD algorithms, one by Yao and Liu (2008), another called A-transpose-TD (Sutton et al., 2008), and a counterpart of A-transpose-TD. The convergence rates of all the four GTD algorithms are proved in a single generic GTD framework to which $L$-$\lambda$ smoothness applies. Empirical results on Random walks, Boyan chain, and Baird counterexample show that Impression GTD converges much faster than existing GTD algorithms for both on-policy and off-policy learning problems, with well-performing step-sizes in a big range.

摘要
gradient temporal difference（GTD）算法（Sutton et al., 2008, 2009）是首个 $O(d)$ ($d$ 是特征数) 的算法，具有离线学习 linear function approximation 的 convergence guarantee。Liu et al. (2015) 和 Dalal et al. (2018) 证明 GTD、GTD2 和 TDC 的 convergence rate 为 $O(t^{-\alpha/2})$，其中 $\alpha \in (0,1)$。这个 bound 是紧张的（Dalal et al., 2020），并且比 $O(1/\sqrt{t})$ 更慢。GTD 算法还有两个步长参数，这些参数难以调整。在文献中，有一种 "single-time-scale" 的 GTD 表述，但这种表述仍然有两个步长参数。这篇文章提出了一种真正的 single-time-scale GTD 算法，用于最小化 Norm of Expected td Update（NEU）目标函数，并且只有一个步长参数。我们证明该新算法，称为 Impression GTD，在 $O(1/t)$ 的速度下 converges。此外，基于预期的平滑（Gower et al. 2019）的一种推广，称为 $L$-$\lambda$ smoothness，我们能够证明 Impression GTD 的速度实际更快，实际上是 linear 速度。我们的速度实际也超越 Gower et al. 的结果，并且在较弱的假设下提供了更紧张的 bound。此外，我们还证明了三个 GTD 算法的 convergence rate，分别是 Yao and Liu (2008) 的一种算法，Sutton et al. (2008) 的 A-transpose-TD 算法，以及它的对应算法。所有四个 GTD 算法的 convergence rate 在一个通用的 GTD 框架中证明，该框架下 $L$-$\lambda$ smoothness 适用。empirical results 表明，Impression GTD 在Random walks、Boyan chain 和 Baird counterexample 问题中 converge much faster than existing GTD algorithms，并且步长在大范围内表现良好。

Point Annotation Probability Map: Towards Dense Object Counting by Tolerating Annotation Noise

paper_url: http://arxiv.org/abs/2308.00530
repo_url: None
paper_authors: Yuehai Chen
for: 这个研究旨在提高计算机视觉中对拥挤场景中物体的数量测量的精度和韧性。
methods: 这个研究使用了一种基于深度学习的方法，即将物体检测任务转化为一个 Gaussian 概率回归问题。然而，这种方法可能不能正确地考虑人工标注过程中的注释噪声，从而导致不同的分布。为了提高 robustness，这个研究使用了一种通用 Gaussian 分布函数（GGD）来形成学习目标点概率图像（PAPM）。
results: 对比于传统的手动设计 PAPM 方法（HD-PAPM）和适应学习 PAPM 方法（AL-PAPM），这个研究的方法在抗注释噪声方面显示出了更高的精度和韧性。此外，通过使用一种基于 GGD 的有效交通成本函数，这个研究还提出了一种可靠的交通框架，从而实现了更好的 PAPM 表现。

Abstract
Counting objects in crowded scenes remains a challenge to computer vision. The current deep learning based approach often formulate it as a Gaussian density regression problem. Such a brute-force regression, though effective, may not consider the annotation noise properly which arises from the human annotation process and may lead to different distributions. We conjecture that it would be beneficial to consider the annotation noise in the dense object counting task. To obtain strong robustness against annotation noise, generalized Gaussian distribution (GGD) function with a tunable bandwidth and shape parameter is exploited to form the learning target point annotation probability map, PAPM. Specifically, we first present a hand-designed PAPM method (HD-PAPM), in which we design a function based on GGD to tolerate the annotation noise. For end-to-end training, the hand-designed PAPM may not be optimal for the particular network and dataset. An adaptively learned PAPM method (AL-PAPM) is proposed. To improve the robustness to annotation noise, we design an effective transport cost function based on GGD. With such transport cost constraints, a better PAPM presentation could be adaptively learned with an optimal transport framework from point annotation in an end-to-end manner. Extensive experiments show the superiority of our proposed methods.

摘要
计算对象在增强的场景中的数量 remains a challenge to computer vision. 现有的深度学习基于方法 often formulate it as a Gaussian density regression problem. 这种粗野的回归，虽然有效，可能不会正确地考虑人工标注过程中的标注噪音。 We conjecture that it would be beneficial to consider the annotation noise in the dense object counting task. To obtain strong robustness against annotation noise, we exploit the generalized Gaussian distribution (GGD) function with a tunable bandwidth and shape parameter to form the learning target point annotation probability map, PAPM. Specifically, we first present a hand-designed PAPM method (HD-PAPM), in which we design a function based on GGD to tolerate the annotation noise. For end-to-end training, the hand-designed PAPM may not be optimal for the particular network and dataset. An adaptively learned PAPM method (AL-PAPM) is proposed. To improve the robustness to annotation noise, we design an effective transport cost function based on GGD. With such transport cost constraints, a better PAPM presentation could be adaptively learned with an optimal transport framework from point annotation in an end-to-end manner. Extensive experiments show the superiority of our proposed methods.

Recent neutrino oscillation result with the IceCube experiment

paper_url: http://arxiv.org/abs/2307.15855
repo_url: None
paper_authors: Shiqi Yu, Jessie Micallef
for: 探测TeV中微子发射的天体物理源
methods: 使用Convolutional Neural Networks重构中微子交互
results: 对大气μ中微子消失的新result和现有全球测量进行比较

Abstract
The IceCube South Pole Neutrino Observatory is a Cherenkov detector instrumented in a cubic kilometer of ice at the South Pole. IceCube's primary scientific goal is the detection of TeV neutrino emissions from astrophysical sources. At the lower center of the IceCube array, there is a subdetector called DeepCore, which has a denser configuration that makes it possible to lower the energy threshold of IceCube and observe GeV-scale neutrinos, opening the window to atmospheric neutrino oscillations studies. Advances in physics sensitivity have recently been achieved by employing Convolutional Neural Networks to reconstruct neutrino interactions in the DeepCore detector. In this contribution, the recent IceCube result from the atmospheric muon neutrino disappearance analysis using the CNN-reconstructed neutrino sample is presented and compared to the existing worldwide measurements.

摘要
南极冰矿 neutrino观测站是一种液气切变仪器，位于南极的冰中一公顷范围内。南极冰矿的主要科学目标是探测astrophysical sources的TeV neutrino发射。南极冰矿的下部中心有一个名为DeepCore的仪器，具有更密集的配置，使得可以降低iceCube的能量阈值，观测GeV级别的 neutrinos，开启大气中 neutrino振荡的研究窗口。最近，employning Convolutional Neural Networks（CNN）重建 neutrino互动的技术进行了进一步的物理敏感度提高。本贡献中将公布ICEube最新的大气μ neutrino消失分析结果，使用CNN重建的neutrino样本，与全球各地的现有测量进行比较。

Dimensionless Policies based on the Buckingham $π$ Theorem: Is it a good way to Generalize Numerical Results?

paper_url: http://arxiv.org/abs/2307.15852
repo_url: None
paper_authors: Alexandre Girard
for: 这 paper 是为了解决一种动力控制问题，即使用约束限制的倒挂pendulum swing-up问题。
methods: 该 paper 使用数字方法计算优化控制法，并利用约束限制的倒挂pendulum swing-up问题的数据生成优化控制器。
results: 研究发现，通过修改问题表述使用约束限制的倒挂pendulum swing-up问题的数据生成优化控制器可以在相似的系统上 reuse。此外，研究还发现了一种称为” режим”的概念，可以帮助relax约束限制的条件。最后，研究还讨论了将输入和输出缩放到相似系统上的约束限制的问题。

Abstract
Yes if the context, the list of variables defining the motion control problem, is dimensionally similar. Here we show that by modifying the problem formulation using dimensionless variables, we can re-use the optimal control law generated numerically for a specific system to a sub-space of dimensionally similar systems. This is demonstrated, with numerically generated optimal controllers, for the classic motion control problem of swinging-up a torque-limited inverted pendulum. We also discuss the concept of regime, a region in the space of context variables, that can help relax the condition on dimensional similarity. Futhermore, we discuss how applying dimensionnal scaling of the input and output of a context-specific policy is equivalent to substituing the new systems parameters in an analytical equation for dimentionnaly similar systems. It remains to be seen if this approach can also help generalizing policies for more complex high-dimensional problems.

摘要
如果上下文中的变量集合定义的运动控制问题的维度相似，那么我们可以通过修改问题定义使用约束维度相似的系统来重用numerically生成的优化控制器。这种方法在 класси的倾斜挠子问题上实现了，并通过 numerically生成的优化控制器来证明。我们还讨论了“ режим”这个概念，它是上下文变量空间中的一个区域，可以帮助降低维度相似性的条件。此外，我们还讨论了将输入和输出缩放到上下文特定策略中的维度相似系统的方法，这与将新系统参数substitued into an analytical equation for dimensionally similar systems中的方法相同。未知是这种方法还可以扩展到更复杂的高维度问题上。

Comprehensive Algorithm Portfolio Evaluation using Item Response Theory

paper_url: http://arxiv.org/abs/2307.15850
repo_url: https://github.com/sevvandi/airt-scripts
paper_authors: Sevvandi Kandanaarachchi, Kate Smith-Miles
for: 评估机器学习算法表现，包括评估算法在不同数据集上的一致性和异常性。
methods: 基于改进的Item Response Theory（IRT）模型，使用卷积神经网络对数据集进行分类。
results: 提供了一种简单、可解释的方法来评估机器学习算法竞争力，并且可以同时评估算法在不同数据集上的表现。

Abstract
Item Response Theory (IRT) has been proposed within the field of Educational Psychometrics to assess student ability as well as test question difficulty and discrimination power. More recently, IRT has been applied to evaluate machine learning algorithm performance on a single classification dataset, where the student is now an algorithm, and the test question is an observation to be classified by the algorithm. In this paper we present a modified IRT-based framework for evaluating a portfolio of algorithms across a repository of datasets, while simultaneously eliciting a richer suite of characteristics - such as algorithm consistency and anomalousness - that describe important aspects of algorithm performance. These characteristics arise from a novel inversion and reinterpretation of the traditional IRT model without requiring additional dataset feature computations. We test this framework on algorithm portfolios for a wide range of applications, demonstrating the broad applicability of this method as an insightful algorithm evaluation tool. Furthermore, the explainable nature of IRT parameters yield an increased understanding of algorithm portfolios.

摘要

Primitive Skill-based Robot Learning from Human Evaluative Feedback

paper_url: http://arxiv.org/abs/2307.15801
repo_url: None
paper_authors: Ayano Hiranaka, Minjune Hwang, Sharon Lee, Chen Wang, Li Fei-Fei, Jiajun Wu, Ruohan Zhang
for: 提高RL算法在真实环境中执行长期机器人 manipulate 任务的效率和安全性。
methods: 利用RL from human feedback（RLHF）和基本技能基于RL两种方法，可以有效地解决稀缺奖励问题和长期任务的复杂性。
results: 对五种 manipulate 任务进行了广泛的实验，比较了SEED与当前RL算法的性能，得到了显著的提高 sample efficiency 和安全性，同时也比其他RLHF方法具有更少的人工干预。

Abstract
Reinforcement learning (RL) algorithms face significant challenges when dealing with long-horizon robot manipulation tasks in real-world environments due to sample inefficiency and safety issues. To overcome these challenges, we propose a novel framework, SEED, which leverages two approaches: reinforcement learning from human feedback (RLHF) and primitive skill-based reinforcement learning. Both approaches are particularly effective in addressing sparse reward issues and the complexities involved in long-horizon tasks. By combining them, SEED reduces the human effort required in RLHF and increases safety in training robot manipulation with RL in real-world settings. Additionally, parameterized skills provide a clear view of the agent's high-level intentions, allowing humans to evaluate skill choices before they are executed. This feature makes the training process even safer and more efficient. To evaluate the performance of SEED, we conducted extensive experiments on five manipulation tasks with varying levels of complexity. Our results show that SEED significantly outperforms state-of-the-art RL algorithms in sample efficiency and safety. In addition, SEED also exhibits a substantial reduction of human effort compared to other RLHF methods. Further details and video results can be found at https://seediros23.github.io/.

摘要
Reinforcement learning (RL) 算法在实际环境中完成长期机器人操作任务时面临重大挑战，主要是因为样本不充分和安全问题。为了解决这些问题，我们提出了一个新的框架，即 SEED，该框架利用了两种方法：人类反馈学习（RLHF）和基本技能基于学习。这两种方法都能够有效地解决罕见奖励问题和长期任务的复杂性。通过结合这两种方法，SEED可以减少人类努力需要在RLHF中，并在实际训练机器人操作中增加安全性。此外，参数化技能提供了机器人高级意图的明确视图，allowing humans to evaluate skill choices before they are executed。这个特点使得训练过程更加安全和高效。为了评估 SEED 的表现，我们进行了对五种 manipulate 任务的广泛实验。我们的结果表明，SEED 在样本效率和安全性方面明显超过了现有的RL算法。此外，SEED 还表现出了与其他 RLHF 方法相比明显减少的人类努力。更多细节和视频结果可以在找到。

Summaries, Highlights, and Action items: Design, implementation and evaluation of an LLM-powered meeting recap system

paper_url: http://arxiv.org/abs/2307.15793
repo_url: None
paper_authors: Sumit Asthana, Sagih Hilleli, Pengcheng He, Aaron Halfaker
for: 这个论文的目的是提高在线计算机媒体空间中的会议体验，使用大语言模型进行会议摘要，以减少个人会议负担和提高会议输出的明确度和一致性。
methods: 该论文使用大语言模型进行会议摘要，并开发了一个基于对话摘要的会议摘要系统。系统包括两种突出的会议摘要表示方式：重要高亮和结构化层次视图。
results: 该论文通过与7名用户进行实验，发现使用大语言模型进行会议摘要可以提高会议体验，但还存在个人重要性和摘要质量等问题。研究结果表明，高质量的会议摘要可以帮助建立共享摘要文档，并且可以通过与用户合作来进一步改进摘要质量和个人重要性。

Abstract
Meetings play a critical infrastructural role in the coordination of work. In recent years, due to shift to hybrid and remote work, more meetings are moving to online Computer Mediated Spaces. This has led to new problems (e.g. more time spent in less engaging meetings) and new opportunities (e.g. automated transcription/captioning and recap support). Recent advances in large language models (LLMs) for dialog summarization have the potential to improve the experience of meetings by reducing individuals' meeting load and increasing the clarity and alignment of meeting outputs. Despite this potential, they face technological limitation due to long transcripts and inability to capture diverse recap needs based on user's context. To address these gaps, we design, implement and evaluate in-context a meeting recap system. We first conceptualize two salient recap representations -- important highlights, and a structured, hierarchical minutes view. We develop a system to operationalize the representations with dialogue summarization as its building blocks. Finally, we evaluate the effectiveness of the system with seven users in the context of their work meetings. Our findings show promise in using LLM-based dialogue summarization for meeting recap and the need for both representations in different contexts. However, we find that LLM-based recap still lacks an understanding of whats personally relevant to participants, can miss important details, and mis-attributions can be detrimental to group dynamics. We identify collaboration opportunities such as a shared recap document that a high quality recap enables. We report on implications for designing AI systems to partner with users to learn and improve from natural interactions to overcome the limitations related to personal relevance and summarization quality.

摘要
To address these gaps, we designed, implemented, and evaluated an in-context meeting recap system. We conceptualized two salient recap representations: important highlights and a structured, hierarchical minutes view. We developed a system to operationalize these representations using dialogue summarization as its building blocks. We evaluated the effectiveness of the system with seven users in the context of their work meetings. Our findings show promise in using LLM-based dialogue summarization for meeting recap, but we also identified limitations, such as a lack of understanding of what is personally relevant to participants, missing important details, and misattributions that can be detrimental to group dynamics.We suggest collaboration opportunities, such as a shared recap document, that a high-quality recap enables. We also identify the need for AI systems to partner with users to learn and improve from natural interactions to overcome the limitations related to personal relevance and summarization quality. Our findings have implications for designing AI systems for meeting support and other applications where summarization and personal relevance are important.

SAFE: Saliency-Aware Counterfactual Explanations for DNN-based Automated Driving Systems

paper_url: http://arxiv.org/abs/2307.15786
repo_url: None
paper_authors: Amir Samadi, Amir Shirian, Konstantinos Koufos, Kurt Debattista, Mehrdad Dianati
for: 本文提出了一种新的CF解释方法，用于生成更加有用的CF示例，以便更好地理解黑盒模型的决策过程。
methods: 本文使用了照明地图来生成CF示例，并且通过分析照明地图来确定CF示例的有用性。
results: 实验结果表明，本文提出的CF解释方法可以生成更加有用的CF示例，并且可以帮助理解黑盒模型的决策过程。Translation:
for: This paper proposes a new approach to CF explanations, which generates more informative CF examples to better understand the decision-making process of black-box models.
methods: The proposed method uses saliency maps to generate CF examples and evaluates their usefulness.
results: Experimental results show that the proposed CF explanation method can generate more informative CF examples and help understand the decision-making process of black-box models.

Abstract
A CF explainer identifies the minimum modifications in the input that would alter the model's output to its complement. In other words, a CF explainer computes the minimum modifications required to cross the model's decision boundary. Current deep generative CF models often work with user-selected features rather than focusing on the discriminative features of the black-box model. Consequently, such CF examples may not necessarily lie near the decision boundary, thereby contradicting the definition of CFs. To address this issue, we propose in this paper a novel approach that leverages saliency maps to generate more informative CF explanations. Source codes are available at: https://github.com/Amir-Samadi//Saliency_Aware_CF.

摘要
一种 CF 解释器可以确定输入中最小的修改，使模型的输出变为其 complement。即，CF 解释器计算模型决策边界上需要的最小修改。现有的深度生成 CF 模型通常使用用户选择的特征而不是黑盒模型的激发特征，因此 CF 示例可能不会位于决策边界附近，从而违反 CF 的定义。为解决这个问题，我们在这篇论文中提出了一种新的方法，利用 Saliency 地图生成更有用的 CF 解释。代码可以在：https://github.com/Amir-Samadi//Saliency_Aware_CF 中找到。

Spherical and Hyperbolic Toric Topology-Based Codes On Graph Embedding for Ising MRF Models: Classical and Quantum Topology Machine Learning

paper_url: http://arxiv.org/abs/2307.15778
repo_url: https://github.com/Lcrypto/Topology-Signal-Processing
paper_authors: Vasiliy Usatyuk, Sergey Egorov, Denis Sapozhnikov
for: 这篇论文探讨了应用信息几何来描述铁森模型的稳态状态。
methods: 该方法利用了环境和球体上的多面体代数来使用迪迪诺-莫里雷LLDPC码的自动同构和圆柱体代数来实现这一点。
results: 该研究显示了一种将深度学习架构与误差修正编码相关的新嵌入方法，以及一种使用统计物理和数学几何来优化误差修正编码的方法。这些方法有助于提高深度学习架构的设计、有效硬件设计和物理科学等领域的进步。

Abstract
The paper introduces the application of information geometry to describe the ground states of Ising models. This is achieved by utilizing parity-check matrices of cyclic and quasi-cyclic codes on toric and spherical topologies. The approach establishes a connection between machine learning and error-correcting coding, specifically in terms of automorphism and the size of the circulant of the quasi-cyclic code. This proposed approach has implications for the development of new embedding methods based on trapping sets. Statistical physics and number geometry are utilized to optimize error-correcting codes, leading to these embedding and sparse factorization methods. The paper establishes a direct connection between DNN architecture and error-correcting coding by demonstrating how state-of-the-art DNN architectures (ChordMixer, Mega, Mega-chunk, CDIL, ...) from the long-range arena can be equivalent to specific types (Cage-graph, Repeat Accumulate) of block and convolutional LDPC codes. QC codes correspond to certain types of chemical elements, with the carbon element being represented by the mixed automorphism Shu-Lin-Fossorier QC-LDPC code. The Quantum Approximate Optimization Algorithm (QAOA) used in the Sherrington-Kirkpatrick Ising model can be seen as analogous to the back-propagation loss function landscape in training DNNs. This similarity creates a comparable problem with TS pseudo-codeword, resembling the belief propagation method. Additionally, the layer depth in QAOA correlates to the number of decoding belief propagation iterations in the Wiberg decoding tree. Overall, this work has the potential to advance multiple fields, from Information Theory, DNN architecture design (sparse and structured prior graph topology), efficient hardware design for Quantum and Classical DPU/TPU (graph, quantize and shift register architect.) to Materials Science and beyond.

摘要
文章介绍了使用信息几何来描述碰声模型的稳定态。这是通过利用循环和各种圆柱形码的自带矩阵来实现的，特别是在拓扑和球形上。这种方法可以将机器学习和错误修复编码相连接，并且在自动同构和循环码的大小之间建立关系。这种提议的方法可以用于开发新的嵌入方法，基于拦截集。统计物理和数字几何在错误修复编码中进行优化，导致这些嵌入和稀疏因子化方法。文章还证明了深度学习架构与错误修复编码之间的直接关系，并且显示了状态艺术架构（ChordMixer、Mega、Mega-chunk、CDIL等）与特定类型（团格raph、重复积累）块和 convolutional LDPC 码之间的等价关系。QC 码对应于某些化学元素，而碳元素则被表示为混合自带矩阵 Shu-Lin-Fossorier QC-LDPC 码。Quantum Approximate Optimization Algorithm（QAOA）在希林-基瑞泽曼-碰声模型中可以被看作类似于反射传播损失函数顺序地形态，这种相似性创造了相似的问题，与TS pseudo-codeword相似，类似于信念传播方法。此外，QAOA层数与反射传播循环数在Wiberg解码树中相关。总之，这项工作有可能推动多个领域的进步，从信息理论、深度学习架构设计（稀疏和结构化前 Graph 拓扑）、高效的古驱肾设计（图形、量化和移位注册架构）到材料科学和更远的领域。

Select and Augment: Enhanced Dense Retrieval Knowledge Graph Augmentation

paper_url: http://arxiv.org/abs/2307.15776
repo_url: None
paper_authors: Micheal Abaho, Yousef H. Alfaifi
for: 提高知识 graphs（KG）中任务的性能，例如链接预测等
methods: 使用多任务框架，选择合适的文本描述来增强KG表示，并将文本描述与KG表示进行对应或增强
results: 在链接预测任务上，与传统CNN方法相比，提高了5.5%和3.5%的 Mean Reciprocal Rank（MRR）和Hits@10分数Note: The above information is in Simplified Chinese text.

Abstract
Injecting textual information into knowledge graph (KG) entity representations has been a worthwhile expedition in terms of improving performance in KG oriented tasks within the NLP community. External knowledge often adopted to enhance KG embeddings ranges from semantically rich lexical dependency parsed features to a set of relevant key words to entire text descriptions supplied from an external corpus such as wikipedia and many more. Despite the gains this innovation (Text-enhanced KG embeddings) has made, the proposal in this work suggests that it can be improved even further. Instead of using a single text description (which would not sufficiently represent an entity because of the inherent lexical ambiguity of text), we propose a multi-task framework that jointly selects a set of text descriptions relevant to KG entities as well as align or augment KG embeddings with text descriptions. Different from prior work that plugs formal entity descriptions declared in knowledge bases, this framework leverages a retriever model to selectively identify richer or highly relevant text descriptions to use in augmenting entities. Furthermore, the framework treats the number of descriptions to use in augmentation process as a parameter, which allows the flexibility of enumerating across several numbers before identifying an appropriate number. Experiment results for Link Prediction demonstrate a 5.5% and 3.5% percentage increase in the Mean Reciprocal Rank (MRR) and Hits@10 scores respectively, in comparison to text-enhanced knowledge graph augmentation methods using traditional CNNs.

摘要
<>translate the following text into Simplified Chinese<> injecting textual information into knowledge graph (KG) entity representations has been a worthwhile expedition in terms of improving performance in KG oriented tasks within the NLP community. external knowledge often adopted to enhance KG embeddings ranges from semantically rich lexical dependency parsed features to a set of relevant key words to entire text descriptions supplied from an external corpus such as wikipedia and many more. despite the gains this innovation (text-enhanced KG embeddings) has made, the proposal in this work suggests that it can be improved even further. instead of using a single text description (which would not sufficiently represent an entity because of the inherent lexical ambiguity of text), we propose a multi-task framework that jointly selects a set of text descriptions relevant to KG entities as well as align or augment KG embeddings with text descriptions. different from prior work that plugs formal entity descriptions declared in knowledge bases, this framework leverages a retriever model to selectively identify richer or highly relevant text descriptions to use in augmenting entities. furthermore, the framework treats the number of descriptions to use in augmentation process as a parameter, which allows the flexibility of enumerating across several numbers before identifying an appropriate number. experiment results for link prediction demonstrate a 5.5% and 3.5% percentage increase in the mean reciprocal rank (mrr) and hits@10 scores respectively, in comparison to text-enhanced knowledge graph augmentation methods using traditional cnns.Here's the translation:<>translate the following text into Simplified Chinese<>通过注入文本信息，知识图（KG）实体表示得到了NLP社区中进行KG oriented任务的改进。外部知识通常采用了具有semantic richness的 lexical dependency parsed feature以及一组相关的关键词来增强KG嵌入。尽管text-enhanced KG embeddings已经带来了一些进步，但是本文提议可以进一步改进。而不是使用单个文本描述（这将不足以表示实体，因为文本的内在ambiguity），我们提议一个多任务框架，它同时选择KG实体相关的文本描述，并将KG嵌入与文本描述进行对齐或扩充。与之前的方法不同，这个框架不使用知识库中声明的正式实体描述，而是利用一个检索模型，选择更加富有或高度相关的文本描述来增强实体。此外，框架对增强过程中的文本描述数量作为参数，允许在增强过程中列举多个数据，以便选择合适的数量。实验结果表明，在链接预测任务中，与传统CNN使用的text-enhanced KG增强方法相比，本方法可以提高MRR和Hits@10分数的平均reciprocal rank和Hits@10分数。

The Hydra Effect: Emergent Self-repair in Language Model Computations

paper_url: http://arxiv.org/abs/2307.15771
repo_url: None
paper_authors: Thomas McGrath, Matthew Rahtz, Janos Kramar, Vladimir Mikulik, Shane Legg
for: 本研究使用 causal 分析探讨语言模型计算的内部结构。
methods: 本研究使用ablation研究方法探讨语言模型层次结构的相互作用。
results: 研究发现语言模型层次结构具有自适应计算和抵消功能，即“卷舌效应”和“下游MLP层的减退功能”。这些效应在不含dropout的语言模型中也存在。

Abstract
We investigate the internal structure of language model computations using causal analysis and demonstrate two motifs: (1) a form of adaptive computation where ablations of one attention layer of a language model cause another layer to compensate (which we term the Hydra effect) and (2) a counterbalancing function of late MLP layers that act to downregulate the maximum-likelihood token. Our ablation studies demonstrate that language model layers are typically relatively loosely coupled (ablations to one layer only affect a small number of downstream layers). Surprisingly, these effects occur even in language models trained without any form of dropout. We analyse these effects in the context of factual recall and consider their implications for circuit-level attribution in language models.

摘要
我团队 Investigating 语言模型计算机制的内部结构，使用 causal 分析发现了两种模式：（1）一种适应 computation 的形式，其中剪除一层注意力层会导致另一层补偿（我们称之为“卷积效应”），以及（2）一种补偿函数，它使得晚期 MLP 层下降抑制最大可能性token。我们的ablation 研究表明，语言模型层通常是相对松散耦合的（剪除一层只会影响少量下游层）。奇怪的是，这些效果在没有任何 dropout 训练的情况下仍然出现。我们对这些效果在事实记忆中进行分析，并考虑它们对语言模型的征义归属的影响。

CHATREPORT: Democratizing Sustainability Disclosure Analysis through LLM-based Tools

paper_url: http://arxiv.org/abs/2307.15770
repo_url: https://github.com/edisonni-hku/chatreport
paper_authors: Jingwei Ni, Julia Bingler, Chiara Colesanti-Senni, Mathias Kraus, Glen Gostlow, Tobias Schimanski, Dominik Stammbach, Saeid Ashraf Vaghefi, Qian Wang, Nicolas Webersinke, Tobias Wekhof, Tingyu Yu, Markus Leippold
for: The paper aims to provide a novel LLM-based system for automating the analysis of corporate sustainability reports, with the goal of improving transparency and stakeholder empowerment.
methods: The system, called ChatReport, uses large language models (LLMs) to analyze sustainability reports and generate analyses, while addressing two key challenges: hallucination and the inefficiency of involving domain experts in the development loop.
results: The authors provide a methodology, annotated datasets, and generated analyses of 1015 reports to demonstrate the effectiveness of ChatReport. The results show that the system can provide accurate and traceable analyses of sustainability reports, empowering stakeholders and improving transparency in sustainability reporting.

Abstract
In the face of climate change, are companies really taking substantial steps toward more sustainable operations? A comprehensive answer lies in the dense, information-rich landscape of corporate sustainability reports. However, the sheer volume and complexity of these reports make human analysis very costly. Therefore, only a few entities worldwide have the resources to analyze these reports at scale, which leads to a lack of transparency in sustainability reporting. Empowering stakeholders with LLM-based automatic analysis tools can be a promising way to democratize sustainability report analysis. However, developing such tools is challenging due to (1) the hallucination of LLMs and (2) the inefficiency of bringing domain experts into the AI development loop. In this paper, we ChatReport, a novel LLM-based system to automate the analysis of corporate sustainability reports, addressing existing challenges by (1) making the answers traceable to reduce the harm of hallucination and (2) actively involving domain experts in the development loop. We make our methodology, annotated datasets, and generated analyses of 1015 reports publicly available.

摘要
在气候变化的面前，公司是否实际做出了更加可持续的操作？回答需要探索企业可持续报告的含义，但这些报告的量和复杂性使人工分析成本高昂。因此，只有少数世界各国的机构有能力大规模分析这些报告，这导致可持续报告的透明度不足。为了解决这个问题，我们提出了一种基于自然语言处理（LLM）的自动分析系统——ChatReport。我们的系统可以帮助投资者、消费者和其他关注可持续发展的各种利益相关者更好地理解企业的可持续发展情况。我们的系统可以解决现有挑战，包括：1. LLM的幻觉：由于LLM的幻觉问题，自动分析系统可能会产生错误的结论。我们的系统可以使答案traceable，以减少幻觉的影响。2. 域专家的参与：在开发LLM模型时，域专家的参与是关键。我们的系统可以 актив地吸引域专家参与开发过程，以提高模型的准确性和可靠性。我们的方法、标注数据集和对1015份报告的自动分析结果都公开 disponibles。通过我们的系统，您可以快速和高效地获得可持续发展的信息，以帮助您做出更加 Informed 的决策。

Goodness-of-Fit of Attributed Probabilistic Graph Generative Models

paper_url: http://arxiv.org/abs/2308.03773
repo_url: None
paper_authors: Pablo Robles-Granda, Katherine Tsai, Oluwasanmi Koyejo
for: 这篇论文主要用于描述如何评估 probabilistic generative models of graphs 的goodness of fit。
methods: 该论文使用了 mean square contingency coefficient 作为评估标准，并提供了一种方法来确保 strutture of learned attributed graph 的质量。
results: 该论文通过应用这种方法来评估了不同种类的图模型的表示能力。

Abstract
Probabilistic generative models of graphs are important tools that enable representation and sampling. Many recent works have created probabilistic models of graphs that are capable of representing not only entity interactions but also their attributes. However, given a generative model of random attributed graph(s), the general conditions that establish goodness of fit are not clear a-priori. In this paper, we define goodness of fit in terms of the mean square contingency coefficient for random binary networks. For this statistic, we outline a procedure for assessing the quality of the structure of a learned attributed graph by ensuring that the discrepancy of the mean square contingency coefficient (constant, or random) is minimal with high probability. We apply these criteria to verify the representation capability of a probabilistic generative model for various popular types of graph models.

摘要
probabilistic生成模型可以用来表示和采样图像。近期的许多研究都创建了可以表示实体互动以及其属性的概率模型。但是，给定一个生成模型，确定好的适应性是不明确的。在这篇论文中，我们定义适应性是指随机二元网络的mean square contingency coefficient的平均值。我们还详细介绍了一种方法，以确保学习的嵌入图像的结构质量高，这种方法是通过确保mean square contingency coefficient的差异（常数或随机）是最小的来实现。我们应用这些标准来验证不同种类的图像模型的表示能力。

Lessons in Reproducibility: Insights from NLP Studies in Materials Science

paper_url: http://arxiv.org/abs/2307.15759
repo_url: None
paper_authors: Xiangyun Lei, Edward Kim, Viktoriia Baibakova, Shijing Sun
for: 本研究对两篇开创性论文进行了可重复性分析，即 “机器学习和编码 synthesis parameters of oxide materials” 和 “自主学习 word embeddings capture latent knowledge from materials science literature”。
methods: 这两篇论文都提供了完整的工作流程，整洁的代码库，以及丰富的评估指南。这使得复制他们的结果更加容易，并部分地复制他们的发现。
results: 我们的分析表明，这两篇论文设置了可贵的标准，使得未来的材料科学发表物可以借鉴。然而，我们还发现了一些需要改进的地方，如提供可 копи得训练数据，更加透明的模型架构和训练过程，以及软件依赖关系版本的详细说明。此外，我们还比较了这两篇论文中的 word embedding 模型，发现了一些关键的可重复性和交叉兼容性问题，这些问题与模型本身外部的设计选择有关。

Abstract
Natural Language Processing (NLP), a cornerstone field within artificial intelligence, has been increasingly utilized in the field of materials science literature. Our study conducts a reproducibility analysis of two pioneering works within this domain: "Machine-learned and codified synthesis parameters of oxide materials" by Kim et al., and "Unsupervised word embeddings capture latent knowledge from materials science literature" by Tshitoyan et al. We aim to comprehend these studies from a reproducibility perspective, acknowledging their significant influence on the field of materials informatics, rather than critiquing them. Our study indicates that both papers offered thorough workflows, tidy and well-documented codebases, and clear guidance for model evaluation. This makes it easier to replicate their results successfully and partially reproduce their findings. In doing so, they set commendable standards for future materials science publications to aspire to. However, our analysis also highlights areas for improvement such as to provide access to training data where copyright restrictions permit, more transparency on model architecture and the training process, and specifications of software dependency versions. We also cross-compare the word embedding models between papers, and find that some key differences in reproducibility and cross-compatibility are attributable to design choices outside the bounds of the models themselves. In summary, our study appreciates the benchmark set by these seminal papers while advocating for further enhancements in research reproducibility practices in the field of NLP for materials science. This balance of understanding and continuous improvement will ultimately propel the intersecting domains of NLP and materials science literature into a future of exciting discoveries.

摘要
自然语言处理（NLP）是人工智能的一个核心领域，在材料科学文献中越来越广泛应用。我们的研究对两篇先锋性论文进行了可重现性分析：“机器学习和编码的材料合成参数” by Kim et al., 和“自然语言模型捕捉材料科学文献中隐知知识” by Tshitoyan et al。我们的研究目的是理解这两篇论文的可重现性，而不是批评它们。我们发现这两篇论文都提供了完整的工作流程、整洁的代码基础和详细的模型评估指南。这使得复制其结果成功并部分复制其发现更加容易。这两篇论文在设置标准的同时，也释放了一些可以进一步改进的提示，例如提供版权限制允许的训练数据访问，更加透明的模型架构和训练过程，以及软件依赖版本的详细说明。我们还将这两篇论文中的词嵌入模型进行比较，发现它们在可重现性和交互兼容性方面有一些关键的差异，这些差异可以归因于模型设计的选择。总之，我们的研究对这两篇论文进行了评价，同时强调了在NLP和材料科学文献领域的研究可重现性实践的进一步提高。这种平衡的理解和不断改进将最终推动这两个领域的发现。

Uncertainty in Natural Language Generation: From Theory to Applications

paper_url: http://arxiv.org/abs/2307.15703
repo_url: https://github.com/Rastaman4e/-1
paper_authors: Joris Baan, Nico Daheim, Evgenia Ilia, Dennis Ulmer, Haau-Sing Li, Raquel Fernández, Barbara Plank, Rico Sennrich, Chrysoula Zerva, Wilker Aziz
for: 这篇论文旨在探讨如何使自然语言生成（NLG）系统更加可靠和可信，以满足不同人群的需求。
methods: 论文提出了一种基于不确定性理论的NLG系统设计方法，包括表示不确定性的基本概念、框架和 vocabulary，以及从语言角度描述NLG中的主要不确定性来源。
results: 论文认为，在NLG系统中处理不确定性可以帮助创建更加适应人群需求的系统和评估协议，并提出了一些有前途的研究方向，如通过不确定性来强化解码、可控生成、自我评估、选择回答、活动学习等。

Abstract
Recent advances of powerful Language Models have allowed Natural Language Generation (NLG) to emerge as an important technology that can not only perform traditional tasks like summarisation or translation, but also serve as a natural language interface to a variety of applications. As such, it is crucial that NLG systems are trustworthy and reliable, for example by indicating when they are likely to be wrong; and supporting multiple views, backgrounds and writing styles -- reflecting diverse human sub-populations. In this paper, we argue that a principled treatment of uncertainty can assist in creating systems and evaluation protocols better aligned with these goals. We first present the fundamental theory, frameworks and vocabulary required to represent uncertainty. We then characterise the main sources of uncertainty in NLG from a linguistic perspective, and propose a two-dimensional taxonomy that is more informative and faithful than the popular aleatoric/epistemic dichotomy. Finally, we move from theory to applications and highlight exciting research directions that exploit uncertainty to power decoding, controllable generation, self-assessment, selective answering, active learning and more.

摘要
First, we present the fundamental theory, frameworks, and vocabulary required to represent uncertainty. We then characterize the main sources of uncertainty in NLG from a linguistic perspective, and propose a two-dimensional taxonomy that is more informative and faithful than the popular aleatoric/epistemic dichotomy. Finally, we move from theory to applications and highlight exciting research directions that exploit uncertainty to power decoding, controllable generation, self-assessment, selective answering, active learning, and more.Translated into Simplified Chinese:最近的强大语言模型的进步使得自然语言生成（NLG）成为一种重要的技术，不仅可以完成传统任务如摘要或翻译，还可以作为多种应用程序的自然语言 интер法。因此，NLG 系统的可靠性和可预测性是非常重要的，例如指示它们可能会错误，并支持多个视点、背景和写作风格，反映人类亚群体。在这篇文章中，我们 argue That a principled treatment of uncertainty can assist in creating systems and evaluation protocols better aligned with these goals.我们首先提出了需要表示不确定性的基本理论、框架和术语。然后，我们从语言学 perspective Characterize the main sources of uncertainty in NLG, and propose a two-dimensional taxonomy that is more informative and faithful than the popular aleatoric/epistemic dichotomy. Finally, we move from theory to applications and highlight exciting research directions that exploit uncertainty to power decoding, controllable generation, self-assessment, selective answering, active learning, and more.Translated into Traditional Chinese:最近的强大语言模型的进步使得自然语言生成（NLG）成为一种重要的技术，不仅可以完成传统任务如摘要或翻译，还可以作为多种应用程序的自然语言 інтер法。因此，NLG 系统的可靠性和可预测性是非常重要的，例如指示它们可能会错误，并支持多个视点、背景和写作风格，反映人类亚群体。在这篇文章中，我们 argue That a principled treatment of uncertainty can assist in creating systems and evaluation protocols better aligned with these goals.我们首先提出了需要表示不确定性的基本理论、框架和术语。然后，我们从语言学 perspective Characterize the main sources of uncertainty in NLG, and propose a two-dimensional taxonomy that is more informative and faithful than the popular aleatoric/epistemic dichotomy. Finally, we move from theory to applications and highlight exciting research directions that exploit uncertainty to power decoding, controllable generation, self-assessment, selective answering, active learning, and more.

AI for Anticipatory Action: Moving Beyond Climate Forecasting

paper_url: http://arxiv.org/abs/2307.15727
repo_url: None
paper_authors: Benjamin Q. Huynh, Mathew V. Kiang
for: 这篇论文旨在提供关于气候预测转移到预测行动的概述，并评估机器学习模型在气候预测中的应用。
methods: 论文评论了机器学习模型在气候预测中的应用，并发现了一些难题，例如如何使机器学习模型更好地支持预测行动。
results: 论文高亮了机器学习模型在气候预测中的应用可以帮助减轻气候变化对最容易受到影响的人群的影响，但还需要更多的研究来解决方法上的难题。

Abstract
Disaster response agencies have been shifting from a paradigm of climate forecasting towards one of anticipatory action: assessing not just what the climate will be, but how it will impact specific populations, thereby enabling proactive response and resource allocation. Machine learning models are becoming exceptionally powerful at climate forecasting, but methodological gaps remain in terms of facilitating anticipatory action. Here we provide an overview of anticipatory action, review relevant applications of machine learning, identify common challenges, and highlight areas where machine learning can uniquely contribute to advancing disaster response for populations most vulnerable to climate change.

摘要
气候灾害机构正在从气候预测模式向一种预期行动模式转移：不仅评估气候会如何发展，而且评估气候对特定人口的影响，以便进行积极的应对和资源分配。机器学习模型在气候预测方面已经非常强大，但在实施预期行动方面还存在一些方法学挑战。本文提供了预期行动的概述，浏览了相关的机器学习应用，确认了常见的挑战，并强调了机器学习在气候变化影响最容易受到影响的人口群体中的独特贡献。

A supervised hybrid quantum machine learning solution to the emergency escape routing problem

paper_url: http://arxiv.org/abs/2307.15682
repo_url: None
paper_authors: Nathan Haboury, Mo Kordzanganeh, Sebastian Schmitt, Ayush Joshi, Igor Tokarev, Lukas Abdallah, Andrii Kurkin, Basil Kyriacou, Alexey Melnikov
for: 这篇研究探讨了如何使用监督式混合量子机器学习来优化自然灾害发生时的紧急撤退计划。
methods: 研究使用了一种新的混合式监督学习方法，融合了量子和传统的FiLM ней罗网络，并在一个实验城市graph上进行训练。
results: 研究发现，将量子和传统的FiLM ней罗网络融合在一起可以提高整体模型的表达力，并在训练 dataset上预测 Navigation 任务的成功率提高7%。

Abstract
Managing the response to natural disasters effectively can considerably mitigate their devastating impact. This work explores the potential of using supervised hybrid quantum machine learning to optimize emergency evacuation plans for cars during natural disasters. The study focuses on earthquake emergencies and models the problem as a dynamic computational graph where an earthquake damages an area of a city. The residents seek to evacuate the city by reaching the exit points where traffic congestion occurs. The situation is modeled as a shortest-path problem on an uncertain and dynamically evolving map. We propose a novel hybrid supervised learning approach and test it on hypothetical situations on a concrete city graph. This approach uses a novel quantum feature-wise linear modulation (FiLM) neural network parallel to a classical FiLM network to imitate Dijkstra's node-wise shortest path algorithm on a deterministic dynamic graph. Adding the quantum neural network in parallel increases the overall model's expressivity by splitting the dataset's harmonic and non-harmonic features between the quantum and classical components. The hybrid supervised learning agent is trained on a dataset of Dijkstra's shortest paths and can successfully learn the navigation task. The hybrid quantum network improves over the purely classical supervised learning approach by 7% in accuracy. We show that the quantum part has a significant contribution of 45.(3)% to the prediction and that the network could be executed on an ion-based quantum computer. The results demonstrate the potential of supervised hybrid quantum machine learning in improving emergency evacuation planning during natural disasters.

摘要
natural disasters 的回应可以减轻其破坏性的影响。这项工作探讨使用监督式量子机器学习优化自然灾害期间撤离车辆的紧急计划。研究将地震灾害作为研究对象，将问题模型为一个动态计算图，地震破坏城市区域，居民寻求离开城市达到出口点，解决方案是一个短路问题在不确定和动态发展的地图上。我们提出了一种新的半监督学习方法，并在假设情况下测试其在具有具体城市图的情况下。这种方法使用一种新的量子特征WISE linear modulation（FiLM）神经网络并行地与一个经典FiLM神经网络相似，用于模拟在决定性动态图上Dijkstra的节点短路算法。在加入量子神经网络后，总模型的表达能力得到提高，因为将数据集的幂律和非幂律特征分别传递给量子和经典组件。半监督学习代理人通过一个包含Dijkstra短路的数据集进行训练，并成功学习导航任务。半量子网络在纯经典监督学习方法的基础上提高了准确率，提高了7%。我们发现量子部分对预测做出了重要贡献，占总预测概率的45.(3)%。我们还证明了这种网络可以在离子基础上执行量子计算机。结果表明，半量子机器学习在自然灾害期间emergency evacuation planning中具有潜在的优势。

Benchmarking Anomaly Detection System on various Jetson Edge Devices

paper_url: http://arxiv.org/abs/2307.16834
repo_url: None
paper_authors: Hoang Viet Pham, Thinh Gia Tran, Chuong Dinh Le, An Dinh Le, Hien Bich Vo
for: 增强市民安全和福祉的监控视频异常事件捕捉
methods: 应用EdgeAI技术，实现端到端犯罪现场异常检测系统
results: 比其他状态艺术算法竞争的异常检测模型，并在多个Jetson边缘设备上测试并部署AI系统，并提供了使用Docker技术进行系统性能改进的经验。

Abstract
Capturing the abnormal event from surveillance videos enhances the safety and well-being of the citizens. The application of EdgeAI (Edge computing-based Artificial Intelligent ) meets the strict latency requirements for security. In this paper, we apply weakly supervised video anomaly detection called Robust Temporal Feature Magnitude Learning (RTFM) to an end-to-end crime-scene anomaly detection system from the surveillance cameras with the help of edge computing technology. The system is tested directly on multiple Jetson edge devices combined with TensorRT as the software developer kit from NVIDIA for system performance enhancement. The experience of an AI-based system deployment on various Jetson Edge devices with Docker technology is also provided. The anomaly detection model yields competitive results compared to other state-of-the-art (SOTA) algorithms on available datasets such as UCF-Crime and UIT VNAnomaly. The approach system reaches 47.56 frames per second (FPS) inference speed on a Jetson edge device with only 3.11 GB RAM usage total. We also discover the promising Jetson device that the AI system achieves 15% better performance than the previous version of Jetson devices while consuming 50% less energy power.

摘要
capturing the abnormal event from surveillance videos enhances the safety and well-being of the citizens. The application of EdgeAI (Edge computing-based Artificial Intelligence) meets the strict latency requirements for security. In this paper, we apply weakly supervised video anomaly detection called Robust Temporal Feature Magnitude Learning (RTFM) to an end-to-end crime-scene anomaly detection system from the surveillance cameras with the help of edge computing technology. The system is tested directly on multiple Jetson edge devices combined with TensorRT as the software developer kit from NVIDIA for system performance enhancement. The experience of an AI-based system deployment on various Jetson Edge devices with Docker technology is also provided. The anomaly detection model yields competitive results compared to other state-of-the-art (SOTA) algorithms on available datasets such as UCF-Crime and UIT VNAnomaly. The approach system reaches 47.56 frames per second (FPS) inference speed on a Jetson edge device with only 3.11 GB RAM usage total. We also discover the promising Jetson device that the AI system achieves 15% better performance than the previous version of Jetson devices while consuming 50% less energy power.

Case Studies of Causal Discovery from IT Monitoring Time Series

paper_url: http://arxiv.org/abs/2307.15678
repo_url: https://github.com/Aryia-Behroziuan/References
paper_authors: Ali Aït-Bachir, Charles K. Assaad, Christophe de Bignicourt, Emilie Devijver, Simon Ferreira, Eric Gaussier, Hosein Mohanna, Lei Zan
for: 本研究旨在应用 causal discovery 算法于 IT 监控数据，以获得系统的 causal 关系。
methods: 本研究使用了不同的 causal discovery 算法，包括 PC 算法、 FCI 算法和 Causal Additive Model (CAM) 算法，并对不同的 IT 监控数据进行了实验。
results: 研究发现，这些 causal discovery 算法可以帮助获得系统的 causal 关系，但是也存在一些挑战，例如时间序列不一致、睡眠时间序列、时间戳不准确和欠数据等。

Abstract
Information technology (IT) systems are vital for modern businesses, handling data storage, communication, and process automation. Monitoring these systems is crucial for their proper functioning and efficiency, as it allows collecting extensive observational time series data for analysis. The interest in causal discovery is growing in IT monitoring systems as knowing causal relations between different components of the IT system helps in reducing downtime, enhancing system performance and identifying root causes of anomalies and incidents. It also allows proactive prediction of future issues through historical data analysis. Despite its potential benefits, applying causal discovery algorithms on IT monitoring data poses challenges, due to the complexity of the data. For instance, IT monitoring data often contains misaligned time series, sleeping time series, timestamp errors and missing values. This paper presents case studies on applying causal discovery algorithms to different IT monitoring datasets, highlighting benefits and ongoing challenges.

摘要
信息技术（IT）系统是现代企业中不可或缺的，它们负责数据存储、通信和自动化过程。监控这些系统非常重要，因为它可以收集广泛的观察时间序数据，用于分析。随着 causal discovery 的兴趣在 IT 监控系统中增长，因为它可以帮助发现 IT 系统中不同组件之间的 causal 关系，从而降低停机时间，提高系统性能和识别异常和事件的根本原因。此外，它还允许预测未来问题的预测，通过历史数据分析。尽管它拥有这些优点，但是在应用 causal discovery 算法于 IT 监控数据时，还存在一些挑战，例如 IT 监控数据中的时间序列误差、休眠时间序列、时间戳错误和缺失值。这篇文章介绍了不同 IT 监控数据集的 case study，探讨了这些挑战和 beneficial 效果。

paper_url: http://arxiv.org/abs/2307.15644
repo_url: https://github.com/wz0919/scalevln
paper_authors: Zun Wang, Jialu Li, Yicong Hong, Yi Wang, Qi Wu, Mohit Bansal, Stephen Gould, Hao Tan, Yu Qiao
for: 提高普通环境下的语言导航agent性能
methods: 使用HM3D和Gibson dataset中的1200+真实照片环境和网络资源进行数据生成，并使用这些数据进行预训练和微调
results: 使用这些数据可以提高现有agent的性能 (+11%相对于之前的SoTA)，并将未看过环境下的游走成功率降低到<1%（相比之前的8%）。此外，这种方法还使得不同的模型在CVDN、REVERIE和R2R中实现了新的状态计算导航结果。

Abstract
Recent research in language-guided visual navigation has demonstrated a significant demand for the diversity of traversable environments and the quantity of supervision for training generalizable agents. To tackle the common data scarcity issue in existing vision-and-language navigation datasets, we propose an effective paradigm for generating large-scale data for learning, which applies 1200+ photo-realistic environments from HM3D and Gibson datasets and synthesizes 4.9 million instruction trajectory pairs using fully-accessible resources on the web. Importantly, we investigate the influence of each component in this paradigm on the agent's performance and study how to adequately apply the augmented data to pre-train and fine-tune an agent. Thanks to our large-scale dataset, the performance of an existing agent can be pushed up (+11% absolute with regard to previous SoTA) to a significantly new best of 80% single-run success rate on the R2R test split by simple imitation learning. The long-lasting generalization gap between navigating in seen and unseen environments is also reduced to less than 1% (versus 8% in the previous best method). Moreover, our paradigm also facilitates different models to achieve new state-of-the-art navigation results on CVDN, REVERIE, and R2R in continuous environments.

摘要
近期研究 языковой导向视觉导航已经表明了广泛环境多样性和训练总体代理人的强需求。为了解决现有视觉语言导航数据集中的常见数据缺乏问题，我们提出了一种有效的数据生成模型，该模型应用了1200+的真实照片环境从HM3D和Gibson数据集，并使用了全面访问网络资源生成490万条 instruciton trajectory对。我们进一步研究这种模型中每个组件对代理人性能的影响，并研究如何正确地应用扩展数据来预训练和精度调整代理人。感谢我们的大规模数据集，已有代理人的性能可以被提高 (+11%绝对与之前的SoTA) 到一个新的最佳值80%单次成功率在R2R测试分割。此外，我们的模型也使得不同的模型在CVDN、REVERIE和R2R在连续环境中实现新的导航成绩。

Fun Paper

2023-07-29

cs.AI - 2023-07-29

Marrying Dialogue Systems with Data Visualization: Interactive Data Visualization Generation from Natural Language Conversations

RoCar: A Relationship Network-based Evaluation Method to Large Language Models

UPFL: Unsupervised Personalized Federated Learning towards New Clients

Ultrasound Image Reconstruction with Denoising Diffusion Restoration Models

Freespace Optical Flow Modeling for Automated Driving

You Can Backdoor Personalized Federated Learning

Graph Condensation for Inductive Node Representation Learning

Towards the Visualization of Aggregated Class Activation Maps to Analyse the Global Contribution of Class Features

The effect of network topologies on fully decentralized learning: a preliminary investigation

A Theory for Emergence of Complex Skills in Language Models

Language models as master equation solvers

ATESA-BÆRT: A Heterogeneous Ensemble Learning Model for Aspect-Based Sentiment Analysis

Opportunistic Air Quality Monitoring and Forecasting with Expandable Graph Neural Networks

Moisesdb: A dataset for source separation beyond 4-stems

Reinforcement Learning Under Probabilistic Spatio-Temporal Constraints with Time Windows

UniBriVL: Robust Universal Representation and Generation of Audio Driven Diffusion Models

A new Gradient TD Algorithm with only One Step-size: Convergence Rate Analysis using $L$-$λ$ Smoothness

Point Annotation Probability Map: Towards Dense Object Counting by Tolerating Annotation Noise

Recent neutrino oscillation result with the IceCube experiment

Dimensionless Policies based on the Buckingham $π$ Theorem: Is it a good way to Generalize Numerical Results?

Comprehensive Algorithm Portfolio Evaluation using Item Response Theory

Primitive Skill-based Robot Learning from Human Evaluative Feedback

Summaries, Highlights, and Action items: Design, implementation and evaluation of an LLM-powered meeting recap system

SAFE: Saliency-Aware Counterfactual Explanations for DNN-based Automated Driving Systems

Spherical and Hyperbolic Toric Topology-Based Codes On Graph Embedding for Ising MRF Models: Classical and Quantum Topology Machine Learning

Select and Augment: Enhanced Dense Retrieval Knowledge Graph Augmentation

The Hydra Effect: Emergent Self-repair in Language Model Computations

CHATREPORT: Democratizing Sustainability Disclosure Analysis through LLM-based Tools

Goodness-of-Fit of Attributed Probabilistic Graph Generative Models

Lessons in Reproducibility: Insights from NLP Studies in Materials Science

Uncertainty in Natural Language Generation: From Theory to Applications

AI for Anticipatory Action: Moving Beyond Climate Forecasting

A supervised hybrid quantum machine learning solution to the emergency escape routing problem

Benchmarking Anomaly Detection System on various Jetson Edge Devices

Case Studies of Causal Discovery from IT Monitoring Time Series

Scaling Data Generation in Vision-and-Language Navigation