2023-07-13

cs.AI

cs.AI - 2023-07-13

IR Design for Application-Specific Natural Language: A Case Study on Traffic Data

paper_url: http://arxiv.org/abs/2307.06983
repo_url: None
paper_authors: Wei Hu, Xuhong Wang, Ding Wang, Shengyue Yao, Zuqiu Mao, Li Li, Fei-Yue Wang, Yilun Lin
for: 该论文旨在提出一种针对应用特定自然语言（ASNL）的中间表示（IR）设计，以提高交通数据处理性能。
methods: 该论文使用了各种方法，包括对ASNL数据的分析和处理，以及设计了一种可以统一处理交通数据的IR。
results: 实验结果显示，使用该IR设计可以在标准数据查询操作中提高处理速度，比直接使用标准XML格式数据的处理速度提高了四十倍以上。

Abstract
In the realm of software applications in the transportation industry, Domain-Specific Languages (DSLs) have enjoyed widespread adoption due to their ease of use and various other benefits. With the ceaseless progress in computer performance and the rapid development of large-scale models, the possibility of programming using natural language in specified applications - referred to as Application-Specific Natural Language (ASNL) - has emerged. ASNL exhibits greater flexibility and freedom, which, in turn, leads to an increase in computational complexity for parsing and a decrease in processing performance. To tackle this issue, our paper advances a design for an intermediate representation (IR) that caters to ASNL and can uniformly process transportation data into graph data format, improving data processing performance. Experimental comparisons reveal that in standard data query operations, our proposed IR design can achieve a speed improvement of over forty times compared to direct usage of standard XML format data.

摘要
在交通领域软件应用中，域限定语言（DSL）已广泛应用，主要因其使用友好性和其他优点。随着计算机性能不断提高和大规模模型快速发展，指定应用程序自然语言编程（ASNL）的可能性出现。ASNL具有更大的灵活性和自由度，但这也导致计算复杂度增加和处理性能下降。为解决这个问题，我们的论文提出了一种中间表示（IR）的设计，可以统一处理交通数据，并将其转换为图数据格式，从而提高数据处理性能。实验比较表明，在标准数据查询操作中，我们的提议的IR设计可以实现超过四十倍的速度提升 compared to直接使用标准XML格式数据。

A Causal Framework to Unify Common Domain Generalization Approaches

paper_url: http://arxiv.org/abs/2307.06825
repo_url: None
paper_authors: Nevin L. Zhang, Kaican Li, Han Gao, Weiyan Xie, Zhi Lin, Zhenguo Li, Luning Wang, Yongxiang Huang
for: 本文是关于域外通用（Domain Generalization，DG）的研究，旨在开发一种能够在新域中 generalized 的机器学习模型。
methods: 本文提出了一种 causal 框架来解释域外通用的方法，并对各种 DG 方法进行了解释和分析。
results: 本文提供了一种新的理解域外通用的方法，并解释了各种 DG 方法之间的关系和优缺点。这些研究可以帮助研究人员更好地理解域外通用的基本原理，并开发更有效的方法来解决这一重要问题。

Abstract
Domain generalization (DG) is about learning models that generalize well to new domains that are related to, but different from, the training domain(s). It is a fundamental problem in machine learning and has attracted much attention in recent years. A large number of approaches have been proposed. Different approaches are motivated from different perspectives, making it difficult to gain an overall understanding of the area. In this paper, we propose a causal framework for domain generalization and present an understanding of common DG approaches in the framework. Our work sheds new lights on the following questions: (1) What are the key ideas behind each DG method? (2) Why is it expected to improve generalization to new domains theoretically? (3) How are different DG methods related to each other and what are relative advantages and limitations? By providing a unified perspective on DG, we hope to help researchers better understand the underlying principles and develop more effective approaches for this critical problem in machine learning.

摘要
域外泛化（DG）是关于学习模型能够在新域中泛化well的问题，新域与训练域相关，但不同。这是机器学习的基本问题，在过去几年内吸引了很多注意。一大量的方法被提出。不同的方法受不同的动机，使得了解这个领域的全面性很难。在这篇论文中，我们提出了 causal 框架 для域外泛化，并对各种 DG 方法进行了解释。我们的工作为以下问题提供了新的灯光：1. 哪些是域外泛化方法的关键思想？2. 为什么可以理论上预期域外泛化方法会提高新域泛化的性能？3. 各种 DG 方法之间有什么关系，哪些方法有优势和局限性？通过提供一个统一的视角，我们希望能够帮助研究人员更好地理解域外泛化的基本原则，并开发更有效的方法来解决这一关键问题在机器学习中。

TinyMetaFed: Efficient Federated Meta-Learning for TinyML

paper_url: http://arxiv.org/abs/2307.06822
repo_url: None
paper_authors: Haoyu Ren, Xue Li, Darko Anicic, Thomas A. Runkler
for: 这个研究旨在探讨使用 Federated Meta-Learning（FedML）在具有限制的资源的小型机器学习（TinyML）设备上实现机器学习。
methods: 这个研究提出了一个名为 TinyMetaFed 的模型独立式meta-learning框架，可以在 TinyML 设备上进行协同训练。TinyMetaFed 使用 partial local reconstruction 和 Top-P% 选择性通信来节省通信成本和保护隐私，使用 online learning 来提高计算效率，并通过 few-shot learning 来应对客户端不对称性。
results: 研究结果显示，TinyMetaFed 可以在三个 TinyML 应用中实现较大的能源减少和通信负载削减，加速训练过程并稳定训练过程。

Abstract
The field of Tiny Machine Learning (TinyML) has made substantial advancements in democratizing machine learning on low-footprint devices, such as microcontrollers. The prevalence of these miniature devices raises the question of whether aggregating their knowledge can benefit TinyML applications. Federated meta-learning is a promising answer to this question, as it addresses the scarcity of labeled data and heterogeneous data distribution across devices in the real world. However, deploying TinyML hardware faces unique resource constraints, making existing methods impractical due to energy, privacy, and communication limitations. We introduce TinyMetaFed, a model-agnostic meta-learning framework suitable for TinyML. TinyMetaFed facilitates collaborative training of a neural network initialization that can be quickly fine-tuned on new devices. It offers communication savings and privacy protection through partial local reconstruction and Top-P% selective communication, computational efficiency via online learning, and robustness to client heterogeneity through few-shot learning. The evaluations on three TinyML use cases demonstrate that TinyMetaFed can significantly reduce energy consumption and communication overhead, accelerate convergence, and stabilize the training process.

摘要
《小型机器学习（TinyML）领域在使用低质量设备（如微控制器）进行机器学习的民主化方面做出了重要进步。随着这些小型设备的普遍存在，人们开始思考这些设备之间是否可以共享知识，以便进一步提高TinyML应用。联邦元学习是一种有前途的解决方案，因为它可以解决实际世界中数据标注稀缺和设备数据分布不均的问题。然而，部署TinyML硬件面临着独特的资源限制，使得现有的方法成为实际应用中的瓶颈。我们介绍了TinyMetaFed，一个适用于TinyML的模型独立元学习框架。TinyMetaFed通过快速 Fine-tune 新设备的神经网络初始化来实现共同训练。它提供了通信成本和隐私保护的部分本地重建和Top-P%选择性通信，计算效率via在线学习，并对客户端多样性提供了少量学习。我们对TinyML应用三个场景进行了评估，结果表明TinyMetaFed可以显著减少能源消耗和通信负担，加速结构，并稳定训练过程。》

Negated Complementary Commonsense using Large Language Models

paper_url: http://arxiv.org/abs/2307.06794
repo_url: https://github.com/navidre/negated_complementary_commonsense
paper_authors: Navid Rezaei, Marek Z. Reformat
for: 这个研究是为了解决大语言模型在非正常问题下的表现问题。
methods: 该研究使用了一种模型无关的方法来提高对于非正常问题的回答表现。
results: 该方法可以超过GPT-3的几个shot生成，并且更重要的是，强调了大语言模型在非正常问题下的回答需要进一步研究。Here’s the English version for reference:
for: This paper aims to address the issue of large language models’ performance in non-ordinary questions.
methods: The study proposes a model-agnostic method to improve the performance in negated complementary scenarios.
results: The proposed method outperforms GPT-3’s few-shot generation (by more than 11 points) and highlights the significance of studying large language models’ responses in negated complementary questions.

Abstract
Larger language models, such as GPT-3, have shown to be excellent in many tasks. However, we demonstrate that out-of-ordinary questions can throw the model off guard. This work focuses on finding answers to negated complementary questions in commonsense scenarios. We illustrate how such questions adversely affect the model responses. We propose a model-agnostic methodology to improve the performance in negated complementary scenarios. Our method outperforms few-shot generation from GPT-3 (by more than 11 points) and, more importantly, highlights the significance of studying the response of large language models in negated complementary questions. The code, data, and experiments are available under: https://github.com/navidre/negated_complementary_commonsense.

摘要
大型语言模型，如GPT-3，在许多任务中表现出色。然而，我们发现异常问题会让模型受挫。这项工作关注在常见情景下的否定补充问题中找到答案。我们示出这些问题会对模型响应产生负面影响。我们提出了一种模型无关的方法来改善在否定补充问题中的性能。我们的方法在GPT-3中超过11个点的几个步骤上表现出色，并更重要的是，强调了大语言模型在否定补充问题中的响应的重要性。代码、数据和实验可以在以下链接中找到：https://github.com/navidre/negated_complementary_commonsense。

Towards Ordinal Data Science

paper_url: http://arxiv.org/abs/2307.09477
repo_url: None
paper_authors: Gerd Stumme, Dominik Dürrschnabel, Tom Hanika
for: 本研究旨在发展Ordinal数据科学为一个新的研究方向，以探讨对 ordinal 结构的计算和知识推理。
methods: 本研究使用了多种方法来计算和推理 ordinal 结构，包括指导图的使用、数据挖掘和可视化等。
results: 本研究通过实验和案例研究表明，Ordinal数据科学可以通过cross-fertilization With other machine learning和知识表示方法，以及多种领域的应用，提供新的视角和工具来探讨和解决复杂的数据问题。

Abstract
Order is one of the main instruments to measure the relationship between objects in (empirical) data. However, compared to methods that use numerical properties of objects, the amount of ordinal methods developed is rather small. One reason for this is the limited availability of computational resources in the last century that would have been required for ordinal computations. Another reason -- particularly important for this line of research -- is that order-based methods are often seen as too mathematically rigorous for applying them to real-world data. In this paper, we will therefore discuss different means for measuring and 'calculating' with ordinal structures -- a specific class of directed graphs -- and show how to infer knowledge from them. Our aim is to establish Ordinal Data Science as a fundamentally new research agenda. Besides cross-fertilization with other cornerstone machine learning and knowledge representation methods, a broad range of disciplines will benefit from this endeavor, including, psychology, sociology, economics, web science, knowledge engineering, scientometrics.

摘要
“顺序是资料中物件之间关系的一个主要量度工具。然而，相比于基于数值性质的方法， ordinal 方法的发展相对较少。一个原因是Last century 的计算资源有限，对于 ordinal 计算而言。另一个原因是， ordinal 方法 frequently 被视为对 real-world 数据过于数学化，因此不太受欢迎。在这篇论文中，我们将讨论不同的 ordinal 构造的量度和计算方法，并示如何从 ordinal 结构中获得知识。我们的目标是创建 Ordinal Data Science 作为一个全新的研究主题。此外，跨越 machine learning 和知识表示方法， Broad range of 领域将受益于这个努力，包括心理学、社会学、经济学、网络科学、知识工程、科学ometry。”Note that Simplified Chinese is used here, as it is the more commonly used standard for Chinese writing in mainland China. If you prefer Traditional Chinese, I can provide that as well.

Layered controller synthesis for dynamic multi-agent systems

paper_url: http://arxiv.org/abs/2307.06758
repo_url: None
paper_authors: Emily Clement, Nicolas Perrin-Gilbert, Philipp Schlehuber-Caissier
for: 本文提出了一种层次方法来解决多智能控制问题，分为三个阶段，每个阶段都基于上一个阶段的结果。
methods: 第一阶段使用参数化时间自动机和停止计时器来计算高级规划，第二阶段使用SMT表示法处理 combinatorial 方面的问题，提供了更准确的解决方案。
results: 使用 SWA-SMT 解决方案作为启始数据集，通过强化学习训练神经网络控制策略，并证明了初始数据集的重要性。

Abstract
In this paper we present a layered approach for multi-agent control problem, decomposed into three stages, each building upon the results of the previous one. First, a high-level plan for a coarse abstraction of the system is computed, relying on parametric timed automata augmented with stopwatches as they allow to efficiently model simplified dynamics of such systems. In the second stage, the high-level plan, based on SMT-formulation, mainly handles the combinatorial aspects of the problem, provides a more dynamically accurate solution. These stages are collectively referred to as the SWA-SMT solver. They are correct by construction but lack a crucial feature: they cannot be executed in real time. To overcome this, we use SWA-SMT solutions as the initial training dataset for our last stage, which aims at obtaining a neural network control policy. We use reinforcement learning to train the policy, and show that the initial dataset is crucial for the overall success of the method.

摘要
在这篇论文中，我们提出了一种层次方法来解决多机器人控制问题，分为三个阶段，每一阶段都基于之前的结果。第一阶段计算出一个高级规划方案，利用带有停止表示的parametric timed automata来高效地模拟简化系统的动态。在第二阶段，高级规划基于SMT形式ulation，主要处理问题的 combinatorial 方面，提供了更加精准的解决方案。这两个阶段合称为SWA-SMT解决器。它们是按照正确的构造，但lack一个关键特点：它们无法在实时下执行。为了解决这个问题，我们使用SWA-SMT解决方案作为启动数据集来训练我们的最后一个阶段，使用神经网络控制策略进行训练，并证明了初始数据集对方法的总成功具有关键作用。

Extended Graph Assessment Metrics for Graph Neural Networks

paper_url: http://arxiv.org/abs/2307.10112
repo_url: None
paper_authors: Tamara T. Mueller, Sophie Starck, Leonhard F. Feiner, Kyriaki-Margarita Bintsi, Daniel Rueckert, Georgios Kaissis
for: 本研究旨在提高医学下游任务中使用图 neural networks (GNNs) 时，重构病人群体为叫做人口图，并使用这个图来进行医学下游任务。
methods: 本研究使用了EXTENDED GRAPH ASSESSMENT METRICS (GAMs) 来评估图结构，包括了两种GAMs：\textit{homophily} 和 \textit{cross-class neighbourhood similarity} (CCNS)。
results: 研究发现，这些 metric 与模型在不同的医学人口图上的性能相关，并且在不同的学习设置下也有相似的结果。

Abstract
When re-structuring patient cohorts into so-called population graphs, initially independent data points can be incorporated into one interconnected graph structure. This population graph can then be used for medical downstream tasks using graph neural networks (GNNs). The construction of a suitable graph structure is a challenging step in the learning pipeline that can have severe impact on model performance. To this end, different graph assessment metrics have been introduced to evaluate graph structures. However, these metrics are limited to classification tasks and discrete adjacency matrices, only covering a small subset of real-world applications. In this work, we introduce extended graph assessment metrics (GAMs) for regression tasks and continuous adjacency matrices. We focus on two GAMs in specific: \textit{homophily} and \textit{cross-class neighbourhood similarity} (CCNS). We extend the notion of GAMs to more than one hop, define homophily for regression tasks, as well as continuous adjacency matrices, and propose a light-weight CCNS distance for discrete and continuous adjacency matrices. We show the correlation of these metrics with model performance on different medical population graphs and under different learning settings.

摘要
In this work, we introduce extended graph assessment metrics (GAMs) for regression tasks and continuous adjacency matrices. We focus on two GAMs in particular: homophily and cross-class neighborhood similarity (CCNS). We extend the notion of GAMs to more than one hop, define homophily for regression tasks, and propose a lightweight CCNS distance for discrete and continuous adjacency matrices. We show the correlation of these metrics with model performance on different medical population graphs and under different learning settings.Here is the text in Simplified Chinese:当重构病人群体为所谓的人口图时，首先独立的数据点可以被 integrate 到一个连接的图结构中。这个人口图可以用于医疗下游任务使用图神经网络（GNNs）。然而，建立合适的图结构是学习管道中的一个挑战，这可以对模型性能产生重要的影响。为此，不同的图评估指标已经被引入，但这些指标只适用于分类任务和离散邻接矩阵。在这种情况下，我们引入了扩展的图评估指标（GAMs），用于回归任务和连续邻接矩阵。我们特别关注两个GAMs：同类性和跨类邻居相似度（CCNS）。我们将同类性扩展到多个跳步，并对回归任务进行定义，以及提出一种轻量级的CCNS距离 для离散和连续邻接矩阵。我们在不同的医疗人口图上和不同的学习设置下显示了这些指标和模型性能之间的相关性。

Learning Multiple Coordinated Agents under Directed Acyclic Graph Constraints

paper_url: http://arxiv.org/abs/2307.07529
repo_url: None
paper_authors: Jaeyeon Jang, Diego Klabjan, Han Liu, Nital S. Patel, Xiuqi Li, Balakrishnan Ananthanarayanan, Husam Dauod, Tzung-Han Juang
for: 这种论文是为了学习多个协调的智能体在导向有环图（DAG）约束下的多智能体强化学习（MARL）方法。
methods: 这种方法利用智能体之间的DAG结构，从而实现更高效的学习性能。其中提出了一种基于MARL模型的假价值函数（MARLM-SR），并证明其为优化价值函数的下界。 computationally, 提出了一种实用的训练算法，利用新的领导者代理和奖励生成器和分配器代理，以更好地导引分解式追随者智能体探索参数空间。
results: 在四个DAG环境中，包括一个实际的Intel高量包装和测试Factory的调度环境，对比其他非DAG方法，这种方法表现出了更高的性能。

Abstract
This paper proposes a novel multi-agent reinforcement learning (MARL) method to learn multiple coordinated agents under directed acyclic graph (DAG) constraints. Unlike existing MARL approaches, our method explicitly exploits the DAG structure between agents to achieve more effective learning performance. Theoretically, we propose a novel surrogate value function based on a MARL model with synthetic rewards (MARLM-SR) and prove that it serves as a lower bound of the optimal value function. Computationally, we propose a practical training algorithm that exploits new notion of leader agent and reward generator and distributor agent to guide the decomposed follower agents to better explore the parameter space in environments with DAG constraints. Empirically, we exploit four DAG environments including a real-world scheduling for one of Intel's high volume packaging and test factory to benchmark our methods and show it outperforms the other non-DAG approaches.

摘要
这个论文提出了一种新的多智能体奖励学习（MARL）方法，用于学习带有导向无环图（DAG）约束的多个协调Agent。与现有的MARL方法不同的是，我们的方法直接利用DAG结构 междуAgent来实现更高效的学习性能。理论上，我们提出了一种基于MARL模型与人工奖励（MARLM-SR）的新价值函数，并证明其为优化价值函数的下界。计算上，我们提出了一种实用的训练算法，通过新的首领Agent和奖励生成器和分配器Agent来引导分解式follower Agent更好地探索参数空间。实验上，我们利用了四个DAG环境，包括一个真实世界的Intel高通量包装和测试工厂的实际任务，并证明了我们的方法在这些环境中表现出色，超过了其他非DAG方法。

Vehicle Dispatching and Routing of On-Demand Intercity Ride-Pooling Services: A Multi-Agent Hierarchical Reinforcement Learning Approach

paper_url: http://arxiv.org/abs/2307.06742
repo_url: None
paper_authors: Jinhua Si, Fang He, Xi Lin, Xindi Tang
for: 提高城市群的交通效率和服务质量
methods: 使用多代理人强化学习模型和适应大 neighbohood search 算法
results: 提高平均日常系统收益和订单完成率

Abstract
The integrated development of city clusters has given rise to an increasing demand for intercity travel. Intercity ride-pooling service exhibits considerable potential in upgrading traditional intercity bus services by implementing demand-responsive enhancements. Nevertheless, its online operations suffer the inherent complexities due to the coupling of vehicle resource allocation among cities and pooled-ride vehicle routing. To tackle these challenges, this study proposes a two-level framework designed to facilitate online fleet management. Specifically, a novel multi-agent feudal reinforcement learning model is proposed at the upper level of the framework to cooperatively assign idle vehicles to different intercity lines, while the lower level updates the routes of vehicles using an adaptive large neighborhood search heuristic. Numerical studies based on the realistic dataset of Xiamen and its surrounding cities in China show that the proposed framework effectively mitigates the supply and demand imbalances, and achieves significant improvement in both the average daily system profit and order fulfillment ratio.

摘要
integrated 城市群开发带来了城市间交通需求的增长。 ride-pooling 服务在城市间交通中存在很大的潜力，但是在线操作受到资源分配和pooling车辆路径规划的复杂性的限制。为了解决这些挑战，本研究提出了一个两级框架，用于在线车队管理。 Specifically，我们提出了一种多代理恶地权威学习模型，用于在不同城市间分配尚未使用的车辆，而lower level使用适应大 neighborhood搜索算法来更新车辆的路径。基于实际的厦门和其周边城市数据，我们的方法能够有效缓解供应和需求不匹配，并在日均系统利润和订单填充率方面做出了显著改善。Note: Please note that the translation is in Simplified Chinese, which is the standard writing system used in mainland China. If you need the translation in Traditional Chinese, please let me know.

MPR-Net:Multi-Scale Pattern Reproduction Guided Universality Time Series Interpretable Forecasting

paper_url: http://arxiv.org/abs/2307.06736
repo_url: https://github.com/coding-loong/MPR-Net
paper_authors: Tianlong Zhao, Xiang Ma, Xuemei Li, Caiming Zhang
for: 预测时间序列的扩展和改进，提高预测性能和可解性。
methods: 提出了一种基于卷积和减 convolution 的预测模型，通过适应多级历史序列模式，并基于模式复制知识进行扩展预测。
results: 在多个真实数据集上进行了详细的实验，并取得了领先的预测性能和泛化性能，同时还保持了良好的可解性。

Abstract
Time series forecasting has received wide interest from existing research due to its broad applications and inherent challenging. The research challenge lies in identifying effective patterns in historical series and applying them to future forecasting. Advanced models based on point-wise connected MLP and Transformer architectures have strong fitting power, but their secondary computational complexity limits practicality. Additionally, those structures inherently disrupt the temporal order, reducing the information utilization and making the forecasting process uninterpretable. To solve these problems, this paper proposes a forecasting model, MPR-Net. It first adaptively decomposes multi-scale historical series patterns using convolution operation, then constructs a pattern extension forecasting method based on the prior knowledge of pattern reproduction, and finally reconstructs future patterns into future series using deconvolution operation. By leveraging the temporal dependencies present in the time series, MPR-Net not only achieves linear time complexity, but also makes the forecasting process interpretable. By carrying out sufficient experiments on more than ten real data sets of both short and long term forecasting tasks, MPR-Net achieves the state of the art forecasting performance, as well as good generalization and robustness performance.

摘要
时间序列预测已经受到了广泛的研究兴趣，因为它们的广泛应用和内在的挑战性。研究的挑战在于Identifying有效的历史序列模式并将其应用于未来预测。高级模型基于点对连接MLP和Transformer架构具有强大的适应力，但其次要计算复杂性限制了实用性。此外，这些结构自然地破坏时间顺序，从而减少了信息利用和使预测过程不可读写。为解决这些问题，本文提出了一种预测模型，MPR-Net。它首先适应性地分解多尺度历史序列模式使用 convolution 操作，然后基于先前知识的模式复制来构建模式扩展预测方法，最后使用 deconvolution 操作将未来模式转换为未来序列。通过利用时间序列中的时间依赖关系，MPR-Net不仅实现了线性时间复杂度，还使预测过程可读写。通过对多个真实数据集进行了详细的实验，MPR-Net实现了预测性能的状态对照，以及良好的总体和稳定性性能。

GRAN is superior to GraphRNN: node orderings, kernel- and graph embeddings-based metrics for graph generators

paper_url: http://arxiv.org/abs/2307.06709
repo_url: https://github.com/otouat/gnnevaluationmetrics
paper_authors: Ousmane Touat, Julian Stier, Pierre-Edouard Portier, Michael Granitzer
for: 这篇论文主要针对的是图像生成方法的评估和选择，以及图像生成模型的改进。
methods: 本文使用了kernel-based和拓扑-based的度量来评估图像生成模型的性能，并对GRAN和GraphRNN两种知名的图像生成模型进行比较。
results: 研究发现，GRAN模型在小型图像上表现更加出色，而GraphRNN模型在大型图像上表现较差。此外，对GraphRNN模型进行深度优先搜索排序后，其性能也得到了改进。

Abstract
A wide variety of generative models for graphs have been proposed. They are used in drug discovery, road networks, neural architecture search, and program synthesis. Generating graphs has theoretical challenges, such as isomorphic representations -- evaluating how well a generative model performs is difficult. Which model to choose depending on the application domain? We extensively study kernel-based metrics on distributions of graph invariants and manifold-based and kernel-based metrics in graph embedding space. Manifold-based metrics outperform kernel-based metrics in embedding space. We use these metrics to compare GraphRNN and GRAN, two well-known generative models for graphs, and unveil the influence of node orderings. It shows the superiority of GRAN over GraphRNN - further, our proposed adaptation of GraphRNN with a depth-first search ordering is effective for small-sized graphs. A guideline on good practices regarding dataset selection and node feature initialization is provided. Our work is accompanied by open-source code and reproducible experiments.

摘要
各种生成模型 для图有很多已经被提出。它们在药物搜索、路网、神经建立搜索和程序合成等领域中使用。生成图有理论挑战，例如同态表示——评估生成模型表现的难度。哪种模型应选取取决于应用领域？我们广泛研究基于kernel的度量和抽象空间中的度量。抽象空间中基于 manifold 的度量表现较好。我们使用这些度量对 GRAN 和 GraphRNN 两种知名的生成模型进行比较，并揭示节点顺序对 GRAN 的影响。结果表明 GRAN 在小型图上表现更优。此外，我们提出的对 GraphRNN 使用深度优先搜索排序的修改也有效。我们提供了关于数据集选择和节点特征初始化的良好做法指南。我们的工作被附加了开源代码和可重现实验。

S-HR-VQVAE: Sequential Hierarchical Residual Learning Vector Quantized Variational Autoencoder for Video Prediction

paper_url: http://arxiv.org/abs/2307.06701
repo_url: None
paper_authors: Mohammad Adiban, Kalin Stefanov, Sabato Marco Siniscalchi, Giampiero Salvi
for: 这篇论文主要研究的是视频预测任务，旨在提出一种新的模型，combines（i）我们最近提出的层次嵌入减量自适应变换器（HR-VQVAE），以及（ii）一种新的空间时间PixelCNN（ST-PixelCNN）。我们称这种方法为Sequential Hierarchical Residual Learning Vector Quantized Variational Autoencoder（S-HR-VQVAE）。
methods: 这种方法利用了HR-VQVAE对静止图像的模型能力，同时利用ST-PixelCNN对空间时间信息的处理能力，从而更好地处理视频预测中的主要挑战，包括学习空间时间信息、处理高维数据、避免模糊预测和隐式模型物理特征。
results: 在KTH人体动作和Move-MNIST任务上进行了广泛的实验研究，表明我们的模型与当前最佳视频预测技术相比，在量化和质量两个方面均有出色的表现，即使模型大小相对较小。此外，我们还提出了一种新的训练方法，可以同时优化HR-VQVAE和ST-PixelCNN参数。

Abstract
We address the video prediction task by putting forth a novel model that combines (i) our recently proposed hierarchical residual vector quantized variational autoencoder (HR-VQVAE), and (ii) a novel spatiotemporal PixelCNN (ST-PixelCNN). We refer to this approach as a sequential hierarchical residual learning vector quantized variational autoencoder (S-HR-VQVAE). By leveraging the intrinsic capabilities of HR-VQVAE at modeling still images with a parsimonious representation, combined with the ST-PixelCNN's ability at handling spatiotemporal information, S-HR-VQVAE can better deal with chief challenges in video prediction. These include learning spatiotemporal information, handling high dimensional data, combating blurry prediction, and implicit modeling of physical characteristics. Extensive experimental results on the KTH Human Action and Moving-MNIST tasks demonstrate that our model compares favorably against top video prediction techniques both in quantitative and qualitative evaluations despite a much smaller model size. Finally, we boost S-HR-VQVAE by proposing a novel training method to jointly estimate the HR-VQVAE and ST-PixelCNN parameters.

摘要
我们提出了一种新的模型，即将 hierarchical residual vector quantized variational autoencoder（HR-VQVAE）与一种新的空间时间 pixel convolutional neural network（ST-PixelCNN）结合起来，我们称之为sequential hierarchical residual learning vector quantized variational autoencoder（S-HR-VQVAE）。通过聪明 HR-VQVAE 对静止图像的准确表示，以及 ST-PixelCNN 对空间时间信息的处理能力，S-HR-VQVAE 可以更好地处理视频预测中的主要挑战，包括学习空间时间信息、处理高维数据、避免模糊预测和隐式模型物理特征。我们在 KTH Human Action 和 Moving-MNIST 任务上进行了广泛的实验，结果表明，我们的模型与其他顶尖视频预测技术相比，在量化和质量评价中均表现出优异，即使模型规模远小于其他模型。最后，我们提出了一种新的训练方法，可以同时优化 HR-VQVAE 和 ST-PixelCNN 参数。

Short Boolean Formulas as Explanations in Practice

paper_url: http://arxiv.org/abs/2307.06971
repo_url: None
paper_authors: Reijo Jaakkola, Tomi Janhunen, Antti Kuusisto, Masood Feyzbakhsh Rankooh, Miikka Vilander
for: 这个论文是用来解释数据模型中的Attributes的。
methods: 这个论文使用了Boolean formulas的方法来解释数据模型中的Attributes。
results: 这个论文提出了一些新的量化 bound，并在实际应用中使用了Answer Set Programming来计算解释 формулы。Results表明，这些解释形式可以达到类似于其他方法的准确率，但是由于过拟合，这些解释形式并不一定是理想的。为了避免过拟合， authors使用了cross validation来确定合适的解释长度。

Abstract
We investigate explainability via short Boolean formulas in the data model based on unary relations. As an explanation of length k, we take a Boolean formula of length k that minimizes the error with respect to the target attribute to be explained. We first provide novel quantitative bounds for the expected error in this scenario. We then also demonstrate how the setting works in practice by studying three concrete data sets. In each case, we calculate explanation formulas of different lengths using an encoding in Answer Set Programming. The most accurate formulas we obtain achieve errors similar to other methods on the same data sets. However, due to overfitting, these formulas are not necessarily ideal explanations, so we use cross validation to identify a suitable length for explanations. By limiting to shorter formulas, we obtain explanations that avoid overfitting but are still reasonably accurate and also, importantly, human interpretable.

摘要
我们通过短Boolean公式进行可解释性研究，基于单关关系的数据模型中。作为一个长度为k的解释，我们选择一个长度为k的Boolean公式，以最小化与目标特性估计的误差。我们首先提供了新的量化界限，用于预期误差在这种情况下。然后，我们还通过三个具体的数据集来研究这种设置的实践效果。在每个案例中，我们使用Answer Set Programming编码来计算不同长度的解释公式，得到的最准确公式的误差与其他方法在同一数据集上的误差类似。然而，由于过拟合，这些公式并不一定是理想的解释，因此我们使用交叉验证来选择合适的解释长度。通过限制到更短的公式，我们可以获得更加简洁的解释，却不会过拟合，同时仍然具有一定的准确性和人类可读性。

IntelliGraphs: Datasets for Benchmarking Knowledge Graph Generation

paper_url: http://arxiv.org/abs/2307.06698
repo_url: https://github.com/thiviyant/intelligraphs
paper_authors: Thiviyan Thanapalasingam, Emile van Krieken, Peter Bloem, Paul Groth
for: 本研究的目的是提出一个新的知识图谱推理任务，即生成可能性和semantically valid的子图。
methods: 本研究使用了五种新的知识图谱数据集，并提出了四种基线模型，其中三种基于传统的KGE模型。
results: 研究发现传统的KGE模型无法捕捉知识图中的semantics，因此提出了一个新的推理任务以促进机器学习模型的semantic understanding。

Abstract
Knowledge Graph Embedding (KGE) models are used to learn continuous representations of entities and relations. A key task in the literature is predicting missing links between entities. However, Knowledge Graphs are not just sets of links but also have semantics underlying their structure. Semantics is crucial in several downstream tasks, such as query answering or reasoning. We introduce the subgraph inference task, where a model has to generate likely and semantically valid subgraphs. We propose IntelliGraphs, a set of five new Knowledge Graph datasets. The IntelliGraphs datasets contain subgraphs with semantics expressed in logical rules for evaluating subgraph inference. We also present the dataset generator that produced the synthetic datasets. We designed four novel baseline models, which include three models based on traditional KGEs. We evaluate their expressiveness and show that these models cannot capture the semantics. We believe this benchmark will encourage the development of machine learning models that emphasize semantic understanding.

摘要
知识图融合（KGE）模型用于学习连续表示实体和关系。文献中的一个关键任务是预测实体之间缺失的链接。但知识图并不只是一个链接的集合，它们也具有下面结构的 semantics。这些 semantics 对下游任务如查询回答或理解具有重要作用。我们引入了子图推理任务，其中一个模型需要生成可能和semantically 有效的子图。我们提出了 IntelliGraphs，一组五个新的知识图数据集。IntelliGraphs 数据集包含具有semantics表示的逻辑规则来评估子图推理。我们还介绍了生成这些 sintetic 数据集的数据生成器。我们设计了四种基线模型，其中三种基于传统的 KGE。我们评估了这些模型的表达能力，并证明这些模型无法捕捉 semantics。我们认为这个审核将鼓励机器学习模型强调semantic理解。

Reinforcement Learning for Syntax-Guided Synthesis

paper_url: http://arxiv.org/abs/2307.09564
repo_url: None
paper_authors: Julian Parsert, Elizabeth Polgreen
for: 这个论文目的是提出一种基于Monte-Carlo Tree Search（MCTS）的自动代码生成算法，用于解决Syntax-Guided Synthesis（SyGuS）问题。
methods: 这种算法使用了强化学习指导的搜索算法，并将学习策略和价值函数与上界确定 bound for trees相结合，以平衡探索和利用。
results: 在训练集和测试集上，这种算法与基准值 enumerate 比较，提高了自动代码生成的性能，提高了26%以上。此外，这种算法还可以自动生成 SyGuS 训练数据，并且与现有的工具（如 CVC5）相比，在训练集上表现良好，在测试集上则与之相当。

Abstract
Program synthesis is the task of automatically generating code based on a specification. In Syntax-Guided Synthesis(SyGuS) this specification is a combination of a syntactic template and a logical formula, and any generated code is proven to satisfy both. Techniques like SyGuS are critical to guaranteeing correct synthesis results. Despite the proliferation of machine learning in other types of program synthesis, state-of-the-art techniques in SyGuS are still driven by automated reasoning tools and simple enumeration. We hypothesize this is for two reasons: first the complexity of the search problem, and second the relatively small data sets available. In this work, we tackle these challenges by framing general SyGuS problems as a tree-search, and present a reinforcement learning guided synthesis algorithm for SyGuS based on Monte-Carlo Tree Search (MCTS). Our algorithm incorporates learned policy and value functions combined with the upper confidence bound for trees to balance exploration and exploitation. We incorporate this search procedure in a reinforcement learning setup in order to iteratively improve our policy and value estimators which are based on boosted tree models. To address the scarcity of training data, we present a method for automatically generating training data for SyGuS based on \emph{anti-unification} of existing first-order satisfiability problems, which we use to train our MCTS policy. We implement and evaluate this setup and demonstrate that learned policy and value improve the synthesis performance over a baseline enumerator by over $26$ percentage points in the training and testing sets. With these results our tool outperforms state-of-the-art-tools such as CVC5 on the training set and performs comparably on the testing set. We make our data set publicly available, enabling further application of machine learning methods to the SyGuS problem.

摘要
Program synthesis是自动生成代码的任务，将specification作为一个构文模板和一个逻辑公式的组合。在Syntax-Guided Synthesis（SyGuS）中，任何生成的代码都必须满足这两者。使用如SyGuS的技术可以保证生成结果的正确性。在其他类型的程序生成中，机器学习的应用兴盛，但现在的state-of-the-art技术仍然靠自动推理工具和简单的排序。我们认为这是因为搜寻问题的复杂性和可用的训练数据量相对较少。在这个工作中，我们将General SyGuS问题案例化为树搜索，并提出一个基于Monte-Carlo Tree Search（MCTS）的循环学习帮助生成搜索算法。我们的算法结合了学习政策和价值函数，并使用上界确界树来寻找平衡探索和优化。我们将这个搜索程序整合到循环学习设置中，以迭代改善我们的政策和价值估计器，这些估计器基于提高树模型。为了解决训练数据的缺乏，我们提出了一种自动生成SyGuS训练数据的方法，使用exististing first-order satisfiability problem的反融合，并将其用于训练我们的MCTS政策。我们实现和评估这个设置，并发现learned policy和价值函数可以在训练和测试集上提高生成性能，相比基准枚举器，提高了26.3%的比例点。我们的工具在训练集上表现和现场的工具相比较，并在测试集上表现相对较好。我们将我们的数据集公开，以便进一步应用机器学习方法到SyGuS问题。

Towards Ubiquitous Semantic Metaverse: Challenges, Approaches, and Opportunities

paper_url: http://arxiv.org/abs/2307.06687
repo_url: None
paper_authors: Kai Li, Billy Pik Lik Lau, Xin Yuan, Wei Ni, Mohsen Guizani, Chau Yuen
for: 这篇论文旨在探讨如何通过 ubique semantic Metaverse 提供更加智能、个性化、上下文意识的交互体验，以满足虚拟和扩展现实应用的需求。
methods: 论文详细介绍了四种基本系统组件的技术，即人工智能 (AI)、空间-时间数据表示 (STDR)、semantic Internet of Things (SIoT) 和增强semantic digital twin (SDT)，以及它们在 ubique semantic Metaverse 中的应用。
results: 论文对 ubique semantic Metaverse 的发展提出了一系列挑战，包括可扩展性、兼容性、隐私安全、性能评估和标准化，以及伦理考虑和责任AI。

Abstract
In recent years, ubiquitous semantic Metaverse has been studied to revolutionize immersive cyber-virtual experiences for augmented reality (AR) and virtual reality (VR) users, which leverages advanced semantic understanding and representation to enable seamless, context-aware interactions within mixed-reality environments. This survey focuses on the intelligence and spatio-temporal characteristics of four fundamental system components in ubiquitous semantic Metaverse, i.e., artificial intelligence (AI), spatio-temporal data representation (STDR), semantic Internet of Things (SIoT), and semantic-enhanced digital twin (SDT). We thoroughly survey the representative techniques of the four fundamental system components that enable intelligent, personalized, and context-aware interactions with typical use cases of the ubiquitous semantic Metaverse, such as remote education, work and collaboration, entertainment and socialization, healthcare, and e-commerce marketing. Furthermore, we outline the opportunities for constructing the future ubiquitous semantic Metaverse, including scalability and interoperability, privacy and security, performance measurement and standardization, as well as ethical considerations and responsible AI. Addressing those challenges is important for creating a robust, secure, and ethically sound system environment that offers engaging immersive experiences for the users and AR/VR applications.

摘要
Recently, ubiquitous semantic Metaverse has been studied to revolutionize immersive cyber-virtual experiences for augmented reality (AR) and virtual reality (VR) users, which leverages advanced semantic understanding and representation to enable seamless, context-aware interactions within mixed-reality environments. This survey focuses on the intelligence and spatio-temporal characteristics of four fundamental system components in ubiquitous semantic Metaverse, i.e., artificial intelligence (AI), spatio-temporal data representation (STDR), semantic Internet of Things (SIoT), and semantic-enhanced digital twin (SDT). We thoroughly survey the representative techniques of the four fundamental system components that enable intelligent, personalized, and context-aware interactions with typical use cases of the ubiquitous semantic Metaverse, such as remote education, work and collaboration, entertainment and socialization, healthcare, and e-commerce marketing. Furthermore, we outline the opportunities for constructing the future ubiquitous semantic Metaverse, including scalability and interoperability, privacy and security, performance measurement and standardization, as well as ethical considerations and responsible AI. Addressing those challenges is important for creating a robust, secure, and ethically sound system environment that offers engaging immersive experiences for the users and AR/VR applications.

Machine Learning-Assisted Pattern Recognition Algorithms for Estimating Ultimate Tensile Strength in Fused Deposition Modeled Polylactic Acid Specimens

paper_url: http://arxiv.org/abs/2307.06970
repo_url: None
paper_authors: Akshansh Mishra, Vijaykumar S Jatti
for: 这个研究旨在使用监督学习算法来估计制造使用沉积模型法（FDM）的聚酯酯（PLA）样品的最大强度（UTS）。methods: 研究使用了四种监督分类算法，namely Logistic Classification, Gradient Boosting Classification, Decision Tree, and K-Nearest Neighbor，以估计样品的UTS。results: 研究发现，Decision Tree和K-Nearest Neighbor算法都达到了F1分数0.71，但KNN算法的AUC分数为0.79，表明KNN算法在分类dataset中的表现较好，因此这个算法是这个研究中最佳的选择。

Abstract
In this study, we investigate the application of supervised machine learning algorithms for estimating the Ultimate Tensile Strength (UTS) of Polylactic Acid (PLA) specimens fabricated using the Fused Deposition Modeling (FDM) process. A total of 31 PLA specimens were prepared, with Infill Percentage, Layer Height, Print Speed, and Extrusion Temperature serving as input parameters. The primary objective was to assess the accuracy and effectiveness of four distinct supervised classification algorithms, namely Logistic Classification, Gradient Boosting Classification, Decision Tree, and K-Nearest Neighbor, in predicting the UTS of the specimens. The results revealed that while the Decision Tree and K-Nearest Neighbor algorithms both achieved an F1 score of 0.71, the KNN algorithm exhibited a higher Area Under the Curve (AUC) score of 0.79, outperforming the other algorithms. This demonstrates the superior ability of the KNN algorithm in differentiating between the two classes of ultimate tensile strength within the dataset, rendering it the most favorable choice for classification in the context of this research. This study represents the first attempt to estimate the UTS of PLA specimens using machine learning-based classification algorithms, and the findings offer valuable insights into the potential of these techniques in improving the performance and accuracy of predictive models in the domain of additive manufacturing.

摘要
在本研究中，我们研究了使用监督学习算法来估算加固塑料(PLA)样品的最高强度(UTS)。总共准备了31个PLA样品，输入参数包括填充率、层高、印刷速度和溶融温度。研究的主要目标是评估四种不同的监督分类算法，namely Logistic Classification、Gradient Boosting Classification、Decision Tree和K-Nearest Neighbor的准确性和有效性，以便预测样品的UTS。结果表明，Decision Tree和K-Nearest Neighbor算法都达到了F1分数的0.71，而KNN算法的AUC分数为0.79，高于其他算法，这表明KNN算法在数据集中更好地区分两个类别的最高强度，因此在本研究中是最佳选择。这是首次使用机器学习基于分类算法来估算PLA样品的UTS，研究结果为加固塑料预测模型的性能和准确性提供了有价值的发现。

Explainable Artificial Intelligence driven mask design for self-supervised seismic denoising

paper_url: http://arxiv.org/abs/2307.06682
repo_url: None
paper_authors: Claire Birnie, Matteo Ravasi
For: 提高震动数据的精度和可靠性，降低干扰的影响* Methods: 使用自主学习的卷积神经网络和对Jacobian矩阵的分析，自动找到最有效的干扰排除措施* Results: 在 synthetic 数据和实际震动数据上，提出了一种完全自动的干扰排除方法，不需要清晰训练标签或先知知情，可以高效地降低 trace-wise 干扰和颜色干扰

Abstract
The presence of coherent noise in seismic data leads to errors and uncertainties, and as such it is paramount to suppress noise as early and efficiently as possible. Self-supervised denoising circumvents the common requirement of deep learning procedures of having noisy-clean training pairs. However, self-supervised coherent noise suppression methods require extensive knowledge of the noise statistics. We propose the use of explainable artificial intelligence approaches to see inside the black box that is the denoising network and use the gained knowledge to replace the need for any prior knowledge of the noise itself. This is achieved in practice by leveraging bias-free networks and the direct linear link between input and output provided by the associated Jacobian matrix; we show that a simple averaging of the Jacobian contributions over a number of randomly selected input pixels, provides an indication of the most effective mask to suppress noise present in the data. The proposed method therefore becomes a fully automated denoising procedure requiring no clean training labels or prior knowledge. Realistic synthetic examples with noise signals of varying complexities, ranging from simple time-correlated noise to complex pseudo rig noise propagating at the velocity of the ocean, are used to validate the proposed approach. Its automated nature is highlighted further by an application to two field datasets. Without any substantial pre-processing or any knowledge of the acquisition environment, the automatically identified blind-masks are shown to perform well in suppressing both trace-wise noise in common shot gathers from the Volve marine dataset and colored noise in post stack seismic images from a land seismic survey.

摘要
“干扰性噪声在地震数据中存在会导致错误和不确定性，因此需要在早期和高效地抑制噪声。不需要清洁训练对组的自主减杂方法可以避免需要深入学习过程中的干扰噪声统计知识。但是，自主减杂方法需要广泛的噪声统计知识。我们提出使用可解释人工智能方法来看到减杂网络的黑盒子内部，并使用获得的知识来取代任何先前的干扰噪声统计知识。这是通过利用偏好度零网络和对输入和输出之间的直线关系提供的 Jacobian 矩阵来实现的。我们显示了一个简单的均值 Jacobian 贡献的方法可以提供噪声降低的指标。因此，我们的提案可以实现一个无需清洁训练对组或先前知识的自动减杂程序。我们采用了实际的合成例子，包括不同复杂性的噪声信号，以验证我们的方法。我们还将方法应用到了两个场景中的数据，而无需任何重要的预processing或场景探索知识，自动获取的掩蔽物具有良好的噪声抑制能力。”

Real-time Percussive Technique Recognition and Embedding Learning for the Acoustic Guitar

paper_url: http://arxiv.org/abs/2307.07426
repo_url: https://github.com/iamtheband/martelloni_et_al_ismir2023
paper_authors: Andrea Martelloni, Andrew P McPherson, Mathieu Barthet
for: 这项研究旨在提高低音钢琴的表演能力，通过实时音乐信息检索（RT-MIR）技术。
methods: 研究人员采用了卷积神经网络（CNN）和变分自动编码器（VAE）等技术，实现了实时钢琴身拍分辨和嵌入学习。
results: 研究人员发现，使用VAEs可以提高分类器的质量，并且可以提供更好的控制距离和丰富的交互。但是，研究人员还需要解决不同数据集的总体化问题。

Abstract
Real-time music information retrieval (RT-MIR) has much potential to augment the capabilities of traditional acoustic instruments. We develop RT-MIR techniques aimed at augmenting percussive fingerstyle, which blends acoustic guitar playing with guitar body percussion. We formulate several design objectives for RT-MIR systems for augmented instrument performance: (i) causal constraint, (ii) perceptually negligible action-to-sound latency, (iii) control intimacy support, (iv) synthesis control support. We present and evaluate real-time guitar body percussion recognition and embedding learning techniques based on convolutional neural networks (CNNs) and CNNs jointly trained with variational autoencoders (VAEs). We introduce a taxonomy of guitar body percussion based on hand part and location. We follow a cross-dataset evaluation approach by collecting three datasets labelled according to the taxonomy. The embedding quality of the models is assessed using KL-Divergence across distributions corresponding to different taxonomic classes. Results indicate that the networks are strong classifiers especially in a simplified 2-class recognition task, and the VAEs yield improved class separation compared to CNNs as evidenced by increased KL-Divergence across distributions. We argue that the VAE embedding quality could support control intimacy and rich interaction when the latent space's parameters are used to control an external synthesis engine. Further design challenges around generalisation to different datasets have been identified.

摘要

PatchSorter: A High Throughput Deep Learning Digital Pathology Tool for Object Labeling

paper_url: http://arxiv.org/abs/2307.07528
repo_url: None
paper_authors: Cedric Walker, Tasneem Talawalla, Robert Toth, Akhil Ambekar, Kien Rea, Oswin Chamian, Fan Fan, Sabina Berezowska, Sven Rottenberg, Anant Madabhushi, Marie Maillard, Laura Barisoni, Hugo Mark Horlings, Andrew Janowczyk
for: 这篇论文是为了提高数字生物学图像中诊断、预后和治疗反应的发现而写的。
methods: 这篇论文使用了深度学习和易用的网页界面，开发了一个开源的标签工具——PatchSorter。
results: 通过使用>100,000个对象，这篇论文展示了>7倍的标签速度提升，并且对标签准确性产生了最小的影响，因此可以快速标注大量数据集。

Abstract
The discovery of patterns associated with diagnosis, prognosis, and therapy response in digital pathology images often requires intractable labeling of large quantities of histological objects. Here we release an open-source labeling tool, PatchSorter, which integrates deep learning with an intuitive web interface. Using >100,000 objects, we demonstrate a >7x improvement in labels per second over unaided labeling, with minimal impact on labeling accuracy, thus enabling high-throughput labeling of large datasets.

摘要
发现与诊断、治疗效果和疾病进程相关的图像模式，经常需要大量的 histological 对象标注。我们现在发布了一个开源的标注工具，PatchSorter，它将深度学习与直观的网页界面相结合。使用 >100,000 个对象，我们展示了 >7x 的标注速度提高，与无助标注相比，减少了标注精度的影响，因此可以实现大规模标注数据集。

DeepIPCv2: LiDAR-powered Robust Environmental Perception and Navigational Control for Autonomous Vehicle

paper_url: http://arxiv.org/abs/2307.06647
repo_url: https://github.com/oskarnatan/deepipcv2
paper_authors: Oskar Natan, Jun Miura
for: DeepIPCv2是一个用于自动驾驶的模型，它利用激光测试器进行环境探测，以提供更加可靠的驾驶，特别是在夜间或对照不良的情况下。methods: DeepIPCv2使用LiDAR点 cloud作为主要感知输入，由于点 cloud不受照明影响，因此可以提供不受影响的环境观测，从而帮助控制模块更好地估计航行控制。results: 根据实验结果，DeepIPCv2在不同的驾驶情况下都展示了最好的性能，并且与其他一些现有的模型进行比较和抑制研究，以证明其表现。

Abstract
We present DeepIPCv2, an autonomous driving model that perceives the environment using a LiDAR sensor for more robust drivability, especially when driving under poor illumination conditions where everything is not clearly visible. DeepIPCv2 takes a set of LiDAR point clouds as the main perception input. Since point clouds are not affected by illumination changes, they can provide a clear observation of the surroundings no matter what the condition is. This results in a better scene understanding and stable features provided by the perception module to support the controller module in estimating navigational control properly. To evaluate its performance, we conduct several tests by deploying the model to predict a set of driving records and perform real automated driving under three different conditions. We also conduct ablation and comparative studies with some recent models to justify its performance. Based on the experimental results, DeepIPCv2 shows a robust performance by achieving the best drivability in all driving scenarios. Furthermore, we will upload the codes to https://github.com/oskarnatan/DeepIPCv2.

摘要
我们现在推出了DeepIPCv2，一个拥有自主驾驶能力的模型，它通过激光探测器对环境进行更加稳定的感知，特别是在夜色昏暗的情况下，一切都不太清楚。DeepIPCv2接受激光点云作为主要感知输入，由于点云不受照明变化影响，因此可以提供不受限制的环境观察，从而实现更好的场景理解和稳定的特征提供。为评估其性能，我们进行了多个测试，包括部署模型预测一组驾驶记录，并在三个不同的条件下进行实际自动驾驶。我们还进行了减少和比较研究，以证明其性能。根据实验结果，DeepIPCv2在所有驾驶场景中表现出了最好的稳定性，并且我们将代码上传到https://github.com/oskarnatan/DeepIPCv2。

Image Transformation Sequence Retrieval with General Reinforcement Learning

paper_url: http://arxiv.org/abs/2307.06630
repo_url: None
paper_authors: Enrique Mas-Candela, Antonio Ríos-Vila, Jorge Calvo-Zaragoza
for: 本文提出了一个新的图像变换序列检索任务（ITSR），在该任务中，模型需要从两个给定的图像中检索到带源和目标图像之间的转换序列。
methods: 作者提出了一种基于概率模型和深度神经网络的解决方案，使用了蒙特卡洛搜索（MCTS）来解决ITSR问题。
results: 实验表明，使用MCTS训练的模型能够在真实和syntheticdomains中出perform其supervised counterpart，特别是在最简单和最复杂的情况下。

Abstract
In this work, the novel Image Transformation Sequence Retrieval (ITSR) task is presented, in which a model must retrieve the sequence of transformations between two given images that act as source and target, respectively. Given certain characteristics of the challenge such as the multiplicity of a correct sequence or the correlation between consecutive steps of the process, we propose a solution to ITSR using a general model-based Reinforcement Learning such as Monte Carlo Tree Search (MCTS), which is combined with a deep neural network. Our experiments provide a benchmark in both synthetic and real domains, where the proposed approach is compared with supervised training. The results report that a model trained with MCTS is able to outperform its supervised counterpart in both the simplest and the most complex cases. Our work draws interesting conclusions about the nature of ITSR and its associated challenges.

摘要
本文介绍了一种新的图像转换序列检索任务（ITSR），在该任务中，模型需要根据两个给定图像（源和目标图像），检索转换过程中的序列。根据特定的挑战特点，如转换序列中的多个正确序列或转换步骤之间的相关性，我们提出了一种基于通用模型驱动的强化学习方法，即蒙地卡尔树搜索（MCTS），其与深度神经网络结合使用。我们的实验在 sintetic 和 real 领域进行了比较，并发现了一个训练使用 MCTS 的模型，可以在 simplest 和 most complex 情况下超越supervised 训练的模型。我们的工作得出了有趣的结论，关于 ITSR 和其相关挑战的性质。

SecureFalcon: The Next Cyber Reasoning System for Cyber Security

paper_url: http://arxiv.org/abs/2307.06616
repo_url: None
paper_authors: Mohamed Amine Ferrag, Ammar Battah, Norbert Tihanyi, Merouane Debbah, Thierry Lestable, Lucas C. Cordeiro
for: 本研究旨在提高软件攻击检测精度和效率，通过使用大语言模型（LLM）和形式人工智能（AI）技术。
methods: 本研究使用了 FalconLLM 模型，通过对 C 语言代码样本进行训练，以 diferenciar易受攻击的代码和安全的代码。新建了 FormAI 训练数据集，通过组合生成的人工智能（AI）和形式验证来评估模型性能。
results: 研究发现，SecureFalcon 模型在检测软件攻击方面达到了 94% 的准确率，表明它在软件安全领域具有广泛的应用前景和潜在的改变性。

Abstract
Software vulnerabilities leading to various detriments such as crashes, data loss, and security breaches, significantly hinder the quality, affecting the market adoption of software applications and systems. Although traditional methods such as automated software testing, fault localization, and repair have been intensively studied, static analysis tools are most commonly used and have an inherent false positives rate, posing a solid challenge to developer productivity. Large Language Models (LLMs) offer a promising solution to these persistent issues. Among these, FalconLLM has shown substantial potential in identifying intricate patterns and complex vulnerabilities, hence crucial in software vulnerability detection. In this paper, for the first time, FalconLLM is being fine-tuned for cybersecurity applications, thus introducing SecureFalcon, an innovative model architecture built upon FalconLLM. SecureFalcon is trained to differentiate between vulnerable and non-vulnerable C code samples. We build a new training dataset, FormAI, constructed thanks to Generative Artificial Intelligence (AI) and formal verification to evaluate its performance. SecureFalcon achieved an impressive 94% accuracy rate in detecting software vulnerabilities, emphasizing its significant potential to redefine software vulnerability detection methods in cybersecurity.

摘要

Introducing Foundation Models as Surrogate Models: Advancing Towards More Practical Adversarial Attacks

paper_url: http://arxiv.org/abs/2307.06608
repo_url: None
paper_authors: Jiaming Zhang, Jitao Sang, Qi Yi, Changsheng Xu
for: 针对无框 setting 的 adversarial attack 进行研究，探讨surrogate model 的选择过程对攻击效果的影响。
methods: 利用 foundational model 作为 surrogate model，采用 margin-based loss strategy 进行 fine-tuning，实现基于 FGSM 算法的 adversarial attack。
results: 比较其他 algoritms 的性能，发现使用 foundational model 和 margin-based loss strategy 可以减轻攻击效果的差异，提高攻击效果。

Abstract
Recently, the no-box adversarial attack, in which the attacker lacks access to the model's architecture, weights, and training data, become the most practical and challenging attack setup. However, there is an unawareness of the potential and flexibility inherent in the surrogate model selection process on no-box setting. Inspired by the burgeoning interest in utilizing foundational models to address downstream tasks, this paper adopts an innovative idea that 1) recasting adversarial attack as a downstream task. Specifically, image noise generation to meet the emerging trend and 2) introducing foundational models as surrogate models. Harnessing the concept of non-robust features, we elaborate on two guiding principles for surrogate model selection to explain why the foundational model is an optimal choice for this role. However, paradoxically, we observe that these foundational models underperform. Analyzing this unexpected behavior within the feature space, we attribute the lackluster performance of foundational models (e.g., CLIP) to their significant representational capacity and, conversely, their lack of discriminative prowess. To mitigate this issue, we propose the use of a margin-based loss strategy for the fine-tuning of foundational models on target images. The experimental results verify that our approach, which employs the basic Fast Gradient Sign Method (FGSM) attack algorithm, outstrips the performance of other, more convoluted algorithms. We conclude by advocating for the research community to consider surrogate models as crucial determinants in the effectiveness of adversarial attacks in no-box settings. The implications of our work bear relevance for improving the efficacy of such adversarial attacks and the overall robustness of AI systems.

摘要
近些时间，无框 adversarial attack（lacking access to the model's architecture, weights, and training data）成为了最实际和挑战性最高的攻击设置。然而，对于无框设置中的可能和灵活性的潜在性，还存在一定的不了了之。 drawing inspiration from the growing interest in using foundational models to address downstream tasks, this paper proposes an innovative idea that 1) recasts adversarial attack as a downstream task and 2) introduces foundational models as surrogate models. By leveraging the concept of non-robust features, we provide two guiding principles for surrogate model selection to explain why foundational models are optimal for this role. However, paradoxically, we observe that these foundational models underperform. Analyzing this unexpected behavior within the feature space, we attribute the lackluster performance of foundational models (e.g., CLIP) to their significant representational capacity and, conversely, their lack of discriminative prowess. To mitigate this issue, we propose the use of a margin-based loss strategy for the fine-tuning of foundational models on target images. The experimental results verify that our approach, which employs the basic Fast Gradient Sign Method (FGSM) attack algorithm, outstrips the performance of other, more convoluted algorithms. We conclude by advocating for the research community to consider surrogate models as crucial determinants in the effectiveness of adversarial attacks in no-box settings. The implications of our work bear relevance for improving the efficacy of such adversarial attacks and the overall robustness of AI systems.

Is Task-Agnostic Explainable AI a Myth?

paper_url: http://arxiv.org/abs/2307.06963
repo_url: None
paper_authors: Alicja Chaszczewicz
for: 提供一个框架将当代可解释人工智能（XAI）挑战集成起来，并显示XAI方法提供辅助性的和有用的输出，但需要注意其概念和技术上的限制，这些限制常常使XAI方法本身变成黑盒子。
methods: 该研究探讨了三种XAI研究方向，包括图像、文本和图形数据，涵盖了焦点、注意力和图形解释器。这些场景中的例子具有不同的时间和背景，但却常出现同样的阻碍，这显示了XAI领域需要一个概念上的突破，以解决XAI方法和应用任务的兼容性问题。
results: 该研究发现，XAI方法提供的辅助性输出可能不适用于应用任务，并且存在概念和技术上的限制，这些限制常常使XAI方法本身变成黑盒子。

Abstract
Our work serves as a framework for unifying the challenges of contemporary explainable AI (XAI). We demonstrate that while XAI methods provide supplementary and potentially useful output for machine learning models, researchers and decision-makers should be mindful of their conceptual and technical limitations, which frequently result in these methods themselves becoming black boxes. We examine three XAI research avenues spanning image, textual, and graph data, covering saliency, attention, and graph-type explainers. Despite the varying contexts and timeframes of the mentioned cases, the same persistent roadblocks emerge, highlighting the need for a conceptual breakthrough in the field to address the challenge of compatibility between XAI methods and application tasks.

摘要
我们的工作作为现代可解释AI（XAI）挑战的框架。我们示出XAI方法提供补充和有用的输出 для机器学习模型，但研究人员和决策者应该注意这些方法的概念和技术限制，这些限制常常导致这些方法变成黑盒子。我们研究了三个XAI研究路径，涵盖图像、文本和图形数据，覆盖焦点、注意力和图形类解释器。尽管这些情况的时间和上下文不同，但 persistente roadblocks出现，表明需要在该领域获得概念性突破，以解决XAI方法和应用任务的兼容性问题。

EFL Students’ Attitudes and Contradictions in a Machine-in-the-loop Activity System

paper_url: http://arxiv.org/abs/2307.13699
repo_url: None
paper_authors: David James Woo, Hengky Susanto, Kai Guo
for: 这个研究是为了investigate英语作为外语（EFL）学生们对机器在作写过程中的提示和建议的态度和矛盾。
methods: 这个研究使用活动理论来分析学生们对机器在作写过程中的态度和矛盾。
results: 结果表明大多数学生表现出积极的态度，有些学生表现出负面或混合的态度。从主题分析中，学生和机器之间的矛盾来自于机器的不足、学生的积极性和语言自主性的权衡。

Abstract
This study applies Activity Theory and investigates the attitudes and contradictions of 67 English as a foreign language (EFL) students from four Hong Kong secondary schools towards machine-in-the-loop writing, where artificial intelligence (AI) suggests ideas during composition. Students answered an open-ended question about their feelings on writing with AI. Results revealed mostly positive attitudes, with some negative or mixed feelings. From a thematic analysis, contradictions or points of tension between students and AI stemmed from AI inadequacies, students' balancing enthusiasm with preference, and their striving for language autonomy. The research highlights the benefits and challenges of implementing machine-in-the-loop writing in EFL classrooms, suggesting educators align activity goals with students' values, language abilities, and AI capabilities to enhance students' activity systems.

摘要
Translation notes:* "English as a foreign language" is translated as "英语作为外语" (Yīngyǔ zuòwei làngyǔ)* "machine-in-the-loop writing" is translated as "机器在循环写作" (Jīqì zài xiànghuàn xiǎo zuò)* "artificial intelligence" is translated as "人工智能" (Réngōng zhìnéng)* "students' values" is translated as "学生的价值观" (Xuéxí de jīyè guān)* "language abilities" is translated as "语言能力" (Yǔyán nénglì)* "AI capabilities" is translated as "人工智能能力" (Réngōng zhìnéng nénglì)* "activity systems" is translated as "活动系统" ( Huóxìng xìtiān)

RVD: A Handheld Device-Based Fundus Video Dataset for Retinal Vessel Segmentation

paper_url: http://arxiv.org/abs/2307.06577
repo_url: None
paper_authors: MD Wahiduzzaman Khan, Hongwei Sheng, Hu Zhang, Heming Du, Sen Wang, Minas Theodore Coroneo, Farshid Hajati, Sahar Shariflou, Michael Kalloniatis, Jack Phu, Ashish Agar, Zi Huang, Mojtaba Golzan, Xin Yu
for: 这个研究的目的是为了提高眼睛血管分类的技术，特别是透过使用手持式设备捕捉的录像，实现更好的血管分类模型。
methods: 这个研究使用了手持式设备捕捉的录像，并提供了三种不同的标注方法，包括全面的血管标注、概略的血管分类和细部的血管分类，以及时间标注，实现更好的血管分类和疾病诊断。
results: 这个研究提供了一个新的视频基于的眼睛血管数据集，包括415名病人的435个手持式录像，并提供了该数据集的评估指标和基准结果，显示了这个数据集具有丰富的数据特征和挑战性，并且可以帮助提高眼睛血管分类的技术。

Abstract
Retinal vessel segmentation is generally grounded in image-based datasets collected with bench-top devices. The static images naturally lose the dynamic characteristics of retina fluctuation, resulting in diminished dataset richness, and the usage of bench-top devices further restricts dataset scalability due to its limited accessibility. Considering these limitations, we introduce the first video-based retinal dataset by employing handheld devices for data acquisition. The dataset comprises 635 smartphone-based fundus videos collected from four different clinics, involving 415 patients from 50 to 75 years old. It delivers comprehensive and precise annotations of retinal structures in both spatial and temporal dimensions, aiming to advance the landscape of vasculature segmentation. Specifically, the dataset provides three levels of spatial annotations: binary vessel masks for overall retinal structure delineation, general vein-artery masks for distinguishing the vein and artery, and fine-grained vein-artery masks for further characterizing the granularities of each artery and vein. In addition, the dataset offers temporal annotations that capture the vessel pulsation characteristics, assisting in detecting ocular diseases that require fine-grained recognition of hemodynamic fluctuation. In application, our dataset exhibits a significant domain shift with respect to data captured by bench-top devices, thus posing great challenges to existing methods. In the experiments, we provide evaluation metrics and benchmark results on our dataset, reflecting both the potential and challenges it offers for vessel segmentation tasks. We hope this challenging dataset would significantly contribute to the development of eye disease diagnosis and early prevention.

摘要
<>将文本翻译成简化中文。<>retinal vessel segmentation通常基于图像数据集，收集于桌面设备上。静止图像自然消失了 RETINA 的动态特征，导致数据集的资源减少，而使用桌面设备更限制了数据集的扩展性，因为它具有有限的访问性。为了解决这些限制，我们介绍了首个视频基于 RETINA 数据集，使用手持设备进行数据采集。该数据集包含635个手机基于 RETINA 视频，从四个不同的医院收集，涵盖415名50-75岁的病人。它提供了全面和精确的 RETINA 结构注释，包括空间维度和时间维度，以提高血管分 segmentation 的领域。特别是，数据集提供了三级空间注释：全面 RETINA 结构描述、总血管-artery 描述和细腔血管描述，以及时间注释，捕捉 RETINA 血管振荡特征，帮助检测眼病，需要细化血液征变识别。在应用中，我们的数据集表现出了与桌面设备收集的数据集显著的领域转移，因此对现有方法 pose 大的挑战。在实验中，我们提供了评估指标和对数据集的测试结果，反映了我们数据集对血管分 segmentation 任务的潜在和挑战。我们希望这个挑战的数据集可以对眼病诊断和预防产生巨大的贡献。

Regression-Oriented Knowledge Distillation for Lightweight Ship Orientation Angle Prediction with Optical Remote Sensing Images

paper_url: http://arxiv.org/abs/2307.06566
repo_url: https://github.com/ubcdingxin/soap-kd
paper_authors: Zhan Shi, Xin Ding, Peng Ding, Chun Yang, Ru Huang, Xiaoxuan Song
for: 预测船舷倾斜角度（SOAP）使用光学远程感知图像是一项重要的图像处理任务，通常利用深度卷积神经网络（CNN）来进行准确预测。这篇论文提出了一种新的框架，用于降低SOAP模型的大小和计算成本，不妨碍预测准确性。
methods: 该论文提出了一种新的SOAP模型 called Mobile-SOAP，基于MobileNetV2，实现了现有模型的预测精度。此外，还创造了四种小型SOAP模型，通过将Mobile-SOAP中的卷积块替换为四种小规模网络，分别实现了不同的预测任务。
results: 广泛的实验表明，Mobile-SOAP模型在FGSC-23 dataset上的预测精度较高，而且通过SOAP-KD知识传递框架，可以将Mobile-SOAP模型中的知识传递给四种特殊设计的小型模型，从而提高预测性能。特别是，通过SOAP-KD，使用ShuffleNetV2x1.0-based模型的测试平均绝对误差只比Mobile-SOAP高8%，但它的参数数量和乘法积加操作（MACs）分别减少61.6%和60.8%。

Abstract
Ship orientation angle prediction (SOAP) with optical remote sensing images is an important image processing task, which often relies on deep convolutional neural networks (CNNs) to make accurate predictions. This paper proposes a novel framework to reduce the model sizes and computational costs of SOAP models without harming prediction accuracy. First, a new SOAP model called Mobile-SOAP is designed based on MobileNetV2, achieving state-of-the-art prediction accuracy. Four tiny SOAP models are also created by replacing the convolutional blocks in Mobile-SOAP with four small-scale networks, respectively. Then, to transfer knowledge from Mobile-SOAP to four lightweight models, we propose a novel knowledge distillation (KD) framework termed SOAP-KD consisting of a novel feature-based guidance loss and an optimized synthetic samples-based knowledge transfer mechanism. Lastly, extensive experiments on the FGSC-23 dataset confirm the superiority of Mobile-SOAP over existing models and also demonstrate the effectiveness of SOAP-KD in improving the prediction performance of four specially designed tiny models. Notably, by using SOAP-KD, the test mean absolute error of the ShuffleNetV2x1.0-based model is only 8% higher than that of Mobile-SOAP, but its number of parameters and multiply-accumulate operations (MACs) are respectively 61.6% and 60.8% less.

摘要
ship orientation angle prediction（SOAP）with optical remote sensing images 是一个重要的图像处理任务，通常利用深度卷积神经网络（CNN）来做精确预测。这篇论文提出了一个新的框架，以减少 SOAP 模型的大小和计算成本，不伤预测精度。首先，我们设计了一个新的 SOAP 模型，称为 Mobile-SOAP，基于 MobileNetV2，实现了状态场所有模型的预测精度。然后，我们创建了四个小型 SOAP 模型，分别替换了 Mobile-SOAP 中的卷积块。为了将 Mobile-SOAP 模型知识传递到四个轻量级模型，我们提出了一种新的知识传递框架，称为 SOAP-KD。该框架包括一种新的特征基于导航损失和一种优化的 sintetic samples 基于知识传递机制。最后，我们进行了广泛的实验，confirming the superiority of Mobile-SOAP over existing models and also demonstrating the effectiveness of SOAP-KD in improving the prediction performance of four specially designed tiny models. Notably, by using SOAP-KD, the test mean absolute error of the ShuffleNetV2x1.0-based model is only 8% higher than that of Mobile-SOAP, but its number of parameters and multiply-accumulate operations (MACs) are respectively 61.6% and 60.8% less.

Prescriptive Process Monitoring Under Resource Constraints: A Reinforcement Learning Approach

paper_url: http://arxiv.org/abs/2307.06564
repo_url: https://github.com/mshoush/rl-prescriptive-monitoring
paper_authors: Mahmoud Shoush, Marlon Dumas
for: 这 paper 是为了提高业务过程的性能，通过运行时触发 intervención，提高正确情况的可能性。
methods: 这 paper 使用了人工智能技术，具体来说是 reinforcement learning，通过试错来学习 intervención 策略。
results: 该 paper 表明，在资源受限制的情况下，不能够忽略 uncertainty 和资源利用率，否则可能导致优化 intervención 效果的决策。它提议使用 conformal prediction 技术来考虑预测uncertainty，以便 reinforcement learning 代理人可以更好地学习 intervención 策略。经过实验 validate 了这种方法，结果显示，明确考虑 uncertainty 可以帮助 reinforcement learning 代理人更好地学习 intervención 策略，从而提高 net intervention gain。

Abstract
Prescriptive process monitoring methods seek to optimize the performance of business processes by triggering interventions at runtime, thereby increasing the probability of positive case outcomes. These interventions are triggered according to an intervention policy. Reinforcement learning has been put forward as an approach to learning intervention policies through trial and error. Existing approaches in this space assume that the number of resources available to perform interventions in a process is unlimited, an unrealistic assumption in practice. This paper argues that, in the presence of resource constraints, a key dilemma in the field of prescriptive process monitoring is to trigger interventions based not only on predictions of their necessity, timeliness, or effect but also on the uncertainty of these predictions and the level of resource utilization. Indeed, committing scarce resources to an intervention when the necessity or effects of this intervention are highly uncertain may intuitively lead to suboptimal intervention effects. Accordingly, the paper proposes a reinforcement learning approach for prescriptive process monitoring that leverages conformal prediction techniques to consider the uncertainty of the predictions upon which an intervention decision is based. An evaluation using real-life datasets demonstrates that explicitly modeling uncertainty using conformal predictions helps reinforcement learning agents converge towards policies with higher net intervention gain

摘要
This paper argues that, when resources are limited, a key challenge in prescriptive process monitoring is to determine which interventions to trigger based not only on their necessity, timeliness, and effect but also on the uncertainty of these predictions and the level of resource utilization. Committing scarce resources to an intervention when the necessity or effects of the intervention are highly uncertain may lead to suboptimal outcomes.To address this challenge, the paper proposes a reinforcement learning approach for prescriptive process monitoring that leverages conformal prediction techniques to consider the uncertainty of the predictions upon which an intervention decision is based. An evaluation using real-life datasets demonstrates that explicitly modeling uncertainty using conformal predictions helps reinforcement learning agents converge towards policies with higher net intervention gain.

Copy Is All You Need

paper_url: http://arxiv.org/abs/2307.06962
repo_url: https://github.com/gmftbygmftby/copyisallyouneed
paper_authors: Tian Lan, Deng Cai, Yan Wang, Heyan Huang, Xian-Ling Mao
for: 这篇论文的目的是提出一种基于复制的文本生成方法，以提高文本生成质量和效率。
methods: 这篇论文使用了一种基于文本集的复制方法，计算了文本中意义ful的语句段的上下文化表示，并使用了高效的向量搜索工具包来索引这些语句段。文本生成任务则被划分为一系列的复制操作，每一步都是从文本集中寻找适当的语句段进行复制。
results: 实验表明，这种方法可以在标准语言模型测试集（WikiText-103）上提供更高质量的文本生成结果，并且与token-level autoregressive模型相比，其推理效率几乎相同。此外，这种方法还可以方便地进行领域适应，只需要将领域特定的文本集switch而已。最后，我们发现，通过将文本集更大化，我们可以再次提高文本生成质量，并且无需进行额外训练。

Abstract
The dominant text generation models compose the output by sequentially selecting words from a fixed vocabulary. In this paper, we formulate text generation as progressively copying text segments (e.g., words or phrases) from an existing text collection. We compute the contextualized representations of meaningful text segments and index them using efficient vector search toolkits. The task of text generation is then decomposed into a series of copy-and-paste operations: at each time step, we seek suitable text spans from the text collection rather than selecting from a standalone vocabulary. Experiments on the standard language modeling benchmark (WikiText-103) show that our approach achieves better generation quality according to both automatic and human evaluations. Besides, its inference efficiency is comparable to token-level autoregressive models thanks to the reduction of decoding steps. We also show that our approach allows for effective domain adaptation by simply switching to domain-specific text collection without extra training. Finally, we observe that our approach attains additional performance gains by simply scaling up to larger text collections, again without further training.\footnote{Our source codes are publicly available at \url{https://github.com/gmftbyGMFTBY/Copyisallyouneed}.}

摘要
主流文本生成模型通常会采用顺序选择词语从固定词汇库中选择 Output 组成。在这篇论文中，我们将文本生成问题定义为逐步从现有文本集中复制文本段落（例如单词或短语）。我们计算了Contextualized表示意义短语，并使用高效的向量搜索工具库进行索引。然后，我们将文本生成任务 decomposed 为一系列的复制操作：在每个时间步骤中，我们寻找wikitext-103标准语言模型测试集上的合适文本段落，而不是从单独的词汇库中选择。我们的方法在自动和人工评估中都显示出较好的生成质量，并且其推理效率与Token-level autoregressive模型相当。此外，我们还证明了我们的方法可以很好地适应域，只需要将域特定的文本集换过去，而无需额外训练。最后，我们发现了我们的方法可以通过简单地扩大文本集来增加性能，而无需进一步训练。（我们的源代码可以在 \url{https://github.com/gmftbyGMFTBY/Copyisallyouneed} 上获取。)

On the Effective Horizon of Inverse Reinforcement Learning

paper_url: http://arxiv.org/abs/2307.06541
repo_url: None
paper_authors: Yiqing Xu, Finale Doshi-Velez, David Hsu
for: 这个论文主要研究 inverse reinforcement learning（IRL）算法，以及如何通过计算一个约束optimal policy来匹配专家示范。
methods: 论文使用了forward reinforcement learning或规划算法来计算一个假设的奖励函数，然后与专家示范匹配。
results: 研究发现，使用有效的时间 horizonto shorter than the ground-truth value可以更快地生成更好的结果。这个现象可以通过控制策略类型的复杂性和避免过度适应来解释。

Abstract
Inverse reinforcement learning (IRL) algorithms often rely on (forward) reinforcement learning or planning over a given time horizon to compute an approximately optimal policy for a hypothesized reward function and then match this policy with expert demonstrations. The time horizon plays a critical role in determining both the accuracy of reward estimate and the computational efficiency of IRL algorithms. Interestingly, an effective time horizon shorter than the ground-truth value often produces better results faster. This work formally analyzes this phenomenon and provides an explanation: the time horizon controls the complexity of an induced policy class and mitigates overfitting with limited data. This analysis leads to a principled choice of the effective horizon for IRL. It also prompts us to reexamine the classic IRL formulation: it is more natural to learn jointly the reward and the effective horizon together rather than the reward alone with a given horizon. Our experimental results confirm the theoretical analysis.

摘要
倒向奖励学习（IRL）算法经常利用前向奖励学习或规划在给定时间阶段来计算一个相对优化的政策 для一个假设的奖励函数，然后与专家示例相匹配。时间阶段在确定奖励估计的准确性和IRL算法的计算效率中扮演了关键角色。奇怪的是，一个有效的时间阶段 shorter than the ground-truth value 经常可以更快地 producess better results。这项工作正式分析了这种现象，并提供了一个解释：时间阶段控制了 inducing 政策类型的复杂性，并避免过拟合 Limited data 中。这种分析导致了一种原则性的选择有效的时间阶段 для IRL。它还让我们重新评估 классический IRL 形式：在学习奖励函数和有效时间阶段之间 jointly 学习是更自然的。我们的实验结果证实了理论分析。

Artificial Intelligence for Drug Discovery: Are We There Yet?

paper_url: http://arxiv.org/abs/2307.06521
repo_url: None
paper_authors: Catrin Hasselgren, Tudor I. Oprea
for: The paper is written to discuss the use of artificial intelligence (AI) in the three pillars of drug discovery, including diseases, targets, and therapeutic modalities, with a focus on small molecule drugs.
methods: The paper uses a variety of AI technologies, such as generative chemistry, machine learning, and multi-property optimization, to accelerate drug discovery and reduce costs.
results: The paper highlights several compounds that have entered clinical trials using AI-driven drug discovery methods, and emphasizes the need for careful vetting of known information to address the reproducibility crisis in the field.Here is the same information in Simplified Chinese text:
for: 论文旨在探讨用数据科学、信息学和人工智能（AI）加速药物发现，以提高有效治疗的开发速度，降低成本和动物实验。
methods: 论文使用多种AI技术，如生成化学、机器学习和多性优化，加速药物发现过程。
results: 论文指出使用AI驱动药物发现方法，已有一些药物进入临床试验，并强调需要仔细检查已知信息，以解决药物发现领域的可重现性危机。

Abstract
Drug discovery is adapting to novel technologies such as data science, informatics, and artificial intelligence (AI) to accelerate effective treatment development while reducing costs and animal experiments. AI is transforming drug discovery, as indicated by increasing interest from investors, industrial and academic scientists, and legislators. Successful drug discovery requires optimizing properties related to pharmacodynamics, pharmacokinetics, and clinical outcomes. This review discusses the use of AI in the three pillars of drug discovery: diseases, targets, and therapeutic modalities, with a focus on small molecule drugs. AI technologies, such as generative chemistry, machine learning, and multi-property optimization, have enabled several compounds to enter clinical trials. The scientific community must carefully vet known information to address the reproducibility crisis. The full potential of AI in drug discovery can only be realized with sufficient ground truth and appropriate human intervention at later pipeline stages.

摘要
医药发现在推广新技术，如数据科学、信息学和人工智能（AI），以加速有效的药物开发，降低成本和动物实验。AI在药物发现中产生了深见，投资者、产业和学术科学家以及立法者都表示了兴趣。成功的药物发现需要优化与药理学、药物学和临床结果相关的质量。本文评论了AI在药物发现中的三大柱子：疾病、目标和治疗方式，主要关注小分子药物。AI技术，如生成化学、机器学习和多属性优化，已经使得许多化合物进入了临床试验。科学社区需要仔细检查已知信息，以解决误差危机。充分发挥AI在药物发现的潜力，需要足够的实验据和后续管道阶段的人类干预。

Leveraging Contextual Counterfactuals Toward Belief Calibration

paper_url: http://arxiv.org/abs/2307.06513
repo_url: None
paper_authors: Qiuyi, Zhang, Michael S. Lee, Sherol Chen
for: 本研究旨在对人类价值观和信念在AI系统中进行更好的对齐，以便更好地满足人类需求。
methods: 本研究使用了一种名为“信念对齐”的方法，将人类价值观和信念组织成两个类别：主观性（这些信念在人类之间的差异）和知识uncertainty（这些信念在不同的情况下的不确定性）。然后，使用一种多目标优化方法，将这些信念对齐到不同的情况下。
results: 本研究使用了一个实验，证明了这种方法可以将信念对齐到不同的情况下，并且可以在不同的情况下找到一个Pareto前沿，即一个集合最佳的信念强度，这些信念强度可以在不同的情况下均衡。

Abstract
Beliefs and values are increasingly being incorporated into our AI systems through alignment processes, such as carefully curating data collection principles or regularizing the loss function used for training. However, the meta-alignment problem is that these human beliefs are diverse and not aligned across populations; furthermore, the implicit strength of each belief may not be well calibrated even among humans, especially when trying to generalize across contexts. Specifically, in high regret situations, we observe that contextual counterfactuals and recourse costs are particularly important in updating a decision maker's beliefs and the strengths to which such beliefs are held. Therefore, we argue that including counterfactuals is key to an accurate calibration of beliefs during alignment. To do this, we first segment belief diversity into two categories: subjectivity (across individuals within a population) and epistemic uncertainty (within an individual across different contexts). By leveraging our notion of epistemic uncertainty, we introduce `the belief calibration cycle' framework to more holistically calibrate this diversity of beliefs with context-driven counterfactual reasoning by using a multi-objective optimization. We empirically apply our framework for finding a Pareto frontier of clustered optimal belief strengths that generalize across different contexts, demonstrating its efficacy on a toy dataset for credit decisions.

摘要
信仰和价值在我们的人工智能系统中越来越被包含，通过精心制定数据收集原则或训练过程中的损失函数正则化。然而，人类信仰问题的元问题是，这些信仰各有不同，不同人群之间不协调，同时，每个人对每个信仰的强度可能并不准确，尤其是在扩展到不同情况时。因此，我们认为在满意度高时，Contextual counterfactuals和报偿成本是更新决策者信仰和信仰强度的关键。为了实现这一点，我们将信仰多样性分为两类：个人间的主观性和个人在不同情况下的认知不确定性。通过我们的认知不确定性概念，我们提出了“信仰均衡ecycle”框架，用于更全面地均衡这些多样性的信仰，通过 context-driven counterfactual reasoning来使用多目标优化。我们在一个简单的借口 dataset 上实际应用了我们的框架，并证明其在不同情况下的普适性。

Improving Nonalcoholic Fatty Liver Disease Classification Performance With Latent Diffusion Models

paper_url: http://arxiv.org/abs/2307.06507
repo_url: None
paper_authors: Romain Hardy, Cornelia Ilin, Joe Klepich, Ryan Mitchell, Steve Hall, Jericho Villareal
for: 这研究旨在探讨 integrating deep learning with clinical expertise 如何提高医疗挑战的解决能力，以及如何为医生提供更好的诊断工具。
methods: 该研究使用了 diffusion models 生成的合成图像，并与实际图像相结合，以提高非酒精性脂肪肝病（NAFLD）分类性能。
results: 研究发现，使用 diffusion-generated images 可以提高 NAFLD 分类性能，并且与使用 generative adversarial networks（GANs）-generated images 相比， diffusion-generated images 的质量更高，最高的 Inception Score（IS）分数为 $1.90$，最低的 Fr'{e}chet Inception Distance（FID）分数为 $69.45$。使用一种部分冻结的 CNN 背bone (EfficientNet v1)，该合成增强方法在 NAFLD 预测任务上达到了最大的图像水平 ROC AUC 值 $0.904$。

Abstract
Integrating deep learning with clinical expertise holds great potential for addressing healthcare challenges and empowering medical professionals with improved diagnostic tools. However, the need for annotated medical images is often an obstacle to leveraging the full power of machine learning models. Our research demonstrates that by combining synthetic images, generated using diffusion models, with real images, we can enhance nonalcoholic fatty liver disease (NAFLD) classification performance. We evaluate the quality of the synthetic images by comparing two metrics: Inception Score (IS) and Fr\'{e}chet Inception Distance (FID), computed on diffusion-generated images and generative adversarial networks (GANs)-generated images. Our results show superior performance for the diffusion-generated images, with a maximum IS score of $1.90$ compared to $1.67$ for GANs, and a minimum FID score of $69.45$ compared to $99.53$ for GANs. Utilizing a partially frozen CNN backbone (EfficientNet v1), our synthetic augmentation method achieves a maximum image-level ROC AUC of $0.904$ on a NAFLD prediction task.

摘要
融合深度学习与临床专业知识可能解决医疗挑战和提供改进的诊断工具，但是需要标注医疗图像的需求 frequently 成为机器学习模型的全部潜力不能得到利用的障碍。我们的研究表明，通过将扩散模型生成的 synthetic 图像与实际图像结合使用，可以提高非阿尔科各疾病（NAFLD）分类性能。我们评估生成的 synthetic 图像质量，比较两个指标：Inception Score（IS）和Fréchet Inception Distance（FID），分别计算在扩散模型生成的图像和生成 adversarial 网络（GANs）生成的图像。我们的结果显示，扩散模型生成的图像性能更高，最大的IS分数为1.90，比GANs的1.67高；最小的FID分数为69.45，比GANs的99.53低。使用一个部分冻结的 CNN 背bone（EfficientNet v1），我们的生成增强方法在NAFLD预测任务中实现最大的图像级 ROC AUC 为0.904。

Hybrid Control Policy for Artificial Pancreas via Ensemble Deep Reinforcement Learning

paper_url: http://arxiv.org/abs/2307.06501
repo_url: None
paper_authors: Wenzhou Lv, Tianyu Wu, Luolin Xiong, Liang Wu, Jian Zhou, Yang Tang, Feng Qian
for: 这个研究的目的是为人类一型糖尿病患者（T1DM）提供closed-loop血糖控制方案。methods: 这个研究使用了一种混合控制策略，将模型预测控制（MPC）与深度征才学习（DRL）组合，以便融合MPC的安全性和稳定性，并且具有DRL的个性化和适应性。此外，这个研究还将meta-学习技术 incorporated into HyCPAP，以便快速适应新的病人，并且仅需要有限的可用数据。results: 这个研究使用了FDA所批准的UVA/Padova T1DM simulator，在三个 scenario中进行了广泛的实验。结果显示，HyCPAP的方法可以在T1DM患者中实现最高的时间在理想血糖范围内的百分比，并且最低的低血糖发生次数。

Abstract
Objective: The artificial pancreas (AP) has shown promising potential in achieving closed-loop glucose control for individuals with type 1 diabetes mellitus (T1DM). However, designing an effective control policy for the AP remains challenging due to the complex physiological processes, delayed insulin response, and inaccurate glucose measurements. While model predictive control (MPC) offers safety and stability through the dynamic model and safety constraints, it lacks individualization and is adversely affected by unannounced meals. Conversely, deep reinforcement learning (DRL) provides personalized and adaptive strategies but faces challenges with distribution shifts and substantial data requirements. Methods: We propose a hybrid control policy for the artificial pancreas (HyCPAP) to address the above challenges. HyCPAP combines an MPC policy with an ensemble DRL policy, leveraging the strengths of both policies while compensating for their respective limitations. To facilitate faster deployment of AP systems in real-world settings, we further incorporate meta-learning techniques into HyCPAP, leveraging previous experience and patient-shared knowledge to enable fast adaptation to new patients with limited available data. Results: We conduct extensive experiments using the FDA-accepted UVA/Padova T1DM simulator across three scenarios. Our approaches achieve the highest percentage of time spent in the desired euglycemic range and the lowest occurrences of hypoglycemia. Conclusion: The results clearly demonstrate the superiority of our methods for closed-loop glucose management in individuals with T1DM. Significance: The study presents novel control policies for AP systems, affirming the great potential of proposed methods for efficient closed-loop glucose control.

摘要
目标：人工胰腺（AP）在ype 1 肥皂糖病（T1DM）患者中实现closed-loop血糖控制具有抢夺性。然而，设计AP控制策略仍然是一个挑战，因为生物体系复杂，延迟的胰岛快捷响应和不准确的血糖测量。MPC提供了安全和稳定性，但缺乏个性化和偏离预期的饮食会影响其性能。DRL则提供了个性化和适应的策略，但面临了分布shift和大量数据需求。方法：我们提出了一种hybrid控制策略， combineMPC和DRL Ensemble，利用这两种策略的优势，并补做它们的相应局限性。为了快速部署AP系统在实际设置中，我们进一步 incorporate meta-学技术，利用前一些经验和患者共享的知识，以快速适应新的患者，并使用有限的可用数据进行适应。结果：我们在FDA认可的UVA/Padova T1DM simulator上进行了广泛的实验，结果显示我们的方法在三个场景中实现了最高的血糖控制时间百分比和最低的低血糖发生率。结论：研究结果明确地表明了我们的方法在T1DM患者中实现closed-loop血糖控制具有优势。重要性：这种研究提出了AP系统的新控制策略，证明了这些方法在efficient closed-loop血糖控制方面具有潜力。

AutoHint: Automatic Prompt Optimization with Hint Generation

paper_url: http://arxiv.org/abs/2307.07415
repo_url: None
paper_authors: Hong Sun, Xue Li, Yinchuan Xu, Youkow Homma, Qi Cao, Min Wu, Jian Jiao, Denis Charles
for: 本文提出了一种名为AutoHint的框架，用于自动引入和优化大语言模型（LLM）的提示。
methods: 本文使用了从输入输出示例中抽取含义更加丰富的提示，并通过自动生成这些提示来优化原始提示。
results: 实验表明，使用AutoHint框架可以在多个任务上显著提高准确率，并且在零shot和几个示例提示下都能达到优秀效果。

Abstract
This paper presents AutoHint, a novel framework for automatic prompt engineering and optimization for Large Language Models (LLM). While LLMs have demonstrated remarkable ability in achieving high-quality annotation in various tasks, the key to applying this ability to specific tasks lies in developing high-quality prompts. Thus we propose a framework to inherit the merits of both in-context learning and zero-shot learning by incorporating enriched instructions derived from input-output demonstrations to optimize original prompt. We refer to the enrichment as the hint and propose a framework to automatically generate the hint from labeled data. More concretely, starting from an initial prompt, our method first instructs a LLM to deduce new hints for selected samples from incorrect predictions, and then summarizes from per-sample hints and adds the results back to the initial prompt to form a new, enriched instruction. The proposed method is evaluated on the BIG-Bench Instruction Induction dataset for both zero-shot and few-short prompts, where experiments demonstrate our method is able to significantly boost accuracy for multiple tasks.

摘要
We refer to the enrichment as the "hint" and propose a method to automatically generate the hint from labeled data. Starting from an initial prompt, our method first instructs the LLM to deduce new hints for selected samples from incorrect predictions, and then summarizes the per-sample hints and adds the results back to the initial prompt to form a new, enriched instruction.We evaluate our method on the BIG-Bench Instruction Induction dataset for both zero-shot and few-shot prompts, and the results show that our method can significantly boost accuracy for multiple tasks.

Microbial Genetic Algorithm-based Black-box Attack against Interpretable Deep Learning Systems

paper_url: http://arxiv.org/abs/2307.06496
repo_url: None
paper_authors: Eldor Abdukhamidov, Mohammed Abuhamad, Simon S. Woo, Eric Chan-Tin, Tamer Abuhmed
for: This paper focuses on the vulnerability of interpretable deep learning systems (IDLSes) to black-box attacks, and proposes a Query-efficient Score-based black-box attack (QuScore) to overcome this vulnerability.methods: The proposed attack method, QuScore, employs an effective microbial genetic algorithm and transfer-based and score-based methods to reduce the number of queries necessary to carry out successful attacks.results: The proposed attack method achieves a high attack success rate of between 95% and 100% and transferability with an average success rate of 69% in the ImageNet and CIFAR datasets. The generated adversarial examples have attribution maps that resemble benign samples, and the attack is resilient against various preprocessing defense techniques.

Abstract
Deep learning models are susceptible to adversarial samples in white and black-box environments. Although previous studies have shown high attack success rates, coupling DNN models with interpretation models could offer a sense of security when a human expert is involved, who can identify whether a given sample is benign or malicious. However, in white-box environments, interpretable deep learning systems (IDLSes) have been shown to be vulnerable to malicious manipulations. In black-box settings, as access to the components of IDLSes is limited, it becomes more challenging for the adversary to fool the system. In this work, we propose a Query-efficient Score-based black-box attack against IDLSes, QuScore, which requires no knowledge of the target model and its coupled interpretation model. QuScore is based on transfer-based and score-based methods by employing an effective microbial genetic algorithm. Our method is designed to reduce the number of queries necessary to carry out successful attacks, resulting in a more efficient process. By continuously refining the adversarial samples created based on feedback scores from the IDLS, our approach effectively navigates the search space to identify perturbations that can fool the system. We evaluate the attack's effectiveness on four CNN models (Inception, ResNet, VGG, DenseNet) and two interpretation models (CAM, Grad), using both ImageNet and CIFAR datasets. Our results show that the proposed approach is query-efficient with a high attack success rate that can reach between 95% and 100% and transferability with an average success rate of 69% in the ImageNet and CIFAR datasets. Our attack method generates adversarial examples with attribution maps that resemble benign samples. We have also demonstrated that our attack is resilient against various preprocessing defense techniques and can easily be transferred to different DNN models.

摘要
深度学习模型容易受到敌意样本的攻击，包括白盒和黑盒环境。 previous studies have shown that coupling DNN models with interpretation models can provide a sense of security, as a human expert can identify whether a given sample is benign or malicious. However, in white-box environments, interpretable deep learning systems (IDLSes) have been shown to be vulnerable to malicious manipulations. In this work, we propose a Query-efficient Score-based black-box attack against IDLSes, QuScore, which requires no knowledge of the target model and its coupled interpretation model. QuScore is based on transfer-based and score-based methods using an effective microbial genetic algorithm. Our method is designed to reduce the number of queries necessary to carry out successful attacks, resulting in a more efficient process. By continuously refining the adversarial samples created based on feedback scores from the IDLS, our approach effectively navigates the search space to identify perturbations that can fool the system. We evaluate the attack's effectiveness on four CNN models (Inception, ResNet, VGG, DenseNet) and two interpretation models (CAM, Grad), using both ImageNet and CIFAR datasets. Our results show that the proposed approach is query-efficient with a high attack success rate that can reach between 95% and 100% and transferability with an average success rate of 69% in the ImageNet and CIFAR datasets. Our attack method generates adversarial examples with attribution maps that resemble benign samples. We have also demonstrated that our attack is resilient against various preprocessing defense techniques and can easily be transferred to different DNN models.

Misclassification in Automated Content Analysis Causes Bias in Regression. Can We Fix It? Yes We Can!

paper_url: http://arxiv.org/abs/2307.06483
repo_url: None
paper_authors: Nathan TeBlunthuis, Valerie Hase, Chung-Hong Chan
For: The paper is written for communication scholars and researchers who use automated classifiers (ACs) in their studies, and it aims to provide a better understanding of the limitations and potential biases of these classifiers.* Methods: The paper uses a systematic literature review and Monte Carlo simulations to investigate the issue of misclassification bias in ACs, and it introduces and tests several methods for correcting this bias.* Results: The paper finds that existing statistical methods can be used to correct misclassification bias, and it recommends a new error correction method that is versatile and efficient. The results suggest that ACs can be useful for measurement with careful study design and appropriate error correction methods.

Abstract
Automated classifiers (ACs), often built via supervised machine learning (SML), can categorize large, statistically powerful samples of data ranging from text to images and video, and have become widely popular measurement devices in communication science and related fields. Despite this popularity, even highly accurate classifiers make errors that cause misclassification bias and misleading results in downstream analyses-unless such analyses account for these errors. As we show in a systematic literature review of SML applications, communication scholars largely ignore misclassification bias. In principle, existing statistical methods can use "gold standard" validation data, such as that created by human annotators, to correct misclassification bias and produce consistent estimates. We introduce and test such methods, including a new method we design and implement in the R package misclassificationmodels, via Monte Carlo simulations designed to reveal each method's limitations, which we also release. Based on our results, we recommend our new error correction method as it is versatile and efficient. In sum, automated classifiers, even those below common accuracy standards or making systematic misclassifications, can be useful for measurement with careful study design and appropriate error correction methods.

摘要
自动分类器（AC），经常通过直接监督机器学习（SML）构建，可以处理大量数据，从文本到图像和视频，并在通信科学和相关领域中成为广泛使用的测量工具。尽管如此，即使高度准确的分类器也会出现错误，导致分类偏见和误导性结果， unless downstream analyses account for these errors。根据我们在通信学家大量应用SML的文献系统atic literature review中发现，通信学家大多ignore misclassification bias。在原则上，现有的统计方法可以使用"金标准"验证数据，such as that created by human annotators，来 corrections misclassification bias and produce consistent estimates。我们介绍并测试了这些方法，包括我们设计和实现的新方法，via Monte Carlo simulations designed to reveal each method's limitations，which we also release。根据我们的结果，我们推荐我们的新错误 corrections方法，因为它是多功能和高效的。总之，自动分类器，即使它们在常见准确标准下或系统性地出现错误，也可以用于测量，只要是进行了仔细的设计和适当的错误 corrections方法。

Efficiently-Verifiable Strong Uniquely Solvable Puzzles and Matrix Multiplication

paper_url: http://arxiv.org/abs/2307.06463
repo_url: https://bitbucket.org/paraphase/matmult-v2
paper_authors: Matthew Anderson, Vu Le
for: The paper is written for developing fast matrix multiplication algorithms.
methods: The paper uses the Cohn-Umans framework and introduces a new subclass of strong uniquely solvable puzzles (SUSPs) called simplifiable SUSPs.
results: The paper shows that individual simplifiable SUSPs can achieve the same strength of bounds on the matrix multiplication exponent $\omega$ as infinite families of SUSPs, and reports on the construction of larger SUSPs than previously known for small width, which strengthens the upper bound on the matrix multiplication exponent from $2.66$ to $2.505$.Here is the information in Simplified Chinese text:
for: 本文是为了开发快速矩阵乘法算法而写的。
methods: 本文使用了Cohn-Umans框架，并引入了一新的强约束可解谜题（SUSP） subclass called simplifiable SUSPs。
results: 本文表明，个别可简化SUSPs可以达到与无穷多个SUSPs相同的约束 exponent $\omega$ 的强度，并报告了小宽度的大SUSPs than previously known，这使得矩阵乘法 exponent from $2.66$ to $2.505$ 的上限变得更加强。

Abstract
We advance the Cohn-Umans framework for developing fast matrix multiplication algorithms. We introduce, analyze, and search for a new subclass of strong uniquely solvable puzzles (SUSP), which we call simplifiable SUSPs. We show that these puzzles are efficiently verifiable, which remains an open question for general SUSPs. We also show that individual simplifiable SUSPs can achieve the same strength of bounds on the matrix multiplication exponent $\omega$ that infinite families of SUSPs can. We report on the construction, by computer search, of larger SUSPs than previously known for small width. This, combined with our tighter analysis, strengthens the upper bound on the matrix multiplication exponent from $2.66$ to $2.505$ obtainable via this computational approach, and nears the results of the handcrafted constructions of Cohn et al.

摘要
我们推进了Cohn-Umans的框架，以开发更快的矩阵乘法算法。我们引入、分析和搜寻一新的强烈独特解难题（SUSP），我们称之为简化SUSP。我们证明了这些题目是可以效率验证的，这是一个未解之谜题 для一般的SUSP。我们还证明了个别的简化SUSP可以 дости到相同的矩阵乘法对数 exponent $\omega$ 的范围，与无限多个SUSP可以实现的结果相似。我们透过计算搜寻，建立了小宽度的更大的SUSP，这与我们更紧密的分析相结合，将矩阵乘法对数 exponent的上限由 $2.66$ 提高到 $2.505$，并且接近Cohn等人的手工建构结果。

ACTI at EVALITA 2023: Overview of the Conspiracy Theory Identification Task

paper_url: http://arxiv.org/abs/2307.06954
repo_url: None
paper_authors: Giuseppe Russo, Niklas Stoehr, Manoel Horta Ribeiro
for: 本研究旨在鉴别谣言内容和特定谣言类别。
methods: 本研究使用大型自然语言模型来鉴别谣言内容和特定谣言类别。
results: 研究发现大型自然语言模型可以有效地鉴别谣言内容和特定谣言类别。Translation:
for: This study aims to identify conspiratorial content and specific conspiracy theory categories.
methods: This study uses large language models to identify conspiratorial content and specific conspiracy theory categories.
results: The study finds that large language models can effectively identify conspiratorial content and specific conspiracy theory categories.

Abstract
Conspiracy Theory Identication task is a new shared task proposed for the first time at the Evalita 2023. The ACTI challenge, based exclusively on comments published on conspiratorial channels of telegram, is divided into two subtasks: (i) Conspiratorial Content Classification: identifying conspiratorial content and (ii) Conspiratorial Category Classification about specific conspiracy theory classification. A total of fifteen teams participated in the task for a total of 81 submissions. We illustrate the best performing approaches were based on the utilization of large language models. We finally draw conclusions about the utilization of these models for counteracting the spreading of misinformation in online platforms.

摘要
《刺激理论识别任务》是在2023年的Evalita中提出的一个新任务。这个任务基于在阴谋渠道上发布的评论，分为两个子任务：（i）刺激内容分类，检测刺激内容；（ii）刺激类别分类，为特定阴谋理论进行分类。总共有15个队伍参与了这个任务，共提交81个解决方案。我们 illustrate最佳实现方法基于大语言模型的使用。最后，我们draw结论关于在网络平台上防止谣言的利用这些模型。Note: "阴谋渠道" (yǐn wù dào) in the text refers to "conspiratorial channels" or "channels for spreading conspiracy theories".

No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models

paper_url: http://arxiv.org/abs/2307.06440
repo_url: https://github.com/jeankaddour/notrainnogain
paper_authors: Jean Kaddour, Oscar Key, Piotr Nawrot, Pasquale Minervini, Matt J. Kusner
for: 这个研究旨在提高Transformer基于语言模型的训练效率，以适应过去几年训练计算量的快速增长。
methods: 这个研究使用了三类有效训练算法：动态架构（层堆栈、层产生）、批量选择（选择性反演、RHO损失）和高效优化器（Lion、Sophia）。
results: 在固定计算预算下使用这些方法进行BERT和T5的预训练后，发现它们的训练、验证和下游性能减退相比基eline。我们提出了一种评估协议，允许在任意机器上进行计算，并将所有计算时间映射到一个参考机器上（reference system time）。

Abstract
The computation necessary for training Transformer-based language models has skyrocketed in recent years. This trend has motivated research on efficient training algorithms designed to improve training, validation, and downstream performance faster than standard training. In this work, we revisit three categories of such algorithms: dynamic architectures (layer stacking, layer dropping), batch selection (selective backprop, RHO loss), and efficient optimizers (Lion, Sophia). When pre-training BERT and T5 with a fixed computation budget using such methods, we find that their training, validation, and downstream gains vanish compared to a baseline with a fully-decayed learning rate. We define an evaluation protocol that enables computation to be done on arbitrary machines by mapping all computation time to a reference machine which we call reference system time. We discuss the limitations of our proposed protocol and release our code to encourage rigorous research in efficient training procedures: https://github.com/JeanKaddour/NoTrainNoGain.

摘要
computation necessary for training Transformer-based language models has skyrocketed in recent years. This trend has motivated research on efficient training algorithms designed to improve training, validation, and downstream performance faster than standard training. In this work, we revisit three categories of such algorithms: dynamic architectures (layer stacking, layer dropping), batch selection (selective backprop, RHO loss), and efficient optimizers (Lion, Sophia). When pre-training BERT and T5 with a fixed computation budget using such methods, we find that their training, validation, and downstream gains vanish compared to a baseline with a fully-decayed learning rate. We define an evaluation protocol that enables computation to be done on arbitrary machines by mapping all computation time to a reference machine which we call reference system time. We discuss the limitations of our proposed protocol and release our code to encourage rigorous research in efficient training procedures: https://github.com/JeanKaddour/NoTrainNoGain.Here's the word-for-word translation in Simplified Chinese:计算必需的语言模型训练计算量在最近几年内有所增加。这种趋势激发了训练效率优化的研究，以提高训练、验证和下游性能。在这项工作中，我们回顾了三类类型的算法：动态架构（层堆栈、层掉落）、批量选择（选择式反Prop、RHO损失）和高效优化器（Lion、Sophia）。当使用这些方法预训练BERT和T5时，我们发现他们的训练、验证和下游性能减退，与基eline的完全衰减学习率相比。我们定义了一种评估协议，使得计算可以在任意机器上进行，将所有计算时间映射到一个参照机器上，我们称之为参照系统时间。我们讨论了我们的评估协议的局限性，并将代码发布，以便严谨的研究高效训练过程：https://github.com/JeanKaddour/NoTrainNoGain。

Distilling Large Language Models for Biomedical Knowledge Extraction: A Case Study on Adverse Drug Events

paper_url: http://arxiv.org/abs/2307.06439
repo_url: None
paper_authors: Yu Gu, Sheng Zhang, Naoto Usuyama, Yonas Woldesenbet, Cliff Wong, Praneeth Sanapathi, Mu Wei, Naveen Valluri, Erika Strandberg, Tristan Naumann, Hoifung Poon
for: 这项研究旨在研究如何使用大语言模型（LLMs）来扩大医疗应用领域中的生物医学知识抽取。
methods: 研究使用自我超vised学习来提取GPT-4中的生物医学知识，并通过精炼模型来实现substantial提高。
results: 研究发现，通过精炼GPT-3.5 PubMedBERT模型，可以在标准ADE提取评估中达到相当于经验性模型的精度，而无需使用任何标注数据。此外，精炼模型还可以在F1分数和GPT-4之间比较高。

Abstract
Large language models (LLMs), such as GPT-4, have demonstrated remarkable capabilities across a wide range of tasks, including health applications. In this paper, we study how LLMs can be used to scale biomedical knowledge curation. We find that while LLMs already possess decent competency in structuring biomedical text, by distillation into a task-specific student model through self-supervised learning, substantial gains can be attained over out-of-box LLMs, with additional advantages such as cost, efficiency, and white-box model access. We conduct a case study on adverse drug event (ADE) extraction, which is an important area for improving care. On standard ADE extraction evaluation, a GPT-3.5 distilled PubMedBERT model attained comparable accuracy as supervised state-of-the-art models without using any labeled data. Despite being over 1,000 times smaller, the distilled model outperformed its teacher GPT-3.5 by over 6 absolute points in F1 and GPT-4 by over 5 absolute points. Ablation studies on distillation model choice (e.g., PubMedBERT vs BioGPT) and ADE extraction architecture shed light on best practice for biomedical knowledge extraction. Similar gains were attained by distillation for other standard biomedical knowledge extraction tasks such as gene-disease associations and protected health information, further illustrating the promise of this approach.

摘要
大型语言模型（LLM），如GPT-4，在各种任务上表现出色，包括医疗应用。在这篇论文中，我们研究了如何使用LLM来扩大医疗知识抽象。我们发现，虽然LLM已经有了结构医疗文本的能力，但通过自我超vised学习，可以通过对任务特定的学生模型进行缩减，实现了较大的提升，同时具有成本效益和白盒模型访问的优势。我们进行了一个case study，研究了药物不良反应（ADE）提取，这是医疗改进的重要领域。使用GPT-3.5缩减PubMedBERT模型，在标准ADE提取评价中与无标签数据的超参考模型相当的准确率。尽管这个模型比GPT-3.5和GPT-4都小得多，但它在F1指标上比其教师模型GPT-3.5高出6个绝对分，比GPT-4高出5个绝对分。对缩减模型选择（如PubMedBERT vs BioGPT）和ADE提取架构进行了ablation研究， shed light on最佳做法 для医疗知识提取。同样的，通过缩减，对其他标准医疗知识提取任务，如基因疾病关系和医疗保密信息，也实现了类似的提升，这更加证明了这种方法的承诺。

SSVEP-Based BCI Wheelchair Control System

paper_url: http://arxiv.org/abs/2307.08703
repo_url: None
paper_authors: Ce Zhou
for: 这项研究旨在帮助患有肢体残缺或肢体不适的人们，特别是患有肢体麻痹的人们，提高生活质量。methods: 该项目使用了Steady-State Visual Evoked Potential（SSVEP）技术，并在Matlab中进行了EEG信号处理，包括Butterworth滤波器和快速傅立叶变换（FFT）。此外，该项目还使用了一种幂数基本分类方法。results: 实验结果表明，该系统可以准确地控制电子轮椅，并且可以在1秒钟左右实现时延。这表明这种基于SSVEP的BCI控制的轮椅在未来可能有很大的应用前途。

Abstract
A brain-computer interface (BCI) is a system that allows a person to communicate or control the surroundings without depending on the brain's normal output pathways of peripheral nerves and muscles. A lot of successful applications have arisen utilizing the advantages of BCI to assist disabled people with so-called assistive technology. Considering using BCI has fewer limitations and huge potential, this project has been proposed to control the movement of an electronic wheelchair via brain signals. The goal of this project is to help disabled people, especially paralyzed people suffering from motor disabilities, improve their life qualities. In order to realize the project stated above, Steady-State Visual Evoked Potential (SSVEP) is involved. It can be easily elicited in the visual cortical with the same frequency as the one is being focused by the subject. There are two important parts in this project. One is to process the EEG signals and another one is to make a visual stimulator using hardware. The EEG signals are processed in Matlab using the algorithm of Butterworth Infinite Impulse Response (IIR) bandpass filter (for preprocessing) and Fast Fourier Transform (FFT) (for feature extraction). Besides, a harmonics-based classification method is proposed and applied in the classification part. Moreover, the design of the visual stimulator combines LEDs as flickers and LCDs as information displayers on one panel. Microcontrollers are employed to control the SSVEP visual stimuli panel. This project is evaluated by subjects with different races and ages. Experimental results show the system is easy to be operated and it can achieve approximately a minimum 1-second time delay. So it demonstrates that this SSVEP-based BCI-controlled wheelchair has a huge potential to be applied to disabled people in the future.

摘要
一个脑computer interfaces（BCI）是一个系统，允许人们通过不同于正常脑神经元和肌肉的输出路径ways来交流或控制周围环境。BCI的优势使得它在助手技术方面得到了广泛的应用。为了帮助残疾人，特别是患有运动障碍的人们，提高生活质量，这个项目提出了通过脑信号控制电子轮椅的想法。项目的目标是帮助残疾人，特别是患有运动障碍的人们，提高生活质量。为了实现以上项目，Steady-State Visual Evoked Potential（SSVEP）被应用。SSVEP可以轻松地在视 Cortical中诱发，与主体的注意力频率相同。项目包括两个重要部分：一是处理EEG信号，二是使用硬件制作视觉刺激器。EEG信号在MATLAB中使用Butterworth无限响应滤波器（IIR）和快速傅立叶变换（FFT）进行处理。此外，我们还提出了一种基于谐波分类方法的分类方式。此外，视觉刺激器使用LED和LCD组成，并通过微控制器控制SSVEP视觉刺激。这个项目在不同的种族和年龄的参与者中进行了评估。实验结果表明，系统易于使用，可以实现约1秒的延迟。这表明，这种基于SSVEP的BCI控制的轮椅在未来对残疾人有巨大的潜力。

Designing Behavior Trees from Goal-Oriented LTLf Formulas

paper_url: http://arxiv.org/abs/2307.06399
repo_url: None
paper_authors: Aadesh Neupane, Michael A. Goodrich
for: 这篇论文旨在将finite trace Linear Temporal Logic (LTL)表示的目标转换成行为树（BT），以确保成功轨迹满足LTL目标。
methods: 本文使用 achievement-oriented task mission grammar derivation 有用的 LTL 方程，并将其转换成 BT。
results: 本文表明，通过将 LTL 方程转换成 BT，可以实现广泛的 плаanner 实现行为节点，并且任何成功轨迹都满足相应的 LTL 方程。两种示例都证明了该方法的有用性：一是对两个 плаanner 和 LTL 目标之间的匹配，二是解决一个序列键匙问题 для Fetch 机器人。

Abstract
Temporal logic can be used to formally specify autonomous agent goals, but synthesizing planners that guarantee goal satisfaction can be computationally prohibitive. This paper shows how to turn goals specified using a subset of finite trace Linear Temporal Logic (LTL) into a behavior tree (BT) that guarantees that successful traces satisfy the LTL goal. Useful LTL formulas for achievement goals can be derived using achievement-oriented task mission grammars, leading to missions made up of tasks combined using LTL operators. Constructing BTs from LTL formulas leads to a relaxed behavior synthesis problem in which a wide range of planners can implement the action nodes in the BT. Importantly, any successful trace induced by the planners satisfies the corresponding LTL formula. The usefulness of the approach is demonstrated in two ways: a) exploring the alignment between two planners and LTL goals, and b) solving a sequential key-door problem for a Fetch robot.

摘要
<>使用时间逻辑来正式表述自主机器人目标，但生成能 garantuee 目标满足的 планиFIER可以是计算极其繁琐的。这篇论文介绍了将使用 subset of finite trace 线性时间逻辑 (LTL) 表述的目标转化为 behavIOR treE (BT)，使得成功跟踪满足 LTL 目标。有用的 LTL 方程可以通过征能任务语言 grammar derivation，导致任务组合使用 LTL 运算符。从 LTL 方程生成 BT 会导致一种放松的行为合成问题，在这种问题中，各种 планиFIER 可以实现动作节点。重要的是，任何成功跟踪都满足相应的 LTL 方程。该方法在两种方面得到了证明：一是探索两个 planner 和 LTL 目标之间的协调关系，二是解决一个 sequential key-door 问题 для Fetch 机器人。Note: "relaxed behavior synthesis" in the text is translated as "放松的行为合成问题" in Simplified Chinese.

Rethinking Answer Set Programming Templates

paper_url: http://arxiv.org/abs/2307.06382
repo_url: None
paper_authors: Mario Alviano, Giovambattista Ianni, Francesco Pacenza, Jessica Zangari
for: 本研究旨在帮助开发人员更好地处理软件开发复杂性，通过将域内稳定性表现在代码中来减少代码的污染和风险。
methods: 本研究提出了一种简单的命名方法，用于在 Answer Set Programming 中强制执行域内稳定性。这种方法使用 universally unique identifiers 来避免名称冲突，并将本地预测映射到主流引擎的常见全局名空间中。
results: 本研究的结果表明，通过使用本种方法，可以在可能空的应用上下文中强制执行域内稳定性，并且可以安全地共享模板应用程序，即使其他知识设计师没有任何关于模板的知识。

Abstract
In imperative programming, the Domain-Driven Design methodology helps in coping with the complexity of software development by materializing in code the invariants of a domain of interest. Code is cleaner and more secure because any implicit assumption is removed in favor of invariants, thus enabling a fail fast mindset and the immediate reporting of unexpected conditions. This article introduces a notion of template for Answer Set Programming that, in addition to the don't repeat yourself principle, enforces locality of some predicates by means of a simple naming convention. Local predicates are mapped to the usual global namespace adopted by mainstream engines, using universally unique identifiers to avoid name clashes. This way, local predicates can be used to enforce invariants on the expected outcome of a template in a possibly empty context of application, independently by other rules that can be added to such a context. Template applications transpiled this way can be processed by mainstream engines and safely shared with other knowledge designers, even when they have zero knowledge of templates.

摘要
在命令式编程中，域驱动设计方法ologies可以帮助 coping with 软件开发中的复杂性，通过在代码中materializing 域中的 invariants，从而使代码更加干净和安全。这篇文章介绍了一种 Answer Set Programming 的模板 notation，该notation 不仅遵循 don't repeat yourself 原则，还通过简单的命名 conventions 来强制本地 predicate 的 locality。本地 predicate 被映射到主流引擎采用的 usual global namespace 中，使用 universally unique identifiers 来避免名称冲突。这样，本地 predicate 可以用来强制模板中的 invariants，在可能的空context of application中独立地加入其他规则。经过这种转换的模板应用可以通过主流引擎进行处理，并安全地与其他知识设计师共享，即使他们对模板有 Zero knowledge。

Diagnosis, Feedback, Adaptation: A Human-in-the-Loop Framework for Test-Time Policy Adaptation

paper_url: http://arxiv.org/abs/2307.06333
repo_url: None
paper_authors: Andi Peng, Aviv Netanyahu, Mark Ho, Tianmin Shu, Andreea Bobu, Julie Shah, Pulkit Agrawal
for: 提高机器人策略的个性化适应性，使其能够更好地适应用户的任务目标。
methods: 使用反向示范法，让用户通过直接提供反馈来 identificaiton个性化不关键任务概念。然后，使用这些概念进行数据扩展，以适应用户的任务目标。
results: 通过实验 validate了我们的框架在抽象和连续控制任务中，能够帮助用户更好地理解机器人失败原因，减少需要精细调整的次数，并将机器人更好地适应用户的任务目标。

Abstract
Policies often fail due to distribution shift -- changes in the state and reward that occur when a policy is deployed in new environments. Data augmentation can increase robustness by making the model invariant to task-irrelevant changes in the agent's observation. However, designers don't know which concepts are irrelevant a priori, especially when different end users have different preferences about how the task is performed. We propose an interactive framework to leverage feedback directly from the user to identify personalized task-irrelevant concepts. Our key idea is to generate counterfactual demonstrations that allow users to quickly identify possible task-relevant and irrelevant concepts. The knowledge of task-irrelevant concepts is then used to perform data augmentation and thus obtain a policy adapted to personalized user objectives. We present experiments validating our framework on discrete and continuous control tasks with real human users. Our method (1) enables users to better understand agent failure, (2) reduces the number of demonstrations required for fine-tuning, and (3) aligns the agent to individual user task preferences.

摘要
政策经常因分布shift而失败 -- 在部署新环境时，状态和奖励发生变化。数据扩展可以增加Robustness，使模型对任务 irrelevant的变化不敏感。但是设计者无法知道在先a priori哪些概念是无关的，特别是不同的用户有不同的完成任务的方式的偏好。我们提出了一个互动式框架，利用用户直接反馈来确定个性化无关的概念。我们的关键思想是生成对抗示例，让用户快速地识别可能的任务相关和无关的概念。知道无关的概念后，我们使用数据扩展来获得适应个性用户任务目标的策略。我们的方法可以：(1)让用户更好地理解代理失败的原因，(2)减少需要精度调整的示例数量，(3)将代理落实到个性用户任务目标。我们在实验中 Validating our framework on discrete and continuous control tasks with real human users.

Budgeting Counterfactual for Offline RL

paper_url: http://arxiv.org/abs/2307.06328
repo_url: None
paper_authors: Yao Liu, Pratik Chaudhari, Rasool Fakoor
for: 这个论文主要针对的问题是线上强化学习中数据有限的情况下，如何使用ounterfactual reasoning来解决决策问题。
methods: 该论文提出了一种新的方法，通过在训练过程中直接约束出现在偏置函数中的out-of-distribution动作数量，以控制推断误差的减少。这种方法利用动态计划来决定在训练过程中是否进行out-of-distribution动作，并且通过一个Upper bound来控制这些动作的数量。
results: 该论文的实验结果表明，与现有的offline强化学习方法相比，该方法在D4RL benchmark上的总性能更高。

Abstract
The main challenge of offline reinforcement learning, where data is limited, arises from a sequence of counterfactual reasoning dilemmas within the realm of potential actions: What if we were to choose a different course of action? These circumstances frequently give rise to extrapolation errors, which tend to accumulate exponentially with the problem horizon. Hence, it becomes crucial to acknowledge that not all decision steps are equally important to the final outcome, and to budget the number of counterfactual decisions a policy make in order to control the extrapolation. Contrary to existing approaches that use regularization on either the policy or value function, we propose an approach to explicitly bound the amount of out-of-distribution actions during training. Specifically, our method utilizes dynamic programming to decide where to extrapolate and where not to, with an upper bound on the decisions different from behavior policy. It balances between the potential for improvement from taking out-of-distribution actions and the risk of making errors due to extrapolation. Theoretically, we justify our method by the constrained optimality of the fixed point solution to our $Q$ updating rules. Empirically, we show that the overall performance of our method is better than the state-of-the-art offline RL methods on tasks in the widely-used D4RL benchmarks.

摘要
主要挑战在线束缚学习中，即数据有限制，来自可行动序列的Counterfactual理解困难：假设我们选择了不同的行动方案？这些情况 frequently leads to extrapolation errors, which tend to accumulate exponentially with the problem horizon. Therefore, it is crucial to recognize that not all decision steps are equally important to the final outcome, and to budget the number of counterfactual decisions a policy makes to control the extrapolation. Unlike existing approaches that use regularization on either the policy or value function, we propose an approach to explicitly bound the amount of out-of-distribution actions during training. Specifically, our method utilizes dynamic programming to decide where to extrapolate and where not to, with an upper bound on the decisions different from the behavior policy. It balances between the potential for improvement from taking out-of-distribution actions and the risk of making errors due to extrapolation. From a theoretical perspective, we justify our method by the constrained optimality of the fixed point solution to our $Q$ updating rules. Empirically, we show that the overall performance of our method is better than the state-of-the-art offline RL methods on tasks in the widely-used D4RL benchmarks.Here's the word-for-word translation:主要挑战在线束缚学习中，即数据有限制，来自可行动序列的Counterfactual理解困难：假设我们选择了不同的行动方案？这些情况 frequently leads to extrapolation errors, which tend to accumulate exponentially with the problem horizon. Therefore, it is crucial to recognize that not all decision steps are equally important to the final outcome, and to budget the number of counterfactual decisions a policy makes to control the extrapolation. Unlike existing approaches that use regularization on either the policy or value function, we propose an approach to explicitly bound the amount of out-of-distribution actions during training. Specifically, our method utilizes dynamic programming to decide where to extrapolate and where not to, with an upper bound on the decisions different from the behavior policy. It balances between the potential for improvement from taking out-of-distribution actions and the risk of making errors due to extrapolation. From a theoretical perspective, we justify our method by the constrained optimality of the fixed point solution to our $Q$ updating rules. Empirically, we show that the overall performance of our method is better than the state-of-the-art offline RL methods on tasks in the widely-used D4RL benchmarks.

FDAPT: Federated Domain-adaptive Pre-training for Language Models

paper_url: http://arxiv.org/abs/2307.06933
repo_url: None
paper_authors: Lekang Jiang, Filip Svoboda, Nicholas D. Lane
for: 这个论文主要针对的问题是如何结合领域适应性预训练（DAPT）和联合学习（FL），以提高模型适应性，同时保护数据隐私。
methods: 这篇论文使用了联合学习（FL）和领域适应性预训练（DAPT）方法，并进行了首次全面的实验研究，以评估联合领域适应性预训练（FDAPT）的性能。
results: 研究发现，FDAPT可以与中央基线相似的下游任务性能，并在非Identical Distribution（non-IID）情况下保持性能稳定。此外，提出了一种新的算法FFDAPT，可以提高计算效率，并与标准FDAPT的下游任务性能相似，但是计算时间减少12.1%。

Abstract
Combining Domain-adaptive Pre-training (DAPT) with Federated Learning (FL) can enhance model adaptation by leveraging more sensitive and distributed data while preserving data privacy. However, few studies have focused on this method. Therefore, we conduct the first comprehensive empirical study to evaluate the performance of Federated Domain-adaptive Pre-training (FDAPT). We demonstrate that FDAPT can maintain competitive downstream task performance to the centralized baseline in both IID and non-IID situations. Furthermore, we propose a novel algorithm, Frozen Federated Domain-adaptive Pre-training (FFDAPT). FFDAPT improves the computational efficiency by 12.1% on average and exhibits similar downstream task performance to standard FDAPT, with general performance fluctuations remaining less than 1%. Finally, through a critical evaluation of our work, we identify promising future research directions for this new research area.

摘要
使用域 adaptive pre-training (DAPT) 与联合学习 (FL) 结合可以提高模型适应性，同时保护数据隐私。然而，目前很少研究这种方法。因此，我们进行了首次的全面实验性研究，以评估联邦域 adaptive pre-training (FDAPT) 的表现。我们发现，FDAPT 可以与中央基线在 IID 和非 IID 情况下保持竞争性的下游任务性能。此外，我们提出了一种新算法，冻结联邦域 adaptive pre-training (FFDAPT)。FFDAPT 可以提高计算效率，在 average 上降低了12.1%，并且与标准 FDAPT 的下游任务性能相似，下游任务性能波动低于1%。最后，通过一个批判性的评估，我们确定了这个新研究领域的前景。

Patch n’ Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution

paper_url: http://arxiv.org/abs/2307.06304
repo_url: https://github.com/Natyren/NaViT
paper_authors: Mostafa Dehghani, Basil Mustafa, Josip Djolonga, Jonathan Heek, Matthias Minderer, Mathilde Caron, Andreas Steiner, Joan Puigcerver, Robert Geirhos, Ibrahim Alabdulmohsin, Avital Oliver, Piotr Padlewski, Alexey Gritsenko, Mario Lučić, Neil Houlsby
for: 本研究旨在挑战 Fix-size 图像处理的传统做法，提出Native Resolution ViT（NaViT）模型，可以处理任意分辨率和比例的输入图像。
methods: NaViT 使用序列填充技术在训练时处理输入图像，以实现输入图像的自由修改。此外，研究还提出了一种基于可变序列长度的训练方法，以提高大规模的超vision 和对比性图像预训练。
results: 在实验中，NaViT 可以具有更高的训练效率和推理效率，并且在多种图像任务上达到更高的表现。此外，NaViT 还能够更好地保持图像的纹理和形态信息，从而提高图像的稳定性和公平性。

Abstract
The ubiquitous and demonstrably suboptimal choice of resizing images to a fixed resolution before processing them with computer vision models has not yet been successfully challenged. However, models such as the Vision Transformer (ViT) offer flexible sequence-based modeling, and hence varying input sequence lengths. We take advantage of this with NaViT (Native Resolution ViT) which uses sequence packing during training to process inputs of arbitrary resolutions and aspect ratios. Alongside flexible model usage, we demonstrate improved training efficiency for large-scale supervised and contrastive image-text pretraining. NaViT can be efficiently transferred to standard tasks such as image and video classification, object detection, and semantic segmentation and leads to improved results on robustness and fairness benchmarks. At inference time, the input resolution flexibility can be used to smoothly navigate the test-time cost-performance trade-off. We believe that NaViT marks a departure from the standard, CNN-designed, input and modelling pipeline used by most computer vision models, and represents a promising direction for ViTs.

摘要
“ ubique 和 suboptimal 的选择， resize 图像到固定分辨率进行计算机视觉模型处理，直到现在还没有被成功挑战。然而，模型如 Vision Transformer (ViT) 提供了 flexible sequence-based 模型化，并且可以处理不同的输入序列长度。我们利用这点，在 NaViT (Native Resolution ViT) 中使用序列压缩 during training，以处理任意分辨率和比例的输入。此外，我们还证明了 NaViT 在大规模的超vised 和 contrastive 图像文本预训练中具有更高的训练效率。 NaViT 可以高效地转移到标准任务，如图像和视频分类、物体检测和semantic segmentation，并且在 robustness 和 fairness 指标上带来改进的结果。在推理时，可以使用输入分辨率的灵活性来平滑地调整推理时的成本-性能交互。我们认为 NaViT 标志着计算机视觉模型的标准、 convolutional Neural Networks (CNN) 设计的输入和模型管道的改变，并表示一个可行的方向。”

Instruction Mining: High-Quality Instruction Data Selection for Large Language Models

paper_url: http://arxiv.org/abs/2307.06290
repo_url: None
paper_authors: Yihan Cao, Yanbin Kang, Lichao Sun
for: 提高语言模型对人工指令的理解和回应能力
methods: 使用特定的自然语言指标来评估指令遵循数据质量，并通过广泛的finetuning实验研究指标与数据质量之间的关系
results: 对多个指令遵循数据进行选择和评估，实现选择相对较高质量的样本，并通过对未看过的数据集进行选择而验证其性能。相比于基于未过滤数据集进行finetuning，通过InstructMining选择的模型在42.5%的情况下表现较好。

Abstract
Large language models typically undergo two training stages, pretraining and finetuning. Despite that large-scale pretraining endows the model with strong capabilities to generate natural language responses, these pretrained models can still fail to understand human instructions at times. To enhance language models' ability of interpreting and responding to instructions, instruction finetuning has emerged as a critical method in this area. Recent studies found that large language models can be finetuned to perform well even with a small amount of high-quality instruction-following data. However, the selection of high-quality datasets for finetuning language models still lacks clear guidelines to follow. In this paper, we propose InstructMining, a linear rule for evaluating instruction-following data quality. We formulate InstructMining using specific natural language indicators. To investigate the relationship between data quality and these indicators, we further conduct extensive finetuning experiments. The experiment results are then applied to estimating parameters in InstructMining. To further investigate its performance, we use InstructMining to select high-quality data from unseen datasets. Results demonstrate that InstructMining can help select relatively high-quality samples from various instruction-following datasets. Compared to models finetuned on unfiltered datasets, models finetuned on InstructMining selected datasets perform better on 42.5% cases.

摘要
大型自然语言模型通常需要两个训练阶段，预训练和资源调整。尽管大规模预训练使模型拥有强大的自然语言回快能力，但这些预训练模型可能会在某些情况下无法理解人类的指令。为提高语言模型对指令的理解和回快，指令调整成为了该领域的关键方法。最近的研究发现，大型自然语言模型可以通过少量高质量指令跟踪数据进行高效调整。然而，选择高质量数据集用于调整语言模型仍然缺乏明确的指南。在这篇论文中，我们提出了InstructMining，一种线性规则用于评估指令跟踪数据质量。我们将InstructMining表述为特定的自然语言指标。为了investigate数据质量和这些指标之间的关系，我们进行了广泛的调整实验。实验结果被应用于估计InstructMining参数。为了进一步评估其性能，我们使用InstructMining来从未看到的数据集中选择高质量样本。结果表明，InstructMining可以帮助选择不同的指令跟踪数据集中的相对较高质量样本。与不过滤数据集进行资源调整相比，通过InstructMining选择的数据集中训练的模型在42.5%的情况下表现较好。