cs.AI - 2023-07-25

Argument Attribution Explanations in Quantitative Bipolar Argumentation Frameworks (Technical Report)

  • paper_url: http://arxiv.org/abs/2307.13582
  • repo_url: None
  • paper_authors: Xiang Yin, Nico Potyka, Francesca Toni
  • for: 这篇论文旨在解释Argumentation Frameworks(AFs)的量化结果,具体来说是解释Quantitative Bipolar Argumentation Frameworks(QBAFs)中的话题论点。
  • methods: 该论文提出了一种新的Argument Attribution Explanations(AAEs)理论,兼用机器学习中的特征归因来解释AFs中的论点。
  • results: 论文通过两个实践案例(即假新闻检测和电影推荐系统)来示例AAEs的应用性。
    Abstract Argumentative explainable AI has been advocated by several in recent years, with an increasing interest on explaining the reasoning outcomes of Argumentation Frameworks (AFs). While there is a considerable body of research on qualitatively explaining the reasoning outcomes of AFs with debates/disputes/dialogues in the spirit of extension-based semantics, explaining the quantitative reasoning outcomes of AFs under gradual semantics has not received much attention, despite widespread use in applications. In this paper, we contribute to filling this gap by proposing a novel theory of Argument Attribution Explanations (AAEs) by incorporating the spirit of feature attribution from machine learning in the context of Quantitative Bipolar Argumentation Frameworks (QBAFs): whereas feature attribution is used to determine the influence of features towards outputs of machine learning models, AAEs are used to determine the influence of arguments towards topic arguments of interest. We study desirable properties of AAEs, including some new ones and some partially adapted from the literature to our setting. To demonstrate the applicability of our AAEs in practice, we conclude by carrying out two case studies in the scenarios of fake news detection and movie recommender systems.
    摘要 争议解释AI在最近几年来得到了许多人的支持,感兴趣的是解释Argumentation Frameworks(AFs)的结果的逻辑过程。虽然有许多关于使用辩论/争议/对话来解释AFs的质量的研究,但是对于使用加权 semantics来解释AFs的量化逻辑结果没有很多关注,尽管这在应用中广泛使用。在这篇论文中,我们减轻这一点的空白,我们提出了一种新的Argument Attribution Explanations(AAEs)理论,该理论基于机器学习中的特征归因,用于解释QBAFs中的话题Arguments。而特征归因用于确定机器学习模型输出的特征对输出产生的影响,AAEs则用于确定话题Arguments对QBAFs中的话题Arguments的影响。我们研究了AAEs的愉悦性质,包括一些新的和一些从文献中部分适应我们的设置。为了证明AAEs在实践中的可行性,我们在假新闻检测和电影推荐系统两个场景中进行了两个案例研究。

Reinterpreting survival analysis in the universal approximator age

  • paper_url: http://arxiv.org/abs/2307.13579
  • repo_url: https://github.com/sdittmer/survival_analysis_sumo_plus_plus
  • paper_authors: Sören Dittmer, Michael Roberts, Jacobus Preller, AIX COVNET, James H. F. Rudd, John A. D. Aston, Carola-Bibiane Schönlieb
  • for: 本研究旨在提供用于深度学习中survival分析的工具,以便充分发挥survival分析的潜在力量。
  • methods: 本研究使用的方法包括新的损失函数、评价指标和首个universal approximating网络,这些工具可以无需数值integation生成survival曲线。
  • results: 研究表明,新的损失函数和模型在大规模的数据研究中表现出色,超过其他方法的表现。
    Abstract Survival analysis is an integral part of the statistical toolbox. However, while most domains of classical statistics have embraced deep learning, survival analysis only recently gained some minor attention from the deep learning community. This recent development is likely in part motivated by the COVID-19 pandemic. We aim to provide the tools needed to fully harness the potential of survival analysis in deep learning. On the one hand, we discuss how survival analysis connects to classification and regression. On the other hand, we provide technical tools. We provide a new loss function, evaluation metrics, and the first universal approximating network that provably produces survival curves without numeric integration. We show that the loss function and model outperform other approaches using a large numerical study.
    摘要 生存分析是统计工具箱中的一个重要组成部分。然而,在经典统计领域中,深度学习已经广泛应用,而生存分析则只是在深度学习社区中最近才得到了一些微的注意。这种最近的发展可能与COVID-19大流行有关。我们的目标是为生存分析在深度学习中充分发挥作用提供工具。一方面,我们讨论了生存分析与分类和回归之间的联系。另一方面,我们提供了技术工具。我们提出了一个新的损失函数、评估指标和首个可靠地生成Survival Curve的网络。我们通过大规模的数值研究表明,我们的损失函数和模型在其他方法的比较中表现出色。

A Dual-mode Local Search Algorithm for Solving the Minimum Dominating Set Problem

  • paper_url: http://arxiv.org/abs/2307.16815
  • repo_url: None
  • paper_authors: Enqiang Zhu, Yu Zhang, Shengzhi Wang, Darren Strash, Chanjuan Liu
  • for: 解决图形中最小控制集(MinDS)问题,即找到一个最小的集合 $D$,使得每个不在 $D$ 中的顶点都与至少一个 $D$ 中的顶点相邻。
  • methods: 我们提出了一种有效的本地搜索算法(DmDS),它采用了两种不同的顶点交换方案来解决MinDS问题。此外,我们还提出了一种基于频率的顶点选择 criterion,以解决其他算法中的各种绑定情况,以及一种新的Initial Solution质量提高策略,基于批处理和扰动。
  • results: 我们对 seven 个数据集进行了比较,包括 346 个实例(或家族),最多有十亿个顶点。实验结果表明,DmDS 在大多数实例中具有最高的准确率,并在广泛的实际图形上发现了许多更好的解决方案。
    Abstract Given a graph, the minimum dominating set (MinDS) problem is to identify a smallest set $D$ of vertices such that every vertex not in $D$ is adjacent to at least one vertex in $D$. The MinDS problem is a classic $\mathcal{NP}$-hard problem and has been extensively studied because of its many disparate applications in network analysis. To solve this problem efficiently, many heuristic approaches have been proposed to obtain a good solution within an acceptable time limit. However, existing MinDS heuristic algorithms are always limited by various tie-breaking cases when selecting vertices, which slows down the effectiveness of the algorithms. In this paper, we design an efficient local search algorithm for the MinDS problem, named DmDS -- a dual-mode local search framework that probabilistically chooses between two distinct vertex-swapping schemes. We further address limitations of other algorithms by introducing vertex selection criterion based on the frequency of vertices added to solutions to address tie-breaking cases, and a new strategy to improve the quality of the initial solution via a greedy-based strategy integrated with perturbation. We evaluate DmDS against the state-of-the-art algorithms on seven datasets, consisting of 346 instances (or families) with up to tens of millions of vertices. Experimental results show that DmDS obtains the best performance in accuracy for almost all instances and finds much better solutions than state-of-the-art MinDS algorithms on a broad range of large real-world graphs.
    摘要 Existing MinDS heuristic algorithms are limited by various tie-breaking cases when selecting vertices, which slows down their effectiveness. In this paper, we propose an efficient local search algorithm for the MinDS problem, called DmDS, which uses a dual-mode local search framework that probabilistically chooses between two distinct vertex-swapping schemes.To address limitations of other algorithms, we introduce a vertex selection criterion based on the frequency of vertices added to solutions to address tie-breaking cases, and a new strategy to improve the quality of the initial solution via a greedy-based strategy integrated with perturbation.We evaluate DmDS against state-of-the-art algorithms on seven datasets, consisting of 346 instances (or families) with up to tens of millions of vertices. Experimental results show that DmDS obtains the best performance in accuracy for almost all instances and finds much better solutions than state-of-the-art MinDS algorithms on a broad range of large real-world graphs.Here is the text in Simplified Chinese:给定一个图,最小控制集(MinDS)问题是找到最小的集合 $D$ 的 vertices,使得每个不在 $D$ 中的 vertex 都与至少一个在 $D$ 中的 vertex 相邻。这是一个 класси的 $\mathcal{NP}$-hard 问题,广泛的研究了因为它在网络分析中的许多实际应用。现有的 MinDS 规则算法都受到不同的选择情况的限制,这会使得它们的效iveness降低。在这篇论文中,我们提出一种高效的本地搜索算法 для MinDS 问题,名为 DmDS,它使用了一种 dual-mode 本地搜索框架, probabilistically 选择两种不同的 vertex-swapping 策略。为了解决其他算法的限制,我们引入一个基于频率的 vertex 选择标准,以 Address 选择情况中的僵尸性,并 introducing a new strategy to improve the quality of the initial solution via a greedy-based strategy integrated with perturbation。我们对 seven 个 datasets 进行了对比,这些 datasets 包括 346 个实例(或家族),最多达到了 tens of millions 的 vertices。实验结果显示,DmDS 在大多数实例中具有最高的准确性,并在许多实际世界图上发现了 much better 的解决方案,远超现有的 MinDS 算法。

The Impact of Imperfect XAI on Human-AI Decision-Making

  • paper_url: http://arxiv.org/abs/2307.13566
  • repo_url: None
  • paper_authors: Katelyn Morrison, Philipp Spitzer, Violet Turri, Michelle Feng, Niklas Kühl, Adam Perer
  • for: 这研究旨在探讨人类和AI协作中如何处理不准确的解释,以提高人类和AI协作的效果。
  • methods: 本研究采用了混合方法,包括人类参与者136人的混合研究,以评估人类在鸟种识别任务中对不准确解释的影响。
  • results: 研究发现,不准确解释会影响人类对AI的依赖度和人类-AI团队性能。此外,解释的强度也影响人类的决策行为。这些发现有助于理解人类和AI协作中的不准确解释的影响,并提供设计人类-AI协作系统的指南。
    Abstract Explainability techniques are rapidly being developed to improve human-AI decision-making across various cooperative work settings. Consequently, previous research has evaluated how decision-makers collaborate with imperfect AI by investigating appropriate reliance and task performance with the aim of designing more human-centered computer-supported collaborative tools. Several human-centered explainable AI (XAI) techniques have been proposed in hopes of improving decision-makers' collaboration with AI; however, these techniques are grounded in findings from previous studies that primarily focus on the impact of incorrect AI advice. Few studies acknowledge the possibility for the explanations to be incorrect even if the AI advice is correct. Thus, it is crucial to understand how imperfect XAI affects human-AI decision-making. In this work, we contribute a robust, mixed-methods user study with 136 participants to evaluate how incorrect explanations influence humans' decision-making behavior in a bird species identification task taking into account their level of expertise and an explanation's level of assertiveness. Our findings reveal the influence of imperfect XAI and humans' level of expertise on their reliance on AI and human-AI team performance. We also discuss how explanations can deceive decision-makers during human-AI collaboration. Hence, we shed light on the impacts of imperfect XAI in the field of computer-supported cooperative work and provide guidelines for designers of human-AI collaboration systems.
    摘要 <>人工智能技术在协作工作场景中快速发展,以提高人机协作决策。先前的研究已经评估了人与不完美AI的协作方式,并设计了更人类中心的计算机支持协作工具。然而,这些技术多数基于先前研究中关注 incorrect AI 建议的影响。很少的研究承认可能存在 incorrect 的解释,即使 AI 建议正确。因此,理解 incorrect XAI 如何影响人机协作决策是关键。在这种情况下,我们通过一项强大的混合方法用户研究,卷入 136 名参与者,评估 incorrect 解释如何影响人们决策行为,包括他们的专业水平和解释的强硬程度。我们的发现表明 incorrect XAI 和参与者的专业水平对人机协作的可靠性和性能产生了影响。我们还讨论了解释如何在人机协作中欺骗决策者。因此,我们为计算机支持协作系统的设计提供了指导,并着重于人机协作中 incorrect XAI 的影响。

Decision-Focused Learning: Foundations, State of the Art, Benchmark and Future Opportunities

  • paper_url: http://arxiv.org/abs/2307.13565
  • repo_url: https://github.com/predopt/predopt-benchmarks
  • paper_authors: Jayanta Mandi, James Kotary, Senne Berden, Maxime Mulamba, Victor Bucarey, Tias Guns, Ferdinando Fioretto
  • for: 这篇论文主要是为了介绍决策关注学习(DFL)这一新兴机器学习 paradigma,它将预测和优化结合在一个端到端系统中,以便在不确定环境下做出优化决策。
  • methods: 论文介绍了各种将机器学习和优化模型集成的技术,并提出了一种分类DFL方法的 Taxonomy,以及一些适用于DFL的测试数据集和任务。
  • results: 论文进行了广泛的实验评估,对DFL方法进行了valuable的探索和评估,并提供了有价值的Future research direction。
    Abstract Decision-focused learning (DFL) is an emerging paradigm in machine learning which trains a model to optimize decisions, integrating prediction and optimization in an end-to-end system. This paradigm holds the promise to revolutionize decision-making in many real-world applications which operate under uncertainty, where the estimation of unknown parameters within these decision models often becomes a substantial roadblock. This paper presents a comprehensive review of DFL. It provides an in-depth analysis of the various techniques devised to integrate machine learning and optimization models, introduces a taxonomy of DFL methods distinguished by their unique characteristics, and conducts an extensive empirical evaluation of these methods proposing suitable benchmark dataset and tasks for DFL. Finally, the study provides valuable insights into current and potential future avenues in DFL research.
    摘要 决策关注学习(DFL)是一种emerging paradigm在机器学习领域,它允许模型通过端到端系统来优化决策,并将预测和优化结合在一起。这种方法在不确定环境下进行决策,对决策模型中未知参数的估计成为了一个重要的障碍。本文提供了DFL的全面回顾,包括不同方法的集成、机器学习和优化模型的分类、以及对这些方法的广泛实验评估。最后,研究还提供了DFL研究的当前和未来可能的方向。Here's the translation of the text in Traditional Chinese:决策关注学习(DFL)是一种emerging paradigm在机器学习领域,它允许模型透过端到端系统来优化决策,并将预测和优化结合在一起。这种方法在不确定环境下进行决策,对决策模型中未知参数的估计成为了一个重要的障碍。本文提供了DFL的全面回顾,包括不同方法的集成、机器学习和优化模型的分类、以及对这些方法的广泛实验评估。最后,研究还提供了DFL研究的现在和未来可能的方向。

On Solving the Rubik’s Cube with Domain-Independent Planners Using Standard Representations

  • paper_url: http://arxiv.org/abs/2307.13552
  • repo_url: None
  • paper_authors: Bharath Muppasani, Vishal Pallagani, Biplav Srivastava, Forest Agostinelli
    for:这篇论文的目的是将 Rubik’s Cube puzzle 表示为 PDDL 语言,以便更好地访问 PDDL планировщики、竞赛和知识工程工具,并使其更易于人类阅读。methods:该论文使用了 PDDL 语言表示 Rubik’s Cube puzzle,并与现有的方法进行比较。其中包括使用 DeepCubeA 搜索算法和 Scorpion планировщиker 的 State-Action-Space+ 表示方法,以及 FastDownward 搜索算法和 FF 规则的组合。results:该论文的实验结果显示,使用 PDDL 语言表示 Rubik’s Cube puzzle可以提高 solve 率,但是不同的表示方法和搜索算法之间存在负荷和优化的贸易offs。Specifically, DeepCubeA 搜索算法可以解决所有问题,但只有78.5%是优化的计划;Scorpion планировщиker 可以解决61.50%的问题,其中79.64%是优化的计划。
    Abstract Rubik's Cube (RC) is a well-known and computationally challenging puzzle that has motivated AI researchers to explore efficient alternative representations and problem-solving methods. The ideal situation for planning here is that a problem be solved optimally and efficiently represented in a standard notation using a general-purpose solver and heuristics. The fastest solver today for RC is DeepCubeA with a custom representation, and another approach is with Scorpion planner with State-Action-Space+ (SAS+) representation. In this paper, we present the first RC representation in the popular PDDL language so that the domain becomes more accessible to PDDL planners, competitions, and knowledge engineering tools, and is more human-readable. We then bridge across existing approaches and compare performance. We find that in one comparable experiment, DeepCubeA (trained with 12 RC actions) solves all problems with varying complexities, albeit only 78.5% are optimal plans. For the same problem set, Scorpion with SAS+ representation and pattern database heuristics solves 61.50% problems optimally, while FastDownward with PDDL representation and FF heuristic solves 56.50% problems, out of which 79.64% of the plans generated were optimal. Our study provides valuable insights into the trade-offs between representational choice and plan optimality that can help researchers design future strategies for challenging domains combining general-purpose solving methods (planning, reinforcement learning), heuristics, and representations (standard or custom).
    摘要 瑞比克立方体(RC)是一个知名且 computationally challenging 的游戏,它激发了人工智能研究者们开发高效的代表法和解决方法。理想情况是将问题解决得最优化地,使用标准notation representation 和通用的解决器和规则。目前最快的解决器是 DeepCubeA WITH custom representation,另一种方法是使用 Scorpion плаanner WITH State-Action-Space+(SAS+) representation。在这篇论文中,我们将 RC 的第一个 representation 在流行的 PDDL 语言中提供,使得Domain 变得更加可达性和可读性更高,并且可以用于 PDDL плаanner、竞赛和知识工程工具。然后,我们将现有的方法相互连接,并比较性能。我们发现在一个相同的实验中,DeepCubeA(已经训练有 12 RC 动作)可以解决具有不同复杂性的所有问题,但只有 78.5% 是优化的方案。对于同一个问题集,Scorpion WITH SAS+ representation 和模式数据库规则可以解决 61.50% 问题,而 FastDownward WITH PDDL representation 和 FF 规则可以解决 56.50% 问题,其中 79.64% 的方案是优化的。我们的研究提供了有价值的对于表示选择和方案优化的交易所,可以帮助研究人员设计未来在复杂的 Domain 中结合通用解决方法(规划、强化学习)、规则和表示(标准或自定义)的策略。

A Planning Ontology to Represent and Exploit Planning Knowledge for Performance Efficiency

  • paper_url: http://arxiv.org/abs/2307.13549
  • repo_url: None
  • paper_authors: Bharath Muppasani, Vishal Pallagani, Biplav Srivastava, Raghava Mutharaju, Michael N. Huhns, Vignesh Narayanan
  • for: 本研究旨在解决自动规划问题,即找到将智能机器人从初始状态转移到目标状态的有效动作序列。
  • methods: 本研究使用国际规划竞赛(IPC)数据 construct了规划ontology,并通过实验在两个用例中示出ontology可以选择有potential的规划器并提高其性能using macros。
  • results: 实验结果表明,使用规划ontology可以选择适合域的规划器并提高其性能。同时,研究者还为社区提供了规划ontology和相关资源,以便进一步研究。
    Abstract Ontologies are known for their ability to organize rich metadata, support the identification of novel insights via semantic queries, and promote reuse. In this paper, we consider the problem of automated planning, where the objective is to find a sequence of actions that will move an agent from an initial state of the world to a desired goal state. We hypothesize that given a large number of available planners and diverse planning domains; they carry essential information that can be leveraged to identify suitable planners and improve their performance for a domain. We use data on planning domains and planners from the International Planning Competition (IPC) to construct a planning ontology and demonstrate via experiments in two use cases that the ontology can lead to the selection of promising planners and improving their performance using macros - a form of action ordering constraints extracted from planning ontology. We also make the planning ontology and associated resources available to the community to promote further research.
    摘要 Ontologies 知道如何组织富有 metadata,支持通过semantic queries提取新的发现,并促进重用。在这篇论文中,我们考虑自动规划问题, objective 是找到一个将智能机器从初始状态转移到目标状态的 sequences of actions。我们假设, given 大量可用的 плаanner 和多样化的规划领域; 它们携带着重要信息,可以用来选择适合的 плаanner 并提高其性能。我们使用国际规划竞赛(IPC)的数据construct 规划ontology,并通过实验示例二进行了证明,规划ontology 可以选择promising planners 并使其性能提高。我们还将规划ontology 和相关资源公开发布,以便进一步的研究。

Group Activity Recognition in Computer Vision: A Comprehensive Review, Challenges, and Future Perspectives

  • paper_url: http://arxiv.org/abs/2307.13541
  • repo_url: None
  • paper_authors: Chuanchuan Wang, Ahmad Sufril Azlan Mohamed
  • For: 这篇论文主要研究目标是为了提高群体活动识别技术,具体来说是通过Global interactivity和活动的方式进行识别。* Methods: 这篇论文使用了多种方法,包括传统方法、基于空间结构的方法、描述符、非深度学习方法、层次回归神经网络(HRNN)、关系模型和注意机制等。* Results: 这篇论文对群体活动识别方法进行了全面的审视和比较,并提出了一种基于关系网络的模块化方法,并进行了实验验证。
    Abstract Group activity recognition is a hot topic in computer vision. Recognizing activities through group relationships plays a vital role in group activity recognition. It holds practical implications in various scenarios, such as video analysis, surveillance, automatic driving, and understanding social activities. The model's key capabilities encompass efficiently modeling hierarchical relationships within a scene and accurately extracting distinctive spatiotemporal features from groups. Given this technology's extensive applicability, identifying group activities has garnered significant research attention. This work examines the current progress in technology for recognizing group activities, with a specific focus on global interactivity and activities. Firstly, we comprehensively review the pertinent literature and various group activity recognition approaches, from traditional methodologies to the latest methods based on spatial structure, descriptors, non-deep learning, hierarchical recurrent neural networks (HRNN), relationship models, and attention mechanisms. Subsequently, we present the relational network and relational architectures for each module. Thirdly, we investigate methods for recognizing group activity and compare their performance with state-of-the-art technologies. We summarize the existing challenges and provide comprehensive guidance for newcomers to understand group activity recognition. Furthermore, we review emerging perspectives in group activity recognition to explore new directions and possibilities.
    摘要 There has been significant research attention on identifying group activities, and this work aims to provide a comprehensive review of the current progress in this field. We will focus on global interactivity and activities, and our approach will include the following steps:1. Literature review: We will review the relevant literature and various group activity recognition approaches, from traditional methodologies to the latest methods based on spatial structure, descriptors, non-deep learning, hierarchical recurrent neural networks (HRNN), relationship models, and attention mechanisms.2. Relational network and architectures: We will present the relational network and relational architectures for each module.3. Methods for recognizing group activity: We will investigate methods for recognizing group activity and compare their performance with state-of-the-art technologies.4. Challenges and future directions: We will summarize the existing challenges and provide comprehensive guidance for newcomers to understand group activity recognition. Additionally, we will review emerging perspectives in group activity recognition to explore new directions and possibilities.Overall, this work aims to provide a comprehensive overview of the current state of group activity recognition technology and its applications, as well as to explore new directions and possibilities for future research.

Spectrum-guided Multi-granularity Referring Video Object Segmentation

  • paper_url: http://arxiv.org/abs/2307.13537
  • repo_url: https://github.com/bo-miao/sgmg
  • paper_authors: Bo Miao, Mohammed Bennamoun, Yongsheng Gao, Ajmal Mian
  • for: 这个论文是为了解决现有的视频对象 segmentation (R-VOS) 技术中的 feature drift 问题,以提高 segmentation 效果。
  • methods: 该论文提出了一种 Spectrum-guided Multi-granularity (SgMg) 方法, Direct segmentation 在编码特征上进行,并使用视觉细节进行优化mask。同时,提出了 Spectrum-guided Cross-modal Fusion (SCF) 方法,在 spectral 频谱上进行了跨模态拟合。
  • results: 实验表明,SgMg 方法在四个视频测试集上达到了当前最佳性能,与 nearest competitor 相比,提高了2.8% 点的 Ref-YouTube-VOS 性能。同时,通过扩展 SgMg,实现了多对象 R-VOS,不仅快速响应,还可以保持满意的性能。
    Abstract Current referring video object segmentation (R-VOS) techniques extract conditional kernels from encoded (low-resolution) vision-language features to segment the decoded high-resolution features. We discovered that this causes significant feature drift, which the segmentation kernels struggle to perceive during the forward computation. This negatively affects the ability of segmentation kernels. To address the drift problem, we propose a Spectrum-guided Multi-granularity (SgMg) approach, which performs direct segmentation on the encoded features and employs visual details to further optimize the masks. In addition, we propose Spectrum-guided Cross-modal Fusion (SCF) to perform intra-frame global interactions in the spectral domain for effective multimodal representation. Finally, we extend SgMg to perform multi-object R-VOS, a new paradigm that enables simultaneous segmentation of multiple referred objects in a video. This not only makes R-VOS faster, but also more practical. Extensive experiments show that SgMg achieves state-of-the-art performance on four video benchmark datasets, outperforming the nearest competitor by 2.8% points on Ref-YouTube-VOS. Our extended SgMg enables multi-object R-VOS, runs about 3 times faster while maintaining satisfactory performance. Code is available at https://github.com/bo-miao/SgMg.
    摘要 当前的视频对象 segmentation (R-VOS) 技术从编码的低分辨率视Language特征中提取条件kernels来 segment decode高分辨率特征。我们发现这会导致重要的特征漂移,使segmentation kernels在前向计算中困难以感知。这 negatively affects the ability of segmentation kernels。为解决这个问题,我们提出了spectrum-guided Multi-granularity (SgMg)方法,它直接在编码特征上进行分 segmentation和使用视觉细节进一步优化Mask。此外,我们提出了spectrum-guided Cross-modal Fusion (SCF),它在spectral domain中进行了intra-frame global interactions,以实现有效的 Multimodal Representation。 finally,我们扩展了SgMg,以实现多对象R-VOS,一种新的 paradigm,可以同时 segment multiple referred objects in a video。这不仅使R-VOS更快,而且更实用。我们的扩展SgMg在四个视频 benchmark dataset上进行了广泛的实验,并达到了状态的art Performance,比 nearest competitor高2.8%点。我们的扩展SgMg可以在3倍的速度下维持满意的性能。代码可以在https://github.com/bo-miao/SgMg 中找到。

Re-mine, Learn and Reason: Exploring the Cross-modal Semantic Correlations for Language-guided HOI detection

  • paper_url: http://arxiv.org/abs/2307.13529
  • repo_url: None
  • paper_authors: Yichao Cao, Xiu Su, Qingfei Tang, Feng Yang, Shan You, Xiaobo Lu, Chang Xu
  • for: 提高人员对象互动(HOI)检测的精度,使用视觉模型解决人员对象互动的复杂关系。
  • methods: 提出了一个系统atic和统一的框架(RmLR),通过结构化文本知识来增强HOI检测。首先,分析了两阶段HOI检测器中的交互信息损失,并提出了重新挖掘策略来生成更全面的视觉表示。其次,设计了更细化的句子和单词级别对齐和知识传递策略,以有效地解决多个交互和多个文本之间的多对多匹配问题。这些策略可以减轻在多个交互同时发生时出现的匹配混乱问题,从而提高对齐过程的有效性。
  • results: 实验结果表明,我们的方法可以减轻HOI检测的困难,并在公共测试集上达到状态 искусственный智能性能的最高水平。我们进一步分析了不同组成部分的影响,以便更好地理解我们的方法的作用。
    Abstract Human-Object Interaction (HOI) detection is a challenging computer vision task that requires visual models to address the complex interactive relationship between humans and objects and predict HOI triplets. Despite the challenges posed by the numerous interaction combinations, they also offer opportunities for multimodal learning of visual texts. In this paper, we present a systematic and unified framework (RmLR) that enhances HOI detection by incorporating structured text knowledge. Firstly, we qualitatively and quantitatively analyze the loss of interaction information in the two-stage HOI detector and propose a re-mining strategy to generate more comprehensive visual representation.Secondly, we design more fine-grained sentence- and word-level alignment and knowledge transfer strategies to effectively address the many-to-many matching problem between multiple interactions and multiple texts.These strategies alleviate the matching confusion problem that arises when multiple interactions occur simultaneously, thereby improving the effectiveness of the alignment process. Finally, HOI reasoning by visual features augmented with textual knowledge substantially improves the understanding of interactions. Experimental results illustrate the effectiveness of our approach, where state-of-the-art performance is achieved on public benchmarks. We further analyze the effects of different components of our approach to provide insights into its efficacy.
    摘要 人机物交互(HOI)检测是一个复杂的计算机视觉任务,需要视觉模型处理人与物之间的复杂交互关系,并预测HOI triplets。尽管交互组合多样化,但它们也提供了多模式学习视觉文本的机会。在这篇论文中,我们提出了一个系统性和统一的框架(RmLR),增强HOI检测的能力,并包括结构化文本知识。首先,我们质量和量上分析了两stage HOI检测器中的交互信息损失,并提出了重新挖掘策略,以生成更全面的视觉表示。其次,我们设计了更细grained的句子和单词级别对齐和知识传递策略,以有效地Address多对多匹配问题。这些策略可以减少同时发生多个交互时的匹配混乱问题,从而改善对齐过程的效果。最后,通过视觉特征加上文本知识来进行HOI理解,可以大幅提高交互的理解能力。实验结果表明,我们的方法可以在公共 benchMark上达到领先的性能。我们进一步分析了不同组成部分的效果,以提供对其效果的深入分析。

FacTool: Factuality Detection in Generative AI – A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios

  • paper_url: http://arxiv.org/abs/2307.13528
  • repo_url: https://github.com/gair-nlp/factool
  • paper_authors: I-Chun Chern, Steffi Chern, Shiqi Chen, Weizhe Yuan, Kehua Feng, Chunting Zhou, Junxian He, Graham Neubig, Pengfei Liu
  • for: 检测生成模型中的错误信息
  • methods: 提出了一个任务和领域无关的框架,用于检测由大语言模型生成的文本中的错误信息
  • results: 在四个不同的任务(知识基础问答、代码生成、数学逻辑和科学文献评论)中,实验结果表明提出的方法有效。
    Abstract The emergence of generative pre-trained models has facilitated the synthesis of high-quality text, but it has also posed challenges in identifying factual errors in the generated text. In particular: (1) A wider range of tasks now face an increasing risk of containing factual errors when handled by generative models. (2) Generated texts tend to be lengthy and lack a clearly defined granularity for individual facts. (3) There is a scarcity of explicit evidence available during the process of fact checking. With the above challenges in mind, in this paper, we propose FacTool, a task and domain agnostic framework for detecting factual errors of texts generated by large language models (e.g., ChatGPT). Experiments on four different tasks (knowledge-based QA, code generation, mathematical reasoning, and scientific literature review) show the efficacy of the proposed method. We release the code of FacTool associated with ChatGPT plugin interface at https://github.com/GAIR-NLP/factool .
    摘要 <>请将以下文本翻译成简化中文。<>生成模型的出现使得高质量文本的合成变得更加容易,但也使得检测生成文本中的事实错误变得更加困难。特别是:(1)更多的任务现在面临着增加的事实错误风险。(2)生成的文本往往很长,缺乏明确的粒度来分割个别的事实。(3)在 фактиче检查过程中没有明确的证据。为了解决这些挑战,在这篇论文中,我们提出了 FacTool,一种任务和领域无关的检测文本生成模型中的事实错误框架。实验在四个不同的任务(知识基础问答、代码生成、数学推理和科学文献综述)中展示了提案的效果。我们将 FacTool 相关的 ChatGPT 插件接口的代码发布在 GitHub 上,请参考

An Empirical Study on Fairness Improvement with Multiple Protected Attributes

  • paper_url: http://arxiv.org/abs/2308.01923
  • repo_url: None
  • paper_authors: Zhenpeng Chen, Jie M. Zhang, Federica Sarro, Mark Harman
  • for: 该论文主要针对多个保护特征的公平性提升,而现有研究多数只是针对单个保护特征进行公平性提升。
  • methods: 该论文对11种现状顶尖公平性提升方法进行了广泛的研究,并分析了不同的数据集、度量和机器学习模型在考虑多个保护特征时的效果。
  • results: 研究发现,只考虑单一保护特征进行公平性提升可能会导致其他保护特征的不公平性增加,这种增加的比例可达88.3%(57.5%的平均值)。此外,对多个保护特征进行公平性提升不会带来减少准确性的代价,但是处理多个保护特征时的精度和回归率增加约5倍和8倍。这些结果有重要的意义,将只报告准确性作为机器学习性能指标是不充分的。
    Abstract Existing research mostly improves the fairness of Machine Learning (ML) software regarding a single protected attribute at a time, but this is unrealistic given that many users have multiple protected attributes. This paper conducts an extensive study of fairness improvement regarding multiple protected attributes, covering 11 state-of-the-art fairness improvement methods. We analyze the effectiveness of these methods with different datasets, metrics, and ML models when considering multiple protected attributes. The results reveal that improving fairness for a single protected attribute can largely decrease fairness regarding unconsidered protected attributes. This decrease is observed in up to 88.3% of scenarios (57.5% on average). More surprisingly, we find little difference in accuracy loss when considering single and multiple protected attributes, indicating that accuracy can be maintained in the multiple-attribute paradigm. However, the effect on precision and recall when handling multiple protected attributes is about 5 times and 8 times that of a single attribute. This has important implications for future fairness research: reporting only accuracy as the ML performance metric, which is currently common in the literature, is inadequate.
    摘要 现有研究主要是在受保护特征单个方面提高机器学习软件的公平性,但这并不是现实中的情况,用户通常有多个受保护特征。这篇论文进行了多个受保护特征公平性改进的广泛研究,涵盖了11种现状最佳实践方法。我们对不同的数据集、度量和机器学习模型进行了这些方法的分析,并发现了以下结论:在考虑多个受保护特征时,改进公平性对单个受保护特征的改进可以导致其他受保护特征的公平性下降,这种下降的比例在88.3%的情况下(57.5%的平均值)。而且,我们发现在考虑单个和多个受保护特征时,精度的影响几乎没有变化,这意味着在多个受保护特征的情况下,精度可以保持在相同的水平。然而,处理多个受保护特征时,精度和准确率的影响是单个受保护特征的8倍和5倍。这有重要的实践意义:现在流行的 literatura 中报道精度作为机器学习性能指标是不充分的。

Zshot: An Open-source Framework for Zero-Shot Named Entity Recognition and Relation Extraction

  • paper_url: http://arxiv.org/abs/2307.13497
  • repo_url: None
  • paper_authors: Gabriele Picco, Marcos Martínez Galindo, Alberto Purpura, Leopold Fuchs, Vanessa López, Hoang Thanh Lam
  • for: 这项研究的目的是提供一个可比较多种现代Zero-Shot Learning(ZSL)方法的框架,以便研究人员可以通过对标准benchmark数据集进行比较。
  • methods: 该框架使用了大量预训练语言模型,并提出了许多新的方法,从而导致了ZSL性能的明显提高。
  • results: 该研究提出了一个名为Zshot的新的ZSL框架,该框架包含了可extendible和可评估的API,以及多种优化技术,如管道 ensemble和可视化工具,以提高ZSL性能。
    Abstract The Zero-Shot Learning (ZSL) task pertains to the identification of entities or relations in texts that were not seen during training. ZSL has emerged as a critical research area due to the scarcity of labeled data in specific domains, and its applications have grown significantly in recent years. With the advent of large pretrained language models, several novel methods have been proposed, resulting in substantial improvements in ZSL performance. There is a growing demand, both in the research community and industry, for a comprehensive ZSL framework that facilitates the development and accessibility of the latest methods and pretrained models.In this study, we propose a novel ZSL framework called Zshot that aims to address the aforementioned challenges. Our primary objective is to provide a platform that allows researchers to compare different state-of-the-art ZSL methods with standard benchmark datasets. Additionally, we have designed our framework to support the industry with readily available APIs for production under the standard SpaCy NLP pipeline. Our API is extendible and evaluable, moreover, we include numerous enhancements such as boosting the accuracy with pipeline ensembling and visualization utilities available as a SpaCy extension.
    摘要 zero-shot learning (ZSL) 任务是指在训练中没有看到的文本中预测实体或关系。 ZSL 已成为一个重要的研究领域,因为特定领域的标注数据稀缺,其应用也在过去几年内不断增长。随着大型预训言语模型的出现,一些新的方法被提出,从而导致了 ZSL 性能的明显提升。现在,研究社区和业界均有强烈的需求,一个涵盖最新的 ZSL 方法和预训言语模型的全面框架。在这个研究中,我们提出了一个名为 Zshot 的新的 ZSL 框架,旨在解决以下问题。我们的主要目标是提供一个平台, allowing researchers 可以比较不同的状态对 ZSL 方法的标准 benchmark 数据集。此外,我们设计了我们的框架可以支持产业,通过在 SpaCy NLP 管道中提供可靠的 API。我们的 API 可扩展和评估,其中包括将 pipeline 结合使用以提高准确性,以及可用的 SpaCy 扩展包中的可视化工具。

Duet: efficient and scalable hybriD neUral rElation undersTanding

  • paper_url: http://arxiv.org/abs/2307.13494
  • repo_url: https://github.com/GIS-PuppetMaster/Duet
  • paper_authors: Kaixin Zhang, Hongzhi Wang, Yabin Lu, Ziqi Li, Chang Shu, Yu Yan, Donghua Yang
  • for: 估算卡尔达ности(cardinality estimation)问题,尤其是在高卡尔达ности和高维度表上,以提高learned cardinality estimator的实际应用。
  • methods: 引入 predicate information into autoregressive model,并提出了一种稳定、高效、可扩展的混合方法(Duet),可以直接估算卡尔达ности而不需要采样或非 differentiable process,从而降低推理复杂度从 O(n) 降至 O(1),并在高卡尔达ности和高维度表上达到更高的准确性。
  • results: 实验结果表明,Duet 可以实现所有设计目标,并在 CPU 上实现更低的推理成本,而且在 GPU 上的大多数学习方法上实现更高的准确性。
    Abstract Learned cardinality estimation methods have achieved high precision compared to traditional methods. Among learned methods, query-driven approaches face the data and workload drift problem for a long time. Although both query-driven and hybrid methods are proposed to avoid this problem, even the state-of-the-art of them suffer from high training and estimation costs, limited scalability, instability, and long-tailed distribution problem on high cardinality and high-dimensional tables, which seriously affects the practical application of learned cardinality estimators. In this paper, we prove that most of these problems are directly caused by the widely used progressive sampling. We solve this problem by introducing predicates information into the autoregressive model and propose Duet, a stable, efficient, and scalable hybrid method to estimate cardinality directly without sampling or any non-differentiable process, which can not only reduces the inference complexity from O(n) to O(1) compared to Naru and UAE but also achieve higher accuracy on high cardinality and high-dimensional tables. Experimental results show that Duet can achieve all the design goals above and be much more practical and even has a lower inference cost on CPU than that of most learned methods on GPU.
    摘要 现代学习 cardinality 估计方法已经达到了高精度,比传统方法更高。 among 学习方法中, Query-driven 方法面临着数据和工作负载漂移问题,持续时间很长。 although Query-driven 和混合方法都是为了避免这个问题,即使是当前最佳的它们也受到高训练和估计成本、有限扩展性、不稳定性和高维度高 cardinality 表的长板块分布问题的影响,这些问题对实际应用 cardinality 估计器产生了严重的影响。在本文中,我们证明了大多数这些问题是由广泛使用进度 sampling 所引起的。我们解决这个问题,通过将 predicate 信息添加到权重 autoregressive 模型,并提出了 Duet,一种稳定、高效和可扩展的混合方法,可以直接无需采样或任何不可微分过程,对高 cardinality 和高维度表进行 cardinality 估计,可以将推理复杂度从 O(n) 降低至 O(1),比 Naru 和 UAE 更高。实验结果表明,Duet 可以实现所有设计目标,并且在 CPU 上比大多数学习方法在 GPU 上更具实际性,甚至在推理成本方面也更低。

Integrating processed-based models and machine learning for crop yield prediction

  • paper_url: http://arxiv.org/abs/2307.13466
  • repo_url: None
  • paper_authors: Michiel G. J. Kallenberg, Bernardo Maestrini, Ron van Bree, Paul Ravensbergen, Christos Pylianidis, Frits van Evert, Ioannis N. Athanasiadis
  • for: 预测哈密瓜产量
  • methods: 使用混合元模型方法
  • results: 比基eline方法更好,但需更多实际数据 validate its practical effectiveness。Here’s the full translation of the abstract in Simplified Chinese:预测哈密瓜产量通常 involve theory-driven process-based 植物生长模型,它们在地方条件下困难准确化,或者数据驱动机器学习方法,它们需要大量数据。在这项工作中,我们调查了使用混合元模型方法进行哈密瓜产量预测。我们使用植物生长模型生成了一个数据集,并对其进行(预)训练一个卷积神经网络,然后使用观察数据进行精度调整。在Silico中,我们的元模型方法比基eline方法更好。在实际试验中,我们的方法与植物生长模型相比,在77个商业场景中表现相当,但是在303个试验场景中,两者都比一个简单的直线回归方法和专门设计的预处理方法差一些。我们的发现表明元模型方法在准确预测哈密瓜产量方面有潜力,但是需要更多的实际数据 validate its practical effectiveness。
    Abstract Crop yield prediction typically involves the utilization of either theory-driven process-based crop growth models, which have proven to be difficult to calibrate for local conditions, or data-driven machine learning methods, which are known to require large datasets. In this work we investigate potato yield prediction using a hybrid meta-modeling approach. A crop growth model is employed to generate synthetic data for (pre)training a convolutional neural net, which is then fine-tuned with observational data. When applied in silico, our meta-modeling approach yields better predictions than a baseline comprising a purely data-driven approach. When tested on real-world data from field trials (n=303) and commercial fields (n=77), the meta-modeling approach yields competitive results with respect to the crop growth model. In the latter set, however, both models perform worse than a simple linear regression with a hand-picked feature set and dedicated preprocessing designed by domain experts. Our findings indicate the potential of meta-modeling for accurate crop yield prediction; however, further advancements and validation using extensive real-world datasets is recommended to solidify its practical effectiveness.
    摘要 卷积预测通常使用理论驱动的生物物理型或数据驱动机器学方法。前者具有难以调整本地条件的缺点,而后者需要大量数据。在这种工作中,我们调查了混合元模型方法用于预测食用产量。我们使用生长模型生成人工数据,并将其用于(预)训练卷积神经网络,然后精度地调整 Observational data。在虚拟环境中,我们的元模型方法比基准组的数据驱动方法更好。在实际数据集(n=303)和商业场景(n=77)中,元模型方法与生长模型具有相似的性能,但是在这两个场景中,所有模型都比一个简单的直线回归和专门为域专家设计的特定预处理方法更差。我们的发现表明元模型方法在准确预测卷积产量方面有潜力,但是进一步的进展和验证使用广泛的实际数据集是建议的,以固定其实际效果。

Unlocking the Emotional World of Visual Media: An Overview of the Science, Research, and Impact of Understanding Emotion

  • paper_url: http://arxiv.org/abs/2307.13463
  • repo_url: None
  • paper_authors: James Z. Wang, Sicheng Zhao, Chenyan Wu, Reginald B. Adams, Michelle G. Newman, Tal Shafir, Rachelle Tsachor
  • for: 这篇论文旨在探讨计算机和机器人领域中人工情感智能技术的发展,以及这种技术如何改变计算机视觉领域的研究。
  • methods: 这篇论文使用了多种方法,包括心理学、工程学和艺术等多个领域的研究成果,以提供一个全面的、多元的视听媒体情感分析领域的概述。
  • results: 这篇论文提出了计算机视觉领域中自动理解表达或诱发情感的技术存在一些挑战和限制,并提出了未来研究的重要方向和途径。
    Abstract The emergence of artificial emotional intelligence technology is revolutionizing the fields of computers and robotics, allowing for a new level of communication and understanding of human behavior that was once thought impossible. While recent advancements in deep learning have transformed the field of computer vision, automated understanding of evoked or expressed emotions in visual media remains in its infancy. This foundering stems from the absence of a universally accepted definition of "emotion", coupled with the inherently subjective nature of emotions and their intricate nuances. In this article, we provide a comprehensive, multidisciplinary overview of the field of emotion analysis in visual media, drawing on insights from psychology, engineering, and the arts. We begin by exploring the psychological foundations of emotion and the computational principles that underpin the understanding of emotions from images and videos. We then review the latest research and systems within the field, accentuating the most promising approaches. We also discuss the current technological challenges and limitations of emotion analysis, underscoring the necessity for continued investigation and innovation. We contend that this represents a "Holy Grail" research problem in computing and delineate pivotal directions for future inquiry. Finally, we examine the ethical ramifications of emotion-understanding technologies and contemplate their potential societal impacts. Overall, this article endeavors to equip readers with a deeper understanding of the domain of emotion analysis in visual media and to inspire further research and development in this captivating and rapidly evolving field.
    摘要 人工情感智能技术的出现正在改变计算机和机器人领域,allowing for a new level of communication and understanding of human behavior that was once thought impossible. However, recent advancements in deep learning have transformed the field of computer vision, automated understanding of evoked or expressed emotions in visual media remains in its infancy. This foundering stems from the absence of a universally accepted definition of "emotion", coupled with the inherently subjective nature of emotions and their intricate nuances.In this article, we provide a comprehensive, multidisciplinary overview of the field of emotion analysis in visual media, drawing on insights from psychology, engineering, and the arts. We begin by exploring the psychological foundations of emotion and the computational principles that underpin the understanding of emotions from images and videos. We then review the latest research and systems within the field, accentuating the most promising approaches. We also discuss the current technological challenges and limitations of emotion analysis, underscoring the necessity for continued investigation and innovation. We contend that this represents a "Holy Grail" research problem in computing and delineate pivotal directions for future inquiry.Finally, we examine the ethical ramifications of emotion-understanding technologies and contemplate their potential societal impacts. Overall, this article endeavors to equip readers with a deeper understanding of the domain of emotion analysis in visual media and to inspire further research and development in this captivating and rapidly evolving field.

Fundamental causal bounds of quantum random access memories

  • paper_url: http://arxiv.org/abs/2307.13460
  • repo_url: None
  • paper_authors: Yunfei Wang, Yuri Alexeev, Liang Jiang, Frederic T. Chong, Junyu Liu
  • for: 本研究旨在探讨量子Random Access Memory(QRAM)在量子物理原理下的限制,以确定量子计算应用程序在数据科学领域的长期表现。
  • methods: 本研究使用相对论和量子多体系统的 Lieb-Robinson bounds来探讨量子快速记忆器的内在约束。
  • results: 研究发现,使用量子声学系统的硬件设计,QRAM 可以处理 $\mathcal{O}(10^7)$ 逻辑量子 bits 在一维结构中,而在二维和三维结构中可以处理 $\mathcal{O}(10^{15})$ 到 $\mathcal{O}(10^{20})$ 和 $\mathcal{O}(10^{24})$ 量子 bits соответpectively。这些约束适用于其他量子硬件系统。研究结果表明,量子物理原理的限制对量子计算应用程序的长期表现有重要的影响,并且提出了可能提高性能的量子记忆器设计。
    Abstract Quantum devices should operate in adherence to quantum physics principles. Quantum random access memory (QRAM), a fundamental component of many essential quantum algorithms for tasks such as linear algebra, data search, and machine learning, is often proposed to offer $\mathcal{O}(\log N)$ circuit depth for $\mathcal{O}(N)$ data size, given $N$ qubits. However, this claim appears to breach the principle of relativity when dealing with a large number of qubits in quantum materials interacting locally. In our study we critically explore the intrinsic bounds of rapid quantum memories based on causality, employing the relativistic quantum field theory and Lieb-Robinson bounds in quantum many-body systems. In this paper, we consider a hardware-efficient QRAM design in hybrid quantum acoustic systems. Assuming clock cycle times of approximately $10^{-3}$ seconds and a lattice spacing of about 1 micrometer, we show that QRAM can accommodate up to $\mathcal{O}(10^7)$ logical qubits in 1 dimension, $\mathcal{O}(10^{15})$ to $\mathcal{O}(10^{20})$ in various 2D architectures, and $\mathcal{O}(10^{24})$ in 3 dimensions. We contend that this causality bound broadly applies to other quantum hardware systems. Our findings highlight the impact of fundamental quantum physics constraints on the long-term performance of quantum computing applications in data science and suggest potential quantum memory designs for performance enhancement.
    摘要

Monte-Carlo Tree Search for Multi-Agent Pathfinding: Preliminary Results

  • paper_url: http://arxiv.org/abs/2307.13453
  • repo_url: None
  • paper_authors: Yelisey Pitanov, Alexey Skrynnik, Anton Andreychuk, Konstantin Yakovlev, Aleksandr Panov
  • for: 这个论文研究了多 Agent Pathfinding 问题,即在图形结构下,每个代理都有唯一的起点和目标点,需要找到一个不受碰撞的多个路径,使每个代理都能够达到其目标点。
  • methods: 我们使用 Monte-Carlo Tree Search (MCTS) 来解决这个问题。MCTS 在各种问题中表现出色,如游戏等,但在多 Agent Pathfinding 中并未得到广泛研究。我们提出了一种专门为多 Agent Pathfinding 设计的 MCTS 变体。我们在 Compute 奖励的方法中使用了特定的路径来帮助代理人员达到目标点,同时保留了代理人员可以离开路径以避免碰撞的能力。
  • results: 我们对基eline планинг算法,例如 A*,进行比较,并证明了我们的方法在多 Agent Pathfinding 中表现出色,超过了基eline 方法。
    Abstract In this work we study a well-known and challenging problem of Multi-agent Pathfinding, when a set of agents is confined to a graph, each agent is assigned a unique start and goal vertices and the task is to find a set of collision-free paths (one for each agent) such that each agent reaches its respective goal. We investigate how to utilize Monte-Carlo Tree Search (MCTS) to solve the problem. Although MCTS was shown to demonstrate superior performance in a wide range of problems like playing antagonistic games (e.g. Go, Chess etc.), discovering faster matrix multiplication algorithms etc., its application to the problem at hand was not well studied before. To this end we introduce an original variant of MCTS, tailored to multi-agent pathfinding. The crux of our approach is how the reward, that guides MCTS, is computed. Specifically, we use individual paths to assist the agents with the the goal-reaching behavior, while leaving them freedom to get off the track if it is needed to avoid collisions. We also use a dedicated decomposition technique to reduce the branching factor of the tree search procedure. Empirically we show that the suggested method outperforms the baseline planning algorithm that invokes heuristic search, e.g. A*, at each re-planning step.
    摘要 To address this, we introduce an original variant of MCTS tailored to multi-agent pathfinding. The key aspect of our approach is how the reward, which guides the MCTS, is computed. Specifically, we use individual paths to assist the agents in reaching their goals while allowing them to deviate from the planned path if necessary to avoid collisions. We also employ a dedicated decomposition technique to reduce the branching factor of the tree search procedure.Empirically, we show that our suggested method outperforms a baseline planning algorithm that invokes heuristic search, such as A\*, at each re-planning step.

A behavioural transformer for effective collaboration between a robot and a non-stationary human

  • paper_url: http://arxiv.org/abs/2307.13447
  • repo_url: None
  • paper_authors: Ruaridh Mon-Williams, Theodoros Stouraitis, Sethu Vijayakumar
  • for: This paper aims to address the challenges of human-robot collaboration in non-stationary environments, where human behavior changes over time.
  • methods: The authors propose a principled meta-learning framework and develop a conditional transformer called Behaviour-Transform (BeTrans) to adapt to new human agents with non-stationary behaviors.
  • results: BeTrans effectively collaborates with simulated human agents and adapts faster to non-stationary simulated human agents than state-of-the-art techniques.Here’s the full text in Simplified Chinese:
  • for: 本研究旨在解决人机合作中的非站点环境问题,其中人类行为随着时间的变化。
  • methods: 作者提出了一种原理化的元学习框架,并开发了一种名为行为变换(BeTrans)的条件变换器,以适应新的人类代理者具有非站点行为的情况。
  • results: BeTrans在模拟人类代理者中的原创自定义环境中显示了与非站点模拟人类代理者的更好的协作和更快的适应速度,比STATE-OF-THE-ART技术更高。
    Abstract A key challenge in human-robot collaboration is the non-stationarity created by humans due to changes in their behaviour. This alters environmental transitions and hinders human-robot collaboration. We propose a principled meta-learning framework to explore how robots could better predict human behaviour, and thereby deal with issues of non-stationarity. On the basis of this framework, we developed Behaviour-Transform (BeTrans). BeTrans is a conditional transformer that enables a robot agent to adapt quickly to new human agents with non-stationary behaviours, due to its notable performance with sequential data. We trained BeTrans on simulated human agents with different systematic biases in collaborative settings. We used an original customisable environment to show that BeTrans effectively collaborates with simulated human agents and adapts faster to non-stationary simulated human agents than SOTA techniques.
    摘要 人机合作中的一大挑战是由人类行为引起的非站点性,这会导致环境转移和人机合作困难。我们提出了一种原则性的元学习框架,以便让机器人更好地预测人类行为,从而解决非站点性问题。基于这个框架,我们开发了行为变换(BeTrans)。BeTrans 是一种 Conditional Transformer,它允许机器人代理人类快速适应新的非站点人类行为,因为它在序列数据上表现出了显著的性能。我们在模拟人类代理人中进行了训练,并在合作 Setting 中验证了 BeTrans 的效果。我们使用了一个自定义的环境,以示 BeTrans 可以快速适应非站点人类行为,并且比标准技术更快。

On the Learning Dynamics of Attention Networks

  • paper_url: http://arxiv.org/abs/2307.13421
  • repo_url: https://github.com/vashisht-rahul/on-the-learning-dynamics-of-attention-networks
  • paper_authors: Rahul Vashisht, Harish G. Ramaswamy
  • for: 本研究的目的是探讨Attention模型的不同损失函数(soft attention、hard attention和latent variable marginal likelihood(LVML))在模型学习中的影响。
  • methods: 本研究使用了三种不同的损失函数来训练Attention模型,包括soft attention、hard attention和LVML。
  • results: 研究发现不同的损失函数会导致Attention模型的不同行为和结果。在训练过程中,使用soft attention损失函数可以让注意力模型在初始化阶段快速改进,但后续会降低。相反,使用hard attention损失函数可以使注意力模型在训练过程中保持稳定。此外,研究还提出了一种简单的混合方法,该方法结合了不同损失函数的优点,并在一些半人工和实际数据集上进行了测试。
    Abstract Attention models are typically learned by optimizing one of three standard loss functions that are variously called -- soft attention, hard attention, and latent variable marginal likelihood (LVML) attention. All three paradigms are motivated by the same goal of finding two models -- a `focus' model that `selects' the right \textit{segment} of the input and a `classification' model that processes the selected segment into the target label. However, they differ significantly in the way the selected segments are aggregated, resulting in distinct dynamics and final results. We observe a unique signature of models learned using these paradigms and explain this as a consequence of the evolution of the classification model under gradient descent when the focus model is fixed. We also analyze these paradigms in a simple setting and derive closed-form expressions for the parameter trajectory under gradient flow. With the soft attention loss, the focus model improves quickly at initialization and splutters later on. On the other hand, hard attention loss behaves in the opposite fashion. Based on our observations, we propose a simple hybrid approach that combines the advantages of the different loss functions and demonstrates it on a collection of semi-synthetic and real-world datasets
    摘要 注意模型通常通过优化三种标准损失函数来学习,它们分别被称为软注意力、硬注意力和隐变量概率 marginal likelihood(LVML)注意力。这三种方法都是为了找到两个模型---一个`焦点'模型可以选择正确的输入段落,以及一个`分类'模型可以处理选择的段落并生成目标标签。然而,它们在选取段落的方式不同,从而导致了不同的动力学和最终结果。我们观察到每种模型学习的独特签名,并解释这是因为分类模型在梯度下降过程中的演化。我们还对这些方法进行了简单的分析,并 deriv了关于参数轨迹的关闭式表达式。在软注意力损失函数下,焦点模型在初始化时快速提升,然后后来受阻。相反,硬注意力损失函数 behave in the opposite fashion。基于我们的观察,我们提出了一种简单的混合方法,将不同的损失函数的优点相互融合,并在一些半Synthetic和实际世界数据集上进行了证明。

Synthesis of Procedural Models for Deterministic Transition Systems

  • paper_url: http://arxiv.org/abs/2307.14368
  • repo_url: None
  • paper_authors: Javier Segovia-Aguas, Jonathan Ferrer-Mestres, Sergio Jiménez
  • for: 这篇论文旨在提出一种总体方法,用于生成某种逻辑系统的状态转移模型。
  • methods: 该方法采用抽象搜索,在Random-Access Machine(RAM)上使用有限Memory和简单的指令集来生成结构化程序。
  • results: 该方法可以生成符合给定输入集的状态转移模型,并且可以针对不同的目标语言进行模型化。
    Abstract This paper introduces a general approach for synthesizing procedural models of the state-transitions of a given discrete system. The approach is general in that it accepts different target languages for modeling the state-transitions of a discrete system; different model acquisition tasks with different target languages, such as the synthesis of STRIPS action models, or the update rule of a cellular automaton, fit as particular instances of our general approach. We follow an inductive approach to synthesis meaning that a set of examples of state-transitions, represented as (pre-state, action, post-state) tuples, are given as input. The goal is to synthesize a structured program that, when executed on a given pre-state, outputs its associated post-state. Our synthesis method implements a combinatorial search in the space of well-structured terminating programs that can be built using a Random-Access Machine (RAM), with a minimalist instruction set, and a finite amount of memory. The combinatorial search is guided with functions that asses the complexity of the candidate programs, as well as their fitness to the given input set of examples.
    摘要

A short review of the main concerns in A.I. development and application within the public sector supported by NLP and TM

  • paper_url: http://arxiv.org/abs/2308.02042
  • repo_url: None
  • paper_authors: Carlos Ferreira
  • for: 这个研究旨在捕捉公共领域中AI应用的数据隐私、伦理、可解释性、信任性和公平性问题的研究趋势。
  • methods: 该研究使用了NLP和TM基础概念,对ACMDigital Library和IEEE Xplore会议论文进行了两年内的查询和分析,以捕捉相关信息。
  • results: 研究结果显示,公平性是最常见的关注点,而数据隐私则是最少的关注点(即使它在大多数文章中都是embedded),而信任性则是最为显著的关注点。
    Abstract Artificial Intelligence is not a new subject, and business, industry and public sectors have used it in different ways and contexts and considering multiple concerns. This work reviewed research papers published in ACM Digital Library and IEEE Xplore conference proceedings in the last two years supported by fundamental concepts of Natural Language Processing (NLP) and Text Mining (TM). The objective was to capture insights regarding data privacy, ethics, interpretability, explainability, trustworthiness, and fairness in the public sector. The methodology has saved analysis time and could retrieve papers containing relevant information. The results showed that fairness was the most frequent concern. The least prominent topic was data privacy (although embedded in most articles), while the most prominent was trustworthiness. Finally, gathering helpful insights about those concerns regarding A.I. applications in the public sector was also possible.
    摘要 人工智能不是新的话题,商业、工业和公共部门在不同的方式和上下文中使用它,并考虑多种关注。这项工作查询了过去两年ACM数字图书馆和IEEE Xplore会议论文,基于自然语言处理(NLP)和文本挖掘(TM)的基本概念。目标是捕捉公共部门中关于数据隐私、伦理、可解性、可信度和公平性的视角。方法包括文献分析,减少分析时间,检索包含相关信息的论文。结果表明,公平性是最常见的关注点,而数据隐私即使在大多数文章中隐藏,也是最少提到的话题。最后,对于人工智能在公共部门中的应用中有所获得有用的洞察。

Towards Bridging the Digital Language Divide

  • paper_url: http://arxiv.org/abs/2307.13405
  • repo_url: None
  • paper_authors: Gábor Bella, Paula Helm, Gertraud Koch, Fausto Giunchiglia
  • for: 帮助各种语言技术扩展到受欠发达语言领域
  • methods: 通过对语言技术的研发方法进行修改,以减少语言偏见
  • results: 通过与本地社区的合作,提高语言技术的多样性和准确性
    Abstract It is a well-known fact that current AI-based language technology -- language models, machine translation systems, multilingual dictionaries and corpora -- focuses on the world's 2-3% most widely spoken languages. Recent research efforts have attempted to expand the coverage of AI technology to `under-resourced languages.' The goal of our paper is to bring attention to a phenomenon that we call linguistic bias: multilingual language processing systems often exhibit a hardwired, yet usually involuntary and hidden representational preference towards certain languages. Linguistic bias is manifested in uneven per-language performance even in the case of similar test conditions. We show that biased technology is often the result of research and development methodologies that do not do justice to the complexity of the languages being represented, and that can even become ethically problematic as they disregard valuable aspects of diversity as well as the needs of the language communities themselves. As our attempt at building diversity-aware language resources, we present a new initiative that aims at reducing linguistic bias through both technological design and methodology, based on an eye-level collaboration with local communities.
    摘要 现在的人工智能语言技术,包括语言模型、机器翻译系统、多语言词典和语料库,它们主要集中在世界上2-3%最广泛使用的语言上。 latest research efforts have attempted to expand the coverage of AI technology to "under-resourced languages." However, we have noticed a phenomenon that we call "linguistic bias" in multilingual language processing systems, which exhibits a hardwired yet involuntary and hidden representational preference towards certain languages. This bias is manifested in uneven per-language performance, even under similar test conditions. We argue that biased technology is often the result of research and development methodologies that do not fully consider the complexity of the languages being represented, and can even become ethically problematic as they disregard valuable aspects of diversity and the needs of language communities themselves. To address this issue, we present a new initiative that aims to reduce linguistic bias through both technological design and methodology, based on eye-level collaboration with local communities.

Predicting Code Coverage without Execution

  • paper_url: http://arxiv.org/abs/2307.13383
  • repo_url: https://github.com/microsoft/coverage-eval
  • paper_authors: Michele Tufano, Shubham Chandel, Anisha Agarwal, Neel Sundaresan, Colin Clement
  • for: 这篇论文的目的是为了评估大语言模型(LLM)对代码执行的理解程度,并提出了一个新的任务——代码覆盖率预测任务。
  • methods: 该论文使用了机器学习算法来减少计算代码覆盖率的成本,并且使用了人工生成的代码和测试用例来评估模型的性能。
  • results: 研究发现,OpenAI的GPT-4和GPT-3.5-Turbo、Google的BARD和Anthropic的Claude等四种state-of-the-art LLM在代码覆盖率预测任务中表现出色,并且 argue that code coverage as a metric and pre-training data source are valuable for overall LLM performance on software engineering tasks。
    Abstract Code coverage is a widely used metric for quantifying the extent to which program elements, such as statements or branches, are executed during testing. Calculating code coverage is resource-intensive, requiring code building and execution with additional overhead for the instrumentation. Furthermore, computing coverage of any snippet of code requires the whole program context. Using Machine Learning to amortize this expensive process could lower the cost of code coverage by requiring only the source code context, and the task of code coverage prediction can be a novel benchmark for judging the ability of models to understand code. We propose a novel benchmark task called Code Coverage Prediction for Large Language Models (LLMs). We formalize this task to evaluate the capability of LLMs in understanding code execution by determining which lines of a method are executed by a given test case and inputs. We curate and release a dataset we call COVERAGEEVAL by executing tests and code from the HumanEval dataset and collecting code coverage information. We report the performance of four state-of-the-art LLMs used for code-related tasks, including OpenAI's GPT-4 and GPT-3.5-Turbo, Google's BARD, and Anthropic's Claude, on the Code Coverage Prediction task. Finally, we argue that code coverage as a metric and pre-training data source are valuable for overall LLM performance on software engineering tasks.
    摘要 “代码覆盖率”是一个广泛使用的度量来量化程式码中不同元素的执行情况。计算代码覆盖率需要费时consumption,需要将代码建立和执行,并且需要额外的实现工具。此外,计算任何一段代码的覆盖率都需要整个程式码上下文。使用机器学习来优化这个费时的过程可以降低代码覆盖率的成本,只需要提供代码上下文,而不需要整个程式码。我们提出一个新的benchmark任务,名为代码覆盖预测(Code Coverage Prediction),用于评估大型自然语言模型(LLMs)的能力。我们正式定义这个任务,以评估LLMs对代码执行的理解度。我们组织了一个名为COVERAGEEVAL的数据集,通过执行HumanEval测试数据并收集代码覆盖信息。我们报告了四个现代LLMs的表现,包括OpenAI的GPT-4和GPT-3.5-Turbo、Google的BARD、以及Anthropic的Claude,在代码覆盖预测任务上的表现。最后,我们认为代码覆盖率作为度量和预训数据来源是LLM在软件工程任务上的重要因素。”

Empower Your Model with Longer and Better Context Comprehension

  • paper_url: http://arxiv.org/abs/2307.13365
  • repo_url: https://github.com/yileijin/attention-transition
  • paper_authors: Yifei Gao, Lei Wang, Jun Fang, Longhua Hu, Jun Cheng
  • for: 提高 LL 模型在较长和复杂的上下文中的理解能力,以便更好地应用在实际场景中。
  • methods: 提出了一种新的技术 called Attention Transition,通过增强模型内部信息传递的能力,使模型能够更好地理解较长的上下文,无需额外训练或影响生成流畅性。
  • results: 在 XSum 数据集上进行了实验,与 GPT4 进行比较,得到了显著的改善,证明了 Attention Transition 的有效性。
    Abstract Recently, with the emergence of numerous Large Language Models (LLMs), the implementation of AI has entered a new era. Irrespective of these models' own capacity and structure, there is a growing demand for LLMs to possess enhanced comprehension of longer and more complex contexts with relatively smaller sizes. Models often encounter an upper limit when processing sequences of sentences that extend beyond their comprehension capacity and result in off-topic or even chaotic responses. While several recent works attempt to address this issue in various ways, they rarely focus on "why models are unable to compensate or strengthen their capabilities on their own". In this paper, we thoroughly investigate the nature of information transfer within LLMs and propose a novel technique called Attention Transition. This technique empowers models to achieve longer and better context comprehension with minimal additional training or impact on generation fluency. Our experiments are conducted on the challenging XSum dataset using LLaMa-7b model with context token length ranging from 800 to 1900. Results demonstrate that we achieve substantial improvements compared with the original generation results evaluated by GPT4.
    摘要 现在,许多大语言模型(LLM)的出现,AI的实现进入了新的时代。无论这些模型的本身能力和结构,有越来越多的需求要LLM具备更好的长文本理解能力,即使文本长度较短。 modeloften encounter an upper limit when processing sequences of sentences that extend beyond their comprehension capacity and result in off-topic or even chaotic responses。 although several recent works attempt to address this issue in various ways, they rarely focus on "why models are unable to compensate or strengthen their capabilities on their own".在这篇论文中,我们全面调查LLM中信息传递的本质,并提出一种新的技术 called Attention Transition。这种技术使得模型可以在不需要额外训练或影响生成流畅性的情况下,实现更长更好的文本理解。我们对XSum数据集使用LLaMa-7b模型,Context Token length在800到1900之间进行了实验。结果显示,我们在评估于GPT4的原始生成结果的基础上获得了显著提高。

Do humans and Convolutional Neural Networks attend to similar areas during scene classification: Effects of task and image type

  • paper_url: http://arxiv.org/abs/2307.13345
  • repo_url: None
  • paper_authors: Romy Müller, Marcel Duerschmidt, Julian Ullrich, Carsten Knoll, Sascha Weber, Steffen Seitz
  • for: 本研究旨在探讨深度学习模型如 convolutional neural networks (CNN) 是如何决定是否与人类注意力相似的因素?而前一代研究主要关注技术因素,很少关注人类注意力的因素。
  • methods: 我们在 presente 研究中采用了多种任务来诱导人类注意力地图,包括自发的视线探索、意图的视线指向以及手动选择区域。此外,我们还使用了不同类型的图像,包括单一的醒目对象、室内场景和无明确对象定义的类别。
  • results: 我们发现,人类任务对于图像类型有很大的影响。对于对象,人类手动选择生成的地图和 CNN 的注意力地图最为相似,而自发视线任务干得影响相对较小。对于室内场景,自发视线任务生成的地图和 CNN 的注意力地图最为不同,而手动选择任务生成的地图和 CNN 的注意力地图相似度较高。这些结果表明,在比较人类和 CNN 的注意力时,需要考虑人类因素。
    Abstract Deep Learning models like Convolutional Neural Networks (CNN) are powerful image classifiers, but what factors determine whether they attend to similar image areas as humans do? While previous studies have focused on technological factors, little is known about the role of factors that affect human attention. In the present study, we investigated how the tasks used to elicit human attention maps interact with image characteristics in modulating the similarity between humans and CNN. We varied the intentionality of human tasks, ranging from spontaneous gaze during categorization over intentional gaze-pointing up to manual area selection. Moreover, we varied the type of image to be categorized, using either singular, salient objects, indoor scenes consisting of object arrangements, or landscapes without distinct objects defining the category. The human attention maps generated in this way were compared to the CNN attention maps revealed by explainable artificial intelligence (Grad-CAM). The influence of human tasks strongly depended on image type: For objects, human manual selection produced maps that were most similar to CNN, while the specific eye movement task has little impact. For indoor scenes, spontaneous gaze produced the least similarity, while for landscapes, similarity was equally low across all human tasks. To better understand these results, we also compared the different human attention maps to each other. Our results highlight the importance of taking human factors into account when comparing the attention of humans and CNN.
    摘要 Translation notes:* "Deep Learning models" is translated as "深度学习模型" (shēn dào xué xí mó del)* "Convolutional Neural Networks" is translated as "卷积神经网络" (jué shū shēn xīn wǎng luò)* "human attention maps" is translated as "人类注意地图" (rén xìng zhù yì dì tu)* "CNN attention maps" is translated as "CNN注意地图" (CNN zhù yì dì tu)* "explainable artificial intelligence" is translated as "可解释人工智能" (kě jiě jiě rén xīn zhī neng)* "Grad-CAM" is translated as "Grad-CAM" (Grad-CAM)* "human tasks" is translated as "人类任务" (rén xìng zhī yè)* "image characteristics" is translated as "图像特点" (tú xiàng tè qǐ)* "singular objects" is translated as "单一物体" (dan yī wù tǐ)* "indoor scenes" is translated as "室内场景" (shì nérie jīng jì)* "landscapes" is translated as "风景" (fēng jǐng)* "intentionality of human tasks" is translated as "人类任务的意图性" (rén xìng zhī yè de yì tú xìng)* "specific eye movement task" is translated as "特定眼动任务" (tè dìng jǐng yù zhí zhì yè)* "manual area selection" is translated as "手动区域选择" (shǒu dòng qū yù zhì yè)Note: The translation is in Simplified Chinese, which is the standard form of Chinese used in mainland China.

Analyzing Chain-of-Thought Prompting in Large Language Models via Gradient-based Feature Attributions

  • paper_url: http://arxiv.org/abs/2307.13339
  • repo_url: None
  • paper_authors: Skyler Wu, Eric Meng Shen, Charumathi Badrinath, Jiaqi Ma, Himabindu Lakkaraju
  • for: 这个论文的目的是解释为何 chain-of-thought (CoT) 提示能使大型自然语言模型(LLM)在各种问答任务上具有更高的准确率。
  • methods: 这篇论文使用了 gradient-based feature attribution 方法,以生成输入字符串对模型输出的影响度量。
  • results: 研究发现,CoT 提示不会使输入字符串中相关的 Token 的重要性增加,但可以增加提问和模型输出变化时 Token 的稳定性。
    Abstract Chain-of-thought (CoT) prompting has been shown to empirically improve the accuracy of large language models (LLMs) on various question answering tasks. While understanding why CoT prompting is effective is crucial to ensuring that this phenomenon is a consequence of desired model behavior, little work has addressed this; nonetheless, such an understanding is a critical prerequisite for responsible model deployment. We address this question by leveraging gradient-based feature attribution methods which produce saliency scores that capture the influence of input tokens on model output. Specifically, we probe several open-source LLMs to investigate whether CoT prompting affects the relative importances they assign to particular input tokens. Our results indicate that while CoT prompting does not increase the magnitude of saliency scores attributed to semantically relevant tokens in the prompt compared to standard few-shot prompting, it increases the robustness of saliency scores to question perturbations and variations in model output.
    摘要 <>chain-of-thought(CoT)提示有效地提高了大型语言模型(LLM)在各种问题回答任务上的准确率。 although understanding why CoT prompting is effective is crucial to ensuring that this phenomenon is a consequence of desired model behavior, little work has addressed this; nonetheless, such an understanding is a critical prerequisite for responsible model deployment. we address this question by leveraging gradient-based feature attribution methods, which produce saliency scores that capture the influence of input tokens on model output. specifically, we probe several open-source LLMs to investigate whether CoT prompting affects the relative importances they assign to particular input tokens. our results indicate that while CoT prompting does not increase the magnitude of saliency scores attributed to semantically relevant tokens in the prompt compared to standard few-shot prompting, it increases the robustness of saliency scores to question perturbations and variations in model output.Note: Please note that the translation is in Simplified Chinese, which is the standard writing system used in mainland China. If you prefer Traditional Chinese, please let me know and I can provide the translation in that format as well.

The Optimal Approximation Factors in Misspecified Off-Policy Value Function Estimation

  • paper_url: http://arxiv.org/abs/2307.13332
  • repo_url: None
  • paper_authors: Philip Amortila, Nan Jiang, Csaba Szepesvári
  • for: 这篇论文是关于 reinforcement learning(RL)中的函数估计精度的研究。特别是研究函数估计精度如何受到函数错误的影响。
  • methods: 这篇论文使用了 linear off-policy value function estimation 方法,并在不同的设定下(如 weighted $L_2$-norm、$L_\infty$ norm、状态别名和状态空间覆盖率)研究了函数估计精度的优化因子。
  • results: 研究发现,在不同的设定下,函数估计精度受到多个因素的影响,其中包括函数错误和状态别名等。这些因素的优化因子可以用来评估函数估计精度的困难程度。
    Abstract Theoretical guarantees in reinforcement learning (RL) are known to suffer multiplicative blow-up factors with respect to the misspecification error of function approximation. Yet, the nature of such \emph{approximation factors} -- especially their optimal form in a given learning problem -- is poorly understood. In this paper we study this question in linear off-policy value function estimation, where many open questions remain. We study the approximation factor in a broad spectrum of settings, such as with the weighted $L_2$-norm (where the weighting is the offline state distribution), the $L_\infty$ norm, the presence vs. absence of state aliasing, and full vs. partial coverage of the state space. We establish the optimal asymptotic approximation factors (up to constants) for all of these settings. In particular, our bounds identify two instance-dependent factors for the $L_2(\mu)$ norm and only one for the $L_\infty$ norm, which are shown to dictate the hardness of off-policy evaluation under misspecification.
    摘要 theoretical guarantees in reinforcement learning (RL) 知道 suffer 多个 multiplication blow-up factors with respect to the misspecification error of function approximation. yet, the nature of such approximation factors -- especially their optimal form in a given learning problem -- is poorly understood. In this paper, we study this question in linear off-policy value function estimation, where many open questions remain. We study the approximation factor in a broad spectrum of settings, such as with the weighted $L_2$-norm (where the weighting is the offline state distribution), the $L_\infty$ norm, the presence vs. absence of state aliasing, and full vs. partial coverage of the state space. We establish the optimal asymptotic approximation factors (up to constants) for all of these settings. In particular, our bounds identify two instance-dependent factors for the $L_2(\mu)$ norm and only one for the $L_\infty$ norm, which are shown to dictate the hardness of off-policy evaluation under misspecification.Here's the Chinese translation of the text:理论保证在强化学习(RL)中知道会受到函数近似错误的多个多项式增长因素的影响。然而,这些近似因素的最佳形式在给定的学习问题中仍然不够了解。在这篇论文中,我们研究了这个问题在线性偏离策略估值函数估计中,这里有许多未解之处。我们在各种设置下研究了近似因素,包括使用权重$L_2$-norm(其权重是在线状态分布上)、$L_\infty$ norm、状态别名和状态空间的完整性 vs. 部分覆盖。我们确定了所有设置的优化的极限增长因素(即常数),并且发现了这些因素在不正确的函数近似下的评估难度。 Specifically, our bounds identify two instance-dependent factors for the $L_2(\mu)$ norm and only one for the $L_\infty$ norm, which are shown to dictate the hardness of off-policy evaluation under misspecification.这里的 bounds 发现了 $L_2(\mu)$ norm 下的两个实例依赖的因素,以及 $L_\infty$ norm 下的一个因素,这些因素在函数近似错误下的评估难度。

2-Level Reinforcement Learning for Ships on Inland Waterways

  • paper_url: http://arxiv.org/abs/2307.16769
  • repo_url: https://github.com/marwaltz/tud_rl
  • paper_authors: Martin Waltz, Niklas Paulig, Ostap Okhrin
  • for: 这个论文目的是控制自主水面车(ASV)在内陆水道(IW)上,基于深度强化学习(DRL)。
  • methods: 该框架包括两级:一级是高级本地路径规划(LPP)单元,另一级是低级路径跟踪(PF)单元,每个单元都包含一个DRL代理。LPP代理负责考虑附近船只、交通规则和水道的几何,而PF代理负责低级杆控制,并考虑水下船只的杆控制、环境力量(风、浪、涨潮)的影响。
  • results: 在模拟环境中,两个代理都进行了广泛验证,使用德国北部的下落河为例子,并使用实际的AIS轨迹来模拟其他船只的行为。
    Abstract This paper proposes a realistic modularized framework for controlling autonomous surface vehicles (ASVs) on inland waterways (IWs) based on deep reinforcement learning (DRL). The framework comprises two levels: a high-level local path planning (LPP) unit and a low-level path following (PF) unit, each consisting of a DRL agent. The LPP agent is responsible for planning a path under consideration of nearby vessels, traffic rules, and the geometry of the waterway. We thereby leverage a recently proposed spatial-temporal recurrent neural network architecture, which is transferred to continuous action spaces. The PF agent is responsible for low-level actuator control while accounting for shallow water influences on the marine craft and the environmental forces winds, waves, and currents. Both agents are thoroughly validated in simulation, employing the lower Elbe in northern Germany as an example case and using real AIS trajectories to model the behavior of other ships.
    摘要

Learning Autonomous Ultrasound via Latent Task Representation and Robotic Skills Adaptation

  • paper_url: http://arxiv.org/abs/2307.13323
  • repo_url: None
  • paper_authors: Xutian Deng, Junnan Jiang, Wen Cheng, Miao Li
  • for: 提高机器人超声扫描的自动化精度和效率
  • methods: 使用多Modal ultrasound技术和自动适应学习方法
  • results: 实验结果显示,提议方法可以生成适应不同人群的复杂超声策略,并实现了对比较好的量化结果
    Abstract As medical ultrasound is becoming a prevailing examination approach nowadays, robotic ultrasound systems can facilitate the scanning process and prevent professional sonographers from repetitive and tedious work. Despite the recent progress, it is still a challenge to enable robots to autonomously accomplish the ultrasound examination, which is largely due to the lack of a proper task representation method, and also an adaptation approach to generalize learned skills across different patients. To solve these problems, we propose the latent task representation and the robotic skills adaptation for autonomous ultrasound in this paper. During the offline stage, the multimodal ultrasound skills are merged and encapsulated into a low-dimensional probability model through a fully self-supervised framework, which takes clinically demonstrated ultrasound images, probe orientations, and contact forces into account. During the online stage, the probability model will select and evaluate the optimal prediction. For unstable singularities, the adaptive optimizer fine-tunes them to near and stable predictions in high-confidence regions. Experimental results show that the proposed approach can generate complex ultrasound strategies for diverse populations and achieve significantly better quantitative results than our previous method.
    摘要 现在医疗超声成为主流检查方法,Robotic超声系统可以帮助扫描过程,避免专业医疗人员的重复和厌烦工作。尽管最近做出了一些进步,但是还是面临着自动完成超声检查的挑战,主要原因是缺乏适当的任务表示方法,以及将学习到的技能通用化到不同的病人身上。为解决这些问题,我们在这篇论文中提出了缺失任务表示和机器人技能适应。在线阶段,我们使用了完全自我超vised框架,将多modal超声技能集成到低维度概率模型中,考虑了临床证明的超声图像、探针 orientations和触摸力。在线阶段,概率模型会选择和评估最佳预测。对于不稳定的孤点,适应优化器进行了微调,使其在高信任区域靠近和稳定预测。实验结果表明,我们的方法可以生成适应不同人口的复杂超声策略,并取得了significantly更好的量化结果,比我们之前的方法更好。

Towards Integrated Traffic Control with Operating Decentralized Autonomous Organization

  • paper_url: http://arxiv.org/abs/2308.03769
  • repo_url: None
  • paper_authors: Shengyue Yao, Jingru Yu, Yi Yu, Jia Xu, Xingyuan Dai, Honghai Li, Fei-Yue Wang, Yilun Lin
  • for: 提高智能交通系统(ITS)的集成控制能力,考虑多种多样智能代理的优化和扩展。
  • methods: 基于分布式自治组织(DAO)框架,实现全局能源消耗效率(ECE)的全球协商,并通过奖励机制优化本地目标。另外,对DAO结构硬直性问题进行了解决方案。
  • results: 通过 numerics 实验,提出的方法可以在各种情况下更快达成全局目标,并且可以提高本地目标。这表明该方法在智能交通系统集成控制中具有潜在的应用前景。
    Abstract With a growing complexity of the intelligent traffic system (ITS), an integrated control of ITS that is capable of considering plentiful heterogeneous intelligent agents is desired. However, existing control methods based on the centralized or the decentralized scheme have not presented their competencies in considering the optimality and the scalability simultaneously. To address this issue, we propose an integrated control method based on the framework of Decentralized Autonomous Organization (DAO). The proposed method achieves a global consensus on energy consumption efficiency (ECE), meanwhile to optimize the local objectives of all involved intelligent agents, through a consensus and incentive mechanism. Furthermore, an operation algorithm is proposed regarding the issue of structural rigidity in DAO. Specifically, the proposed operation approach identifies critical agents to execute the smart contract in DAO, which ultimately extends the capability of DAO-based control. In addition, a numerical experiment is designed to examine the performance of the proposed method. The experiment results indicate that the controlled agents can achieve a consensus faster on the global objective with improved local objectives by the proposed method, compare to existing decentralized control methods. In general, the proposed method shows a great potential in developing an integrated control system in the ITS
    摘要 随着智能交通系统(ITS)的复杂度的增加,一种能够考虑丰富多种智能代理人的集中化控制方法是感到需要。然而,现有的中央化或分布式控制方法未能同时考虑优化和可扩展性。为解决这个问题,我们提议一种基于分布式自治组织(DAO)的集中化控制方法。该方法可以在全球范围内达成能源消耗效率(ECE)的全球协议,同时通过协议和激励机制来优化所有参与的智能代理人的本地目标。此外,我们还提出了一种对 DAO 的结构硬直性问题的操作算法。具体来说,该算法可以在 DAO 中标识关键代理人执行智能合同,从而扩展 DAO 基础的能力。此外,我们还设计了一个数值实验,以评估提议方法的性能。实验结果表明,由于提议方法,控制代理人可以更快达成全球目标,并且提高本地目标。总之,我们的方法在智能交通系统中集中化控制方法具有很大的潜力。

Word Sense Disambiguation as a Game of Neurosymbolic Darts

  • paper_url: http://arxiv.org/abs/2307.16663
  • repo_url: None
  • paper_authors: Tiansi Dong, Rafet Sifa
  • for: 本研究旨在提出一种新的神经符号方法来解决自然语言理解和知识工程中的词意划分问题。
  • methods: 该方法基于一种嵌入式的 Configuration of Nested Balls (CNB) 模型,其中每个词 embedding 的中心点具有一定的稳定性,并且可以准确地表示词义的含义。而 inclusion 关系 между球体可以准确地表示符号 гиперonym 关系 между词义,从而实现了简单的逻辑推理。
  • results: 在使用预训练 n-ball 嵌入后,我们在 WSD 数据集上进行了一系列实验,并取得了 F1 分数在 90.1% 到 100.0% 之间的结果。这表明了我们的方法可以超越深度学习方法的楼层效果。
    Abstract Word Sense Disambiguation (WSD) is one of the hardest tasks in natural language understanding and knowledge engineering. The glass ceiling of 80% F1 score is recently achieved through supervised deep-learning, enriched by a variety of knowledge graphs. Here, we propose a novel neurosymbolic methodology that is able to push the F1 score above 90%. The core of our methodology is a neurosymbolic sense embedding, in terms of a configuration of nested balls in n-dimensional space. The centre point of a ball well-preserves word embedding, which partially fix the locations of balls. Inclusion relations among balls precisely encode symbolic hypernym relations among senses, and enable simple logic deduction among sense embeddings, which cannot be realised before. We trained a Transformer to learn the mapping from a contextualized word embedding to its sense ball embedding, just like playing the game of darts (a game of shooting darts into a dartboard). A series of experiments are conducted by utilizing pre-training n-ball embeddings, which have the coverage of around 70% training data and 75% testing data in the benchmark WSD corpus. The F1 scores in experiments range from 90.1% to 100.0% in all six groups of test data-sets (each group has 4 testing data with different sizes of n-ball embeddings). Our novel neurosymbolic methodology has the potential to break the ceiling of deep-learning approaches for WSD. Limitations and extensions of our current works are listed.
    摘要

Imperceptible Physical Attack against Face Recognition Systems via LED Illumination Modulation

  • paper_url: http://arxiv.org/abs/2307.13294
  • repo_url: None
  • paper_authors: Junbin Fang, Canjian Jiang, You Jiang, Puxi Lin, Zhaojie Chen, Yujing Sun, Siu-Ming Yiu, Zoe L. Jiang
  • for: 本研究旨在提出一种实用、执行、不显而又低计算量的LED照明模拓ersion adversarial攻击,以攻击数据驱动的面Recognition视系统。
  • methods: 该攻击方法基于LED照明模拓ersion,通过快速幅度调整场景LED照明的强度来生成不可见的明暗变化,并利用CMOS图像感知器的滚动闸效果,将明暗信息加入到捕捉到的脸像中。
  • results: 对于Well-known的面检测模型Dlib、MTCNN和RetinaFace,DoS攻击达成率分别为97.67%、100%和100%,而对于面验证模型Dlib、FaceNet和ArcFace,掩饰攻击达成率均为100%。
    Abstract Although face recognition starts to play an important role in our daily life, we need to pay attention that data-driven face recognition vision systems are vulnerable to adversarial attacks. However, the current two categories of adversarial attacks, namely digital attacks and physical attacks both have drawbacks, with the former ones impractical and the latter one conspicuous, high-computational and inexecutable. To address the issues, we propose a practical, executable, inconspicuous and low computational adversarial attack based on LED illumination modulation. To fool the systems, the proposed attack generates imperceptible luminance changes to human eyes through fast intensity modulation of scene LED illumination and uses the rolling shutter effect of CMOS image sensors in face recognition systems to implant luminance information perturbation to the captured face images. In summary,we present a denial-of-service (DoS) attack for face detection and a dodging attack for face verification. We also evaluate their effectiveness against well-known face detection models, Dlib, MTCNN and RetinaFace , and face verification models, Dlib, FaceNet,and ArcFace.The extensive experiments show that the success rates of DoS attacks against face detection models reach 97.67%, 100%, and 100%, respectively, and the success rates of dodging attacks against all face verification models reach 100%.
    摘要 尽管人脸识别开始在我们日常生活中扮演重要角色,但是我们需要注意到数据驱动的人脸识别视觉系统容易受到反对攻击。然而,当前两种反对攻击方法,即数字攻击和物理攻击都有缺点,前者不实用,后者突出、计算高、不执行。为了解决这些问题,我们提议一种实用、执行、不露出来、计算低的反对攻击方法,基于LED照明模拟。通过快速强度模拟场景LED照明的快速强度变化,并使用CMOS图像感知器中的滚动闸效果,我们生成不可见的明暗变化,让人类眼睛无法感受到。总之,我们提出了一种人脸检测系统的拒绝服务(DoS)攻击和躲避攻击,并对知名的人脸检测模型Dlib、MTCNN和RetinaFace,以及人脸验证模型Dlib、FaceNet和ArcFace进行了广泛的测试,结果显示,对人脸检测模型的DoS攻击成功率为97.67%、100%和100%,对所有人脸验证模型的躲避攻击成功率均为100%。

Reinforcement Learning -based Adaptation and Scheduling Methods for Multi-source DASH

  • paper_url: http://arxiv.org/abs/2308.11621
  • repo_url: https://github.com/ntnghia1908/Master_Thesis
  • paper_authors: Nghia T. Nguyen, Long Luu, Phuong L. Vo, Thi Thanh Sang Nguyen, Cuong T. Do, Ngoc-thanh Nguyen
  • for: 这个论文主要研究多源视频流ING的高质量体验(QoE)优化。
  • methods: 该论文提出了两种RL算法来优化多源视频流的QoE:RL-based adaptation with greedy scheduling(RLAGS)和RL-based adaptation and scheduling(RLAS)。
  • results: 经过广泛的 simulations validate 了提出的算法的效率。
    Abstract Dynamic adaptive streaming over HTTP (DASH) has been widely used in video streaming recently. In DASH, the client downloads video chunks in order from a server. The rate adaptation function at the video client enhances the user's quality-of-experience (QoE) by choosing a suitable quality level for each video chunk to download based on the network condition. Today networks such as content delivery networks, edge caching networks, content-centric networks,... usually replicate video contents on multiple cache nodes. We study video streaming from multiple sources in this work. In multi-source streaming, video chunks may arrive out of order due to different conditions of the network paths. Hence, to guarantee a high QoE, the video client needs not only rate adaptation but also chunk scheduling. Reinforcement learning (RL) has emerged as the state-of-the-art control method in various fields in recent years. This paper proposes two algorithms for streaming from multiple sources: RL-based adaptation with greedy scheduling (RLAGS) and RL-based adaptation and scheduling (RLAS). We also build a simulation environment for training and evaluating. The efficiency of the proposed algorithms is proved via extensive simulations with real-trace data.
    摘要 “对于多源串流,由于不同的网络路径,可能会有弹性的播放顺序。因此,确保高质量体验(QoE)需要不仅进行率适应,还需要进行块调度。对于多源串流,本文提出了两个算法:基于强化学习(RL)的适应调度(RLAGS)和基于RL的适应调度和调度(RLAS)。我们还建立了一个实验环境,用于训练和评估。经过广泛的实验,我们证明了提案的算法的效率。”Note: Simplified Chinese is used here, as it is more commonly used in mainland China and is the standard for most online content. Traditional Chinese is used in Taiwan and Hong Kong, and is a more complex and nuanced version of the language.

Curvature-based Transformer for Molecular Property Prediction

  • paper_url: http://arxiv.org/abs/2307.13275
  • repo_url: None
  • paper_authors: Yili Chen, Zhengyu Li, Zheng Wan, Hui Yu, Xian Wei
  • for: 提高基于人工智能的药物设计中分子属性预测的能力
  • methods: 引入Discretization of Ricci Curvature来提高图像神经网络模型对分子图数据的结构信息抽取能力
  • results: 在PCQM4M-LST、MoleculeNet等化学分子数据集上进行了实验,与Uni-Mol、Graphormer等模型进行比较,结果表明该方法可以达到状态艺术的结果,并且证明了Discretized Ricci curvature可以反映分子结构和功能关系。
    Abstract The prediction of molecular properties is one of the most important and challenging tasks in the field of artificial intelligence-based drug design. Among the current mainstream methods, the most commonly used feature representation for training DNN models is based on SMILES and molecular graphs, although these methods are concise and effective, they also limit the ability to capture spatial information. In this work, we propose Curvature-based Transformer to improve the ability of Graph Transformer neural network models to extract structural information on molecular graph data by introducing Discretization of Ricci Curvature. To embed the curvature in the model, we add the curvature information of the graph as positional Encoding to the node features during the attention-score calculation. This method can introduce curvature information from graph data without changing the original network architecture, and it has the potential to be extended to other models. We performed experiments on chemical molecular datasets including PCQM4M-LST, MoleculeNet and compared with models such as Uni-Mol, Graphormer, and the results show that this method can achieve the state-of-the-art results. It is proved that the discretized Ricci curvature also reflects the structural and functional relationship while describing the local geometry of the graph molecular data.
    摘要 预测分子性质是人工智能基于药物设计的一个最重要和挑战性任务。现有主流方法中,最常用的特征表示方法是基于SMILES和分子图,尽管这些方法简洁有效,但它们也限制了捕捉空间信息的能力。在这种工作中,我们提出了几何基于变换器的Curvature-based Transformer,以提高分子图数据中的结构信息提取能力。为了嵌入曲率信息,我们在节点特征计算时将拟合分数加入节点特征中,从而将曲率信息作为位置编码。这种方法可以在原始网络结构不变的情况下,将曲率信息引入模型,并且具有扩展性。我们在PCQM4M-LST、MoleculeNet等化学分子数据集上进行了实验,并与Uni-Mol、Graphormer等模型进行比较,结果表明,这种方法可以实现领先的结果。此外,我们还发现,积分 Ricci 曲率也可以反映分子结构和功能关系,并描述分子图数据的地方几何结构。

Unbiased Weight Maximization

  • paper_url: http://arxiv.org/abs/2307.13270
  • repo_url: None
  • paper_authors: Stephen Chung
  • for: 本研究旨在提出一种生物学可能性的人工神经网络(ANN)训练方法,即将每个单元视为一个随机强化学习(RL)代理,从而将网络视为一群代理。这种方法可以更好地模仿生物系统中观察到的 synaptic plasticity 的形式。
  • methods: 本研究使用的方法包括 REINFORCE 本地学习规则,以及一种名为 Weight Maximization 的新方法。Weight Maximization 将每个隐藏单元的奖励信号替换为其发射量的 нор,从而让每个隐藏单元可以最大化其发射量的 norm 而不是全局奖励信号。
  • results: 研究人员通过分析Weight Maximization的理论性质和提出一种变体 Unbiased Weight Maximization,发现这种新的学习规则可以提高学习速度和最终性能。特别是,在我们所知道的情况下,这是第一种不偏不倚于网络单元数量的学习规则,可以快速地学习一个 Bernoulli-logistic 网络。
    Abstract A biologically plausible method for training an Artificial Neural Network (ANN) involves treating each unit as a stochastic Reinforcement Learning (RL) agent, thereby considering the network as a team of agents. Consequently, all units can learn via REINFORCE, a local learning rule modulated by a global reward signal, which aligns more closely with biologically observed forms of synaptic plasticity. Nevertheless, this learning method is often slow and scales poorly with network size due to inefficient structural credit assignment, since a single reward signal is broadcast to all units without considering individual contributions. Weight Maximization, a proposed solution, replaces a unit's reward signal with the norm of its outgoing weight, thereby allowing each hidden unit to maximize the norm of the outgoing weight instead of the global reward signal. In this research report, we analyze the theoretical properties of Weight Maximization and propose a variant, Unbiased Weight Maximization. This new approach provides an unbiased learning rule that increases learning speed and improves asymptotic performance. Notably, to our knowledge, this is the first learning rule for a network of Bernoulli-logistic units that is unbiased and scales well with the number of network's units in terms of learning speed.
    摘要 一种生物学可能性的方法 для训练人工神经网络(ANN)是将每个单元视为一个随机强化学习(RL)代理,从而考虑网络为一群代理。因此,所有单元都可以通过REINFORCE本地学习规则,该规则由全局奖励信号修饰,更加接近生物观察到的 synaptic plasticity 形式。然而,这种学习方法通常慢速并且与网络大小成比例差化学分,因为不充分考虑单元各自的贡献。Weight Maximization 是一种提议的解决方案,它将每个隐藏单元的奖励信号替换为单元的出口权重的 нор,从而让每个隐藏单元可以最大化出口权重的 norm 而不是全局奖励信号。在这份研究报告中,我们分析了Weight Maximization 的理论性质和一种变体,即偏函数Weight Maximization。这种新的学习规则提供了一种不偏学习规则,可以提高学习速度和最终性能。值得注意的是,到我们所知,这是一种可以快速学习和与网络单元数量成比例增长的学习规则,对于一个由 Bernoulli-logistic 单元组成的网络来说。

LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition

  • paper_url: http://arxiv.org/abs/2307.13269
  • repo_url: https://github.com/sail-sg/lorahub
  • paper_authors: Chengsong Huang, Qian Liu, Bill Yuchen Lin, Tianyu Pang, Chao Du, Min Lin
  • for: 这篇论文旨在研究LoRA(低级别适应)的可组合性,以实现新任务的适应性。
  • methods: 该论文提出了LoraHub框架,可以策略性地组合多个LoRA模块,从多个任务中学习各种不同的技能。
  • results: 实验结果表明,LoraHub可以在几个shot数据量的情况下,模拟内在学习的表现,而不需要具体的例子。此外,LoraHub的组合不需要新的参数或梯度。
    Abstract Low-rank adaptations (LoRA) are often employed to fine-tune large language models (LLMs) for new tasks. This paper investigates LoRA composability for cross-task generalization and introduces LoraHub, a strategic framework devised for the purposive assembly of LoRA modules trained on diverse given tasks, with the objective of achieving adaptable performance on unseen tasks. With just a few examples from a novel task, LoraHub enables the fluid combination of multiple LoRA modules, eradicating the need for human expertise. Notably, the composition requires neither additional model parameters nor gradients. Our empirical results, derived from the Big-Bench Hard (BBH) benchmark, suggest that LoraHub can effectively mimic the performance of in-context learning in few-shot scenarios, excluding the necessity of in-context examples alongside each inference input. A significant contribution of our research is the fostering of a community for LoRA, where users can share their trained LoRA modules, thereby facilitating their application to new tasks. We anticipate this resource will widen access to and spur advancements in general intelligence as well as LLMs in production. Code will be available at https://github.com/sail-sg/lorahub.
    摘要 低阶 adaptations(LoRA)常用于细化大语言模型(LLM)以适应新任务。这篇论文研究LoRA的可组合性,并提出了LoraHub,一个战略性框架,用于策略性将LoRA模块训练在多种任务上,以实现对未看过任务的适应性。只需几个例子,LoraHub可以快速组合多个LoRA模块,不需要人工专业知识。更重要的是,组合不需要额外参数或梯度。我们的实验结果,基于Big-Bench Hard(BBH)benchmark,表明LoraHub可以有效模拟少数例大学习的表现,排除需要在每个推理输入 alongside的具体例子。我们的研究的一个重要贡献是推动LoRA社区,用户可以共享自己训练好的LoRA模块,从而使其应用于新任务。我们预计这种资源将扩大LLM的应用范围和推动生产环境中的普通智能。代码将在https://github.com/sail-sg/lorahub上提供。

Federated Split Learning with Only Positive Labels for resource-constrained IoT environment

  • paper_url: http://arxiv.org/abs/2307.13266
  • repo_url: None
  • paper_authors: Praveen Joshi, Chandra Thapa, Mohammed Hasanuzzaman, Ted Scully, Haithem Afli
  • for: 提高 IoT 设备数据隐私和提高模型训练效率
  • methods: 使用 federated split learning (SFPL) 技术,包括随机洗涤数据和本地批量正则化
  • results: SFPL 比 SFL 提高了模型训练效率和精度,具体比例为: + CIFAR-100 数据集上 ResNet-56 和 ResNet-32 模型的比例分别为 51.54 和 32.57 + CIFAR-10 数据集上 ResNet-32 和 ResNet-8 模型的比例分别为 9.23 和 8.52
    Abstract Distributed collaborative machine learning (DCML) is a promising method in the Internet of Things (IoT) domain for training deep learning models, as data is distributed across multiple devices. A key advantage of this approach is that it improves data privacy by removing the necessity for the centralized aggregation of raw data but also empowers IoT devices with low computational power. Among various techniques in a DCML framework, federated split learning, known as splitfed learning (SFL), is the most suitable for efficient training and testing when devices have limited computational capabilities. Nevertheless, when resource-constrained IoT devices have only positive labeled data, multiclass classification deep learning models in SFL fail to converge or provide suboptimal results. To overcome these challenges, we propose splitfed learning with positive labels (SFPL). SFPL applies a random shuffling function to the smashed data received from clients before supplying it to the server for model training. Additionally, SFPL incorporates the local batch normalization for the client-side model portion during the inference phase. Our results demonstrate that SFPL outperforms SFL: (i) by factors of 51.54 and 32.57 for ResNet-56 and ResNet-32, respectively, with the CIFAR-100 dataset, and (ii) by factors of 9.23 and 8.52 for ResNet-32 and ResNet-8, respectively, with CIFAR-10 dataset. Overall, this investigation underscores the efficacy of the proposed SFPL framework in DCML.
    摘要 “分布式合作机器学习(DCML)是互联网东西(IoT)领域的一种有前途的方法,用于训练深度学习模型,因为数据分布在多个设备上。这种方法的优点在于,它提高了数据隐私,因为不需要将原始数据集中化,同时也使得 IoT 设备 WITH 较低的计算能力得到启发。在 DCML 框架中, federated split learning(SFL)是最适合高效地训练和测试,因为设备具有有限的计算能力。然而,当 IoT 设备具有只有正例数据时,SFL 中的多类分类深度学习模型无法实现或提供低效果。为了解决这些挑战,我们提出了 splitfed learning with positive labels(SFPL)。SFPL 使用随机排序函数将客户端上收到的数据进行销毁,然后将其提供给服务器进行模型训练。此外,SFPL 还在推理阶段添加了本地批处理正则化。我们的结果表明,SFPL 在 CIFAR-100 和 CIFAR-10 datasets 上分别比 SFL 提高了51.54 和 32.57 倍,并且在 CIFAR-10 datasets 上比 SFL 提高了9.23 和 8.52 倍。总的来说,这种研究证明了我们提出的 SFPL 框架在 DCML 中的效果。”Note: Please note that the translation is in Simplified Chinese, and the words and phrases in bold are the ones that are translated.

Structural Credit Assignment with Coordinated Exploration

  • paper_url: http://arxiv.org/abs/2307.13256
  • repo_url: None
  • paper_authors: Stephen Chung
  • for: 训练人工神经网络(ANN),使用生物学可能性的方法。
  • methods: 每个单元 treated as 随机强化学习(RL)代理,使用REINFORCE本地学习规则,且受到全局奖励信号的调整,更加符合生物观察到的 synaptic plasticity 形式。
  • results: 协调探索可以大幅提高训练速度,并且可以超过 straight-through estimator(STE)反propagation。
    Abstract A biologically plausible method for training an Artificial Neural Network (ANN) involves treating each unit as a stochastic Reinforcement Learning (RL) agent, thereby considering the network as a team of agents. Consequently, all units can learn via REINFORCE, a local learning rule modulated by a global reward signal, which aligns more closely with biologically observed forms of synaptic plasticity. However, this learning method tends to be slow and does not scale well with the size of the network. This inefficiency arises from two factors impeding effective structural credit assignment: (i) all units independently explore the network, and (ii) a single reward is used to evaluate the actions of all units. Accordingly, methods aimed at improving structural credit assignment can generally be classified into two categories. The first category includes algorithms that enable coordinated exploration among units, such as MAP propagation. The second category encompasses algorithms that compute a more specific reward signal for each unit within the network, like Weight Maximization and its variants. In this research report, our focus is on the first category. We propose the use of Boltzmann machines or a recurrent network for coordinated exploration. We show that the negative phase, which is typically necessary to train Boltzmann machines, can be removed. The resulting learning rules are similar to the reward-modulated Hebbian learning rule. Experimental results demonstrate that coordinated exploration significantly exceeds independent exploration in training speed for multiple stochastic and discrete units based on REINFORCE, even surpassing straight-through estimator (STE) backpropagation.
    摘要 一种生物学可能性的人工神经网络(ANN)训练方法是将每个单元视为一个随机奖励学习(RL)代理,从而考虑网络为一个团队。因此,所有单元都可以通过REINFORCE本地学习规则,该规则由全局奖励信号调整,更加接近生物观察到的 synaptic plasticity 形式。然而,这种学习方法通常慢并不能Scalable 到网络的大小。这种缺效果来自两个因素:(i)所有单元独立探索网络,(ii)全网络使用单一奖励评价所有单元的行为。因此,可以将方法分为两类:第一类包括使用MAP卷积算法进行协调探索的算法,第二类包括计算网络内每个单元的具体奖励信号的算法,如Weight Maximization 和其变种。在这份研究报告中,我们注重第一类。我们提议使用 Boltzmann 机或回归网络进行协调探索。我们发现,通常需要训练 Boltzmann 机的负阶可以被除去,其结果的学习规则与奖励调整的 Hebbian 学习规则类似。实验结果表明,协调探索在多个随机和离散单元基于 REINFORCE 训练速度上明显超过独立探索,甚至超过 straight-through estimator(STE)反propagation。

GaPro: Box-Supervised 3D Point Cloud Instance Segmentation Using Gaussian Processes as Pseudo Labelers

  • paper_url: http://arxiv.org/abs/2307.13251
  • repo_url: https://github.com/vinairesearch/gapro
  • paper_authors: Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen
  • for: 这篇论文主要针对的是3D点云实例分割的问题,即使使用软指导下进行解决。
  • methods: 我们提出了一种新的实例分割方法,即使使用软指导下进行解决。我们的方法包括从矩形框签到实例分割网络的训练。此外,我们还使用了自适应策略来进一步提高方法的性能。
  • results: 我们的实验表明,我们的方法可以比前一代软指导下的实例分割方法表现更好,并且与现有的全指导方法具有相似的性能。此外,我们还证明了我们的方法可以适应不同的全指导方法,只需使用我们生成的 Pseudo 标签进行训练即可。
    Abstract Instance segmentation on 3D point clouds (3DIS) is a longstanding challenge in computer vision, where state-of-the-art methods are mainly based on full supervision. As annotating ground truth dense instance masks is tedious and expensive, solving 3DIS with weak supervision has become more practical. In this paper, we propose GaPro, a new instance segmentation for 3D point clouds using axis-aligned 3D bounding box supervision. Our two-step approach involves generating pseudo labels from box annotations and training a 3DIS network with the resulting labels. Additionally, we employ the self-training strategy to improve the performance of our method further. We devise an effective Gaussian Process to generate pseudo instance masks from the bounding boxes and resolve ambiguities when they overlap, resulting in pseudo instance masks with their uncertainty values. Our experiments show that GaPro outperforms previous weakly supervised 3D instance segmentation methods and has competitive performance compared to state-of-the-art fully supervised ones. Furthermore, we demonstrate the robustness of our approach, where we can adapt various state-of-the-art fully supervised methods to the weak supervision task by using our pseudo labels for training. The source code and trained models are available at https://github.com/VinAIResearch/GaPro.
    摘要

RoSAS: Deep Semi-Supervised Anomaly Detection with Contamination-Resilient Continuous Supervision

  • paper_url: http://arxiv.org/abs/2307.13239
  • repo_url: https://github.com/xuhongzuo/rosas
  • paper_authors: Hongzuo Xu, Yijie Wang, Guansong Pang, Songlei Jian, Ning Liu, Yongjun Wang
    for: 这篇论文是为了解决半有向式异常检测方法中的两个限制而撰写的。这两个限制分别是:1)无法处理没有标签的异常(即异常污染),这可能导致学习过程中的混乱;2)仅使用类别型标签(例如二进制或排序标签),这会导致异常分析 scores 的学习得到极其连续的分布。methods: 这篇论文提出了一种新的半有向式异常检测方法,其中提案了一种称为“污染抑制连续超级指导”的新方法。这种方法利用标签的混合来创建新的标签数据,以减少异常污染的影响。同时,这种方法还加入了一个对应于特征学习的目标,以增强网络的弹性和适应力。results: 根据11个真实世界数据集的实验结果,这篇论文的方法与现有的竞争者相比,能够提高20%-30%的AUC-PR表现,并且在不同的异常污染水平和标签数量中具有更好的适应能力和更高的稳定性。
    Abstract Semi-supervised anomaly detection methods leverage a few anomaly examples to yield drastically improved performance compared to unsupervised models. However, they still suffer from two limitations: 1) unlabeled anomalies (i.e., anomaly contamination) may mislead the learning process when all the unlabeled data are employed as inliers for model training; 2) only discrete supervision information (such as binary or ordinal data labels) is exploited, which leads to suboptimal learning of anomaly scores that essentially take on a continuous distribution. Therefore, this paper proposes a novel semi-supervised anomaly detection method, which devises \textit{contamination-resilient continuous supervisory signals}. Specifically, we propose a mass interpolation method to diffuse the abnormality of labeled anomalies, thereby creating new data samples labeled with continuous abnormal degrees. Meanwhile, the contaminated area can be covered by new data samples generated via combinations of data with correct labels. A feature learning-based objective is added to serve as an optimization constraint to regularize the network and further enhance the robustness w.r.t. anomaly contamination. Extensive experiments on 11 real-world datasets show that our approach significantly outperforms state-of-the-art competitors by 20%-30% in AUC-PR and obtains more robust and superior performance in settings with different anomaly contamination levels and varying numbers of labeled anomalies. The source code is available at https://github.com/xuhongzuo/rosas/.
    摘要 semi-supervised异常检测方法可以利用一些异常示例来提高性能,但它们仍然受到两种限制:1)无标签异常(即异常污染)可能会导致学习过程中的干扰,当所有无标签数据被用作模型训练时;2)只利用精确的数据标签(如二进制或排序数据标签),这会导致异常分数的学习被强制为精确的连续分布。因此,本文提出了一种新的 semi-supervised异常检测方法,即使用“污染 resistant 连续指导信号”。具体来说,我们提出了一种质量 interpolating 方法,以填充标记为异常的数据中的异常性,并创建新的数据样本,其标签为连续的异常度。同时,污染区域可以被新生成的数据样本覆盖,这些样本由正确标签的数据组合生成。此外,我们还添加了一个基于特征学习的目标函数,以便为抗污染regular化网络,进一步提高对异常污染的Robustness。我们在11个实际世界数据集上进行了广泛的实验,结果表明,我们的方法在AUC-PR方面比状态艺术竞争者提高20%-30%,并在不同的异常污染水平和变量数量的情况下具有更加稳定和优秀的性能。代码可以在https://github.com/xuhongzuo/rosas/获取。

Multilevel Large Language Models for Everyone

  • paper_url: http://arxiv.org/abs/2307.13221
  • repo_url: None
  • paper_authors: Yuanhao Gong
  • for: 将大语言模型连接到一起,实现更高级别的功能,基于用户个人输入和互联网信息。
  • methods: 利用人脑蓝图的多层次结构,连接通用和专业型大语言模型,以实现更高效的自然语言处理、计算机视觉任务、专业助手、商业和医疗应用。
  • results: 提出了一种基于用户个人输入和互联网信息的多层次大语言模型,可以减少冗余并提高性能,适用于多种应用场景。
    Abstract Large language models have made significant progress in the past few years. However, they are either generic {\it or} field specific, splitting the community into different groups. In this paper, we unify these large language models into a larger map, where the generic {\it and} specific models are linked together and can improve each other, based on the user personal input and information from the internet. The idea of linking several large language models together is inspired by the functionality of human brain. The specific regions on the brain cortex are specific for certain low level functionality. And these regions can jointly work together to achieve more complex high level functionality. Such behavior on human brain cortex sheds the light to design the multilevel large language models that contain global level, field level and user level models. The user level models run on local machines to achieve efficient response and protect the user's privacy. Such multilevel models reduce some redundancy and perform better than the single level models. The proposed multilevel idea can be applied in various applications, such as natural language processing, computer vision tasks, professional assistant, business and healthcare.
    摘要 Our multilevel approach includes global, field, and user levels, with user-level models running on local machines to ensure efficient response and protect user privacy. This approach reduces redundancy and performs better than single-level models, and it can be applied to various applications such as natural language processing, computer vision tasks, professional assistance, business, and healthcare.

One for Multiple: Physics-informed Synthetic Data Boosts Generalizable Deep Learning for Fast MRI Reconstruction

  • paper_url: http://arxiv.org/abs/2307.13220
  • repo_url: https://github.com/wangziblake/pisf
  • paper_authors: Zi Wang, Xiaotong Yu, Chengyan Wang, Weibo Chen, Jiazheng Wang, Ying-Hua Chu, Hongwei Sun, Rushuai Li, Peiyong Li, Fan Yang, Haiwei Han, Taishan Kang, Jianzhong Lin, Chen Yang, Shufu Chang, Zhang Shi, Sha Hua, Yan Li, Juan Hu, Liuhong Zhu, Jianjun Zhou, Meijing Lin, Jiefeng Guo, Congbo Cai, Zhong Chen, Di Guo, Xiaobo Qu
    for:这个研究旨在提高快速磁共振成像(MRI)的扫描时间,并使用深度学习(DL)来进行图像重建。methods:本研究使用了一个名为Physics-Informed Synthetic data learning framework(PISF),这是一个可以在多个实验设计下进行测试和训练的框架。PISF使用了一个单一的训练模型,可以在多个实验设计下进行图像重建。results:研究发现,使用PISF可以实现对多种实验设计的图像重建,并且可以在实验设计之间进行一致性的重建。此外,PISF还可以在不同的显示器和中心之间进行一致性的重建。对10名医生进行评价后,PISF的优秀适应性得到了证明。
    Abstract Magnetic resonance imaging (MRI) is a principal radiological modality that provides radiation-free, abundant, and diverse information about the whole human body for medical diagnosis, but suffers from prolonged scan time. The scan time can be significantly reduced through k-space undersampling but the introduced artifacts need to be removed in image reconstruction. Although deep learning (DL) has emerged as a powerful tool for image reconstruction in fast MRI, its potential in multiple imaging scenarios remains largely untapped. This is because not only collecting large-scale and diverse realistic training data is generally costly and privacy-restricted, but also existing DL methods are hard to handle the practically inevitable mismatch between training and target data. Here, we present a Physics-Informed Synthetic data learning framework for Fast MRI, called PISF, which is the first to enable generalizable DL for multi-scenario MRI reconstruction using solely one trained model. For a 2D image, the reconstruction is separated into many 1D basic problems and starts with the 1D data synthesis, to facilitate generalization. We demonstrate that training DL models on synthetic data, integrated with enhanced learning techniques, can achieve comparable or even better in vivo MRI reconstruction compared to models trained on a matched realistic dataset, reducing the demand for real-world MRI data by up to 96%. Moreover, our PISF shows impressive generalizability in multi-vendor multi-center imaging. Its excellent adaptability to patients has been verified through 10 experienced doctors' evaluations. PISF provides a feasible and cost-effective way to markedly boost the widespread usage of DL in various fast MRI applications, while freeing from the intractable ethical and practical considerations of in vivo human data acquisitions.
    摘要 To address these challenges, we present a Physics-Informed Synthetic data learning framework for Fast MRI, called PISF. This framework enables generalizable DL for multi-scenario MRI reconstruction using solely one trained model. For a 2D image, the reconstruction is separated into many 1D basic problems, starting with 1D data synthesis to facilitate generalization. We demonstrate that training DL models on synthetic data, integrated with enhanced learning techniques, can achieve comparable or even better in vivo MRI reconstruction compared to models trained on a matched realistic dataset, reducing the demand for real-world MRI data by up to 96%. Moreover, our PISF shows impressive generalizability in multi-vendor multi-center imaging, and its excellent adaptability to patients has been verified through 10 experienced doctors' evaluations.PISF provides a feasible and cost-effective way to markedly boost the widespread usage of DL in various fast MRI applications, while freeing from the intractable ethical and practical considerations of in vivo human data acquisitions.

Adversarial Deep Hedging: Learning to Hedge without Price Process Modeling

  • paper_url: http://arxiv.org/abs/2307.13217
  • repo_url: None
  • paper_authors: Masanori Hirano, Kentaro Minami, Kentaro Imajo
  • for: 这个论文是为了探讨deep hedging框架在不完全市场中的应用,以及如何使用机器学习来 Addressing Market frictions和其他实际市场条件。
  • methods: 这个论文提出了一种新的框架,即对抗深度减值(Adversarial Deep Hedging),它是基于对抗学习的。在这个框架中,一个农家和一个生成器,分别模拟了基础资产过程和基础资产过程,在对抗的情况下被训练。这种方法可以不Explicitly model the underlying asset process,并且可以学习一个Robust hedger。
  • results: 通过numerical experiments,我们示示了我们的提议方法在实际市场数据上的竞争性表现。
    Abstract Deep hedging is a deep-learning-based framework for derivative hedging in incomplete markets. The advantage of deep hedging lies in its ability to handle various realistic market conditions, such as market frictions, which are challenging to address within the traditional mathematical finance framework. Since deep hedging relies on market simulation, the underlying asset price process model is crucial. However, existing literature on deep hedging often relies on traditional mathematical finance models, e.g., Brownian motion and stochastic volatility models, and discovering effective underlying asset models for deep hedging learning has been a challenge. In this study, we propose a new framework called adversarial deep hedging, inspired by adversarial learning. In this framework, a hedger and a generator, which respectively model the underlying asset process and the underlying asset process, are trained in an adversarial manner. The proposed method enables to learn a robust hedger without explicitly modeling the underlying asset process. Through numerical experiments, we demonstrate that our proposed method achieves competitive performance to models that assume explicit underlying asset processes across various real market data.
    摘要 深度投资是一种基于深度学习的derivative投资框架,可以在不完全市场中实现效果性的补偿。深度投资的优点在于它可以处理不同的实际市场条件,如市场阻力,这些条件在传统的数学金融框架中很难处理。深度投资基于市场模拟,因此下面资产价值过程模型是关键。然而,现有的文献中的深度投资经常采用传统的数学金融模型,如 Браунов运动和随机振荡模型,找到有效的下面资产模型 для深度投资学习是一个挑战。在这项研究中,我们提出了一种新的框架,即反对抗深度投资, inspirited by adversarial learning。在这个框架中,一个投资者和一个生成器,分别模拟下面资产过程和下面资产过程,在对抗的方式下进行训练。我们的提议的方法可以不Explicitly模型下面资产过程,却可以学习一个有效的投资者。通过数值实验,我们示出了我们的提议方法可以与假设下面资产过程的模型相比,在各种实际市场数据上达到竞争性的性能。

FedMEKT: Distillation-based Embedding Knowledge Transfer for Multimodal Federated Learning

  • paper_url: http://arxiv.org/abs/2307.13214
  • repo_url: None
  • paper_authors: Huy Q. Le, Minh N. H. Nguyen, Chu Myaet Thwal, Yu Qiao, Chaoning Zhang, Choong Seon Hong
  • for: 提出了一种基于多模态学习的联合学习框架,以便在多个客户端协同训练一个通用全球模型,而不需要分享私人数据。
  • methods: 提出了一种 semi-supervised learning 方法,使得客户端可以从不同的模式中提取表示,并将其交换到服务器和客户端中。同时,我们还提出了一种基于液化的多模态嵌入知识传输机制,以便在客户端和服务器之间共享知识。
  • results: 经过广泛的实验,我们发现 FedMEKT 可以在多modal human activity recognition 任务中提高全球编码器性能,同时保护用户隐私和个人数据,并且需要更少的通信成本。
    Abstract Federated learning (FL) enables a decentralized machine learning paradigm for multiple clients to collaboratively train a generalized global model without sharing their private data. Most existing works simply propose typical FL systems for single-modal data, thus limiting its potential on exploiting valuable multimodal data for future personalized applications. Furthermore, the majority of FL approaches still rely on the labeled data at the client side, which is limited in real-world applications due to the inability of self-annotation from users. In light of these limitations, we propose a novel multimodal FL framework that employs a semi-supervised learning approach to leverage the representations from different modalities. Bringing this concept into a system, we develop a distillation-based multimodal embedding knowledge transfer mechanism, namely FedMEKT, which allows the server and clients to exchange the joint knowledge of their learning models extracted from a small multimodal proxy dataset. Our FedMEKT iteratively updates the generalized global encoders with the joint embedding knowledge from the participating clients. Thereby, to address the modality discrepancy and labeled data constraint in existing FL systems, our proposed FedMEKT comprises local multimodal autoencoder learning, generalized multimodal autoencoder construction, and generalized classifier learning. Through extensive experiments on three multimodal human activity recognition datasets, we demonstrate that FedMEKT achieves superior global encoder performance on linear evaluation and guarantees user privacy for personal data and model parameters while demanding less communication cost than other baselines.
    摘要 联合学习(FL)提供了一个分散式机器学习模式,让多个客户端合作训练一个通用的全球模型,无需分享私人数据。现有大部分研究仅提出传统的FL系统,仅适用于单一模式的数据,因此限制了其在未来个性化应用中的潜力。另外,大多数FL方法仍然依赖客户端上的标签数据,实际上在实际应用中因为用户无法自动标注数据而受限。为了解决这些限制,我们提出了一个新的多modal FL框架,它使用了 semi-supervised 学习方法,以利用不同模式之间的表示。我们发展了一个炼制基于的多modal嵌入知识传递机制,即 FedMEKT,让服务器和客户端可以将它们的学习模型中的通用知识交换。我们的 FedMEKT 逐步更新通用全球嵌入器,使用参与客户端的共同知识。这样可以解决现有 FL 系统中的模式差异和标签数据限制,我们的提案包括本地多modal自适应器学习、通用多modal自适应器建构和通用分类学习。经过广泛的实验,我们在三个多modal人类活动识别数据集上证明了 FedMEKT 可以实现更好的全球嵌入器性能,并保证用户隐私和个人数据,同时需要更少的通信成本。

Gait Cycle-Inspired Learning Strategy for Continuous Prediction of Knee Joint Trajectory from sEMG

  • paper_url: http://arxiv.org/abs/2307.13209
  • repo_url: None
  • paper_authors: Xueming Fu, Hao Zheng, Luyan Liu, Wenjuan Zhong, Haowen Liu, Wenxuan Xiong, Yuyang Zhang, Yifeng Chen, Dong Wei, Mingjie Dong, Yefeng Zheng, Mingming Zhang
  • for: 预测下肢运动意图是控制机器人外科手臂和 prosthetic 臂的关键。
  • methods: 本文提出了一种结合两种步征学习策略来减少人类股关节轨迹预测性能的问题。
  • results: 实验结果显示,我们的模型可以预测股关节角度的平均Root Mean Square Error(RMSE)为3.03(0.49)度和50ms之前。这是相关文献中已知的最佳性能,与其他文献相比,减少RMSE至少9.5%。
    Abstract Predicting lower limb motion intent is vital for controlling exoskeleton robots and prosthetic limbs. Surface electromyography (sEMG) attracts increasing attention in recent years as it enables ahead-of-time prediction of motion intentions before actual movement. However, the estimation performance of human joint trajectory remains a challenging problem due to the inter- and intra-subject variations. The former is related to physiological differences (such as height and weight) and preferred walking patterns of individuals, while the latter is mainly caused by irregular and gait-irrelevant muscle activity. This paper proposes a model integrating two gait cycle-inspired learning strategies to mitigate the challenge for predicting human knee joint trajectory. The first strategy is to decouple knee joint angles into motion patterns and amplitudes former exhibit low variability while latter show high variability among individuals. By learning through separate network entities, the model manages to capture both the common and personalized gait features. In the second, muscle principal activation masks are extracted from gait cycles in a prolonged walk. These masks are used to filter out components unrelated to walking from raw sEMG and provide auxiliary guidance to capture more gait-related features. Experimental results indicate that our model could predict knee angles with the average root mean square error (RMSE) of 3.03(0.49) degrees and 50ms ahead of time. To our knowledge this is the best performance in relevant literatures that has been reported, with reduced RMSE by at least 9.5%.
    摘要 预测下肢运动意图是控制外骨骼机器人和人工肢的关键。表面电 MYography (sEMG) 在最近几年来引起了越来越多的关注,因为它可以在实际运动之前预测人体运动意图。然而,人体 JOINT 轨迹的预测性能仍然是一个挑战,这是因为人体之间和个体之间存在差异。前者是由生物学特征(如身高和体重)和个人偏好的步态所致,而后者是由不规则的肌肉活动所引起的。本文提出了一种将两种步征学习策略 integrate 到模型中,以减少预测人体 knee Joint 轨迹的挑战。首先,我们决定将 knee Joint 角度分解成运动模式和振荡强度两个部分。前者在各个个体中表现出低变异性,而后者则表现出高变异性。通过分解这两个部分,我们可以通过不同的网络实体学习两者。这种方法可以捕捉到各个个体的共同和个性化步态特征。其次,我们从步征征ycle中提取了肌肉主动活动面。这些面用于过滤 raw sEMG 中不相关于步行的组分,并提供辅助指导以捕捉更多的步行特征。实验结果表明,我们的模型可以预测 knee Joint 角度的平均根据 Mean Square Error (RMSE) 为 3.03(0.49)度,并在50毫秒前预测。根据我们所知,这是相关文献中最佳的性能,相比前一个最佳性能减少了至少9.5%。

Federated Distributionally Robust Optimization with Non-Convex Objectives: Algorithm and Analysis

  • paper_url: http://arxiv.org/abs/2307.14364
  • repo_url: None
  • paper_authors: Yang Jiao, Kai Yang, Dongjin Song
  • for: 解决分布式环境中 asynchronous updating 问题,以及如何有效地利用 prior distribution 和适度地调整 robustness 水平。
  • methods: 提出了 asynchronous distributed algorithm ASPIRE algorithm with EASE method,并开发了新的 uncertainty set - constrained D-norm uncertainty set,以便有效地利用 prior distribution 和控制 robustness 水平。
  • results: 理论分析表明提出的算法可靠地 converge,并且 iteration complexity 也得到了分析。 empirical studies 表明该方法可以快速 converge,对数据不同性和 malicious attacks 具有抗锋性,并且可以控制 robustness 水平和性能之间的负荷。
    Abstract Distributionally Robust Optimization (DRO), which aims to find an optimal decision that minimizes the worst case cost over the ambiguity set of probability distribution, has been widely applied in diverse applications, e.g., network behavior analysis, risk management, etc. However, existing DRO techniques face three key challenges: 1) how to deal with the asynchronous updating in a distributed environment; 2) how to leverage the prior distribution effectively; 3) how to properly adjust the degree of robustness according to different scenarios. To this end, we propose an asynchronous distributed algorithm, named Asynchronous Single-looP alternatIve gRadient projEction (ASPIRE) algorithm with the itErative Active SEt method (EASE) to tackle the federated distributionally robust optimization (FDRO) problem. Furthermore, a new uncertainty set, i.e., constrained D-norm uncertainty set, is developed to effectively leverage the prior distribution and flexibly control the degree of robustness. Finally, our theoretical analysis elucidates that the proposed algorithm is guaranteed to converge and the iteration complexity is also analyzed. Extensive empirical studies on real-world datasets demonstrate that the proposed method can not only achieve fast convergence, and remain robust against data heterogeneity as well as malicious attacks, but also tradeoff robustness with performance.
    摘要 Distributionally Robust Optimization (DRO),targeting at finding an optimal decision that minimizes the worst-case cost over the ambiguity set of probability distribution, has been widely applied in various fields, such as network behavior analysis and risk management. However, existing DRO techniques face three key challenges:1. How to deal with asynchronous updating in a distributed environment;2. How to effectively leverage the prior distribution;3. How to properly adjust the degree of robustness according to different scenarios.To address these challenges, we propose an asynchronous distributed algorithm, named Asynchronous Single-loop Alternating Gradient Projection (ASPIRE) algorithm with the Iterative Active Set method (EASE) to solve the Federated Distributionally Robust Optimization (FDRO) problem. Furthermore, a new uncertainty set, i.e., constrained D-norm uncertainty set, is developed to effectively leverage the prior distribution and flexibly control the degree of robustness.Our theoretical analysis shows that the proposed algorithm is guaranteed to converge, and the iteration complexity is also analyzed. Extensive empirical studies on real-world datasets demonstrate that the proposed method can not only achieve fast convergence, remain robust against data heterogeneity as well as malicious attacks, but also trade off robustness with performance.

Blockchain-based Optimized Client Selection and Privacy Preserved Framework for Federated Learning

  • paper_url: http://arxiv.org/abs/2308.04442
  • repo_url: None
  • paper_authors: Attia Qammar, Abdenacer Naouri, Jianguo Ding, Huansheng Ning
  • for: 这个研究旨在提出一个基于区块链的优化客户端选择和隐私保证的联边学习框架,以解决传统联边学习结构中的单点失灵攻击和随机选择客户端对模型训练的影响。
  • methods: 我们提出了三种智能合约:1)客户端注册合约、2)前向拍卖合约来选择优化客户端进行联边学习模型训练、3)支付和赔偿合约。另外,我们还实现了完全几何加密(CKKS)方法,以保证在传输本地模型更新时,资料的隐私不会被泄露。
  • results: 我们在 benchmark 数据集上评估了我们的提案,并与现有的研究进行比较。结果显示,我们的方法可以实现高精度率和隐私保证的联边学习框架,并且具有分散的自然 caracteristics。
    Abstract Federated learning is a distributed mechanism that trained large-scale neural network models with the participation of multiple clients and data remains on their devices, only sharing the local model updates. With this feature, federated learning is considered a secure solution for data privacy issues. However, the typical FL structure relies on the client-server, which leads to the single-point-of-failure (SPoF) attack, and the random selection of clients for model training compromised the model accuracy. Furthermore, adversaries try for inference attacks i.e., attack on privacy leads to gradient leakage attacks. We proposed the blockchain-based optimized client selection and privacy-preserved framework in this context. We designed the three kinds of smart contracts such as 1) registration of clients 2) forward bidding to select optimized clients for FL model training 3) payment settlement and reward smart contracts. Moreover, fully homomorphic encryption with Cheon, Kim, Kim, and Song (CKKS) method is implemented before transmitting the local model updates to the server. Finally, we evaluated our proposed method on the benchmark dataset and compared it with state-of-the-art studies. Consequently, we achieved a higher accuracy rate and privacy-preserved FL framework with decentralized nature.
    摘要 federated learning 是一种分布式机制,通过多个客户端参与训练大规模神经网络模型,保留数据在客户端上,只将本地模型更新共享。由于这种特点, federated learning 被视为一种保障数据隐私的解决方案。然而, Typical FL 结构依赖于客户端-服务器模型,导致单点失败攻击(SPoF)和随机选择客户端进行模型训练,从而影响模型精度。此外,敌方会尝试进行推理攻击,即袭击隐私导致梯度泄露攻击。我们在这种情况下提出了基于区块链的优化客户端选择和隐私保护框架。我们设计了三种种智能合约,包括1)客户端注册 2)向服务器进行前置拍卖选择优化客户端进行 FL 模型训练 3)支付和奖励智能合约。此外,我们实现了使用 Cheon、Kim、Kim 和 Song(CKKS)方法的完全同质加密,以便在向服务器传输本地模型更新之前对其进行加密。最后,我们对标准数据集进行评估,并与现有研究进行比较。因此,我们实现了高精度率和隐私保护的 FL 框架,并具有分布式的自然Characteristics。

Knowledge-enhanced Neuro-Symbolic AI for Cybersecurity and Privacy

  • paper_url: http://arxiv.org/abs/2308.02031
  • repo_url: None
  • paper_authors: Aritran Piplai, Anantaa Kotal, Seyedreza Mohseni, Manas Gaur, Sudip Mittal, Anupam Joshi
  • for: 该论文旨在探讨如何使用神经网络和符号知识图来提高人类可理解性和安全性在人工智能系统中。
  • methods: 该论文使用了神经网络和符号知识图的组合方法,以提高对复杂数据空间的探索和学习,同时保持可理解性和安全性。
  • results: 该论文表明,通过神经网络和符号知识图的组合,可以在Cybersecurity和隐私等高度需要人工智能可解释性的领域中提高AI系统的准确性和安全性。
    Abstract Neuro-Symbolic Artificial Intelligence (AI) is an emerging and quickly advancing field that combines the subsymbolic strengths of (deep) neural networks and explicit, symbolic knowledge contained in knowledge graphs to enhance explainability and safety in AI systems. This approach addresses a key criticism of current generation systems, namely their inability to generate human-understandable explanations for their outcomes and ensure safe behaviors, especially in scenarios with \textit{unknown unknowns} (e.g. cybersecurity, privacy). The integration of neural networks, which excel at exploring complex data spaces, and symbolic knowledge graphs, which represent domain knowledge, allows AI systems to reason, learn, and generalize in a manner understandable to experts. This article describes how applications in cybersecurity and privacy, two most demanding domains in terms of the need for AI to be explainable while being highly accurate in complex environments, can benefit from Neuro-Symbolic AI.
    摘要 neural network 和 symbolic knowledge graph 的结合,即 Neuro-Symbolic AI,是一个快速发展的领域,它可以提高 AI 系统的解释性和安全性。这种方法可以解决现有系统的一个批评,即无法生成人类理解的解释,特别是在“未知未知”(如隐私、安全)的场景下。 neural network 可以很好地探索复杂数据空间,而 symbolic knowledge graph 可以表示领域知识,这使得 AI 系统可以由专家理解的方式进行推理、学习和泛化。本文介绍了如何通过 Neuro-Symbolic AI 应用于隐私和安全领域,这两个领域对 AI 系统的解释性和高精度性有特别高的需求。

Counterfactual Explanation Policies in RL

  • paper_url: http://arxiv.org/abs/2307.13192
  • repo_url: None
  • paper_authors: Shripad V. Deshmukh, Srivatsan R, Supriti Vijay, Jayakumar Subramanian, Chirag Agarwal
  • for: 这个论文的目的是解释RL策略的可解释性,并提供一种基于对比的策略分析方法。
  • methods: 该论文使用了对比方法,将策略视为一种可变的对比,并通过对比来分析策略的改进。
  • results: 实验结果表明,COUNTERPOL可以生成有用的对比解释,帮助分析RL策略的性能改进。 论文在五个不同的RL环境中进行了广泛的实验,并证明了对比解释的实用性。
    Abstract As Reinforcement Learning (RL) agents are increasingly employed in diverse decision-making problems using reward preferences, it becomes important to ensure that policies learned by these frameworks in mapping observations to a probability distribution of the possible actions are explainable. However, there is little to no work in the systematic understanding of these complex policies in a contrastive manner, i.e., what minimal changes to the policy would improve/worsen its performance to a desired level. In this work, we present COUNTERPOL, the first framework to analyze RL policies using counterfactual explanations in the form of minimal changes to the policy that lead to the desired outcome. We do so by incorporating counterfactuals in supervised learning in RL with the target outcome regulated using desired return. We establish a theoretical connection between Counterpol and widely used trust region-based policy optimization methods in RL. Extensive empirical analysis shows the efficacy of COUNTERPOL in generating explanations for (un)learning skills while keeping close to the original policy. Our results on five different RL environments with diverse state and action spaces demonstrate the utility of counterfactual explanations, paving the way for new frontiers in designing and developing counterfactual policies.
    摘要 为了使机器学习(RL)代理人在各种决策问题中使用奖励偏好,正在使得RL政策的可追踪性变得越来越重要。然而,现有的工作几乎没有系统地理解这些复杂的政策,尤其是在对比方式下进行分析。在这项工作中,我们提出了Counterpol,第一个使用对比解释来分析RL政策的框架。我们通过在RL中 incorporating counterfactuals into supervised learning,使得政策更容易理解。我们还证明了Counterpol与常用的信任区间基本策略优化方法在RL中的理论联系。我们的实验结果表明,Counterpol可以快速生成对应于不同奖励目标的解释,同时保持着原始政策的相似性。我们在五种不同的RL环境中进行了extensive empirical analysis,并证明了对于不同的状态和动作空间,Counterpol可以提供有用的对比解释,开启了新的前ier征学习和开发对比政策的可能性。

Digital Emotion Regulation on Social Media

  • paper_url: http://arxiv.org/abs/2307.13187
  • repo_url: None
  • paper_authors: Akriti Verma, Shama Islam, Valeh Moghaddam, Adnan Anwar
  • for: 这篇论文主要是关于如何利用数字技术来调节情绪 state,以支持伦理的技术设计、开发和部署。
  • methods: 论文使用了社交媒体应用程序的不同特性和功能来描述不同阶段的情绪调节过程。
  • results: 研究发现了不同社交媒体应用程序在不同阶段的情绪调节过程中的应用,以及最近的研究对社交媒体应用程序的情绪调节 intervenciones。
    Abstract Emotion regulation is the process of consciously altering one's affective state, that is the underlying emotional state such as happiness, confidence, guilt, anger etc. The ability to effectively regulate emotions is necessary for functioning efficiently in everyday life. Today, the pervasiveness of digital technology is being purposefully employed to modify our affective states, a process known as digital emotion regulation. Understanding digital emotion regulation can help support the rise of ethical technology design, development, and deployment. This article presents an overview of digital emotion regulation in social media applications, as well as a synthesis of recent research on emotion regulation interventions for social media. We share our findings from analysing state-of-the-art literature on how different social media applications are utilised at different stages in the process of emotion regulation.
    摘要 情绪调节是指意识地改变自己的情绪状态,包括内在的情绪状态如快乐、自信、罪愧、愤怒等。有效地调节情绪是日常生活中必要的。随着数字技术的普及,人们正在意识地利用这些技术来修改自己的情绪状态,这被称为数字情绪调节。了解数字情绪调节可以帮助促进伦理技术的设计、开发和投入。本文提供了社交媒体应用程序中数字情绪调节的概述,以及最新的研究成果表明在社交媒体上进行情绪调节的干预措施。我们分析了最新的文献,描述了不同的社交媒体应用程序在不同阶段的情绪调节过程中的使用。

Opinion Mining Using Population-tuned Generative Language Models

  • paper_url: http://arxiv.org/abs/2307.13173
  • repo_url: None
  • paper_authors: Allmin Susaiyah, Abhinay Pandya, Aki Härmä
  • for: 用于挖掘文本收集中的意见
  • methods: 使用生成语言模型,通过特定的方法和数据集进行训练
  • results: 可以学习和传递意见到semantic类,保持极性分布
    Abstract We present a novel method for mining opinions from text collections using generative language models trained on data collected from different populations. We describe the basic definitions, methodology and a generic algorithm for opinion insight mining. We demonstrate the performance of our method in an experiment where a pre-trained generative model is fine-tuned using specifically tailored content with unnatural and fully annotated opinions. We show that our approach can learn and transfer the opinions to the semantic classes while maintaining the proportion of polarisation. Finally, we demonstrate the usage of an insight mining system to scale up the discovery of opinion insights from a real text corpus.
    摘要 我们提出了一种新的方法,用于从文本集中挖掘意见使用生成语言模型,这些模型在不同的人口数据上进行训练。我们描述了基本定义、方法和一个通用的算法 для意见洞察挖掘。我们在实验中使用预训练的生成模型进行微调,使用特定的内容和假备注意意见。我们显示了我们的方法可以学习并传递意见到Semantic类中,同时保持极性分布。最后,我们示出了一个洞察挖掘系统可以扩大从真实文本集中发现意见洞察的能力。Note: Please note that the translation is in Simplified Chinese, which is the standard form of Chinese used in mainland China and Singapore. If you need Traditional Chinese, please let me know.

Investigating the Robustness of Sequential Recommender Systems Against Training Data Perturbations: an Empirical Study

  • paper_url: http://arxiv.org/abs/2307.13165
  • repo_url: None
  • paper_authors: Filippo Betello, Federico Siciliano, Pushkar Mishra, Fabrizio Silvestri
  • for: 这个论文旨在研究Sequential Recommender Systems(SRSs)在训练数据中的稳定性,具体来说是研究在 temporally ordered sequence 中移除items的影响。
  • methods: 这个论文使用了两种不同的SRS模型,在多个数据集上进行了评估,使用了Normalized Discounted Cumulative Gain(NDCG)和Rank Sensitivity List metric来衡量性能。
  • results: 研究发现,在序列中移除items的末端位置会导致性能下降,NDCG下降可达60%,而从开头或中间位置移除items没有显著影响。这些发现表明考虑训练数据中items的位置是重要的,这将有助于设计更加稳定的SRSs。
    Abstract Sequential Recommender Systems (SRSs) have been widely used to model user behavior over time, but their robustness in the face of perturbations to training data is a critical issue. In this paper, we conduct an empirical study to investigate the effects of removing items at different positions within a temporally ordered sequence. We evaluate two different SRS models on multiple datasets, measuring their performance using Normalized Discounted Cumulative Gain (NDCG) and Rank Sensitivity List metrics. Our results demonstrate that removing items at the end of the sequence significantly impacts performance, with NDCG decreasing up to 60\%, while removing items from the beginning or middle has no significant effect. These findings highlight the importance of considering the position of the perturbed items in the training data and shall inform the design of more robust SRSs.
    摘要

Improving Primary Healthcare Workflow Using Extreme Summarization of Scientific Literature Based on Generative AI

  • paper_url: http://arxiv.org/abs/2307.15715
  • repo_url: None
  • paper_authors: Gregor Stiglic, Leon Kopitar, Lucija Gosak, Primoz Kocbek, Zhe He, Prithwish Chakraborty, Pablo Meyer, Jiang Bian
    for: 这个研究的目的是探究生成人工智能在减轻医疗专业人员的认知压力方面的潜力,以便更好地帮助他们保持最新的科学文献知识。methods: 这个研究使用了生成型人工智能技术,基于大规模语言模型,对科学报告摘要进行概括。results: 研究结果表明,使用生成型人工智能 для文献综述是高效的,可以减少医疗专业人员查找科学文献所需的时间。然而,研究还发现,当全文摘要不可用时,EXTRACTED知识的准确性会下降。这种破坏性技术有potential可以大幅减少医疗专业人员保持最新科学文献知识的时间,但是进一步的发展可能需要帮助他们更好地理解摘要中的知识。
    Abstract Primary care professionals struggle to keep up to date with the latest scientific literature critical in guiding evidence-based practice related to their daily work. To help solve the above-mentioned problem, we employed generative artificial intelligence techniques based on large-scale language models to summarize abstracts of scientific papers. Our objective is to investigate the potential of generative artificial intelligence in diminishing the cognitive load experienced by practitioners, thus exploring its ability to alleviate mental effort and burden. The study participants were provided with two use cases related to preventive care and behavior change, simulating a search for new scientific literature. The study included 113 university students from Slovenia and the United States randomized into three distinct study groups. The first group was assigned to the full abstracts. The second group was assigned to the short abstracts generated by AI. The third group had the option to select a full abstract in addition to the AI-generated short summary. Each use case study included ten retrieved abstracts. Our research demonstrates that the use of generative AI for literature review is efficient and effective. The time needed to answer questions related to the content of abstracts was significantly lower in groups two and three compared to the first group using full abstracts. The results, however, also show significantly lower accuracy in extracted knowledge in cases where full abstract was not available. Such a disruptive technology could significantly reduce the time required for healthcare professionals to keep up with the most recent scientific literature; nevertheless, further developments are needed to help them comprehend the knowledge accurately.
    摘要 We conducted a study with 113 university students from Slovenia and the United States, randomly assigned to three groups. The first group was given full abstracts, the second group was given AI-generated short summaries, and the third group had the option to choose a full abstract or the AI-generated summary. Each use case study included ten retrieved abstracts.Our findings show that using generative AI for literature review is efficient and effective. The time needed to answer questions related to the content of abstracts was significantly lower in groups two and three compared to the first group using full abstracts. However, the results also showed that accuracy in extracted knowledge was significantly lower when full abstracts were not available.This disruptive technology has the potential to significantly reduce the time required for healthcare professionals to keep up with the most recent scientific literature. However, further developments are needed to help them comprehend the knowledge accurately.

Why Don’t You Clean Your Glasses? Perception Attacks with Dynamic Optical Perturbations

  • paper_url: http://arxiv.org/abs/2307.13131
  • repo_url: None
  • paper_authors: Yi Han, Matthew Chan, Eric Wengrowski, Zhuohuan Li, Nils Ole Tippenhauer, Mani Srivastava, Saman Zonouz, Luis Garcia
  • for: 这篇论文的目的是研究攻击自适应系统中的机器学习模型,以及这些模型在物理世界中的攻击。
  • methods: 这篇论文使用了一种名为“EvilEye”的人在中渠攻击,利用透明屏幕生成动态物理攻击示例。这种攻击利用了相机的光学特性,在不同的照明条件下引起识别错误。
  • results: 实验表明,EvilEye生成的攻击示例在环境噪声和自适应系统的动态变化下表现得非常稳定,可以高效绕过当前物理世界攻击检测框架。此外,EvilEye可以针对不同的物体实现高度的攻击成功率。
    Abstract Camera-based autonomous systems that emulate human perception are increasingly being integrated into safety-critical platforms. Consequently, an established body of literature has emerged that explores adversarial attacks targeting the underlying machine learning models. Adapting adversarial attacks to the physical world is desirable for the attacker, as this removes the need to compromise digital systems. However, the real world poses challenges related to the "survivability" of adversarial manipulations given environmental noise in perception pipelines and the dynamicity of autonomous systems. In this paper, we take a sensor-first approach. We present EvilEye, a man-in-the-middle perception attack that leverages transparent displays to generate dynamic physical adversarial examples. EvilEye exploits the camera's optics to induce misclassifications under a variety of illumination conditions. To generate dynamic perturbations, we formalize the projection of a digital attack into the physical domain by modeling the transformation function of the captured image through the optical pipeline. Our extensive experiments show that EvilEye's generated adversarial perturbations are much more robust across varying environmental light conditions relative to existing physical perturbation frameworks, achieving a high attack success rate (ASR) while bypassing state-of-the-art physical adversarial detection frameworks. We demonstrate that the dynamic nature of EvilEye enables attackers to adapt adversarial examples across a variety of objects with a significantly higher ASR compared to state-of-the-art physical world attack frameworks. Finally, we discuss mitigation strategies against the EvilEye attack.
    摘要 摄像头基于自动化系统,模拟人类感知,在安全关键平台中得到普遍应用。因此,一个已经形成的文献出现,探讨机器学习模型的攻击。对于攻击者来说,在物理世界中进行攻击是有利的,因为这 eliminates the need to compromise digital systems。然而,物理世界具有对攻击修改的"生存性"问题,即环境噪声和自动化系统的动态性。在这篇论文中,我们采用了感知先采集的方法。我们介绍了一种基于透明显示器的人在中间攻击,称为EvilEye。EvilEye利用摄像头的光学来导致分类错误,并在不同的照明条件下实现高度的攻击成功率(ASR),并 circumvent state-of-the-art physical adversarial detection frameworks。我们的广泛实验表明,EvilEye生成的physical perturbations是对环境光度条件的变化具有更高的Robustness,相比之下,现有的物理扰动框架。我们还证明了EvilEye的动态性可以在不同的物体上实现更高的ASR,比之前的物理世界攻击框架。最后,我们讨论了对EvilEye攻击的防御策略。

A Hybrid Machine Learning Model for Classifying Gene Mutations in Cancer using LSTM, BiLSTM, CNN, GRU, and GloVe

  • paper_url: http://arxiv.org/abs/2307.14361
  • repo_url: None
  • paper_authors: Sanad Aburass, Osama Dorgham, Jamil Al Shaqsi
  • for: 这种研究是为了使用Kaggle的个性化医疗:再定义癌症治疗数据集来分类基因突变。
  • methods: 这个模型使用了LSTM、BiLSTM、CNN、GRU和GloVe ensemble模型来实现这一目标。
  • results: 这个模型的准确率、精度、准确率、F1分数和平均平方误差都高于所有其他模型,并且需要更少的训练时间,因此是性能和效率的完美结合。
    Abstract This study presents an ensemble model combining LSTM, BiLSTM, CNN, GRU, and GloVe to classify gene mutations using Kaggle's Personalized Medicine: Redefining Cancer Treatment dataset. The results were compared against well-known transformers like as BERT, Electra, Roberta, XLNet, Distilbert, and their LSTM ensembles. Our model outperformed all other models in terms of accuracy, precision, recall, F1 score, and Mean Squared Error. Surprisingly, it also needed less training time, resulting in a perfect combination of performance and efficiency. This study demonstrates the utility of ensemble models for difficult tasks such as gene mutation classification.
    摘要 Translation notes:* "LSTM" and "BiLSTM" were translated as "长ShortTermMemory" (CHángshòu Dàimengyī) and "双向LongShortTermMemory" (Shuāngxiàng CHángshòu Dàimengyī) respectively.* "CNN" was translated as "卷积神经网络" (Jiànpán Jīngxīn Wǎngwǎng)* "GRU" was translated as "幂等循环神经网络" (Jìdé Xiàngxīng Jīngxīn Wǎngwǎng)* "GloVe" was translated as "全球最佳 embeddings" (Quánqīu Zuìjiā Embeddings)* "BERT" was translated as "Bidirectional Encoder Representations from Transformers" (Bìxiàngdìng Jīngxīn Fāngyìng)* "Electra" was translated as "Electra: A Method for Estimating the Representation of a Set of Words" (Électra: A Method for Estimating the Representation of a Set of Words)* "Roberta" was translated as "Roberta: A Simple and Efficient Transformer for Language Understanding" (Roberta: A Simple and Efficient Transformer for Language Understanding)* "XLNet" was translated as "XLNet: Generalized Autoencoders for Language Understanding" (XLNet: Generalized Autoencoders for Language Understanding)* "Distilbert" was translated as "DistilBERT: Distilled BERT for Efficient and Compact Language Models" (DistilBERT: Distilled BERT for Efficient and Compact Language Models)* "ensemble" was translated as "组合" (Zǔzhōng)Please note that the translation is in Simplified Chinese, and the translation may vary depending on the context and the specific dialect.

Deep Bradley-Terry Rating: Quantifying Properties from Comparisons

  • paper_url: http://arxiv.org/abs/2307.13709
  • repo_url: None
  • paper_authors: Satoru Fujii
  • for: 该论文旨在解决实际世界中不直接可观察的许多属性的难题,通过使用grade human scores作为目标标签进行训练。
  • methods: 该论文提出了一种名为深度布莱德利-泰勒评分(DBTR)的新机器学习框架,该框架将布莱德利-泰勒模型 integrates into neural network structure,并在不平等环境下进行扩展。
  • results: 经过实验分析,DBTR成功地学习和估计所需的属性。
    Abstract Many properties in the real world can't be directly observed, making them difficult to learn. To deal with this challenging problem, prior works have primarily focused on estimating those properties by using graded human scores as the target label in the training. Meanwhile, rating algorithms based on the Bradley-Terry model are extensively studied to evaluate the competitiveness of players based on their match history. In this paper, we introduce the Deep Bradley-Terry Rating (DBTR), a novel machine learning framework designed to quantify and evaluate properties of unknown items. Our method seamlessly integrates the Bradley-Terry model into the neural network structure. Moreover, we generalize this architecture further to asymmetric environments with unfairness, a condition more commonly encountered in real-world settings. Through experimental analysis, we demonstrate that DBTR successfully learns to quantify and estimate desired properties.
    摘要 很多现实世界中的属性是直接观察不到的,使得学习变得困难。以前的工作主要是通过使用排名为目标标签进行训练来估算这些属性。而BRADLEY-TERRY模型的评分算法在评估玩家的竞技水平上广泛研究。在这篇论文中,我们介绍了深度BRADLEY-TERRY评分(DBTR),一种新的机器学习框架,用于评估和评价未知的物品属性。我们将BRADLEY-TERRY模型集成到神经网络结构中,并将其扩展到不平等环境下,更加符合实际世界中的情况。我们通过实验分析,证明DBTR可以成功地评估和估算所需的属性。

Getting pwn’d by AI: Penetration Testing with Large Language Models

  • paper_url: http://arxiv.org/abs/2308.00121
  • repo_url: https://github.com/ipa-lab/hackingBuddyGPT
  • paper_authors: Andreas Happe, Jürgen Cito
  • for: 该论文探讨了使用大语言模型(如GPT3.5)来补充安全测试人员,以增强安全测试的效率和质量。
  • methods: 论文采用了高级语言模型进行具体的任务规划和低级漏洞搜索两种使用场景,并实现了在虚拟机上实现了封闭反馈循环,让LLM分析机器状态并提供攻击方式。
  • results: 论文初步结果显示,使用大语言模型可以帮助提高安全测试的效率和质量,并且可以帮助找到一些潜在的漏洞。但是,论文还需要进一步的改进和优化。
    Abstract The field of software security testing, more specifically penetration testing, is an activity that requires high levels of expertise and involves many manual testing and analysis steps. This paper explores the potential usage of large-language models, such as GPT3.5, to augment penetration testers with AI sparring partners. We explore the feasibility of supplementing penetration testers with AI models for two distinct use cases: high-level task planning for security testing assignments and low-level vulnerability hunting within a vulnerable virtual machine. For the latter, we implemented a closed-feedback loop between LLM-generated low-level actions with a vulnerable virtual machine (connected through SSH) and allowed the LLM to analyze the machine state for vulnerabilities and suggest concrete attack vectors which were automatically executed within the virtual machine. We discuss promising initial results, detail avenues for improvement, and close deliberating on the ethics of providing AI-based sparring partners.
    摘要 field 软件安全测试,具体来说是渗透测试,需要高水平的专业知识和多种手动测试和分析步骤。这篇论文探讨使用大型自然语言模型,如GPT3.5,来补充渗透测试员的人工智能对手。我们探讨在两个不同的用例中使用AI模型:一是高级任务规划 для安全测试任务,二是低级漏洞搜寻在易于攻击的虚拟机中。对于后一个,我们实现了关闭反馈循环,使用SSH连接到易于攻击的虚拟机,让LLM分析机器状态以找到漏洞并建议具体的攻击方式,然后自动在虚拟机中执行。我们讨论了初步的结果,详细描述改进的方向,并关于提供AI基本对手的伦理问题。

An Explainable Geometric-Weighted Graph Attention Network for Identifying Functional Networks Associated with Gait Impairment

  • paper_url: http://arxiv.org/abs/2307.13108
  • repo_url: https://github.com/favour-nerrise/xgw-gat
  • paper_authors: Favour Nerrise, Qingyu Zhao, Kathleen L. Poston, Kilian M. Pohl, Ehsan Adeli
  • for: 这个研究的目的是为了更好地理解parkinson病的motor进程,以开发更有效和个性化的治疗方法。
  • methods: 这个研究使用了一种可解释的、几何的、weighted-graph注意力神经网络(xGW-GAT),用于预测parkinson病患者的跑步困难程度。
  • results: xGW-GAT模型可以从resting-state功能MRI数据中提取出跑步困难相关的功能连接图,并且可以提供可解释的功能子网络,对于parkinson病患者的motor困难提供了解释。
    Abstract One of the hallmark symptoms of Parkinson's Disease (PD) is the progressive loss of postural reflexes, which eventually leads to gait difficulties and balance problems. Identifying disruptions in brain function associated with gait impairment could be crucial in better understanding PD motor progression, thus advancing the development of more effective and personalized therapeutics. In this work, we present an explainable, geometric, weighted-graph attention neural network (xGW-GAT) to identify functional networks predictive of the progression of gait difficulties in individuals with PD. xGW-GAT predicts the multi-class gait impairment on the MDS Unified PD Rating Scale (MDS-UPDRS). Our computational- and data-efficient model represents functional connectomes as symmetric positive definite (SPD) matrices on a Riemannian manifold to explicitly encode pairwise interactions of entire connectomes, based on which we learn an attention mask yielding individual- and group-level explainability. Applied to our resting-state functional MRI (rs-fMRI) dataset of individuals with PD, xGW-GAT identifies functional connectivity patterns associated with gait impairment in PD and offers interpretable explanations of functional subnetworks associated with motor impairment. Our model successfully outperforms several existing methods while simultaneously revealing clinically-relevant connectivity patterns. The source code is available at https://github.com/favour-nerrise/xGW-GAT .
    摘要 Our model represents functional connectomes as symmetric positive definite (SPD) matrices on a Riemannian manifold to explicitly encode pairwise interactions of entire connectomes. Based on this, we learn an attention mask that yields individual- and group-level explainability. Applied to our resting-state functional MRI (rs-fMRI) dataset of individuals with PD, xGW-GAT identifies functional connectivity patterns associated with gait impairment in PD and provides interpretable explanations of functional subnetworks associated with motor impairment. Our model outperforms several existing methods while providing clinically relevant connectivity patterns. The source code is available at https://github.com/favour-nerrise/xGW-GAT.

How to use LLMs for Text Analysis

  • paper_url: http://arxiv.org/abs/2307.13106
  • repo_url: https://github.com/cssmodels/howtousellms
  • paper_authors: Petter Törnberg
  • for: 这篇论文是用于介绍大语言模型(LLM)在社会科学中的应用。
  • methods: 论文使用Python语言和API进行文本分析,包括文本标注和分类、情感分析和批判话语分析等多种任务。
  • results: 论文通过使用LLM来分析政治文本,并成功地超越了现有的状态。
    Abstract This guide introduces Large Language Models (LLM) as a highly versatile text analysis method within the social sciences. As LLMs are easy-to-use, cheap, fast, and applicable on a broad range of text analysis tasks, ranging from text annotation and classification to sentiment analysis and critical discourse analysis, many scholars believe that LLMs will transform how we do text analysis. This how-to guide is aimed at students and researchers with limited programming experience, and offers a simple introduction to how LLMs can be used for text analysis in your own research project, as well as advice on best practices. We will go through each of the steps of analyzing textual data with LLMs using Python: installing the software, setting up the API, loading the data, developing an analysis prompt, analyzing the text, and validating the results. As an illustrative example, we will use the challenging task of identifying populism in political texts, and show how LLMs move beyond the existing state-of-the-art.
    摘要 这个指南介绍大语言模型(LLM)作为社会科学中高度灵活的文本分析方法。由于LLM是容易使用、便宜、快速并可应用于广泛的文本分析任务,从文本标注和分类到情感分析和批判性文本分析,许多学者认为LLM会改变我们如何进行文本分析。这本引导是向没有programming经验的学生和研究人员的,提供了使用Python来进行文本分析的简单入门,以及最佳实践的建议。我们将通过每个步骤来分析文本数据使用LLM,包括安装软件、设置API、加载数据、开发分析提示、分析文本和验证结果。作为一个示例,我们使用政治文本中的 populism 识别任务,并示出如何使用LLM超越现有的状态。

Contrastive Example-Based Control

  • paper_url: http://arxiv.org/abs/2307.13101
  • repo_url: https://github.com/khatch31/laeo
  • paper_authors: Kyle Hatch, Benjamin Eysenbach, Rafael Rafailov, Tianhe Yu, Ruslan Salakhutdinov, Sergey Levine, Chelsea Finn
  • for: 这篇论文的目的是提出一种基于实例的控制方法,可以在无线务动态环境中学习Q值函数。
  • methods: 这种方法使用数据驱动的方法,从转移动力和高返回状态中学习一个隐式模型,而不是直接学习奖励函数。
  • results: 对比基线方法,这种方法在多种状态基于和图像基于的离线控制任务中表现出色,并且在数据集大小增加时显示了更好的稳定性和扩展性。
    Abstract While many real-world problems that might benefit from reinforcement learning, these problems rarely fit into the MDP mold: interacting with the environment is often expensive and specifying reward functions is challenging. Motivated by these challenges, prior work has developed data-driven approaches that learn entirely from samples from the transition dynamics and examples of high-return states. These methods typically learn a reward function from high-return states, use that reward function to label the transitions, and then apply an offline RL algorithm to these transitions. While these methods can achieve good results on many tasks, they can be complex, often requiring regularization and temporal difference updates. In this paper, we propose a method for offline, example-based control that learns an implicit model of multi-step transitions, rather than a reward function. We show that this implicit model can represent the Q-values for the example-based control problem. Across a range of state-based and image-based offline control tasks, our method outperforms baselines that use learned reward functions; additional experiments demonstrate improved robustness and scaling with dataset size.
    摘要 虽然许多实际问题可以借助强化学习解决,但这些问题很少遵循MDP模型:与环境交互往往是昂贵的,并且指定奖励函数是困难的。受这些挑战的推动,先前的工作已经开发出了基于数据的方法,这些方法通过从转移动力学中采样而学习,并使用高奖状态的示例来标记转移。这些方法可以在许多任务上达到良好的结果,但它们可能复杂,需要减少和时间差更新。在这篇论文中,我们提出了一种没有奖励函数的离线控制方法,这种方法学习了多步转移的隐藏模型,而不是奖励函数。我们证明这种隐藏模型可以表示离线控制问题中的Q值。在一系列基于状态和图像的离线控制任务上,我们的方法超过了基准值,并且进行了附加的robustness和数据集大小的扩展试验。

Comparative Analysis of Drug-GPT and ChatGPT LLMs for Healthcare Insights: Evaluating Accuracy and Relevance in Patient and HCP Contexts

  • paper_url: http://arxiv.org/abs/2307.16850
  • repo_url: None
  • paper_authors: Giorgos Lysandrou, Roma English Owen, Kirsty Mursec, Grant Le Brun, Elizabeth A. L. Fairley
  • for: 这个研究旨在比较三个生成预训练变换器(GPT)解决方案在问答(Q&A) Setting中的表现:Drug-GPT 3、Drug-GPT 4 和 ChatGPT,以医疗应用场景为背景。研究的目的是确定哪一个模型可以在涉及到患有过敏性皮肤炎(AD)患者经验和医疗专业人员(HCP)关于糖尿病讨论中提供最准确和有 relevance 的答案。
  • methods: 这个研究使用了三个GPT模型:Drug-GPT 3、Drug-GPT 4 和 ChatGPT,通过精心编辑的患者和医疗专业人员社交媒体和讨论区域的数据来支持这三个模型。
  • results: 研究结果表明,三个模型都能生成有 relevance 和准确的答案,但Drug-GPT 3 和 Drug-GPT 4 通过使用专门编辑的患者和医疗专业人员社交媒体和讨论区域数据,为患者和医疗专业人员提供了更加有target 和深入的报告。ChatGPT 是一个更通用的模型,可以为读者提供高度概括的了解这些主题,但可能缺乏Drug-GPT模型所具备的深度和个人经验。
    Abstract This study presents a comparative analysis of three Generative Pre-trained Transformer (GPT) solutions in a question and answer (Q&A) setting: Drug-GPT 3, Drug-GPT 4, and ChatGPT, in the context of healthcare applications. The objective is to determine which model delivers the most accurate and relevant information in response to prompts related to patient experiences with atopic dermatitis (AD) and healthcare professional (HCP) discussions about diabetes. The results demonstrate that while all three models are capable of generating relevant and accurate responses, Drug-GPT 3 and Drug-GPT 4, which are supported by curated datasets of patient and HCP social media and message board posts, provide more targeted and in-depth insights. ChatGPT, a more general-purpose model, generates broader and more general responses, which may be valuable for readers seeking a high-level understanding of the topics but may lack the depth and personal insights found in the answers generated by the specialized Drug-GPT models. This comparative analysis highlights the importance of considering the language model's perspective, depth of knowledge, and currency when evaluating the usefulness of generated information in healthcare applications.
    摘要 Translation notes:* "Atopic dermatitis" (AD) is translated as "恶性皮肤炎" (éviation skin rash)* "Healthcare professional" (HCP) is translated as "医疗专业人员" (yījīu zhōngyè rényuè)* "Curated datasets" is translated as "精选数据集" (jīngxuǎn numérique)* "Social media and message board posts" is translated as "社交媒体和讨论版块" (shèjiāo tiēdī yǔ tǎo luó bǎ)* "General-purpose model" is translated as "通用模型" (tōngyòng módeli)* "Specialized models" is translated as "专业模型" (zhuāngyè módeli)

Making Metadata More FAIR Using Large Language Models

  • paper_url: http://arxiv.org/abs/2307.13085
  • repo_url: None
  • paper_authors: Sowmya S. Sundaram, Mark A. Musen
  • for: 这篇论文是为了解决实验数据中的metadata问题,尤其是对于不同的metadata数据进行比较和分组。
  • methods: 这篇论文使用自然语言处理(NLP)技术,开发了一个名为FAIRMetaText的应用程序,可以比较metadata中的自然语言描述,并提供一个数学性相似度的衡量方法,以便对metadata进行分组或找到相似的替代词。
  • results: 这篇论文透过对公开 available的研究artifacts进行详细的研究,证明了FAIRMetaText的算法可以大幅提高metadata相关的任务,包括搜寻、分组和替代词等。
    Abstract With the global increase in experimental data artifacts, harnessing them in a unified fashion leads to a major stumbling block - bad metadata. To bridge this gap, this work presents a Natural Language Processing (NLP) informed application, called FAIRMetaText, that compares metadata. Specifically, FAIRMetaText analyzes the natural language descriptions of metadata and provides a mathematical similarity measure between two terms. This measure can then be utilized for analyzing varied metadata, by suggesting terms for compliance or grouping similar terms for identification of replaceable terms. The efficacy of the algorithm is presented qualitatively and quantitatively on publicly available research artifacts and demonstrates large gains across metadata related tasks through an in-depth study of a wide variety of Large Language Models (LLMs). This software can drastically reduce the human effort in sifting through various natural language metadata while employing several experimental datasets on the same topic.
    摘要 global 实验数据的增加,将它们集成一起是一个主要障碍 - 坏的metadata。为了bridging这个差距,这个工作提出了一个基于自然语言处理(NLP)的应用程序,called FAIRMetaText,它比较metadata的自然语言描述。具体来说,FAIRMetaText使用自然语言描述来提供两个条件之间的数学相似度测量。这个测量可以用来分析不同的metadata,提供符合性检查或组织相似的条件。这个软件可以帮助大幅提高人工过滤不同主题的自然语言metadata的时间和努力。Here's a breakdown of the translation:* global 实验数据 (global experimental data) becomes 实验数据的增加 (increase in experimental data)* 将它们集成一起 (harnessing them in a unified fashion) becomes 将它们集成一起 (collecting them together)* 坏的metadata (bad metadata) becomes 坏的metadata (incorrect or incomplete metadata)* bridging 这个差距 (bridging the gap) becomes 帮助大幅提高 (helping to greatly improve)* 这个工作 (this work) becomes 这个软件 (this software)* called FAIRMetaText (called FAIRMetaText) becomes 叫做 FAIRMetaText (called FAIRMetaText)* 比较metadata (compare metadata) becomes 比较metadata (compare metadata)* 自然语言描述 (natural language description) becomes 自然语言描述 (natural language description)* 提供两个条件之间的数学相似度测量 (provide a mathematical similarity measure between two terms) becomes 提供两个条件之间的数学相似度测量 (provide a mathematical similarity measure between two terms)* 这个测量可以用来 (this measurement can be used to) becomes 这个测量可以用来 (this measurement can be used to)* 分析不同的metadata (analyze varied metadata) becomes 分析不同的metadata (analyze different metadata)* 提供符合性检查 (provide compliance checks) becomes 提供符合性检查 (provide compliance checks)* 组织相似的条件 (group similar terms) becomes 组织相似的条件 (group similar terms)I hope this helps! Let me know if you have any further questions or if you'd like me to translate anything else.

Fairness Under Demographic Scarce Regime

  • paper_url: http://arxiv.org/abs/2307.13081
  • repo_url: None
  • paper_authors: Patrik Joslin Kenfack, Samira Ebrahimi Kahou, Ulrich Aïvodji
  • for: 提高模型的公平性和准确性之间的贸易offs
  • methods: 引入不确定性认识,并在具有最低不确定性的样本上遵循公平性约束
  • results: 比 класси型的属性分类器更好地平衡公平性和准确性,并且在一些实际场景下超过使用真实敏感属性的模型。
    Abstract Most existing works on fairness assume the model has full access to demographic information. However, there exist scenarios where demographic information is partially available because a record was not maintained throughout data collection or due to privacy reasons. This setting is known as demographic scarce regime. Prior research have shown that training an attribute classifier to replace the missing sensitive attributes (proxy) can still improve fairness. However, the use of proxy-sensitive attributes worsens fairness-accuracy trade-offs compared to true sensitive attributes. To address this limitation, we propose a framework to build attribute classifiers that achieve better fairness-accuracy trade-offs. Our method introduces uncertainty awareness in the attribute classifier and enforces fairness on samples with demographic information inferred with the lowest uncertainty. We show empirically that enforcing fairness constraints on samples with uncertain sensitive attributes is detrimental to fairness and accuracy. Our experiments on two datasets showed that the proposed framework yields models with significantly better fairness-accuracy trade-offs compared to classic attribute classifiers. Surprisingly, our framework outperforms models trained with constraints on the true sensitive attributes.
    摘要 Our method introduces uncertainty awareness in the attribute classifier and enforces fairness on samples with demographic information inferred with the lowest uncertainty. We show empirically that enforcing fairness constraints on samples with uncertain sensitive attributes is detrimental to fairness and accuracy. Our experiments on two datasets showed that the proposed framework yields models with significantly better fairness-accuracy trade-offs compared to classic attribute classifiers. Surprisingly, our framework outperforms models trained with constraints on the true sensitive attributes.In simplified Chinese:大多数现有的公平研究假设模型拥有完整的人口信息。然而,有些场景中人口信息只有部分可用,如数据采集或隐私问题,这种情况被称为人口缺乏 режим。先前的研究表明,使用代理敏感特征来取代缺失的人口信息可以提高公平性。然而,使用代理敏感特征会对公平精度负面影响。为解决这些限制,我们提出了一个框架,用于建立具有更好的公平精度负面的属性分类器。我们的方法会在属性分类器中引入不确定性意识,并在拥有最低不确定性的人口信息上遵循公平约束。我们的实验表明,对不确定的敏感特征进行公平约束是对公平和准确性的负面影响。我们的方案在两个 datasets 上进行了实验,结果表明,我们的框架可以在公平精度负面上取得显著更好的性能,并且超过使用真实敏感特征进行约束的模型。这种情况下,我们的框架可以更好地处理人口缺失的问题。

Adaptive Certified Training: Towards Better Accuracy-Robustness Tradeoffs

  • paper_url: http://arxiv.org/abs/2307.13078
  • repo_url: None
  • paper_authors: Zhakshylyk Nurlanov, Frank R. Schmidt, Florian Bernard
  • for: 本研究旨在提高深度学习模型的可靠性,尤其是在实际应用中。
  • methods: 我们提出了一种基于适应证明半径的新训练方法,以提高模型的标准准确率和鲁棒性。
  • results: 我们在MNIST、CIFAR-10和TinyImageNet datasets上进行了实验,并证明了我们的方法可以提高模型的鲁棒性,并且在标准准确率保持不变的情况下提高模型的鲁棒性。特别是在CIFAR-10和TinyImageNet上,我们的方法可以提高模型的鲁棒性至多两倍,并且在同等标准准确率水平下。
    Abstract As deep learning models continue to advance and are increasingly utilized in real-world systems, the issue of robustness remains a major challenge. Existing certified training methods produce models that achieve high provable robustness guarantees at certain perturbation levels. However, the main problem of such models is a dramatically low standard accuracy, i.e. accuracy on clean unperturbed data, that makes them impractical. In this work, we consider a more realistic perspective of maximizing the robustness of a model at certain levels of (high) standard accuracy. To this end, we propose a novel certified training method based on a key insight that training with adaptive certified radii helps to improve both the accuracy and robustness of the model, advancing state-of-the-art accuracy-robustness tradeoffs. We demonstrate the effectiveness of the proposed method on MNIST, CIFAR-10, and TinyImageNet datasets. Particularly, on CIFAR-10 and TinyImageNet, our method yields models with up to two times higher robustness, measured as an average certified radius of a test set, at the same levels of standard accuracy compared to baseline approaches.
    摘要

LLM-Rec: Personalized Recommendation via Prompting Large Language Models

  • paper_url: http://arxiv.org/abs/2307.15780
  • repo_url: None
  • paper_authors: Hanjia Lyu, Song Jiang, Hanqing Zeng, Qifan Wang, Si Zhang, Ren Chen, Chris Leung, Jiajie Tang, Yinglong Xia, Jiebo Luo
  • for: 提高个性化推荐性能
  • methods: 使用大语言模型(LLM)输入增强strategies,包括基本提示、推荐驱动提示、参与度引导提示和推荐驱动+参与度引导提示
  • results: 结合LLM生成的增强输入文本后,个性化推荐性能得到提高,推荐驱动和参与度引导提示策略可以启动LLM理解全球和本地项目特点。
    Abstract We investigate various prompting strategies for enhancing personalized recommendation performance with large language models (LLMs) through input augmentation. Our proposed approach, termed LLM-Rec, encompasses four distinct prompting strategies: (1) basic prompting, (2) recommendation-driven prompting, (3) engagement-guided prompting, and (4) recommendation-driven + engagement-guided prompting. Our empirical experiments show that incorporating the augmented input text generated by LLM leads to improved recommendation performance. Recommendation-driven and engagement-guided prompting strategies are found to elicit LLM's understanding of global and local item characteristics. This finding highlights the importance of leveraging diverse prompts and input augmentation techniques to enhance the recommendation capabilities with LLMs.
    摘要 我们研究了多种提示策略,以提高大语言模型(LLM)个性化推荐性能。我们提出的方法,称之为LLM-Rec,包括四种不同的提示策略:(1)基础提示,(2)推荐驱动提示,(3)参与指导提示,和(4)推荐驱动+参与指导提示。我们的实验表明,通过将LLM生成的增强输入文本 integrating 到推荐系统中,可以提高推荐性能。推荐驱动和参与指导提示策略可以引导LLM理解全球和本地项目特征。这一发现强调了利用多种提示和输入增强技术,以提高LLM推荐能力。

Personalized Category Frequency prediction for Buy It Again recommendations

  • paper_url: http://arxiv.org/abs/2308.01195
  • repo_url: None
  • paper_authors: Amit Pande, Kunal Ghosh, Rankyung Park
  • for: 这个论文的目的是提出一种基于个性化类别的推荐系统,以帮助零售商提高用户体验和网站参与度。
  • methods: 该论文提出了一种叫做层次PCIC模型,它包括个性化类别模型(PC模型)和个性化类别下的项模型(IC模型)。PC模型生成了个性化的类别列表,IC模型排名类别下的项目。这个模型使用生存分布模型和时间序列模型来捕捉产品的通用消耗率和趋势。
  • results: 相比十二个基准模型,PCIC提高了NDCG达16%,同时提高了回归率约2%。PCIC可以在大规模数据集上进行批量训练(耗时8个小时),并在一家大型零售商的官方网站上进行AB测试,导致用户参与度得到了显著提高。
    Abstract Buy It Again (BIA) recommendations are crucial to retailers to help improve user experience and site engagement by suggesting items that customers are likely to buy again based on their own repeat purchasing patterns. Most existing BIA studies analyze guests personalized behavior at item granularity. A category-based model may be more appropriate in such scenarios. We propose a recommendation system called a hierarchical PCIC model that consists of a personalized category model (PC model) and a personalized item model within categories (IC model). PC model generates a personalized list of categories that customers are likely to purchase again. IC model ranks items within categories that guests are likely to consume within a category. The hierarchical PCIC model captures the general consumption rate of products using survival models. Trends in consumption are captured using time series models. Features derived from these models are used in training a category-grained neural network. We compare PCIC to twelve existing baselines on four standard open datasets. PCIC improves NDCG up to 16 percent while improving recall by around 2 percent. We were able to scale and train (over 8 hours) PCIC on a large dataset of 100M guests and 3M items where repeat categories of a guest out number repeat items. PCIC was deployed and AB tested on the site of a major retailer, leading to significant gains in guest engagement.
    摘要 Buy It Again(BIA)建议对零售商非常重要,可以帮助提高用户体验和网站参与度,通过建议客户可能会再次购买的商品,基于客户的重复购买模式。大多数现有的BIA研究分析客人个性化行为的项目粒度。我们提出了一种推荐系统,即层次PCIC模型,它包括个性化类别模型(PC模型)和个性化类别内项模型(IC模型)。PC模型生成了客户可能会购买的个性化类别列表。IC模型在类别内排名客户可能会消耗的项目。层次PCIC模型捕捉了产品的总消耗率,使用生存模型记录时间序列模型。这些模型中的特征被用于训练类别粒度的神经网络。我们与12个基准模型进行比较,PCIC提高了NDCG达16%,同时提高了回归率约2%。我们可以在8小时内扩展和训练PCIC模型(100万客户和300万项目),并在一家大型零售商的官方网站上部署PCIC模型,导致用户参与度显著增长。

Parallel $Q$-Learning: Scaling Off-policy Reinforcement Learning under Massively Parallel Simulation

  • paper_url: http://arxiv.org/abs/2307.12983
  • repo_url: https://github.com/Improbable-AI/pql
  • paper_authors: Zechu Li, Tao Chen, Zhang-Wei Hong, Anurag Ajay, Pulkit Agrawal
  • for: 这个论文是为了提高复杂任务的强化学习效率,特别是利用高性能的GPU加速器进行数据采集和训练。
  • methods: 这个论文使用了并行$Q$-学习(PQL)算法,该算法可以并行采集数据、学习策略和价值函数,从而提高强化学习的效率。
  • results: 该论文通过实验表明,使用PQL算法可以在短短的wall-clock时间内完成复杂任务的强化学习训练,并且能够保持偏离策略的数据效率。此外,论文还 investigate了强化学习学习速度的关键因素。
    Abstract Reinforcement learning is time-consuming for complex tasks due to the need for large amounts of training data. Recent advances in GPU-based simulation, such as Isaac Gym, have sped up data collection thousands of times on a commodity GPU. Most prior works used on-policy methods like PPO due to their simplicity and ease of scaling. Off-policy methods are more data efficient but challenging to scale, resulting in a longer wall-clock training time. This paper presents a Parallel $Q$-Learning (PQL) scheme that outperforms PPO in wall-clock time while maintaining superior sample efficiency of off-policy learning. PQL achieves this by parallelizing data collection, policy learning, and value learning. Different from prior works on distributed off-policy learning, such as Apex, our scheme is designed specifically for massively parallel GPU-based simulation and optimized to work on a single workstation. In experiments, we demonstrate that $Q$-learning can be scaled to \textit{tens of thousands of parallel environments} and investigate important factors affecting learning speed. The code is available at https://github.com/Improbable-AI/pql.
    摘要 <>将文本翻译成简化中文。<>因为复杂任务需要大量训练数据,因此强化学习需要较长的时间。 latest advances in GPU-based simulation, such as Isaac Gym, have sped up data collection by thousands of times on a commodity GPU. Most prior works used on-policy methods like PPO due to their simplicity and ease of scaling. Off-policy methods are more data efficient but challenging to scale, resulting in longer wall-clock training time. This paper presents a Parallel $Q$-Learning (PQL) scheme that outperforms PPO in wall-clock time while maintaining the superior sample efficiency of off-policy learning. PQL achieves this by parallelizing data collection, policy learning, and value learning. Unlike prior works on distributed off-policy learning, such as Apex, our scheme is designed specifically for massively parallel GPU-based simulation and optimized to work on a single workstation. In experiments, we demonstrate that $Q$-learning can be scaled to tens of thousands of parallel environments and investigate important factors affecting learning speed. The code is available at https://github.com/Improbable-AI/pql.

3D-LLM: Injecting the 3D World into Large Language Models

  • paper_url: http://arxiv.org/abs/2307.12981
  • repo_url: https://github.com/UMass-Foundation-Model/3D-LLM
  • paper_authors: Yining Hong, Haoyu Zhen, Peihao Chen, Shuhong Zheng, Yilun Du, Zhenfang Chen, Chuang Gan
  • for: This paper is written for proposing a new family of 3D language models (3D-LLMs) that can take 3D point clouds and their features as input and perform a diverse set of 3D-related tasks.
  • methods: The paper uses three types of prompting mechanisms to collect over 300k 3D-language data covering tasks such as captioning, dense captioning, 3D question answering, task decomposition, 3D grounding, 3D-assisted dialog, navigation, and more. The paper also utilizes a 3D feature extractor to obtain 3D features from rendered multi-view images, and uses 2D VLMs as the backbone to train the 3D-LLMs. Additionally, the paper introduces a 3D localization mechanism to better capture 3D spatial information.
  • results: The paper shows that the proposed 3D-LLMs outperform state-of-the-art baselines on the ScanQA dataset, with a BLEU-1 score that surpasses the state-of-the-art score by 9%. Additionally, the paper shows that the 3D-LLMs outperform 2D VLMs on held-in datasets for 3D captioning, task composition, and 3D-assisted dialogue, and provides qualitative examples of the model’s ability to perform tasks beyond the scope of existing LLMs and VLMs.Here’s the format you requested:
  • for: 这篇论文是为了提出一种新的3D语言模型(3D-LLMs),可以将3D点云和其特征作为输入,并执行多种3D相关任务。
  • methods: 论文使用三种提问机制来收集超过300k个3D语言数据,覆盖包括captioning、dense captioning、3D问答、任务分解、3D静止、3D-assisted dialog、导航等任务。论文还利用了一种3D特征提取器来从渲染多视图图像中提取3D特征。在训练3D-LLMs时,论文使用2D VLMs作为基础。此外,论文还引入了3D地址机制,使3D-LLMs更好地捕捉3D空间信息。
  • results: 论文显示,提出的3D-LLMs在ScanQA数据集上超过了状态艺术基线,BLEU-1分数高于状态艺术分数 by 9%。此外,论文还显示,3D-LLMs在固定数据集上超过了2D VLMs,并提供了质量例子,表明模型可以完成超出现有LLMs和VLMs的任务。
    Abstract Large language models (LLMs) and Vision-Language Models (VLMs) have been proven to excel at multiple tasks, such as commonsense reasoning. Powerful as these models can be, they are not grounded in the 3D physical world, which involves richer concepts such as spatial relationships, affordances, physics, layout, and so on. In this work, we propose to inject the 3D world into large language models and introduce a whole new family of 3D-LLMs. Specifically, 3D-LLMs can take 3D point clouds and their features as input and perform a diverse set of 3D-related tasks, including captioning, dense captioning, 3D question answering, task decomposition, 3D grounding, 3D-assisted dialog, navigation, and so on. Using three types of prompting mechanisms that we design, we are able to collect over 300k 3D-language data covering these tasks. To efficiently train 3D-LLMs, we first utilize a 3D feature extractor that obtains 3D features from rendered multi- view images. Then, we use 2D VLMs as our backbones to train our 3D-LLMs. By introducing a 3D localization mechanism, 3D-LLMs can better capture 3D spatial information. Experiments on ScanQA show that our model outperforms state-of-the-art baselines by a large margin (e.g., the BLEU-1 score surpasses state-of-the-art score by 9%). Furthermore, experiments on our held-in datasets for 3D captioning, task composition, and 3D-assisted dialogue show that our model outperforms 2D VLMs. Qualitative examples also show that our model could perform more tasks beyond the scope of existing LLMs and VLMs. Project Page: : https://vis-www.cs.umass.edu/3dllm/.
    摘要 大型语言模型(LLM)和视力语言模型(VLM)已经证明可以在多个任务上表现出色,如常识理解。尽管这些模型强大,但它们不是基于3D物理世界的,这个世界包括更加复杂的概念,如空间关系、可用性、物理、布局等。在这项工作中,我们提议将3D世界注入到大型语言模型中,并 introduce a whole new family of 3D-LLMs。Specifically, 3D-LLMs can take 3D点云和其特征作为输入,并执行一系列3D相关任务,包括captioning、dense captioning、3D问答、任务分解、3D定位、3D辅助对话、导航等。通过我们设计的三种提示机制,我们能够收集超过300k的3D语言数据覆盖这些任务。为有效地训练3D-LLMs,我们首先利用3D特征EXTRACTOR提取3D特征从渲染多视图图像中。然后,我们使用2D VLMs作为我们的背部来训练我们的3D-LLMs。通过引入3D本地化机制,3D-LLMs可以更好地捕捉3D空间信息。ScanQA实验结果显示,我们的模型超过了状态机的基准值(例如BLEU-1分数超过了状态机的分数 by 9%)。此外,我们在我们保留的数据集上进行3D captioning、任务组合和3D辅助对话的实验,我们的模型超过了2D VLMs。质量例子也表明我们的模型可以完成更多的任务,超出现有LLMs和VLMs的范围。项目页面:https://vis-www.cs.umass.edu/3dllm/。

A Connection between One-Step Regularization and Critic Regularization in Reinforcement Learning

  • paper_url: http://arxiv.org/abs/2307.12968
  • repo_url: https://github.com/ben-eysenbach/ac-connection
  • paper_authors: Benjamin Eysenbach, Matthieu Geist, Sergey Levine, Ruslan Salakhutdinov
  • for: 这个论文的目的是解释一些离线RL算法的正则化方法,以提高其性能。
  • methods: 这个论文使用了一些常用的离线RL算法,如CQL和一步RL,并对它们进行了正则化。
  • results: 研究发现,使用一步RL可以得到类似于critic正则化的性能,但是需要更多的计算资源。而在实际应用中,使用一步RL可以实现稳定和简单的RL方法,但是其性能可能不及critic正则化。
    Abstract As with any machine learning problem with limited data, effective offline RL algorithms require careful regularization to avoid overfitting. One-step methods perform regularization by doing just a single step of policy improvement, while critic regularization methods do many steps of policy improvement with a regularized objective. These methods appear distinct. One-step methods, such as advantage-weighted regression and conditional behavioral cloning, truncate policy iteration after just one step. This ``early stopping'' makes one-step RL simple and stable, but can limit its asymptotic performance. Critic regularization typically requires more compute but has appealing lower-bound guarantees. In this paper, we draw a close connection between these methods: applying a multi-step critic regularization method with a regularization coefficient of 1 yields the same policy as one-step RL. While practical implementations violate our assumptions and critic regularization is typically applied with smaller regularization coefficients, our experiments nevertheless show that our analysis makes accurate, testable predictions about practical offline RL methods (CQL and one-step RL) with commonly-used hyperparameters. Our results that every problem can be solved with a single step of policy improvement, but rather that one-step RL might be competitive with critic regularization on RL problems that demand strong regularization.
    摘要 “与有限数据的机器学习问题相似,有效的离线RL算法需要仔细的规则化以避免过拟合。一步方法在做出一步策略改进后就结束,而批处规则化方法则在多个步骤策略改进中使用规则化目标。这些方法看起来很不同。一步方法,如偏好权重回归和 conditional behavioral cloning,在做出一步策略改进后就结束。这种``早期停止''使得一步RL简单和稳定,但可能限制其极限性能。批处规则化通常需要更多的计算资源,但它具有吸引人的下界保证。在这篇论文中,我们将一步和批处规则化方法之间 Draw a close connection:在应用多步批处规则化方法时,使用规则化系数为1就等于一步RL。虽然实践中的假设不符合我们的假设,但我们的实验表明,我们的分析对实际的离线RL方法(CQL和一步RL)的实现进行了准确和可靠的预测。我们的结果表明,每个问题都可以通过一步策略改进来解决,但是一步RL可能与批处规则化在RL问题上具有强规则化的情况下竞争。”

Enhancing image captioning with depth information using a Transformer-based framework

  • paper_url: http://arxiv.org/abs/2308.03767
  • repo_url: None
  • paper_authors: Aya Mahmoud Ahmed, Mohamed Yousef, Khaled F. Hussain, Yousef Bassyouni Mahdy
  • for: 提高图像描述性能
  • methods: 使用 transformer 架构,RGB 图像和其对应的深度图像进行共同描述
  • results: 在 NYU-v2 和 Stanford 图像段落描述数据集上实现了提高描述性能,并提出了一个更正版的 NYU-v2 数据集。Here’s the full Chinese text:
  • for: 本文 investigate Whether integrating depth information with RGB images can enhance the captioning task and generate better descriptions.
  • methods: 我们提出了一个基于 transformer 架构的 RGB 图像和其对应的深度图像共同描述框架。
  • results: 我们在 NYU-v2 和 Stanford 图像段落描述数据集上实现了提高描述性能,并提出了一个更正版的 NYU-v2 数据集。
    Abstract Captioning images is a challenging scene-understanding task that connects computer vision and natural language processing. While image captioning models have been successful in producing excellent descriptions, the field has primarily focused on generating a single sentence for 2D images. This paper investigates whether integrating depth information with RGB images can enhance the captioning task and generate better descriptions. For this purpose, we propose a Transformer-based encoder-decoder framework for generating a multi-sentence description of a 3D scene. The RGB image and its corresponding depth map are provided as inputs to our framework, which combines them to produce a better understanding of the input scene. Depth maps could be ground truth or estimated, which makes our framework widely applicable to any RGB captioning dataset. We explored different fusion approaches to fuse RGB and depth images. The experiments are performed on the NYU-v2 dataset and the Stanford image paragraph captioning dataset. During our work with the NYU-v2 dataset, we found inconsistent labeling that prevents the benefit of using depth information to enhance the captioning task. The results were even worse than using RGB images only. As a result, we propose a cleaned version of the NYU-v2 dataset that is more consistent and informative. Our results on both datasets demonstrate that the proposed framework effectively benefits from depth information, whether it is ground truth or estimated, and generates better captions. Code, pre-trained models, and the cleaned version of the NYU-v2 dataset will be made publically available.
    摘要 captioning图像是一个复杂的Scene理解任务,搭配计算机视觉和自然语言处理。虽然图像描述模型已经在生成出excelente描述,但这个领域主要集中在生成2D图像的单个句子。这篇论文 investigates whether integrating depth信息withRGB图像可以提高描述任务并生成更好的描述。为此,我们提出了一个基于Transformer架构的encoder-decoder框架,用于生成3D场景的多句子描述。RGB图像和其相应的深度图被提供为我们框架的输入,我们将它们结合以生成更好的对输入场景的理解。深度图可以是真实的或估计的,使我们的框架适用于任何RGB描述数据集。我们实现了不同的融合方法来融合RGB和深度图像。实验在NYU-v2数据集和Stanford图像段落描述数据集进行。在我们的NYU-v2数据集工作中,我们发现了不一致的标签,这阻碍了使用深度信息提高描述任务的好处。结果甚至比使用RGB图像alone更差。因此,我们提出了一个更正版的NYU-v2数据集,其标签更加一致和有用。我们的结果在两个数据集上表明,我们的提案的框架可以受益于深度信息,无论是真实的或估计的,并生成更好的描述。代码、预训练模型和更正版的NYU-v2数据集将公开发布。

RLCD: Reinforcement Learning from Contrast Distillation for Language Model Alignment

  • paper_url: http://arxiv.org/abs/2307.12950
  • repo_url: https://github.com/facebookresearch/rlcd
  • paper_authors: Kevin Yang, Dan Klein, Asli Celikyilmaz, Nanyun Peng, Yuandong Tian
  • for: 本研究旨在开发一种不使用人类反馈的自然语言原则Alignment方法,以提高语言模型的表现。
  • methods: 本方法使用模拟的偏好对,包括高质量和低质量示例,通过对比正向和负向提示来训练偏好模型。然后,使用奖励学习来改进基础不aligned语言模型。
  • results: 实验表明,RLCD方法在三种多样化的对齐任务中(无害性、有益性、故事简 outline生成)均表现出色,并在7B和30B模型缩放下表现出超过RLAIF(Bai et al., 2022b)和上下文混合(Huang et al., 2022)基eline的result。
    Abstract We propose Reinforcement Learning from Contrast Distillation (RLCD), a method for aligning language models to follow natural language principles without using human feedback. RLCD trains a preference model using simulated preference pairs that contain both a high-quality and low-quality example, generated using contrasting positive and negative prompts. The preference model is then used to improve a base unaligned language model via reinforcement learning. Empirically, RLCD outperforms RLAIF (Bai et al., 2022b) and context distillation (Huang et al., 2022) baselines across three diverse alignment tasks--harmlessness, helpfulness, and story outline generation--and on both 7B and 30B model scales for preference data simulation.
    摘要 我们提议一种基于强化学习的自然语言原则对齐方法,称为强化学习自然语言原则(RLCD)。RLCD使用模拟的偏好对使用了对比正反例的高质量和低质量示例,通过偏好模型进行改进。我们在三种多样化的对齐任务中(无害、有用和故事笔记生成) empirically 证明RLCD超过RLAIF(Bai et al., 2022b)和语音维度(Huang et al., 2022)基elines,并在7B和30B模型缩放下进行偏好数据模拟。

On Privileged and Convergent Bases in Neural Network Representations

  • paper_url: http://arxiv.org/abs/2307.12941
  • repo_url: None
  • paper_authors: Davis Brown, Nikhil Vyas, Yamini Bansal
  • for: 本研究探究 neural network 学习的表示方式是否具有特权和共同基准。
  • methods: 研究使用各个神经元表示的特征方向的重要性。
  • results: 发现 neural network 表示不具有完全旋转不变性,并且在不同初始化的情况下,多个基准可以实现相同的性能。
    Abstract In this study, we investigate whether the representations learned by neural networks possess a privileged and convergent basis. Specifically, we examine the significance of feature directions represented by individual neurons. First, we establish that arbitrary rotations of neural representations cannot be inverted (unlike linear networks), indicating that they do not exhibit complete rotational invariance. Subsequently, we explore the possibility of multiple bases achieving identical performance. To do this, we compare the bases of networks trained with the same parameters but with varying random initializations. Our study reveals two findings: (1) Even in wide networks such as WideResNets, neural networks do not converge to a unique basis; (2) Basis correlation increases significantly when a few early layers of the network are frozen identically. Furthermore, we analyze Linear Mode Connectivity, which has been studied as a measure of basis correlation. Our findings give evidence that while Linear Mode Connectivity improves with increased network width, this improvement is not due to an increase in basis correlation.
    摘要 在这个研究中,我们研究神经网络学习的表示方式是否具有特权和共同基准。 Specifically,我们研究神经元每个个体的特征方向的重要性。首先,我们证明神经网络中的表示不能逆转(不同于线性网络),这表明它们不具备完全旋转不变性。接着,我们探索多个基准是否可以实现相同的性能。为此,我们比较由同样的参数训练而成的不同随机初始化的网络的基准。我们的研究发现了两点:1. même dans les réseaux larges tels que les WideResNets, les réseaux neuronaux ne convergent pas vers une base unique;2. La corrélation de la base augmente significativement lorsque les premières couches du réseau sont gelées de manière identique. En outre, nous analysons la connectivité linéaire des modes, qui a été étudiée comme une mesure de la corrélation de la base. Nos résultats montrent que si la largeur du réseau augmente, la connectivité linéaire des modes s'améliore, mais cet amélioration n'est pas due à une augmentation de la corrélation de la base.

Rule By Example: Harnessing Logical Rules for Explainable Hate Speech Detection

  • paper_url: http://arxiv.org/abs/2307.12935
  • repo_url: https://github.com/chrisisking/rule-by-example
  • paper_authors: Christopher Clarke, Matthew Hall, Gaurav Mittal, Ye Yu, Sandra Sajeev, Jason Mars, Mei Chen
  • for: 这个论文旨在解决现代在线内容审核中的挑战,即使用深度学习模型来取代规则驱动的方法,以提高内容审核的可靠性和可 explainer。
  • methods: 这个论文提出了一种新的示例基于对比学习方法,称为规则示例学习(Rule By Example,RBE),可以从逻辑规则中学习rich embedding表示。
  • results: 实验结果表明,RBE可以在3个popular hate speech classification dataset上超越现状的深度学习分类器,以及使用规则和无监督学习方法,同时提供可 explainer的模型预测结果via规则基准。
    Abstract Classic approaches to content moderation typically apply a rule-based heuristic approach to flag content. While rules are easily customizable and intuitive for humans to interpret, they are inherently fragile and lack the flexibility or robustness needed to moderate the vast amount of undesirable content found online today. Recent advances in deep learning have demonstrated the promise of using highly effective deep neural models to overcome these challenges. However, despite the improved performance, these data-driven models lack transparency and explainability, often leading to mistrust from everyday users and a lack of adoption by many platforms. In this paper, we present Rule By Example (RBE): a novel exemplar-based contrastive learning approach for learning from logical rules for the task of textual content moderation. RBE is capable of providing rule-grounded predictions, allowing for more explainable and customizable predictions compared to typical deep learning-based approaches. We demonstrate that our approach is capable of learning rich rule embedding representations using only a few data examples. Experimental results on 3 popular hate speech classification datasets show that RBE is able to outperform state-of-the-art deep learning classifiers as well as the use of rules in both supervised and unsupervised settings while providing explainable model predictions via rule-grounding.
    摘要 传统的内容审核方法通常采用规则基于的冒泡法来标识内容。虽然规则容易自定义和人类易于理解,但它们具有脆弱性和缺乏在当今互联网上巨量undesirable content的适应性。近年来,深度学习的进步有力地解决了这些挑战。然而,尽管表现得到改善,这些数据驱动模型仍然缺乏透明性和可解释性,导致用户和多个平台的不信任。在这篇论文中,我们提出了 Rule By Example (RBE):一种基于例子的对比学习方法,用于从逻辑规则中学习文本内容审核任务。RBE可以提供规则基于的预测,使得模型预测更加可解释和自定义。我们示示了我们的方法可以使用只有几个数据示例来学习丰富的规则嵌入表示。实验结果表明,RBE在3个流行的仇恨言语分类数据集上能够超越当前的深度学习分类器和规则在指导下的情况下,同时提供可解释的模型预测 via 规则嵌入。

Theoretically Guaranteed Policy Improvement Distilled from Model-Based Planning

  • paper_url: http://arxiv.org/abs/2307.12933
  • repo_url: None
  • paper_authors: Chuming Li, Ruonan Jia, Jie Liu, Yinmin Zhang, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang
  • for: 这篇论文是为了提出一种基于模型的 reinforcement learning 算法,以提高控制任务的效率。
  • methods: 该论文使用了模型改进阶段来储存优化的动作序列,并通过Policy ImprovementStep进行了优化。
  • results: 实验表明,MPDP算法在六个 MuJoCo 连续控制任务上实现了更高的样本效率和极限性性能,比较model-free和基于模型的 плани法。
    Abstract Model-based reinforcement learning (RL) has demonstrated remarkable successes on a range of continuous control tasks due to its high sample efficiency. To save the computation cost of conducting planning online, recent practices tend to distill optimized action sequences into an RL policy during the training phase. Although the distillation can incorporate both the foresight of planning and the exploration ability of RL policies, the theoretical understanding of these methods is yet unclear. In this paper, we extend the policy improvement step of Soft Actor-Critic (SAC) by developing an approach to distill from model-based planning to the policy. We then demonstrate that such an approach of policy improvement has a theoretical guarantee of monotonic improvement and convergence to the maximum value defined in SAC. We discuss effective design choices and implement our theory as a practical algorithm -- Model-based Planning Distilled to Policy (MPDP) -- that updates the policy jointly over multiple future time steps. Extensive experiments show that MPDP achieves better sample efficiency and asymptotic performance than both model-free and model-based planning algorithms on six continuous control benchmark tasks in MuJoCo.
    摘要

Contextual Bandits and Imitation Learning via Preference-Based Active Queries

  • paper_url: http://arxiv.org/abs/2307.12926
  • repo_url: None
  • paper_authors: Ayush Sekhari, Karthik Sridharan, Wen Sun, Runzhe Wu
  • for: 本研究考虑了上下文搬狮和模仿学习问题,learner 缺乏直接行动的奖励信息,而是可以在每个回合中aktive查询专家以获取不准确的偏好反馈。learner 的目标是同时减少执行行动的 regret 和对专家进行比较查询的次数。
  • methods: 本研究提出了一种算法,该算法利用在函数类型上的在线回归 oracle,以选择行动和决定何时进行查询。对于上下文搬狮 Setting,我们的算法实现了一个 regret bound,其中 regret 的极限为 $O(\min{\sqrt{T}, d/\Delta})$,其中 $T$ 表示互动次数,$d$ 表示函数类型的拓扑维度,$\Delta$ 表示最佳行动对所有上下文下的最小偏好。我们的算法不需要知道 $\Delta$,并且与标准上下文搬狮 Setting 中获得的 regret bound相当。此外,我们的算法只需要对专家进行 $O(\min{T, d^2/\Delta^2})$ 次查询。
  • results: 我们的算法可以在上下文搬狮和模仿学习 Setting 中实现 regret bound,同时减少对专家的查询次数。在模仿学习 Setting 中,我们的算法甚至可以超过专家的性能,这 highlights 一个实际的应用优点,即在不熟悉环境中,可以通过偏好反馈来学习并超越专家。
    Abstract We consider the problem of contextual bandits and imitation learning, where the learner lacks direct knowledge of the executed action's reward. Instead, the learner can actively query an expert at each round to compare two actions and receive noisy preference feedback. The learner's objective is two-fold: to minimize the regret associated with the executed actions, while simultaneously, minimizing the number of comparison queries made to the expert. In this paper, we assume that the learner has access to a function class that can represent the expert's preference model under appropriate link functions, and provide an algorithm that leverages an online regression oracle with respect to this function class for choosing its actions and deciding when to query. For the contextual bandit setting, our algorithm achieves a regret bound that combines the best of both worlds, scaling as $O(\min\{\sqrt{T}, d/\Delta\})$, where $T$ represents the number of interactions, $d$ represents the eluder dimension of the function class, and $\Delta$ represents the minimum preference of the optimal action over any suboptimal action under all contexts. Our algorithm does not require the knowledge of $\Delta$, and the obtained regret bound is comparable to what can be achieved in the standard contextual bandits setting where the learner observes reward signals at each round. Additionally, our algorithm makes only $O(\min\{T, d^2/\Delta^2\})$ queries to the expert. We then extend our algorithm to the imitation learning setting, where the learning agent engages with an unknown environment in episodes of length $H$ each, and provide similar guarantees for regret and query complexity. Interestingly, our algorithm for imitation learning can even learn to outperform the underlying expert, when it is suboptimal, highlighting a practical benefit of preference-based feedback in imitation learning.
    摘要 我们考虑了上下文带强盗捕鱼和模仿学习问题,learner缺乏直接行动的奖励知识。而是可以在每个回合中活动地询问专家, comparison two actions,并 receive noisy preference feedback。学习者的目标是两fold:一是最小化执行的行动奖励相关的 regret,二是最小化向专家提问的数量。在这篇文章中,我们假设学习者可以访问一个函数类,该函数类可以表示专家的偏好模型,并提供一个在线回归 oracle,以便选择行动和决定何时向专家提问。对于上下文带强盗捕鱼设置,我们的算法可以达到 $O(\min\{\sqrt{T}, d/\Delta\})$ 的 regret bound,其中 $T$ 表示互动次数, $d$ 表示函数类的吸引力维度, $\Delta$ 表示最佳行动在所有上下文中的最小偏好。我们的算法不需要了解 $\Delta$,并且与标准上下文带强盗捕鱼设置的 regret bound相比,我们的 regret bound相对较高。此外,我们的算法只需要 $O(\min\{T, d^2/\Delta^2\})$ 次向专家提问。然后,我们将我们的算法扩展到模仿学习设置,learner在每个 episodes 中与未知环境互动,并提供了类似的 regret 和查询复杂度保证。有趣的是,我们的算法可以在专家下不佳的情况下,learn to outperform 专家,这 highlights 实用上的 benefit 。

Hierarchical Skeleton Meta-Prototype Contrastive Learning with Hard Skeleton Mining for Unsupervised Person Re-Identification

  • paper_url: http://arxiv.org/abs/2307.12917
  • repo_url: https://github.com/kali-hac/hi-mpc
  • paper_authors: Haocong Rao, Cyril Leung, Chunyan Miao
  • for: 本研究旨在提出一种基于深度感知器和深度学习的人重识别方法,使用无监督的 Hierarchical skeleton Meta-Prototype Contrastive learning (Hi-MPC) 方法,以提高人重识别的精度。
  • methods: 本方法首先构建了 hierarchical 表示,以模型人体的坐标系和运动特征,从 JOINTS 、component 和 limb 等多个水平。然后,提出了一种 hierarchical meta-prototype contrastive learning 模型,通过对不同水平的skeleton features进行 clustering和对比,以学习更有效的人体特征。此外,还提出了一种硬件骨挖掘机制,以适应ively 挖掘出更有用的骨骼特征。
  • results: 在五个数据集上进行了广泛的评估,显示了我们的方法可以与现有的state-of-the-art方法进行比较,并且在cross-view人重识别和 RGB 环境下也表现出色。
    Abstract With rapid advancements in depth sensors and deep learning, skeleton-based person re-identification (re-ID) models have recently achieved remarkable progress with many advantages. Most existing solutions learn single-level skeleton features from body joints with the assumption of equal skeleton importance, while they typically lack the ability to exploit more informative skeleton features from various levels such as limb level with more global body patterns. The label dependency of these methods also limits their flexibility in learning more general skeleton representations. This paper proposes a generic unsupervised Hierarchical skeleton Meta-Prototype Contrastive learning (Hi-MPC) approach with Hard Skeleton Mining (HSM) for person re-ID with unlabeled 3D skeletons. Firstly, we construct hierarchical representations of skeletons to model coarse-to-fine body and motion features from the levels of body joints, components, and limbs. Then a hierarchical meta-prototype contrastive learning model is proposed to cluster and contrast the most typical skeleton features ("prototypes") from different-level skeletons. By converting original prototypes into meta-prototypes with multiple homogeneous transformations, we induce the model to learn the inherent consistency of prototypes to capture more effective skeleton features for person re-ID. Furthermore, we devise a hard skeleton mining mechanism to adaptively infer the informative importance of each skeleton, so as to focus on harder skeletons to learn more discriminative skeleton representations. Extensive evaluations on five datasets demonstrate that our approach outperforms a wide variety of state-of-the-art skeleton-based methods. We further show the general applicability of our method to cross-view person re-ID and RGB-based scenarios with estimated skeletons.
    摘要 With the rapid advancements in depth sensors and deep learning, skeleton-based person re-identification (re-ID) models have recently made significant progress with many advantages. Most existing solutions learn single-level skeleton features from body joints, assuming equal skeleton importance, while they typically lack the ability to exploit more informative skeleton features from various levels such as limb level with more global body patterns. The label dependency of these methods also limits their flexibility in learning more general skeleton representations. This paper proposes a generic unsupervised Hierarchical skeleton Meta-Prototype Contrastive learning (Hi-MPC) approach with Hard Skeleton Mining (HSM) for person re-ID with unlabeled 3D skeletons. Firstly, we construct hierarchical representations of skeletons to model coarse-to-fine body and motion features from the levels of body joints, components, and limbs. Then, we propose a hierarchical meta-prototype contrastive learning model to cluster and contrast the most typical skeleton features ("prototypes") from different-level skeletons. By converting original prototypes into meta-prototypes with multiple homogeneous transformations, we induce the model to learn the inherent consistency of prototypes to capture more effective skeleton features for person re-ID. Furthermore, we devise a hard skeleton mining mechanism to adaptively infer the informative importance of each skeleton, so as to focus on harder skeletons to learn more discriminative skeleton representations. Extensive evaluations on five datasets demonstrate that our approach outperforms a wide variety of state-of-the-art skeleton-based methods. We further show the general applicability of our method to cross-view person re-ID and RGB-based scenarios with estimated skeletons.

Consensus-based Participatory Budgeting for Legitimacy: Decision Support via Multi-agent Reinforcement Learning

  • paper_url: http://arxiv.org/abs/2307.12915
  • repo_url: None
  • paper_authors: Srijoni Majumdar, Evangelos Pournaras
  • for: 这篇论文是关于如何使用协商来改善参与预算的法定程序的,以提高公共基金的分配的公正性和包容性。
  • methods: 这篇论文提出了一种新的协商方法,使用多代理人强化学习技术来支持决策,并帮助选民互动以达成可持续的妥协。
  • results: 实验结果表明,这种协商方法可以达成妥协,效率高并稳定,而且与现有的投票聚合方法相比,它可以提高公平性和包容性。
    Abstract The legitimacy of bottom-up democratic processes for the distribution of public funds by policy-makers is challenging and complex. Participatory budgeting is such a process, where voting outcomes may not always be fair or inclusive. Deliberation for which project ideas to put for voting and choose for implementation lack systematization and do not scale. This paper addresses these grand challenges by introducing a novel and legitimate iterative consensus-based participatory budgeting process. Consensus is designed to be a result of decision support via an innovative multi-agent reinforcement learning approach. Voters are assisted to interact with each other to make viable compromises. Extensive experimental evaluation with real-world participatory budgeting data from Poland reveal striking findings: Consensus is reachable, efficient and robust. Compromise is required, which is though comparable to the one of existing voting aggregation methods that promote fairness and inclusion without though attaining consensus.
    摘要 政策制定者的底层民主过程对公共资金的分配存在挑战和复杂性。参与预算是这种过程之一,其投票结果可能不一定公平和包容。协商选择要投票的项目意见和实施的方法缺乏系统化和扩展性。这篇论文解决这些总统困难,提出了一种新的合法的迭代共识参与预算过程。这种共识是通过创新的多代理增强学习方法支持决策的结果。选民被助け到互动相互,制定可行的妥协。实际在波兰的实验证明了 striking 的发现:共识是可以达成的,高效和稳定。妥协是必要的,与现有的投票集成方法相比,它不一定实现共识,但能够保证公平和包容。

Graph Neural Networks For Mapping Variables Between Programs – Extended Version

  • paper_url: http://arxiv.org/abs/2307.13014
  • repo_url: https://github.com/pmorvalho/ecai23-gnns-for-mapping-variables-between-programs
  • paper_authors: Pedro Orvalho, Jelle Piepenbrock, Mikoláš Janota, Vasco Manquinho
  • for: 本研究旨在提高程序相似性比较的精度和效率,通过使用图神经网络(GNN)将两个程序中变量的集合映射到一起。
  • methods: 本研究使用了图神经网络(GNN)来映射两个程序的抽象树(AST)中的变量集合。
  • results: 实验结果显示,我们的方法可以正确地映射83%的评估数据集,而当前状态的程序修复方法(主要基于程序结构)只能修复约72%的错误程序。而我们的方法, solely based on variable mappings,可以修复约88.5%的错误程序。
    Abstract Automated program analysis is a pivotal research domain in many areas of Computer Science -- Formal Methods and Artificial Intelligence, in particular. Due to the undecidability of the problem of program equivalence, comparing two programs is highly challenging. Typically, in order to compare two programs, a relation between both programs' sets of variables is required. Thus, mapping variables between two programs is useful for a panoply of tasks such as program equivalence, program analysis, program repair, and clone detection. In this work, we propose using graph neural networks (GNNs) to map the set of variables between two programs based on both programs' abstract syntax trees (ASTs). To demonstrate the strength of variable mappings, we present three use-cases of these mappings on the task of program repair to fix well-studied and recurrent bugs among novice programmers in introductory programming assignments (IPAs). Experimental results on a dataset of 4166 pairs of incorrect/correct programs show that our approach correctly maps 83% of the evaluation dataset. Moreover, our experiments show that the current state-of-the-art on program repair, greatly dependent on the programs' structure, can only repair about 72% of the incorrect programs. In contrast, our approach, which is solely based on variable mappings, can repair around 88.5%.
    摘要 自动化程序分析是计算机科学多个领域的重要研究领域,特别是正式方法和人工智能。由于程序相等性问题是不可解决的,因此比较两个程序是非常困难的。通常,以便比较两个程序,需要两个程序变量集的关系。因此,将变量 между两个程序映射到相同的空间是非常有用的,这有助于许多任务,如程序相等性、程序分析、程序修复和假象检测。在这个工作中,我们提议使用图神经网络(GNNs)将两个程序的变量集映射到相同的空间,基于这两个程序的抽象语法树(ASTs)。为了证明变量映射的强大性,我们在程序修复任务上提供了三个使用情况,用于修复 novice 程序员在入门编程作业(IPAs)中经常出现的常见bug。实验结果表明,我们的方法可以在4166对错误/正确程序的评估集中正确地映射83%的评估集。此外,我们的实验还表明,现有的程序修复方法,强调程序结构,只能修复约72%的错误程序。相比之下,我们的方法,solely基于变量映射,可以修复约88.5%的错误程序。

Towards a Visual-Language Foundation Model for Computational Pathology

  • paper_url: http://arxiv.org/abs/2307.12914
  • repo_url: None
  • paper_authors: Ming Y. Lu, Bowen Chen, Drew F. K. Williamson, Richard J. Chen, Ivy Liang, Tong Ding, Guillaume Jaume, Igor Odintsov, Andrew Zhang, Long Phi Le, Georg Gerber, Anil V Parwani, Faisal Mahmood
  • for: 这篇研究旨在提出一个基于对比学习的类别学习模型,以推广 Histopathology 领域中的诊断和鉴别 tasks。
  • methods: 这篇研究使用了多种来源的 Histopathology 图像和生医文本,并运用了1,170万个图像-描述对的 pairs 进行task-agnostic pretraining。
  • results: 研究发现,这个 CONCH 模型可以在13个不同的benchmark上进行 transferred learning,并在 histology 图像分类、分类、描述、文本-图像和图像-文本撷取等下测试得到了顶尖性能。
    Abstract The accelerated adoption of digital pathology and advances in deep learning have enabled the development of powerful models for various pathology tasks across a diverse array of diseases and patient cohorts. However, model training is often difficult due to label scarcity in the medical domain and the model's usage is limited by the specific task and disease for which it is trained. Additionally, most models in histopathology leverage only image data, a stark contrast to how humans teach each other and reason about histopathologic entities. We introduce CONtrastive learning from Captions for Histopathology (CONCH), a visual-language foundation model developed using diverse sources of histopathology images, biomedical text, and notably over 1.17 million image-caption pairs via task-agnostic pretraining. Evaluated on a suite of 13 diverse benchmarks, CONCH can be transferred to a wide range of downstream tasks involving either or both histopathology images and text, achieving state-of-the-art performance on histology image classification, segmentation, captioning, text-to-image and image-to-text retrieval. CONCH represents a substantial leap over concurrent visual-language pretrained systems for histopathology, with the potential to directly facilitate a wide array of machine learning-based workflows requiring minimal or no further supervised fine-tuning.
    摘要 随着数字 PATHOLOGY 的加速采用和深度学习的进步,已经开发出了许多强大的模型用于各种疾病和患者群体中的 PATHOLOGY 任务。然而,模型训练往往困难,因为医疗领域中标签的缺乏和模型的使用受到特定任务和疾病的限制。此外,大多数 histopathology 模型仅利用图像数据,与人类教育和理解 histopathologic 实体不符。我们介绍了 CONtrastive learning from Captions for Histopathology (CONCH),一种基于多种 histopathology 图像、生物医学文本和着重于 1.17 万个图像-caption 对的视觉语言基础模型。在 13 种多样化的标准测试集上评估,CONCH 可以转移到覆盖图像和文本下游任务,实现 histology 图像分类、 segmentation、captioning、text-to-image 和 image-to-text 检索的状态计算机科学中的最佳性能。CONCH 代表了对于 histopathology 的较大的进步,具有直接促进许多机器学习基于工作流程,需要 minimal 或无需进一步的监督微调的潜在。

GridMM: Grid Memory Map for Vision-and-Language Navigation

  • paper_url: http://arxiv.org/abs/2307.12907
  • repo_url: https://github.com/mrzihan/gridmm
  • paper_authors: Zihan Wang, Xiangyang Li, Jiahao Yang, Yeqi Liu, Shuqiang Jiang
  • for: 本研究旨在提出一种新的视觉语言导航(VLN)方法,以便在3D环境中根据自然语言指令进行导航。
  • methods: 我们提出了一种名为Grid Memory Map(GridMM)的新方法,它使用了顺序状态、 topological maps 或 top-down semantic maps来表示已经游览过的环境。我们还提出了一种 instruction relevance aggregation 方法,用于在每个格子区域中捕捉细腻的视觉提示。
  • results: 我们在REVERIE、R2R、SOON数据集上进行了广泛的实验,并在R2R-CE数据集上进行了连续环境的实验,结果显示了我们的提posed方法的优越性。
    Abstract Vision-and-language navigation (VLN) enables the agent to navigate to a remote location following the natural language instruction in 3D environments. To represent the previously visited environment, most approaches for VLN implement memory using recurrent states, topological maps, or top-down semantic maps. In contrast to these approaches, we build the top-down egocentric and dynamically growing Grid Memory Map (i.e., GridMM) to structure the visited environment. From a global perspective, historical observations are projected into a unified grid map in a top-down view, which can better represent the spatial relations of the environment. From a local perspective, we further propose an instruction relevance aggregation method to capture fine-grained visual clues in each grid region. Extensive experiments are conducted on both the REVERIE, R2R, SOON datasets in the discrete environments, and the R2R-CE dataset in the continuous environments, showing the superiority of our proposed method.
    摘要 视觉语言导航(VLN)允许代理人在三维环境中根据自然语言指令进行导航。以前的环境表示方法中,大多数方法使用循环状态、 topological map 或 top-down semantic map 来实现记忆。在这些方法中,我们构建了从上而下的 egocentric 和动态增长的 Grid Memory Map(i.e., GridMM),以 структуриze 已经探索的环境。从全球视角来看,历史观察被投影到一个统一的格子地图上,可以更好地表示环境的空间关系。从本地视角来看,我们还提出了一种指令相关积累方法,以捕捉每个格子区域中的细腻视觉准确。我们在 discrete 环境中的 REVERIE、R2R 和 SOON 数据集上,以及 continuous 环境中的 R2R-CE 数据集上进行了广泛的实验,显示了我们提出的方法的优越性。