cs.AI - 2023-10-18

Learning to Solve Climate Sensor Placement Problems with a Transformer

  • paper_url: http://arxiv.org/abs/2310.12387
  • repo_url: None
  • paper_authors: Chen Wang, Victoria Huang, Gang Chen, Hui Ma, Bryce Chen, Jochen Schmidt
  • for: 该论文目的是提出一种基于深度学习的感知器布局方法,用于解决环境监测和灾害管理中的感知器布局问题。
  • methods: 该方法使用深度学习方法来自动生成优化策略,以解决环境监测和灾害管理中的感知器布局问题。
  • results: 对比多种现有方法,该方法能够生成高质量的解决方案,并且比现有方法更有效率。
    Abstract The optimal placement of sensors for environmental monitoring and disaster management is a challenging problem due to its NP-hard nature. Traditional methods for sensor placement involve exact, approximation, or heuristic approaches, with the latter being the most widely used. However, heuristic methods are limited by expert intuition and experience. Deep learning (DL) has emerged as a promising approach for generating heuristic algorithms automatically. In this paper, we introduce a novel sensor placement approach focused on learning improvement heuristics using deep reinforcement learning (RL) methods. Our approach leverages an RL formulation for learning improvement heuristics, driven by an actor-critic algorithm for training the policy network. We compare our method with several state-of-the-art approaches by conducting comprehensive experiments, demonstrating the effectiveness and superiority of our proposed approach in producing high-quality solutions. Our work presents a promising direction for applying advanced DL and RL techniques to challenging climate sensor placement problems.
    摘要 “环境监控和灾害管理中的仪器位置最佳化是一个NP困难的问题。传统方法包括精确、近似或规律方法,但这些方法受到专家智慧和经验的限制。深度学习(DL)已经成为一种可能的方法,用于生成自动生成规律算法。在这篇论文中,我们提出了一种新的仪器位置方法,利用深度强化学习(RL)方法学习改善规律。我们的方法利用了RL的形式来学习改善规律,驱动actor-critic算法来训练政策网。我们对多个现有方法进行了严详的实验,展示了我们的提案方法的有效性和优势。我们的工作呈现了应用进步的DL和RL技术解决气候仪器位置问题的可能性。”Note: Please note that the translation is in Simplified Chinese, which is the standard form of Chinese used in mainland China and Singapore. If you need Traditional Chinese, please let me know.

Online Learning and Planning in Cognitive Hierarchies

  • paper_url: http://arxiv.org/abs/2310.12386
  • repo_url: https://github.com/Aryia-Behroziuan/References
  • paper_authors: Bernhard Hengst, Maurice Pagnucco, David Rajaratnam, Claude Sammut, Michael Thielscher
  • for: 本研究旨在探讨复杂 роботи变得需要多种机器人和人工智能技术的集成,以实现全球性和行为的协调。
  • methods: 本研究使用形式化框架来模型机器人系统的复杂 интеGRATION和有效的决策过程,从符号规划到在线学习策略和过渡系统。
  • results: 研究人员通过扩展Clark et al.(2016)的形式化框架,实现了复杂机器人系统的可靠和有效的 интеGRATION和决策过程。此外,新的框架还允许更加灵活地模型不同的决策组件之间的交互。
    Abstract Complex robot behaviour typically requires the integration of multiple robotic and Artificial Intelligence (AI) techniques and components. Integrating such disparate components into a coherent system, while also ensuring global properties and behaviours, is a significant challenge for cognitive robotics. Using a formal framework to model the interactions between components can be an important step in dealing with this challenge. In this paper we extend an existing formal framework [Clark et al., 2016] to model complex integrated reasoning behaviours of robotic systems; from symbolic planning through to online learning of policies and transition systems. Furthermore the new framework allows for a more flexible modelling of the interactions between different reasoning components.
    摘要 通常需要结合多种机器人和人工智能(AI)技术和组件来实现复杂的机器人行为。将这些不同的组件集成成一个一致的系统,并确保全球性和行为,是认知机器人的主要挑战。使用形式化框架来模型组件之间的交互可以是解决这个挑战的重要步骤。在这篇论文中,我们将对 Clark et al.(2016)的现有正式框架进行扩展,以模型机器人系统的复杂集成推理行为,从 симвоlic 规划到在线学习策略和转移系统。此外,新的框架还允许更加灵活地模型不同推理组件之间的交互。

Solving Hard Analogy Questions with Relation Embedding Chains

  • paper_url: http://arxiv.org/abs/2310.12379
  • repo_url: https://github.com/niteshroyal/solvinghardanalogyquestions
  • paper_authors: Nitesh Kumar, Steven Schockaert
  • for: 本研究的目的是将概念之间的关系模型为路径,同时具有relation embedding的特性。
  • methods: 本研究使用了知识 graphs(KGs)如ConceptNet,并模型了两个概念之间的关系为一组路径。然而,KGs具有固定的关系类型,并且容易受到噪音和损害。本研究还使用了 fine-tuned语言模型来提取关系嵌入,但这并不适用于间接相关的词语和结构化领域知识。
  • results: 本研究提出了一种将路径和关系嵌入结合的方法,并通过实验表明其可以解决困难的相似问题。
    Abstract Modelling how concepts are related is a central topic in Lexical Semantics. A common strategy is to rely on knowledge graphs (KGs) such as ConceptNet, and to model the relation between two concepts as a set of paths. However, KGs are limited to a fixed set of relation types, and they are incomplete and often noisy. Another strategy is to distill relation embeddings from a fine-tuned language model. However, this is less suitable for words that are only indirectly related and it does not readily allow us to incorporate structured domain knowledge. In this paper, we aim to combine the best of both worlds. We model relations as paths but associate their edges with relation embeddings. The paths are obtained by first identifying suitable intermediate words and then selecting those words for which informative relation embeddings can be obtained. We empirically show that our proposed representations are useful for solving hard analogy questions.
    摘要 模型两个概念之间的关系是lexical semantics中的一个中心话题。一种常见的策略是通过知识图(KG)such as ConceptNet,并将两个概念之间的关系表示为一组路径。然而,KGs是有限的,并且经常受到干扰和噪声的影响。另一种策略是通过精心调整的自然语言模型提取关系嵌入。然而,这并不适用于间接相关的词语,而且不能轻松地包含结构化领域知识。在这篇论文中,我们决心将这两种方法结合在一起。我们模型关系为路径,并将路径上的边与关系嵌入相关联。我们首先 identific suitable intermediate words,然后选择这些words,以便可以获取有用的关系嵌入。我们的提议的表示方法在解决困难的类比问题上表现出色。

ClusT3: Information Invariant Test-Time Training

  • paper_url: http://arxiv.org/abs/2310.12345
  • repo_url: https://github.com/dosowiechi/clust3
  • paper_authors: Gustavo A. Vargas Hakim, David Osowiechi, Mehrdad Noori, Milad Cheraghalikhani, Ismail Ben Ayed, Christian Desrosiers
  • for: 提高深度学习模型对不同预测环境的鲁棒性
  • methods: 提出了一种基于维度信息最大化的无监督测试时培养技术,通过同时在训练时进行多尺度特征图和整数 latent representation 的匹配,实现在测试时使用自动生成的proxy任务来适应不同预测环境。
  • results: 实验结果表明,该技术可以在不同的测试时适应 benchmark 上达到竞争力的分类性能。
    Abstract Deep Learning models have shown remarkable performance in a broad range of vision tasks. However, they are often vulnerable against domain shifts at test-time. Test-time training (TTT) methods have been developed in an attempt to mitigate these vulnerabilities, where a secondary task is solved at training time simultaneously with the main task, to be later used as an self-supervised proxy task at test-time. In this work, we propose a novel unsupervised TTT technique based on the maximization of Mutual Information between multi-scale feature maps and a discrete latent representation, which can be integrated to the standard training as an auxiliary clustering task. Experimental results demonstrate competitive classification performance on different popular test-time adaptation benchmarks.
    摘要

Eliminating Reasoning via Inferring with Planning: A New Framework to Guide LLMs’ Non-linear Thinking

  • paper_url: http://arxiv.org/abs/2310.12342
  • repo_url: None
  • paper_authors: Yongqi Tong, Yifan Wang, Dawei Li, Sizhe Wang, Zi Lin, Simeng Han, Jingbo Shang
    for: 这研究旨在强化大语言模型(LLM)的高级逻辑能力,通过模拟人类线性思维和逻辑的混合。methods: 这研究提出了新的提示方法,即排除逻辑提示(IEP),它将排除逻辑和推理结合起来,以便LLM可以更好地模拟人类的非线性思维。results: 研究发现,IEP可以在多种任务上consistently outperform CoT,并且可以和CoT结合使用,以提高LLM的表现。此外,研究还引入了新的benchmark,即MENTAL-ABILITY REASONING BENCHMARK(MARB),以评估LLM的逻辑和语言理解能力。
    Abstract Chain-of-Thought(CoT) prompting and its variants explore equipping large language models (LLMs) with high-level reasoning abilities by emulating human-like linear cognition and logic. However, the human mind is complicated and mixed with both linear and nonlinear thinking. In this work, we propose \textbf{I}nferential \textbf{E}xclusion \textbf{P}rompting (IEP), a novel prompting that combines the principles of elimination and inference in order to guide LLMs to think non-linearly. IEP guides LLMs to plan and then utilize Natural Language Inference (NLI) to deduce each possible solution's entailment relation with context, commonsense, or facts, therefore yielding a broader perspective by thinking back for inferring. This forward planning and backward eliminating process allows IEP to better simulate the complex human thinking processes compared to other CoT-based methods, which only reflect linear cognitive processes. We conducted a series of empirical studies and have corroborated that IEP consistently outperforms CoT across various tasks. Additionally, we observe that integrating IEP and CoT further improves the LLMs' performance on certain tasks, highlighting the necessity of equipping LLMs with mixed logic processes. Moreover, to better evaluate comprehensive features inherent in human logic, we introduce \textbf{M}ental-\textbf{A}bility \textbf{R}easoning \textbf{B}enchmark (MARB). The benchmark comprises six novel subtasks with a total of 9,115 questions, among which 1,685 are developed with hand-crafted rationale references. We believe both \textsc{IEP} and \textsc{MARB} can serve as a promising direction for unveiling LLMs' logic and verbal reasoning abilities and drive further advancements. \textsc{MARB} will be available at ~\texttt{anonymity link} soon.
    摘要 Chain-of-Thought(CoT)提示和其变种探索将大型语言模型(LLM)具备高级思维能力,通过模拟人类线性认知和逻辑。然而,人类思维是复杂的,混合了线性和非线性思维。在这项工作中,我们提出了《排除并推理》(IEP)提示,它结合排除和推理的原理,以引导 LLM 进行非线性思维。IEP 使 LLM 可以规划,然后通过自然语言推理(NLI)来推理每个可能解的上下文、通用智慧和事实的关系,从而获得更广泛的视野。这种前置规划和后置排除过程使 IEP 更能模拟人类思维过程,相比其他 CoT 基于方法。我们进行了一系列实验研究,并证明 IEP 在多种任务上表现出色。此外,我们发现将 IEP 和 CoT 集成可以进一步提高 LLMS 的表现,强调了训练 LLMs 的混合逻辑过程的必要性。此外,为了更好地评估人类逻辑的全面特征,我们引入了《MENTAL-ABILITY REASONING BENCHMARK》(MARB)。 MARB 包括六个新的任务,共计 9,115 个问题,其中 1,685 个问题采用了手动制作的 rational references。我们认为 IEP 和 MARB 都可以成为探索 LLMs 逻辑和语言逻辑能力的有希望的方向,并驱动进一步的进步。MARB 将在 ~\texttt{anonymity link} 上公开。

Opportunities for Adaptive Experiments to Enable Continuous Improvement that Trades-off Instructor and Researcher Incentives

  • paper_url: http://arxiv.org/abs/2310.12324
  • repo_url: None
  • paper_authors: Ilya Musabirov, Angela Zavaleta-Bernuy, Pan Chen, Michael Liut, Joseph Jay Williams
  • for: 这个论文的目的是提供一种基于机器学习的adaptive experimentation方法,用于持续改进高等教育课程。
  • methods: 这篇论文使用了机器学习算法来分析数据,并在不同的学生群中采用不同的condition进行比较,以确定最有效的condition。
  • results: 这篇论文的实验结果表明,使用adaptive experimentation方法可以更好地支持学生的需求,并提高学生的学习效果。
    Abstract Randomized experimental comparisons of alternative pedagogical strategies could provide useful empirical evidence in instructors' decision-making. However, traditional experiments do not have a clear and simple pathway to using data rapidly to try to increase the chances that students in an experiment get the best conditions. Drawing inspiration from the use of machine learning and experimentation in product development at leading technology companies, we explore how adaptive experimentation might help in continuous course improvement. In adaptive experiments, as different arms/conditions are deployed to students, data is analyzed and used to change the experience for future students. This can be done using machine learning algorithms to identify which actions are more promising for improving student experience or outcomes. This algorithm can then dynamically deploy the most effective conditions to future students, resulting in better support for students' needs. We illustrate the approach with a case study providing a side-by-side comparison of traditional and adaptive experimentation of self-explanation prompts in online homework problems in a CS1 course. This provides a first step in exploring the future of how this methodology can be useful in bridging research and practice in doing continuous improvement.
    摘要 随机实验比较不同的教学策略可以提供有用的实际证据,帮助教师做出决策。然而,传统的实验没有一个明确的和简单的数据使用路径,这限制了学生在实验中获得最佳条件的机会。我们从技术公司的产品开发中使用机器学习和实验的经验而来,探讨如何使用适应试验来促进课程不断改进。在适应试验中,不同的臂/条件在学生面前采用,并分析数据,以改善未来学生的经验。这可以使用机器学习算法来确定哪些行动更有前途的提高学生体验或成绩。这个算法然后会在未来学生面前动态部署最有效的条件,从而提供更好的学生需求支持。我们通过一个案例研究,对传统和适应试验自适应提示在线作业问题的比较,以示方法的可行性。这是继续改进的未来的一个初步探索。

The Sentiment Problem: A Critical Survey towards Deconstructing Sentiment Analysis

  • paper_url: http://arxiv.org/abs/2310.12318
  • repo_url: None
  • paper_authors: Pranav Narayanan Venkit, Mukund Srinath, Sanjana Gautam, Saranya Venkatraman, Vipul Gupta, Rebecca J. Passonneau, Shomir Wilson
  • for: 本研究探讨了 sentiment analysis (SA) 在不同社技系统中的应用、模型和数据集方面的问题。
  • methods: 研究者通过审查 189 篇同行评审文章,探讨 SA 在不同领域中的应用和模型,以及数据集的问题。
  • results: 研究发现 SA 在不同领域中的定义和应用存在差异,导致可能的挑战和偏见。为解决这问题,研究者提出了一个伦理卡,以帮助实践者在使用 SA 时确保公正使用。
    Abstract We conduct an inquiry into the sociotechnical aspects of sentiment analysis (SA) by critically examining 189 peer-reviewed papers on their applications, models, and datasets. Our investigation stems from the recognition that SA has become an integral component of diverse sociotechnical systems, exerting influence on both social and technical users. By delving into sociological and technological literature on sentiment, we unveil distinct conceptualizations of this term in domains such as finance, government, and medicine. Our study exposes a lack of explicit definitions and frameworks for characterizing sentiment, resulting in potential challenges and biases. To tackle this issue, we propose an ethics sheet encompassing critical inquiries to guide practitioners in ensuring equitable utilization of SA. Our findings underscore the significance of adopting an interdisciplinary approach to defining sentiment in SA and offer a pragmatic solution for its implementation.
    摘要 我们进行了一个关于社会技术方面的情感分析(SA)的调查, kritically examining 189 peer-reviewed papers on their applications, models, and datasets。我们的调查源于认识到SA已成为多种社会技术系统的重要组成部分,影响社会和技术用户。通过探究社会学和技术文献中的情感概念,我们揭示了不同领域中情感的不同定义和概念化。我们的研究发现了情感定义和框架的明确性不足,可能导致挑战和偏见。为解决这个问题,我们提议一份伦理宣言,涵盖了重要的伦理问题,以帮助实践者在使用SA时确保公正使用。我们的发现表明了采用多科学方法来定义情感在SA中的重要性,并提供了一个实用的解决方案。

A Unifying Framework for Learning Argumentation Semantics

  • paper_url: http://arxiv.org/abs/2310.12309
  • repo_url: None
  • paper_authors: Zlatina Mileva, Antonis Bikakis, Fabio Aurelio D’Asaro, Mark Law, Alessandra Russo
  • for: 这篇论文是关于人工智能领域的论证推理研究,旨在提出一种可解释的论证Acceptability semantics的框架,以便在人机对话中使用。
  • methods: 该论文使用了逻辑编程方法,通过学习来计算论证的接受性。
  • results: 经验证试验表明,该框架可以在论证计算中具有较高的性能,并且可以在人机对话中提供更加可靠的结果。
    Abstract Argumentation is a very active research field of Artificial Intelligence concerned with the representation and evaluation of arguments used in dialogues between humans and/or artificial agents. Acceptability semantics of formal argumentation systems define the criteria for the acceptance or rejection of arguments. Several software systems, known as argumentation solvers, have been developed to compute the accepted/rejected arguments using such criteria. These include systems that learn to identify the accepted arguments using non-interpretable methods. In this paper we present a novel framework, which uses an Inductive Logic Programming approach to learn the acceptability semantics for several abstract and structured argumentation frameworks in an interpretable way. Through an empirical evaluation we show that our framework outperforms existing argumentation solvers, thus opening up new future research directions in the area of formal argumentation and human-machine dialogues.
    摘要 争议是人工智能的一个非常活跃的研究领域,涉及对人类和/或人工代理人之间的对话中使用的论据的表示和评估。正式争议系统的 Acceptability semantics 定义了论据的接受或拒绝的标准。一些称为争议解决器的软件系统已经被开发出来计算使用这些标准来接受或拒绝论据。这些系统包括使用非可解释的方法来识别接受的论据的学习系统。在这篇论文中,我们提出了一种新的框架,使用逻辑编程方法来学习多种抽象和结构化争议框架的接受可能性,并在实验评估中证明了我们的框架可以在接受可能性评估方面超越现有的争议解决器,从而开启了新的未来研究方向在正式争议和人机对话领域。

Preference Optimization for Molecular Language Models

  • paper_url: http://arxiv.org/abs/2310.12304
  • repo_url: https://github.com/harmonic-discovery/pref-opt-for-mols
  • paper_authors: Ryan Park, Ryan Theisen, Navriti Sahni, Marcel Patek, Anna Cichońska, Rayees Rahman
  • for: 用于生成新的化学结构
  • methods: 使用直接偏好优化精度调整
  • results: 高效、简单、有效地与化学家喜好Alignment of generated molecules
    Abstract Molecular language modeling is an effective approach to generating novel chemical structures. However, these models do not \emph{a priori} encode certain preferences a chemist may desire. We investigate the use of fine-tuning using Direct Preference Optimization to better align generated molecules with chemist preferences. Our findings suggest that this approach is simple, efficient, and highly effective.
    摘要 分子语言模型可以有效地生成新的化学结构。然而,这些模型没有先验的编码化学家可能愿望的偏好。我们研究了使用直接偏好优化来更好地将生成的分子与化学家的偏好相Alignment。我们发现这种方法简单、高效并有高效果。Here's a word-for-word translation:分子语言模型可以有效地生成新的化学结构。然而,这些模型没有先验的编码化学家可能愿望的偏好。我们研究了使用直接偏好优化来更好地将生成的分子与化学家的偏好相Alignment。我们发现这种方法简单、高效并有高效果。

Document-Level Language Models for Machine Translation

  • paper_url: http://arxiv.org/abs/2310.12303
  • repo_url: https://github.com/Sfedfcv/redesigned-pancake
  • paper_authors: Frithjof Petrick, Christian Herold, Pavel Petrushkov, Shahram Khadivi, Hermann Ney
  • for: 提高文档翻译系统的 Context-awareness,使其能够更好地理解文档的含义和结构。
  • methods: 组合现有的 sentence-level 翻译模型和文档级别的语言模型,并使用 novel weighting techniques 来提高系统的灵活性和计算效率。
  • results: 在四种多样化的翻译任务上进行了全面的评估,并显示了substantially 提高的文档指向得分,同时也更加计算效率。但是,我们还发现,通过回译来获得更好的结果,但是需要重新训练翻译系统。此外,我们还探讨了大语言模型的混合,并发现可能在使用大语言模型时存在强大的潜在性。
    Abstract Despite the known limitations, most machine translation systems today still operate on the sentence-level. One reason for this is, that most parallel training data is only sentence-level aligned, without document-level meta information available. In this work, we set out to build context-aware translation systems utilizing document-level monolingual data instead. This can be achieved by combining any existing sentence-level translation model with a document-level language model. We improve existing approaches by leveraging recent advancements in model combination. Additionally, we propose novel weighting techniques that make the system combination more flexible and significantly reduce computational overhead. In a comprehensive evaluation on four diverse translation tasks, we show that our extensions improve document-targeted scores substantially and are also computationally more efficient. However, we also find that in most scenarios, back-translation gives even better results, at the cost of having to re-train the translation system. Finally, we explore language model fusion in the light of recent advancements in large language models. Our findings suggest that there might be strong potential in utilizing large language models via model combination.
    摘要 尽管现有的机器翻译系统 todavía 以句子为单位运行,一个原因是因为大多数平行训练数据只有句子水平的对齐,没有文档水平的元信息可用。在这项工作中,我们设想建立了文本上下文感知的翻译系统,使用文档水平的独立语言模型。我们提高了现有的方法,利用最新的模型组合技术。此外,我们提出了新的权重技巧,使系统组合更加灵活,并显著减少计算负担。在四种多样化的翻译任务上进行了全面的评估,我们发现我们的扩展可以大幅提高文档目标得分,并且计算更加高效。然而,我们也发现,在大多数情况下,回传翻译能够提供更好的结果,但是需要重新训练翻译系统。最后,我们探讨了大语言模型的集成,我们发现大语言模型可以通过模型组合来提供强大的潜在力。

Jorge: Approximate Preconditioning for GPU-efficient Second-order Optimization

  • paper_url: http://arxiv.org/abs/2310.12298
  • repo_url: None
  • paper_authors: Siddharth Singh, Zachary Sating, Abhinav Bhatele
  • for: 这篇论文的目的是提出一种高效的二阶优化器,以提高深度学习模型的训练效率和性能。
  • methods: 这篇论文使用了一种新的二阶优化器 named Jorge,它通过简化预conditioning步骤,从而大大减少了计算成本,使其在GPU上实现高效。
  • results: 实验结果表明,Jorge可以与现有的优化器,如SGD、AdamW和Shampoo等比肩,并在多个深度学习模型上显示出更高的效率和性能。
    Abstract Despite their better convergence properties compared to first-order optimizers, second-order optimizers for deep learning have been less popular due to their significant computational costs. The primary efficiency bottleneck in such optimizers is matrix inverse calculations in the preconditioning step, which are expensive to compute on GPUs. In this paper, we introduce Jorge, a second-order optimizer that promises the best of both worlds -- rapid convergence benefits of second-order methods, and high computational efficiency typical of first-order methods. We address the primary computational bottleneck of computing matrix inverses by completely eliminating them using an approximation of the preconditioner computation. This makes Jorge extremely efficient on GPUs in terms of wall-clock time. Further, we describe an approach to determine Jorge's hyperparameters directly from a well-tuned SGD baseline, thereby significantly minimizing tuning efforts. Our empirical evaluations demonstrate the distinct advantages of using Jorge, outperforming state-of-the-art optimizers such as SGD, AdamW, and Shampoo across multiple deep learning models, both in terms of sample efficiency and wall-clock time.
    摘要 尽管第二顺序优化器在深度学习中的更好的整合性,但由于计算成本高涨,使得它们在实际应用中较少使用。在这篇论文中,我们介绍了 Jorge,一种第二顺序优化器,它可以同时具有第一顺序优化器的快速整合和高效计算性。我们通过完全抛弃矩阵逆计算,使得 Jorge 在 GPU 上具有高效的墙 clock 时间。此外,我们还提出了一种确定 Jorge 的超参数的方法,通过对已经优化的 SGD 基线进行调整,以此减少调整努力。我们的实验表明,使用 Jorge 可以获得明显的优势,在多种深度学习模型上,在样本效率和墙 clock 时间两个方面都超过了状态元优化器,如 SGD、AdamW 和 Shampoo。

Fact-based Agent modeling for Multi-Agent Reinforcement Learning

  • paper_url: http://arxiv.org/abs/2310.12290
  • repo_url: https://github.com/Aryia-Behroziuan/References
  • paper_authors: Baofu Fang, Caiming Zheng, Hao Wang
  • for: 提高多智能体系中agent之间协作和互动的效率,在 unknown 环境下实现agent模型化。
  • methods: 使用fact-based belief inference(FBI)网络模型其他智能体的行为和意图,通过variational autoencoder(VAE)学习智能体政策表示。
  • results: 在多智能体粒子环境(MPE)中比基eline方法高效地提高agent政策学习效率,在复杂的竞争合作混合enario中实现更高的返点。
    Abstract In multi-agent systems, agents need to interact and collaborate with other agents in environments. Agent modeling is crucial to facilitate agent interactions and make adaptive cooperation strategies. However, it is challenging for agents to model the beliefs, behaviors, and intentions of other agents in non-stationary environment where all agent policies are learned simultaneously. In addition, the existing methods realize agent modeling through behavior cloning which assume that the local information of other agents can be accessed during execution or training. However, this assumption is infeasible in unknown scenarios characterized by unknown agents, such as competition teams, unreliable communication and federated learning due to privacy concerns. To eliminate this assumption and achieve agent modeling in unknown scenarios, Fact-based Agent modeling (FAM) method is proposed in which fact-based belief inference (FBI) network models other agents in partially observable environment only based on its local information. The reward and observation obtained by agents after taking actions are called facts, and FAM uses facts as reconstruction target to learn the policy representation of other agents through a variational autoencoder. We evaluate FAM on various Multiagent Particle Environment (MPE) and compare the results with several state-of-the-art MARL algorithms. Experimental results show that compared with baseline methods, FAM can effectively improve the efficiency of agent policy learning by making adaptive cooperation strategies in multi-agent reinforcement learning tasks, while achieving higher returns in complex competitive-cooperative mixed scenarios.
    摘要 在多代理系统中,代理需要互动和合作,在环境中进行交互。代理模型是重要的,以便促进代理之间的交互和适应合作策略。然而,在非站ARY环境中,所有代理策略都是同时学习的,对于代理来模型别人的信念、行为和意图是挑战。此外,现有的方法通过行为做副本来实现代理模型,假设在执行或训练中可以访问其他代理的本地信息。然而,这个假设在未知场景中是不可能的,例如竞争队伍、不可靠的通信和联合学习中的隐私问题。为了绕过这个假设并实现代理模型在未知场景中,我们提出了基于事实的代理模型(FAM)方法。FAM使用事实(即代理所获得的奖励和观察)为重建目标,通过变分自动编码器来学习别人的策略表示。我们在多代理粒子环境(MPE)上进行了评估,并与一些现有的 MARL 算法进行了比较。实验结果表明,相比基eline方法,FAM可以更好地提高代理策略学习效率,在多代理束缚学习任务中实现更高的返回,而且在复杂的竞争-合作混合场景中 achieve higher returns。

Enhancing the Performance of Automated Grade Prediction in MOOC using Graph Representation Learning

  • paper_url: http://arxiv.org/abs/2310.12281
  • repo_url: https://github.com/dsaatusu/mooper_grade_prediction
  • paper_authors: Soheila Farokhi, Aswani Yaramala, Jiangtao Huang, Muhammad F. A. Khan, Xiaojun Qi, Hamid Karimi
    for:The paper is written for the purpose of enhancing the performance of predictive machine learning models in student assignment grade prediction for MOOCs.methods:The paper uses graph embedding techniques to extract latent structural information encoded in the interactions between entities in the MOOC dataset, without requiring ground truth labels.results:The paper demonstrates that structural features can significantly improve the predictive performance of downstream assessment tasks, and the code and data are available in \url{https://github.com/DSAatUSU/MOOPer_grade_prediction}.
    Abstract In recent years, Massive Open Online Courses (MOOCs) have gained significant traction as a rapidly growing phenomenon in online learning. Unlike traditional classrooms, MOOCs offer a unique opportunity to cater to a diverse audience from different backgrounds and geographical locations. Renowned universities and MOOC-specific providers, such as Coursera, offer MOOC courses on various subjects. Automated assessment tasks like grade and early dropout predictions are necessary due to the high enrollment and limited direct interaction between teachers and learners. However, current automated assessment approaches overlook the structural links between different entities involved in the downstream tasks, such as the students and courses. Our hypothesis suggests that these structural relationships, manifested through an interaction graph, contain valuable information that can enhance the performance of the task at hand. To validate this, we construct a unique knowledge graph for a large MOOC dataset, which will be publicly available to the research community. Furthermore, we utilize graph embedding techniques to extract latent structural information encoded in the interactions between entities in the dataset. These techniques do not require ground truth labels and can be utilized for various tasks. Finally, by combining entity-specific features, behavioral features, and extracted structural features, we enhance the performance of predictive machine learning models in student assignment grade prediction. Our experiments demonstrate that structural features can significantly improve the predictive performance of downstream assessment tasks. The code and data are available in \url{https://github.com/DSAatUSU/MOOPer_grade_prediction}
    摘要 近年来,大规模在线开放课程(MOOC)在在线学习中得到了广泛的应用和发展。不同于传统的教室,MOOCs为不同背景和地理位置的学生提供了独特的学习机会。知名大学和MOOC专门提供者,如 Coursera,为多种主题的MOOC课程。由于大量报名和教师与学生之间的直接交互有限,因此自动评估任务如学生的评价和早期退出预测变得必要。然而,当前的自动评估方法忽略了学生和课程之间的结构关系。我们的假设是,这些结构关系,通过互动图表示出来,含有价值信息,可以提高任务的表现。为此,我们构建了一个大 MOOC 数据集的专用知识图,该图将在研究社区中公开。此外,我们利用图像技术来提取数据集中互动图中所隐藏的结构信息。这些技术不需要标注数据,可以用于多种任务。最后,我们将实体特征、行为特征和提取的结构特征相结合,提高预测机器学习模型的学生评价分数预测性能。我们的实验表明,结构特征可以显著提高下游评估任务的预测性能。代码和数据可以在 中找到。

An Image is Worth Multiple Words: Learning Object Level Concepts using Multi-Concept Prompt Learning

  • paper_url: http://arxiv.org/abs/2310.12274
  • repo_url: https://github.com/lxasqjc/mcpl
  • paper_authors: Chen Jin, Ryutaro Tanno, Amrutha Saseendran, Tom Diethe, Philip Teare
  • for: 本研究旨在学习一种新的”词”来表示图像风格和外观,并将其集成到自然语言句子中生成新的合成图像。
  • methods: 我们提出了一种多个概念提示学习(MCPL)框架,在单个句子-图像对中同时学习多个新”词”。为了提高词概念相关性的准确性,我们提出了三种REG regularization技术:注意力掩码(AttnMask),提示对比损失(PromptCL)和绑定形容词(Bind adj。)。
  • results: 我们通过图像生成、编辑和注意力可视化等方式进行了广泛的量化比较,demonstrating that our method can learn more semantically disentangled concepts with enhanced word-concept correlation。此外,我们还介绍了一个新的数据集和评价协议,专门为这种学习对象级概念的新任务。
    Abstract Textural Inversion, a prompt learning method, learns a singular embedding for a new "word" to represent image style and appearance, allowing it to be integrated into natural language sentences to generate novel synthesised images. However, identifying and integrating multiple object-level concepts within one scene poses significant challenges even when embeddings for individual concepts are attainable. This is further confirmed by our empirical tests. To address this challenge, we introduce a framework for Multi-Concept Prompt Learning (MCPL), where multiple new "words" are simultaneously learned from a single sentence-image pair. To enhance the accuracy of word-concept correlation, we propose three regularisation techniques: Attention Masking (AttnMask) to concentrate learning on relevant areas; Prompts Contrastive Loss (PromptCL) to separate the embeddings of different concepts; and Bind adjective (Bind adj.) to associate new "words" with known words. We evaluate via image generation, editing, and attention visualisation with diverse images. Extensive quantitative comparisons demonstrate that our method can learn more semantically disentangled concepts with enhanced word-concept correlation. Additionally, we introduce a novel dataset and evaluation protocol tailored for this new task of learning object-level concepts.
    摘要 文本倒转,一种快速学习方法,学习一个新的"词"来表示图像风格和外观,以便将其 integrate into natural language sentences 生成新的合成图像。然而,在一个场景中identifying和integrating多个对象水平的概念 pose significant challenges, Even when embeddings for individual concepts are available. This is further confirmed by our empirical tests. To address this challenge, we introduce a framework for Multi-Concept Prompt Learning (MCPL), where multiple new "words" are simultaneously learned from a single sentence-image pair. To enhance the accuracy of word-concept correlation, we propose three regularization techniques: Attention Masking (AttnMask) to concentrate learning on relevant areas; Prompts Contrastive Loss (PromptCL) to separate the embeddings of different concepts; and Bind adjective (Bind adj.) to associate new "words" with known words. We evaluate via image generation, editing, and attention visualization with diverse images. Extensive quantitative comparisons demonstrate that our method can learn more semantically disentangled concepts with enhanced word-concept correlation. Additionally, we introduce a novel dataset and evaluation protocol tailored for this new task of learning object-level concepts.

Tailoring Adversarial Attacks on Deep Neural Networks for Targeted Class Manipulation Using DeepFool Algorithm

  • paper_url: http://arxiv.org/abs/2310.13019
  • repo_url: None
  • paper_authors: S. M. Fazle Rabby Labib, Joyanta Jyoti Mondal, Meem Arafat Manab
  • for: 本研究旨在提出一种可argeting Specific classes的深度骗客(Targeted DeepFool),以提高深度神经网络(DNNs)的鲁棒性。
  • methods: 本文提出了一种基于DeepFool算法的Targeted DeepFool算法,并引入了最低信任分数的超参数,以提高灵活性。
  • results: 我们的实验表明,Targeted DeepFool算法可以在不同的深度神经网络架构上实现高效率和图像质量保持,而且可以增强模型的鲁棒性。 results show that one of the deep convolutional neural network architectures, AlexNet, and one of the state-of-the-art model Vision Transformer exhibit high robustness to getting fooled.
    Abstract Deep neural networks (DNNs) have significantly advanced various domains, but their vulnerability to adversarial attacks poses serious concerns. Understanding these vulnerabilities and developing effective defense mechanisms is crucial. DeepFool, an algorithm proposed by Moosavi-Dezfooli et al. (2016), finds minimal perturbations to misclassify input images. However, DeepFool lacks a targeted approach, making it less effective in specific attack scenarios. Also, in previous related works, researchers primarily focus on success, not considering how much an image is getting distorted; the integrity of the image quality, and the confidence level to misclassifying. So, in this paper, we propose Targeted DeepFool, an augmented version of DeepFool that allows targeting specific classes for misclassification. We also introduce a minimum confidence score requirement hyperparameter to enhance flexibility. Our experiments demonstrate the effectiveness and efficiency of the proposed method across different deep neural network architectures while preserving image integrity as much as possible. Results show that one of the deep convolutional neural network architectures, AlexNet, and one of the state-of-the-art model Vision Transformer exhibit high robustness to getting fooled. Our code will be made public when publishing the paper.
    摘要 深度神经网络(DNNs)在不同领域中得到了 significiant advancement,但它们受到了敌意攻击的威胁,这种威胁的存在对于理解和开发有效防御机制是非常重要。DeepFool算法,由Moosavi-Dezfooli et al.(2016)提出,可以在输入图像上发现微小的扰动,以让图像被误分类。然而,DeepFool算法缺乏目标化方法,这使得其在特定攻击enario下效果较差。此外,在先前的相关研究中,研究人员主要关注成功,而不是图像的纯度和误分类的信息量。因此,在这篇论文中,我们提出了Targeted DeepFool算法,这是对DeepFool算法的扩展,可以对特定的类进行误分类。我们还引入了最小信任分数的启用参数,以提高灵活性。我们的实验表明,提议的方法可以在不同的深度神经网络架构上进行效果和效率的混合,同时保持图像的纯度。结果显示,AlexNet和一种state-of-the-art模型Vision Transformer在深度神经网络架构上具有高度的抗攻击能力。我们将代码公开时出版论文。

A Unified Approach to Domain Incremental Learning with Memory: Theory and Algorithm

  • paper_url: http://arxiv.org/abs/2310.12244
  • repo_url: None
  • paper_authors: Haizhou Shi, Hao Wang
  • For: 本研究旨在提出一个统一架构,以应对不同领域的渐进式学习问题,并且仅从先前领域中获取一小部分的数据(即记忆)进行学习。* Methods: 本研究提出了一个统一架构, named Unified Domain Incremental Learning (UDIL),它整合了多种现有的方法,并且通过在训练过程中适应不同的参数,以获得最紧密的一致 bound。* Results: 实验结果显示,UDIL 比先前的领域渐进式学习方法在both synthetic和实际数据集上表现更好,并且可以适应不同的领域。
    Abstract Domain incremental learning aims to adapt to a sequence of domains with access to only a small subset of data (i.e., memory) from previous domains. Various methods have been proposed for this problem, but it is still unclear how they are related and when practitioners should choose one method over another. In response, we propose a unified framework, dubbed Unified Domain Incremental Learning (UDIL), for domain incremental learning with memory. Our UDIL **unifies** various existing methods, and our theoretical analysis shows that UDIL always achieves a tighter generalization error bound compared to these methods. The key insight is that different existing methods correspond to our bound with different **fixed** coefficients; based on insights from this unification, our UDIL allows **adaptive** coefficients during training, thereby always achieving the tightest bound. Empirical results show that our UDIL outperforms the state-of-the-art domain incremental learning methods on both synthetic and real-world datasets. Code will be available at https://github.com/Wang-ML-Lab/unified-continual-learning.
    摘要 域incremental learning aimsto adapt to a sequence of domains with only a small subset of data (i.e., memory) from previous domains. Various methods have been proposed for this problem, but it is still unclear how they are related and when practitioners should choose one method over another. In response, we propose a unified framework, called Unified Domain Incremental Learning (UDIL), for domain incremental learning with memory. Our UDIL unifies various existing methods, and our theoretical analysis shows that UDIL always achieves a tighter generalization error bound compared to these methods. The key insight is that different existing methods correspond to our bound with different fixed coefficients; based on insights from this unification, our UDIL allows adaptive coefficients during training, thereby always achieving the tightest bound. Empirical results show that our UDIL outperforms the state-of-the-art domain incremental learning methods on both synthetic and real-world datasets. Code will be available at https://github.com/Wang-ML-Lab/unified-continual-learning.Here's the translation of the highlighted phrases:* **unifies**: 统一* **fixed**: 固定的* **adaptive**: 可变的* **tighter**: 更紧的* **generalization error bound**: 泛化误差 bound* **state-of-the-art**: 现有的最佳方法* **synthetic**: sintetic* **real-world**: 实际的

Few-Shot In-Context Imitation Learning via Implicit Graph Alignment

  • paper_url: http://arxiv.org/abs/2310.12238
  • repo_url: None
  • paper_authors: Vitalis Vosylius, Edward Johns
  • for: 本研究旨在解决机器人学习新任务时,用于几个示例对象中的任务关系推广到新未经见过的对象上。
  • methods: 本研究使用 conditional alignment 问题来形式化模仿学习,通过对对象图表示的匹配来捕捉任务相关关系。
  • results: 实验结果显示,我们的方法可以高效地完成几种实际生活中的每日任务,并在比较基eline上表现出色。视频可以在我们项目网站(https://www.robot-learning.uk/implicit-graph-alignment)中找到。
    Abstract Consider the following problem: given a few demonstrations of a task across a few different objects, how can a robot learn to perform that same task on new, previously unseen objects? This is challenging because the large variety of objects within a class makes it difficult to infer the task-relevant relationship between the new objects and the objects in the demonstrations. We address this by formulating imitation learning as a conditional alignment problem between graph representations of objects. Consequently, we show that this conditioning allows for in-context learning, where a robot can perform a task on a set of new objects immediately after the demonstrations, without any prior knowledge about the object class or any further training. In our experiments, we explore and validate our design choices, and we show that our method is highly effective for few-shot learning of several real-world, everyday tasks, whilst outperforming baselines. Videos are available on our project webpage at https://www.robot-learning.uk/implicit-graph-alignment.
    摘要 问题如下:给定一些对象的几个示例任务,如何使 robot 能够在新、未经见过的对象上完成同样的任务?这是因为对象类中的巨量对象关系使得推断任务相关关系 между新对象和示例对象困难。我们解决这个问题,通过将仿真学定义为对象图表示的条件对Alignment问题。因此,我们表明,这种conditioning允许机器人在示例后立即在新对象上进行任务,不需要对对象类或进一步训练。在我们的实验中,我们探索和验证我们的设计选择,并证明我们的方法高效地实现了几个真实世界、日常任务的少量学习,并超越基elines。视频可以在我们项目网站上找到:https://www.robot-learning.uk/implicit-graph-alignment。

An Eager Satisfiability Modulo Theories Solver for Algebraic Datatypes

  • paper_url: http://arxiv.org/abs/2310.12234
  • repo_url: None
  • paper_authors: Amar Shah, Federico Mora, Sanjit A. Seshia
  • for: 这篇论文的目的是提出一种新的满足推理(SMT)解决方案,用于自动推理关于抽象数据类型(ADT)的问题。
  • methods: 这篇论文使用了一种新的积极的方法,即将 ADT 查询转化为一种 simpler 的逻辑理论,未解释函数(UF),然后使用现有的解决器解决减少后的查询。
  • results: 作者证明了这种方法的有效性和完整性,并在现有的benchmark集和一个新的劳动ious benchmark集上进行了比较,得到的结果表明该方法在现有的 benchmark 上比 state-of-the-art 的方法更高效。
    Abstract Algebraic data types (ADTs) are a construct classically found in functional programming languages that capture data structures like enumerated types, lists, and trees. In recent years, interest in ADTs has increased. For example, popular programming languages, like Python, have added support for ADTs. Automated reasoning about ADTs can be done using satisfiability modulo theories (SMT) solving, an extension of the Boolean satisfiability problem with constraints over first-order structures. Unfortunately, SMT solvers that support ADTs do not scale as state-of-the-art approaches all use variations of the same \emph{lazy} approach. In this paper, we present an SMT solver that takes a fundamentally different approach, an \emph{eager} approach. Specifically, our solver reduces ADT queries to a simpler logical theory, uninterpreted functions (UF), and then uses an existing solver on the reduced query. We prove the soundness and completeness of our approach and demonstrate that it outperforms the state-of-theart on existing benchmarks, as well as a new, more challenging benchmark set from the planning domain.
    摘要 алгебраические данные типы (ADTs) 是一种在函数编程语言中出现的构造,用于表示枚举类型、列表和树等数据结构。在最近几年中,关于 ADTs 的兴趣增加了。例如,流行编程语言如 Python 也添加了对 ADTs 的支持。通过使用满足性模ulo理论 (SMT) 解决方案,可以自动进行 ADTs 的逻辑推理。然而,现有的 SMT 解决方案都是基于同样的怠慢(lazy)方法,我们则提出了一种不同的积极(eager)方法。具体来说,我们的解决方案将 ADT 查询降到了一个更简单的逻辑理论,未解释函数 (UF),然后使用现有的解决方案对减少后的查询进行解决。我们证明了我们的方法的有效性和完整性,并证明其在现有的benchmark中以及一个新的、更加挑战性的benchmark集中的性能比例较好。

Probabilistic Sampling of Balanced K-Means using Adiabatic Quantum Computing

  • paper_url: http://arxiv.org/abs/2310.12153
  • repo_url: None
  • paper_authors: Jan-Nico Zaech, Martin Danelljan, Luc Van Gool
  • for: 这篇论文探讨了可以用adiabatic quantum computing(AQC)来解决数值和常见NP困难的优化问题。
  • methods: 现有的AQC技术仅允许使用最佳量子态,将其他量子态视为噪音并抛弃。这篇论文提出了使用这些噪音信息进行概率平衡k-means排序的想法。
  • results: 这篇论文使用了D-Wave AQC处理 sintetic和实际数据,并证明了这种方法可以更好地识别歧义的解和数据点。
    Abstract Adiabatic quantum computing (AQC) is a promising quantum computing approach for discrete and often NP-hard optimization problems. Current AQCs allow to implement problems of research interest, which has sparked the development of quantum representations for many machine learning and computer vision tasks. Despite requiring multiple measurements from the noisy AQC, current approaches only utilize the best measurement, discarding information contained in the remaining ones. In this work, we explore the potential of using this information for probabilistic balanced k-means clustering. Instead of discarding non-optimal solutions, we propose to use them to compute calibrated posterior probabilities with little additional compute cost. This allows us to identify ambiguous solutions and data points, which we demonstrate on a D-Wave AQC on synthetic and real data.
    摘要

Fairer and More Accurate Tabular Models Through NAS

  • paper_url: http://arxiv.org/abs/2310.12145
  • repo_url: None
  • paper_authors: Richeek Das, Samuel Dooley
  • for: 这种研究旨在使深度学习模型更加公正,具体来说是通过更新模型的结构和训练参数来实现这一目标。
  • methods: 该研究使用多目标神经网络搜索(NAS)和超参数优化(HPO)来找到一个新的模型,以提高模型的输出。
  • results: 研究发现,尝试单独优化模型的准确率可能会导致公正性问题,而同时优化模型的准确率和公正性可以共同优化模型的性能。
    Abstract Making models algorithmically fairer in tabular data has been long studied, with techniques typically oriented towards fixes which usually take a neural model with an undesirable outcome and make changes to how the data are ingested, what the model weights are, or how outputs are processed. We employ an emergent and different strategy where we consider updating the model's architecture and training hyperparameters to find an entirely new model with better outcomes from the beginning of the debiasing procedure. In this work, we propose using multi-objective Neural Architecture Search (NAS) and Hyperparameter Optimization (HPO) in the first application to the very challenging domain of tabular data. We conduct extensive exploration of architectural and hyperparameter spaces (MLP, ResNet, and FT-Transformer) across diverse datasets, demonstrating the dependence of accuracy and fairness metrics of model predictions on hyperparameter combinations. We show that models optimized solely for accuracy with NAS often fail to inherently address fairness concerns. We propose a novel approach that jointly optimizes architectural and training hyperparameters in a multi-objective constraint of both accuracy and fairness. We produce architectures that consistently Pareto dominate state-of-the-art bias mitigation methods either in fairness, accuracy or both, all of this while being Pareto-optimal over hyperparameters achieved through single-objective (accuracy) optimization runs. This research underscores the promise of automating fairness and accuracy optimization in deep learning models.
    摘要 使深度学习模型更加公平在表格数据上进行研究已经很长时间了,通常采用的技术是对现有的神经网络模型进行修改,以改善数据入口方式、模型权重或输出处理方式。我们采用一种不同的策略,即对模型的建构和训练超参数进行更新,以找到一个从头开始的全新模型,以提高结果的公平性。在这项工作中,我们提出使用多目标神经网络搜索(NAS)和超参数优化(HPO)来优化模型的建构和超参数。我们在多个数据集上进行了广泛的建构和超参数空间的探索(包括MLP、ResNet和FT-Transformer),并证明了模型预测结果中的公平性和准确性指标之间的依赖关系。我们发现,通过solely使用NAS优化模型的准确性,通常无法自动解决公平性问题。我们提出了一种新的方法,即同时优化建构和超参数,以实现多目标约束中的准确性和公平性两个目标的 JOINT 优化。我们生成了一系列可以同时dominates state-of-the-art偏见缓解方法的建构,并且这些建构都是通过多目标约束来实现的。这些研究表明了自动化深度学习模型的公平性和准确性优化的推荐。

Getting aligned on representational alignment

  • paper_url: http://arxiv.org/abs/2310.13018
  • repo_url: None
  • paper_authors: Ilia Sucholutsky, Lukas Muttenthaler, Adrian Weller, Andi Peng, Andreea Bobu, Been Kim, Bradley C. Love, Erin Grant, Jascha Achterberg, Joshua B. Tenenbaum, Katherine M. Collins, Katherine L. Hermann, Kerem Oktar, Klaus Greff, Martin N. Hebart, Nori Jacoby, Qiuyi, Zhang, Raja Marjieh, Robert Geirhos, Sherol Chen, Simon Kornblith, Sunayana Rane, Talia Konkle, Thomas P. O’Connell, Thomas Unterthiner, Andrew K. Lampinen, Klaus-Robert Müller, Mariya Toneva, Thomas L. Griffiths
  • for: The paper aims to improve communication between research communities studying representational alignment in cognitive science, neuroscience, and machine learning, by proposing a unifying framework that can serve as a common language for these fields.
  • methods: The paper surveys the literature from these fields and demonstrates how prior work fits into the proposed framework.
  • results: The paper identifies open problems in representational alignment where progress can benefit all three fields, and hopes to catalyze cross-disciplinary collaboration and accelerate progress for all communities studying and developing information processing systems.Here’s the same information in Simplified Chinese text:
  • for: 这篇论文目的是为了提高认知科学、神经科学和机器学习等领域研究表征对Alignment的交流,提出一个统一的框架,以便这些领域的研究者之间更好地交流。
  • methods: 论文将Literature Survey的方法采用到这些领域的文献中,并将先前的工作放入该框架中。
  • results: 论文标识了表征对Alignment中的开放问题,希望通过跨领域合作,加速所有研究信息处理系统的进步。
    Abstract Biological and artificial information processing systems form representations of the world that they can use to categorize, reason, plan, navigate, and make decisions. To what extent do the representations formed by these diverse systems agree? Can diverging representations still lead to the same behaviors? And how can systems modify their representations to better match those of another system? These questions pertaining to the study of \textbf{\emph{representational alignment} are at the heart of some of the most active research areas in contemporary cognitive science, neuroscience, and machine learning. Unfortunately, there is limited knowledge-transfer between research communities interested in representational alignment, and much of the progress in one field ends up being rediscovered independently in another, when greater cross-field communication would be advantageous. To improve communication between fields, we propose a unifying framework that can serve as a common language between researchers studying representational alignment. We survey the literature from the fields of cognitive science, neuroscience, and machine learning, and demonstrate how prior work fits into this framework. Finally, we lay out open problems in representational alignment where progress can benefit all three fields. We hope that our work can catalyze cross-disciplinary collaboration and accelerate progress for all communities studying and developing information processing systems. We note that this is a working paper and encourage readers to reach out with their suggestions for future revisions.
    摘要 Translation notes:* 表征对齐 (representational alignment) is a term used to describe the study of how different information processing systems form representations of the world and how those representations can be aligned to improve communication and collaboration between systems.* 生物学的信息处理系统 (biological information processing systems) refers to the systems found in living organisms, such as the human brain, that process and analyze information from the environment.* 人工的信息处理系统 (artificial information processing systems) refers to the systems created by humans, such as computers and machine learning algorithms, that process and analyze information.* 形象 (representations) refers to the internal mental or computational models that systems use to represent the world and make decisions.* 类别 (categories) refers to the ways in which systems group and classify objects or concepts in the world.* 理解 (reasoning) refers to the processes by which systems draw conclusions or make decisions based on the information they have.* 规划 (planning) refers to the processes by which systems create and execute a plan to achieve a goal.* 导航 (navigation) refers to the processes by which systems move through the world and avoid obstacles.* 决策 (decision-making) refers to the processes by which systems choose between different options or courses of action.

A comprehensible analysis of the efficacy of Ensemble Models for Bug Prediction

  • paper_url: http://arxiv.org/abs/2310.12133
  • repo_url: None
  • paper_authors: Ingrid Marçal, Rogério Eduardo Garcia
  • for: 本研究旨在比较和分析使用人工智能技术在软件工程中预测Java类库中存在bug的可能性。
  • methods: 我们使用了两种Apache Commons Project的Java组件进行训练和测试模型,分别是单个AI模型和集成AI模型。
  • results: 我们的实验结果表明,集成AI模型可以在预测Java类库中存在bug的可能性方面超过单个AI模型的结果。我们还提供了因素的分析,以便更好地理解 ensemble AI 模型的性能提升的原因。
    Abstract The correctness of software systems is vital for their effective operation. It makes discovering and fixing software bugs an important development task. The increasing use of Artificial Intelligence (AI) techniques in Software Engineering led to the development of a number of techniques that can assist software developers in identifying potential bugs in code. In this paper, we present a comprehensible comparison and analysis of the efficacy of two AI-based approaches, namely single AI models and ensemble AI models, for predicting the probability of a Java class being buggy. We used two open-source Apache Commons Project's Java components for training and evaluating the models. Our experimental findings indicate that the ensemble of AI models can outperform the results of applying individual AI models. We also offer insight into the factors that contribute to the enhanced performance of the ensemble AI model. The presented results demonstrate the potential of using ensemble AI models to enhance bug prediction results, which could ultimately result in more reliable software systems.
    摘要 软件系统的正确性是其效果运行的关键。找到和修复软件漏洞是软件开发中重要的任务。随着人工智能(AI)技术在软件工程中的广泛应用,出现了一些可以帮助软件开发人员找到代码中潜在的漏洞的技术。在这篇论文中,我们提供了可读性比较和分析,探讨使用单个AI模型和 ensemble AI模型来预测Java类的可能性。我们使用了两个开源Apache Commons Project的Java组件来训练和测试模型。我们的实验结果表明, ensemble AI模型可以在应用单个AI模型的情况下出perform better。我们还提供了影响ensemble AI模型的表现的因素。该结果表明,使用ensemble AI模型可以提高漏洞预测结果,从而导致更可靠的软件系统。

DiagrammerGPT: Generating Open-Domain, Open-Platform Diagrams via LLM Planning

  • paper_url: http://arxiv.org/abs/2310.12128
  • repo_url: https://github.com/aszala/DiagrammerGPT
  • paper_authors: Abhay Zala, Han Lin, Jaemin Cho, Mohit Bansal
  • for: 这个论文旨在解决现有的文本到图像(T2I)生成模型无法生成符号图文(diagram)的问题。
  • methods: 该论文提出了一种两stage的文本到图文生成框架,利用大语言模型(LLMs)的布局指导能力来生成更加准确的开放领域、开放 платфор具有的图文。
  • results: 论文通过使用 LLMs 生成和修改 ‘图文计划’(在一个 плаanner-auditor 反馈循环中),以及使用 DiagramGLIGEN 和文本标签渲染模块来生成图文,实现了更高的准确率和质量。
    Abstract Text-to-image (T2I) generation has seen significant growth over the past few years. Despite this, there has been little work on generating diagrams with T2I models. A diagram is a symbolic/schematic representation that explains information using structurally rich and spatially complex visualizations (e.g., a dense combination of related objects, text labels, directional arrows, connection lines, etc.). Existing state-of-the-art T2I models often fail at diagram generation because they lack fine-grained object layout control when many objects are densely connected via complex relations such as arrows/lines and also often fail to render comprehensible text labels. To address this gap, we present DiagrammerGPT, a novel two-stage text-to-diagram generation framework that leverages the layout guidance capabilities of LLMs (e.g., GPT-4) to generate more accurate open-domain, open-platform diagrams. In the first stage, we use LLMs to generate and iteratively refine 'diagram plans' (in a planner-auditor feedback loop) which describe all the entities (objects and text labels), their relationships (arrows or lines), and their bounding box layouts. In the second stage, we use a diagram generator, DiagramGLIGEN, and a text label rendering module to generate diagrams following the diagram plans. To benchmark the text-to-diagram generation task, we introduce AI2D-Caption, a densely annotated diagram dataset built on top of the AI2D dataset. We show quantitatively and qualitatively that our DiagrammerGPT framework produces more accurate diagrams, outperforming existing T2I models. We also provide comprehensive analysis including open-domain diagram generation, vector graphic diagram generation in different platforms, human-in-the-loop diagram plan editing, and multimodal planner/auditor LLMs (e.g., GPT-4Vision). We hope our work can inspire further research on diagram generation via T2I models and LLMs.
    摘要 TEXT-TO-IMAGE(T2I)生成技术在过去几年内有了很大的发展。然而,有很少的研究集中于使用 T2I 模型生成图文。图文是一种使用结构rich和空间复杂的视觉表示方式,用于展示信息(例如,密集的对象、文本标签、指向箭头、连接线等)。现有的 T2I 模型通常在图文生成中存在缺陷,因为它们缺乏细化的对象布局控制,特别是当多个对象密集连接并且有复杂的关系(如箭头/线)时。为解决这个漏洞,我们提出了 DiagrammerGPT,一种新的两stage T2I 生成框架。在第一stage中,我们使用 LLMs(例如 GPT-4)来生成和反复修改 '图文计划'(在计划-审查器反馈循环中),该计划描述了所有对象(包括物体和文本标签)、它们之间的关系(如箭头或线)以及它们的包围盒布局。在第二stage中,我们使用 DiagramGLIGEN 和文本标签渲染模块来生成图文,按照图文计划进行。为了评估 T2I 生成任务,我们提出了 AI2D-Caption,一个密集注释的图文数据集,建立在 AI2D 数据集之上。我们表明了量化和质量上,我们的 DiagrammerGPT 框架可以生成更加准确的图文,超过现有的 T2I 模型。此外,我们还提供了广泛的分析,包括开放平台图文生成、vector graphic diagram生成、人工循环图文计划编辑和多Modal LLMs(例如 GPT-4Vision)。我们希望我们的工作可以鼓励更多的研究人员通过 T2I 模型和 LLMs 来生成图文。

SHARCS: Efficient Transformers through Routing with Dynamic Width Sub-networks

  • paper_url: http://arxiv.org/abs/2310.12126
  • repo_url: None
  • paper_authors: Mohammadreza Salehi, Sachin Mehta, Aditya Kusupati, Ali Farhadi, Hannaneh Hajishirzi
  • for: 提高Transformer网络的批处理能力和精度
  • methods: 使用SHARCS进行适应推理,可以在不同的模型和压缩方法下进行自适应调整
  • results: SHARCS可以提高推理速度,并且可以保持精度水平,实际测试中SHARCS可以提高推理速度2倍,但是精度下降不значитель
    Abstract We introduce SHARCS for adaptive inference that takes into account the hardness of input samples. SHARCS can train a router on any transformer network, enabling the model to direct different samples to sub-networks with varying widths. Our experiments demonstrate that: (1) SHARCS outperforms or complements existing per-sample adaptive inference methods across various classification tasks in terms of accuracy vs. FLOPs; (2) SHARCS generalizes across different architectures and can be even applied to compressed and efficient transformer encoders to further improve their efficiency; (3) SHARCS can provide a 2 times inference speed up at an insignificant drop in accuracy.
    摘要 我们介绍SHARCS,一种适应推理的方法,考虑到输入样本的困难程度。SHARCS可以在任何transformer网络上训练路由器,让模型将不同的样本分配到不同宽度的子网络上。我们的实验表明:(1)SHARCS与现有的每个样本适应推理方法相比,在不同的分类任务中获得更高的精度和FLOPs的调整;(2)SHARCS可以适用于不同的架构,并且可以进一步改善压缩和高效的transformerEncoder的效率;(3)SHARCS可以提供2倍的推理速度,而无需对精度造成显著的损失。

A Cautionary Tale: On the Role of Reference Data in Empirical Privacy Defenses

  • paper_url: http://arxiv.org/abs/2310.12112
  • repo_url: None
  • paper_authors: Caelin G. Kaplan, Chuan Xu, Othmane Marfoq, Giovanni Neglia, Anderson Santana de Oliveira
  • for: This paper focuses on developing effective privacy-preserving machine learning methods that can provide satisfactory levels of training data privacy without significantly compromising model utility.
  • methods: The proposed method is based on an empirical risk minimization approach with a constraint on the generalization error, which is evaluated as a weighted empirical risk minimization (WERM) over the training and reference datasets.
  • results: The proposed method outperforms existing state-of-the-art empirical privacy defenses using reference data for nearly all relative privacy levels of reference and training data, and demonstrates the importance of considering the triad of model utility, training data privacy, and reference data privacy when comparing privacy defenses.
    Abstract Within the realm of privacy-preserving machine learning, empirical privacy defenses have been proposed as a solution to achieve satisfactory levels of training data privacy without a significant drop in model utility. Most existing defenses against membership inference attacks assume access to reference data, defined as an additional dataset coming from the same (or a similar) underlying distribution as training data. Despite the common use of reference data, previous works are notably reticent about defining and evaluating reference data privacy. As gains in model utility and/or training data privacy may come at the expense of reference data privacy, it is essential that all three aspects are duly considered. In this paper, we first examine the availability of reference data and its privacy treatment in previous works and demonstrate its necessity for fairly comparing defenses. Second, we propose a baseline defense that enables the utility-privacy tradeoff with respect to both training and reference data to be easily understood. Our method is formulated as an empirical risk minimization with a constraint on the generalization error, which, in practice, can be evaluated as a weighted empirical risk minimization (WERM) over the training and reference datasets. Although we conceived of WERM as a simple baseline, our experiments show that, surprisingly, it outperforms the most well-studied and current state-of-the-art empirical privacy defenses using reference data for nearly all relative privacy levels of reference and training data. Our investigation also reveals that these existing methods are unable to effectively trade off reference data privacy for model utility and/or training data privacy. Overall, our work highlights the need for a proper evaluation of the triad model utility / training data privacy / reference data privacy when comparing privacy defenses.
    摘要 在隐私保护机器学习领域,验证性隐私防御被提出为实现训练数据隐私的解决方案,而不导致模型性能下降。大多数现有的防御机制假设有访问参考数据,定义为训练数据所处的同一个(或类似)分布下的另一个数据集。尽管参考数据广泛使用,但前一些作品却不够明确地定义和评估参考数据隐私。因为获得模型性能和/或训练数据隐私的增进可能会导致参考数据隐私的损害,因此必须同时考虑这三个方面。在这篇论文中,我们首先检查参考数据的可用性和隐私处理方法,并证明其必要性以便比较防御机制。其次,我们提出一种基准防御方法,允许模型性能和训练数据隐私之间的利用率评估,并且可以通过将总体化风险最小化问题转化为权重加总风险最小化问题(WERM)来实现。虽然我们视WERM为简单的基准方法,但我们的实验表明,它在大多数参考数据隐私水平下能够超越目前最具有研究价值和状态艺术的Empirical Privacy防御方法。我们的调查也表明,这些现有方法无法有效地考虑参考数据隐私和训练数据隐私之间的贝叶率。总的来说,我们的工作强调了评估模型性能、训练数据隐私和参考数据隐私的三元模型在比较防御机制时的重要性。

DASA: Difficulty-Aware Semantic Augmentation for Speaker Verification

  • paper_url: http://arxiv.org/abs/2310.12111
  • repo_url: None
  • paper_authors: Yuanyuan Wang, Yang Zhang, Zhiyong Wu, Zhihan Yang, Tao Wei, Kun Zou, Helen Meng
  • for: 提高深度神经网络模型的总化能力和鲁棒性,通过数据扩充来提高speaker认证模型的性能。
  • methods: 提出了一种新的困难意识 semantic 数据扩充(DASA)方法,通过对话者嵌入空间中的 semantic 方向进行偏移来生成多样化的训练样本,同时保持采样的计算成本很低。
  • results: 经验表明,提出的方法可以带来remarkable的性能提升,最好的结果在CN-Celeb评测集上实现了14.6%的相对性能下降。
    Abstract Data augmentation is vital to the generalization ability and robustness of deep neural networks (DNNs) models. Existing augmentation methods for speaker verification manipulate the raw signal, which are time-consuming and the augmented samples lack diversity. In this paper, we present a novel difficulty-aware semantic augmentation (DASA) approach for speaker verification, which can generate diversified training samples in speaker embedding space with negligible extra computing cost. Firstly, we augment training samples by perturbing speaker embeddings along semantic directions, which are obtained from speaker-wise covariance matrices. Secondly, accurate covariance matrices are estimated from robust speaker embeddings during training, so we introduce difficultyaware additive margin softmax (DAAM-Softmax) to obtain optimal speaker embeddings. Finally, we assume the number of augmented samples goes to infinity and derive a closed-form upper bound of the expected loss with DASA, which achieves compatibility and efficiency. Extensive experiments demonstrate the proposed approach can achieve a remarkable performance improvement. The best result achieves a 14.6% relative reduction in EER metric on CN-Celeb evaluation set.
    摘要 <> traducedata augmentation是深度神经网络(DNN)模型的重要组成部分,它们可以提高模型的通用能力和鲁棒性。现有的增强方法 дляspeaker verification通常是对原始信号进行 manipulate,这些方法需要较多的计算时间,并且增强的样本缺乏多样性。在这篇论文中,我们提出了一种新的困难相关Semantic Augmentation(DASA)方法,可以在speaker embedding空间中生成多样化的训练样本,而且计算成本几乎为零。首先,我们在训练样本中进行增强,通过在语意方向上偏移speaker embeddings来生成多样化的训练样本。其次,我们在训练过程中对robust speaker embeddings进行估计,以获得高精度的covariance矩阵。最后,我们引入difficulty-aware additive margin softmax(DAAM-Softmax)来获取最佳的speaker embeddings。最后,我们假设增强的样本数量为无穷大,并 deriveclosed-form upper bound of the expected loss with DASA,这个目标函数可以实现compatibility和效率。我们的实验表明,提出的方法可以获得显著的性能改进。最好的结果实现了CN-Celeb评估集上的14.6%相对减少EER指标。Note: Some words and phrases in the text have been modified to better fit the Simplified Chinese language, but the overall meaning and content of the text remain the same.

Quality Diversity through Human Feedback

  • paper_url: http://arxiv.org/abs/2310.12103
  • repo_url: None
  • paper_authors: Li Ding, Jenny Zhang, Jeff Clune, Lee Spector, Joel Lehman
  • for: 提高基本模型的性能 для质量任务
  • methods: 结合人工反馈来推导多样性度量
  • results: 比现有多样性算法提高自动多样性发现能力,并与人工定义多样性度量匹配搜索能力
    Abstract Reinforcement learning from human feedback (RLHF) has exhibited the potential to enhance the performance of foundation models for qualitative tasks. Despite its promise, its efficacy is often restricted when conceptualized merely as a mechanism to maximize learned reward models of averaged human preferences, especially in areas such as image generation which demand diverse model responses. Meanwhile, quality diversity (QD) algorithms, dedicated to seeking diverse, high-quality solutions, are often constrained by the dependency on manually defined diversity metrics. Interestingly, such limitations of RLHF and QD can be overcome by blending insights from both. This paper introduces Quality Diversity through Human Feedback (QDHF), which employs human feedback for inferring diversity metrics, expanding the applicability of QD algorithms. Empirical results reveal that QDHF outperforms existing QD methods regarding automatic diversity discovery, and matches the search capabilities of QD with human-constructed metrics. Notably, when deployed for a latent space illumination task, QDHF markedly enhances the diversity of images generated by a Diffusion model. The study concludes with an in-depth analysis of QDHF's sample efficiency and the quality of its derived diversity metrics, emphasizing its promise for enhancing exploration and diversity in optimization for complex, open-ended tasks.
    摘要 <>传递人类反馈学习(RLHF)已经展示了改进基础模型的表现能力。尽管它的潜力很大,但它的效果往往受限于仅视为提高学习得到的奖励模型的均值人类喜好,特别是在图像生成等需要多样化模型响应的领域。同时,质量多样性(QD)算法,专门寻找多样、高质量的解决方案,经常受到手动定义多样性度量的限制。 Curiously, RLHF和QD的局限性可以通过两者的洞察得到解决。这篇论文介绍了基于人类反馈的质量多样性(QDHF),它利用人类反馈来推断多样性度量,扩展了QD算法的适用范围。实验结果表明,QDHF在自动多样性发现方面表现出色,与人工定义多样性度量相当。尤其是在使用了一种扩散模型进行隐藏空间照明任务时,QDHF明显提高了生成的图像的多样性。研究结束于对QDHF的样本效率和获得的多样性度量的深入分析,强调它在复杂、开端任务中的探索和多样性提高的抢夺。Note: The translation is in Simplified Chinese, which is the standard writing system used in mainland China. If you prefer Traditional Chinese, please let me know and I can provide the translation in that format as well.

Non-Intrusive Adaptation: Input-Centric Parameter-efficient Fine-Tuning for Versatile Multimodal Modeling

  • paper_url: http://arxiv.org/abs/2310.12100
  • repo_url: None
  • paper_authors: Yaqing Wang, Jialin Wu, Tanmaya Dabral, Jiageng Zhang, Geoff Brown, Chun-Ta Lu, Frederick Liu, Yi Liang, Bo Pang, Michael Bendersky, Radu Soricut
  • for: 这篇论文的目的是探讨如何实现具有优秀表现的大型语言模型(LLMs)和视觉语言模型(VLMs)的实际应用,以及如何在不需要专门设计的任务下进行适应和服务。
  • methods: 这篇论文使用了两种Parameter-efficient fine-tuning(PEFT)技术:不断式PEFT和非断式PEFT。不断式PEFT直接改变模型内部架构,可以更加灵活,但是它导入了训练和服务中的复杂性。非断式PEFT则保持模型内部架构不变,仅对输入的嵌入进行适应,这种方法比较简单。这篇论文描述了一种非断式PEFT技术名为AdaLink,它在不同的任务上实现了与SoTA创新PEFT(LoRA)和全模型精确调整(FT)的竞争性表现。
  • results: 这篇论文的结果显示,AdaLink在文本仅和多媒体任务上均实现了与SoTA intrusive PEFT(LoRA)和FT的竞争性表现。此外,这篇论文还进行了对不同训练 режи和执行环境的测试,以确保AdaLink在实际应用中具有可靠性和稳定性。
    Abstract Large language models (LLMs) and vision language models (VLMs) demonstrate excellent performance on a wide range of tasks by scaling up parameter counts from O(10^9) to O(10^{12}) levels and further beyond. These large scales make it impossible to adapt and deploy fully specialized models given a task of interest. Parameter-efficient fine-tuning (PEFT) emerges as a promising direction to tackle the adaptation and serving challenges for such large models. We categorize PEFT techniques into two types: intrusive and non-intrusive. Intrusive PEFT techniques directly change a model's internal architecture. Though more flexible, they introduce significant complexities for training and serving. Non-intrusive PEFT techniques leave the internal architecture unchanged and only adapt model-external parameters, such as embeddings for input. In this work, we describe AdaLink as a non-intrusive PEFT technique that achieves competitive performance compared to SoTA intrusive PEFT (LoRA) and full model fine-tuning (FT) on various tasks. We evaluate using both text-only and multimodal tasks, with experiments that account for both parameter-count scaling and training regime (with and without instruction tuning).
    摘要

Position Interpolation Improves ALiBi Extrapolation

  • paper_url: http://arxiv.org/abs/2310.13017
  • repo_url: None
  • paper_authors: Faisal Al-Khateeb, Nolan Dey, Daria Soboleva, Joel Hestness
  • for: 帮助预训练模型使用旋转位嵌入(RoPE)来推断更长的序列长度。
  • methods: 使用线性位 interpolator来扩展模型使用注意力与直线偏好(ALiBi)的推断范围。
  • results: 位 interpolator显著提高了预训练模型在语言模型和摘要回传任务中的推断能力。
    Abstract Linear position interpolation helps pre-trained models using rotary position embeddings (RoPE) to extrapolate to longer sequence lengths. We propose using linear position interpolation to extend the extrapolation range of models using Attention with Linear Biases (ALiBi). We find position interpolation significantly improves extrapolation capability on upstream language modelling and downstream summarization and retrieval tasks.
    摘要 线性位置 interpolate 帮助预训练模型使用旋转位置嵌入 (RoPE) 来推断更长的序列长度。我们提议使用线性位置 interpolate 来扩展使用 Attention with Linear Biases (ALiBi) 模型的推断范围。我们发现位置 interpolate 对于上游语言模型和下游摘要和检索任务的推断能力有显著改善。

Unveiling the Siren’s Song: Towards Reliable Fact-Conflicting Hallucination Detection

  • paper_url: http://arxiv.org/abs/2310.12086
  • repo_url: https://github.com/zjunlp/factchd
  • paper_authors: Xiang Chen, Duanzheng Song, Honghao Gui, Chengxi Wang, Ningyu Zhang, Fei Huang, Chengfei Lv, Dan Zhang, Huajun Chen
  • for: The paper is written for evaluating the factuality of text generated by large language models (LLMs) and developing a benchmark for detecting fact-conflicting hallucinations in these models.
  • methods: The paper introduces a new benchmark called FactCHD, which assimilates a large-scale dataset of factuality patterns and incorporates fact-based chains of evidence to facilitate comprehensive factual reasoning. The authors also present a new method called TRUTH-TRIANGULATOR, which synthesizes reflective considerations by tool-enhanced ChatGPT and LoRA-tuning based on Llama2 to yield more credible detection.
  • results: The paper demonstrates the effectiveness of the FactCHD benchmark and shows that current methods fall short of faithfully detecting factual errors. The authors also present results from using TRUTH-TRIANGULATOR, which shows improved detection performance compared to existing methods.
    Abstract Large Language Models (LLMs), such as ChatGPT/GPT-4, have garnered widespread attention owing to their myriad of practical applications, yet their adoption has been constrained by issues of fact-conflicting hallucinations across web platforms. The assessment of factuality in text, produced by LLMs, remains inadequately explored, extending not only to the judgment of vanilla facts but also encompassing the evaluation of factual errors emerging in complex inferential tasks like multi-hop, and etc. In response, we introduce FactCHD, a fact-conflicting hallucination detection benchmark meticulously designed for LLMs. Functioning as a pivotal tool in evaluating factuality within "Query-Respons" contexts, our benchmark assimilates a large-scale dataset, encapsulating a broad spectrum of factuality patterns, such as vanilla, multi-hops, comparison, and set-operation patterns. A distinctive feature of our benchmark is its incorporation of fact-based chains of evidence, thereby facilitating comprehensive and conducive factual reasoning throughout the assessment process. We evaluate multiple LLMs, demonstrating the effectiveness of the benchmark and current methods fall short of faithfully detecting factual errors. Furthermore, we present TRUTH-TRIANGULATOR that synthesizes reflective considerations by tool-enhanced ChatGPT and LoRA-tuning based on Llama2, aiming to yield more credible detection through the amalgamation of predictive results and evidence. The benchmark dataset and source code will be made available in https://github.com/zjunlp/FactCHD.
    摘要 大型语言模型(LLMs),如ChatGPT/GPT-4,在实际应用方面引起了广泛关注,但其普及受到了网络平台上的事实冲突报告的限制。评估 LLMS 生成的文本中的事实真实性仍然不充分探讨,包括单纯的事实以及在复杂的推理任务中出现的事实错误。为此,我们提出了 FactCHD,一个特别设计 для LLMS 的事实冲突报告 benchmark。作为评估“查询-回答”上的事实真实性的重要工具,我们的 benchmark 集成了一个大规模的数据集,包括多种事实真实性模式,如简单、多步、比较和集成模式。我们的 benchmark 的一个特点是通过 incorporating fact-based chains of evidence,以便在评估过程中进行全面和有利的事实理解。我们测试了多个 LLMS,并证明了我们的 benchmark 和现有方法无法准确检测事实错误。此外,我们还提出了 TRUTH-TRIANGULATOR,一种基于 tool-enhanced ChatGPT 和 LoRA-tuning 的 Llama2 的方法,以便通过合并预测结果和证据来提供更可靠的检测。我们的 benchmark 数据集和源代码将在 GitHub 上发布。

DHOT-GM: Robust Graph Matching Using A Differentiable Hierarchical Optimal Transport Framework

  • paper_url: http://arxiv.org/abs/2310.12081
  • repo_url: None
  • paper_authors: Haoran Cheng, Dixin Luo, Hongteng Xu
  • for: 本研究旨在提出一种新的图像匹配方法,用于更好地利用图像中的多Modal信息,提高图像匹配的精度和效率。
  • methods: 本方法基于一种可导的层次优化交通(HOT)框架,使用各种modal信息来匹配图像。具体来说,我们将每个图像表示为一组相关矩阵,其中每个矩阵代表图像中不同modal信息的信息。然后,我们对两个图像进行匹配,并使用优化交通距离来衡量匹配结果。
  • results: 我们通过对多个图像匹配任务进行实验,发现我们的方法比前 существу的方法更高效和更稳定。在匹配过程中,我们可以通过调整可导的优化交通距离来控制匹配的精度和稳定性。
    Abstract Graph matching is one of the most significant graph analytic tasks in practice, which aims to find the node correspondence across different graphs. Most existing approaches rely on adjacency matrices or node embeddings when matching graphs, whose performances are often sub-optimal because of not fully leveraging the multi-modal information hidden in graphs, such as node attributes, subgraph structures, etc. In this study, we propose a novel and effective graph matching method based on a differentiable hierarchical optimal transport (HOT) framework, called DHOT-GM. Essentially, our method represents each graph as a set of relational matrices corresponding to the information of different modalities. Given two graphs, we enumerate all relational matrix pairs and obtain their matching results, and accordingly, infer the node correspondence by the weighted averaging of the matching results. This method can be implemented as computing the HOT distance between the two graphs -- each matching result is an optimal transport plan associated with the Gromov-Wasserstein (GW) distance between two relational matrices, and the weights of all matching results are the elements of an upper-level optimal transport plan defined on the matrix sets. We propose a bi-level optimization algorithm to compute the HOT distance in a differentiable way, making the significance of the relational matrices adjustable. Experiments on various graph matching tasks demonstrate the superiority and robustness of our method compared to state-of-the-art approaches.
    摘要 GRAPH MATCHING 是一个非常重要的图分析任务,旨在找到不同图中节点的对应关系。现有的大多数方法都基于图邻接矩阵或节点嵌入, whose performances are often sub-optimal because of not fully leveraging the multi-modal information hidden in graphs, such as node attributes, subgraph structures, etc. 在这种研究中,我们提出了一种新的和有效的图 matching方法,基于可微的层次优先 transport(HOT)框架,称为DHOT-GM。本方法将每个图表示为不同特征Modalities的信息的集合。给定两个图,我们会枚举所有的关系矩阵对,并计算它们的匹配结果,然后根据匹配结果的权重,进行节点对应。这种方法可以视为计算HOT距离 между两个图,每个匹配结果是一个优先transport plan相关的Gromov-Wasserstein(GW)距离 между两个关系矩阵,并且权重的所有匹配结果的元素是一个上级优先transport plan定义在矩阵集上。我们提出了一种二级优化算法来计算HOT距离,使得关系矩阵的重要性可调。实验结果表明,我们的方法比现有的方法更高效和Robust。

Black-Box Training Data Identification in GANs via Detector Networks

  • paper_url: http://arxiv.org/abs/2310.12063
  • repo_url: None
  • paper_authors: Lukman Olagoke, Salil Vadhan, Seth Neel
  • for: 本研究探讨了使用生成对抗网络(GAN)时的隐私问题,特别是在黑盒Setting下(即只有 generator 的样本)。
  • methods: 我们提出了一系列的会员推测攻击,包括一种名为“检测器”的攻击,它通过训练一个第二个网络来评估样本的生成者生成的可能性。
  • results: 我们在多种图像和表格数据集上,以及不同的攻击和 GAN 架构上,发现了非常有趣的隐私攻击。然而,与其他生成和分类模型相比,GAN 的攻击成功率仍然相对较低。这留下了一个有趣的问题:是 GAN 更加隐私,或者需要更强的攻击?
    Abstract Since their inception Generative Adversarial Networks (GANs) have been popular generative models across images, audio, video, and tabular data. In this paper we study whether given access to a trained GAN, as well as fresh samples from the underlying distribution, if it is possible for an attacker to efficiently identify if a given point is a member of the GAN's training data. This is of interest for both reasons related to copyright, where a user may want to determine if their copyrighted data has been used to train a GAN, and in the study of data privacy, where the ability to detect training set membership is known as a membership inference attack. Unlike the majority of prior work this paper investigates the privacy implications of using GANs in black-box settings, where the attack only has access to samples from the generator, rather than access to the discriminator as well. We introduce a suite of membership inference attacks against GANs in the black-box setting and evaluate our attacks on image GANs trained on the CIFAR10 dataset and tabular GANs trained on genomic data. Our most successful attack, called The Detector, involve training a second network to score samples based on their likelihood of being generated by the GAN, as opposed to a fresh sample from the distribution. We prove under a simple model of the generator that the detector is an approximately optimal membership inference attack. Across a wide range of tabular and image datasets, attacks, and GAN architectures, we find that adversaries can orchestrate non-trivial privacy attacks when provided with access to samples from the generator. At the same time, the attack success achievable against GANs still appears to be lower compared to other generative and discriminative models; this leaves the intriguing open question of whether GANs are in fact more private, or if it is a matter of developing stronger attacks.
    摘要 自它们的出现以来,生成对抗网络(GANs)已成为图像、音频、视频和表格数据上广泛使用的生成模型。在这篇论文中,我们研究了给定一个已经训练过GAN的攻击者,以及新的样本从下面分布中获得的情况下,是否可以高效地判断一个点是否属于GAN的训练数据。这对于版权和数据隐私具有重要的意义,因为用户可能想要确定他们的版权数据是否被用来训练GAN,而且在数据隐私方面,能够检测训练集成员是一种称为会员推理攻击的能力。与大多数前期工作不同,本文 investigate GANs在黑盒设置下的隐私问题,攻击者只有Generator的样本而不具有Discriminator的访问权。我们介绍了一组黑盒成员推理攻击,并对图像GAN在CIFAR10数据集和表格GAN在生物数据集进行了评估。我们最成功的攻击方法叫做检测器,它通过训练一个第二个网络来评估样本是否由GAN生成,而不是一个新的样本从分布中。我们证明在一个简单的生成器模型下,检测器是一种相对优化的会员推理攻击。在各种图像和表格数据集、攻击和GAN架构下,我们发现攻击者可以通过Generator的样本进行非常复杂的隐私攻击。尽管GANs在隐私方面的攻击仍然比其他生成和判断模型低,但这仍然留下了一个惊喜的问题:GANs是否更安全,或者是需要更强的攻击。

Machine Learning-based Nutrient Application’s Timeline Recommendation for Smart Agriculture: A Large-Scale Data Mining Approach

  • paper_url: http://arxiv.org/abs/2310.12052
  • repo_url: None
  • paper_authors: Usama Ikhlaq, Tahar Kechadi
  • for: 这项研究旨在提供一种可预测肥料应用量的解决方案,以便更好地管理投用肥料,降低成本、保护环境。
  • methods: 该研究使用大规模不同数据类型的数据集,通过分析这些数据来预测肥料应用量。研究还涉及到肥料应用量和天气数据的相互作用对作物产量的影响。
  • results: 研究发现,基于天气和土壤特点进行调整的肥料应用量可以提高作物产量,同时降低肥料投用量。该方法也被证明可靠和可扩展。
    Abstract This study addresses the vital role of data analytics in monitoring fertiliser applications in crop cultivation. Inaccurate fertiliser application decisions can lead to costly consequences, hinder food production, and cause environmental harm. We propose a solution to predict nutrient application by determining required fertiliser quantities for an entire season. The proposed solution recommends adjusting fertiliser amounts based on weather conditions and soil characteristics to promote cost-effective and environmentally friendly agriculture. The collected dataset is high-dimensional and heterogeneous. Our research examines large-scale heterogeneous datasets in the context of the decision-making process, encompassing data collection and analysis. We also study the impact of fertiliser applications combined with weather data on crop yield, using the winter wheat crop as a case study. By understanding local contextual and geographic factors, we aspire to stabilise or even reduce the demand for agricultural nutrients while enhancing crop development. The proposed approach is proven to be efficient and scalable, as it is validated using a real-world and large dataset.
    摘要 The dataset used in this study is high-dimensional and heterogeneous, and we examine the impact of fertilizer applications combined with weather data on crop yield using the winter wheat crop as a case study. By understanding local contextual and geographic factors, we aim to stabilize or even reduce the demand for agricultural nutrients while enhancing crop development.Our proposed approach is efficient and scalable, as it is validated using a real-world and large dataset. This study demonstrates the potential of data analytics in optimizing fertilizer applications and promoting sustainable agriculture practices.

Is Channel Independent strategy optimal for Time Series Forecasting?

  • paper_url: http://arxiv.org/abs/2310.17658
  • repo_url: None
  • paper_authors: Yuan Peiwen, Zhu Changsheng
  • for: 这篇论文是为了探讨适用于长期时间序列预测的不同模型。
  • methods: 这篇论文提出了一种简单 yet effective的策略called Channel Self-Clustering (CSC),用于线性模型。此外,它还提出了Channel Rearrangement (CR)方法,用于深度模型。
  • results: 这篇论文的实验结果显示,CSC策略可以提高CI策略的性能,同时减少参数的数量,例如在电力集成数据集上减少了10倍以上。CR方法也可以与基准模型竞争。此外,论文还讨论了是否使用历史时间序列中的同一个通道的历史值来预测未来值。
    Abstract There has been an emergence of various models for long-term time series forecasting. Recent studies have demonstrated that a single linear layer, using Channel Dependent (CD) or Channel Independent (CI) modeling, can even outperform a large number of sophisticated models. However, current research primarily considers CD and CI as two complementary yet mutually exclusive approaches, unable to harness these two extremes simultaneously. And it is also a challenging issue that both CD and CI are static strategies that cannot be determined to be optimal for a specific dataset without extensive experiments. In this paper, we reconsider whether the current CI strategy is the best solution for time series forecasting. First, we propose a simple yet effective strategy called CSC, which stands for $\mathbf{C}$hannel $\mathbf{S}$elf-$\mathbf{C}$lustering strategy, for linear models. Our Channel Self-Clustering (CSC) enhances CI strategy's performance improvements while reducing parameter size, for exmpale by over 10 times on electricity dataset, and significantly cutting training time. Second, we further propose Channel Rearrangement (CR), a method for deep models inspired by the self-clustering. CR attains competitive performance against baselines. Finally, we also discuss whether it is best to forecast the future values using the historical values of the same channel as inputs. We hope our findings and methods could inspire new solutions beyond CD/CI.
    摘要 有些新的模型已经出现了用于长期时间序预测。最近的研究表明,一个单一的线性层,使用通道依赖(CD)或通道独立(CI)的方法,可以超越许多复杂的模型。然而,当前的研究主要考虑CD和CI为两种 complementary yet mutually exclusive的方法,无法同时挖掘这两种极点。此外,CD和CI都是静态策略,无法确定是特定数据集的优化方法。在这篇论文中,我们重新考虑现在CI策略是时间序预测的最佳解决方案。首先,我们提出了一种简单 yet effective的策略,称为$\mathbf{C}$hannel $\mathbf{S}$elf-$\mathbf{C}$lustering(CSC)策略,用于线性模型。我们的通道自我凝聚(CSC)策略可以提高CI策略的性能改进,同时减少参数大小,例如在电力数据集上减少了10倍以上。其次,我们还提出了一种受到自我凝聚启发的深度模型策略,称为通道重新排序(CR)策略。CR策略可以与基elines相比。最后,我们还讨论了是否应该使用历史值的同一个通道作为预测未来值的输入。我们希望我们的发现和方法可以激发新的解决方案 beyond CD/CI。

A General Theoretical Paradigm to Understand Learning from Human Preferences

  • paper_url: http://arxiv.org/abs/2310.12036
  • repo_url: None
  • paper_authors: Mohammad Gheshlaghi Azar, Mark Rowland, Bilal Piot, Daniel Guo, Daniele Calandriello, Michal Valko, Rémi Munos
  • for: 本文旨在理解现代学习从人类偏好中学习(RLHF)的实际算法。
  • methods: 本文使用直接偏好优化(DPO)方法,并进行了深入的理论分析。
  • results: 研究发现,使用新的通用目标函数$\Psi$PO可以减少RLHF和DPO中的两个重要假设,并且可以提供更多的性能保证。此外,在一些示例中,使用 $\Psi$PO 可以实现更高的效率和更好的表现。
    Abstract The prevalent deployment of learning from human preferences through reinforcement learning (RLHF) relies on two important approximations: the first assumes that pairwise preferences can be substituted with pointwise rewards. The second assumes that a reward model trained on these pointwise rewards can generalize from collected data to out-of-distribution data sampled by the policy. Recently, Direct Preference Optimisation (DPO) has been proposed as an approach that bypasses the second approximation and learn directly a policy from collected data without the reward modelling stage. However, this method still heavily relies on the first approximation. In this paper we try to gain a deeper theoretical understanding of these practical algorithms. In particular we derive a new general objective called $\Psi$PO for learning from human preferences that is expressed in terms of pairwise preferences and therefore bypasses both approximations. This new general objective allows us to perform an in-depth analysis of the behavior of RLHF and DPO (as special cases of $\Psi$PO) and to identify their potential pitfalls. We then consider another special case for $\Psi$PO by setting $\Psi$ simply to Identity, for which we can derive an efficient optimisation procedure, prove performance guarantees and demonstrate its empirical superiority to DPO on some illustrative examples.
    摘要 <>将文本翻译为简化字的中文。<>现有的学习从人类偏好(RLHF)的广泛部署都 rely 于两个重要的近似:第一个假设每个对比的偏好可以被替换为点 wise 奖励。第二个假设一个基于这些点 wise 奖励的奖励模型可以从收集的数据中泛化到非收集数据。 reciently,Direct Preference Optimization(DPO)被提出,它不需要奖励模型的训练阶段,直接从收集数据中学习策略。然而,这个方法仍然很重视第一个近似。 在这篇论文中,我们尝试了更深入的理论理解这些实际算法。我们 derive 了一个新的通用目标函数called $\Psi$PO,它表示在对比上学习人类偏好的情况下,不需要两个近似。这个新的通用目标函数允许我们对 RLHF 和 DPO(作为 $\Psi$PO 的特殊情况)进行深入的分析,并识别它们的潜在弱点。然后,我们考虑了 $\Psi$PO 中 $\Psi$ 设置为标识函数的特殊情况,可以 derive 一个高效的优化过程,证明性能保证和在一些示例中证明其超越 DPO 的实际性。

SegmATRon: Embodied Adaptive Semantic Segmentation for Indoor Environment

  • paper_url: http://arxiv.org/abs/2310.12031
  • repo_url: https://github.com/wingrune/segmatron
  • paper_authors: Tatiana Zemskova, Margarita Kichik, Dmitry Yudin, Aleksei Staroverov, Aleksandr Panov
  • for: 这篇论文是为了提出一种适应器模型,用于在具有物体意义的图像 semantic segmentation 中进行图像分割。
  • methods: 这篇论文使用了一种混合多组件损失函数来适应模型参数在多个图像中进行INF 时的调整。
  • results: 研究表明,通过使用代理人的行动在室内环境中获取更多图像,可以提高 semantic segmentation 的质量。Here’s the breakdown of each point in English:
  • for: The paper proposes an adaptive transformer model for embodied image semantic segmentation.
  • methods: The paper uses a hybrid multicomponent loss function to adapt the model weights during inference on multiple images.
  • results: The study shows that obtaining additional images using the agent’s actions in an indoor environment can improve the quality of semantic segmentation.
    Abstract This paper presents an adaptive transformer model named SegmATRon for embodied image semantic segmentation. Its distinctive feature is the adaptation of model weights during inference on several images using a hybrid multicomponent loss function. We studied this model on datasets collected in the photorealistic Habitat and the synthetic AI2-THOR Simulators. We showed that obtaining additional images using the agent's actions in an indoor environment can improve the quality of semantic segmentation. The code of the proposed approach and datasets are publicly available at https://github.com/wingrune/SegmATRon.
    摘要 这篇论文提出了一种适应器模型,名为SegmATRon,用于具体图像 semantic segmentation。它的特点是在推理过程中对模型参数进行适应,使用混合多组件损失函数。我们在 Habitat 和 AI2-THOR sintética simulators 上进行了研究,并证明了通过使用机器人的行为在室内环境中获取更多图像可以提高 semantic segmentation 的质量。代码和数据集可以在 https://github.com/wingrune/SegmATRon 上公开获取。

Multi-view Contrastive Learning for Entity Typing over Knowledge Graphs

  • paper_url: http://arxiv.org/abs/2310.12008
  • repo_url: https://github.com/zhiweihu1103/et-mclet
  • paper_authors: Zhiwei Hu, Víctor Gutiérrez-Basulto, Zhiliang Xiang, Ru Li, Jeff Z. Pan
  • for: 本文旨在提出一种新的知识 graphs 实体类型推断方法(MCLET),以更好地编码知识图中实体和类型的Semantic知识。
  • methods: MCLET 方法包括三个模块:一、多视图生成和编码模块,使实体和类型的抽象信息从不同视图得到更好的表示;二、相互视图对比学习模块,使不同视图之间的表示进行协同改进;三、实体类型预测模块,通过多头注意力和混合专家策略来预测缺失的实体类型。
  • results: compared to the state-of-the-art, MCLET 方法在实验中显示出了强大的表现。
    Abstract Knowledge graph entity typing (KGET) aims at inferring plausible types of entities in knowledge graphs. Existing approaches to KGET focus on how to better encode the knowledge provided by the neighbors and types of an entity into its representation. However, they ignore the semantic knowledge provided by the way in which types can be clustered together. In this paper, we propose a novel method called Multi-view Contrastive Learning for knowledge graph Entity Typing (MCLET), which effectively encodes the coarse-grained knowledge provided by clusters into entity and type embeddings. MCLET is composed of three modules: i) Multi-view Generation and Encoder module, which encodes structured information from entity-type, entity-cluster and cluster-type views; ii) Cross-view Contrastive Learning module, which encourages different views to collaboratively improve view-specific representations of entities and types; iii) Entity Typing Prediction module, which integrates multi-head attention and a Mixture-of-Experts strategy to infer missing entity types. Extensive experiments show the strong performance of MCLET compared to the state-of-the-art
    摘要 知识图Entity类型推断(KGET)目标在于推断知识图中实体的可能性类型。现有的KGET方法主要关注如何更好地编码实体周围的知识和类型到其表示中。然而,它们忽略了实体和类型之间的 semantics 知识,即类型之间的聚合知识。本文提出了一种新的方法called Multi-view Contrastive Learning for knowledge graph Entity Typing(MCLET),它可以有效地编码实体和类型的聚合知识到实体和类型表示中。MCLET包括以下三个模块:1. 多视图生成和编码模块(Multi-view Generation and Encoder module):编码实体-类型、实体-团队和团队-类型的结构信息。2. 交叉视图对比学习模块(Cross-view Contrastive Learning module):鼓励不同视图之间的信息进行协同改进视图特定的实体和类型表示。3. 实体类型预测模块(Entity Typing Prediction module):通过多头注意力和 Mixture-of-Experts 策略来预测缺失的实体类型。广泛的实验表明MCLET在比较顶尖方法的情况下显示出了强大的表现。

KI-PMF: Knowledge Integrated Plausible Motion Forecasting

  • paper_url: http://arxiv.org/abs/2310.12007
  • repo_url: None
  • paper_authors: Abhishek Vivekanandan, Ahmed Abouelazm, Philip Schörner, J. Marius Zöllner
  • for: 预测交通actor的运动轨迹,以实现自动驾驶车辆大规模部署。
  • methods: 引入非 Parametric 剪除层和注意力层,以整合定义的知识优化。
  • results: 实现了遵循物理法律和驾驶环境几何学的轨迹预测,提供了安全可靠的运动预测结果,是实现自动驾驶车辆安全有效的关键。
    Abstract Accurately forecasting the motion of traffic actors is crucial for the deployment of autonomous vehicles at a large scale. Current trajectory forecasting approaches primarily concentrate on optimizing a loss function with a specific metric, which can result in predictions that do not adhere to physical laws or violate external constraints. Our objective is to incorporate explicit knowledge priors that allow a network to forecast future trajectories in compliance with both the kinematic constraints of a vehicle and the geometry of the driving environment. To achieve this, we introduce a non-parametric pruning layer and attention layers to integrate the defined knowledge priors. Our proposed method is designed to ensure reachability guarantees for traffic actors in both complex and dynamic situations. By conditioning the network to follow physical laws, we can obtain accurate and safe predictions, essential for maintaining autonomous vehicles' safety and efficiency in real-world settings.In summary, this paper presents concepts that prevent off-road predictions for safe and reliable motion forecasting by incorporating knowledge priors into the training process.
    摘要 <>对于大规模自动驾驶 vehicles 的部署, точно预测交通actor 的运动是非常重要的。当前的轨迹预测方法主要集中在优化特定的损失函数中,可能会导致预测不符合物理法律或违反外部约束。我们的目标是将显式知识假设纳入网络中,以预测未来的轨迹,并且遵循车辆的动力学约束和驾驶环境的几何结构。为 achieve 这一目标,我们引入非参数化剪除层和注意层,以整合定义的知识假设。我们的提出方法旨在保证交通actor 的可达性,并在复杂和动态情况下提供可靠的预测。通过使网络遵循物理法律,我们可以获得高度准确和安全的预测结果,这是保持自动驾驶 vehicle 的安全和效率在实际场景中的关键。总之,本文提出了避免脱离路径的安全和可靠轨迹预测方法,通过在训练过程中纳入知识假设。

Sociotechnical Safety Evaluation of Generative AI Systems

  • paper_url: http://arxiv.org/abs/2310.11986
  • repo_url: None
  • paper_authors: Laura Weidinger, Maribeth Rauh, Nahema Marchal, Arianna Manzini, Lisa Anne Hendricks, Juan Mateos-Garcia, Stevie Bergman, Jackie Kay, Conor Griffin, Ben Bariach, Iason Gabriel, Verena Rieser, William Isaac
  • for: 评估生成AI系统的安全性
  • methods: 提出三层框架,包括能力评估、系统安全原则和人类互动的评估
  • results: 发现现有评估缺陷,并提出了解决方案,包括实践步骤和不同角色的责任
    Abstract Generative AI systems produce a range of risks. To ensure the safety of generative AI systems, these risks must be evaluated. In this paper, we make two main contributions toward establishing such evaluations. First, we propose a three-layered framework that takes a structured, sociotechnical approach to evaluating these risks. This framework encompasses capability evaluations, which are the main current approach to safety evaluation. It then reaches further by building on system safety principles, particularly the insight that context determines whether a given capability may cause harm. To account for relevant context, our framework adds human interaction and systemic impacts as additional layers of evaluation. Second, we survey the current state of safety evaluation of generative AI systems and create a repository of existing evaluations. Three salient evaluation gaps emerge from this analysis. We propose ways forward to closing these gaps, outlining practical steps as well as roles and responsibilities for different actors. Sociotechnical safety evaluation is a tractable approach to the robust and comprehensive safety evaluation of generative AI systems.
    摘要 生成AI系统的应用涉及到一系列的风险。为确保生成AI系统的安全,这些风险必须进行评估。在这篇论文中,我们提出了两个主要贡献,以便建立这些评估。首先,我们提议一种三层结构的框架,它采用一种结构化的社会技术方法来评估这些风险。这个框架包括功能评估,这是当前主要的安全评估方法。然后,我们又基于系统安全原则,尤其是认为上下文决定了一个给定的功能是否会带来害。为了考虑相关的上下文,我们的框架添加了人机交互和系统影响作为其他两个层次评估。其次,我们对生成AI系统的安全评估状况进行了调查,并创建了一个库存的评估。三个突出的评估漏洞出现在这种分析中。我们提出了方法来填充这些漏洞,并详细介绍了不同角色的角色和责任。社会技术安全评估是一种可行的方法,以确保生成AI系统的安全评估是全面和可靠的。

InfoDiffusion: Information Entropy Aware Diffusion Process for Non-Autoregressive Text Generation

  • paper_url: http://arxiv.org/abs/2310.11976
  • repo_url: https://github.com/rzhwang/infodiffusion
  • paper_authors: Renzhi Wang, Jing Li, Piji Li
  • for: 提高文本生成质量和多样性,尝试 bridge 人类自然文本生成过程和当前扩散模型的生成过程之间的差距。
  • methods: InfoDiffusion 使用 “keyinfo-first” 生成策略,并在不同文本信息量下应用噪音调度。另外,InfoDiffusion 还结合了自我条件和新提出的部分噪音模型结构。
  • results: InfoDiffusion 在生成质量和多样性方面表现出色,同时 sampling efficiency 也高于基eline模型。
    Abstract Diffusion models have garnered considerable interest in the field of text generation. Several studies have explored text diffusion models with different structures and applied them to various tasks, including named entity recognition and summarization. However, there exists a notable disparity between the "easy-first" text generation process of current diffusion models and the "keyword-first" natural text generation process of humans, which has received limited attention. To bridge this gap, we propose InfoDiffusion, a non-autoregressive text diffusion model. Our approach introduces a "keyinfo-first" generation strategy and incorporates a noise schedule based on the amount of text information. In addition, InfoDiffusion combines self-conditioning with a newly proposed partially noising model structure. Experimental results show that InfoDiffusion outperforms the baseline model in terms of generation quality and diversity, as well as exhibiting higher sampling efficiency.
    摘要 diffusion模型在文本生成领域已经引起了广泛的关注。多个研究已经探索了不同结构的 diffusion模型,并应用于名称识别和概要summarization等任务。然而,现有的"易先"文本生成过程和人类的"关键词-先"自然文本生成过程之间存在显著的差距,这一点很少得到了关注。为了bridging这个差距,我们提出了InfoDiffusion,一种非自适应文本diffusion模型。我们的方法采用了"关键信息-先"生成策略,并在文本信息的量 bases on a noise schedule。此外,InfoDiffusion还结合了自conditioning和一种新提出的部分噪音模型结构。实验结果表明,InfoDiffusion在生成质量和多样性方面都超过了基eline模型,同时也显示了更高的采样效率。

Improving Generalization of Alignment with Human Preferences through Group Invariant Learning

  • paper_url: http://arxiv.org/abs/2310.11971
  • repo_url: None
  • paper_authors: Rui Zheng, Wei Shen, Yuan Hua, Wenbin Lai, Shihan Dou, Yuhao Zhou, Zhiheng Xi, Xiao Wang, Haoran Huang, Tao Gui, Qi Zhang, Xuanjing Huang
  • for: This paper aims to improve the stability and generalization of AI assistants based on language models (LLMs) by proposing a novel approach to Reinforcement Learning from Human Feedback (RLHF).
  • methods: The proposed approach uses a combination of data classification and adaptive exploration to learn a consistent policy across various domains. It deliberately maximizes performance variance and allocates more learning capacity to challenging data.
  • results: The experimental results show that the proposed approach significantly enhances training stability and model generalization, outperforming traditional RL methods that exploit shortcuts and overlook challenging samples.
    Abstract The success of AI assistants based on language models (LLMs) hinges crucially on Reinforcement Learning from Human Feedback (RLHF), which enables the generation of responses more aligned with human preferences. As universal AI assistants, there's a growing expectation for them to perform consistently across various domains. However, previous work shows that Reinforcement Learning (RL) often exploits shortcuts to attain high rewards and overlooks challenging samples. This focus on quick reward gains undermines both the stability in training and the model's ability to generalize to new, unseen data. In this work, we propose a novel approach that can learn a consistent policy via RL across various data groups or domains. Given the challenges associated with acquiring group annotations, our method automatically classifies data into different groups, deliberately maximizing performance variance. Then, we optimize the policy to perform well on challenging groups. Lastly, leveraging the established groups, our approach adaptively adjusts the exploration space, allocating more learning capacity to more challenging data and preventing the model from over-optimizing on simpler data. Experimental results indicate that our approach significantly enhances training stability and model generalization.
    摘要 成功的语言模型基于AI助手(LLMs)取决于人类反馈学习(RLHF),它使得生成响应更加与人类偏好相align。作为普遍的AI助手,人们对它们的表现具有增加的期望,但previous work表明,使用学习策略(RL)时,经常利用短cut掉到达高奖励,而忽略了复杂的样本。这种围绕快速奖励的专注会使训练不稳定并导致模型无法在新、未看到的数据上Generalize。在这项工作中,我们提出了一种新的方法,可以通过RL在不同的数据组或领域中学习一个一致的策略。由于获得组注释的困难,我们的方法会自动将数据分类为不同的组,故意增加性能的变化。然后,我们会优化策略,以便在复杂的组中表现好。最后,我们通过已有的组来适应性地调整探索空间,将更多的学习资源分配给更加复杂的数据,避免模型在简单的数据上过度优化。实验结果表明,我们的方法可以显著提高训练稳定性和模型泛化。

A Multi-Scale Decomposition MLP-Mixer for Time Series Analysis

  • paper_url: http://arxiv.org/abs/2310.11959
  • repo_url: https://github.com/zshhans/msd-mixer
  • paper_authors: Shuhan Zhong, Sizhe Song, Guanyao Li, Weipeng Zhuo, Yang Liu, S. -H. Gary Chan
  • for: 这篇论文是为了提出一种基于多尺度分解和多层混合的深度学习方法,以更好地处理时间序列数据的特殊特点和复杂多尺度时间变化。
  • methods: 这篇论文使用了一种叫做Multi-Scale Decomposition MLP-Mixer(MSD-Mixer)的方法,它可以输出不同层次的时间序列分解成分,并将这些分解成分与不同层次的时间序列进行混合,以捕捉时间序列的多尺度特征和互相关联。
  • results: 这篇论文的实验结果显示,使用MSD-Mixer方法可以在不同的时间序列分析任务(长期和短期预测、填充、侦测异常和分类)中,与其他现有的任务特定和任务通用方法相比,实现了更好的性能。
    Abstract Time series data, often characterized by unique composition and complex multi-scale temporal variations, requires special consideration of decomposition and multi-scale modeling in its analysis. Existing deep learning methods on this best fit to only univariate time series, and have not sufficiently accounted for sub-series level modeling and decomposition completeness. To address this, we propose MSD-Mixer, a Multi-Scale Decomposition MLP-Mixer which learns to explicitly decompose the input time series into different components, and represents the components in different layers. To handle multi-scale temporal patterns and inter-channel dependencies, we propose a novel temporal patching approach to model the time series as multi-scale sub-series, i.e., patches, and employ MLPs to mix intra- and inter-patch variations and channel-wise correlations. In addition, we propose a loss function to constrain both the magnitude and autocorrelation of the decomposition residual for decomposition completeness. Through extensive experiments on various real-world datasets for five common time series analysis tasks (long- and short-term forecasting, imputation, anomaly detection, and classification), we demonstrate that MSD-Mixer consistently achieves significantly better performance in comparison with other state-of-the-art task-general and task-specific approaches.
    摘要 时间序列数据,经常具有独特的组成和复杂的多尺度时间变化,需要特殊地对它进行分解和多尺度模型化的分析。现有的深度学习方法只适用于单变量时间序列,并未充分考虑分解完整性和子序列水平的模型化。为解决这个问题,我们提议了MSD-Mixer,一种多尺度分解MLP-Mixer,可以显式地将输入时间序列分解成不同的组成部分,并将这些部分在不同层中表示。同时,我们提出了一种新的时间补丁方法,可以模型时间序列为多尺度子序列,即补丁,并使用MLP来混合内部和外部补丁的变化和通道之间的相关性。此外,我们还提出了一种约束分解 оста异量和自相关的损失函数,以确保分解完整性。经过对各种实际世界数据集的多种时间序列分析任务(长期和短期预测、填充、异常检测和分类)的广泛实验,我们示示了MSD-Mixer在与其他当前领域的任务特定和任务通用方法相比,具有显著更好的表现。

Too Good To Be True: performance overestimation in (re)current practices for Human Activity Recognition

  • paper_url: http://arxiv.org/abs/2310.11950
  • repo_url: None
  • paper_authors: Andrés Tello, Victoria Degeler, Alexander Lazovik
  • for: The paper aims to raise awareness about the issue of accuracy overestimation in Human Activity Recognition (HAR) studies due to biased data segmentation and evaluation methods.
  • methods: The paper uses sliding windows for data segmentation and standard random k-fold cross validation, which are common approaches in state-of-the-art HAR studies, but can lead to biased results.
  • results: The paper shows that these biased methods can produce lower accuracies than correct unbiased methods, and that the problem persists independently of the method or dataset used.Here are the three points in Simplified Chinese text:
  • for: 这篇论文旨在提醒科学界关于人活动识别(HAR)研究中的准确度过估问题,具体来说是因为数据 segmentation和评估方法的偏见导致的。
  • methods: 这篇论文使用滑动窗口方法进行数据 segmentation,并使用标准随机k-fold交叉验证法,这些方法是当今HAR研究中最常用的,但可能导致偏见的结果。
  • results: 这篇论文显示,这些偏见方法可能会生成较低的准确度,而正确的不偏见方法更难在科学期刊上发表。
    Abstract Today, there are standard and well established procedures within the Human Activity Recognition (HAR) pipeline. However, some of these conventional approaches lead to accuracy overestimation. In particular, sliding windows for data segmentation followed by standard random k-fold cross validation, produce biased results. An analysis of previous literature and present-day studies, surprisingly, shows that these are common approaches in state-of-the-art studies on HAR. It is important to raise awareness in the scientific community about this problem, whose negative effects are being overlooked. Otherwise, publications of biased results lead to papers that report lower accuracies, with correct unbiased methods, harder to publish. Several experiments with different types of datasets and different types of classification models allow us to exhibit the problem and show it persists independently of the method or dataset.
    摘要 今天,人活动识别(HAR)管道中有标准化和确定的程序。然而,这些传统方法会导致准确性过估。具体来说,使用滑块窗口进行数据分割,然后使用标准随机k-叶值验证,会产生偏见结果。历史和当代研究的分析表明,这些是现代HAR研究中最常见的方法。重要的是,我们需要在科学社区中启示这个问题,以避免这种偏见的负面影响。否则,使用偏见方法的出版物将导致准确方法更难于发表。我们通过不同的数据集和不同的分类模型进行了多个实验,证明了这个问题的存在,并证明它独立于方法或数据集。

  • paper_url: http://arxiv.org/abs/2310.11917
  • repo_url: None
  • paper_authors: Adrian Kochsiek, Rainer Gemulla
  • for: 本文旨在提出和评估大规模知识图(KG)中 semi-inductive link prediction(LP)模型的一个大规模 benchmark。
  • methods: 本文使用了 Wikidata5M 作为基础,并提供了三种 LP 任务:推导式(k-shot)、辅助式(transductive)和零批式(0-shot),每种任务都有不同的可用信息,包括 KG 结构、文本提及和实体的详细描述。
  • results: 据小规模实验结果表明, semi-inductive LP 性能与辅助式 LP 性能在长尾实体上存在差异,并且 semi-inductive LP 性能远远低于辅助式 LP 性能。 这个 benchmark 为未来在 semi-inductive LP 模型中集成文本和上下文信息进行进一步研究提供了一个测试床。
    Abstract Semi-inductive link prediction (LP) in knowledge graphs (KG) is the task of predicting facts for new, previously unseen entities based on context information. Although new entities can be integrated by retraining the model from scratch in principle, such an approach is infeasible for large-scale KGs, where retraining is expensive and new entities may arise frequently. In this paper, we propose and describe a large-scale benchmark to evaluate semi-inductive LP models. The benchmark is based on and extends Wikidata5M: It provides transductive, k-shot, and 0-shot LP tasks, each varying the available information from (i) only KG structure, to (ii) including textual mentions, and (iii) detailed descriptions of the entities. We report on a small study of recent approaches and found that semi-inductive LP performance is far from transductive performance on long-tail entities throughout all experiments. The benchmark provides a test bed for further research into integrating context and textual information in semi-inductive LP models.
    摘要 《知识 graphs(KG)中的半推导链预测(LP)任务是预测新、未看过的实体上的事实,基于上下文信息。虽然新实体可以通过重新训练模型来扩展,但这种方法在大规模KG中是不可行的,因为重新训练是昂贵的并且新实体可能会频繁出现。在这篇论文中,我们提出了一个大规模的LP模型评估标准 benchmark。该 benchmark 基于并扩展了 Wikidata5M:它提供了半推导、k-shot、0-shot LP任务,每个任务都不同的提供KG结构、文本提及和实体的详细描述。我们对一些最新的方法进行了一小项研究,发现在长尾实体上,半推导LP性能与推导LP性能在所有实验中都远远不同。该 benchmark 提供了一个研究 semi-inductive LP 模型integrating上下文和文本信息的测试床。》Note: The translation is in Simplified Chinese, which is the standard writing system used in mainland China. If you prefer Traditional Chinese, I can provide that as well.

Analyze Mass Spectrometry data with Artificial Intelligence to assist the understanding of past habitability of Mars and provide insights for future missions

  • paper_url: http://arxiv.org/abs/2310.11888
  • repo_url: https://github.com/ioannisnasios/marsspectrometry2_gaschromatography
  • paper_authors: Ioannis Nasios
  • for: 这个研究用于检测古代火星是否可居住,但同时这种方法也可以应用于我们太阳系中的任何天体。
  • methods: 这个研究使用人工智能分析火星气相学数据,包括演化气相分析(EGA-MS)和气chromatography(GC-MS)两种技术,以确定古代火星样本中的特定化学物质。
  • results: 研究表明EGA-MS和GC-MS数据可以用于描述外星物质的化学成分,并且提供了一种可靠的方法来分析这些数据。
    Abstract This paper presents an application of artificial intelligence on mass spectrometry data for detecting habitability potential of ancient Mars. Although data was collected for planet Mars the same approach can be replicated for any terrestrial object of our solar system. Furthermore, proposed methodology can be adapted to any domain that uses mass spectrometry. This research is focused in data analysis of two mass spectrometry techniques, evolved gas analysis (EGA-MS) and gas chromatography (GC-MS), which are used to identify specific chemical compounds in geological material samples. The study demonstrates the applicability of EGA-MS and GC-MS data to extra-terrestrial material analysis. Most important features of proposed methodology includes square root transformation of mass spectrometry values, conversion of raw data to 2D sprectrograms and utilization of specific machine learning models and techniques to avoid overfitting on relative small datasets. Both EGA-MS and GC-MS datasets come from NASA and two machine learning competitions that the author participated and exploited. Complete running code for the GC-MS dataset/competition is available at GitHub.1 Raw training mass spectrometry data include [0, 1] labels of specific chemical compounds, selected to provide valuable insights and contribute to our understanding of the potential past habitability of Mars.
    摘要

From Neural Activations to Concepts: A Survey on Explaining Concepts in Neural Networks

  • paper_url: http://arxiv.org/abs/2310.11884
  • repo_url: None
  • paper_authors: Jae Hee Lee, Sergio Lanza, Stefan Wermter
  • for: 本文主要用于探讨现代神经网络中概念解释的方法。
  • methods: 本文使用了多种方法来解释神经网络中的概念,包括特征重要性分析、模型解释和概念映射等。
  • results: 本文通过对多种神经网络模型进行分析,发现了一些有用的概念解释方法,并且提出了一些可能的应用场景。这些结果可能为实现基于可解释概念的神经网络和符号学AI做出了重要贡献。
    Abstract In this paper, we review recent approaches for explaining concepts in neural networks. Concepts can act as a natural link between learning and reasoning: once the concepts are identified that a neural learning system uses, one can integrate those concepts with a reasoning system for inference or use a reasoning system to act upon them to improve or enhance the learning system. On the other hand, knowledge can not only be extracted from neural networks but concept knowledge can also be inserted into neural network architectures. Since integrating learning and reasoning is at the core of neuro-symbolic AI, the insights gained from this survey can serve as an important step towards realizing neuro-symbolic AI based on explainable concepts.
    摘要 在这篇论文中,我们对现代神经网络中的概念解释方法进行了评论。神经网络中的概念可以作为自然的链接连接学习和理解:一旦已经确定了神经学习系统使用的概念,那么可以将这些概念与符号系统集成,以进行推理或使用符号系统来改进或增强神经学习系统。同时,可以从神经网络中提取知识,同时也可以将符号知识插入到神经网络架构中。由于将学习和理解集成是神经 симвоlic AI 的核心,这些评论所获得的启示可以作为实现神经 симвоlic AI 基于可解释的概念的重要一步。

AI Nushu: An Exploration of Language Emergence in Sisterhood -Through the Lens of Computational Linguistics

  • paper_url: http://arxiv.org/abs/2310.11870
  • repo_url: None
  • paper_authors: Yuqian Sun, Yuying Tang, Ze Gao, Zhijun Pan, Chuyan Xu, Yurou Chen, Kejiang Qian, Zhigang Wang, Tristan Braud, Chang Hee Lee, Ali Asadipour
  • for: 这篇论文旨在探讨一种基于女性文化遗产的人工智能语言系统,即“AI Nushu”。
  • methods: 该论文使用了人工智能技术,将中文词典和女性文字资料库训练两个人工智能代理人,以便共同创建一种标准写作系统,用于编码中文。
  • results: 该研究提供了一种搭建在人工智能技术和中国文化遗产之上的艺术解读,以及一种将女性视角与计算语言学融合的新的视角。
    Abstract This paper presents "AI Nushu," an emerging language system inspired by Nushu (women's scripts), the unique language created and used exclusively by ancient Chinese women who were thought to be illiterate under a patriarchal society. In this interactive installation, two artificial intelligence (AI) agents are trained in the Chinese dictionary and the Nushu corpus. By continually observing their environment and communicating, these agents collaborate towards creating a standard writing system to encode Chinese. It offers an artistic interpretation of the creation of a non-western script from a computational linguistics perspective, integrating AI technology with Chinese cultural heritage and a feminist viewpoint.
    摘要

Enhancing Genetic Improvement Mutations Using Large Language Models

  • paper_url: http://arxiv.org/abs/2310.19813
  • repo_url: None
  • paper_authors: Alexander E. I. Brownlee, James Callan, Karine Even-Mendoza, Alina Geiger, Carol Hanna, Justyna Petke, Federica Sarro, Dominik Sobania
  • for: 本研究探讨了使用大语言模型(LLM)进行生成改进(Genetic Improvement,GI)的搜索技术。
  • methods: 本研究使用了OpenAI的API来生成JCodec工具的编辑。研究采用了5种不同的编辑类型进行随机抽样。
  • results: 研究发现,使用LLM生成的编辑可以提高单元测试通过率达到75%,但找到最佳改进的patch通常是通过标准插入编辑。此外,LLM增强的GI可以找到许多改进patch,但是最佳改进patch是通过标准GI找到的。
    Abstract Large language models (LLMs) have been successfully applied to software engineering tasks, including program repair. However, their application in search-based techniques such as Genetic Improvement (GI) is still largely unexplored. In this paper, we evaluate the use of LLMs as mutation operators for GI to improve the search process. We expand the Gin Java GI toolkit to call OpenAI's API to generate edits for the JCodec tool. We randomly sample the space of edits using 5 different edit types. We find that the number of patches passing unit tests is up to 75% higher with LLM-based edits than with standard Insert edits. Further, we observe that the patches found with LLMs are generally less diverse compared to standard edits. We ran GI with local search to find runtime improvements. Although many improving patches are found by LLM-enhanced GI, the best improving patch was found by standard GI.
    摘要 大型语言模型(LLM)已成功应用于软件工程任务,包括程序修复。然而,它们在基于搜索的技术,如遗传改进(GI)中的应用仍然是未知之地。在这篇论文中,我们评估了使用 LLM 作为 GI 中的变异运算来改善搜索过程。我们扩展了 Gin Java GI 工具包,以调用 OpenAI 的 API 生成 JCodec 工具中的修改。我们随机采样了修改空间,使用 5 种不同的修改类型。我们发现,使用 LLM 生成的修改可以提高单元测试通过率达到 75%,而且发现 LLM 生成的修改通常比标准插入修改更加稳定。此外,我们发现使用 LLM 增强 GI 可以找到更好的改进补丁,但最佳改进补丁仍然由标准 GI 找到。

The Value-Sensitive Conversational Agent Co-Design Framework

  • paper_url: http://arxiv.org/abs/2310.11848
  • repo_url: None
  • paper_authors: Malak Sadek, Rafael A. Calvo, Celine Mougenot
  • for: 本研究旨在提出一个价值敏感对话代理(VSCA)框架,实现对价值敏感对话代理的共同设计(co-design)。
  • methods: 本研究使用了以前的研究中所识别的需求,以及一个实用的框架,包括一个设计工具组。
  • results: 本研究提出了一个评估协议,以评估框架和设计工具组在设计工作室中的效果。
    Abstract Conversational agents (CAs) are gaining traction in both industry and academia, especially with the advent of generative AI and large language models. As these agents are used more broadly by members of the general public and take on a number of critical use cases and social roles, it becomes important to consider the values embedded in these systems. This consideration includes answering questions such as 'whose values get embedded in these agents?' and 'how do those values manifest in the agents being designed?' Accordingly, the aim of this paper is to present the Value-Sensitive Conversational Agent (VSCA) Framework for enabling the collaborative design (co-design) of value-sensitive CAs with relevant stakeholders. Firstly, requirements for co-designing value-sensitive CAs which were identified in previous works are summarised here. Secondly, the practical framework is presented and discussed, including its operationalisation into a design toolkit. The framework facilitates the co-design of three artefacts that elicit stakeholder values and have a technical utility to CA teams to guide CA implementation, enabling the creation of value-embodied CA prototypes. Finally, an evaluation protocol for the framework is proposed where the effects of the framework and toolkit are explored in a design workshop setting to evaluate both the process followed and the outcomes produced.
    摘要 对话代理(CA)在工业和学术界受到推广,特别是在生成AI和大型自然语言模型的出现后。这些代理在公众中更加广泛使用,扮演许多重要的使用案和社会角色,因此需要考虑这些系统中嵌入的价值。因此,本文的目的是提出价值敏感对话代理(VSCA)框架,帮助专业人员和重要参与者在一起设计价值敏感CA。首先,以前的研究中所识别出的实现值敏感CA的需求简述了一下。其次,实际的框架被提出来,并讨论了它的实现方式。这个框架包括三个展示实物,吸引参与者的价值,并对CA团队提供技术实用性,帮助创建具有价值的CA原型。最后,为这个框架和工具组提出评估协议,以评估这个框架和工具组在设计工作室中的影响,以及它们创造的结果。

Masked Pretraining for Multi-Agent Decision Making

  • paper_url: http://arxiv.org/abs/2310.11846
  • repo_url: None
  • paper_authors: Jie Liu, Yinmin Zhang, Chuming Li, Chao Yang, Yaodong Yang, Yu Liu, Wanli Ouyang
  • for: 这篇论文主要针对多智能体决策问题,旨在建立一个通用智能体可以在零情况下完成决策。
  • methods: 该论文提出了一种基于trasformer架构的MaskMA模型,通过面具学习策略来解决多智能体设置下的困难。此外,该模型还实现了一种通用行为表示,可以在不同的智能体数量和动作空间下进行扩展。
  • results: 实验结果表明,使用MaskMA模型可以在11个训练地图上进行零情况下的赢利率达77.8%,并且在其他类型的下游任务中表现良好(如多策略协作和随机团队游戏)。
    Abstract Building a single generalist agent with zero-shot capability has recently sparked significant advancements in decision-making. However, extending this capability to multi-agent scenarios presents challenges. Most current works struggle with zero-shot capabilities, due to two challenges particular to the multi-agent settings: a mismatch between centralized pretraining and decentralized execution, and varying agent numbers and action spaces, making it difficult to create generalizable representations across diverse downstream tasks. To overcome these challenges, we propose a \textbf{Mask}ed pretraining framework for \textbf{M}ulti-\textbf{a}gent decision making (MaskMA). This model, based on transformer architecture, employs a mask-based collaborative learning strategy suited for decentralized execution with partial observation. Moreover, MaskMA integrates a generalizable action representation by dividing the action space into actions toward self-information and actions related to other entities. This flexibility allows MaskMA to tackle tasks with varying agent numbers and thus different action spaces. Extensive experiments in SMAC reveal MaskMA, with a single model pretrained on 11 training maps, can achieve an impressive 77.8% zero-shot win rate on 60 unseen test maps by decentralized execution, while also performing effectively on other types of downstream tasks (\textit{e.g.,} varied policies collaboration and ad hoc team play).
    摘要

Brain decoding: toward real-time reconstruction of visual perception

  • paper_url: http://arxiv.org/abs/2310.19812
  • repo_url: None
  • paper_authors: Yohann Benchetrit, Hubert Banville, Jean-Rémi King
  • for: 这研究旨在实时解oding brain activity中的视觉过程
  • methods: 使用 magnetoencephalography (MEG) 和一种基于嵌入学习的解码模型
  • results: 1. MEG decoder 可以7倍提高图像检索率; 2. 脑响应图像具有高级视觉特征; 3. 图像检索和生成都表明MEG信号主要含有高级视觉特征, 而7T fMRI 则捕捉低级视觉特征。
    Abstract In the past five years, the use of generative and foundational AI systems has greatly improved the decoding of brain activity. Visual perception, in particular, can now be decoded from functional Magnetic Resonance Imaging (fMRI) with remarkable fidelity. This neuroimaging technique, however, suffers from a limited temporal resolution ($\approx$0.5 Hz) and thus fundamentally constrains its real-time usage. Here, we propose an alternative approach based on magnetoencephalography (MEG), a neuroimaging device capable of measuring brain activity with high temporal resolution ($\approx$5,000 Hz). For this, we develop an MEG decoding model trained with both contrastive and regression objectives and consisting of three modules: i) pretrained embeddings obtained from the image, ii) an MEG module trained end-to-end and iii) a pretrained image generator. Our results are threefold: Firstly, our MEG decoder shows a 7X improvement of image-retrieval over classic linear decoders. Second, late brain responses to images are best decoded with DINOv2, a recent foundational image model. Third, image retrievals and generations both suggest that MEG signals primarily contain high-level visual features, whereas the same approach applied to 7T fMRI also recovers low-level features. Overall, these results provide an important step towards the decoding - in real time - of the visual processes continuously unfolding within the human brain.
    摘要 在过去五年,基于生成和基础AI系统的使用已经大幅提高了脑动力的解码。视觉认知特别是可以通过功能磁共振成像(fMRI)进行高度准确的解码。然而,这种神经成像技术受到时间分辨率的限制(约为0.5Hz),因此在实时应用中受到极大的限制。我们提出了一种备选方案,基于磁共振成像(MEG),这种神经成像设备可以在高时间分辨率(约为5000Hz)下测量脑动力。为此,我们开发了一个基于MEG的解码模型,其包括三个模块:i)预训练的嵌入,ii)基于MEG的练习结构,iii)预训练的图像生成器。我们的结果如下:首先,我们的MEG解码器与 классические线性解码器相比,图像检索的性能提高了7倍。其次,对于图像的晚期响应,DINOv2,一种最新的基础图像模型,表现最佳。最后,图像检索和生成都表明MEG信号主要含有高级视觉特征,而使用7T fMRI也能够恢复低级特征。总之,这些结果为实时解码人类大脑中不断发展的视觉过程提供了重要的一步。

Classification Aggregation without Unanimity

  • paper_url: http://arxiv.org/abs/2310.11841
  • repo_url: None
  • paper_authors: Olivier Cailloux, Matthieu Hervouin, Ali I. Ozkes, M. Remzi Sanver
  • for: 这篇论文主要针对 классификация聚合函数的研究。
  • methods: 论文使用了 dictatorship 来描述每个公民独立的 классификация聚合函数。
  • results: 论文显示了每个独立和不同的类别聚合函数都是 dictatorship,这与 Maniquet 和 Mongin (2016)的结果相同。此外,论文还提出了一种新的证明方法,可以涵盖两个类别的情况,除非对象的数量也是两个。最后,论文还列出了两个类别和两个对象的所有独立和一致的 классификация聚合函数。
    Abstract A classification is a surjective mapping from a set of objects to a set of categories. A classification aggregation function aggregates every vector of classifications into a single one. We show that every citizen sovereign and independent classification aggregation function is essentially a dictatorship. This impossibility implies an earlier result of Maniquet and Mongin (2016), who show that every unanimous and independent classification aggregation function is a dictatorship. The relationship between the two impossibilities is reminiscent to the relationship between Wilson's and Arrow's impossibilities in preference aggregation. Moreover, while the Maniquet-Mongin impossibility rests on the existence of at least three categories, we propose an alternative proof technique that covers the case of two categories, except when the number of objects is also two. We also identify all independent and unanimous classification aggregation functions for the case of two categories and two objects.
    摘要 一种分类是一个射函数,将一个集合对象映射到另一个集合类别。一个分类汇聚函数将每个vector分类汇聚成一个单一的汇聚结果。我们表明,每个公民独立和自主的分类汇聚函数都是一种独裁统治。这一不可能性等价于 Earlier Maniquet 和 Mongin(2016)的结果,他们表明,每个一致和独立的分类汇聚函数都是一种独裁统治。这两个不可能性之间的关系与 Wilson 和 Arrow 的不可能性在偏好汇聚中有相似之处。此外,我们提出了一种不同的证明技巧,覆盖了三个类别的情况,而不是两个类别和两个对象的情况。我们还确定了所有独立和一致的分类汇聚函数的情况,只有两个类别和两个对象的情况例外。

IntentDial: An Intent Graph based Multi-Turn Dialogue System with Reasoning Path Visualization

  • paper_url: http://arxiv.org/abs/2310.11818
  • repo_url: None
  • paper_authors: Zengguang Hao, Jie Zhang, Binxia Xu, Yafang Wang, Gerard de Melo, Xiaolong Li
  • for: 本研究旨在提高对话系统的听众感知和响应能力,使其能够更好地理解用户的意图和需求。
  • methods: 该研究提出了一种基于图的多Turn对话系统,使用了反馈学习来自动地从对话中提取用户的意图元素和标准查询。此外,还提供了可视化组件,以便监视对话中每个转折的直接逻辑路径。
  • results: 该研究通过实验证明了该系统的可行性和效果,并且可以帮助提高对话系统的实际应用。
    Abstract Intent detection and identification from multi-turn dialogue has become a widely explored technique in conversational agents, for example, voice assistants and intelligent customer services. The conventional approaches typically cast the intent mining process as a classification task. Although neural classifiers have proven adept at such classification tasks, the issue of neural network models often impedes their practical deployment in real-world settings. We present a novel graph-based multi-turn dialogue system called , which identifies a user's intent by identifying intent elements and a standard query from a dynamically constructed and extensible intent graph using reinforcement learning. In addition, we provide visualization components to monitor the immediate reasoning path for each turn of a dialogue, which greatly facilitates further improvement of the system.
    摘要 <>转换给定文本到简化中文。<>对话机器人中的意图检测和识别已经广泛研究,例如语音助手和智能客服。传统方法通常将意图挖掘过程视为一个分类任务。虽然神经网络模型在这类分类任务中表现出色,但神经网络模型在实际应用中的实现往往受阻。我们提出了一种新的图表基多轮对话系统,可以通过动态构建和扩展意图图来识别用户的意图,并使用回归学习来确定意图元素和标准查询。此外,我们还提供了可视化组件,可以帮助监测对话中每个转折的直接逻辑路径,这对系统进一步改进很有帮助。

Conservative Predictions on Noisy Financial Data

  • paper_url: http://arxiv.org/abs/2310.11815
  • repo_url: None
  • paper_authors: Omkar Nabar, Gautam Shroff
  • for: 用于降低金融市场中运动风险的Fixed-term returns预测
  • methods: 使用 tradicional MLPs 和可微分决策树,在 sintetic data 和实际金融市场数据上预测Fixed-term returns,并通过遍历多个模型来减少数据噪音
  • results: our approach 可以获得更高的总收益,同时降低风险水平,并且提出了一个新的实用指标来衡量每笔交易的平均收益和风险衡量指标。
    Abstract Price movements in financial markets are well known to be very noisy. As a result, even if there are, on occasion, exploitable patterns that could be picked up by machine-learning algorithms, these are obscured by feature and label noise rendering the predictions less useful, and risky in practice. Traditional rule-learning techniques developed for noisy data, such as CN2, would seek only high precision rules and refrain from making predictions where their antecedents did not apply. We apply a similar approach, where a model abstains from making a prediction on data points that it is uncertain on. During training, a cascade of such models are learned in sequence, similar to rule lists, with each model being trained only on data on which the previous model(s) were uncertain. Similar pruning of data takes place at test-time, with (higher accuracy) predictions being made albeit only on a fraction (support) of test-time data. In a financial prediction setting, such an approach allows decisions to be taken only when the ensemble model is confident, thereby reducing risk. We present results using traditional MLPs as well as differentiable decision trees, on synthetic data as well as real financial market data, to predict fixed-term returns using commonly used features. We submit that our approach is likely to result in better overall returns at a lower level of risk. In this context we introduce an utility metric to measure the average gain per trade, as well as the return adjusted for downside risk, both of which are improved significantly by our approach.
    摘要 金融市场的价格变化非常具有噪音特性,因此,即使有时存在可以被机器学习算法捕捉的可见模式,这些模式受到特征和标签噪音的干扰,导致预测的准确性受到限制,实际应用中风险较高。传统的规则学习技术,如CN2,会寻找高精度规则,并在其前提不适用时停止预测。我们采用类似的方法,其中一个模型在训练过程中会决定不预测数据点,当前模型不确定时。在测试时,数据会被减少,仅保留一部分(支持)测试数据,并且使用高精度预测。在金融预测设置下,这种方法可以降低风险,只有当 ensemble 模型确定时,才会进行决策。我们使用传统的 MLP 以及可微分决策树,在 sintetic 数据和实际金融市场数据上预测 fixes-term 回报,使用常用的特征。我们认为,我们的方法可能会带来更好的总回报,同时降低风险水平。为此,我们引入了一个实用指标,用于衡量每笔交易的均衡收益,以及对于降低风险的回报调整指标,两者均得到了显著提高。

Learning and Discovering Quantum Properties with Multi-Task Neural Networks

  • paper_url: http://arxiv.org/abs/2310.11807
  • repo_url: None
  • paper_authors: Ya-Dong Wu, Yan Zhu, Yuexuan Wang, Giulio Chiribella
  • for: 用深度神经网络预测量子态的性质从限制的测量数据中。
  • methods: 开发了一种网络模型,可同时预测多种量子性质,包括对量子态的期望值、非线性函数、如异Alignment和多体几何 invariants。
  • results: 发现一种模型可以通过多用途训练,不仅预测给定集合中的性质,还可以描述全局多体量子系统的性质从本地测量中。同时,模型还可以分类保护型态相对变化、发现不确定的界限。
    Abstract Deep neural networks are a powerful tool for predicting properties of quantum states from limited measurement data. Here we develop a network model that can simultaneously predict multiple quantum properties, including not only expectation values of quantum observables, but also general nonlinear functions of the quantum state, like entanglement entropies and many-body topological invariants. Remarkably, we find that a model trained on a given set of properties can also discover new properties outside that set. Multi-purpose training also enables the model to infer global properties of many-body quantum systems from local measurements, to classify symmetry protected topological phases of matter, and to discover unknown boundaries between different phases.
    摘要 深度神经网络是一种 poderous 工具,可以预测量子状态的性质从有限的测量数据中。在这里,我们开发了一种网络模型,可以同时预测多种量子性质,包括不只是量子观测器的期望值,还有一些泛函数,如量子状态的异步率和多体几何抽象。很意外地,我们发现,一个基于给定的性质集合来训练的模型,可以同时揭示未知的性质集合。多用途培训也使得模型可以从地方测量数据中推断全局的多体量子系统的性质,分类保护 topological phases of matter,并发现未知的阶段边界。

Auction-Based Scheduling

  • paper_url: http://arxiv.org/abs/2310.11798
  • repo_url: https://github.com/Abarjag/Abarja
  • paper_authors: Guy Avni, Kaushik Mallik, Suman Sadhukhan
  • for: 这种paper是为了解决多个、部分矛盾的决策任务而写的。
  • methods: 这种方法使用了拍卖机制来解决多个目标的决策问题。每个目标都有一个分立的策略,可以独立创建、修改和替换。
  • results: 这种方法可以在不同的环境下实现长期公平的决策,并且可以解决多个目标之间的冲突。在路径规划问题上,这种方法可以synthesize一对策略和它们的初始分配的预算,以及拍卖策略。
    Abstract Many sequential decision-making tasks require satisfaction of multiple, partially contradictory objectives. Existing approaches are monolithic, namely all objectives are fulfilled using a single policy, which is a function that selects a sequence of actions. We present auction-based scheduling, a modular framework for multi-objective decision-making problems. Each objective is fulfilled using a separate policy, and the policies can be independently created, modified, and replaced. Understandably, different policies with conflicting goals may choose conflicting actions at a given time. In order to resolve conflicts, and compose policies, we employ a novel auction-based mechanism. We allocate a bounded budget to each policy, and at each step, the policies simultaneously bid from their available budgets for the privilege of being scheduled and choosing an action. Policies express their scheduling urgency using their bids and the bounded budgets ensure long-run scheduling fairness. We lay the foundations of auction-based scheduling using path planning problems on finite graphs with two temporal objectives. We present decentralized algorithms to synthesize a pair of policies, their initially allocated budgets, and bidding strategies. We consider three categories of decentralized synthesis problems, parameterized by the assumptions that the policies make on each other: (a) strong synthesis, with no assumptions and strongest guarantees, (b) assume-admissible synthesis, with weakest rationality assumptions, and (c) assume-guarantee synthesis, with explicit contract-based assumptions. For reachability objectives, we show that, surprisingly, decentralized assume-admissible synthesis is always possible when the out-degrees of all vertices are at most two.
    摘要 许多顺序决策任务需满足多个、部分矛盾的目标。现有的方法都是单一的,即所有目标都是通过单一策略(一个函数选择一系列动作)来满足。我们介绍了拍卖机制来解决这类决策问题。在这种机制下,每个目标都是通过一个分离的策略来满足,这些策略可以独立创建、修改和替换。当不同的策略有冲突目标时,我们使用一种新的拍卖机制来解决冲突。我们为每个策略分配一个固定预算,并在每步中让各策略同时从其可用预算中竞拍为执行动作的权利。策略通过竞拍价格表达其排期优先级,并且固定预算确保长期排期公平。我们在路径规划问题上建立了拍卖机制的基础,并提出了三种分类的分解问题:强化合理化(strong synthesis)、弱合理化(assume-admissible synthesis)和合理合同(assume-guarantee synthesis)。对于可达性目标,我们发现了一个意外的结论:在所有顶点出度都不大于2时,分解问题总是可能的。

Solving the multiplication problem of a large language model system using a graph-based method

  • paper_url: http://arxiv.org/abs/2310.13016
  • repo_url: None
  • paper_authors: Turker Tuncer, Sengul Dogan, Mehmet Baygin, Prabal Datta Barua, Abdul Hafeez-Baig, Ru-San Tan, Subrata Chakraborty, U. Rajendra Acharya
  • for: 解决 chatGPT 模型中的乘法问题,提高其数学运算精度。
  • methods: 基于图表结构的乘法算法,通过增加 10k 操作符来模拟人类数学运算。
  • results: 对 1,000,000 个大数乘法任务,提出了 100% 的准确率,成功解决了 GPT 模型中的乘法挑战。
    Abstract The generative pre-trained transformer (GPT)-based chatbot software ChatGPT possesses excellent natural language processing capabilities but is inadequate for solving arithmetic problems, especially multiplication. Its GPT structure uses a computational graph for multiplication, which has limited accuracy beyond simple multiplication operations. We developed a graph-based multiplication algorithm that emulated human-like numerical operations by incorporating a 10k operator, where k represents the maximum power to base 10 of the larger of two input numbers. Our proposed algorithm attained 100% accuracy for 1,000,000 large number multiplication tasks, effectively solving the multiplication challenge of GPT-based and other large language models. Our work highlights the importance of blending simple human insights into the design of artificial intelligence algorithms. Keywords: Graph-based multiplication; ChatGPT; Multiplication problem
    摘要 《基于Transformer(GPT)的对话机器人软件ChatGPT具有出色的自然语言处理能力,但对数学问题(尤其是乘法)的解决能力不足。GPT结构使用的计算图在多项式乘法操作上有限的准确性。我们开发了基于图的乘法算法,通过 incorporating a 10k操作符(其中k表示base 10最大幂),实现了人类化数学操作。我们的提议算法在100万大数乘法任务中达到100%的准确率,有效解决了GPT基于和其他大语言模型的乘法挑战。我们的工作强调了人工智能算法设计中的人类智慧的重要性。关键词:图基于乘法; ChatGPT; 乘法问题》Note: Please keep in mind that the translation is done by a machine and may not be perfect.

Telecom AI Native Systems in the Age of Generative AI – An Engineering Perspective

  • paper_url: http://arxiv.org/abs/2310.11770
  • repo_url: None
  • paper_authors: Ricardo Britto, Timothy Murphy, Massimo Iovene, Leif Jonsson, Melike Erol-Kantarci, Benedek Kovács
  • for: The paper explores the integration of foundational models (FMs) in the telecommunications industry, with a focus on the concept of “AI native telco” and the engineering considerations and challenges associated with implementing FMs in the software life cycle.
  • methods: The paper discusses the use of FMs in natural language processing tasks and content generation, and highlights the need for AI native-first approaches to fully leverage the potential of FMs in the telecom industry.
  • results: The paper emphasizes the enormous potential of FMs in revolutionizing how we interact with software products and services in the telecom industry, but also acknowledges the need for careful consideration of ethical, regulatory, and operational challenges to ensure the successful integration of FMs in mission-critical telecom contexts.
    Abstract The rapid advancements in Artificial Intelligence (AI), particularly in generative AI and foundational models (FMs), have ushered in transformative changes across various industries. Large language models (LLMs), a type of FM, have demonstrated their prowess in natural language processing tasks and content generation, revolutionizing how we interact with software products and services. This article explores the integration of FMs in the telecommunications industry, shedding light on the concept of AI native telco, where AI is seamlessly woven into the fabric of telecom products. It delves into the engineering considerations and unique challenges associated with implementing FMs into the software life cycle, emphasizing the need for AI native-first approaches. Despite the enormous potential of FMs, ethical, regulatory, and operational challenges require careful consideration, especially in mission-critical telecom contexts. As the telecom industry seeks to harness the power of AI, a comprehensive understanding of these challenges is vital to thrive in a fiercely competitive market.
    摘要 “人工智能(AI)的快速进步,特别是生成AI和基础模型(FM),已经在不同行业引入了transformative变革。大语言模型(LLM),一种基础模型,在自然语言处理任务和内容生成方面表现出色,改变了我们与软件产品和服务的交互方式。本文探讨了在电信行业中基础模型的整合,探讨了AI native telco这一概念,其中AI被融入了电信产品的тка料中。文章还讨论了在软件生命周期中实施基础模型的工程准则和特有挑战,强调了AI native-first的方法。虽然FM具有巨大的潜力,但是伦理、法规和运营上的挑战需要仔细考虑,特别在关键的电信上下文中。电信行业想要利用AI的力量,需要深入理解这些挑战,以在竞争激烈的市场中vivify。”

Stranger Danger! Cross-Community Interactions with Fringe Users Increase the Growth of Fringe Communities on Reddit

  • paper_url: http://arxiv.org/abs/2310.12186
  • repo_url: None
  • paper_authors: Giuseppe Russo, Manoel Horta Ribeiro, Robert West
    for: 这些研究旨在解释具有偏见和极端思想的社区在主流平台上快速增长的机制。methods: 这些研究使用文本基因推断技术来研究具有偏见和极端思想的社区在Reddit上的增长。results: 研究发现,与偏见和极端思想相关的社区之间的交互可以吸引新成员加入这些社区。接受这些交互的用户比相似的匹配用户更有4.2%的可能性加入偏见和极端思想社区。这种效应受到社区特点(如左右两派社区)和交互语言的影响。使用恶意语言进行交互可以增加加入偏见和极端思想社区的可能性,比非恶意交互高5pp。对于非偏见和极端思想社区(如r/climatechange、r/NBA、r/leagueoflegends)进行重复分析,未发现这种增长机制。总的来说,我们的发现表明,减少偏见和极端思想社区之间的交互可以减少主流平台上的偏见和极端思想社区的增长。
    Abstract Fringe communities promoting conspiracy theories and extremist ideologies have thrived on mainstream platforms, raising questions about the mechanisms driving their growth. Here, we hypothesize and study a possible mechanism: new members may be recruited through fringe-interactions: the exchange of comments between members and non-members of fringe communities. We apply text-based causal inference techniques to study the impact of fringe-interactions on the growth of three prominent fringe communities on Reddit: r/Incel, r/GenderCritical, and r/The_Donald. Our results indicate that fringe-interactions attract new members to fringe communities. Users who receive these interactions are up to 4.2 percentage points (pp) more likely to join fringe communities than similar, matched users who do not. This effect is influenced by 1) the characteristics of communities where the interaction happens (e.g., left vs. right-leaning communities) and 2) the language used in the interactions. Interactions using toxic language have a 5pp higher chance of attracting newcomers to fringe communities than non-toxic interactions. We find no effect when repeating this analysis by replacing fringe (r/Incel, r/GenderCritical, and r/The_Donald) with non-fringe communities (r/climatechange, r/NBA, r/leagueoflegends), suggesting this growth mechanism is specific to fringe communities. Overall, our findings suggest that curtailing fringe-interactions may reduce the growth of fringe communities on mainstream platforms.
    摘要 极端社区促进阴谋理论和极端思想的发展,在主流平台上蓬勃发展,这引发了关于这些机制的问题。我们提出和研究一种可能的机制:新成员可能通过极端互动被招募到极端社区中。我们使用文本基因ferrer inference技术来研究极端互动对Reddit上三个 prominent fringe community的发展产生的影响:r/Incel、r/GenderCritical和r/The_Donald。我们的结果表明,极端互动会吸引新成员加入极端社区。接受这些互动的用户比相似的匹配用户更有4.2%的可能性加入极端社区。这种效应受到社区的特点(如左右翼社区)以及互动语言的影响。使用恶势力语言进行互动可能使新成员加入极端社区的可能性提高5pp。当我们将极端社区换为非极端社区(如r/climatechange、r/NBA、r/leagueoflegends)进行重复分析时,我们没有发现这种效应,这表明这种生长机制特有于极端社区。总的来说,我们的发现表明,遏制极端互动可能会降低主流平台上极端社区的发展。

Estimating Material Properties of Interacting Objects Using Sum-GP-UCB

  • paper_url: http://arxiv.org/abs/2310.11749
  • repo_url: None
  • paper_authors: M. Yunus Seker, Oliver Kroemer
  • for: 估算物体的物理和动态属性从观察数据中
  • methods: bayesian优化方法确定物体参数
  • results: 能够有效地进行逐步学习,不需要重新评估已有观察数据的奖励值
    Abstract Robots need to estimate the material and dynamic properties of objects from observations in order to simulate them accurately. We present a Bayesian optimization approach to identifying the material property parameters of objects based on a set of observations. Our focus is on estimating these properties based on observations of scenes with different sets of interacting objects. We propose an approach that exploits the structure of the reward function by modeling the reward for each observation separately and using only the parameters of the objects in that scene as inputs. The resulting lower-dimensional models generalize better over the parameter space, which in turn results in a faster optimization. To speed up the optimization process further, and reduce the number of simulation runs needed to find good parameter values, we also propose partial evaluations of the reward function, wherein the selected parameters are only evaluated on a subset of real world evaluations. The approach was successfully evaluated on a set of scenes with a wide range of object interactions, and we showed that our method can effectively perform incremental learning without resetting the rewards of the gathered observations.
    摘要 Robots需要估算物体的物理和动态性质从观察数据中,以便准确模拟。我们提出了一种 bayesian优化方法,用于根据观察数据中的物体参数进行物体物理性质的估算。我们的注重点是基于不同交互对象的场景中的观察数据进行估算。我们提议利用奖励函数的结构,将奖励函数分割成每个观察中的奖励模型,并只使用场景中的对象参数作为输入。这将导致更好的维度减少,从而更快地优化。为了进一步加速优化过程,并减少需要进行实际评估的运行次数,我们还提议使用部分评估奖励函数,选择的参数只在一部分实际评估中进行评估。我们成功地应用了这种方法在一组具有多种对象交互的场景中,并证明了我们的方法可以进行逐步学习而不需要重置观察得到的奖励。

Action-Quantized Offline Reinforcement Learning for Robotic Skill Learning

  • paper_url: http://arxiv.org/abs/2310.11731
  • repo_url: None
  • paper_authors: Jianlan Luo, Perry Dong, Jeffrey Wu, Aviral Kumar, Xinyang Geng, Sergey Levine
  • for: 这篇论文旨在提出一种适应量化动作的策略,以提高在离线学习中的RL性能。
  • methods: 该论文提出了一种基于VQ-VAE的状态决定动作量化方法,以避免粗略量化所导致的极大幂灵活增长。
  • results: 该论文在多个离线RL方法,如IQL、CQL和BRAC等方法的基础上,提出了一种适应量化的策略,并在Robomimic环境中进行了详细的实验 validate。结果显示,与离线RL方法相比,该策略可以提高策略性能 by 2-3倍。
    Abstract The offline reinforcement learning (RL) paradigm provides a general recipe to convert static behavior datasets into policies that can perform better than the policy that collected the data. While policy constraints, conservatism, and other methods for mitigating distributional shifts have made offline reinforcement learning more effective, the continuous action setting often necessitates various approximations for applying these techniques. Many of these challenges are greatly alleviated in discrete action settings, where offline RL constraints and regularizers can often be computed more precisely or even exactly. In this paper, we propose an adaptive scheme for action quantization. We use a VQ-VAE to learn state-conditioned action quantization, avoiding the exponential blowup that comes with na\"ive discretization of the action space. We show that several state-of-the-art offline RL methods such as IQL, CQL, and BRAC improve in performance on benchmarks when combined with our proposed discretization scheme. We further validate our approach on a set of challenging long-horizon complex robotic manipulation tasks in the Robomimic environment, where our discretized offline RL algorithms are able to improve upon their continuous counterparts by 2-3x. Our project page is at https://saqrl.github.io/
    摘要 “偏离线强化学习(RL)模式提供了一个通用的方法,将静止行为数据转换成可以更好地性能的策略。虽然政策限制、保守主义和其他避免分布Shift的方法有助于减轻偏离线RL的挑战,但是连续动作设置frequently需要一些近似方法。在离散动作设置中,偏离线RL约束和规范可以经常更加精确地计算或者甚至是 exactly。在这篇论文中,我们提议了一种可变的行动量化方案。我们使用了VQ-VAE来学习状态决定的行动量化,以避免由粗粒化所带来的极大增长。我们证明了一些state-of-the-art的偏离线RL方法,如IQL、CQL和BRAC,在与我们提议的量化方案结合后,在标准准确度上提高了性能。我们进一步验证了我们的方法在Robomimic环境中的一组复杂的长期机械抓取任务上,我们的量化的偏离线RL算法能够超过其连续 counterpart的性能,提高了2-3倍。我们的项目页面是https://saqrl.github.io/”

Federated Heterogeneous Graph Neural Network for Privacy-preserving Recommendation

  • paper_url: http://arxiv.org/abs/2310.11730
  • repo_url: None
  • paper_authors: Bo Yan, Yang Cao, Haoyu Wang, Wenchuan Yang, Junping Du, Chuan Shi
  • For: This paper proposes a federated heterogeneous graph neural network (FedHGNN) based framework for recommendation, which can collaboratively train a recommendation model on distributed Heterogeneous Information Networks (HINs) without leaking user privacy.* Methods: The paper formalizes a privacy definition based on differential privacy for HIN-based federated recommendation, and elaborately designs a semantic-preserving user interactions publishing method to recover the broken meta-path based semantics caused by distributed data storage.* Results: The proposed FedHGNN model outperforms existing methods by a large margin (up to 34% in HR@10 and 42% in NDCG@10) under an acceptable privacy budget, as demonstrated through extensive experiments on three datasets.
    Abstract Heterogeneous information network (HIN), which contains rich semantics depicted by meta-paths, has become a powerful tool to alleviate data sparsity in recommender systems. Existing HIN-based recommendations hold the data centralized storage assumption and conduct centralized model training. However, the real-world data is often stored in a distributed manner for privacy concerns, resulting in the failure of centralized HIN-based recommendations. In this paper, we suggest the HIN is partitioned into private HINs stored in the client side and shared HINs in the server. Following this setting, we propose a federated heterogeneous graph neural network (FedHGNN) based framework, which can collaboratively train a recommendation model on distributed HINs without leaking user privacy. Specifically, we first formalize the privacy definition in the light of differential privacy for HIN-based federated recommendation, which aims to protect user-item interactions of private HIN as well as user's high-order patterns from shared HINs. To recover the broken meta-path based semantics caused by distributed data storage and satisfy the proposed privacy, we elaborately design a semantic-preserving user interactions publishing method, which locally perturbs user's high-order patterns as well as related user-item interactions for publishing. After that, we propose a HGNN model for recommendation, which conducts node- and semantic-level aggregations to capture recovered semantics. Extensive experiments on three datasets demonstrate our model outperforms existing methods by a large margin (up to 34% in HR@10 and 42% in NDCG@10) under an acceptable privacy budget.
    摘要 众所周知的异种信息网络(HIN)已成为推荐系统中强大的工具,它可以通过meta-paths中嵌入的 semantics来解决数据稀缺问题。然而,现实中的数据通常会被分布式存储,这导致了传统的中央化HIN-based推荐模型的失败。在这篇论文中,我们提议将HIN partitioned into private HINs stored on the client side and shared HINs on the server。基于这种设定,我们提出了一种联邦异种图神经网络(FedHGNN)基础架构,可以在分布式HINs上并发训练一个推荐模型,无需透露用户隐私。具体来说,我们首先定义了隐私定义,以保护用户-项交互的隐私以及用户高阶征分的信息。然后,我们采用了一种semantic-preserving用户互动发布方法,通过地方扰动用户的高阶征分和相关用户-项交互来发布。接着,我们提出了一种HGNN模型,通过节点和semantic-level汇聚来捕捉恢复的 semantics。我们对三个数据集进行了广泛的实验,结果显示,我们的模型在遵守隐私预算下,可以大幅提高推荐效果(最多34%的HR@10和42%的NDCG@10)。

Uncertainty in Automated Ontology Matching: Lessons Learned from an Empirical Experimentation

  • paper_url: http://arxiv.org/abs/2310.11723
  • repo_url: None
  • paper_authors: Inès Osman, Salvatore F. Pileggi, Sadok Ben Yahia
  • for: 本研究旨在探讨自动对照 ontology 的应用,以提高数据集合的相互连接和semantic integrability。
  • methods: 本研究采用了基于 ontology 的底层知识建构方法,并在实际数据上进行了实验。
  • results: 实验结果表明,自动对照过程中存在较大的不确定性,而 semi-supervised 方法则显示出了更好的可靠性。
    Abstract Data integration is considered a classic research field and a pressing need within the information science community. Ontologies play a critical role in such a process by providing well-consolidated support to link and semantically integrate datasets via interoperability. This paper approaches data integration from an application perspective, looking at techniques based on ontology matching. An ontology-based process may only be considered adequate by assuming manual matching of different sources of information. However, since the approach becomes unrealistic once the system scales up, automation of the matching process becomes a compelling need. Therefore, we have conducted experiments on actual data with the support of existing tools for automatic ontology matching from the scientific community. Even considering a relatively simple case study (i.e., the spatio-temporal alignment of global indicators), outcomes clearly show significant uncertainty resulting from errors and inaccuracies along the automated matching process. More concretely, this paper aims to test on real-world data a bottom-up knowledge-building approach, discuss the lessons learned from the experimental results of the case study, and draw conclusions about uncertainty and uncertainty management in an automated ontology matching process. While the most common evaluation metrics clearly demonstrate the unreliability of fully automated matching solutions, properly designed semi-supervised approaches seem to be mature for a more generalized application.
    摘要 <>Translate the given text into Simplified Chinese.<>数据集成被视为信息科学领域的经典研究领域和紧迫需求。 ontology 在这种过程中扮演了关键的支持角色,以链接和semantic集成数据。本文从应用角度出发,研究基于 ontology 匹配技术的数据集成方法。然而,由于系统规模增加, manual 匹配过程变得不现实。因此,我们在实际数据支持下进行了现有的自动 ontology 匹配工具的实验。尽管使用简单的case study(即全球指标的空间-时间Alignment),实验结果显示了自动匹配过程中的显著不确定性。本文的目标是在真实数据上测试底层知识建构方法,讨论实验结果中的教训,并对自动匹配过程中的不确定性和不确定性管理进行结论。尽管常见的评价指标显示全自动匹配解决方案的不可靠性,但是正确设计的半监督方法似乎已经成熟备用更广泛的应用。

  • paper_url: http://arxiv.org/abs/2310.11722
  • repo_url: None
  • paper_authors: Yaxin Fan, Feng Jiang, Peifeng Li, Haizhou Li
  • for: This paper aims to evaluate the ability of large language models (LLMs) to provide accurate and factual suggestions for user self-diagnosis queries.
  • methods: The authors constructed a benchmark of common atomic knowledge in user self-diagnosis queries and evaluated both generic and specialized LLMs on this benchmark. They also performed error analysis and explored different types of data for fine-tuning specialized LLMs.
  • results: The results showed that generic LLMs perform better than specialized LLMs in terms of atomic knowledge and instruction-following ability, and that distilled data can benefit LLMs most. Additionally, the authors found that both generic and specialized LLMs are sycophantic, meaning they tend to cater to users’ claims when it comes to unknown knowledge.
    Abstract Large Language Models (LLMs) have the potential to revolutionize the way users self-diagnose through search engines by offering direct and efficient suggestions. Recent studies primarily focused on the quality of LLMs evaluated by GPT-4 or their ability to pass medical exams, no studies have quantified the extent of health-related atomic knowledge stored in LLMs' memory, which is the basis of LLMs to provide more factual suggestions. In this paper, we first constructed a benchmark, including the most common types of atomic knowledge in user self-diagnosis queries, with 17 atomic types and a total of 14, 048 pieces of atomic knowledge. Then, we evaluated both generic and specialized LLMs on the benchmark. The experimental results showcased that generic LLMs perform better than specialized LLMs in terms of atomic knowledge and instruction-following ability. Error analysis revealed that both generic and specialized LLMs are sycophantic, e.g., always catering to users' claims when it comes to unknown knowledge. Besides, generic LLMs showed stronger safety, which can be learned by specialized LLMs through distilled data. We further explored different types of data commonly adopted for fine-tuning specialized LLMs, i.e., real-world, semi-distilled, and distilled data, and found that distilled data can benefit LLMs most.
    摘要 大型语言模型(LLMs)有可能革命化用户自诊查找结果,提供直接和有效的建议。现在的研究主要集中在GPT-4评估了LMMs的质量或者LMMs能否通过医学考试,但是没有评估LMMs储存的健康相关知识量,这是LMMs提供更加正确的建议的基础。在这篇论文中,我们首先建立了一个benchmark,包括用户自诊查找常见的17种原子知识,总共14,048个原子知识。然后,我们评估了一般和特殊的LMMs在benchmark上。实验结果显示,一般LMMs在原子知识和指令遵循能力方面表现比特殊LMMs更好。错误分析表明,一般和特殊LMMs都具有追求用户的倾听,即当用户提出未知知识时,LMMs都会尽力适应。此外,一般LMMs表现出更强的安全性,可以通过特殊LMMs的滤过资料学习。我们进一步探索了不同类型的特殊LMMs fine-tuning的常用数据,包括实际世界、半滤过和滤过数据,发现滤过数据可以帮助LMMs最多。

Enhancing Low-resource Fine-grained Named Entity Recognition by Leveraging Coarse-grained Datasets

  • paper_url: http://arxiv.org/abs/2310.11715
  • repo_url: https://github.com/sue991/cofiner
  • paper_authors: Su Ah Lee, Seokjin Oh, Woohwan Jung
  • for: 本文提出了一种解决Named Entity Recognition(NER)缺乏精细标注数据的问题,特别是在细化NER场景下。
  • methods: 本文使用了现有的粗化标注数据,并提出了一种叫做Fine-to-Coarse(F2C)映射矩阵来利用粗化和细化实体之间的层次结构。此外,本文还提出了一种矛盾检测方法,以避免粗化实体与细化实体之间的矛盾。
  • results: 实验结果表明,我们的方法在只有少量细化标注时比$K$-shot学习和监督学习方法表现更好。
    Abstract Named Entity Recognition (NER) frequently suffers from the problem of insufficient labeled data, particularly in fine-grained NER scenarios. Although $K$-shot learning techniques can be applied, their performance tends to saturate when the number of annotations exceeds several tens of labels. To overcome this problem, we utilize existing coarse-grained datasets that offer a large number of annotations. A straightforward approach to address this problem is pre-finetuning, which employs coarse-grained data for representation learning. However, it cannot directly utilize the relationships between fine-grained and coarse-grained entities, although a fine-grained entity type is likely to be a subcategory of a coarse-grained entity type. We propose a fine-grained NER model with a Fine-to-Coarse(F2C) mapping matrix to leverage the hierarchical structure explicitly. In addition, we present an inconsistency filtering method to eliminate coarse-grained entities that are inconsistent with fine-grained entity types to avoid performance degradation. Our experimental results show that our method outperforms both $K$-shot learning and supervised learning methods when dealing with a small number of fine-grained annotations.
    摘要 翻译结果:Named Entity Recognition (NER) часто遇到缺乏标签数据的问题,特别在细化NER场景下。虽可以使用$K$-shot学习技术,但其表现往往停滞在数十个标签以上。为解决这问题,我们利用现有的粗化数据,它们提供了大量的标签。一种直接 Addressing this problem is pre-finetuning, which uses coarse-grained data for representation learning. However, it cannot directly utilize the relationships between fine-grained and coarse-grained entities, although a fine-grained entity type is likely to be a subcategory of a coarse-grained entity type. We propose a fine-grained NER model with a Fine-to-Coarse(F2C) mapping matrix to leverage the hierarchical structure explicitly. In addition, we present an inconsistency filtering method to eliminate coarse-grained entities that are inconsistent with fine-grained entity types to avoid performance degradation. Our experimental results show that our method outperforms both $K$-shot learning and supervised learning methods when dealing with a small number of fine-grained annotations.

Learning Co-Speech Gesture for Multimodal Aphasia Type Detection

  • paper_url: http://arxiv.org/abs/2310.11710
  • repo_url: https://github.com/dsail-skku/multimodal-aphasia-type-detection_emnlp_2023
  • paper_authors: Daeun Lee, Sejung Son, Hyolim Jeon, Seungbae Kim, Jinyoung Han
  • for: 这种研究旨在为抑制语言障碍的病人提供有效的诊断方法,具体来说是用多模态图像神经网络来识别不同的语言障碍类型。
  • methods: 该研究使用了多模态图像神经网络,通过学习语音和手势模式之间的相关性,生成敏感于手势信息的文本表示,从而准确地识别不同的语言障碍类型。
  • results: 对比 exist 方法,该研究实现了状态机器人的Result(F1 84.2%),并显示了手势特征的优越性, highlighting the significance of gesture expression in detecting aphasia types。
    Abstract Aphasia, a language disorder resulting from brain damage, requires accurate identification of specific aphasia types, such as Broca's and Wernicke's aphasia, for effective treatment. However, little attention has been paid to developing methods to detect different types of aphasia. Recognizing the importance of analyzing co-speech gestures for distinguish aphasia types, we propose a multimodal graph neural network for aphasia type detection using speech and corresponding gesture patterns. By learning the correlation between the speech and gesture modalities for each aphasia type, our model can generate textual representations sensitive to gesture information, leading to accurate aphasia type detection. Extensive experiments demonstrate the superiority of our approach over existing methods, achieving state-of-the-art results (F1 84.2\%). We also show that gesture features outperform acoustic features, highlighting the significance of gesture expression in detecting aphasia types. We provide the codes for reproducibility purposes.
    摘要 apraxia, a language disorder caused by brain damage, requires accurate identification of specific apraxia types, such as Broca's and Wernicke's apraxia, for effective treatment. However, little attention has been paid to developing methods to detect different types of apraxia. Recognizing the importance of analyzing co-speech gestures for distinguishing apraxia types, we propose a multimodal graph neural network for apraxia type detection using speech and corresponding gesture patterns. By learning the correlation between the speech and gesture modalities for each apraxia type, our model can generate textual representations sensitive to gesture information, leading to accurate apraxia type detection. Extensive experiments demonstrate the superiority of our approach over existing methods, achieving state-of-the-art results (F1 84.2\%). We also show that gesture features outperform acoustic features, highlighting the significance of gesture expression in detecting apraxia types. We provide the codes for reproducibility purposes.Note: "apraxia" is the traditional Chinese term for aphasia, and "apraxia type" is the Chinese term for aphasia type.

Live Graph Lab: Towards Open, Dynamic and Real Transaction Graphs with NFT

  • paper_url: http://arxiv.org/abs/2310.11709
  • repo_url: None
  • paper_authors: Zhen Zhang, Bingqiao Luo, Shengliang Lu, Bingsheng He
  • for: This paper is written for investigating the properties of the Non-fungible tokens (NFTs) ecosystem from a temporal graph analysis perspective.
  • methods: The paper uses a live graph with NFT transaction network, which is obtained by downloading and parsing the NFT transaction activities. The authors also use a series of measurements to understand the properties of the NFT ecosystem and compare it with social, citation, and web networks.
  • results: The paper provides new observations and insights into the characteristics of the emerging NFT ecosystem, including its dynamics and properties. The authors also study machine learning models in this live graph to enrich the current datasets and provide new opportunities for the graph community.
    Abstract Numerous studies have been conducted to investigate the properties of large-scale temporal graphs. Despite the ubiquity of these graphs in real-world scenarios, it's usually impractical for us to obtain the whole real-time graphs due to privacy concerns and technical limitations. In this paper, we introduce the concept of {\it Live Graph Lab} for temporal graphs, which enables open, dynamic and real transaction graphs from blockchains. Among them, Non-fungible tokens (NFTs) have become one of the most prominent parts of blockchain over the past several years. With more than \$40 billion market capitalization, this decentralized ecosystem produces massive, anonymous and real transaction activities, which naturally forms a complicated transaction network. However, there is limited understanding about the characteristics of this emerging NFT ecosystem from a temporal graph analysis perspective. To mitigate this gap, we instantiate a live graph with NFT transaction network and investigate its dynamics to provide new observations and insights. Specifically, through downloading and parsing the NFT transaction activities, we obtain a temporal graph with more than 4.5 million nodes and 124 million edges. Then, a series of measurements are presented to understand the properties of the NFT ecosystem. Through comparisons with social, citation, and web networks, our analyses give intriguing findings and point out potential directions for future exploration. Finally, we also study machine learning models in this live graph to enrich the current datasets and provide new opportunities for the graph community. The source codes and dataset are available at https://livegraphlab.github.io.
    摘要 多个研究已经进行了大规模时间图的性质调查。尽管这些图在实际场景中很常见,但由于隐私问题和技术限制,我们通常无法获取实时图。在这篇论文中,我们介绍了一种名为{\it Live Graph Lab}的概念,该概念可以在区块链上提供开放、动态和实时交易图。其中,非 fungible tokens(NFTs)在过去几年中成为了区块链的一个最 prominent的部分。NFTs的市场规模超过400亿美元,这个分布式生态系统会生成大量、匿名和实时交易活动,自然形成了复杂的交易网络。然而,关于这个emerging NFT生态系统从时间图分析的特点还有很少的理解。为了减少这一差距,我们在实时图中实例化了NFT交易网络,并investigated its dynamics,以提供新的观察和发现。具体来说,通过下载和解析NFT交易活动,我们获得了一个时间图,包含超过450万个节点和124亿个边。然后,我们进行了一系列测量,以了解NFT生态系统的特点。通过与社交、引用和网络相比较,我们的分析发现了有趣的发现,并指出了未来的探索方向。此外,我们还研究了在这个实时图上的机器学习模型,以激励当前的数据集和提供新的探索机会。实时图和数据集的源代码可以在https://livegraphlab.github.io/ obtained。

A Comprehensive Survey on Vector Database: Storage and Retrieval Technique, Challenge

  • paper_url: http://arxiv.org/abs/2310.11703
  • repo_url: None
  • paper_authors: Yikun Han, Chunjiang Liu, Pengfei Wang
  • for: 本文旨在为Vector数据库的高维数据存储提供一个概述,以及相关的近似搜索算法的概述。
  • methods: 本文使用分类法概述了现有的Vector数据库架构,并对近似搜索问题的解决方法进行了分类,包括 hash-based、tree-based、graph-based 和 quantization-based 等方法。
  • results: 本文提供了现有Vector数据库的挑战和大语言模型的组合,以及它们在新的可能性领域中的应用。
    Abstract A vector database is used to store high-dimensional data that cannot be characterized by traditional DBMS. Although there are not many articles describing existing or introducing new vector database architectures, the approximate nearest neighbor search problem behind vector databases has been studied for a long time, and considerable related algorithmic articles can be found in the literature. This article attempts to comprehensively review relevant algorithms to provide a general understanding of this booming research area. The basis of our framework categorises these studies by the approach of solving ANNS problem, respectively hash-based, tree-based, graph-based and quantization-based approaches. Then we present an overview of existing challenges for vector databases. Lastly, we sketch how vector databases can be combined with large language models and provide new possibilities.
    摘要 vector database 是用于存储高维数据的数据库,而这些数据不能由传统的DBMS进行描述。虽然有很少的文章描述了现有或引入新的vector database架构,但近似最近邻居问题(ANNS)在vector databases中的研究已经很长时间了,相关的算法文章在 literatura 中可以找到。本文尝试从ategorize 这些研究,按照解决 ANNS 问题的方法分为hash-based、tree-based、graph-based和quantization-based四种方法。然后,我们介绍vector databases 存在的挑战,最后,我们探讨 vector databases 与大型自然语言模型如何结合,提供新的可能性。Note: "vector database" is a literal translation of the English phrase, and it is not a commonly used term in Chinese. In Chinese, the term "高维数据库" (gāo wěi dà kē) is more commonly used to refer to a database that stores high-dimensional data.

Runner re-identification from single-view video in the open-world setting

  • paper_url: http://arxiv.org/abs/2310.11700
  • repo_url: None
  • paper_authors: Tomohiro Suzuki, Kazushi Tsutsui, Kazuya Takeda, Keisuke Fujii
  • for: 这paper是为了解决多视图或单视图体育视频中运动员重新认识的问题,以便实现自动化视频分析。
  • methods: 该paper使用了预训练的YOLOv8和EfficientNet,以及自适应的gated recurrent unit autoencoder模型来自动处理单视图视频,并使用运动动态特征来提高重新认识精度。
  • results: 该paper在使用一个运动实践视频数据集上进行测试,并显示了与一种状态的艺术模型相比,该方法可以更高的准确率来重新认识运动员。此外,该paper还证明了自适应运动动态特征提取器的有效性。该runner重新认识系统可以用于自动分析运动视频。
    Abstract In many sports, player re-identification is crucial for automatic video processing and analysis. However, most of the current studies on player re-identification in multi- or single-view sports videos focus on re-identification in the closed-world setting using labeled image dataset, and player re-identification in the open-world setting for automatic video analysis is not well developed. In this paper, we propose a runner re-identification system that directly processes single-view video to address the open-world setting. In the open-world setting, we cannot use labeled dataset and have to process video directly. The proposed system automatically processes raw video as input to identify runners, and it can identify runners even when they are framed out multiple times. For the automatic processing, we first detect the runners in the video using the pre-trained YOLOv8 and the fine-tuned EfficientNet. We then track the runners using ByteTrack and detect their shoes with the fine-tuned YOLOv8. Finally, we extract the image features of the runners using an unsupervised method using the gated recurrent unit autoencoder model. To improve the accuracy of runner re-identification, we use dynamic features of running sequence images. We evaluated the system on a running practice video dataset and showed that the proposed method identified runners with higher accuracy than one of the state-of-the-art models in unsupervised re-identification. We also showed that our unsupervised running dynamic feature extractor was effective for runner re-identification. Our runner re-identification system can be useful for the automatic analysis of running videos.
    摘要 在多种运动中,玩家重新认定是自动视频处理和分析的关键。然而,当前大多数player重新认定在多视图或单视图运动视频中的研究都集中在关闭世界设定下使用标注图像集,而在开放世界设定下的自动视频分析中player重新认定还未得到充分开发。在这篇论文中,我们提出了一个runner重新认定系统,Directly处理单视图视频来解决开放世界设定。在开放世界设定下,我们不能使用标注集和处理视频 directly。提出的系统可以自动处理原始视频,并在多次框架外重新认定运动员。为了实现自动处理,我们首先在视频中检测运动员使用预训练的YOLOv8和精度调整的EfficientNet。然后,我们使用ByteTrack跟踪运动员,并使用精度调整的YOLOv8检测他们的鞋。最后,我们使用无监督方法使用闭环回归自适应模型提取运动员的图像特征。为了提高运动员重新认定的准确性,我们使用运动序列图像的动态特征。我们对一个跑步练习视频数据集进行评估,并显示了我们提出的方法可以在无监督下高度准确地重新认定运动员,并且我们的无监督跑动特征提取器是 runner重新认定中有效的。我们的runner重新认定系统可以对运动视频进行自动分析。

Architectural Implications of GNN Aggregation Programming Abstractions

  • paper_url: http://arxiv.org/abs/2310.12184
  • repo_url: None
  • paper_authors: Yingjie Qi, Jianlei Yang, Ao Zhou, Tong Qiao, Chunming Hu
  • for: 本研究旨在对现有的图数据处理抽象进行全面的评估和分析,以便为未来的图神经网络加速提供依据。
  • methods: 本研究使用现有的图数据处理抽象,并对其进行了详细的 caracterization 研究,以确定哪些抽象更有效率。
  • results: 研究发现,使用不同的抽象方法可以得到不同的性能和效率。同时,对于某些特定的图数据处理任务,某些抽象方法可以显著提高性能。
    Abstract Graph neural networks (GNNs) have gained significant popularity due to the powerful capability to extract useful representations from graph data. As the need for efficient GNN computation intensifies, a variety of programming abstractions designed for optimizing GNN Aggregation have emerged to facilitate acceleration. However, there is no comprehensive evaluation and analysis upon existing abstractions, thus no clear consensus on which approach is better. In this letter, we classify existing programming abstractions for GNN Aggregation by the dimension of data organization and propagation method. By constructing these abstractions on a state-of-the-art GNN library, we perform a thorough and detailed characterization study to compare their performance and efficiency, and provide several insights on future GNN acceleration based on our analysis.
    摘要 格raph神经网络(GNNs)已经吸引了广泛的关注,因为它们可以从图数据中提取有用的表示。随着GNN计算的需求越来越高,各种优化GNN聚合的编程封装出现了,以便加速。然而,现有的编程封装没有得到全面的评估和分析,因此没有明确的共识,哪个方法更好。在这封信中,我们将现有的GNN聚合编程封装分类为数据组织维度和传播方法。通过在当今的GNN库上构建这些封装,我们进行了详细的性能和效率Characterization研究,并提供了一些关于未来GNN加速的反思。

Quantum Acceleration of Infinite Horizon Average-Reward Reinforcement Learning

  • paper_url: http://arxiv.org/abs/2310.11684
  • repo_url: None
  • paper_authors: Bhargav Ganguly, Vaneet Aggarwal
  • for: 这个论文探讨了量子加速器在解决无穷 horizon Markov Decision Processes(MDPs)中提高均衡奖励的可能性。
  • methods: 我们提出了一种创新的量子框架,让智能机器人在未知MDP中与其进行交互,从而超越传统交互模式。我们的方法基于一种以Optimism为驱动的表格学习算法,利用量子信号来估计量子平均值。
  • results: 我们通过了对论文的严格理论分析,证明量子估计优势导致量子算法在无穷Horizon Reinforcement学习中获得了极大的进步,具体来说,我们的量子算法可以实现$\tilde{\mathcal{O}(1)$的 regret bound,比 классические对手的$\tilde{\mathcal{O}(\sqrt{T})$ bound明显更高。
    Abstract This paper investigates the potential of quantum acceleration in addressing infinite horizon Markov Decision Processes (MDPs) to enhance average reward outcomes. We introduce an innovative quantum framework for the agent's engagement with an unknown MDP, extending the conventional interaction paradigm. Our approach involves the design of an optimism-driven tabular Reinforcement Learning algorithm that harnesses quantum signals acquired by the agent through efficient quantum mean estimation techniques. Through thorough theoretical analysis, we demonstrate that the quantum advantage in mean estimation leads to exponential advancements in regret guarantees for infinite horizon Reinforcement Learning. Specifically, the proposed Quantum algorithm achieves a regret bound of $\tilde{\mathcal{O}(1)$, a significant improvement over the $\tilde{\mathcal{O}(\sqrt{T})$ bound exhibited by classical counterparts.
    摘要

Descriptive Knowledge Graph in Biomedical Domain

  • paper_url: http://arxiv.org/abs/2310.11681
  • repo_url: None
  • paper_authors: Kerui Zhu, Jie Huang, Kevin Chen-Chuan Chang
  • for: 该论文旨在提供一种自动抽取和生成有用和描述性句子的系统,以便有效地搜索生物医学知识。
  • methods: 该系统使用ChatGPT和一个精度调整的关系合成模型,自动生成有用和可靠的描述句子,从而减少了人类阅读努力。
  • results: 该系统可以帮助研究人员轻松地获得高级知识和详细参考,并且可以交互地循序搜索到有关的信息。在COVID-19研究中,该系统得到了广泛的应用,如药物重用和文献筛选。
    Abstract We present a novel system that automatically extracts and generates informative and descriptive sentences from the biomedical corpus and facilitates the efficient search for relational knowledge. Unlike previous search engines or exploration systems that retrieve unconnected passages, our system organizes descriptive sentences as a relational graph, enabling researchers to explore closely related biomedical entities (e.g., diseases treated by a chemical) or indirectly connected entities (e.g., potential drugs for treating a disease). Our system also uses ChatGPT and a fine-tuned relation synthesis model to generate concise and reliable descriptive sentences from retrieved information, reducing the need for extensive human reading effort. With our system, researchers can easily obtain both high-level knowledge and detailed references and interactively steer to the information of interest. We spotlight the application of our system in COVID-19 research, illustrating its utility in areas such as drug repurposing and literature curation.
    摘要 我们提出了一种新的系统,可以自动提取和生成有用和描述性的句子从生物医学词库,以便高效地搜索关系知识。与过去的搜索引擎或探索系统不同,我们的系统将描述句子组织成关系图,allowing researchers to explore closely related biomedical entities (例如,由化学物质治疗的疾病) or indirectly connected entities (例如,用于治疗疾病的潜在药物).我们的系统还使用ChatGPT和一种精心调整的关系合成模型,从检索到的信息中生成高度可靠和 concise的描述句子,从而减少了人类阅读努力。通过我们的系统,研究人员可以轻松地获得高级知识和详细参考,并且可以互动地导航到 interessant information。我们在COVID-19研究中强调了我们的系统的应用,例如药物重用和文献筛选。

Using Experience Classification for Training Non-Markovian Tasks

  • paper_url: http://arxiv.org/abs/2310.11678
  • repo_url: None
  • paper_authors: Ruixuan Miao, Xu Lu, Cong Tian, Bin Yu, Zhenhua Duan
  • for: 解决实际任务中的非Markovian任务,即奖励不仅基于当前状态,而且基于状态历史。
  • methods: 提出一种新的强化学习方法,利用线性时间逻辑LTL$_f$编码到Markov决策过程中,以便利用先进的RL算法。
  • results: 通过在多个 benchmark 问题中实践,证明了我们的方法的可行性和效果。
    Abstract Unlike the standard Reinforcement Learning (RL) model, many real-world tasks are non-Markovian, whose rewards are predicated on state history rather than solely on the current state. Solving a non-Markovian task, frequently applied in practical applications such as autonomous driving, financial trading, and medical diagnosis, can be quite challenging. We propose a novel RL approach to achieve non-Markovian rewards expressed in temporal logic LTL$_f$ (Linear Temporal Logic over Finite Traces). To this end, an encoding of linear complexity from LTL$_f$ into MDPs (Markov Decision Processes) is introduced to take advantage of advanced RL algorithms. Then, a prioritized experience replay technique based on the automata structure (semantics equivalent to LTL$_f$ specification) is utilized to improve the training process. We empirically evaluate several benchmark problems augmented with non-Markovian tasks to demonstrate the feasibility and effectiveness of our approach.
    摘要

Improved Sample Complexity Analysis of Natural Policy Gradient Algorithm with General Parameterization for Infinite Horizon Discounted Reward Markov Decision Processes

  • paper_url: http://arxiv.org/abs/2310.11677
  • repo_url: None
  • paper_authors: Washim Uddin Mondal, Vaneet Aggarwal
  • for: 这个论文是关于设计高效采样学习算法的研究,特别是针对无穷 horizon 折扣奖励Markov决策过程。
  • methods: 这个算法使用加速的随机梯度下降过程来获得自然策略偏导。
  • results: 这个算法可以达到 $\mathcal{O}({\epsilon^{-2})$ 样本复杂度和 $\mathcal{O}(\epsilon^{-1})$ 迭代复杂度,比现状态艺术ifactoria 样本复杂度增加 $\log(\frac{1}{\epsilon})$ 因子。此外,这个算法不需要不可证明的假设,即IS重要性的方差是上界bounded。在Hessian-free和IS-free算法中,ANPG beat最佳样本复杂度的记录,同时与其他最佳迭代复杂度匹配。
    Abstract We consider the problem of designing sample efficient learning algorithms for infinite horizon discounted reward Markov Decision Process. Specifically, we propose the Accelerated Natural Policy Gradient (ANPG) algorithm that utilizes an accelerated stochastic gradient descent process to obtain the natural policy gradient. ANPG achieves $\mathcal{O}({\epsilon^{-2})$ sample complexity and $\mathcal{O}(\epsilon^{-1})$ iteration complexity with general parameterization where $\epsilon$ defines the optimality error. This improves the state-of-the-art sample complexity by a $\log(\frac{1}{\epsilon})$ factor. ANPG is a first-order algorithm and unlike some existing literature, does not require the unverifiable assumption that the variance of importance sampling (IS) weights is upper bounded. In the class of Hessian-free and IS-free algorithms, ANPG beats the best-known sample complexity by a factor of $\mathcal{O}(\epsilon^{-\frac{1}{2})$ and simultaneously matches their state-of-the-art iteration complexity.
    摘要 我们考虑无限 horizon 折抵质量评估 Markov Decision Process 的问题。我们提出了加速自然策略导数(ANPG)算法,它利用加速随机Gradient Descent 过程来获得自然策略导数。ANPG 实现了 $\mathcal{O}({\epsilon^{-2})$ 样本复杂性和 $\mathcal{O}(\epsilon^{-1})$ 迭代复杂性,其中 $\epsilon$ 定义了优化误差。这超过了现有的 state-of-the-art 样本复杂性中的 $\log(\frac{1}{\epsilon})$ 因子。ANPG 是一个首顺算法,不需要先前的文献中未能证明的不可靠的假设,即 importance sampling 的 variance 的Upper Bounded。在 Hessian-free 和 IS-free 数据中,ANPG 比最好的 known sample complexity 的factor $\mathcal{O}(\epsilon^{-\frac{1}{2})$ ,同时将其state-of-the-art迭代复杂性与最佳的 state-of-the-art 匹配。

PREM: A Simple Yet Effective Approach for Node-Level Graph Anomaly Detection

  • paper_url: http://arxiv.org/abs/2310.11676
  • repo_url: https://github.com/campanulabells/prem-gad
  • paper_authors: Junjun Pan, Yixin Liu, Yizhen Zheng, Shirui Pan
  • for: 本研究旨在提高图structured数据中节点级别异常检测效率,并提供一种简单可行的方法来实现这一目标。
  • methods: 该方法称为PREM,包括两个模块:预处理模块和ego- neighborg matching模块。PREM方法不需要传输消息传递,而是使用简单的对比损失函数,从而大幅提高训练速度和内存使用效率。
  • results: 经过对五种真实世界数据集的严格评估,PREM方法显示了robustness和效果。特别是在ACM数据集上,PREM方法与最高效的基线方法相比,提高了5%的AUC,提高了9倍的训练速度,并大幅降低内存使用量。
    Abstract Node-level graph anomaly detection (GAD) plays a critical role in identifying anomalous nodes from graph-structured data in various domains such as medicine, social networks, and e-commerce. However, challenges have arisen due to the diversity of anomalies and the dearth of labeled data. Existing methodologies - reconstruction-based and contrastive learning - while effective, often suffer from efficiency issues, stemming from their complex objectives and elaborate modules. To improve the efficiency of GAD, we introduce a simple method termed PREprocessing and Matching (PREM for short). Our approach streamlines GAD, reducing time and memory consumption while maintaining powerful anomaly detection capabilities. Comprising two modules - a pre-processing module and an ego-neighbor matching module - PREM eliminates the necessity for message-passing propagation during training, and employs a simple contrastive loss, leading to considerable reductions in training time and memory usage. Moreover, through rigorous evaluations of five real-world datasets, our method demonstrated robustness and effectiveness. Notably, when validated on the ACM dataset, PREM achieved a 5% improvement in AUC, a 9-fold increase in training speed, and sharply reduce memory usage compared to the most efficient baseline.
    摘要 nodal-level 图像异常检测 (GAD) 在不同领域中,如医学、社交网络和电商,扮演了重要的角色,以识别图像中异常的节点。然而,由于异常的多样性以及标注数据的缺乏,存在许多挑战。现有的方法ologies,如重建基于的方法和对比学习,虽然有效,但往往受到效率问题的困扰,这些问题来自于复杂的目标函数和复杂的模块。为了改善 GAD 的效率,我们提出了一种简单的方法,称为 PREprocessing 和 Matching (PREM)。我们的方法通过流elines 节点级别的图像数据,从而减少训练时间和内存使用,同时保持强大的异常检测能力。PREM 包括两个模块:预处理模块和一个 Egon 的匹配模块。我们的方法不需要在训练期间进行消息传递,而是使用一个简单的对比损失函数,从而导致训练时间和内存使用的减少。此外,我们对五个真实世界数据集进行了严格的评估,我们的方法在robustness和效果两个方面具有出色的表现。特别是在 ACM 数据集上,PREM 可以在训练速度、内存使用和 AUC 等方面与最高效的基eline 相比,达到 5% 的提升,9 倍增加训练速度,并显著减少内存使用。

Prototype-based HyperAdapter for Sample-Efficient Multi-task Tuning

  • paper_url: http://arxiv.org/abs/2310.11670
  • repo_url: https://github.com/bumble666/pha
  • paper_authors: Hao Zhao, Jie Fu, Zhaofeng He
  • for: 这个研究是为了提高预训练语言模型的扩展性和数据效率。
  • methods: 这篇论文使用了参数效率的精致调整(PEFT)方法,并提出了一个名为实例紧密抽象(PHA)的新框架,它使用了适应器调整和超级网络来生成条件模组。
  • results: 这篇论文的实验结果显示,PHA方法在多任务学习和几少例转移学习中比较其他强基eline方法表现更好,尤其是当资料量变少时。
    Abstract Parameter-efficient fine-tuning (PEFT) has shown its effectiveness in adapting the pre-trained language models to downstream tasks while only updating a small number of parameters. Despite the success, most existing methods independently adapt to each task without considering knowledge transfer between tasks and are limited to low-data regimes. To overcome this issue, we propose Prototype-based HyperAdapter (PHA), a novel framework built on the adapter-tuning and hypernetwork. It introduces an instance-dense retriever and a prototypical hypernetwork to generate the conditional modules in a sample-efficient manner. This leads to comparable performance improvements against existing PEFT methods on multi-task learning and few-shot transfer learning. More importantly, when the available data size gets smaller, our method outperforms other strong baselines by a large margin. Based on our extensive empirical experiments across various datasets, we demonstrate that PHA strikes a better trade-off between trainable parameters, accuracy on stream tasks, and sample efficiency.
    摘要

SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents

  • paper_url: http://arxiv.org/abs/2310.11667
  • repo_url: None
  • paper_authors: Xuhui Zhou, Hao Zhu, Leena Mathur, Ruohong Zhang, Haofei Yu, Zhengyang Qi, Louis-Philippe Morency, Yonatan Bisk, Daniel Fried, Graham Neubig, Maarten Sap
  • for: 评估人工智能系统的社会智能能力
  • methods: 使用LLM-based agents和人类角色扮演者进行社会交互 scenario,并使用SOTOPIA-Eval评估框架评估模型的表现
  • results: 发现GPT-4在SOTOPIA-hard subsets中表现较差,其社交常识理解和战略通信技能受限,而人类则在这些 subsets中表现出优异的社会智能能力。
    Abstract Humans are social beings; we pursue social goals in our daily interactions, which is a crucial aspect of social intelligence. Yet, AI systems' abilities in this realm remain elusive. We present SOTOPIA, an open-ended environment to simulate complex social interactions between artificial agents and evaluate their social intelligence. In our environment, agents role-play and interact under a wide variety of scenarios; they coordinate, collaborate, exchange, and compete with each other to achieve complex social goals. We simulate the role-play interaction between LLM-based agents and humans within this task space and evaluate their performance with a holistic evaluation framework called SOTOPIA-Eval. With SOTOPIA, we find significant differences between these models in terms of their social intelligence, and we identify a subset of SOTOPIA scenarios, SOTOPIA-hard, that is generally challenging for all models. We find that on this subset, GPT-4 achieves a significantly lower goal completion rate than humans and struggles to exhibit social commonsense reasoning and strategic communication skills. These findings demonstrate SOTOPIA's promise as a general platform for research on evaluating and improving social intelligence in artificial agents.
    摘要 人类是社交生物,我们在日常互动中追求社交目标,这是人工智能系统的能力领域中的一个关键方面。然而,人工智能系统在这个领域的能力仍然尚未得到解释。我们提出了SOTOPIA,一个开放式环境,用于模拟人工智能代理人在复杂社交交互中的表现。在我们的环境中,代理人扮演和互动,在多种情况下协同合作、交换和竞争以完成复杂社交目标。我们在这个任务空间中模拟了LLM基于代理人和人类之间的角色扮演互动,并使用SOTOPIA-Eval全面评价框架进行评估。与SOTOPIA的使用,我们发现了不同的人工智能模型在社交智能方面存在显著差异,并确定了一个通用难度集合(SOTOPIA-hard),该集合对所有模型都是挑战性的。我们发现在这个集合中,GPT-4的目标完成率远低于人类,并且它很难展现社交感知和战略通信技能。这些发现表明SOTOPIA的潜在价值,作为一个通用的人工智能社交评价和改进平台。

Hetero$^2$Net: Heterophily-aware Representation Learning on Heterogenerous Graphs

  • paper_url: http://arxiv.org/abs/2310.11664
  • repo_url: None
  • paper_authors: Jintang Li, Zheng Wei, Jiawang Dan, Jing Zhou, Yuchang Zhu, Ruofan Wu, Baokun Wang, Zhang Zhen, Changhua Meng, Hong Jin, Zibin Zheng, Liang Chen
  • for: 本研究旨在 investigating the heterophily properties in heterogeneous graphs, and developing a heterophily-aware graph neural network (HGNN) to effectively handle more complex heterogeneous graphs.
  • methods: 我们使用 metapaths to identify the heterophily in heterogeneous graphs, and propose two practical metrics to quantitatively describe the levels of heterophily. We also introduce Hetero$^2$Net, a heterophily-aware HGNN that incorporates both masked metapath prediction and masked label prediction tasks to effectively handle both homophilic and heterophilic heterogeneous graphs.
  • results: 我们在 five real-world heterogeneous graph benchmarks with varying levels of heterophily 上 evaluate the performance of Hetero$^2$Net, and demonstrate that it outperforms strong baselines in the semi-supervised node classification task, providing valuable insights into effectively handling more complex heterogeneous graphs.
    Abstract Real-world graphs are typically complex, exhibiting heterogeneity in the global structure, as well as strong heterophily within local neighborhoods. While a growing body of literature has revealed the limitations of common graph neural networks (GNNs) in handling homogeneous graphs with heterophily, little work has been conducted on investigating the heterophily properties in the context of heterogeneous graphs. To bridge this research gap, we identify the heterophily in heterogeneous graphs using metapaths and propose two practical metrics to quantitatively describe the levels of heterophily. Through in-depth investigations on several real-world heterogeneous graphs exhibiting varying levels of heterophily, we have observed that heterogeneous graph neural networks (HGNNs), which inherit many mechanisms from GNNs designed for homogeneous graphs, fail to generalize to heterogeneous graphs with heterophily or low level of homophily. To address the challenge, we present Hetero$^2$Net, a heterophily-aware HGNN that incorporates both masked metapath prediction and masked label prediction tasks to effectively and flexibly handle both homophilic and heterophilic heterogeneous graphs. We evaluate the performance of Hetero$^2$Net on five real-world heterogeneous graph benchmarks with varying levels of heterophily. The results demonstrate that Hetero$^2$Net outperforms strong baselines in the semi-supervised node classification task, providing valuable insights into effectively handling more complex heterogeneous graphs.
    摘要 To address this gap, we identify the heterophily in heterogeneous graphs using metapaths and propose two practical metrics to quantitatively describe the levels of heterophily. Through in-depth investigations on several real-world heterogeneous graphs with varying levels of heterophily, we find that existing heterogeneous graph neural networks (HGNNs) fail to generalize to heterogeneous graphs with heterophily or low levels of homophily.To address this challenge, we present Hetero$^2$Net, a heterophily-aware HGNN that incorporates both masked metapath prediction and masked label prediction tasks to effectively and flexibly handle both homophilic and heterophilic heterogeneous graphs. We evaluate the performance of Hetero$^2$Net on five real-world heterogeneous graph benchmarks with varying levels of heterophily, and the results show that Hetero$^2$Net outperforms strong baselines in the semi-supervised node classification task, providing valuable insights into effectively handling more complex heterogeneous graphs.

Cloud-Magnetic Resonance Imaging System: In the Era of 6G and Artificial Intelligence

  • paper_url: http://arxiv.org/abs/2310.11641
  • repo_url: None
  • paper_authors: Yirong Zhou, Yanhuang Wu, Yuhan Su, Jing Li, Jianyun Cai, Yongfu You, Di Guo, Xiaobo Qu
  • for: 解决医疗机构年度生成巨量数据问题,提高医疗诊断精度和工作效率。
  • methods: integrating 分布式云计算、6G频率、边缘计算、联合学习和区块链技术。
  • results: 提高数据存储安全性、传输速度、人工智能算法维护、硬件升级和交叉机构医疗协作。
    Abstract Magnetic Resonance Imaging (MRI) plays an important role in medical diagnosis, generating petabytes of image data annually in large hospitals. This voluminous data stream requires a significant amount of network bandwidth and extensive storage infrastructure. Additionally, local data processing demands substantial manpower and hardware investments. Data isolation across different healthcare institutions hinders cross-institutional collaboration in clinics and research. In this work, we anticipate an innovative MRI system and its four generations that integrate emerging distributed cloud computing, 6G bandwidth, edge computing, federated learning, and blockchain technology. This system is called Cloud-MRI, aiming at solving the problems of MRI data storage security, transmission speed, AI algorithm maintenance, hardware upgrading, and collaborative work. The workflow commences with the transformation of k-space raw data into the standardized Imaging Society for Magnetic Resonance in Medicine Raw Data (ISMRMRD) format. Then, the data are uploaded to the cloud or edge nodes for fast image reconstruction, neural network training, and automatic analysis. Then, the outcomes are seamlessly transmitted to clinics or research institutes for diagnosis and other services. The Cloud-MRI system will save the raw imaging data, reduce the risk of data loss, facilitate inter-institutional medical collaboration, and finally improve diagnostic accuracy and work efficiency.
    摘要 The workflow of Cloud-MRI commences with the transformation of k-space raw data into the standardized Imaging Society for Magnetic Resonance in Medicine Raw Data (ISMRMRD) format. Then, the data are uploaded to the cloud or edge nodes for fast image reconstruction, neural network training, and automatic analysis. Finally, the outcomes are seamlessly transmitted to clinics or research institutes for diagnosis and other services.The Cloud-MRI system will save the raw imaging data, reduce the risk of data loss, facilitate inter-institutional medical collaboration, and finally improve diagnostic accuracy and work efficiency.Translated into Simplified Chinese:магнитно резонантно изображение (MRI) играет важную роль в медицинском диагнозирању, генеришући петабајтове количине слике података годишње у великим болницама. Овај обимни поток података захтева значајан удео мрежне брзине и екстензивну инфраструктуру за чување. Осим тога, локално обрадање података захтеваsubstantial ljudske ресурсе и инвестиције у хардвер. Ограничење података међу различитим здравственим установама отежава међуустанове медицинску сарадњу у клиникама и истраживањима. У овом раду, очекујемо иновативни систем MRI и његове четири генерације које интегришу емерингве дистрибуировану рачунарску облак технологију, 6G фреквенцију, ивицу рачунара, federated learning и блокчејн технологију. Овај систем се зове Cloud-MRI и има за циљ решења проблема чувања података MRI, брзине преноса, одржавања алгоритама, побољшања хардвера и сарадње.Радни процес Cloud-MRI почиње трансформацијомraw k-простора у стандардизовану форму Imaging Society for Magnetic Resonance in Medicine Raw Data (ISMRMRD). Затим, подаци се upload у облак или ивицу за брзо реконструкцију слике, тренинг неуралних мрежа и автоматско анализирање. На крају, извори се преносе безбедно на клинике или истраживачке институте за дијагнозу и друге услуге.Cloud-MRI систем ће чувати raw слике, смањити ризик губитка података, побољшати међуустанове медицинску сарадњу и на крају побољшати точност дијагнозе и ефикасност рада.

A Symbolic Language for Interpreting Decision Trees

  • paper_url: http://arxiv.org/abs/2310.11636
  • repo_url: https://github.com/diegoemilio01/a-symbolic-language-for-interpreting-decision-trees
  • paper_authors: Marcelo Arenas, Pablo Barcelo, Diego Bustamente, Jose Caraball, Bernardo Subercaseaux
  • for: 这个论文旨在探讨形式可解释AI的发展,探讨decision trees的可解释性问题,并提出了不同的可解释性查询和处理方法。
  • methods: 该论文使用了一种名为StratiFOILed的精心构造的 fragments of first-ordered logic,可以计算多种后期解释,包括本地解释(如推理和对比解释)和全局解释(如特征相关性)。
  • results: 该论文提出了ExplainDT,一种符号语言用于解释decision trees,可以根据用户需求来定制查询。StratiFOILed queries可以写作Boolean combination of NP-problems,可以在实践中使用常数数量的SAT解决器调用来评估。
    Abstract The recent development of formal explainable AI has disputed the folklore claim that "decision trees are readily interpretable models", showing different interpretability queries that are computationally hard on decision trees, as well as proposing different methods to deal with them in practice. Nonetheless, no single explainability query or score works as a "silver bullet" that is appropriate for every context and end-user. This naturally suggests the possibility of "interpretability languages" in which a wide variety of queries can be expressed, giving control to the end-user to tailor queries to their particular needs. In this context, our work presents ExplainDT, a symbolic language for interpreting decision trees. ExplainDT is rooted in a carefully constructed fragment of first-ordered logic that we call StratiFOILed. StratiFOILed balances expressiveness and complexity of evaluation, allowing for the computation of many post-hoc explanations--both local (e.g., abductive and contrastive explanations) and global ones (e.g., feature relevancy)--while remaining in the Boolean Hierarchy over NP. Furthermore, StratiFOILed queries can be written as a Boolean combination of NP-problems, thus allowing us to evaluate them in practice with a constant number of calls to a SAT solver. On the theoretical side, our main contribution is an in-depth analysis of the expressiveness and complexity of StratiFOILed, while on the practical side, we provide an optimized implementation for encoding StratiFOILed queries as propositional formulas, together with an experimental study on its efficiency.
    摘要 最近的形式可解AI发展有抵触了传统的说法,证明了决策树不是一种直观可解的模型,并提出了不同的可解性查询和处理方法。然而,没有一个单一的可解性查询或分数可以满足每个情况和用户需求。这自然地提出了“可解性语言”的概念,允许用户根据自己的需求定制查询。在这个上下文中,我们提出了ExplainDT,一种符号语言用于解释决策树。ExplainDT基于我们优化的一种首领逻辑,即StratiFOILed,该逻辑具有较高的表达力和评估复杂性,可以计算多种后期解释(包括地方的推理和对比解释以及全局的特征相关性),同时仍然保持在Boolean Hierarchy中。此外,StratiFOILed查询可以写作一个Boolean组合,因此可以通过一个常数数量的SAT解决器的调用来评估。从理论角度来看,我们的主要贡献是对StratiFOILed的表达力和评估复杂性进行深入分析,而从实践角度来看,我们提供了优化的编码方法和实验研究其效率。