cs.AI - 2023-07-26

Improving International Climate Policy via Mutually Conditional Binding Commitments

  • paper_url: http://arxiv.org/abs/2307.14267
  • repo_url: None
  • paper_authors: Jobst Heitzig, Jörg Oechssler, Christoph Pröschel, Niranjana Ragavan, Yat Long Lo
  • for: 这篇论文是为了解决气候变化国际协议(Paris协议)面临的挑战,即大多数国家决定的国家决定(NDCs)的无条件性,导致主要排放者之间的免费骗取行为和NDCs中的具体性缺乏。
  • methods: 该论文提出了一种分布式、底层的决策机制——条件承诺机制, draws inspiration from国家人投票伙伴关系,并提供了适应性和激励早期采用者的优势。
  • results: 该论文介绍了机制的概述、在AI4ClimateCooperation挑战中的表现,以及实际应用方面的可能性。
    Abstract The Paris Agreement, considered a significant milestone in climate negotiations, has faced challenges in effectively addressing climate change due to the unconditional nature of most Nationally Determined Contributions (NDCs). This has resulted in a prevalence of free-riding behavior among major polluters and a lack of concrete conditionality in NDCs. To address this issue, we propose the implementation of a decentralized, bottom-up approach called the Conditional Commitment Mechanism. This mechanism, inspired by the National Popular Vote Interstate Compact, offers flexibility and incentives for early adopters, aiming to formalize conditional cooperation in international climate policy. In this paper, we provide an overview of the mechanism, its performance in the AI4ClimateCooperation challenge, and discuss potential real-world implementation aspects. Prior knowledge of the climate mitigation collective action problem, basic economic principles, and game theory concepts are assumed.
    摘要 《巴黎协议》被视为气候谈判中的重要里程碑,但它在有效地解决气候变化问题上遇到了挑战。这是因为大多数国家确定的气候承诺(NDCs)的条件性较弱,导致主要污染者存在“免费乘客”的现象,NDCs中缺乏具体的条件性。为解决这个问题,我们提议实施一种分散式、底层式的承诺机制,称为条件承诺机制。这种机制灵感自国家人投票协议,提供了灵活性和激励早期采取行动的优势,以帮助正式化国际气候政策中的条件合作。在这篇论文中,我们提供机制的概述、在AI4气候合作挑战中的表现,以及实际应用方面的思考。假设读者有气候 Mitigation的集体行动问题、基本经济原则和游戏理论的知识。

Improving International Climate Policy via Mutually Conditional Binding Commitments

  • paper_url: http://arxiv.org/abs/2307.14266
  • repo_url: None
  • paper_authors: Jobst Heitzig, Jörg Oechssler, Christoph Pröschel, Niranjana Ragavan, Richie YatLong Lo
  • for: 提高国际气候政策决策的现实性
  • methods: 使用优化的RICE-N模拟和多代理人强化学习框架,以及 Conditional Commitments Mechanism(CCF机制)等方法
  • results: 提出了减少实验与现实之间差距,增强协调和考虑社会因素的方法,以及改进强化学习算法等建议,以提高国际气候政策决策的效果和可行性。
    Abstract This paper proposes enhancements to the RICE-N simulation and multi-agent reinforcement learning framework to improve the realism of international climate policy negotiations. Acknowledging the framework's value, we highlight the necessity of significant enhancements to address the diverse array of factors in modeling climate negotiations. Building upon our previous work on the "Conditional Commitments Mechanism" (CCF mechanism) we discuss ways to bridge the gap between simulation and reality. We suggest the inclusion of a recommender or planner agent to enhance coordination, address the Real2Sim gap by incorporating social factors and non-party stakeholder sub-agents, and propose enhancements to the underlying Reinforcement Learning solution algorithm. These proposed improvements aim to advance the evaluation and formulation of negotiation protocols for more effective international climate policy decision-making in Rice-N. However, further experimentation and testing are required to determine the implications and effectiveness of these suggestions.
    摘要

The flow of ideas in word embeddings

  • paper_url: http://arxiv.org/abs/2307.16819
  • repo_url: None
  • paper_authors: Debayan Dasgupta
  • for: investigate the similarity-based flow of ideas in language models
  • methods: adopts microrheology tools and random walker in word embeddings
  • results: shows signatures of anomalous diffusion and potential association with creativity
    Abstract The flow of ideas has been extensively studied by physicists, psychologists, and machine learning engineers. This paper adopts specific tools from microrheology to investigate the similarity-based flow of ideas. We introduce a random walker in word embeddings and study its behavior. Such similarity-mediated random walks through the embedding space show signatures of anomalous diffusion commonly observed in complex structured systems such as biological cells and complex fluids. The paper concludes by proposing the application of popular tools employed in the study of random walks and diffusion of particles under Brownian motion to assess quantitatively the incorporation of diverse ideas in a document. Overall, this paper presents a self-referenced method combining microrheology and machine learning concepts to explore the meandering tendencies of language models and their potential association with creativity.
    摘要 研究想法的流动已经广泛研究了物理学家、心理学家和机器学习工程师。这篇论文采用特定的工具从微流动学来研究相似性基于的想法流动。我们引入了单词嵌入中的随机游走者,并研究其行为。这种相似性媒介的随机游走在嵌入空间中显示了复杂结构系统中常见的异常扩散特征。论文结束时提出了使用广泛用于随机游走和分子扩散下 Брау恩运动的计算方法来评估文档中多元想法的 incorporation。总的来说,这篇论文提出了结合微流动学和机器学习概念的自referenced方法,用以探索语言模型的漫游倾向和创造力的可能关系。

Visual Saliency Detection in Advanced Driver Assistance Systems

  • paper_url: http://arxiv.org/abs/2308.03770
  • repo_url: None
  • paper_authors: Francesco Rundo, Michael Sebastian Rundo, Concetto Spampinato
  • for: 这 paper 的目的是提出一种智能系统,用于评估司机的注意力水平和驾驶场景的重要性。
  • methods: 该系统使用了semantic segmentation 3D deep network、dedicated 1D temporal deep convolutional network、hardware accelerator等技术。
  • results: 实验结果表明,该系统可以准确地评估司机的注意力水平和驾驶场景的重要性,提高了驾驶安全性。
    Abstract Visual Saliency refers to the innate human mechanism of focusing on and extracting important features from the observed environment. Recently, there has been a notable surge of interest in the field of automotive research regarding the estimation of visual saliency. While operating a vehicle, drivers naturally direct their attention towards specific objects, employing brain-driven saliency mechanisms that prioritize certain elements over others. In this investigation, we present an intelligent system that combines a drowsiness detection system for drivers with a scene comprehension pipeline based on saliency. To achieve this, we have implemented a specialized 3D deep network for semantic segmentation, which has been pretrained and tailored for processing the frames captured by an automotive-grade external camera. The proposed pipeline was hosted on an embedded platform utilizing the STA1295 core, featuring ARM A7 dual-cores, and embeds an hardware accelerator. Additionally, we employ an innovative biosensor embedded on the car steering wheel to monitor the driver drowsiness, gathering the PhotoPlethysmoGraphy (PPG) signal of the driver. A dedicated 1D temporal deep convolutional network has been devised to classify the collected PPG time-series, enabling us to assess the driver level of attentiveness. Ultimately, we compare the determined attention level of the driver with the corresponding saliency-based scene classification to evaluate the overall safety level. The efficacy of the proposed pipeline has been validated through extensive experimental results.
    摘要 “视觉吸引”指代人类在观察环境时自然地吸引注意力和提取重要特征的机制。在汽车研究领域,近期对视觉吸引的估计表现出了明显的兴趣增长。在运行汽车时, drivers 自然地将注意力集中在特定对象上,使用大脑驱动的吸引机制,将某些元素优先于别的元素。在这次研究中,我们提出了一个智能系统,该系统结合了驾驶员睡眠检测系统和基于吸引的场景理解管道。为达到这一目标,我们实施了一个特殊的3D深度网络 дляsemantic segmentation,该网络在 automotive-grade 外部摄像头捕捉的帧中进行了预训练和定制。我们的提案的管道在 ARM A7 双核 STA1295 核心上的嵌入式平台上运行,并利用硬件加速器。此外,我们还使用了车辆方向盘上的生物传感器来监测驾驶员睡眠,收集了 driver 的 PhotoPlethysmoGraphy (PPG) 信号。我们设计了一个1D时间深度卷积网络,以分类收集的 PPG 时间序列,从而评估驾驶员的注意度水平。最后,我们将驾驶员的注意度水平与相应的吸引基于场景分类相比评估整体安全水平。我们的提案的管道的效果得到了广泛的实验 validate。

A New Perspective on Evaluation Methods for Explainable Artificial Intelligence (XAI)

  • paper_url: http://arxiv.org/abs/2307.14246
  • repo_url: None
  • paper_authors: Timo Speith, Markus Langer
  • for: 这个论文主要是为了探讨Explainable Artificial Intelligence(XAI)在Requirements Engineering(RE)领域中的重要性,以及XAI在系统质量中的影响。
  • methods: 该论文使用了一种critical examination的方法,检查了XAI的可解性和性能之间的Supposed trade-off,并提出了一种nuanced approach来缓解这个负面关系。
  • results: 该论文的研究结果表明,在不同的资源和领域特点的情况下,可解性和性能之间存在一定的 equilibrio,而不是简单的trade-off关系。这些结果提供了一个基础 для未来的研究和实践,以推进RE领域中的AI发展。
    Abstract Within the field of Requirements Engineering (RE), the increasing significance of Explainable Artificial Intelligence (XAI) in aligning AI-supported systems with user needs, societal expectations, and regulatory standards has garnered recognition. In general, explainability has emerged as an important non-functional requirement that impacts system quality. However, the supposed trade-off between explainability and performance challenges the presumed positive influence of explainability. If meeting the requirement of explainability entails a reduction in system performance, then careful consideration must be given to which of these quality aspects takes precedence and how to compromise between them. In this paper, we critically examine the alleged trade-off. We argue that it is best approached in a nuanced way that incorporates resource availability, domain characteristics, and considerations of risk. By providing a foundation for future research and best practices, this work aims to advance the field of RE for AI.
    摘要 在人工智能支持系统中的需求工程(RE)领域,增加了可解释人工智能(XAI)的重要性,以实现人工智能支持系统与用户需求、社会期望和法规标准的一致。通常,可解释性被视为系统质量的重要非功能要求。然而,它与性能之间的优先级权衡带来挑战。如果满足可解释性需求导致系统性能下降,那么需要考虑哪个质量方面优先,以及如何妥协这两个方面。在这篇论文中,我们 críticamente评估了这种负面冲击。我们认为,应该以细化的方式进行评估,考虑资源可用性、领域特点以及风险考虑。通过提供未来研究和最佳实践的基础,这篇论文旨在推动RE领域的发展。

Revisiting the Performance-Explainability Trade-Off in Explainable Artificial Intelligence (XAI)

  • paper_url: http://arxiv.org/abs/2307.14239
  • repo_url: None
  • paper_authors: Barnaby Crook, Maximilian Schlüter, Timo Speith
  • for: This paper aims to advance the field of Requirements Engineering (RE) for Artificial Intelligence (AI) by critically examining the supposed trade-off between explainability and performance.
  • methods: The paper argues that the trade-off between explainability and performance should be approached in a nuanced way that incorporates resource availability, domain characteristics, and considerations of risk.
  • results: The paper provides a foundation for future research and best practices in RE for AI, with the goal of advancing the field.
    Abstract Within the field of Requirements Engineering (RE), the increasing significance of Explainable Artificial Intelligence (XAI) in aligning AI-supported systems with user needs, societal expectations, and regulatory standards has garnered recognition. In general, explainability has emerged as an important non-functional requirement that impacts system quality. However, the supposed trade-off between explainability and performance challenges the presumed positive influence of explainability. If meeting the requirement of explainability entails a reduction in system performance, then careful consideration must be given to which of these quality aspects takes precedence and how to compromise between them. In this paper, we critically examine the alleged trade-off. We argue that it is best approached in a nuanced way that incorporates resource availability, domain characteristics, and considerations of risk. By providing a foundation for future research and best practices, this work aims to advance the field of RE for AI.
    摘要 在人工智能支持系统中的需求工程(RE)领域,随着可解释人工智能(XAI)的增加重要性,用户需求、社会期望和法规标准的Alignment已经吸引了关注。通常,可解释性被视为系统质量的重要非函数需求。然而,supposed trade-off between explainability和性能挑战了 présumé的积极影响。如果满足可解释性需求意味着系统性能下降,那么需要仔细考虑哪个质量特征优先级顺序和如何妥协 между them。在这篇论文中,我们critically examine the alleged trade-off。我们认为应该 approached in a nuanced way that incorporates resource availability, domain characteristics, and considerations of risk。通过提供未来研究和最佳实践的基础,这项工作想要进一步发展RE领域的AI。Note: The translation is in Simplified Chinese, which is the standard writing system used in mainland China. If you need the translation in Traditional Chinese, please let me know.

UnScientify: Detecting Scientific Uncertainty in Scholarly Full Text

  • paper_url: http://arxiv.org/abs/2307.14236
  • repo_url: None
  • paper_authors: Panggih Kusuma Ningrum, Philipp Mayr, Iana Atanassova
  • for: 本研究的目的是开发一个可交互式的系统,用于探测科学文献中的不确定性。
  • methods: 该系统使用了一种弱监督的技术,利用细致的注释方案来在科学文献中标识句子级上的语言表达的不确定性。其执行管道包括模式匹配、复杂句子检查和作者参考检查。
  • results: 该系统可自动标记和注释科学文献中的不确定性标识,并考虑不同类型的科学不确定性,以便应用于信息检索、文本挖掘和学术文献处理等领域。此外,该系统提供了可解释的结果,帮助理解文本中标识的不确定性实例。
    Abstract This demo paper presents UnScientify, an interactive system designed to detect scientific uncertainty in scholarly full text. The system utilizes a weakly supervised technique that employs a fine-grained annotation scheme to identify verbally formulated uncertainty at the sentence level in scientific texts. The pipeline for the system includes a combination of pattern matching, complex sentence checking, and authorial reference checking. Our approach automates labeling and annotation tasks for scientific uncertainty identification, taking into account different types of scientific uncertainty, that can serve various applications such as information retrieval, text mining, and scholarly document processing. Additionally, UnScientify provides interpretable results, aiding in the comprehension of identified instances of scientific uncertainty in text.
    摘要 这个 demo 文章介绍了一个名为 UnScientify 的交互式系统,用于探测科学不确定性在学术全文中。该系统采用一种弱监督技术,使用细致的注释方案来在科学文本中识别句子级上的语言化不确定性。该管道包括组合pattern匹配、复杂句检查和作者参考检查。我们的方法自动标注和注释科学不确定性标识 task,考虑不同类型的科学不确定性,可以满足信息检索、文本挖掘和学术文档处理等应用。此外,UnScientify 提供可解释结果,帮助理解文本中标识的科学不确定性实例。

Non-Linear Self Augmentation Deep Pipeline for Cancer Treatment outcome Prediction

  • paper_url: http://arxiv.org/abs/2307.14398
  • repo_url: None
  • paper_authors: Francesco Rundo, Concetto Spampinato, Michael Rundo
  • for: 这个研究的目的是提高化疗治结果的预测,以便更好地选择适合受到化疗治疗的病人。
  • methods: 这个研究使用了一种新的非线性细胞架构,以及一个深度下渠排序器,从抑肝 CT 图像中提取和增强 2D 特征,以提高化疗治结果的预测。
  • results: 这个研究的结果显示,这种新的方法可以实现约 93% 的全局准确率,即使在肾脏癌症 (mUC) 等特殊的疾病中也有出色的预测效果。
    Abstract Immunotherapy emerges as promising approach for treating cancer. Encouraging findings have validated the efficacy of immunotherapy medications in addressing tumors, resulting in prolonged survival rates and notable reductions in toxicity compared to conventional chemotherapy methods. However, the pool of eligible patients for immunotherapy remains relatively small, indicating a lack of comprehensive understanding regarding the physiological mechanisms responsible for favorable treatment response in certain individuals while others experience limited benefits. To tackle this issue, the authors present an innovative strategy that harnesses a non-linear cellular architecture in conjunction with a deep downstream classifier. This approach aims to carefully select and enhance 2D features extracted from chest-abdomen CT images, thereby improving the prediction of treatment outcomes. The proposed pipeline has been meticulously designed to seamlessly integrate with an advanced embedded Point of Care system. In this context, the authors present a compelling case study focused on Metastatic Urothelial Carcinoma (mUC), a particularly aggressive form of cancer. Performance evaluation of the proposed approach underscores its effectiveness, with an impressive overall accuracy of approximately 93%
    摘要 免疫疗法在治疗癌症方面与兴趣增加。有关免疫疗法药物在治疗肿瘤方面的显著结果,使得生存时间增加和化学治疗方法相比,较少副作用。然而,适合免疫疗法的病人群较小,这表明我们对治疗成功的生理机制所知甚少。为解决这个问题,作者们提出了一个创新的策略,利用不对称细胞架构和深度下游分类器。这个方法的目的是将来自胸腹部 Computed Tomography 影像的2D特征 precisely 选择和增强,以提高治疗结果预测的精度。提案的管道已经严格地设计,以便与高级嵌入式点检系统集成。在这个上下文中,作者们提出了一个吸引人的案例研究, concentrate 在具有攻击性的膀胱癌(mUC)上。研究表明,提案的方法在这个案例中具有很高的总精度,约93%。

Sources of Opacity in Computer Systems: Towards a Comprehensive Taxonomy

  • paper_url: http://arxiv.org/abs/2307.14232
  • repo_url: None
  • paper_authors: Sara Mann, Barnaby Crook, Lena Kästner, Astrid Schomäcker, Timo Speith
  • for: 本研究旨在提高现代计算机系统的透明性,以便在需要公正或负责任的领域中应用。
  • methods: 本研究提出了一种基于八种透明性源的分类法,这八种源分为三大类:建筑性、分析性和社会技术性。对每种源,提供了实践中透明性处理的初步建议。
  • results: 本研究提供了一个 Context-dependent 透明性分类法,可以帮助需求工程师和其他实践者更好地理解特定context中透明性的主要来源,并选择或开发合适的透明性处理策略。
    Abstract Modern computer systems are ubiquitous in contemporary life yet many of them remain opaque. This poses significant challenges in domains where desiderata such as fairness or accountability are crucial. We suggest that the best strategy for achieving system transparency varies depending on the specific source of opacity prevalent in a given context. Synthesizing and extending existing discussions, we propose a taxonomy consisting of eight sources of opacity that fall into three main categories: architectural, analytical, and socio-technical. For each source, we provide initial suggestions as to how to address the resulting opacity in practice. The taxonomy provides a starting point for requirements engineers and other practitioners to understand contextually prevalent sources of opacity, and to select or develop appropriate strategies for overcoming them.
    摘要 现代计算机系统在现代生活中 ubique,但是许多它们仍然呈 opacity。这种情况会在需要 fairness 或 accountability 的领域 pose significant challenges。我们认为,在不同的 context 中 opaque 的 sources 的最佳策略是针对性的,即根据具体情况下的 opaque 的来源。通过synthesizing 和 extending 现有的讨论,我们提出了一个包含 eight sources of opacity 的税onomy,分为三大类:architectural、analytical 和 socio-technical。对于每个来源,我们提供了初步的实践方法,以便 requirements engineers 和其他专业人员在具体情况下理解contextually prevalent 的 opaque 来源,并选择或开发合适的策略来解决它们。

Explore the possibility of advancing climate negotiations on the basis of regional trade organizations: A study based on RICE-N

  • paper_url: http://arxiv.org/abs/2307.14226
  • repo_url: None
  • paper_authors: Wubo Dai
  • for: 这篇论文是为了提供新的理论支持气候谈判,帮助解决现在国际合作的不清楚前景。
  • methods: 这篇论文使用了深度学习建立了一个基于代理模型(ABM),并基于现有的贸易团体,对气候谈判进行了模拟。
  • results: 模拟结果显示,该方案具有良好的前景。
    Abstract Climate issues have become more and more important now. Although global governments have made some progress, we are still facing the truth that the prospect of international cooperation is not clear at present. Due to the limitations of the Integrated assessment models (IAMs) model, it is difficult to simulate the dynamic negotiation process. Therefore, using deep learning to build a new agents based model (ABM) might can provide new theoretical support for climate negotiations. Building on the RICE-N model, this work proposed an approach to climate negotiations based on existing trade groups. Simulation results show that the scheme has a good prospect.
    摘要 现在,气候问题已经变得非常重要。虽然全球政府已经做出了一些进展,但现在我们还面临着国际合作的未定性。由于 инте格рирован的评估模型(IAMs)的局限性,模拟动态谈判过程很难。因此,使用深度学习建立新的代理人基本模型(ABM)可能会提供新的理论支持 для气候谈判。基于RICE-N模型,本研究提出了基于现有贸易组织的方法。计算结果显示,该方案具有良好的前景。Note: "RICE-N" stands for "Regional Integrated model of Climate and the Economy with Non-cooperative Negociations".

AI and Education: An Investigation into the Use of ChatGPT for Systems Thinking

  • paper_url: http://arxiv.org/abs/2307.14206
  • repo_url: None
  • paper_authors: Holger Arndt
  • for: 这个探索性研究检查了虚拟智能工具ChatGPT是否可以支持不同科目的系统思维(ST)能力。
  • methods: 研究使用了通用和专业Prompt来评估ChatGPT的准确率、帮助度和可靠性在不同版本的工具中。
  • results: 研究发现ChatGPT可以在不同科目提供大量正确和很有帮助的答案,表明它可以增强ST技能。但有时会出现错误,需要用户保持批判性。 DESPITE SOME LIMITATIONS, THIS STUDY SUGGESTS THAT WITH CAREFUL USE AND ATTENTION TO ITS IDIOSYNCRASIES, ChatGPT CAN BE A VALUABLE TOOL FOR TEACHING AND LEARNING ST.
    Abstract This exploratory study investigates the potential of the artificial intelligence tool, ChatGPT, to support systems thinking (ST) in various subjects. Using both general and subject specific prompts, the study assesses the accuracy, helpfulness, and reliability of ChatGPT's responses across different versions of the tool. The results indicate that ChatGPT can provide largely correct and very helpful responses in various subjects, demonstrating its potential as a tool for enhancing ST skills. However, occasional inaccuracies highlight the need for users to remain critical of ChatGPT's responses. Despite some limitations, this study suggests that with careful use and attention to its idiosyncrasies, ChatGPT can be a valuable tool for teaching and learning ST.
    摘要 这项探索性研究检查了人工智能工具ChatGPT在不同学科中支持系统思维(ST)的潜力。通过通用和专业specific prompts,研究评估了ChatGPT的答案准确性、帮助性和可靠性,并发现ChatGPT在不同版本中的答案准确性较高,能够提供帮助学习ST技能的工具。然而, occasional inaccuracies 表明用户需要保持批判性,不能完全依赖ChatGPT的答案。不withstanding some limitations,这项研究表明,通过仔细使用和注意其特点,ChatGPT可以成为教学和学习ST的有价值工具。

Unveiling Security, Privacy, and Ethical Concerns of ChatGPT

  • paper_url: http://arxiv.org/abs/2307.14192
  • repo_url: None
  • paper_authors: Xiaodong Wu, Ran Duan, Jianbing Ni
  • for: 本研究探讨 chatGPT 如何应用于不同领域,以及 chatGPT 的安全、隐私和伦理问题。
  • methods: 本研究使用 topic modeling 和 reinforcement learning 技术,从 GPT-1 到 GPT-4 的升级路径,探讨模型的特点、局限性和应用前景。
  • results: 本研究指出 chatGPT 的潜在风险和问题,包括安全、隐私和伦理问题,并且提出了解决这些问题的开放问题。
    Abstract This paper delves into the realm of ChatGPT, an AI-powered chatbot that utilizes topic modeling and reinforcement learning to generate natural responses. Although ChatGPT holds immense promise across various industries, such as customer service, education, mental health treatment, personal productivity, and content creation, it is essential to address its security, privacy, and ethical implications. By exploring the upgrade path from GPT-1 to GPT-4, discussing the model's features, limitations, and potential applications, this study aims to shed light on the potential risks of integrating ChatGPT into our daily lives. Focusing on security, privacy, and ethics issues, we highlight the challenges these concerns pose for widespread adoption. Finally, we analyze the open problems in these areas, calling for concerted efforts to ensure the development of secure and ethically sound large language models.
    摘要 这篇论文探讨了chatGPT,一种基于人工智能的聊天机器人,它利用话题模型和强化学习生成自然的回应。虽然chatGPT在各个领域,如客户服务、教育、心理健康治疗、个人产生力和内容创作等领域都具有极大的承诺,但是需要考虑其安全、隐私和伦理问题。通过探讨GPT-1到GPT-4的升级路径,讨论模型的特点、局限性和应用潜力,这篇研究目的是为了照明chatGPT在我们日常生活中的潜在风险。Focus on安全、隐私和伦理问题,我们高亮了这些问题对普及的挑战。最后,我们分析了在这些领域的开放问题,呼吁一共努力确保开发出安全和伦理正确的大语言模型。

LOIS: Looking Out of Instance Semantics for Visual Question Answering

  • paper_url: http://arxiv.org/abs/2307.14142
  • repo_url: None
  • paper_authors: Siyu Zhang, Yeming Chen, Yaoru Sun, Fang Wang, Haibo Shi, Haoran Wang
    for: 这个论文的目的是提高视觉问答模型的理解能力,尤其是在理解图像中的对象semantics的关系。methods: 该论文提出了一种不使用 bounding boxes 的模型框架,称为 Looking Out of Instance Semantics (LOIS),以便更好地描述图像中的Visual fact。此外,该论文还提出了两种关系注意力模块:1)内模态注意力和2)间模态注意力,以解决多视Modality特征之间的关系。results: 该论文的实验结果表明,与四个标准 VQA 数据集进行比较,该提出的方法在改进视觉理解能力方面表现出优秀的成绩。
    Abstract Visual question answering (VQA) has been intensively studied as a multimodal task that requires effort in bridging vision and language to infer answers correctly. Recent attempts have developed various attention-based modules for solving VQA tasks. However, the performance of model inference is largely bottlenecked by visual processing for semantics understanding. Most existing detection methods rely on bounding boxes, remaining a serious challenge for VQA models to understand the causal nexus of object semantics in images and correctly infer contextual information. To this end, we propose a finer model framework without bounding boxes in this work, termed Looking Out of Instance Semantics (LOIS) to tackle this important issue. LOIS enables more fine-grained feature descriptions to produce visual facts. Furthermore, to overcome the label ambiguity caused by instance masks, two types of relation attention modules: 1) intra-modality and 2) inter-modality, are devised to infer the correct answers from the different multi-view features. Specifically, we implement a mutual relation attention module to model sophisticated and deeper visual semantic relations between instance objects and background information. In addition, our proposed attention model can further analyze salient image regions by focusing on important word-related questions. Experimental results on four benchmark VQA datasets prove that our proposed method has favorable performance in improving visual reasoning capability.
    摘要 Visual问答(VQA)已经广泛研究过,它需要跨视觉和语言之间的桥接来得出正确的答案。现有的尝试都开发了多种注意力模块来解决VQA任务。然而,模型推理性能受到视觉处理的限制,尤其是对象 semantics的理解。大多数现有的检测方法都依赖于 bounding boxes,这是VQA模型理解图像中对象 semantics的 causal nexus 的重要挑战。为此,我们在这里提出一种不使用 bounding boxes 的finer模型框架,称为 Looking Out of Instance Semantics(LOIS)。LOIS允许更细化的特征描述,以生成更加精准的视觉事实。此外,为了解决因instance masks引起的标签模糊,我们提出了两种类型的关系注意力模块:1)内模态关系注意力模块和2)间模态关系注意力模块。这些模块可以帮助模型正确地从不同的多视图特征中推理答案。具体来说,我们实现了相互关系注意力模块,以模型复杂的和深入的视觉semantics关系 между实例对象和背景信息。此外,我们的提议的注意力模型还可以进一步分析重要的单词相关问题,以增强图像区域的注意力。实验结果表明,我们的提议方法在四个 benchmark VQA 数据集上表现出色,提高了图像逻辑能力。

  • paper_url: http://arxiv.org/abs/2307.14138
  • repo_url: None
  • paper_authors: Behzad Nourani-Koliji, Steven Bilaj, Amir Rezaei Balef, Setareh Maghsudi
  • for: 解决 piecewise stationary combinatorial semi-bandit问题,处理非站台环境下的变化和 causal 关系。
  • methods: 使用 Upper Confidence Bound (UCB) 算法,并采用适应性的 GLR 测试来检测变化。 新引入 group restart 策略以适应结构化环境。
  • results: theoretically 确定了变化数量对性能的影响,并且在实际场景中比 benchmark 表现更好。
    Abstract We study the piecewise stationary combinatorial semi-bandit problem with causally related rewards. In our nonstationary environment, variations in the base arms' distributions, causal relationships between rewards, or both, change the reward generation process. In such an environment, an optimal decision-maker must follow both sources of change and adapt accordingly. The problem becomes aggravated in the combinatorial semi-bandit setting, where the decision-maker only observes the outcome of the selected bundle of arms. The core of our proposed policy is the Upper Confidence Bound (UCB) algorithm. We assume the agent relies on an adaptive approach to overcome the challenge. More specifically, it employs a change-point detector based on the Generalized Likelihood Ratio (GLR) test. Besides, we introduce the notion of group restart as a new alternative restarting strategy in the decision making process in structured environments. Finally, our algorithm integrates a mechanism to trace the variations of the underlying graph structure, which captures the causal relationships between the rewards in the bandit setting. Theoretically, we establish a regret upper bound that reflects the effects of the number of structural- and distribution changes on the performance. The outcome of our numerical experiments in real-world scenarios exhibits applicability and superior performance of our proposal compared to the state-of-the-art benchmarks.
    摘要 我们研究分割站位的 combinatorial 半带兽问题,其中奖励的生成过程受到基础武器的分布变化、奖励之间的 causal 关系变化或者两者同时变化。在这种非站ARY环境下,一个优化的决策者需要同时考虑这些变化并适应应对。在 combinatorial 半带兽设置下,决策者只能观察选择的武器集的结果。我们的提议的策略是使用 Upper Confidence Bound(UCB)算法。我们假设Agent使用适应的方法来解决这个挑战。具体来说,它使用基于 Generalized Likelihood Ratio(GLR)测试的变化检测器。此外,我们引入了一种新的结构化环境中的 restart 策略——组合重启。最后,我们的算法包含一个跟踪下面结构变化的机制,该结构变化捕捉了奖励之间的 causal 关系。理论上,我们确定了一个 regret Upper bound,该 bound 反映了结构变化和分布变化对性能的影响。实际上,我们的数值实验在真实世界情况下展现了我们的提议的可应用性和优越性,相比之下现状标准 benchmark。

Developing and Evaluating Tiny to Medium-Sized Turkish BERT Models

  • paper_url: http://arxiv.org/abs/2307.14134
  • repo_url: None
  • paper_authors: Himmet Toprak Kesgin, Muzaffer Kaan Yuce, Mehmet Fatih Amasyali
  • for: This paper aims to bridge the research gap in less-resourced languages by introducing and evaluating tiny, mini, small, and medium-sized uncased Turkish BERT models.
  • methods: The authors trained these models on a diverse dataset encompassing over 75GB of text from multiple sources and tested them on several tasks, including mask prediction, sentiment analysis, news classification, and zero-shot classification.
  • results: Despite their smaller size, the models exhibited robust performance, including zero-shot task, while ensuring computational efficiency and faster execution times.Here is the same information in Simplified Chinese text:
  • for: 这篇论文目标是bridging the research gap in less-resourced languages,通过引入和评估不同大小的uncased Turkish BERT模型。
  • methods: 作者使用了多种任务和数据集来训练这些模型,包括偏好预测、情感分析、新闻分类和零批预测。
  • results: despite their smaller size, these models showed robust performance, including zero-shot task, while ensuring computational efficiency and faster execution times.
    Abstract This study introduces and evaluates tiny, mini, small, and medium-sized uncased Turkish BERT models, aiming to bridge the research gap in less-resourced languages. We trained these models on a diverse dataset encompassing over 75GB of text from multiple sources and tested them on several tasks, including mask prediction, sentiment analysis, news classification, and, zero-shot classification. Despite their smaller size, our models exhibited robust performance, including zero-shot task, while ensuring computational efficiency and faster execution times. Our findings provide valuable insights into the development and application of smaller language models, especially in the context of the Turkish language.
    摘要 Note:* "tiny" refers to models with fewer parameters, typically less than 100M;* "mini" refers to models with parameters between 100M and 500M;* "small" refers to models with parameters between 500M and 1B;* "medium-sized" refers to models with parameters between 1B and 2B.Also, "uncased" means that the models were trained without the use of capitalization, which is a common practice in natural language processing tasks.

A semantics-driven methodology for high-quality image annotation

  • paper_url: http://arxiv.org/abs/2307.14119
  • repo_url: None
  • paper_authors: Fausto Giunchiglia, Mayukh Bagchi, Xiaolei Diao
  • for: 本研究的目的是提出一种基于自然语言处理、知识表示和计算机视觉的方法,以降低对图像标注的主观决策。
  • methods: 该方法利用WordNet语义层次结构来提供图像标注的意义,并通过基于物品和视觉属性的对应来驱动图像的标注。
  • results: 该方法在ImageNet层次中的图像上进行了验证,并显示了降低主观决策的效果。
    Abstract Recent work in Machine Learning and Computer Vision has highlighted the presence of various types of systematic flaws inside ground truth object recognition benchmark datasets. Our basic tenet is that these flaws are rooted in the many-to-many mappings which exist between the visual information encoded in images and the intended semantics of the labels annotating them. The net consequence is that the current annotation process is largely under-specified, thus leaving too much freedom to the subjective judgment of annotators. In this paper, we propose vTelos, an integrated Natural Language Processing, Knowledge Representation, and Computer Vision methodology whose main goal is to make explicit the (otherwise implicit) intended annotation semantics, thus minimizing the number and role of subjective choices. A key element of vTelos is the exploitation of the WordNet lexico-semantic hierarchy as the main means for providing the meaning of natural language labels and, as a consequence, for driving the annotation of images based on the objects and the visual properties they depict. The methodology is validated on images populating a subset of the ImageNet hierarchy.
    摘要 In this paper, we propose vTelos, an integrated methodology that combines Natural Language Processing, Knowledge Representation, and Computer Vision to explicitly define the intended annotation semantics. This approach leverages the WordNet lexico-semantic hierarchy to provide meaning to natural language labels and drive the annotation of images based on the objects and visual properties they depict. We validate the methodology on a subset of the ImageNet hierarchy.

GraphRNN Revisited: An Ablation Study and Extensions for Directed Acyclic Graphs

  • paper_url: http://arxiv.org/abs/2307.14109
  • repo_url: None
  • paper_authors: Taniya Das, Mark Koch, Maya Ravichandran, Nikhil Khatri
  • for: 学习图形生成模型
  • methods: 使用深度学习架构GraphRNN,并对基线模型进行评估和简要改进
  • results: 1) 对You等人提出的GraphRNN架构进行重现并评估,并发现BFSTraversal对模型性能有重要贡献;2) 对GraphRNN进行扩展,使其可以生成直接的有向无环图,并在实际数据集上达到显著提高。
    Abstract GraphRNN is a deep learning-based architecture proposed by You et al. for learning generative models for graphs. We replicate the results of You et al. using a reproduced implementation of the GraphRNN architecture and evaluate this against baseline models using new metrics. Through an ablation study, we find that the BFS traversal suggested by You et al. to collapse representations of isomorphic graphs contributes significantly to model performance. Additionally, we extend GraphRNN to generate directed acyclic graphs by replacing the BFS traversal with a topological sort. We demonstrate that this method improves significantly over a directed-multiclass variant of GraphRNN on a real-world dataset.
    摘要 GRaphRNN 是一种深度学习建议的架构,由 You 等人提出用于学习图形生成模型。我们使用重现 GRaphRNN 架构的实现来重现 You 等人的结果,并对基线模型进行评估。通过一个剥削研究,我们发现 You 等人建议的 BFS 搜索方法可以帮助 collapse 同构图的表示,对模型性能产生重要贡献。此外,我们将 GRaphRNN 扩展到生成指定的有向无环图,通过将 BFS 搜索替换为拓扑排序。我们示出了这种方法在一个真实的数据集上表现明显更好。

Actions Speak What You Want: Provably Sample-Efficient Reinforcement Learning of the Quantal Stackelberg Equilibrium from Strategic Feedbacks

  • paper_url: http://arxiv.org/abs/2307.14085
  • repo_url: None
  • paper_authors: Siyu Chen, Mengdi Wang, Zhuoran Yang
  • for: The paper is written for learning a Quantal Stackelberg Equilibrium (QSE) in an episodic Markov game with a leader-follower structure.
  • methods: The paper uses reinforcement learning (RL) and entropy-regularized policy optimization to solve the leader’s decision-making problem. The authors propose sample-efficient algorithms for both the online and offline settings, based on maximum likelihood estimation and model-free or model-based RL.
  • results: The paper achieves sublinear regret upper bounds for the leader’s decision-making problem, and also quantifies the uncertainty of the estimators. The authors propose optimistic and pessimistic algorithms for online and offline settings, and show that their algorithms are computationally efficient when specialized to the linear and myopic setting.
    Abstract We study reinforcement learning (RL) for learning a Quantal Stackelberg Equilibrium (QSE) in an episodic Markov game with a leader-follower structure. In specific, at the outset of the game, the leader announces her policy to the follower and commits to it. The follower observes the leader's policy and, in turn, adopts a quantal response policy by solving an entropy-regularized policy optimization problem induced by leader's policy. The goal of the leader is to find her optimal policy, which yields the optimal expected total return, by interacting with the follower and learning from data. A key challenge of this problem is that the leader cannot observe the follower's reward, and needs to infer the follower's quantal response model from his actions against leader's policies. We propose sample-efficient algorithms for both the online and offline settings, in the context of function approximation. Our algorithms are based on (i) learning the quantal response model via maximum likelihood estimation and (ii) model-free or model-based RL for solving the leader's decision making problem, and we show that they achieve sublinear regret upper bounds. Moreover, we quantify the uncertainty of these estimators and leverage the uncertainty to implement optimistic and pessimistic algorithms for online and offline settings. Besides, when specialized to the linear and myopic setting, our algorithms are also computationally efficient. Our theoretical analysis features a novel performance-difference lemma which incorporates the error of quantal response model, which might be of independent interest.
    摘要 我们研究利用强化学习(RL)学习一个量化Stackelberg平衡(QSE)在一个 episodic Markov 游戏中,具有领袖-追随者结构。具体来说,在游戏开始时,领袖公布她的策略给追随者,并将其固定下来。追随者根据领袖的策略采取一个量化响应策略,这是通过解决由领袖策略引起的 entropy-regularized 策略优化问题来实现的。领袖的目标是找到她的优化策略,使得她在与追随者交互时获得最优预期总回报。一个关键挑战是,领袖无法见到追随者的奖励,她需要从追随者的行为中推断出追随者的量化响应模型。我们提出了 sample-efficient 算法,这些算法基于(i)通过最大可信度估计学习量化响应模型,以及(ii)模型自由或模型基于 RL 解决领袖决策问题。我们证明了这些算法可以达到负线性 regret Upper bound。此外,我们还评估了这些估计器的uncertainty,并利用这些uncertainty来实现在线和离线设置中的优胜算法。此外,当特化到线性和偏向设置时,我们的算法也是计算效率高的。我们的理论分析包括一个新的性能差异 lemma,它可能是独立的兴趣。

Learning to simulate partially known spatio-temporal dynamics with trainable difference operators

  • paper_url: http://arxiv.org/abs/2307.14395
  • repo_url: None
  • paper_authors: Xiang Huang, Zhuoyuan Li, Hongsheng Liu, Zidong Wang, Hongye Zhou, Bin Dong, Bei Hua
  • for: 用神经网络模拟空间时间动态的研究在最近几年得到了广泛关注。然而,大多数现有方法采用纯数据驱动黑盒模型,具有限制精度和可读性。
  • methods: 我们提出一种新的混合架构,名为PDE-Net++,它将可训练的差分算子与黑盒模型结合在一起,并嵌入部分先验知识。我们还提出了两种不同的差分层:可训练的flipping差分层(TFDL)和可训练的动态差分层(TDDL)。
  • results: 数值实验表明,PDE-Net++的预测精度较高,并且在推断过程中表现出色。相比之下,黑盒模型的预测精度较差。
    Abstract Recently, using neural networks to simulate spatio-temporal dynamics has received a lot of attention. However, most existing methods adopt pure data-driven black-box models, which have limited accuracy and interpretability. By combining trainable difference operators with black-box models, we propose a new hybrid architecture explicitly embedded with partial prior knowledge of the underlying PDEs named PDE-Net++. Furthermore, we introduce two distinct options called the trainable flipping difference layer (TFDL) and the trainable dynamic difference layer (TDDL) for the difference operators. Numerous numerical experiments have demonstrated that PDE-Net++ has superior prediction accuracy and better extrapolation performance than black-box models.
    摘要 最近,使用神经网络模拟空间时间动态得到了很多关注。然而,大多数现有方法采用纯数据驱动黑盒模型,准确性和可解释性受限。我们提出一种新的混合架构,名为PDE-Net++,其包含可训练的差分算子和黑盒模型。此外,我们还提出了两种不同的选项,即可训练的折衔差层(TFDL)和可训练的动态差层(TDDL)。数字实验证明,PDE-Net++在预测精度和推断性方面都有较高的性能,比黑盒模型更好。

Uncertainty Guided Adaptive Warping for Robust and Efficient Stereo Matching

  • paper_url: http://arxiv.org/abs/2307.14071
  • repo_url: None
  • paper_authors: Junpeng Jing, Jiankun Li, Pengfei Xiong, Jiangyu Liu, Shuaicheng Liu, Yichen Guo, Xin Deng, Mai Xu, Lai Jiang, Leonid Sigal
  • for: 提高掌控双眼匹配的稳定性和可靠性,以便应用于实际世界中。
  • methods: 提出了一个新的不确定指标驱动的匹配方法,通过在截图运算中灵活地调整对应点的样本数量,以及将传统非 Parametric 截图改进为可学习的截图。
  • results: 实验结果显示,这个方法可以在不需要重新训练的情况下,在 ETH3D、KITTI 和 Middlebury 数据集上取得最佳性能。此外,这个方法还可以在实时应用中实现高性能和轻量级化。
    Abstract Correlation based stereo matching has achieved outstanding performance, which pursues cost volume between two feature maps. Unfortunately, current methods with a fixed model do not work uniformly well across various datasets, greatly limiting their real-world applicability. To tackle this issue, this paper proposes a new perspective to dynamically calculate correlation for robust stereo matching. A novel Uncertainty Guided Adaptive Correlation (UGAC) module is introduced to robustly adapt the same model for different scenarios. Specifically, a variance-based uncertainty estimation is employed to adaptively adjust the sampling area during warping operation. Additionally, we improve the traditional non-parametric warping with learnable parameters, such that the position-specific weights can be learned. We show that by empowering the recurrent network with the UGAC module, stereo matching can be exploited more robustly and effectively. Extensive experiments demonstrate that our method achieves state-of-the-art performance over the ETH3D, KITTI, and Middlebury datasets when employing the same fixed model over these datasets without any retraining procedure. To target real-time applications, we further design a lightweight model based on UGAC, which also outperforms other methods over KITTI benchmarks with only 0.6 M parameters.
    摘要 Simplified Chinese:相关基于的三维匹配已经实现了出色的性能,它追求了两个特征图的成本量。然而,当前使用固定模型时,不同数据集上的性能并不uniform,这限制了它们在实际应用中的可行性。为了解决这个问题,这篇文章提出了一新的视角,即动态计算相关性的方法。一种名为Uncertainty Guided Adaptive Correlation(UGAC)模块被引入,以适应不同的enario。在扭曲操作中,基于偏差值的不确定性估计来动态调整抽样区域。此外,我们改进了传统的非参数化扭曲,使得learnable参数可以被学习。我们表明,通过将Recurrent Network激活UGAC模块,可以更加稳定地和有效地进行三维匹配。广泛的实验表明,我们的方法在ETH3D、KITTI和Middlebury数据集上达到了 Fix 模型不需要 retrained 的最佳性能。为了实现实时应用,我们进一步设计了一种具有UGAC模块的轻量级模型,它也在KITTI benchmark上超越了其他方法,只有0.6M参数。

Hypergraph Isomorphism Computation

  • paper_url: http://arxiv.org/abs/2307.14394
  • repo_url: None
  • paper_authors: Yifan Feng, Jiashu Han, Shihui Ying, Yue Gao
    for: 这篇论文的目的是解决高阶结构信息的问题,并且提出了一个基于Weisfiler-Lehman test算法的高阶图变换测试方法,以及两种基于这个算法的专案架构。methods: 这篇论文使用了Weisfiler-Lehman test算法来解决高阶图变换测试问题,并且提出了一个基于这个算法的专案架构,包括Hypergraph Weisfeiler-Lehamn Subtree Kernel和Hypergraph Weisfeiler-Lehamn Hyperedge Kernel。results: 这篇论文的结果显示,使用了提出的方法可以在处理复杂高阶图结构时实现高效的运行速度,比起其他常用的核心基于方法更快,甚至可以在80倍以上的时间内运行。
    Abstract The isomorphism problem is a fundamental problem in network analysis, which involves capturing both low-order and high-order structural information. In terms of extracting low-order structural information, graph isomorphism algorithms analyze the structural equivalence to reduce the solver space dimension, which demonstrates its power in many applications, such as protein design, chemical pathways, and community detection. For the more commonly occurring high-order relationships in real-life scenarios, the problem of hypergraph isomorphism, which effectively captures these high-order structural relationships, cannot be straightforwardly addressed using graph isomorphism methods. Besides, the existing hypergraph kernel methods may suffer from high memory consumption or inaccurate sub-structure identification, thus yielding sub-optimal performance. In this paper, to address the abovementioned problems, we first propose the hypergraph Weisfiler-Lehman test algorithm for the hypergraph isomorphism test problem by generalizing the Weisfiler-Lehman test algorithm from graphs to hypergraphs. Secondly, based on the presented algorithm, we propose a general hypergraph Weisfieler-Lehman kernel framework and implement two instances, which are Hypergraph Weisfeiler-Lehamn Subtree Kernel and Hypergraph Weisfeiler-Lehamn Hyperedge Kernel. In order to fulfill our research objectives, a comprehensive set of experiments was meticulously designed, including seven graph classification datasets and 12 hypergraph classification datasets. Results on hypergraph classification datasets show significant improvements compared to other typical kernel-based methods, which demonstrates the effectiveness of the proposed methods. In our evaluation, we found that our proposed methods outperform the second-best method in terms of runtime, running over 80 times faster when handling complex hypergraph structures.
    摘要 “iso”问题是网络分析中的基本问题,它涉及到捕捉低阶和高阶结构信息。在提取低阶结构信息方面,图 isomorphism 算法可以将结构等价性缩小到解决空间维度,这种能力在蛋白质设计、化学路径和社区检测等多个应用中得到了证明。然而,在实际生活中更常出现的高阶关系,高阶图 isomorphism 问题不能直接使用图 isomorphism 方法处理。此外,现有的高阶kernel方法可能会具有高内存消耗或不准确的结构特征标识,从而导致低效性。在本文中,我们提出了高阶 Weisfiler-Lehman 测试算法来解决上述问题,并基于该算法提出了一种总体的高阶 Weisfiler-Lehman kernel框架。此外,我们还实现了两个实例:高阶 Weisfeiler-Lehamn 子树kernel和高阶 Weisfeiler-Lehamn 边kernel。为了实现我们的研究目标,我们细心设计了一系列实验,包括7个图分类 dataset 和12个高阶图分类 dataset。结果表明,我们的提出的方法在高阶图分类 dataset 上具有显著的改善,这表明了我们的方法的有效性。在我们的评估中,我们发现了我们的提出的方法在处理复杂的高阶结构时的运行时间比其他常见的 kernel-based 方法快得多,达到80倍以上。

Acceptable risks in Europe’s proposed AI Act: Reasonableness and other principles for deciding how much risk management is enough

  • paper_url: http://arxiv.org/abs/2308.02047
  • repo_url: None
  • paper_authors: Henry Fraser, Jose-Miguel Bello y Villarino
    for:The paper evaluates the European Commission’s proposed AI Act’s approach to risk management and risk acceptability for high-risk AI systems.methods:The paper critiques the Act’s provisions on risk acceptability, arguing that they are unworkable and do not promote a proportionate regulatory burden or trustworthiness.results:The paper suggests that the European Parliament’s recent draft amendments to the risk management provisions, which include “reasonableness” and cost-benefit analysis, are more workable and better balance the goals of proportionality and trustworthiness. The paper also emphasizes the importance of civic legitimacy in risk acceptability judgments, including detailed guidance or involvement from regulators and meaningful input from affected stakeholders.Here is the same information in Simplified Chinese text:for:这篇论文评估欧盟委员会的提议的人工智能法案中的风险管理和风险可接受性方面的高风险人工智能系统。methods:论文批判法案中的风险接受性条款,认为它们不实施可持续的规制负担,也不促进信任worthy的人工智能。results:论文认为欧洲 parlament最新的修订草案中的风险管理条款,包括“合理”的原则,可以更好地实现可持续的规制负担和信任worthy的人工智能。论文还强调了风险接受性评估的公民合法性的重要性,包括 regulators提供详细指南或参与,以及affected stakeholders的意见参与。
    Abstract This paper critically evaluates the European Commission's proposed AI Act's approach to risk management and risk acceptability for high-risk AI systems that pose risks to fundamental rights and safety. The Act aims to promote "trustworthy" AI with a proportionate regulatory burden. Its provisions on risk acceptability require residual risks from high-risk systems to be reduced or eliminated "as far as possible", having regard to the "state of the art". This criterion, especially if interpreted narrowly, is unworkable and promotes neither proportionate regulatory burden, nor trustworthiness. By contrast the Parliament's most recent draft amendments to the risk management provisions introduce "reasonableness", cost-benefit analysis, and are more transparent about the value-laden and contextual nature of risk acceptability judgements. This paper argues that the Parliament's approach is more workable, and better balances the goals of proportionality and trustworthiness. It explains what reasonableness in risk acceptability judgments would entail, drawing on principles from negligence law and European medical devices regulation. And it contends that the approach to risk acceptability judgments need a firm foundation of civic legitimacy: including detailed guidance or involvement from regulators, and meaningful input from affected stakeholders.
    摘要

Open Image Content Disarm And Reconstruction

  • paper_url: http://arxiv.org/abs/2307.14057
  • repo_url: None
  • paper_authors: Eli Belkind, Ran Dubin, Amit Dvir
  • for: 本研究旨在提出一种图像内容级落实和重建(ICDR)系统,以防止恶意软件使用图像隐藏恶意脚本或敏感数据。
  • methods: 该系统采用零信任方式,对图像文件进行分析和清理,以除除恶意软件和敏感数据。
  • results: 实验结果表明,ICDR系统能够准确地检测和移除图像文件中的恶意软件和敏感数据,同时保持图像质量和文件可用性。
    Abstract With the advance in malware technology, attackers create new ways to hide their malicious code from antivirus services. One way to obfuscate an attack is to use common files as cover to hide the malicious scripts, so the malware will look like a legitimate file. Although cutting-edge Artificial Intelligence and content signature exist, evasive malware successfully bypasses next-generation malware detection using advanced methods like steganography. Some of the files commonly used to hide malware are image files (e.g., JPEG). In addition, some malware use steganography to hide malicious scripts or sensitive data in images. Steganography in images is difficult to detect even with specialized tools. Image-based attacks try to attack the user's device using malicious payloads or utilize image steganography to hide sensitive data inside legitimate images and leak it outside the user's device. Therefore in this paper, we present a novel Image Content Disarm and Reconstruction (ICDR). Our ICDR system removes potential malware, with a zero trust approach, while maintaining high image quality and file usability. By extracting the image data, removing it from the rest of the file, and manipulating the image pixels, it is possible to disable or remove the hidden malware inside the file.
    摘要

One-Nearest Neighborhood Guides Inlier Estimation for Unsupervised Point Cloud Registration

  • paper_url: http://arxiv.org/abs/2307.14019
  • repo_url: None
  • paper_authors: Yongzhe Yuan, Yue Wu, Maoguo Gong, Qiguang Miao, A. K. Qin
  • for: 这篇论文是为了提高无监督点云注册方法的精度,尤其是在部分重叠的场景下,而设计的。
  • methods: 该论文提出了一种有效的无监督点云注册方法,通过捕捉源点云和其相对参照点云的几何结构一致性来提高准确率。
  • results: 该论文通过实验证明了该方法的效iveness,并且在Synthetic和实际数据集上都达到了优秀的结果。
    Abstract The precision of unsupervised point cloud registration methods is typically limited by the lack of reliable inlier estimation and self-supervised signal, especially in partially overlapping scenarios. In this paper, we propose an effective inlier estimation method for unsupervised point cloud registration by capturing geometric structure consistency between the source point cloud and its corresponding reference point cloud copy. Specifically, to obtain a high quality reference point cloud copy, an One-Nearest Neighborhood (1-NN) point cloud is generated by input point cloud. This facilitates matching map construction and allows for integrating dual neighborhood matching scores of 1-NN point cloud and input point cloud to improve matching confidence. Benefiting from the high quality reference copy, we argue that the neighborhood graph formed by inlier and its neighborhood should have consistency between source point cloud and its corresponding reference copy. Based on this observation, we construct transformation-invariant geometric structure representations and capture geometric structure consistency to score the inlier confidence for estimated correspondences between source point cloud and its reference copy. This strategy can simultaneously provide the reliable self-supervised signal for model optimization. Finally, we further calculate transformation estimation by the weighted SVD algorithm with the estimated correspondences and corresponding inlier confidence. We train the proposed model in an unsupervised manner, and extensive experiments on synthetic and real-world datasets illustrate the effectiveness of the proposed method.
    摘要 Typically, the precision of unsupervised point cloud registration methods is limited by the lack of reliable inlier estimation and self-supervised signal, especially in partially overlapping scenarios. In this paper, we propose an effective inlier estimation method for unsupervised point cloud registration by capturing geometric structure consistency between the source point cloud and its corresponding reference point cloud copy. Specifically, we generate an One-Nearest Neighborhood (1-NN) point cloud by input point cloud to facilitate matching map construction and improve matching confidence. Benefiting from the high-quality reference copy, we argue that the neighborhood graph formed by inlier and its neighborhood should have consistency between the source point cloud and its corresponding reference copy. Based on this observation, we construct transformation-invariant geometric structure representations and capture geometric structure consistency to score the inlier confidence for estimated correspondences between the source point cloud and its reference copy. This strategy can simultaneously provide a reliable self-supervised signal for model optimization. Finally, we calculate transformation estimation by the weighted SVD algorithm with the estimated correspondences and corresponding inlier confidence. We train the proposed model in an unsupervised manner, and extensive experiments on synthetic and real-world datasets illustrate the effectiveness of the proposed method.

ESSAformer: Efficient Transformer for Hyperspectral Image Super-resolution

  • paper_url: http://arxiv.org/abs/2307.14010
  • repo_url: None
  • paper_authors: Mingjin Zhang, Chi Zhang, Qiming Zhang, Jie Guo, Xinbo Gao, Jing Zhang
  • for: restaura un alta resolução imagem hiperspectral a partir de uma observação de baixa resolução
  • methods: utiliza uma rede de transformador com estrutura de refinamento iterativo e uma nova métrica de similaridade espectral para incorporar informações de espectro na formação da imagem
  • results: gerou imagens de alta resolução mais naturais e obteve resultados visuais e quantitativos excelentes sem precisar de pretreinamento em grandes conjuntos de dados
    Abstract Single hyperspectral image super-resolution (single-HSI-SR) aims to restore a high-resolution hyperspectral image from a low-resolution observation. However, the prevailing CNN-based approaches have shown limitations in building long-range dependencies and capturing interaction information between spectral features. This results in inadequate utilization of spectral information and artifacts after upsampling. To address this issue, we propose ESSAformer, an ESSA attention-embedded Transformer network for single-HSI-SR with an iterative refining structure. Specifically, we first introduce a robust and spectral-friendly similarity metric, \ie, the spectral correlation coefficient of the spectrum (SCC), to replace the original attention matrix and incorporates inductive biases into the model to facilitate training. Built upon it, we further utilize the kernelizable attention technique with theoretical support to form a novel efficient SCC-kernel-based self-attention (ESSA) and reduce attention computation to linear complexity. ESSA enlarges the receptive field for features after upsampling without bringing much computation and allows the model to effectively utilize spatial-spectral information from different scales, resulting in the generation of more natural high-resolution images. Without the need for pretraining on large-scale datasets, our experiments demonstrate ESSA's effectiveness in both visual quality and quantitative results.
    摘要 单个多spectral像超分辨(单个HSI-SR)目标是从低分辨度观测获取高分辨度多spectral像。然而,现有的CNN基于方法具有限制性,不能建立长距离依赖关系和 spectral特征之间的交互信息。这会导致使用spectral信息的不足和 после upsampling artifacts。为解决这个问题,我们提出了ESSAformer,一种基于 transformer 网络的 ESSA 注意力嵌入结构,其中包括迭代性修复结构。具体来说,我们首先引入一种可靠的 spectral-friendly 相似度指标,即spectrum 相似度系数(SCC),以 replacing 原始注意力矩阵,并通过这个指标来带入 inductive biases 到模型中,以便训练。然后,我们利用 kernelizable 注意力技术,并有理论支持,将SCC 转化为一种高效的 ESSA 注意力,从而降低注意力计算的复杂性。ESSA 可以在 upsampling 后扩大特征的接受场,无需增加计算量,并且可以有效地利用不同的scale中的 spatial-spectral 信息,从而生成更自然的高分辨度图像。在无需大规模数据预训练的情况下,我们的实验表明ESSA 在视觉质量和量化结果方面都有显著的效果。

DPBERT: Efficient Inference for BERT based on Dynamic Planning

  • paper_url: http://arxiv.org/abs/2308.00108
  • repo_url: None
  • paper_authors: Weixin Wu, Hankz Hankui Zhuo
  • for: 这个研究旨在提高BERT模型在移动设备上的应用,因为现有的输入适应参数推理方法无法充分利用BERT模型的结构。
  • methods: 我们提出了一个名为“动态观念规划”的精致化策略,它可以通过选择一个序列转换层的子集来加速BERT模型的推理过程。我们将这个方法添加到原始BERT模型中,以便在推理过程中决定哪些层应该被包含或被忽略。
  • results: 我们在GLUE评量标准上进行实验,结果显示我们的方法可以降低推理时间至75%,同时保持98%的准确度,对比于现有的输入适应方法而言,它具有更好的几何-速度贡献。
    Abstract Large-scale pre-trained language models such as BERT have contributed significantly to the development of NLP. However, those models require large computational resources, making it difficult to be applied to mobile devices where computing power is limited. In this paper we aim to address the weakness of existing input-adaptive inference methods which fail to take full advantage of the structure of BERT. We propose Dynamic Planning in BERT, a novel fine-tuning strategy that can accelerate the inference process of BERT through selecting a subsequence of transformer layers list of backbone as a computational path for an input sample. To do this, our approach adds a planning module to the original BERT model to determine whether a layer is included or bypassed during inference. Experimental results on the GLUE benchmark exhibit that our method reduces latency to 75\% while maintaining 98\% accuracy, yielding a better accuracy-speed trade-off compared to state-of-the-art input-adaptive methods.
    摘要 大规模预训练语言模型如BERT对自然语言处理(NLP)发展做出了重要贡献。然而,这些模型需要大量计算资源,使其在移动设备上应用具有有限的计算能力具有困难。在这篇论文中,我们目标是解决现有输入适应推理方法的弱点,这些方法无法完全利用BERT结构的优势。我们提议使用BERT动态规划策略,这是一种新的微调策略,可以通过选择一个序列中的transformer层列表来加速BERT的推理过程。为了实现这一点,我们的方法在原始BERT模型中添加了规划模块,以确定在推理过程中是否包含或绕过某层。实验结果表明,我们的方法可以将延迟时间降低至75%,同时保持98%的准确率,与现有输入适应方法相比,实现了更好的准确率-速度质量比。

How User Language Affects Conflict Fatality Estimates in ChatGPT

  • paper_url: http://arxiv.org/abs/2308.00072
  • repo_url: None
  • paper_authors: Daniel Kazenwadel, Christoph V. Steinert
  • for: 这份研究旨在探讨OpenAI的ChatGPT语言模型是否受到语言特定的训练数据中的偏观影响。
  • methods: 研究使用GPT-3.5自动化查询程序,在希伯来和阿拉伯语言中查询了 especific airstrikes,在土耳其语言和кур德语言中查询了另一场战争。
  • results: 研究发现,当使用攻击者的语言进行查询时,GPT-3.5提供了27±11%比较低的伤亡估计,而且当查询结果存在推卸责任的答案时,这个差异会更大。这种语言偏观可能会增强现有的媒体偏观和信息径,最终增强冲突。
    Abstract OpenAI's ChatGPT language model has gained popularity as a powerful tool for complex problem-solving and information retrieval. However, concerns arise about the reproduction of biases present in the language-specific training data. In this study, we address this issue in the context of the Israeli-Palestinian and Turkish-Kurdish conflicts. Using GPT-3.5, we employed an automated query procedure to inquire about casualties in specific airstrikes, in both Hebrew and Arabic for the former conflict and Turkish and Kurdish for the latter. Our analysis reveals that GPT-3.5 provides 27$\pm$11 percent lower fatality estimates when queried in the language of the attacker than in the language of the targeted group. Evasive answers denying the existence of such attacks further increase the discrepancy, creating a novel bias mechanism not present in regular search engines. This language bias has the potential to amplify existing media biases and contribute to information bubbles, ultimately reinforcing conflicts.
    摘要

Dual-Space Attacks against Random-Walk-based Anomaly Detection

  • paper_url: http://arxiv.org/abs/2307.14387
  • repo_url: https://github.com/yuni-lai/dualattackrw
  • paper_authors: Yuni Lai, Marcin Waniek, Yulin Zhu, Liying Li, Jingwen Wu, Tomasz P. Michalak, Talal Rahwan, Kai Zhou
  • for: 这 paper 旨在探讨 Random Walks-based Anomaly Detection (RWAD) 中的两个攻击表面,即图空间攻击和特征空间攻击。
  • methods: 作者采用了实际的双空间攻击,包括图空间攻击和特征空间攻击。图空间攻击是一个 би-级优化问题,而特征空间攻击可以通过使用随机游走模型的关闭式解来解决。
  • results: 实验表明,作者的提出的攻击方法有效地启用了 RWAD 中的目标节点,并且在黑盒设置下进行了跨度攻击。此外,特征空间攻击也有效地降低了目标节点的异常分数。
    Abstract Random Walks-based Anomaly Detection (RWAD) is commonly used to identify anomalous patterns in various applications. An intriguing characteristic of RWAD is that the input graph can either be pre-existing or constructed from raw features. Consequently, there are two potential attack surfaces against RWAD: graph-space attacks and feature-space attacks. In this paper, we explore this vulnerability by designing practical dual-space attacks, investigating the interplay between graph-space and feature-space attacks. To this end, we conduct a thorough complexity analysis, proving that attacking RWAD is NP-hard. Then, we proceed to formulate the graph-space attack as a bi-level optimization problem and propose two strategies to solve it: alternative iteration (alterI-attack) or utilizing the closed-form solution of the random walk model (cf-attack). Finally, we utilize the results from the graph-space attacks as guidance to design more powerful feature-space attacks (i.e., graph-guided attacks). Comprehensive experiments demonstrate that our proposed attacks are effective in enabling the target nodes from RWAD with a limited attack budget. In addition, we conduct transfer attack experiments in a black-box setting, which show that our feature attack significantly decreases the anomaly scores of target nodes. Our study opens the door to studying the dual-space attack against graph anomaly detection in which the graph space relies on the feature space.
    摘要 Random Walks-based Anomaly Detection(RWAD)通常用于识别各种应用中的异常模式。RWAD中的一个有趣特点是输入图可以是先前存在的或者从原始特征中构建的。因此,RWAD有两个可能的攻击面:图形空间攻击和特征空间攻击。在这篇论文中,我们探索这一漏洞,并设计了实用的双空间攻击。我们首先进行了全面的复杂度分析,证明攻击RWAD是NP困难的。然后,我们将图形空间攻击形式为二级优化问题,并提出了两种解决方案:alternative iteration(alterI-attack)或者利用随机游走模型的关闭式解(cf-attack)。最后,我们利用图形空间攻击的结果作为指导,设计了更有力的特征空间攻击(i.e., graph-guided attacks)。我们的实验表明,我们提posed的攻击方法可以在有限的攻击预算下启用目标节点。此外,我们进行了黑盒设置下的传输攻击实验,显示我们的特征攻击可以减少目标节点的异常分数。我们的研究开启了图 anomaly detection中的双空间攻击,其中图形空间依赖于特征空间。

Controlling the Latent Space of GANs through Reinforcement Learning: A Case Study on Task-based Image-to-Image Translation

  • paper_url: http://arxiv.org/abs/2307.13978
  • repo_url: None
  • paper_authors: Mahyar Abbasian, Taha Rajabzadeh, Ahmadreza Moradipari, Seyed Amir Hossein Aqajari, Hongsheng Lu, Amir Rahmani
  • for: This paper aims to address the challenge of exerting control over the generation process of Generative Adversarial Networks (GANs) by integrating a reinforcement learning (RL) agent with a latent-space GAN (l-GAN).
  • methods: The proposed methodology utilizes an actor-critic RL agent with a meticulously designed reward policy to acquire proficiency in navigating the latent space of the l-GAN and generating outputs based on specified tasks.
  • results: The authors conducted a series of experiments employing the MNIST dataset, including arithmetic addition as an illustrative task, and the outcomes serve to validate their methodology.
    Abstract Generative Adversarial Networks (GAN) have emerged as a formidable AI tool to generate realistic outputs based on training datasets. However, the challenge of exerting control over the generation process of GANs remains a significant hurdle. In this paper, we propose a novel methodology to address this issue by integrating a reinforcement learning (RL) agent with a latent-space GAN (l-GAN), thereby facilitating the generation of desired outputs. More specifically, we have developed an actor-critic RL agent with a meticulously designed reward policy, enabling it to acquire proficiency in navigating the latent space of the l-GAN and generating outputs based on specified tasks. To substantiate the efficacy of our approach, we have conducted a series of experiments employing the MNIST dataset, including arithmetic addition as an illustrative task. The outcomes of these experiments serve to validate our methodology. Our pioneering integration of an RL agent with a GAN model represents a novel advancement, holding great potential for enhancing generative networks in the future.
    摘要

Understanding Deep Neural Networks via Linear Separability of Hidden Layers

  • paper_url: http://arxiv.org/abs/2307.13962
  • repo_url: None
  • paper_authors: Chao Zhang, Xinyu Chen, Wensheng Li, Lixue Liu, Wei Wu, Dacheng Tao
  • For: 本研究用线性可分性来研究深度神经网络的特点。* Methods: 我们首先提出了基于MINKOWSKI差的线性可分性度量(MD-LSM)来评估两个点集的线性可分性度。然后,我们证明了投入更新的网络权重可以提高隐藏层输出的线性可分性度,并且更新后的网络会获得更好的训练性能。此外,我们还研究了活动函数和网络大小(包括宽度和深度)对隐藏层的线性可分性的影响。* Results: 我们通过实验 validate our findings on some popular deep networks, including MLP, CNN, DBN, ResNet, VGGNet, AlexNet, ViT, and GoogLeNet.
    Abstract In this paper, we measure the linear separability of hidden layer outputs to study the characteristics of deep neural networks. In particular, we first propose Minkowski difference based linear separability measures (MD-LSMs) to evaluate the linear separability degree of two points sets. Then, we demonstrate that there is a synchronicity between the linear separability degree of hidden layer outputs and the network training performance, i.e., if the updated weights can enhance the linear separability degree of hidden layer outputs, the updated network will achieve a better training performance, and vice versa. Moreover, we study the effect of activation function and network size (including width and depth) on the linear separability of hidden layers. Finally, we conduct the numerical experiments to validate our findings on some popular deep networks including multilayer perceptron (MLP), convolutional neural network (CNN), deep belief network (DBN), ResNet, VGGNet, AlexNet, vision transformer (ViT) and GoogLeNet.
    摘要 在本文中,我们测量了深度神经网络的线性可分性,以研究深度神经网络的特点。特别是,我们首先提出了Minkowski差分基于的线性可分性度量(MD-LSM),用于评估两个点集的线性可分性度。然后,我们示出了潜在的同步现象:如果卷积重量更新可以提高隐藏层输出的线性可分性度,那么更新后的网络将在训练性能上得到提高,并且相反。此外,我们还研究了活化函数和网络大小(包括宽和深度)对隐藏层的线性可分性的影响。最后,我们进行了一些实验,以验证我们的发现在一些流行的深度网络上,包括多层感知网络(MLP)、卷积神经网络(CNN)、深度信念网络(DBN)、ResNet、VGGNet、AlexNet、视transformer(ViT)和GoogLeNet。

Flexible Differentially Private Vertical Federated Learning with Adaptive Feature Embeddings

  • paper_url: http://arxiv.org/abs/2308.02362
  • repo_url: None
  • paper_authors: Yuxi Mi, Hongquan Liu, Yewei Xia, Yiheng Sun, Jihong Guan, Shuigeng Zhou
  • for: 本研究旨在探讨Vertically Federated Learning(VFL)中隐私保护的缺陷,因为共享特征嵌入可能泄露敏感信息。
  • methods: 本文提出了一种flexible和通用的方法,即在 differential privacy(DP)下分解隐私保护和任务用途两个目标,并在两个目标之间寻找 equilibria。
  • results: 经过广泛的实验 validate,提议的 VFL-AFE 框架能够具有防御隐私攻击和保持任务用途的能力,而不需要牺牲 Established DP 机制。
    Abstract The emergence of vertical federated learning (VFL) has stimulated concerns about the imperfection in privacy protection, as shared feature embeddings may reveal sensitive information under privacy attacks. This paper studies the delicate equilibrium between data privacy and task utility goals of VFL under differential privacy (DP). To address the generality issue of prior arts, this paper advocates a flexible and generic approach that decouples the two goals and addresses them successively. Specifically, we initially derive a rigorous privacy guarantee by applying norm clipping on shared feature embeddings, which is applicable across various datasets and models. Subsequently, we demonstrate that task utility can be optimized via adaptive adjustments on the scale and distribution of feature embeddings in an accuracy-appreciative way, without compromising established DP mechanisms. We concretize our observation into the proposed VFL-AFE framework, which exhibits effectiveness against privacy attacks and the capacity to retain favorable task utility, as substantiated by extensive experiments.
    摘要 vertical Federated learning (VFL) 的出现引发了隐私保护不足的担忧,因为共享特征表示可能暴露敏感信息面临隐私攻击。这篇论文研究了 VFL 中数据隐私和任务利用目标之间的紧耦合关系,并提出了一种flexible和通用的方法来解决这个问题。Specifically, we initially derive a rigorous privacy guarantee by applying norm clipping on shared feature embeddings, which is applicable across various datasets and models. Subsequently, we demonstrate that task utility can be optimized via adaptive adjustments on the scale and distribution of feature embeddings in an accuracy-appreciative way, without compromising established DP mechanisms. We concretize our observation into the proposed VFL-AFE framework, which exhibits effectiveness against privacy attacks and the capacity to retain favorable task utility, as substantiated by extensive experiments.Here's the word-for-word translation of the text into Simplified Chinese: vertical Federated learning (VFL) 的出现引发了隐私保护不足的担忧,因为共享特征表示可能暴露敏感信息面临隐私攻击。这篇论文研究了 VFL 中数据隐私和任务利用目标之间的紧耦合关系,并提出了一种flexible和通用的方法来解决这个问题。Specifically, we initially derive a rigorous privacy guarantee by applying norm clipping on shared feature embeddings, which is applicable across various datasets and models. Subsequently, we demonstrate that task utility can be optimized via adaptive adjustments on the scale and distribution of feature embeddings in an accuracy-appreciative way, without compromising established DP mechanisms. We concretize our observation into the proposed VFL-AFE framework, which exhibits effectiveness against privacy attacks and the capacity to retain favorable task utility, as substantiated by extensive experiments.

How Does Diffusion Influence Pretrained Language Models on Out-of-Distribution Data?

  • paper_url: http://arxiv.org/abs/2307.13949
  • repo_url: https://github.com/maybelizzy/diffusion_ood_robustness
  • paper_authors: Huazheng Wang, Daixuan Cheng, Haifeng Sun, Jingyu Wang, Qi Qi, Jianxin Liao, Jing Wang, Cong Liu
  • for: 这种研究旨在探讨 diffusion models 如何影响 modern NLP 中的预训语言模型(PLMs)在异常数据(OOD)上的性能。
  • methods: 该研究使用了 diffusion 模型,包括 forward diffusion 过程和 reverse denoising 过程,以及测试不同训练参数和数据统计特征的实验。
  • results: 研究发现,在 OOD 数据上,训练 PLMs WITH diffusion 会下降重建能力;而 diffusion 模型可以有效地检测 OOD 样本,在大多数数据集上达到了领先的性能,具体是absolute accuracy 提高达18%。这些结果表明,diffusion 减少了 PLMs 在 OOD 数据上的Robustness。
    Abstract Transformer-based pretrained language models (PLMs) have achieved great success in modern NLP. An important advantage of PLMs is good out-of-distribution (OOD) robustness. Recently, diffusion models have attracted a lot of work to apply diffusion to PLMs. It remains under-explored how diffusion influences PLMs on OOD data. The core of diffusion models is a forward diffusion process which gradually applies Gaussian noise to inputs, and a reverse denoising process which removes noise. The noised input reconstruction is a fundamental ability of diffusion models. We directly analyze OOD robustness by measuring the reconstruction loss, including testing the abilities to reconstruct OOD data, and to detect OOD samples. Experiments are conducted by analyzing different training parameters and data statistical features on eight datasets. It shows that finetuning PLMs with diffusion degrades the reconstruction ability on OOD data. The comparison also shows that diffusion models can effectively detect OOD samples, achieving state-of-the-art performance in most of the datasets with an absolute accuracy improvement up to 18%. These results indicate that diffusion reduces OOD robustness of PLMs.
    摘要 transformer-based pre-trained语言模型(PLM)在现代NLP中取得了很大成功。PLM的一个重要优势是对于不同类型的输入数据(out-of-distribution,OOD)的Robustness。近期,扩散模型在应用扩散到PLM方面吸引了很多研究。然而, diffusion对PLM在OOD数据上的影响还很少研究。扩散模型的核心是一个前向扩散过程,逐渐将输入数据加载到Gaussian噪声中,以及一个反推噪声过程,去除噪声。重要的是,扩散模型可以很好地重建噪声输入。我们直接分析OOD robustness,测试PLM是否能够正确重建OOD数据,以及是否能够检测OOD样本。我们通过对不同的训练参数和数据统计特征进行分析,在八个数据集上进行了实验。结果表明,练化PLMs with diffusion会降低OOD数据的重建能力。 comparison也显示,扩散模型可以有效地检测OOD样本,在大多数数据集中达到了状态的精度提升最多18%。这些结果表明, diffusion减少了PLMs的OOD Robustness。

Learning-based Control for PMSM Using Distributed Gaussian Processes with Optimal Aggregation Strategy

  • paper_url: http://arxiv.org/abs/2307.13945
  • repo_url: None
  • paper_authors: Zhenxiao Yin, Xiaobing Dai, Zewen Yang, Yang Shen, Georges Hattab, Hang Zhao
  • for: 这篇论文的目的是提出一种基于Lyapunov稳定理论的分布式GPR控制策略,用于提高PM synchronous motor的精度控制。
  • methods: 该策略使用分布式GPR来描述系统,并采用 posterior mean来避免计算复杂的 posterior variance。
  • results: 在模拟中,该策略得到了证明,并且在高频PM synchronous motor控制中实现了简单、高效的实现。
    Abstract The growing demand for accurate control in varying and unknown environments has sparked a corresponding increase in the requirements for power supply components, including permanent magnet synchronous motors (PMSMs). To infer the unknown part of the system, machine learning techniques are widely employed, especially Gaussian process regression (GPR) due to its flexibility of continuous system modeling and its guaranteed performance. For practical implementation, distributed GPR is adopted to alleviate the high computational complexity. However, the study of distributed GPR from a control perspective remains an open problem. In this paper, a control-aware optimal aggregation strategy of distributed GPR for PMSMs is proposed based on the Lyapunov stability theory. This strategy exclusively leverages the posterior mean, thereby obviating the need for computationally intensive calculations associated with posterior variance in alternative approaches. Moreover, the straightforward calculation process of our proposed strategy lends itself to seamless implementation in high-frequency PMSM control. The effectiveness of the proposed strategy is demonstrated in the simulations.
    摘要 随着不确定环境中精准控制的需求增长,电动机组件,包括永磁同步机(PMSM)的要求也在增长。为了推断未知系统部分,机器学习技术广泛应用,特别是 Gaussian process regression(GPR),因为它可以 kontinuous系统模型的灵活性和 garantizado性。但是,从控制角度来看,分布式GPR的研究仍然是一个开放的问题。在这篇论文中,一种基于Lyapunov稳定理论的控制意识优化策略 для分布式GPR在PMSM控制中被提出。这种策略仅仅利用 posterior mean,因此不需要计算量大的 posterior variance在其他方法中所需的计算量。此外,我们提出的计算过程易于实现高频PMSM控制。在 simulations中,我们证明了该策略的有效性。

Entropy Neural Estimation for Graph Contrastive Learning

  • paper_url: http://arxiv.org/abs/2307.13944
  • repo_url: https://github.com/kunzhan/M-ILBO
  • paper_authors: Yixuan Ma, Xiaolin Zhang, Peng Zhang, Kun Zhan
  • for: 本文提出了一种基于对比学习的图гра夫特表示学习方法,目的是提取图гра夫特上的独特高级表示。
  • methods: 本文使用了一种基于对比学习的方法,即通过最大化对比信息的下界来估算数据集的熵。同时,本文还提出了一种简单 yet effective的子集采样策略,以提高对比表示的精度。
  • results: 本文通过实验表明,提出的方法可以在七个图гра夫特benchmark上达到当前状态的较好表现。同时,本文还介绍了一种跨视图一致性约束,以保证学习的表示是视图的整体图гра夫特表示的一致。
    Abstract Contrastive learning on graphs aims at extracting distinguishable high-level representations of nodes. In this paper, we theoretically illustrate that the entropy of a dataset can be approximated by maximizing the lower bound of the mutual information across different views of a graph, \ie, entropy is estimated by a neural network. Based on this finding, we propose a simple yet effective subset sampling strategy to contrast pairwise representations between views of a dataset. In particular, we randomly sample nodes and edges from a given graph to build the input subset for a view. Two views are fed into a parameter-shared Siamese network to extract the high-dimensional embeddings and estimate the information entropy of the entire graph. For the learning process, we propose to optimize the network using two objectives, simultaneously. Concretely, the input of the contrastive loss function consists of positive and negative pairs. Our selection strategy of pairs is different from previous works and we present a novel strategy to enhance the representation ability of the graph encoder by selecting nodes based on cross-view similarities. We enrich the diversity of the positive and negative pairs by selecting highly similar samples and totally different data with the guidance of cross-view similarity scores, respectively. We also introduce a cross-view consistency constraint on the representations generated from the different views. This objective guarantees the learned representations are consistent across views from the perspective of the entire graph. We conduct extensive experiments on seven graph benchmarks, and the proposed approach achieves competitive performance compared to the current state-of-the-art methods. The source code will be publicly released once this paper is accepted.
    摘要 contrastive learning on graphs aims to extract distinguishable high-level representations of nodes. In this paper, we theoretically prove that the entropy of a dataset can be approximated by maximizing the lower bound of the mutual information across different views of a graph, \ie, entropy is estimated by a neural network. Based on this finding, we propose a simple yet effective subset sampling strategy to contrast pairwise representations between views of a dataset. Specifically, we randomly sample nodes and edges from a given graph to build the input subset for a view. Two views are fed into a parameter-shared Siamese network to extract the high-dimensional embeddings and estimate the information entropy of the entire graph. For the learning process, we propose to optimize the network using two objectives, simultaneously. Concretely, the input of the contrastive loss function consists of positive and negative pairs. Our selection strategy of pairs is different from previous works and we present a novel strategy to enhance the representation ability of the graph encoder by selecting nodes based on cross-view similarities. We enrich the diversity of the positive and negative pairs by selecting highly similar samples and totally different data with the guidance of cross-view similarity scores, respectively. We also introduce a cross-view consistency constraint on the representations generated from the different views. This objective guarantees the learned representations are consistent across views from the perspective of the entire graph. We conduct extensive experiments on seven graph benchmarks, and the proposed approach achieves competitive performance compared to the current state-of-the-art methods. The source code will be publicly released once this paper is accepted.

Stability of Multi-Agent Learning: Convergence in Network Games with Many Players

  • paper_url: http://arxiv.org/abs/2307.13922
  • repo_url: None
  • paper_authors: Aamal Hussain, Dan Leonte, Francesco Belardinelli, Georgios Piliouras
  • for: 研究多智能学习在多名玩家游戏中的复杂动态行为。
  • methods: 使用Q学习方法研究多名玩家游戏的各对各交互和网络结构的影响。
  • results: 发现在适当的网络条件下,可以实现多名玩家游戏中稳定的学习动态,无论玩家的数量如何。
    Abstract The behaviour of multi-agent learning in many player games has been shown to display complex dynamics outside of restrictive examples such as network zero-sum games. In addition, it has been shown that convergent behaviour is less likely to occur as the number of players increase. To make progress in resolving this problem, we study Q-Learning dynamics and determine a sufficient condition for the dynamics to converge to a unique equilibrium in any network game. We find that this condition depends on the nature of pairwise interactions and on the network structure, but is explicitly independent of the total number of agents in the game. We evaluate this result on a number of representative network games and show that, under suitable network conditions, stable learning dynamics can be achieved with an arbitrary number of agents.
    摘要 多体学习在多名玩家游戏中展现出复杂的动力学行为,不受限于严格的网络零游戏例子。此外,我们发现,随着玩家数量增加,协调性行为越来越少有可能出现。为解决这个问题,我们研究了Q学习动力学和确定了一个充分条件,使得动力学在任何网络游戏中 converges to a unique equilibrium。我们发现这个条件取决于对抗对和网络结构,但是不виси于总的agent数量。我们在一些代表性的网络游戏中评估了这些结果,并显示,在适当的网络条件下,可以通过任意数量的代理人实现稳定的学习动力学。

HyperFed: Hyperbolic Prototypes Exploration with Consistent Aggregation for Non-IID Data in Federated Learning

  • paper_url: http://arxiv.org/abs/2307.14384
  • repo_url: None
  • paper_authors: Xinting Liao, Weiming Liu, Chaochao Chen, Pengyang Zhou, Huabin Zhu, Yanchao Tan, Jun Wang, Yue Qi
  • for: 提高 Federated Learning 下非Identical Independent Distribution(non-IID)Client数据的性能
  • methods: Hyperbolic Prototype Tammes Initialization(HPTI)、Hyperbolic Prototype Learning(HPL)和Consistent Aggregation(CA)
  • results: 在四个数据集上进行了广泛的研究,证明 HyperFed 可以有效地提高 Federated Learning 下 non-IID Client数据的性能
    Abstract Federated learning (FL) collaboratively models user data in a decentralized way. However, in the real world, non-identical and independent data distributions (non-IID) among clients hinder the performance of FL due to three issues, i.e., (1) the class statistics shifting, (2) the insufficient hierarchical information utilization, and (3) the inconsistency in aggregating clients. To address the above issues, we propose HyperFed which contains three main modules, i.e., hyperbolic prototype Tammes initialization (HPTI), hyperbolic prototype learning (HPL), and consistent aggregation (CA). Firstly, HPTI in the server constructs uniformly distributed and fixed class prototypes, and shares them with clients to match class statistics, further guiding consistent feature representation for local clients. Secondly, HPL in each client captures the hierarchical information in local data with the supervision of shared class prototypes in the hyperbolic model space. Additionally, CA in the server mitigates the impact of the inconsistent deviations from clients to server. Extensive studies of four datasets prove that HyperFed is effective in enhancing the performance of FL under the non-IID set.
    摘要 Federated learning (FL) 共同模型用户数据在分布式方式下进行协同学习。然而,在实际世界中,客户端数据分布不匹配(non-IID)会阻碍 FL 的性能,主要问题包括:1)类别统计 Parametric shift,2)不足的层次信息利用,3)客户端聚合不一致。为解决以上问题,我们提出了 HyperFed,它包括以下三个主要模块:1)hyperbolic prototype Tammes initialization(HPTI),2)hyperbolic prototype learning(HPL),3)consistent aggregation(CA)。首先,HPTI 在服务器端构建固定类型和 uniformly distributed 的类prototype,并将其分享给客户端以匹配类统计,导向客户端的准确特征表示。其次,HPL 在每个客户端上在 hyperbolic 模型空间中捕捉地方数据中的层次信息,并在服务器端的 supervision 下进行学习。最后,CA 在服务器端mitigates the impact of inconsistent deviations from clients to server。经验studies of four datasets 表明,HyperFed 可以有效提高 FL 在 non-IID 环境下的性能。

Embedding Democratic Values into Social Media AIs via Societal Objective Functions

  • paper_url: http://arxiv.org/abs/2307.13912
  • repo_url: None
  • paper_authors: Chenyan Jia, Michelle S. Lam, Minh Chau Mai, Jeff Hancock, Michael S. Bernstein
  • for: 本研究旨在开发一种基于社会科学理论和方法的人工智能系统,以mitigate partisan animosity在社交媒体上的影响。
  • methods: 本研究使用了一种方法,将社会科学中已经评估和验证的社会科学构造翻译成人工智能系统中的目标函数,称为社会目标函数。然后,通过使用社会科学中已经开发的问卷Instruments和质量代码本来翻译这些构造,并将其转化为详细的大语言模型提示。
  • results: 研究发现,使用这种方法可以创建一个评估社交媒体帖子中anti-democratic attitudes的模型,并在三个研究中证明了该模型的有效性。在第一个研究中,通过手动标注(alpha=.895)社交媒体帖子中anti-democratic attitudes的分数,并试出多种基于这些分数的 feed ranking 条件,发现可以减少参与者的偏见情感(d=.20)和下排帖子(d=.25)无需妨碍参与者的体验和参与度。在第二个研究中,通过创建一个民主态度模型,发现与手动标注(rho=.75)具有强相关性。最后,在第三个研究中,重复第一个研究,使用民主态度模型代替手动标注,并发现 feed downranking 使用社会目标函数可以减少参与者的偏见情感(d=.25)。这种方法提供了一种新的策略,可以基于社会科学理论和方法来减少社交媒体中的社会危害。
    Abstract Can we design artificial intelligence (AI) systems that rank our social media feeds to consider democratic values such as mitigating partisan animosity as part of their objective functions? We introduce a method for translating established, vetted social scientific constructs into AI objective functions, which we term societal objective functions, and demonstrate the method with application to the political science construct of anti-democratic attitudes. Traditionally, we have lacked observable outcomes to use to train such models, however, the social sciences have developed survey instruments and qualitative codebooks for these constructs, and their precision facilitates translation into detailed prompts for large language models. We apply this method to create a democratic attitude model that estimates the extent to which a social media post promotes anti-democratic attitudes, and test this democratic attitude model across three studies. In Study 1, we first test the attitudinal and behavioral effectiveness of the intervention among US partisans (N=1,380) by manually annotating (alpha=.895) social media posts with anti-democratic attitude scores and testing several feed ranking conditions based on these scores. Removal (d=.20) and downranking feeds (d=.25) reduced participants' partisan animosity without compromising their experience and engagement. In Study 2, we scale up the manual labels by creating the democratic attitude model, finding strong agreement with manual labels (rho=.75). Finally, in Study 3, we replicate Study 1 using the democratic attitude model instead of manual labels to test its attitudinal and behavioral impact (N=558), and again find that the feed downranking using the societal objective function reduced partisan animosity (d=.25). This method presents a novel strategy to draw on social science theory and methods to mitigate societal harms in social media AIs.
    摘要 可以我们设计人工智能(AI)系统,以考虑民主价值观为其目标函数中的一部分?我们介绍了一种将社会科学建构翻译成AI目标函数的方法,我们称之为社会目标函数,并示例了这种方法应用于政治科学构建中的反民主态度。在过去,我们缺乏可观察的结果来训练这些模型,但社会科学已经开发出了调查工具和质量代码库 для这些构建,其精度使其可以翻译成详细的提示 для大语言模型。我们应用这种方法创建了一个民主态度模型,可以估计社交媒体文章是否推动反民主态度,并在三项研究中测试了这种民主态度模型。在第一项研究中,我们首先测试了对美国党派者(N=1,380)的情感和行为效果,并 manually annotate(α=.895)社交媒体文章的反民主态度得分。去掉(d=.20)和下推文章(d=.25)可以降低参与者的党派仇恨,而不是削弱他们的体验和参与度。在第二项研究中,我们扩大了手动标签,创建了民主态度模型,并发现与手动标签强相关(ρ=.75)。在第三项研究中,我们重复了第一项研究,使用民主态度模型而不是手动标签,并发现feed下推使用社会目标函数减少了党派仇恨(d=.25)。这种方法提供了一种新的策略,可以基于社会科学理论和方法来减少社会媒体AIs中的社会危害。

Robustness Verification of Deep Neural Networks using Star-Based Reachability Analysis with Variable-Length Time Series Input

  • paper_url: http://arxiv.org/abs/2307.13907
  • repo_url: None
  • paper_authors: Neelanjana Pal, Diego Manzanas Lopez, Taylor T Johnson
  • for: 这个论文是为了验证和验议基于神经网络的时序数据分析和预测维护的可靠性和可靠性。
  • methods: 该论文使用了基于时序数据的神经网络分析,并使用了变量长度输入数据来简化输入处理和提高网络架构的通用性。
  • results: 该论文通过使用星形可达性分析和一些性能指标来检查神经网络的可靠性,并证明了神经网络的输出受输入噪声影响的影响。
    Abstract Data-driven, neural network (NN) based anomaly detection and predictive maintenance are emerging research areas. NN-based analytics of time-series data offer valuable insights into past behaviors and estimates of critical parameters like remaining useful life (RUL) of equipment and state-of-charge (SOC) of batteries. However, input time series data can be exposed to intentional or unintentional noise when passing through sensors, necessitating robust validation and verification of these NNs. This paper presents a case study of the robustness verification approach for time series regression NNs (TSRegNN) using set-based formal methods. It focuses on utilizing variable-length input data to streamline input manipulation and enhance network architecture generalizability. The method is applied to two data sets in the Prognostics and Health Management (PHM) application areas: (1) SOC estimation of a Lithium-ion battery and (2) RUL estimation of a turbine engine. The NNs' robustness is checked using star-based reachability analysis, and several performance measures evaluate the effect of bounded perturbations in the input on network outputs, i.e., future outcomes. Overall, the paper offers a comprehensive case study for validating and verifying NN-based analytics of time-series data in real-world applications, emphasizing the importance of robustness testing for accurate and reliable predictions, especially considering the impact of noise on future outcomes.
    摘要 <>Translate the given text into Simplified Chinese.<>数据驱动、基于神经网络(NN)的异常检测和预测维护是当前研究领域之一。NN分析时间序列数据可以提供价值的信息,如设备的剩余有用生命(RUL)和电池的状态充电(SOC)。但是,输入时间序列数据可能会受到意外或无意义的噪声影响,因此需要robust验证和验证这些NN。这篇论文介绍了一种基于集合形式方法的NN验证方法,它通过使用可变长输入数据来简化输入处理和提高网络架构的通用性。该方法在两个PHM应用领域的数据集上进行了实践:(1)Li-ion电池SOC估计和(2)涡轮机RUL估计。通过星形可达性分析来检查NN的Robustness,并使用一些性能指标来评估输入噪声对网络输出的影响,即未来的结果。总之,这篇论文提供了一个全面的NP-based analytics验证和验证方法,强调验证NN-based时间序列数据分析的重要性,特别是考虑噪声对未来结果的影响。

Data Augmentation for Neural Machine Translation using Generative Language Model

  • paper_url: http://arxiv.org/abs/2307.16833
  • repo_url: None
  • paper_authors: Seokjin Oh, Su ah Lee, Woohwan Jung
  • for: 提高机器翻译模型的性能,Addressing the scarcity of large parallel corpora in Neural Machine Translation.
  • methods: 使用提示基于的数据增强技术,利用大规模语言模型如ChatGPT生成Synthetic parallel corpus,不需要新的模型训练成本。
  • results: 与未增强基eline相比,提高0.68 Bleu分数。
    Abstract Despite the rapid growth in model architecture, the scarcity of large parallel corpora remains the main bottleneck in Neural Machine Translation. Data augmentation is a technique that enhances the performance of data-hungry models by generating synthetic data instead of collecting new ones. We explore prompt-based data augmentation approaches that leverage large-scale language models such as ChatGPT. To create a synthetic parallel corpus, we compare 3 methods using different prompts. We employ two assessment metrics to measure the diversity of the generated synthetic data. This approach requires no further model training cost, which is mandatory in other augmentation methods like back-translation. The proposed method improves the unaugmented baseline by 0.68 BLEU score.
    摘要 尽管模型架构在快速发展,但数据缺乏大量并行 Corpora 仍然是机器翻译神经网络中的主要瓶颈。数据扩充是一种技术,可以提高数据吞吐量模型的性能,而不需要收集新的数据。我们探索了基于 prompt 的数据扩充方法,利用大规模语言模型如 ChatGPT。为创建一个合成并行 Corpora,我们比较了三种不同的 prompt 方法。我们采用了两个评估指标来度量生成的合成数据的多样性。这种方法不需要额外的模型训练成本,与其他扩充方法如 back-translation 不同。我们的方法可以提高无扩充基准值的 BLEU 得分0.68分。

FinTree: Financial Dataset Pretrain Transformer Encoder for Relation Extraction

  • paper_url: http://arxiv.org/abs/2307.13900
  • repo_url: None
  • paper_authors: Hyunjong Ok
  • for: FinTree is written for financial relation extraction tasks, specifically to improve the accuracy of relation predictions between two given entities.
  • methods: FinTree uses a pre-trained encoder language model, with a novel structure that predicts a masked token instead of the conventional [CLS] token, inspired by the Pattern Exploiting Training methodology. The model is trained with a unique input pattern to provide contextual and positional information about the entities of interest, and a post-processing step ensures accurate predictions in line with the entity types.
  • results: FinTree outperforms on the REFinD, a large-scale financial relation extraction dataset.
    Abstract We present FinTree, Financial Dataset Pretrain Transformer Encoder for Relation Extraction. Utilizing an encoder language model, we further pretrain FinTree on the financial dataset, adapting the model in financial domain tasks. FinTree stands out with its novel structure that predicts a masked token instead of the conventional [CLS] token, inspired by the Pattern Exploiting Training methodology. This structure allows for more accurate relation predictions between two given entities. The model is trained with a unique input pattern to provide contextual and positional information about the entities of interest, and a post-processing step ensures accurate predictions in line with the entity types. Our experiments demonstrate that FinTree outperforms on the REFinD, a large-scale financial relation extraction dataset. The code and pretrained models are available at https://github.com/HJ-Ok/FinTree.
    摘要 我们介绍FinTree,一个基于语言模型的金融 dataset 预训读取器。我们透过使用语言模型,进一步预训 FinTree 在金融领域任务中。FinTree 的独特结构是预测填写的 tokens,而不是 convention 的 [CLS] tokens,这种结构允许更精确地预测两个 Entities 之间的关系。模型在特定的输入模式下训练,以提供 Contextual 和位置信息,并且进行后处理步骤,以确保预测和 Entities 类型相符。我们的实验表明,FinTree 在 REFinD 大规模金融关系提取 dataset 上表现出色。代码和预训模型可以在 获取。

Regularizing Neural Networks with Meta-Learning Generative Models

  • paper_url: http://arxiv.org/abs/2307.13899
  • repo_url: None
  • paper_authors: Shin’ya Yamaguchi, Daiki Chijiwa, Sekitoshi Kanai, Atsutoshi Kumagai, Hisashi Kashima
  • for: 提高深度学习中的生成数据增强
  • methods: 利用生成模型生成的 sintetic 样本作为增强数据,并通过 meta 学习来动态确定 sintetic 样本以最小化验证损失
  • results: 对 six 个数据集进行实验,发现 MGR 可以避免生成数据增强导致性能下降,并稳定超越基elines。
    Abstract This paper investigates methods for improving generative data augmentation for deep learning. Generative data augmentation leverages the synthetic samples produced by generative models as an additional dataset for classification with small dataset settings. A key challenge of generative data augmentation is that the synthetic data contain uninformative samples that degrade accuracy. This is because the synthetic samples do not perfectly represent class categories in real data and uniform sampling does not necessarily provide useful samples for tasks. In this paper, we present a novel strategy for generative data augmentation called meta generative regularization (MGR). To avoid the degradation of generative data augmentation, MGR utilizes synthetic samples in the regularization term for feature extractors instead of in the loss function, e.g., cross-entropy. These synthetic samples are dynamically determined to minimize the validation losses through meta-learning. We observed that MGR can avoid the performance degradation of na\"ive generative data augmentation and boost the baselines. Experiments on six datasets showed that MGR is effective particularly when datasets are smaller and stably outperforms baselines.
    摘要

AI4GCC - Team: Below Sea Level: Critiques and Improvements

  • paper_url: http://arxiv.org/abs/2307.13894
  • repo_url: None
  • paper_authors: Bram Renting, Phillip Wozny, Robert Loftin, Claudia Wieners, Erman Acar
  • for: 评估气候变化对经济的影响
  • methods: 使用 интегра assessment 模型(IAM) RICE-N 进行评估
  • results: 提出了改进 rice-N 模型的建议,包括使用关税收入和奖励过production,并批判IAMs 中偏正的损害函数和不切实际的抑制成本函数。
    Abstract We present a critical analysis of the simulation framework RICE-N, an integrated assessment model (IAM) for evaluating the impacts of climate change on the economy. We identify key issues with RICE-N, including action masking and irrelevant actions, and suggest improvements such as utilizing tariff revenue and penalizing overproduction. We also critically engage with features of IAMs in general, namely overly optimistic damage functions and unrealistic abatement cost functions. Our findings contribute to the ongoing efforts to further develop the RICE-N framework in an effort to improve the simulation, making it more useful as an inspiration for policymakers.
    摘要 我们提出了关于模拟框架RICE-N的批判分析,这是一种气候变化影响经济的集成评估模型(IAM)。我们发现了RICE-N中的关键问题,包括行动遮盖和无关行动,并建议使用关税收入和强制产量罚款来改进。我们还与IAM中的一些特征进行批判,包括过估损害函数和不实际的降低成本函数。我们的发现可以帮助进一步发展RICE-N框架,使其更有用作政策制定者的参考。

Dynamic Grouping for Climate Change Negotiation: Facilitating Cooperation and Balancing Interests through Effective Strategies

  • paper_url: http://arxiv.org/abs/2307.13893
  • repo_url: None
  • paper_authors: Yu Qin, Duo Zhang, Yuren Pang
  • for: 这篇论文旨在提出一种基于现实世界商业和政治谈判协议的气候变化缓解动态分组模型,以促进不同参与者之间的有效合作,实现全球气候变化目标。
  • methods: 该模型包括三个阶段:分组和更新、内部谈判和间部谈判。它利用分组方法和更新策略解决多地区气候谈判中的复杂性和不均衡。
  • results: 通过在RICE-N框架中应用谈判模型,表明了国际合作气候变化缓解的可能性。
    Abstract In this paper, we propose a dynamic grouping negotiation model for climate mitigation based on real-world business and political negotiation protocols. Within the AI4GCC competition framework, we develop a three-stage process: group formation and updates, intra-group negotiation, and inter-group negotiation. Our model promotes efficient and effective cooperation between various stakeholders to achieve global climate change objectives. By implementing a group-forming method and group updating strategy, we address the complexities and imbalances in multi-region climate negotiations. Intra-group negotiations ensure that all members contribute to mitigation efforts, while inter-group negotiations use the proposal-evaluation framework to set mitigation and savings rates. We demonstrate our negotiation model within the RICE-N framework, illustrating a promising approach for facilitating international cooperation on climate change mitigation.
    摘要 在这篇论文中,我们提出了一种动态分组谈判模型,用于气候 Mitigation 的实际商业和政治谈判协议。在 AI4GCC 竞赛框架下,我们开发了三个阶段过程:分组形成和更新、内部谈判和间部谈判。我们的模型推动了不同参与者之间的有效和有效的合作,以实现全球气候变化目标。通过实施分组形成方法和分组更新策略,我们解决了多地区气候谈判中的复杂性和不平衡。内部谈判确保所有成员做出了减少措施的贡献,而间部谈判使用提案评估框架来设置减少和节约率。我们在 RICE-N 框架中示出了我们的谈判模型,预示了对国际气候变化减少努力的有效方法。

AI4GCC-Team – Below Sea Level: Score and Real World Relevance

  • paper_url: http://arxiv.org/abs/2307.13892
  • repo_url: None
  • paper_authors: Phillip Wozny, Bram Renting, Robert Loftin, Claudia Wieners, Erman Acar
  • for: The paper is written to address the challenges of carbon leakage in the context of the RICE-N climate-economic simulation, with the goal of achieving a comparable temperature rise to RCP 3.4/4.5 and SSP 2.
  • methods: The paper proposes a negotiation protocol inspired by the Carbon Border Adjustment Mechanism (CBAM) and Climate Clubs (CC), and demonstrates the effectiveness of this approach through simulations.
  • results: The paper’s proposed protocol results in a temperature rise comparable to RCP 3.4/4.5 and SSP 2, and provides an analysis of its World Trade Organization compliance, administrative and political feasibility, and ethical concerns. However, the paper also acknowledges the risk of hurting the least developing countries and suggests specific corrective measures to avoid exacerbating existing inequalities.In Simplified Chinese text, the three key points would be:
  • for: 该文章是为了解决RICE-N气候经济模拟中的碳泄漏问题,目的是实现RCP 3.4/4.5和SSP 2的温室气体升高。
  • methods: 文章提出了一种启发自碳泄漏机制和气候俱乐部的谈判协议,并通过模拟证明其效果。
  • results: 文章的提出的协议实现了RCP 3.4/4.5和SSP 2的温室气体升高,并进行了世界贸易组织合法性、行政和政治可行性以及伦理问题的分析。但文章也承认可能对最弱国家造成影响,并建议特定的修正措施来避免加剧现有不平等。
    Abstract As our submission for track three of the AI for Global Climate Cooperation (AI4GCC) competition, we propose a negotiation protocol for use in the RICE-N climate-economic simulation. Our proposal seeks to address the challenges of carbon leakage through methods inspired by the Carbon Border Adjustment Mechanism (CBAM) and Climate Clubs (CC). We demonstrate the effectiveness of our approach by comparing simulated outcomes to representative concentration pathways (RCP) and shared socioeconomic pathways (SSP). Our protocol results in a temperature rise comparable to RCP 3.4/4.5 and SSP 2. Furthermore, we provide an analysis of our protocol's World Trade Organization compliance, administrative and political feasibility, and ethical concerns. We recognize that our proposal risks hurting the least developing countries, and we suggest specific corrective measures to avoid exacerbating existing inequalities, such as technology sharing and wealth redistribution. Future research should improve the RICE-N tariff mechanism and implement actions allowing for the aforementioned corrective measures.
    摘要 为AIfor Global Climate Cooperation(AI4GCC)比赛的第三轨道提交,我们提议一种谈判协议,用于在RICE-N气候经济模拟中 address carbon leakage 问题。我们的提议启发自Carbon Border Adjustment Mechanism(CBAM)和Climate Clubs(CC)的方法。我们通过对比 simulate 结果和代表气候道具(RCP)和共产经济道具(SSP)来证明我们的方法的有效性。我们的协议会导致温度升高相当于RCP 3.4/4.5和SSP 2。此外,我们还提供了对我们协议的世界贸易组织合法性、行政和政治可行性以及道德问题的分析。我们认为我们的建议可能会对最少发展国家产生负面影响,我们建议特定的修正措施,以避免增加现有的不平等,如技术分享和财富重新分配。未来的研究应该完善RICE-N关税机制,并实施相应的行动,以实现上述修正措施。

Dynamic Grouping for Climate Change Negotiation: Facilitating Cooperation and Balancing Interests through Effective Strategies

  • paper_url: http://arxiv.org/abs/2307.13886
  • repo_url: None
  • paper_authors: Duo Zhang, Yuren Pang, Yu Qin
  • for: This paper aims to improve the accuracy and effectiveness of climate change negotiation models by addressing limitations in the current framework.
  • methods: The paper explores five critical aspects of geographical impacts and refines the utility and rewards framework to better account for heterogeneity and historical/cultural factors.
  • results: By addressing these limitations, the paper hopes to enhance the accuracy and effectiveness of climate change negotiation models, enabling policymakers and stakeholders to devise targeted and appropriate strategies to tackle climate change at both regional and global levels.Here’s the same information in Simplified Chinese text:
  • for: 这篇论文目的是改进当前气候变化谈判模型的准确性和效果,消除当前框架中的限制。
  • methods: 论文探讨了五个关键地区的影响,并对奖励和折损函数进行了修改,以更好地考虑地域差异和历史文化因素。
  • results: 通过修改限制,论文希望提高气候变化谈判模型的准确性和效果,帮助政策制定者和各方决策者制定适当的地域和全球级气候变化策略。
    Abstract The current framework for climate change negotiation models presents several limitations that warrant further research and development. In this track, we discuss mainly two key areas for improvement, focusing on the geographical impacts and utility framework. In the aspects of geographical impacts, We explore five critical aspects: (1) the shift from local to global impact, (2) variability in climate change effects across regions, (3) heterogeneity in geographical location and political structures, and (4) collaborations between adjacent nations, (5) the importance of including historical and cultural factors influencing climate negotiations. Furthermore, we emphasize the need to refine the utility and rewards framework to reduce the homogeneity and the level of overestimating the climate mitigation by integrating the positive effects of saving rates into the reward function and heterogeneity among all regions. By addressing these limitations, we hope to enhance the accuracy and effectiveness of climate change negotiation models, enabling policymakers and stakeholders to devise targeted and appropriate strategies to tackle climate change at both regional and global levels.
    摘要 当前气候变化谈判模型存在多个限制,需要进一步的研究和发展。在这一轨道上,我们主要讨论两个关键领域的改进,即地域影响和用途框架。在地域影响方面,我们探讨五个关键方面:(1)从本地到全球影响的转变,(2)气候变化影响不同地区的变化性,(3)地理位置和政治结构之间的多样性,(4)邻国合作,(5)包括历史和文化因素 influencing气候谈判。此外,我们强调要更新用途和奖励框架,以减少同化和气候遏制的误差,并将保存率integrated到奖励函数中,以增强气候谈判模型的准确性和效果。通过解决这些限制,我们希望能够提高气候变化谈判模型的准确性和效果,帮助政策制定者和各方利益者制定适应的气候变化策略,并在全球和地域水平上应对气候变化。

WebArena: A Realistic Web Environment for Building Autonomous Agents

  • paper_url: http://arxiv.org/abs/2307.13854
  • repo_url: https://github.com/web-arena-x/webarena
  • paper_authors: Shuyan Zhou, Frank F. Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Yonatan Bisk, Daniel Fried, Uri Alon, Graham Neubig
  • for: 本研究旨在开发一个真实和可重现的自动化代理环境,以便用于日常任务管理。
  • methods: 本研究使用了现代自然语言处理技术,包括理解和行动之前的推理,以及网页上进行任务完成。
  • results: 研究发现,解决复杂任务是具有挑战性,并且现今的语言模型仍未能完全成功地完成这些实际生活中的任务。
    Abstract With generative AI advances, the exciting potential for autonomous agents to manage daily tasks via natural language commands has emerged. However, cur rent agents are primarily created and tested in simplified synthetic environments, substantially limiting real-world scenario representation. In this paper, we build an environment for agent command and control that is highly realistic and reproducible. Specifically, we focus on agents that perform tasks on websites, and we create an environment with fully functional websites from four common domains: e-commerce, social forum discussions, collaborative software development, and content management. Our environment is enriched with tools (e.g., a map) and external knowledge bases (e.g., user manuals) to encourage human-like task-solving. Building upon our environment, we release a set of benchmark tasks focusing on evaluating the functional correctness of task completions. The tasks in our benchmark are diverse, long-horizon, and are designed to emulate tasks that humans routinely perform on the internet. We design and implement several autonomous agents, integrating recent techniques such as reasoning before acting. The results demonstrate that solving complex tasks is challenging: our best GPT-4-based agent only achieves an end-to-end task success rate of 10.59%. These results highlight the need for further development of robust agents, that current state-of-the-art LMs are far from perfect performance in these real-life tasks, and that WebArena can be used to measure such progress. Our code, data, environment reproduction resources, and video demonstrations are publicly available at https://webarena.dev/.
    摘要 “受到生成AI的进步启发,现在可以使用自然语言指令来让自动代理人执行日常任务。然而,目前的代理人主要在简单的合成环境中被设计和测试,这限制了实际世界情况的表现。在这篇论文中,我们建立了一个高度现实和可重现的环境,用于代理人的指令和控制。我们特别关注代理人在网站上进行任务的情况,并创建了四种常见的领域中的完整网站:电子商务、社交讨论区、协同软件开发和内容管理。我们的环境包括工具(如地图)和外部知识库(如用户手册),以促进人类化的任务解决。基于我们的环境,我们发布了一组对任务完成的评估标准。我们的任务集包括多元、长期和人类在网络上常进行的任务。我们设计和实现了一些自动代理人,应用latest技术,如理解才行。我们的最佳GPT-4基于代理人仅在终端任务成功率为10.59%。这些结果显示解决复杂任务是具有挑战性,目前的state-of-the-art LMs在这些实际任务中表现未能完美,WebArena可以用来衡量这种进步。我们的代码、数据、环境重现资源和视频示例都公开available at 。”

MAEA: Multimodal Attribution for Embodied AI

  • paper_url: http://arxiv.org/abs/2307.13850
  • repo_url: None
  • paper_authors: Vidhi Jain, Jayant Sravan Tamarapalli, Sahiti Yerramilli, Yonatan Bisk
  • for: 这篇论文是关于Multimodal Perception for Embodied AI的研究,旨在解决多modal输入可能包含高度相互补充的信息问题。
  • methods: 论文使用了Attribution分析来理解不同策略在ALFRED数据集上的全球趋势,并 investigate模型和数据集偏见。
  • results: 论文提出了MAEA框架,可以计算任意分Diffable策略的全球Attribution,并通过Attribution分析下降级语言和视觉Attribution。
    Abstract Understanding multimodal perception for embodied AI is an open question because such inputs may contain highly complementary as well as redundant information for the task. A relevant direction for multimodal policies is understanding the global trends of each modality at the fusion layer. To this end, we disentangle the attributions for visual, language, and previous action inputs across different policies trained on the ALFRED dataset. Attribution analysis can be utilized to rank and group the failure scenarios, investigate modeling and dataset biases, and critically analyze multimodal EAI policies for robustness and user trust before deployment. We present MAEA, a framework to compute global attributions per modality of any differentiable policy. In addition, we show how attributions enable lower-level behavior analysis in EAI policies for language and visual attributions.
    摘要 (Simplified Chinese translation)理解多模态识别对嵌入式AI是一个开放的问题,因为这些输入可能包含高度相互补充的信息。一个有用的方向是理解每个模态在拟合层的全球趋势。为此,我们分离不同策略在ALFRED dataset上训练的视觉、语言和前一个动作输入的归因分析。归因分析可以用来排序和分组失败场景,探索模型和数据集偏见,并且在部署之前对多模态EAI策略进行critical分析。我们提出了MAEA框架,用于计算任何可微分策略的全球归因。此外,我们还示出了归因如何帮助分析EAI策略的低级行为。

Scaling Integer Arithmetic in Probabilistic Programs

  • paper_url: http://arxiv.org/abs/2307.13837
  • repo_url: None
  • paper_authors: William X. Cao, Poorva Garg, Ryan Tjoa, Steven Holtzen, Todd Millstein, Guy Van den Broeck
  • for: 这篇论文是关于probabilistic programming languages (PPLs)中的分布问题的研究。
  • methods: 这篇论文使用了一种名为“binary encoding strategy”的方法,这种方法利用了整数运算中的逻辑结构来实现精确的 probabilistic inference。
  • results: 该研究表明,使用这种binary encoding strategy可以在高维复杂的整数分布中实现精确的 probabilistic inference,并且可以扩展到更大的整数分布。
    Abstract Distributions on integers are ubiquitous in probabilistic modeling but remain challenging for many of today's probabilistic programming languages (PPLs). The core challenge comes from discrete structure: many of today's PPL inference strategies rely on enumeration, sampling, or differentiation in order to scale, which fail for high-dimensional complex discrete distributions involving integers. Our insight is that there is structure in arithmetic that these approaches are not using. We present a binary encoding strategy for discrete distributions that exploits the rich logical structure of integer operations like summation and comparison. We leverage this structured encoding with knowledge compilation to perform exact probabilistic inference, and show that this approach scales to much larger integer distributions with arithmetic.
    摘要 随机分布在整数上是现代随机模型中非常普遍的,但它们对许多现代随机编程语言(PPL)来说仍然是挑战。核心问题在于整数的离散结构:许多今天的PPL推理策略都是基于枚举、采样或导数来扩展,这些方法在高维复杂的整数分布中失效。我们的创新是利用整数的数学结构,这些方法没有使用。我们提出了一种二进制编码策略,利用整数操作的逻辑结构来捕捉整数分布的结构。我们利用这种结构化编码,与知识编译来实现精确的随机推理,并示出这种方法可以扩展到更大的整数分布。

Offline Reinforcement Learning with On-Policy Q-Function Regularization

  • paper_url: http://arxiv.org/abs/2307.13824
  • repo_url: None
  • paper_authors: Laixi Shi, Robert Dadashi, Yuejie Chi, Pablo Samuel Castro, Matthieu Geist
  • for: 本研究旨在解决离线强化学习(RL)中的推理扩展错误问题,通过对政策进行正则化,以避免由历史数据集和期望政策之间的分布转换引起的扩展错误。
  • methods: 本研究提议使用Q函数正则化,通过对Q函数进行估计,以优化政策。两种基于Q函数正则化的算法被提出,并在D4RL标准套件中进行了实验。
  • results: 实验结果表明,使用Q函数正则化可以减轻推理扩展错误,并在D4RL标准套件中展现出强大的表现。
    Abstract The core challenge of offline reinforcement learning (RL) is dealing with the (potentially catastrophic) extrapolation error induced by the distribution shift between the history dataset and the desired policy. A large portion of prior work tackles this challenge by implicitly/explicitly regularizing the learning policy towards the behavior policy, which is hard to estimate reliably in practice. In this work, we propose to regularize towards the Q-function of the behavior policy instead of the behavior policy itself, under the premise that the Q-function can be estimated more reliably and easily by a SARSA-style estimate and handles the extrapolation error more straightforwardly. We propose two algorithms taking advantage of the estimated Q-function through regularizations, and demonstrate they exhibit strong performance on the D4RL benchmarks.
    摘要 核心挑战是线上强化学习(RL)是处理 History 集和期望策略之间分布变化导致的(可能 catastrophic)推理错误。大量先前工作是通过显式/隐式正则化学习策略向行为策略的方向进行正则化,这在实践中很难估量。在这个工作中,我们提议将正则化向行为策略的 Q-函数而不是行为策略本身,因为 Q-函数可以更容易地和更可靠地通过 SARSA 样式的估计。我们提出了两种利用估计 Q-函数的正则化算法,并在 D4RL 标准启动中展示它们的强大表现。

Fitting Auditory Filterbanks with Multiresolution Neural Networks

  • paper_url: http://arxiv.org/abs/2307.13821
  • repo_url: https://github.com/lostanlen/lostanlen2023waspaa
  • paper_authors: Vincent Lostanlen, Daniel Haider, Han Han, Mathieu Lagrange, Peter Balazs, Martin Ehler
  • for: 这 paper 是为了解决深度学习音频模型中的非parametric vs. parametric问题。
  • methods: 这 paper 使用了多resolution neural network (MuReNN),其中包括分割 discrete wavelet transform (DWT) 的 octave subbands,并在每个 octave 中训练独立的 convolutional 操作。
  • results: compared to convnets 和 Gabor convolutions, MuReNN 在 three optimization problems 中达到了 state-of-the-art 性能。
    Abstract Waveform-based deep learning faces a dilemma between nonparametric and parametric approaches. On one hand, convolutional neural networks (convnets) may approximate any linear time-invariant system; yet, in practice, their frequency responses become more irregular as their receptive fields grow. On the other hand, a parametric model such as LEAF is guaranteed to yield Gabor filters, hence an optimal time-frequency localization; yet, this strong inductive bias comes at the detriment of representational capacity. In this paper, we aim to overcome this dilemma by introducing a neural audio model, named multiresolution neural network (MuReNN). The key idea behind MuReNN is to train separate convolutional operators over the octave subbands of a discrete wavelet transform (DWT). Since the scale of DWT atoms grows exponentially between octaves, the receptive fields of the subsequent learnable convolutions in MuReNN are dilated accordingly. For a given real-world dataset, we fit the magnitude response of MuReNN to that of a well-established auditory filterbank: Gammatone for speech, CQT for music, and third-octave for urban sounds, respectively. This is a form of knowledge distillation (KD), in which the filterbank ''teacher'' is engineered by domain knowledge while the neural network ''student'' is optimized from data. We compare MuReNN to the state of the art in terms of goodness of fit after KD on a hold-out set and in terms of Heisenberg time-frequency localization. Compared to convnets and Gabor convolutions, we find that MuReNN reaches state-of-the-art performance on all three optimization problems.
    摘要 waveform-based深度学习面临一个选择between非 Parametric和 Parametricapproaches。一个方面,卷积神经网络(convnets)可以近似任何线性时变系统;然而,在实践中,它们的频谱响应变得更加异常的随着它们的感知场的增大。另一方面,一个参数化模型如LEAF可以确保生成Gabor滤波器,因此获得最佳时间频域定位;然而,这种强大的推导牵扯来了表达能力的代价。在这篇论文中,我们希望超越这个困境,通过引入多尺度神经网络(MuReNN)来实现。MuReNN的关键思想是在分割 octave 子域中训练分离的卷积操作。由于 DWT atoms 的规模在 octave 中 exponential 增长,MuReNN 中的后续可学习卷积的感知场随着 octave 的增大而增大。对于一个实际数据集,我们将 MuReNN 的幅响应与一个已知的听力滤波器链:Gammatone for speech, CQT for music, and third-octave for urban sounds, respectively。这是一种知识储存(KD),在哪里听力滤波器''教师''是通过领域知识工程而设计的,而神经网络''学生''是通过数据优化的。我们将 MuReNN 与现状的最佳性进行比较,包括在 KD 后的好处评价和 Heisenberg 时间频域本地化。与 convnets 和 Gabor 卷积相比,我们发现 MuReNN 在三个优化问题中达到了状态机器人的表现。

ForestMonkey: Toolkit for Reasoning with AI-based Defect Detection and Classification Models

  • paper_url: http://arxiv.org/abs/2307.13815
  • repo_url: None
  • paper_authors: Jiajun Zhang, Georgina Cosma, Sarah Bugby, Jason Watkins
  • for: 本文提出了一个名为“Forest Monkey”(FM)的工具集,用于解释任何基于人工智能的缺陷检测和分类模型的预测结果,并提供了一些可读的图表和文本来描述这些结果。
  • methods: 本文使用了一些方法,包括从预测结果提取特征,将图像转换为缺陷特征,以及使用决策树基本的人工智能推理器。
  • results: 本文透过对四个不同的数据集和四个不同的模型进行时间性能评估,以评估FM工具集的效果。此外,文章还提供了一个教学 tutorials,以帮助用户在使用FM工具集进行解释任务。
    Abstract Artificial intelligence (AI) reasoning and explainable AI (XAI) tasks have gained popularity recently, enabling users to explain the predictions or decision processes of AI models. This paper introduces Forest Monkey (FM), a toolkit designed to reason the outputs of any AI-based defect detection and/or classification model with data explainability. Implemented as a Python package, FM takes input in the form of dataset folder paths (including original images, ground truth labels, and predicted labels) and provides a set of charts and a text file to illustrate the reasoning results and suggest possible improvements. The FM toolkit consists of processes such as feature extraction from predictions to reasoning targets, feature extraction from images to defect characteristics, and a decision tree-based AI-Reasoner. Additionally, this paper investigates the time performance of the FM toolkit when applied to four AI models with different datasets. Lastly, a tutorial is provided to guide users in performing reasoning tasks using the FM toolkit.
    摘要 人工智能(AI)逻辑和可解释AI(XAI)任务在最近几年内受欢迎,使用户可以解释AI模型的预测或决策过程。这篇论文介绍了森林猴(FM)工具集,用于对任何基于AI的检测和分类模型的输出进行逻辑推理。实现为Python包,FM接受输入为数据集文件夹路径(包括原始图像、真实标签和预测标签),并提供一组图表和文本文件来说明逻辑结果以及可能的改进建议。FM工具集包括从预测中提取特征到逻辑目标的特征提取、从图像中提取到缺陷特征的特征提取,以及基于决策树的AI-Reasoner。此外,本篇论文还 investigate FM工具集在不同数据集上四个AI模型的时间性能。最后,本文提供了用户执行逻辑任务的教程。

Speech representation learning: Learning bidirectional encoders with single-view, multi-view, and multi-task methods

  • paper_url: http://arxiv.org/abs/2308.00129
  • repo_url: None
  • paper_authors: Qingming Tang
  • for: 本论文旨在提高时间或空间序列数据上的预测任务,通过使用学习的表示。
  • methods: 本论文使用超级vised学习来训练深度神经网络,以学习好的序列表示。
  • results: 本论文在多种学习Setting中进行了广泛的研究,包括有监督学习、无监督学习、半监督学习以及多视图学习。
    Abstract This thesis focuses on representation learning for sequence data over time or space, aiming to improve downstream sequence prediction tasks by using the learned representations. Supervised learning has been the most dominant approach for training deep neural networks for learning good sequential representations. However, one limiting factor to scale supervised learning is the lack of enough annotated data. Motivated by this challenge, it is natural to explore representation learning methods that can utilize large amounts of unlabeled and weakly labeled data, as well as an additional data modality. I describe my broad study of representation learning for speech data. Unlike most other works that focus on a single learning setting, this thesis studies multiple settings: supervised learning with auxiliary losses, unsupervised learning, semi-supervised learning, and multi-view learning. Besides different learning problems, I also explore multiple approaches for representation learning. Though I focus on speech data, the methods described in this thesis can also be applied to other domains. Overall, the field of representation learning is developing rapidly. State-of-the-art results on speech related tasks are typically based on Transformers pre-trained with large-scale self-supervised learning, which aims to learn generic representations that can benefit multiple downstream tasks. Since 2020, large-scale pre-training has been the de facto choice to achieve good performance. This delayed thesis does not attempt to summarize and compare with the latest results on speech representation learning; instead, it presents a unique study on speech representation learning before the Transformer era, that covers multiple learning settings. Some of the findings in this thesis can still be useful today.
    摘要 这个论文关注在时间或空间序列数据上进行表示学习,以提高 subsequenct 预测任务中的表示质量。supervised learning 是深度神经网络训练深入表示的最主要方法。然而,缺乏足够的标注数据是规模化 supervised learning 的限制因素。为了解决这个挑战,我们可以 explore representation learning 方法,可以利用大量未标注和弱标注数据,以及额外的数据模式。我描述了我对 speech 数据的广泛研究。与大多数其他作品一样,这个论文不仅关注单一的学习设定,而是研究多种设定:supervised learning with auxiliary losses,unsupervised learning,semi-supervised learning,和多视图学习。此外,我们还探索了多种表示学习方法。虽然我们关注 speech 数据,但这些方法可以应用到其他领域。总的来说,表示学习领域在发展 rapidly。现代 speech 相关任务的 state-of-the-art 结果通常基于 Transformers 预训练大规模自我学习,该目的是学习通用的表示,可以促进多个下游任务。自 2020 年以来,大规模预训练成为了downstream任务的标准选择,以实现好的性能。这个论文不尝试综述和与最新 results on speech representation learning 进行比较,而是提供了在 Transformer 时代之前的唯一研究,涵盖多种学习设定。一些这个论文中的发现仍然可以在今天上有用。

How to Scale Your EMA

  • paper_url: http://arxiv.org/abs/2307.13813
  • repo_url: https://github.com/ZulqarnainZilli/-9-Email-Marketing-Tips-For-Content-Marketers
  • paper_authors: Dan Busbridge, Jason Ramapuram, Pierre Ablin, Tatiana Likhomanenko, Eeshan Gunesh Dhekane, Xavier Suau, Russ Webb
  • for: 这 paper 的目的是解决实际机器学习中保持批处理大小的训练动态性的问题,以便实现批处理大小和墙 clock 时间的负荷。
  • methods: 这 paper 使用了一种 scaling rule,即在批处理大小变化时,对学习率进行线性Scaling,以实现批处理大小和墙 clock 时间的负荷。此外,paper 还使用了模型Exponential Moving Average (EMA),以提高超vised learning 的稳定性和通用性。
  • results: 这 paper 的结果表明,通过使用 scaling rule 和模型 EMA,可以在不同的架构、优化器和数据模式下实现训练动态性,并且可以在小批处理大小和大批处理大小下训练 BYOL 方法,从而实现 wall-clock 时间的6倍减少。
    Abstract Preserving training dynamics across batch sizes is an important tool for practical machine learning as it enables the trade-off between batch size and wall-clock time. This trade-off is typically enabled by a scaling rule, for example, in stochastic gradient descent, one should scale the learning rate linearly with the batch size. Another important tool for practical machine learning is the model Exponential Moving Average (EMA), which is a model copy that does not receive gradient information, but instead follows its target model with some momentum. This model EMA can improve the robustness and generalization properties of supervised learning, stabilize pseudo-labeling, and provide a learning signal for Self-Supervised Learning (SSL). Prior works have treated the model EMA separately from optimization, leading to different training dynamics across batch sizes and lower model performance. In this work, we provide a scaling rule for optimization in the presence of model EMAs and demonstrate its validity across a range of architectures, optimizers, and data modalities. We also show the rule's validity where the model EMA contributes to the optimization of the target model, enabling us to train EMA-based pseudo-labeling and SSL methods at small and large batch sizes. For SSL, we enable training of BYOL up to batch size 24,576 without sacrificing performance, optimally a 6$\times$ wall-clock time reduction.
    摘要 保持批处理大小下的训练动态是实用机器学习中重要的工具,它允许批处理大小和墙 clock 时间之间的变数协调。这种协调通常通过一个扩大规则实现,例如在杂散梯度下降中,需要将学习率线性地与批处理大小相乘。另外,模型 exponentially moving average(EMA)也是一种重要的实用机器学习工具,它可以提高supervised learning的稳定性和泛化性,并为自动标注和自主学习提供学习信号。先前的工作通常将模型 EMA 与优化分开处理,导致不同的批处理大小下的训练动态,从而降低模型性能。在这项工作中,我们提供了在模型 EMA 存在下的优化 scaling rule,并证明其在不同的架构、优化器和数据模式下的有效性。我们还显示了这种规则在模型 EMA 对目标模型优化的情况下的有效性,允许我们在小批处理大小和大批处理大小下进行 Pseudo-labeling 和 SSL 训练。对 SSL,我们可以在批处理大小为 24,576 的情况下训练 BYOL,无需牺牲性能,实现了墙 clock 时间的6倍减少。

When Multi-Task Learning Meets Partial Supervision: A Computer Vision Review

  • paper_url: http://arxiv.org/abs/2307.14382
  • repo_url: None
  • paper_authors: Maxime Fontana, Michael Spratling, Miaojing Shi
  • for: 这篇论文主要研究的是多任务学习(MTL),即同时学习多个任务,并利用这些任务之间的关系来减少内存需求和计算时间。
  • methods: 该论文主要介绍了传统的MTL方法,包括不同的参数共享技术来传递知识 между任务。同时,它还讨论了由多个目标函数组成的多目标优化问题,以及这种多目标优化问题所带来的挑战。
  • results: 该论文提出了一些基于partial supervision的MTL方法,以解决多目标优化问题中的挑战。它还介绍了一些可用的数据集、工具和benchmarking结果,以评估这些方法的性能。
    Abstract Multi-Task Learning (MTL) aims to learn multiple tasks simultaneously while exploiting their mutual relationships. By using shared resources to simultaneously calculate multiple outputs, this learning paradigm has the potential to have lower memory requirements and inference times compared to the traditional approach of using separate methods for each task. Previous work in MTL has mainly focused on fully-supervised methods, as task relationships can not only be leveraged to lower the level of data-dependency of those methods but they can also improve performance. However, MTL introduces a set of challenges due to a complex optimisation scheme and a higher labeling requirement. This review focuses on how MTL could be utilised under different partial supervision settings to address these challenges. First, this review analyses how MTL traditionally uses different parameter sharing techniques to transfer knowledge in between tasks. Second, it presents the different challenges arising from such a multi-objective optimisation scheme. Third, it introduces how task groupings can be achieved by analysing task relationships. Fourth, it focuses on how partially supervised methods applied to MTL can tackle the aforementioned challenges. Lastly, this review presents the available datasets, tools and benchmarking results of such methods.
    摘要 First, the review examines how MTL traditionally uses parameter sharing techniques to transfer knowledge between tasks. Second, it discusses the challenges that arise from the multi-objective optimization scheme. Third, it introduces task groupings based on task relationships. Fourth, it focuses on how partially supervised methods can be applied to MTL to tackle these challenges. Finally, the review presents available datasets, tools, and benchmarking results for such methods.

EdgeConvEns: Convolutional Ensemble Learning for Edge Intelligence

  • paper_url: http://arxiv.org/abs/2307.14381
  • repo_url: None
  • paper_authors: Ilkay Sikdokur, İnci M. Baytaş, Arda Yurdakul
  • for: 本研究旨在实现在边缘网络中部署深度学习模型,以提高边缘设备的学习能力和预测性能。
  • methods: 本研究提出了一种 convolutional ensemble learning 方法,称为 EdgeConvEns,可以在边缘设备上训练不同计算能力的弱模型,并将这些模型 ensemble 在中央服务器上进行更好的预测性能。
  • results: 实验结果表明,EdgeConvEns 可以在不同训练场景下超过当前最佳性能,并且需要 fewer 次网络通信和 menos 数据传输。
    Abstract Deep edge intelligence aims to deploy deep learning models that demand computationally expensive training in the edge network with limited computational power. Moreover, many deep edge intelligence applications require handling distributed data that cannot be transferred to a central server due to privacy concerns. Decentralized learning methods, such as federated learning, offer solutions where models are learned collectively by exchanging learned weights. However, they often require complex models that edge devices may not handle and multiple rounds of network communication to achieve state-of-the-art performances. This study proposes a convolutional ensemble learning approach, coined EdgeConvEns, that facilitates training heterogeneous weak models on edge and learning to ensemble them where data on edge are heterogeneously distributed. Edge models are implemented and trained independently on Field-Programmable Gate Array (FPGA) devices with various computational capacities. Learned data representations are transferred to a central server where the ensemble model is trained with the learned features received from the edge devices to boost the overall prediction performance. Extensive experiments demonstrate that the EdgeConvEns can outperform the state-of-the-art performance with fewer communications and less data in various training scenarios.
    摘要 深入智能旨在部署需要计算费时训练的深度学习模型在边缘网络中,该网络具有有限的计算能力。此外,许多深入智能应用需要处理分散的数据,这些数据不能被传输到中央服务器 Due to privacy concerns. 联邦学习方法,如联邦学习,可以解决这些问题,但它们经常需要复杂的模型,边缘设备可能无法处理,并且需要多次网络通信以 достиieving state-of-the-art表现。本研究提出了一种 convolutional ensemble learning 方法,称为 EdgeConvEns,它可以在边缘上训练不同计算 capacities的 Edge 模型,并将学习到的数据表示 transferred to a central server,并在该服务器上训练 ensemble 模型,以提高总预测性能。 Edge 模型在 Field-Programmable Gate Array (FPGA) 设备上独立实现和训练,学习到的数据表示在中央服务器上进行 ensemble 训练,以提高预测性能。广泛的实验表明,EdgeConvEns 可以在不同训练场景下超越现有的性能,并且需要 fewer communications 和 less data。

A large language model-assisted education tool to provide feedback on open-ended responses

  • paper_url: http://arxiv.org/abs/2308.02439
  • repo_url: https://github.com/KordingLab/llm4teach-freetext-server
  • paper_authors: Jordan K. Matelsky, Felipe Parodi, Tony Liu, Richard D. Lange, Konrad P. Kording
  • for: 这篇论文是为了提供一种自动回答开结问题的工具,以帮助教师提供快速个性化反馈,从而提高学生的知识水平和教学方法。
  • methods: 这个工具使用大型自然语言模型(LLMs),由教师定义的标准来指导其回答开结问题。
  • results: 这个工具可以快速提供个性化反馈,帮助学生快速测试知识和identify改进的领域。
    Abstract Open-ended questions are a favored tool among instructors for assessing student understanding and encouraging critical exploration of course material. Providing feedback for such responses is a time-consuming task that can lead to overwhelmed instructors and decreased feedback quality. Many instructors resort to simpler question formats, like multiple-choice questions, which provide immediate feedback but at the expense of personalized and insightful comments. Here, we present a tool that uses large language models (LLMs), guided by instructor-defined criteria, to automate responses to open-ended questions. Our tool delivers rapid personalized feedback, enabling students to quickly test their knowledge and identify areas for improvement. We provide open-source reference implementations both as a web application and as a Jupyter Notebook widget that can be used with instructional coding or math notebooks. With instructor guidance, LLMs hold promise to enhance student learning outcomes and elevate instructional methodologies.
    摘要 открытые вопросы是教师们喜欢使用的工具,用于评估学生理解度和促进课程材料的探究性评估。提供反馈 для这些答案是一项时间消耗大的任务,可能会让教师感受到压力,导致反馈质量下降。许多教师会转而使用更简单的问题格式,如多选题,以获得快速的反馈,但是这将导致个性化的反馈和深入的评估被 sacrificed。在这里,我们介绍了一种工具,使用大型自然语言模型(LLMs),以 instruktor-defined 的标准来自动回答开放式问题。我们的工具可以快速提供个性化反馈,让学生快速测试自己的知识水平,并快速发现需要改进的方面。我们提供了开源的参考实现,一个网应用和一个 Jupyter Notebook widget,可以与教学编程或数学笔记一起使用。With instructor guidance,LLMs 表示可以提高学生学习成果和提高教学方法。

Is GPT a Computational Model of Emotion? Detailed Analysis

  • paper_url: http://arxiv.org/abs/2307.13779
  • repo_url: None
  • paper_authors: Ala N. Tak, Jonathan Gratch
  • for: 这篇论文探讨 GPT 家族大语言模型的情感理解能力。
  • methods: 论文首先研究 GPT 如何理解自己的生活记忆,然后通过系统地变化情况来影响情绪强度和应急响应。
  • results: 研究发现,不使用提问工程ering的情况下,GPT 的预测与人类提供的评估和情感标签高度相符。然而,GPT 在预测情绪强度和应急响应方面存在困难。GPT-4 在初期研究中表现最佳,但在第二次研究中表现不佳,尽管通过小量提问工程ering提供了更好的结果。这些研究表明了如何有效地使用这些模型的优点,以及如何解决它们的弱点,特别是 Response 的变化。
    Abstract This paper investigates the emotional reasoning abilities of the GPT family of large language models via a component perspective. The paper first examines how the model reasons about autobiographical memories. Second, it systematically varies aspects of situations to impact emotion intensity and coping tendencies. Even without the use of prompt engineering, it is shown that GPT's predictions align significantly with human-provided appraisals and emotional labels. However, GPT faces difficulties predicting emotion intensity and coping responses. GPT-4 showed the highest performance in the initial study but fell short in the second, despite providing superior results after minor prompt engineering. This assessment brings up questions on how to effectively employ the strong points and address the weak areas of these models, particularly concerning response variability. These studies underscore the merits of evaluating models from a componential perspective.
    摘要

An Empirical Study on Bugs Inside PyTorch: A Replication Study

  • paper_url: http://arxiv.org/abs/2307.13777
  • repo_url: None
  • paper_authors: Sharon Chee Yin Ho, Vahid Majdinasab, Mohayeminul Islam, Diego Elias Costa, Emad Shihab, Foutse Khomh, Sarah Nadi, Muhammad Raza
  • for: 本研究旨在探讨PyTorch库中的bug标识和修复过程,以便更好地理解深度学习库中bug的特点和影响。
  • methods: 本研究采用了对PyTorch库的开发过程中发现的bug进行分析,并对bug的原因和表现特征进行描述,以及分析bug修复的方法。
  • results: 研究发现,PyTorch库中的bug更像传统软件项目中的bug,而不是深度学习特有的问题。此外,本研究还对TensorFlow库的bug标识和修复过程进行了比较,探讨了两个库之间的相似性和差异。
    Abstract Software systems are increasingly relying on deep learning components, due to their remarkable capability of identifying complex data patterns and powering intelligent behaviour. A core enabler of this change in software development is the availability of easy-to-use deep learning libraries. Libraries like PyTorch and TensorFlow empower a large variety of intelligent systems, offering a multitude of algorithms and configuration options, applicable to numerous domains of systems. However, bugs in those popular deep learning libraries also may have dire consequences for the quality of systems they enable; thus, it is important to understand how bugs are identified and fixed in those libraries. Inspired by a study of Jia et al., which investigates the bug identification and fixing process at TensorFlow, we characterize bugs in the PyTorch library, a very popular deep learning framework. We investigate the causes and symptoms of bugs identified during PyTorch's development, and assess their locality within the project, and extract patterns of bug fixes. Our results highlight that PyTorch bugs are more like traditional software projects bugs, than related to deep learning characteristics. Finally, we also compare our results with the study on TensorFlow, highlighting similarities and differences across the bug identification and fixing process.
    摘要 Inspired by a study on TensorFlow, we investigated the bug identification and fixing process in PyTorch, a very popular deep learning framework. We found that the causes and symptoms of bugs in PyTorch are more like traditional software project bugs, rather than being specific to deep learning. We also extracted patterns of bug fixes and compared our results with the study on TensorFlow, highlighting similarities and differences in the bug identification and fixing process.

Combating the Curse of Multilinguality in Cross-Lingual WSD by Aligning Sparse Contextualized Word Representations

  • paper_url: http://arxiv.org/abs/2307.13776
  • repo_url: https://github.com/begab/sparsity_makes_sense
  • paper_authors: Gábor Berend
  • for: 本研究旨在使用大型预训练的单语言自然语言处理模型进行Zero-shot单词意思分类(WSD),并采用上下文化映射机制。
  • methods: 本研究使用了词典学习程序来获取笔记缩短的上下文化词表示,并使用大型预训练的单语言自然语言处理模型进行Zero-shot单词意思分类。
  • results: 实验结果表明,通过上述修改,可以对17种语言进行详细的实验,并获得了62.0到68.5的平均F1分数的显著提升(升幅约6.5)。
    Abstract In this paper, we advocate for using large pre-trained monolingual language models in cross lingual zero-shot word sense disambiguation (WSD) coupled with a contextualized mapping mechanism. We also report rigorous experiments that illustrate the effectiveness of employing sparse contextualized word representations obtained via a dictionary learning procedure. Our experimental results demonstrate that the above modifications yield a significant improvement of nearly 6.5 points of increase in the average F-score (from 62.0 to 68.5) over a collection of 17 typologically diverse set of target languages. We release our source code for replicating our experiments at https://github.com/begab/sparsity_makes_sense.
    摘要 在这篇论文中,我们支持使用大型预训练单语言自然语言模型在跨语言零shot单词含义决定(WSD)中与contextualized mapping机制相结合。我们还对实验结果进行了严格的报告,表明使用稀疏contextualized词表示 obtener得到的改进方法可以带来较大的改进,具体是从62.0提高到68.5的平均F分数,在17种语言集中。我们将代码发布在https://github.com/begab/sparsity_makes_sense上,以便其他人复现我们的实验。

E^2VPT: An Effective and Efficient Approach for Visual Prompt Tuning

  • paper_url: http://arxiv.org/abs/2307.13770
  • repo_url: https://github.com/chenghan111/e2vpt
  • paper_authors: Cheng Han, Qifan Wang, Yiming Cui, Zhiwen Cao, Wenguan Wang, Siyuan Qi, Dongfang Liu
  • for: 这篇论文的目的是提出一种有效且有效的大规模 transformer 模型适应方法,以减少 fine-tuning 中的参数数量。
  • methods: 这篇论文使用了一些 parameter-efficient learning 技术,包括引入 learnable key-value prompts 和 visual prompts,以提高模型的适应能力。此外,它还提出了一个 prompt pruning 程序,可以系统地删除低重要性的 prompt,以提高模型的效率。
  • results: 这篇论文的实验结果显示,它的方法可以与一些现有的基eline相比,在两个 benchmark 上表现出色,并且仅使用了模型的 0.32% 的参数数量。
    Abstract As the size of transformer-based models continues to grow, fine-tuning these large-scale pretrained vision models for new tasks has become increasingly parameter-intensive. Parameter-efficient learning has been developed to reduce the number of tunable parameters during fine-tuning. Although these methods show promising results, there is still a significant performance gap compared to full fine-tuning. To address this challenge, we propose an Effective and Efficient Visual Prompt Tuning (E^2VPT) approach for large-scale transformer-based model adaptation. Specifically, we introduce a set of learnable key-value prompts and visual prompts into self-attention and input layers, respectively, to improve the effectiveness of model fine-tuning. Moreover, we design a prompt pruning procedure to systematically prune low importance prompts while preserving model performance, which largely enhances the model's efficiency. Empirical results demonstrate that our approach outperforms several state-of-the-art baselines on two benchmarks, with considerably low parameter usage (e.g., 0.32% of model parameters on VTAB-1k). Our code is available at https://github.com/ChengHan111/E2VPT.
    摘要 随着 transformer-based 模型的大小继续增长, fine-tuning these large-scale pretrained vision models for new tasks 已成为 parameter-intensive 的挑战。parameter-efficient learning 已经开发出来以减少 fine-tuning 过程中的可调参数数量。虽然这些方法显示了扎实的结果,但是还有一定的性能差距 compared to full fine-tuning。为了解决这个挑战,我们提出了一种 Effective and Efficient Visual Prompt Tuning (E^2VPT) 方法,用于大规模 transformer-based 模型的适应。specifically,我们在 self-attention 层和输入层中引入了一些可学习的 key-value prompts 和 visual prompts,以提高模型的 fine-tuning 效果。此外,我们还设计了一种 prompt pruning 过程,可以系统地剔除低重要性的 prompts,并保持模型的性能,这有效地提高了模型的效率。实验结果表明,我们的方法可以与一些 state-of-the-art 基eline 相比,在 two benchmarks 上表现出色,并且具有较低的参数使用率(例如,0.32% 的模型参数在 VTAB-1k 上)。我们的代码可以在 https://github.com/ChengHan111/E2VPT 上找到。

ClusterSeq: Enhancing Sequential Recommender Systems with Clustering based Meta-Learning

  • paper_url: http://arxiv.org/abs/2307.13766
  • repo_url: None
  • paper_authors: Mohammmadmahdi Maheri, Reza Abdollahzadeh, Bardia Mohammadi, Mina Rafiei, Jafar Habibi, Hamid R. Rabiee
  • For: 解决用户冷启始问题,提高续传推荐系统的效果。* Methods: meta-learning clustering-based sequential recommender system,利用用户序列中的动态信息提高物品预测精度。* Results: 比对几种现状的meta-学推荐器,ClusterSeq显示出较高的预测精度,特别是对”小用户”的预测。
    Abstract In practical scenarios, the effectiveness of sequential recommendation systems is hindered by the user cold-start problem, which arises due to limited interactions for accurately determining user preferences. Previous studies have attempted to address this issue by combining meta-learning with user and item-side information. However, these approaches face inherent challenges in modeling user preference dynamics, particularly for "minor users" who exhibit distinct preferences compared to more common or "major users." To overcome these limitations, we present a novel approach called ClusterSeq, a Meta-Learning Clustering-Based Sequential Recommender System. ClusterSeq leverages dynamic information in the user sequence to enhance item prediction accuracy, even in the absence of side information. This model preserves the preferences of minor users without being overshadowed by major users, and it capitalizes on the collective knowledge of users within the same cluster. Extensive experiments conducted on various benchmark datasets validate the effectiveness of ClusterSeq. Empirical results consistently demonstrate that ClusterSeq outperforms several state-of-the-art meta-learning recommenders. Notably, compared to existing meta-learning methods, our proposed approach achieves a substantial improvement of 16-39% in Mean Reciprocal Rank (MRR).
    摘要 在实际应用场景中,顺序推荐系统的效果受用户冷启 пробле 的限制,这种问题由用户与ITEM之间的交互有限,难以准确地确定用户的喜好。先前的研究已经尝试通过meta-学习与用户和ITEM的信息结合来解决这个问题,但这些方法面临用户喜好动态模型化的内在挑战,特别是对"小用户"(minor users)的喜好表现出明显的差异。为了解决这些限制,我们提出了一种新的方法:ClusterSeq,这是一种基于 clustering 的 Meta-Learning Sequential Recommender System。ClusterSeq 利用用户序列中的动态信息来提高ITEM预测精度,即使在没有副信息的情况下。这种模型保持了小用户的喜好,不被大用户(major users)所掩蔽,同时利用用户序列中的共同知识来提高推荐的准确率。经验 validate 了 ClusterSeq 的效果,与先前的 estado-of-the-art meta-学习推荐器相比,ClusterSeq 在 Mean Reciprocal Rank(MRR)上表现出了明显的提高,具体数据表明,ClusterSeq 与先前的 meta-学习方法相比,在 MRR 上提高了16-39%。

Implicitly Normalized Explicitly Regularized Density Estimation

  • paper_url: http://arxiv.org/abs/2307.13763
  • repo_url: None
  • paper_authors: Mark Kozdoba, Binyamin Perets, Shie Mannor
  • for: 本文提出了一种新的非 Parametric density estimation方法,该方法基于 Sobolev нор的 regularization。
  • methods: 本方法不同于 Kernel Density Estimation,可以使模型的偏好明确和可读性。虽然不存在关闭式analytic形式的kernel,但可以使用采样来approximate它。但问题是非CONvex,标准的梯度方法不好。但是,我们表明可以使用适当的初始化和自然梯度,以获得良好的解。
  • results: 本方法可以获得不正规化的概率分布,这使得不能使用log-likelihood дляcross validation。但我们表明可以使用 Fisher Divergence based Score Matching方法来解决这个问题。我们在 ADBench 最新的异常检测 benchmark 上评估了本方法,并发现它在more than 15Algorithms中排名第二。
    Abstract We propose a new approach to non-parametric density estimation, that is based on regularizing a Sobolev norm of the density. This method is provably different from Kernel Density Estimation, and makes the bias of the model clear and interpretable. While there is no closed analytic form for the associated kernel, we show that one can approximate it using sampling. The optimization problem needed to determine the density is non-convex, and standard gradient methods do not perform well. However, we show that with an appropriate initialization and using natural gradients, one can obtain well performing solutions. Finally, while the approach provides unnormalized densities, which prevents the use of log-likelihood for cross validation, we show that one can instead adapt Fisher Divergence based Score Matching methods for this task. We evaluate the resulting method on the comprehensive recent Anomaly Detection benchmark suite, ADBench, and find that it ranks second best, among more than 15 algorithms.
    摘要 我们提出了一种新的非参数性密度估计方法,基于 Sobolev нор的规范化。这种方法与核密度估计方法不同,可以清晰地显示模型的偏见。虽然该密度函数没有固定的关联核函数,但我们表明可以使用采样来approximate它。优化问题需要解决的非 conjugate 问题,标准的梯度法不太好。然而,我们表明,通过适当的初始化和使用自然梯度,可以获得良好的解。虽然该方法提供的密度函数没有标准化,因此无法使用对数似然函数进行cross validation,但我们示出了使用 Fisher 分布 Based Score Matching 方法来解决这个问题。我们对最新的 Anomaly Detection benchmark suite ADBench 进行了评估,并发现其在more than 15 算法中排名第二。

Training-based Model Refinement and Representation Disagreement for Semi-Supervised Object Detection

  • paper_url: http://arxiv.org/abs/2307.13755
  • repo_url: None
  • paper_authors: Seyed Mojtaba Marvasti-Zadeh, Nilanjan Ray, Nadir Erbilgin
  • for: 提高现有 объек检测器的性能和泛化能力,通过使用有限的标注数据和广泛的无标注数据进行 semi-supervised object detection。
  • methods: 提出了一种新的训练阶段基于模型级别的准确强化(TMR)和一种简单 yet effective的表示不一致(RD)策略,用于解决经典EMA策略和教师-学生模型在训练后期的一致问题。
  • results: 对比于现有SSOD方法,提出的方法在COCO标准、COCO附加和Pascal VOC数据集上得到了更高的性能,具体来说是与基线Unbiased-Teacher-v2(& Unbiased-Teacher-v1)方法相比,平均mAP差距为2.23、2.1、3.36(& 2.07、1.9、3.27)。
    Abstract Semi-supervised object detection (SSOD) aims to improve the performance and generalization of existing object detectors by utilizing limited labeled data and extensive unlabeled data. Despite many advances, recent SSOD methods are still challenged by inadequate model refinement using the classical exponential moving average (EMA) strategy, the consensus of Teacher-Student models in the latter stages of training (i.e., losing their distinctiveness), and noisy/misleading pseudo-labels. This paper proposes a novel training-based model refinement (TMR) stage and a simple yet effective representation disagreement (RD) strategy to address the limitations of classical EMA and the consensus problem. The TMR stage of Teacher-Student models optimizes the lightweight scaling operation to refine the model's weights and prevent overfitting or forgetting learned patterns from unlabeled data. Meanwhile, the RD strategy helps keep these models diverged to encourage the student model to explore complementary representations. Our approach can be integrated into established SSOD methods and is empirically validated using two baseline methods, with and without cascade regression, to generate more reliable pseudo-labels. Extensive experiments demonstrate the superior performance of our approach over state-of-the-art SSOD methods. Specifically, the proposed approach outperforms the baseline Unbiased-Teacher-v2 (& Unbiased-Teacher-v1) method by an average mAP margin of 2.23, 2.1, and 3.36 (& 2.07, 1.9, and 3.27) on COCO-standard, COCO-additional, and Pascal VOC datasets, respectively.
    摘要 semi-supervised对象检测(SSOD)目标是提高现有对象检测器的性能和泛化能力,通过利用有限的标注数据和广泛的无标注数据。 despite many advances, recent SSOD methods are still challenged by inadequate model refinement using the classical exponential moving average (EMA) strategy, the consensus of Teacher-Student models in the latter stages of training (i.e., losing their distinctiveness), and noisy/misleading pseudo-labels. This paper proposes a novel training-based model refinement (TMR) stage and a simple yet effective representation disagreement (RD) strategy to address the limitations of classical EMA and the consensus problem. The TMR stage of Teacher-Student models optimizes the lightweight scaling operation to refine the model's weights and prevent overfitting or forgetting learned patterns from unlabeled data. Meanwhile, the RD strategy helps keep these models diverged to encourage the student model to explore complementary representations. Our approach can be integrated into established SSOD methods and is empirically validated using two baseline methods, with and without cascade regression, to generate more reliable pseudo-labels. Extensive experiments demonstrate the superior performance of our approach over state-of-the-art SSOD methods. Specifically, the proposed approach outperforms the baseline Unbiased-Teacher-v2 (& Unbiased-Teacher-v1) method by an average mAP margin of 2.23, 2.1, and 3.36 (& 2.07, 1.9, and 3.27) on COCO-standard, COCO-additional, and Pascal VOC datasets, respectively.

Benchmarking and Analyzing Generative Data for Visual Recognition

  • paper_url: http://arxiv.org/abs/2307.13697
  • repo_url: https://github.com/Luodian/GenBench
  • paper_authors: Bo Li, Haotian Liu, Liangyu Chen, Yong Jae Lee, Chunyuan Li, Ziwei Liu
  • for: 本研究探讨了大型预训 génative 模型在视觉识别中的潜在作用,主要比较了三种不同的数据来源(生成、检索和原始)。
  • methods: 我们提出了一个广泛的标准套件(\textbf{GenBench),包括22个数据集和2548个类别,用于评估不同的视觉识别任务中的生成数据。我们还提出了一个无需训练的 metric(\textbf{CLER),用于评估生成数据在识别任务中的效果。
  • results: 我们的研究发现,生成数据在许多视觉识别任务中表现出优异的特点,并且可以通过文本推理来注入外部知识来提高性能。
    Abstract Advancements in large pre-trained generative models have expanded their potential as effective data generators in visual recognition. This work delves into the impact of generative images, primarily comparing paradigms that harness external data (\ie generative \vs retrieval \vs original). Our key contributions are: \textbf{1) GenBench Construction:} We devise \textbf{GenBench}, a broad benchmark comprising 22 datasets with 2548 categories, to appraise generative data across various visual recognition tasks. \textbf{2) CLER Score:} To address the insufficient correlation of existing metrics (\eg, FID, CLIP score) with downstream recognition performance, we propose \textbf{CLER}, a training-free metric indicating generative data's efficiency for recognition tasks prior to training. \textbf{3) New Baselines:} Comparisons of generative data with retrieved data from the same external pool help to elucidate the unique traits of generative data. \textbf{4) External Knowledge Injection:} By fine-tuning special token embeddings for each category via Textual Inversion, performance improves across 17 datasets, except when dealing with low-resolution reference images. Our exhaustive benchmark and analysis spotlight generative data's promise in visual recognition, while identifying key challenges for future investigation.
    摘要 “大型预训生成模型的进步已经扩展了它们在视觉识别中的应用前景。这个工作探讨了生成图像的影响,主要是比较使用外部数据(即生成 VS 重新 VS 原始)。我们的主要贡献包括:1. 生成测验工具(GenBench)的设计:我们开发了一个包含22个dataset、2548个类别的广泛benchmark,以评估不同的视觉识别任务中的生成数据。2. CLER分数的提案:为了解决现有的度量器(如FID、CLIP分数)与下游识别性能之间的不足相关性,我们提出了CLER,一个无需训练的度量器,可以在生成数据前以评估该数据的识别能力。3. 新的基准值:通过与相同的外部数据库中的重新数据进行比较,我们可以更好地显示生成数据的独特特征。4. 外部知识注入:通过在每个类别的特殊token嵌入中进行文本反转,我们在17个dataset中提高了表现,除了对低分辨率的参考图像。我们的充分的benchmark和分析灯示了生成数据在视觉识别中的应用潜力,同时点出了未来的挑战。”

Foundational Models Defining a New Era in Vision: A Survey and Outlook

  • paper_url: http://arxiv.org/abs/2307.13721
  • repo_url: https://github.com/awaisrauf/awesome-cv-foundational-models
  • paper_authors: Muhammad Awais, Muzammal Naseer, Salman Khan, Rao Muhammad Anwer, Hisham Cholakkal, Mubarak Shah, Ming-Hsuan Yang, Fahad Shahbaz Khan
    for:foundational models for computer vision tasks, such as segmentation, object detection, and image/video captioning, are reviewed in this paper.methods:the paper discusses various architecture designs, training objectives, pre-training datasets, fine-tuning mechanisms, and prompting patterns used in foundational models.results:the paper reviews recent developments in foundational models and their applications in computer vision tasks, including their ability to generalize to new scenes and tasks, their contextual understanding, and their limitations in real-world environments.
    Abstract Vision systems to see and reason about the compositional nature of visual scenes are fundamental to understanding our world. The complex relations between objects and their locations, ambiguities, and variations in the real-world environment can be better described in human language, naturally governed by grammatical rules and other modalities such as audio and depth. The models learned to bridge the gap between such modalities coupled with large-scale training data facilitate contextual reasoning, generalization, and prompt capabilities at test time. These models are referred to as foundational models. The output of such models can be modified through human-provided prompts without retraining, e.g., segmenting a particular object by providing a bounding box, having interactive dialogues by asking questions about an image or video scene or manipulating the robot's behavior through language instructions. In this survey, we provide a comprehensive review of such emerging foundational models, including typical architecture designs to combine different modalities (vision, text, audio, etc), training objectives (contrastive, generative), pre-training datasets, fine-tuning mechanisms, and the common prompting patterns; textual, visual, and heterogeneous. We discuss the open challenges and research directions for foundational models in computer vision, including difficulties in their evaluations and benchmarking, gaps in their real-world understanding, limitations of their contextual understanding, biases, vulnerability to adversarial attacks, and interpretability issues. We review recent developments in this field, covering a wide range of applications of foundation models systematically and comprehensively. A comprehensive list of foundational models studied in this work is available at \url{https://github.com/awaisrauf/Awesome-CV-Foundational-Models}.
    摘要 视觉系统能够理解和描述视觉场景的compositional性是理解我们世界的基本要求。实际环境中 объектов和他们的位置之间的复杂关系、歧义和变化可以更好地用人类语言来描述,这些语言自然受到语法规则和其他模态的限制。通过大规模的训练数据和模型学习,可以bridge这些模式之间的差异,实现上下文理解、泛化和提示能力。这些模型被称为基础模型。基础模型的输出可以通过人提供的提示进行修改,例如提供 bounding box 来 segment particular object,或者通过问题提问来进行互动对话,或者通过语言指令来控制机器人的行为。在这篇评论中,我们提供了基础模型的广泛和系统性的 Review,包括不同模式结合(视觉、文本、音频等)、训练目标(对比、生成)、预训练数据集、练习机制和常见的提示模式(文本、视觉、混合)。我们还讨论了基础模型在计算机视觉领域的开放挑战和研究方向,包括评价和测试 benchmarking 困难、实际世界理解的差距、上下文理解的局限性、偏见、攻击性和可读性问题。我们还综述了该领域最新的发展,涵盖了基础模型的各种应用,从系统性和完整性来评价。基础模型的完整列表可以在 \url{https://github.com/awaisrauf/Awesome-CV-Foundational-Models} 上查看。

Composite Diffusion | whole >= Σparts

  • paper_url: http://arxiv.org/abs/2307.13720
  • repo_url: None
  • paper_authors: Vikram Jamwal, Ramaneswaran S
  • for: 这篇论文旨在提供一种基于文本扩散的高质量图像生成方法,帮助艺术家和 графический设计师更好地控制图像的配置和分布。
  • methods: 该方法使用 Composite Diffusion 技术,让艺术家通过自由形式的分割场景,将多个场景组合成一个完整的图像。在这个过程中,艺术家可以使用自然语言描述每个场景的内容,并可以通过引用图像或控制输入来调整图像的组合和融合。
  • results: 该方法可以提供高质量的图像生成,并且可以帮助艺术家更好地控制图像的配置和分布。通过对比现有图像质量指标和艺术家的愿望,我们提出了新的质量标准,以更好地评估图像生成的效果。
    Abstract For an artist or a graphic designer, the spatial layout of a scene is a critical design choice. However, existing text-to-image diffusion models provide limited support for incorporating spatial information. This paper introduces Composite Diffusion as a means for artists to generate high-quality images by composing from the sub-scenes. The artists can specify the arrangement of these sub-scenes through a flexible free-form segment layout. They can describe the content of each sub-scene primarily using natural text and additionally by utilizing reference images or control inputs such as line art, scribbles, human pose, canny edges, and more. We provide a comprehensive and modular method for Composite Diffusion that enables alternative ways of generating, composing, and harmonizing sub-scenes. Further, we wish to evaluate the composite image for effectiveness in both image quality and achieving the artist's intent. We argue that existing image quality metrics lack a holistic evaluation of image composites. To address this, we propose novel quality criteria especially relevant to composite generation. We believe that our approach provides an intuitive method of art creation. Through extensive user surveys, quantitative and qualitative analysis, we show how it achieves greater spatial, semantic, and creative control over image generation. In addition, our methods do not need to retrain or modify the architecture of the base diffusion models and can work in a plug-and-play manner with the fine-tuned models.
    摘要 For an artist or graphic designer, the spatial layout of a scene is a critical design choice. However, existing text-to-image diffusion models provide limited support for incorporating spatial information. This paper introduces Composite Diffusion as a means for artists to generate high-quality images by composing from sub-scenes. The artists can specify the arrangement of these sub-scenes through a flexible free-form segment layout. They can describe the content of each sub-scene primarily using natural text and additionally by utilizing reference images or control inputs such as line art, scribbles, human pose, canny edges, and more. We provide a comprehensive and modular method for Composite Diffusion that enables alternative ways of generating, composing, and harmonizing sub-scenes. Further, we wish to evaluate the composite image for effectiveness in both image quality and achieving the artist's intent. We argue that existing image quality metrics lack a holistic evaluation of image composites. To address this, we propose novel quality criteria especially relevant to composite generation. We believe that our approach provides an intuitive method of art creation. Through extensive user surveys, quantitative and qualitative analysis, we show how it achieves greater spatial, semantic, and creative control over image generation. In addition, our methods do not need to retrain or modify the architecture of the base diffusion models and can work in a plug-and-play manner with the fine-tuned models.

The Visual Language of Fabrics

  • paper_url: http://arxiv.org/abs/2307.13681
  • repo_url: None
  • paper_authors: Valentin Deschaintre, Julia Guerrero-Viu, Diego Gutierrez, Tamy Boubekeur, Belen Masia
  • for: 本研究准备了一个名为text2fabric的新数据集,该数据集将自然语言描述与不同的织物材质图像联系起来。
  • methods: 研究人员使用自然语言描述来描述织物的外观,并分析了数据集,从中提取了一个紧凑的词汇、属性和结构,以便更好地理解人们如何描述织物。
  • results: 研究人员通过使用text2fabric数据集,可以准确地理解织物的描述,并且可以使用这些描述来特化大型视觉语言模型,例如CLIP,以创建一个有意义的潜在空间,并提高物料检索和自动标注等应用。
    Abstract We introduce text2fabric, a novel dataset that links free-text descriptions to various fabric materials. The dataset comprises 15,000 natural language descriptions associated to 3,000 corresponding images of fabric materials. Traditionally, material descriptions come in the form of tags/keywords, which limits their expressivity, induces pre-existing knowledge of the appropriate vocabulary, and ultimately leads to a chopped description system. Therefore, we study the use of free-text as a more appropriate way to describe material appearance, taking the use case of fabrics as a common item that non-experts may often deal with. Based on the analysis of the dataset, we identify a compact lexicon, set of attributes and key structure that emerge from the descriptions. This allows us to accurately understand how people describe fabrics and draw directions for generalization to other types of materials. We also show that our dataset enables specializing large vision-language models such as CLIP, creating a meaningful latent space for fabric appearance, and significantly improving applications such as fine-grained material retrieval and automatic captioning.
    摘要 我们介绍text2fabric数据集,这是一个新的数据集,将自然语言描述与各种织物材质相关联。该数据集包含15,000个自然语言描述和3,000个相应的织物图像。传统上,材质描述通常以标签/关键词的形式出现,这限制了其表达能力,需要先采用适当的词汇库,并最终导致描述系统被剪辑。因此,我们研究使用自然语言来更好地描述材质外观,以织物作为非专家通常处理的常见物品为例。基于数据集的分析,我们标识出了一个紧凑的词汇集、属性集和关键结构,这些元素允许我们准确地理解人们如何描述织物,并提供了泛化到其他材质的方向。此外,我们还示出了使用text2fabric数据集可以特化大型视觉语言模型,创造出meaningful的织物外观空间,并显著提高了材质 Retrieval和自动标题等应用。

How Can Large Language Models Help Humans in Design and Manufacturing?

  • paper_url: http://arxiv.org/abs/2307.14377
  • repo_url: None
  • paper_authors: Liane Makatura, Michael Foshey, Bohan Wang, Felix HähnLein, Pingchuan Ma, Bolei Deng, Megan Tjandrasuwita, Andrew Spielberg, Crystal Elaine Owens, Peter Yichen Chen, Allan Zhao, Amy Zhu, Wil J Norton, Edward Gu, Joshua Jacob, Yifei Li, Adriana Schulz, Wojciech Matusik
  • for: investigate the application of Large Language Models (LLMs) in generative design across the entire design and manufacturing workflow.
  • methods: convert text-based prompts into design specifications, transform designs into manufacturing instructions, produce design spaces and variations, compute design performance, and search for designs based on performance.
  • results: highlight both the benefits and limitations of current LLMs through a series of examples, with the goal of catalyzing continued improvement and progression of these models.
    Abstract The advancement of Large Language Models (LLMs), including GPT-4, provides exciting new opportunities for generative design. We investigate the application of this tool across the entire design and manufacturing workflow. Specifically, we scrutinize the utility of LLMs in tasks such as: converting a text-based prompt into a design specification, transforming a design into manufacturing instructions, producing a design space and design variations, computing the performance of a design, and searching for designs predicated on performance. Through a series of examples, we highlight both the benefits and the limitations of the current LLMs. By exposing these limitations, we aspire to catalyze the continued improvement and progression of these models.
    摘要 大语言模型(LLM)的发展,包括GPT-4,为生成设计带来了新的机遇。我们对整个设计和生产工作流程中的应用进行调查。具体来说,我们分析LLM在以下任务中的用途:将文本提示转换成设计规范,将设计转换成生产指令,生成设计空间和设计变化,计算设计的性能,以及基于性能搜索设计。通过一些例子,我们显示了当前LLM的优势和局限性。通过暴露这些局限性,我们希望能够促进这些模型的持续改进和进步。

FedDRL: A Trustworthy Federated Learning Model Fusion Method Based on Staged Reinforcement Learning

  • paper_url: http://arxiv.org/abs/2307.13716
  • repo_url: None
  • paper_authors: Leiming Chen, Cihao Dong, Sibo Qiao, Ziling Huang, Kai Wang, Yuming Nie, Zhaoxiang Hou, Cheewei Tan
  • for: 解决 federated learning 中 client 模型质量不均匀和恶意上传模型导致全局模型精度下降的问题。
  • methods: 提出了一种基于 reinforcement learning 的模型融合方法,包括两个阶段:第一阶段是过滤恶意模型并选择可信客户端模型参与融合,第二阶段是自适应调整可信客户端模型的权重并进行最佳全局模型融合。
  • results: 对五种模型融合场景进行了比较,研究结果表明,我们的算法比基eline algorithms 高于可靠性而保持精度。
    Abstract Traditional federated learning uses the number of samples to calculate the weights of each client model and uses this fixed weight value to fusion the global model. However, in practical scenarios, each client's device and data heterogeneity leads to differences in the quality of each client's model. Thus the contribution to the global model is not wholly determined by the sample size. In addition, if clients intentionally upload low-quality or malicious models, using these models for aggregation will lead to a severe decrease in global model accuracy. Traditional federated learning algorithms do not address these issues. To solve this probelm, we propose FedDRL, a model fusion approach using reinforcement learning based on a two staged approach. In the first stage, Our method could filter out malicious models and selects trusted client models to participate in the model fusion. In the second stage, the FedDRL algorithm adaptively adjusts the weights of the trusted client models and aggregates the optimal global model. We also define five model fusion scenarios and compare our method with two baseline algorithms in those scenarios. The experimental results show that our algorithm has higher reliability than other algorithms while maintaining accuracy.
    摘要 传统的联合学习方法使用客户端模型的样本数来计算每个客户端模型的权重值,然后使用这些固定权重值进行模型融合。然而,在实际应用中,每个客户端的设备和数据多样性会导致每个客户端模型的质量差异,因此折衔到全局模型的质量不仅取决于样本数。此外,如果客户端故意上传低质量或黑客模型,使用这些模型进行融合会导致全局模型的准确率受到严重的影响。传统的联合学习算法不能解决这些问题。为解决这个问题,我们提出了FedDRL,一种基于强化学习的模型融合方法。在第一个阶段,我们的方法可以过滤出黑客模型,并选择可信worth客户端模型参与模型融合。在第二个阶段,FedDRL算法可以自适应地调整可信worth客户端模型的权重值,并将最佳的全局模型进行融合。我们还定义了五种模型融合场景,并与两个基eline算法进行比较。实验结果显示,我们的算法在可靠性和准确率之间取得了良好的平衡。

Towards an AI Accountability Policy

  • paper_url: http://arxiv.org/abs/2307.13658
  • repo_url: None
  • paper_authors: Przemyslaw Grabowicz, Nicholas Perello, Yair Zick
  • for: 这份白皮书是回应美国国家电信管理局(NATIONAL TELECOMMUNICATIONS AND INFORMATION ADMINISTRATION,NTIA)发布的“AI责任政策请求意见”。
  • methods: 本白皮书提供了一系列相互连接的建议,用于制定AI责任政策。
  • results: 本白皮书的建议可以帮助建立一个可靠、可信、可追溯的AI责任政策制度。I hope that helps! Let me know if you have any other questions.
    Abstract This white paper is a response to the "AI Accountability Policy Request for Comments" by the National Telecommunications and Information Administration of the United States. The question numbers for which comments were requested are provided in superscripts at the end of key sentences answering the respective questions. The white paper offers a set of interconnected recommendations for an AI accountability policy.
    摘要 这份白皮书是回应美国国家电信和信息管理局(NATIONAL TELECOMMUNICATIONS AND INFORMATION ADMINISTRATION,NTIA)发布的“人工智能负责任政策请求意见”(AI Accountability Policy Request for Comments)。在关键句中的问号(superscripts)后面提供了对应的答案。本白皮书提出了一系列相互关联的人工智能负责任政策建议。

QuickQual: Lightweight, convenient retinal image quality scoring with off-the-shelf pretrained models

  • paper_url: http://arxiv.org/abs/2307.13646
  • repo_url: https://github.com/justinengelmann/quickqual
  • paper_authors: Justin Engelmann, Amos Storkey, Miguel O. Bernabeu
  • for: 这个论文的目的是提出一种新的Retinal Image Quality Scoring(RIQS)方法,以解决目前主流的深度学习(DL)方法对眼eground图像质量评分的问题。
  • methods: 这个方法使用了一个简单的ImageNet预训练的Densenet121背景,并使用Support Vector Machine(SVM)进行分类。
  • results: 这个方法可以达到新的州对眼eground图像质量评分的最佳状态(Accuracy:88.50%,AUC:0.9687),表明RIQS可以通过普通的感知特征学习来解决,而不需要大量的fundus图像数据进行深度学习模型训练。
    Abstract Image quality remains a key problem for both traditional and deep learning (DL)-based approaches to retinal image analysis, but identifying poor quality images can be time consuming and subjective. Thus, automated methods for retinal image quality scoring (RIQS) are needed. The current state-of-the-art is MCFNet, composed of three Densenet121 backbones each operating in a different colour space. MCFNet, and the EyeQ dataset released by the same authors, was a huge step forward for RIQS. We present QuickQual, a simple approach to RIQS, consisting of a single off-the-shelf ImageNet-pretrained Densenet121 backbone plus a Support Vector Machine (SVM). QuickQual performs very well, setting a new state-of-the-art for EyeQ (Accuracy: 88.50% vs 88.00% for MCFNet; AUC: 0.9687 vs 0.9588). This suggests that RIQS can be solved with generic perceptual features learned on natural images, as opposed to requiring DL models trained on large amounts of fundus images. Additionally, we propose a Fixed Prior linearisation scheme, that converts EyeQ from a 3-way classification to a continuous logistic regression task. For this task, we present a second model, QuickQual MEga Minified Estimator (QuickQual-MEME), that consists of only 10 parameters on top of an off-the-shelf Densenet121 and can distinguish between gradable and ungradable images with an accuracy of 89.18% (AUC: 0.9537). Code and model are available on GitHub: https://github.com/justinengelmann/QuickQual . QuickQual is so lightweight, that the entire inference code (and even the parameters for QuickQual-MEME) is already contained in this paper.
    摘要 Image quality remains a key problem for both traditional and deep learning (DL)-based approaches to retinal image analysis, but identifying poor quality images can be time-consuming and subjective. Thus, automated methods for retinal image quality scoring (RIQS) are needed. The current state-of-the-art is MCFNet, composed of three Densenet121 backbones each operating in a different color space. MCFNet, and the EyeQ dataset released by the same authors, was a huge step forward for RIQS. We present QuickQual, a simple approach to RIQS, consisting of a single off-the-shelf ImageNet-pretrained Densenet121 backbone plus a Support Vector Machine (SVM). QuickQual performs very well, setting a new state-of-the-art for EyeQ (Accuracy: 88.50% vs 88.00% for MCFNet; AUC: 0.9687 vs 0.9588). This suggests that RIQS can be solved with generic perceptual features learned on natural images, as opposed to requiring DL models trained on large amounts of fundus images. Additionally, we propose a Fixed Prior linearization scheme, that converts EyeQ from a 3-way classification to a continuous logistic regression task. For this task, we present a second model, QuickQual MEga Minified Estimator (QuickQual-MEME), that consists of only 10 parameters on top of an off-the-shelf Densenet121 and can distinguish between gradable and ungradable images with an accuracy of 89.18% (AUC: 0.9537). Code and model are available on GitHub: . QuickQual is so lightweight, that the entire inference code (and even the parameters for QuickQual-MEME) is already contained in this paper.

Safety Margins for Reinforcement Learning

  • paper_url: http://arxiv.org/abs/2307.13642
  • repo_url: None
  • paper_authors: Alexander Grushin, Walt Woods, Alvaro Velasquez, Simon Khan
  • for: 本研究旨在提供一种能够量化评估自主控制器在某些情况下的危险性,以便在例如货物运输应用中引入人工监督。
  • methods: 本研究使用了一种基于概率动作的方法来定义自主控制器的真实扰乱性,并可以在实时中计算出代理扰乱度量。
  • results: 研究人员通过评估APE-X和A3C在Atari环境中学习的策略,发现安全间隔可以直接反映自主控制器的危险程度,并且当自主控制器接近失败状态时,安全间隔会逐渐减少。
    Abstract Any autonomous controller will be unsafe in some situations. The ability to quantitatively identify when these unsafe situations are about to occur is crucial for drawing timely human oversight in, e.g., freight transportation applications. In this work, we demonstrate that the true criticality of an agent's situation can be robustly defined as the mean reduction in reward given some number of random actions. Proxy criticality metrics that are computable in real-time (i.e., without actually simulating the effects of random actions) can be compared to the true criticality, and we show how to leverage these proxy metrics to generate safety margins, which directly tie the consequences of potentially incorrect actions to an anticipated loss in overall performance. We evaluate our approach on learned policies from APE-X and A3C within an Atari environment, and demonstrate how safety margins decrease as agents approach failure states. The integration of safety margins into programs for monitoring deployed agents allows for the real-time identification of potentially catastrophic situations.
    摘要 任何自主控制器都会在某些情况下不安全。可以量化地识别这些不安全情况的出现是控制器运行中的关键,例如在货物运输应用中。在这种工作中,我们表明了真正的危机性可以通过计算一些随机动作后的奖励减少平均值来定义。可计时计算的代理危机指标可以与真实危机指标进行比较,我们示出如何利用这些代理指标生成安全优势,这些优势直接与可能错误的行为相关联,并且与预计的性能损失相对比较。我们在APE-X和A3Clearned policies中的Atari环境中评估了我们的方法,并示出了安全优势随着控制器接近失败状态而减少。将安全优势 integrating into deployed agents的监控程序中可以实时识别潜在的灾难性情况。

GPT-3 Models are Few-Shot Financial Reasoners

  • paper_url: http://arxiv.org/abs/2307.13617
  • repo_url: None
  • paper_authors: Raul Salles de Padua, Imran Qureshi, Mustafa U. Karakaplan
  • For: The paper is written to evaluate the performance of pre-trained language models, specifically GPT-3, in answering financial questions.* Methods: The paper uses a combination of a retriever and a logic engine to answer financial questions, and the authors experiment with different approaches to fine-tune the model.* Results: The authors find that a separate retrieval model and logic engine are essential components to achieving state-of-the-art performance in the financial question answering task, and their refined prompt-engineering approach on GPT-3 achieves near state-of-the-art accuracy without any fine-tuning.Here are the three points in Simplified Chinese text:* For: 本文是用来评估预训练语言模型,尤其是 GPT-3,在回答金融问题上的性能。* Methods: 本文使用了一种组合Retriever和逻辑引擎来回答金融问题,并对不同方法进行了 эксперимент来细化模型。* Results: 作者发现,分离的Retrieval模型和逻辑引擎是回答金融问题的State-of-the-art性能的关键组成部分,并且他们的改进的提问工程学approach在 GPT-3 上 achiev near State-of-the-art accuracy without any fine-tuning.
    Abstract Financial analysis is an important tool for evaluating company performance. Practitioners work to answer financial questions to make profitable investment decisions, and use advanced quantitative analyses to do so. As a result, Financial Question Answering (QA) is a question answering task that requires deep reasoning about numbers. Furthermore, it is unknown how well pre-trained language models can reason in the financial domain. The current state-of-the-art requires a retriever to collect relevant facts about the financial question from the text and a generator to produce a valid financial program and a final answer. However, recently large language models like GPT-3 have achieved state-of-the-art performance on wide variety of tasks with just a few shot examples. We run several experiments with GPT-3 and find that a separate retrieval model and logic engine continue to be essential components to achieving SOTA performance in this task, particularly due to the precise nature of financial questions and the complex information stored in financial documents. With this understanding, our refined prompt-engineering approach on GPT-3 achieves near SOTA accuracy without any fine-tuning.
    摘要 金融分析是评估公司性能的重要工具。实践者们努力回答金融问题,以达到可持续的投资决策。为此,金融问答(QA)是一个需要深入理解数字的问答任务。然而,目前不确定前置语言模型在金融领域的推理能力。现状顶峰需要一个检索器收集金融问题相关的信息,以及一个生成器生成有效的金融计划和最终答案。然而,最近大型语言模型如GPT-3已经在各种任务上达到了顶峰性能,只需要几个示例。我们进行了多个实验,发现在这个任务中,独立的检索器和逻辑引擎仍然是必要的组成部分,特别是因为金融问题的具体性和金融文档中的复杂信息。通过这种理解,我们对GPT-3进行了改进的提示工程,达到了近顶峰准确率,无需任何微调。

Team Intro to AI team8 at CoachAI Badminton Challenge 2023: Advanced ShuttleNet for Shot Predictions

  • paper_url: http://arxiv.org/abs/2307.13715
  • repo_url: None
  • paper_authors: Shih-Hong Chen, Pin-Hsuan Chou, Yong-Fu Liu, Chien-An Han
  • for: 提高现有框架ShuttleNet在预测羽毛球击球类型和位置的性能,通过利用过去的拍打。
  • methods: 利用过去的拍打来提高ShuttleNet的预测性能。
  • results: 在CoachAI Badminton Challenge中达到了比基准更好的结果,并最终蝉联赛事的冠军。
    Abstract In this paper, our objective is to improve the performance of the existing framework ShuttleNet in predicting badminton shot types and locations by leveraging past strokes. We participated in the CoachAI Badminton Challenge at IJCAI 2023 and achieved significantly better results compared to the baseline. Ultimately, our team achieved the first position in the competition and we made our code available.
    摘要 在这篇论文中,我们的目标是通过利用过去的拍打来提高现有框架ShuttleNet在预测羽毛球shot类型和位置的性能。我们参加了IJCAI 2023年的CoachAI Badminton Challenge并取得了对基线的显著改进。最终,我们的团队取得了比赛的第一名,并将代码公开。Note that the translation is in Simplified Chinese, which is the standard form of Chinese used in mainland China and Singapore. If you need Traditional Chinese, please let me know.