cs.AI - 2023-11-23

Data-Driven Risk Modeling for Infrastructure Projects Using Artificial Intelligence Techniques

  • paper_url: http://arxiv.org/abs/2311.14203
  • repo_url: None
  • paper_authors: Abdolmajid Erfani
  • For: The paper aims to improve the traditional expert-based approach to identifying and evaluating project risks in large infrastructure projects, and to provide a data-driven framework for risk management.* Methods: The paper uses historical data and artificial intelligence techniques to automatically identify risks and evaluate the quality of early risk registers and risk assessments.* Results: The study examines the evolution of risks over time and compares the effectiveness of risk identification and assessment in the initial phase versus project execution. The results provide insights into how project teams can improve their risk management practices and enhance the success of large infrastructure projects.
    Abstract Managing project risk is a key part of the successful implementation of any large project and is widely recognized as a best practice for public agencies to deliver infrastructures. The conventional method of identifying and evaluating project risks involves getting input from subject matter experts at risk workshops in the early phases of a project. As a project moves through its life cycle, these identified risks and their assessments evolve. Some risks are realized to become issues, some are mitigated, and some are retired as no longer important. Despite the value provided by conventional expert-based approaches, several challenges remain due to the time-consuming and expensive processes involved. Moreover, limited is known about how risks evolve from ex-ante to ex-post over time. How well does the project team identify and evaluate risks in the initial phase compared to what happens during project execution? Using historical data and artificial intelligence techniques, this study addressed these limitations by introducing a data-driven framework to identify risks automatically and to examine the quality of early risk registers and risk assessments. Risk registers from more than 70 U.S. major transportation projects form the input dataset.
    摘要 管理项目风险是大型项目实施成功的关键部分,广泛被认为是公共机构实施基础设施的最佳做法。传统的风险认知和评估方法是在项目早期阶段通过专家参与的风险工作shop获取输入。项目进行生命周期中,这些识别的风险和评估会随着时间的推移而发展。一些风险实现为问题,一些被mitigate,一些不再重要而被退役。虽然传统专家基本方法提供了价值,但是存在一些挑战,其中一些是时间consuming和昂贵的过程。此外,对于风险从初始阶段到执行阶段的发展不熟悉。项目团队在初期阶段是否能够correctly识别和评估风险?使用历史数据和人工智能技术,本研究解决了这些限制,通过自动识别风险和评估early risk registers和风险评估质量。研究使用了美国70多个大型交通项目的风险登记表作为输入数据。

The 2nd Workshop on Maritime Computer Vision (MaCVi) 2024

  • paper_url: http://arxiv.org/abs/2311.14762
  • repo_url: None
  • paper_authors: Benjamin Kiefer, Lojze Žust, Matej Kristan, Janez Perš, Matija Teršek, Arnold Wiliem, Martin Messmer, Cheng-Yen Yang, Hsiang-Wei Huang, Zhongyu Jiang, Heng-Cheng Kuo, Jie Mei, Jenq-Neng Hwang, Daniel Stadler, Lars Sommer, Kaer Huang, Aiguo Zheng, Weitu Chong, Kanokphan Lertniphonphan, Jun Xie, Feng Chen, Jian Li, Zhepeng Wang, Luca Zedda, Andrea Loddo, Cecilia Di Ruberto, Tuan-Anh Vu, Hai Nguyen-Truong, Tan-Sang Ha, Quan-Dung Pham, Sai-Kit Yeung, Yuan Feng, Nguyen Thanh Thien, Lixin Tian, Sheng-Yao Kuan, Yuan-Hao Ho, Angel Bueno Rodriguez, Borja Carrillo-Perez, Alexander Klein, Antje Alex, Yannik Steiniger, Felix Sattler, Edgardo Solano-Carrillo, Matej Fabijanić, Magdalena Šumunec, Nadir Kapetanović, Andreas Michel, Wolfgang Gross, Martin Weinmann
  • for: 本研讨会探讨了海上计算机视觉技术的应用于无人飞机(UAV)和无人水面船舶(USV)领域。
  • methods: 研讨会考虑了三个挑战类别:一是UAV基于海上物体跟踪重新标识,二是USV基于海上障碍物分割和检测,三是USV基于海上船舶跟踪。其中USV基于海上障碍物分割和检测还有三个子挑战,其中一个是关于实际设备上高效的推理。
  • results: 本文对挑战中的发现进行了全面的概述,并提供了统计分析和质量分析,评估了来自超过195个提交的趋势。所有数据集、评估代码和排名可以在https://macvi.org/workshop/macvi24上公开获得。
    Abstract The 2nd Workshop on Maritime Computer Vision (MaCVi) 2024 addresses maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicles (USV). Three challenges categories are considered: (i) UAV-based Maritime Object Tracking with Re-identification, (ii) USV-based Maritime Obstacle Segmentation and Detection, (iii) USV-based Maritime Boat Tracking. The USV-based Maritime Obstacle Segmentation and Detection features three sub-challenges, including a new embedded challenge addressing efficicent inference on real-world embedded devices. This report offers a comprehensive overview of the findings from the challenges. We provide both statistical and qualitative analyses, evaluating trends from over 195 submissions. All datasets, evaluation code, and the leaderboard are available to the public at https://macvi.org/workshop/macvi24.
    摘要 第二届海上计算机视觉工作shop(MaCVi2024)关注海上计算机视觉领域中无人飞行器(UAV)和无人水面船舶(USV)的应用。工作shop分为三个挑战类别:1. UAV-based Maritime Object Tracking with Re-identification:关注使用UAV进行海上物体跟踪和重新识别。2. USV-based Maritime Obstacle Segmentation and Detection:关注使用USV进行海上障碍物分割和检测。3. USV-based Maritime Boat Tracking:关注使用USV进行海上船舶跟踪。USV-based Maritime Obstacle Segmentation and Detection中有三个子挑战,其中一个新的嵌入式挑战是关注实际设备上快速和高效的推理。本报告提供了海上计算机视觉领域的批处理结果,包括统计分析和质量分析,评估了来自超过195个提交的趋势。所有数据集、评估代码和排名可以在https://macvi.org/workshop/macvi24上公开获取。

Appearance-based gaze estimation enhanced with synthetic images using deep neural networks

  • paper_url: http://arxiv.org/abs/2311.14175
  • repo_url: https://github.com/flakeua/bachelorsthesis
  • paper_authors: Dmytro Herashchenko, Igor Farkaš
  • for: 这篇论文是关于人机交互中人工智能眼动估计的研究,帮助机器人理解和预测人类行为。
  • methods: 该论文使用人工神经网络建立了一个模块化系统,从分割的眼睛图像中估计眼动,利用现有的面部检测(RetinaFace)和头部姿态估计(6DRepNet)组件。该方法不需要特殊硬件或红外滤光器,只使用了普通的RGB相机。
  • results: 使用MetaHuman工具生成了超过57,000个人脸的大规模 sintetic数据集,并将其公开提供。在训练模型时,将这个数据集与标准的哥伦比亚眼动数据集一起使用,导致眼动估计的均方误差在眼球倾斜和旋转方向下降至两度以下,与相关方法相比。此外,我们还验证了该模型在实际场景中的可行性,通过使用NICO semi-humanoid机器人的内置4K相机进行先期测试。
    Abstract Human eye gaze estimation is an important cognitive ingredient for successful human-robot interaction, enabling the robot to read and predict human behavior. We approach this problem using artificial neural networks and build a modular system estimating gaze from separately cropped eyes, taking advantage of existing well-functioning components for face detection (RetinaFace) and head pose estimation (6DRepNet). Our proposed method does not require any special hardware or infrared filters but uses a standard notebook-builtin RGB camera, as often approached with appearance-based methods. Using the MetaHuman tool, we also generated a large synthetic dataset of more than 57,000 human faces and made it publicly available. The inclusion of this dataset (with eye gaze and head pose information) on top of the standard Columbia Gaze dataset into training the model led to better accuracy with a mean average error below two degrees in eye pitch and yaw directions, which compares favourably to related methods. We also verified the feasibility of our model by its preliminary testing in real-world setting using the builtin 4K camera in NICO semi-humanoid robot's eye.
    摘要 人类眼光估计是成功人机交互中重要的认知元素,允许机器人阅读和预测人类行为。我们采用人工神经网络方法解决这个问题,建立了分解眼睛的眼光估计系统,利用现有的面部检测(RetinaFace)和头部姿态估计(6DRepNet)的可靠组件。我们的提议方法不需要任何特殊硬件或红外筛选,只使用了标准的RGB摄像头,与现有的外观基于方法类似。使用MetaHuman工具,我们还生成了大量的人类面部synthetic数据集,并公开提供了这些数据。在训练模型时,将这些数据集(包括眼光和头部姿态信息)与标准的科学地域眼光数据集相加之后,模型的准确率下降至每个眼球的投掷和旋转方向的平均误差下二度,与相关方法相比较较高。我们还验证了我们的模型在实际场景中的可行性,通过NICO半人工机器人的内置4K摄像头进行先期测试。

Evaluating GPT-4’s Vision Capabilities on Brazilian University Admission Exams

  • paper_url: http://arxiv.org/abs/2311.14169
  • repo_url: https://github.com/piresramon/gpt-4-enem
  • paper_authors: Ramon Pires, Thales Sales Almeida, Hugo Abonizio, Rodrigo Nogueira
  • for: 本研究旨在评估语言模型在大学入学考试中的表现,并具有文本和视觉元素的完整评估框架。
  • methods: 我们使用了最新的GPT-4语言模型,并对两年度的全国高中招生考试(ENEM)进行了评估。我们还对文本和视觉元素的使用进行了比较,以评估模型在不同类型的问题上的表现。
  • results: 研究显示,GPT-4模型在处理复杂多学科问题方面具有人类水平表现,而且在使用文本和视觉元素时,文本描述性的表现更为出色。然而,数学题仍然是这些现代模型的挑战。研究代码和数据可以在GitHub上获取(https://github.com/piresramon/gpt-4-enem)。
    Abstract Recent advancements in language models have showcased human-comparable performance in academic entrance exams. However, existing studies often overlook questions that require the integration of visual comprehension, thus compromising the full spectrum and complexity inherent in real-world scenarios. To address this gap, we present a comprehensive framework to evaluate language models on entrance exams, which incorporates both textual and visual elements. We evaluate the two most recent editions of Exame Nacional do Ensino M\'edio (ENEM), the main standardized entrance examination adopted by Brazilian universities. Our study not only reaffirms the capabilities of GPT-4 as the state of the art for handling complex multidisciplinary questions, but also pioneers in offering a realistic assessment of multimodal language models on Portuguese examinations. One of the highlights is that text captions transcribing visual content outperform the direct use of images, suggesting that the vision model has room for improvement. Yet, despite improvements afforded by images or captions, mathematical questions remain a challenge for these state-of-the-art models. The code and data used on experiments are available at https://github.com/piresramon/gpt-4-enem.
    摘要

Variational Annealing on Graphs for Combinatorial Optimization

  • paper_url: http://arxiv.org/abs/2311.14156
  • repo_url: https://github.com/VariationalCOgrammer/VAG-CO
  • paper_authors: Sebastian Sanokowski, Wilhelm Berghammer, Sepp Hochreiter, Sebastian Lehner
  • for: solves combinatorial optimization (CO) problems with probabilistic approaches, but with performance limitations on difficult problem instances.
  • methods: introduces subgraph tokenization to represent the configuration of solution variables with a single token, alleviating the drawback of long sequential sampling procedure, and uses annealed entropy regularization to ensure efficient and stable learning.
  • results: demonstrates superior performance on many popular CO problems compared to traditional probabilistic approaches, and provides theoretical motivation for the annealed entropy regularization.
    Abstract Several recent unsupervised learning methods use probabilistic approaches to solve combinatorial optimization (CO) problems based on the assumption of statistically independent solution variables. We demonstrate that this assumption imposes performance limitations in particular on difficult problem instances. Our results corroborate that an autoregressive approach which captures statistical dependencies among solution variables yields superior performance on many popular CO problems. We introduce subgraph tokenization in which the configuration of a set of solution variables is represented by a single token. This tokenization technique alleviates the drawback of the long sequential sampling procedure which is inherent to autoregressive methods without sacrificing expressivity. Importantly, we theoretically motivate an annealed entropy regularization and show empirically that it is essential for efficient and stable learning.
    摘要 几种最近的无监督学习方法使用概率方法解决 combinatorial optimization(CO)问题,基于解决变量独立性的假设。我们示出这种假设对特别是困难问题实例带来性能限制。我们的结果证明了一种自适应方法,该方法捕捉解决变量之间的统计相关性,在许多流行的 CO 问题上表现出优秀的性能。我们还介绍了子图化划分,该技术使得解决变量的配置被表示为单个Token。这种划分技术消除了概率方法无法避免的长顺序采样过程的缺点,而不是牺牲表达能力。更重要的是,我们 theoretically 驱动了一种热退 entropy 规范,并 empirically 表明其是有效和稳定学习的关键。

Tube-NeRF: Efficient Imitation Learning of Visuomotor Policies from MPC using Tube-Guided Data Augmentation and NeRFs

  • paper_url: http://arxiv.org/abs/2311.14153
  • repo_url: None
  • paper_authors: Andrea Tagliabue, Jonathan P. How
  • for: 本研究旨在快速增强多旋翼机器人的视emo听控制能力,通过结合imitative learning(IL)和可靠的模型预测控制器(MPC),以及一种数据增强策略(DA),从而提高控制精度和稳定性。
  • methods: 本研究使用了一种基于Neural Radiance Fields(NeRFs)的数据增强策略,named Tube-NeRF,来生成新的synthetic图像,并使用了一种带有Tube的稳定MPC来选择相关的视图和计算相应的操作。
  • results: 研究表明,通过使用Tube-NeRF数据增强策略和稳定MPC,可以快速增强多旋翼机器人的视emo听控制能力,并且可以在实际环境中实现高精度的地理位和跟踪控制,即使面临大的干扰。
    Abstract Imitation learning (IL) can train computationally-efficient sensorimotor policies from a resource-intensive Model Predictive Controller (MPC), but it often requires many samples, leading to long training times or limited robustness. To address these issues, we combine IL with a variant of robust MPC that accounts for process and sensing uncertainties, and we design a data augmentation (DA) strategy that enables efficient learning of vision-based policies. The proposed DA method, named Tube-NeRF, leverages Neural Radiance Fields (NeRFs) to generate novel synthetic images, and uses properties of the robust MPC (the tube) to select relevant views and to efficiently compute the corresponding actions. We tailor our approach to the task of localization and trajectory tracking on a multirotor, by learning a visuomotor policy that generates control actions using images from the onboard camera as only source of horizontal position. Our evaluations numerically demonstrate learning of a robust visuomotor policy with an 80-fold increase in demonstration efficiency and a 50% reduction in training time over current IL methods. Additionally, our policies successfully transfer to a real multirotor, achieving accurate localization and low tracking errors despite large disturbances, with an onboard inference time of only 1.5 ms.
    摘要 模仿学习(IL)可以训练计算效率高的感知动作策略从一个资源占用过重的预测控制器(MPC)中,但它通常需要大量样本,导致训练时间长或Robustness有限。为解决这些问题,我们将IL与一种考虑过程和感知不确定性的robust MPC结合,并设计了一种数据扩充(DA)策略。我们称之为Tube-NeRF的DA方法,它利用神经辐射场(NeRFs)生成新的 sintetic图像,并使用Tube的特性选择相关的视图和高效地计算相应的动作。我们将我们的方法应用于多旋翼机的本地化和轨迹跟踪任务,通过学习一个视图动作策略,该策略使用机载摄像头中的图像作为垂直位置的唯一来源。我们的评估结果表明,我们可以通过Tube-NeRF DA方法学习一个Robust的视图动作策略,与现有IL方法相比,提高示例效率80倍,降低训练时间50%。此外,我们的策略成功地转移到实际的多旋翼机上,实现了准确的本地化和低跟踪误差,即使面临大的干扰,仅在机载推理时间1.5ms。

Machine Learning For An Explainable Cost Prediction of Medical Insurance

  • paper_url: http://arxiv.org/abs/2311.14139
  • repo_url: None
  • paper_authors: Ugochukwu Orji, Elochukwu Ukwandu
  • for: 这个研究的目的是为了帮助决策者、保险公司和可能需要购买医疗保险的人们在选择合适的保险政策时做出更 Informed 的决策。
  • methods: 这篇研究使用了三种回归 Ensemble Machine Learning 模型(包括极限梯度提升、梯度提升机器和随机森林),并使用了解释性人工智能方法(SHAP 和 ICE)来揭示和解释医疗保险费用的决定因素。
  • results: 研究发现所有模型都表现出色,但 XGBoost 模型在整体性能方面表现更好,尽管它需要更多的计算资源;而 RF 模型则录得较低的预测误差,但它所需的计算资源比 XGBoost 模型少得多。此外,对于每个模型,使用 XAi 方法来揭示决定因素的结果表明,ICE 图表提供了更多的交互detail,而 SHAP 分析则显示了更高一级的总体视图。
    Abstract Predictive modeling in healthcare continues to be an active actuarial research topic as more insurance companies aim to maximize the potential of Machine Learning approaches to increase their productivity and efficiency. In this paper, the authors deployed three regression-based ensemble ML models that combine variations of decision trees through Extreme Gradient Boosting, Gradient-boosting Machine, and Random Forest) methods in predicting medical insurance costs. Explainable Artificial Intelligence methods SHapley Additive exPlanations and Individual Conditional Expectation plots were deployed to discover and explain the key determinant factors that influence medical insurance premium prices in the dataset. The dataset used comprised 986 records and is publicly available in the KAGGLE repository. The models were evaluated using four performance evaluation metrics, including R-squared, Mean Absolute Error, Root Mean Squared Error, and Mean Absolute Percentage Error. The results show that all models produced impressive outcomes; however, the XGBoost model achieved a better overall performance although it also expanded more computational resources, while the RF model recorded a lesser prediction error and consumed far fewer computing resources than the XGBoost model. Furthermore, we compared the outcome of both XAi methods in identifying the key determinant features that influenced the PremiumPrices for each model and whereas both XAi methods produced similar outcomes, we found that the ICE plots showed in more detail the interactions between each variable than the SHAP analysis which seemed to be more high-level. It is the aim of the authors that the contributions of this study will help policymakers, insurers, and potential medical insurance buyers in their decision-making process for selecting the right policies that meet their specific needs.
    摘要 预测模型在医疗保险领域仍然是活跃的 actuarial 研究话题,更多的保险公司希望通过机器学习方法提高产出力和效率。本文中,作者们部署了三种重合式机器学习模型(EXTREME GRADIENT BOOSTING、GRADIENT-BOOSTING MACHINE和RANDOM FOREST)来预测医疗保险费用。使用可解释人工智能方法(SHapley Additive exPlanations和Individual Conditional Expectation plots)来发现和解释医疗保险费用中的关键determinant factor。使用的数据集包括986个记录,可公开在KAGGLE存储库中。模型通过四种表现评价指标(R-squared、Mean Absolute Error、Root Mean Squared Error和Mean Absolute Percentage Error)进行评价。结果显示所有模型都达到了出色的结果,但XGBoost模型在整体表现更好,尽管它也需要更多的计算资源;而RF模型记录了较小的预测错误,但它的计算资源consumption比XGBoost模型要少。此外,我们对两种XAi方法的结果进行比较,发现ICE plots显示了每个变量之间的互动,而SHAP分析则似乎是高级的。作者们希望通过本研究的贡献,帮助政策制定者、保险公司和可能需要购买医疗保险的人们在选择适合自己需求的策略时作出更 Informed 决策。

Byzantine Robustness and Partial Participation Can Be Achieved Simultaneously: Just Clip Gradient Differences

  • paper_url: http://arxiv.org/abs/2311.14127
  • repo_url: None
  • paper_authors: Grigory Malinovsky, Peter Richtárik, Samuel Horváth, Eduard Gorbunov
  • for: 本文提出了一种基于分布式学习的 Byzantine fault tolerance 机制,以提高模型训练的可靠性和准确性。
  • methods: 本文提出了一种基于客户端采样的分布式学习方法,并使用了 Gradient Clipping 技术来控制梯度差异。此外,文章还 incorporated 了通信压缩技术以提高通信效率。
  • results: 本文的实验结果显示,该方法可以在满足一定的假设下达到与现有 SOTA 理论结果相同的抽象率。
    Abstract Distributed learning has emerged as a leading paradigm for training large machine learning models. However, in real-world scenarios, participants may be unreliable or malicious, posing a significant challenge to the integrity and accuracy of the trained models. Byzantine fault tolerance mechanisms have been proposed to address these issues, but they often assume full participation from all clients, which is not always practical due to the unavailability of some clients or communication constraints. In our work, we propose the first distributed method with client sampling and provable tolerance to Byzantine workers. The key idea behind the developed method is the use of gradient clipping to control stochastic gradient differences in recursive variance reduction. This allows us to bound the potential harm caused by Byzantine workers, even during iterations when all sampled clients are Byzantine. Furthermore, we incorporate communication compression into the method to enhance communication efficiency. Under quite general assumptions, we prove convergence rates for the proposed method that match the existing state-of-the-art (SOTA) theoretical results.
    摘要 分布式学习已经成为训练大型机器学习模型的主导方法。然而,在实际场景中,参与者可能不可靠或有恶意行为,这会对训练模型的精度和完整性造成重大挑战。拜占庭错误tolerance机制已经被提出来解决这些问题,但它们通常假设所有客户端都会参与训练,这并不是实际场景中一定可行的。在我们的工作中,我们提出了首个基于客户端采样的分布式方法,并且可以证明对拜占庭工作者快速响应。我们的方法的关键思想是使用梯度截断来控制随机梯度差异在回归变量减少中。这使得我们可以在所有采样客户端都是恶意的情况下缓 bound潜在的害。此外,我们还将通信压缩 incorporated into the method to enhance communication efficiency. Under quite general assumptions, we prove convergence rates for the proposed method that match the existing state-of-the-art (SOTA) theoretical results.

Towards Auditing Large Language Models: Improving Text-based Stereotype Detection

  • paper_url: http://arxiv.org/abs/2311.14126
  • repo_url: None
  • paper_authors: Wu Zekun, Sahan Bulathwela, Adriano Soares Koshiyama
  • for: 这个论文的目的是为了评估和检测大语言模型中的刻板印象偏见。
  • methods: 该论文使用了一种新的多重刻板类别器,以及多种解释性人工智能工具来评估和检测大语言模型中的刻板印象偏见。
  • results: 该论文的实验结果表明,使用多重刻板类别器可以超过一对一的binary分类器的性能,并且可以准确地检测和评估大语言模型中的刻板印象偏见。
    Abstract Large Language Models (LLM) have made significant advances in the recent past becoming more mainstream in Artificial Intelligence (AI) enabled human-facing applications. However, LLMs often generate stereotypical output inherited from historical data, amplifying societal biases and raising ethical concerns. This work introduces i) the Multi-Grain Stereotype Dataset, which includes 52,751 instances of gender, race, profession and religion stereotypic text and ii) a novel stereotype classifier for English text. We design several experiments to rigorously test the proposed model trained on the novel dataset. Our experiments show that training the model in a multi-class setting can outperform the one-vs-all binary counterpart. Consistent feature importance signals from different eXplainable AI tools demonstrate that the new model exploits relevant text features. We utilise the newly created model to assess the stereotypic behaviour of the popular GPT family of models and observe the reduction of bias over time. In summary, our work establishes a robust and practical framework for auditing and evaluating the stereotypic bias in LLM.
    摘要
  1. 多粒子刻板数据集(Multi-Grain Stereotype Dataset),包含52,751个性别、种族、职业和宗教刻板文本的例子。2. 一个新的刻板分类器 для English 文本,用于评估 LLM 的刻板性。我们设计了多个实验,用于严谨地测试我们的提案。我们发现,在多组分类设定下,我们的模型可以超越一对一的 binary 对抗方案。各种 Explainable AI 工具的常见特征重要性信号表明,我们的新模型充分利用了文本特征。我们使用我们创建的模型,评估 GPT 家族模型中的刻板性,并发现随时间的减少偏见。总之,我们的工作建立了一个可靠且实用的框架,用于审核和评估 LLM 中的刻板性。

Scalable AI Safety via Doubly-Efficient Debate

  • paper_url: http://arxiv.org/abs/2311.14125
  • repo_url: https://github.com/google-deepmind/debate
  • paper_authors: Jonah Brown-Cohen, Geoffrey Irving, Georgios Piliouras
  • for: 这 paper 的目的是提出一种debate方法,用于检测 AI 系统的不当行为。
  • methods: 这 paper 使用了一种新的debate协议,其中honestStrategy可以在幂等数量的步骤上进行模拟,并且可以验证杂音 AI 系统的对称性。
  • results: 这 paper 的结果表明,这种新的debate协议可以在杂音 AI 系统下实现honestStrategy的成功,并且可以验证对称性。
    Abstract The emergence of pre-trained AI systems with powerful capabilities across a diverse and ever-increasing set of complex domains has raised a critical challenge for AI safety as tasks can become too complicated for humans to judge directly. Irving et al. [2018] proposed a debate method in this direction with the goal of pitting the power of such AI models against each other until the problem of identifying (mis)-alignment is broken down into a manageable subtask. While the promise of this approach is clear, the original framework was based on the assumption that the honest strategy is able to simulate deterministic AI systems for an exponential number of steps, limiting its applicability. In this paper, we show how to address these challenges by designing a new set of debate protocols where the honest strategy can always succeed using a simulation of a polynomial number of steps, whilst being able to verify the alignment of stochastic AI systems, even when the dishonest strategy is allowed to use exponentially many simulation steps.
    摘要 “随着具有强大能力的预训AI系统在复杂和不断增长的领域中出现,AI安全问题受到了严重挑战。尔金等人(2018)提出了一种辩论方法,目的是通过将AI模型之间进行辩论,直到解决不同领域中的(mis)调整问题。尽管这种方法的推奨可以预见,但原始框架假设了诚实策略可以递对数steps进行实际模拟,从而导致其应用范围受限。在这篇论文中,我们显示了如何解决这些挑战,通过设计一组新的辩论协议,让诚实策略在几何步骤的模拟中获得成功,并且能够验证数学AI系统的调整,即使对方可以使用无限多步骤进行模拟。”

A density estimation perspective on learning from pairwise human preferences

  • paper_url: http://arxiv.org/abs/2311.14115
  • repo_url: None
  • paper_authors: Vincent Dumoulin, Daniel D. Johnson, Pablo Samuel Castro, Hugo Larochelle, Yann Dauphin
  • for: 本研究旨在使用人类反馈来改进大语言模型(LLM)的训练,具体来说是通过对 pairwise 偏好数据进行反馈来学习一个奖励函数,并将 LLM 视为一种策略,以便在额外的常量约束下最大化奖励。
  • methods: 本研究使用了一种新的解释方式,即基于对 pairwise 偏好数据的生成过程,将 LHF 视为一种概率度量问题。这种解释方式提供了理论和实验结果,证明了在一家特定的生成过程下,通过对 pairwise 偏好数据进行反馈来学习 annotator 的隐藏偏好分布是有效的。
  • results: 研究发现,在一些特定的生成过程下,通过对 pairwise 偏好数据进行反馈来学习 annotator 的隐藏偏好分布是有效的,但是在一些特定的情况下,如果对 annotator 的偏好分布进行了不正确的假设,那么这种方法可能会遇到问题。
    Abstract Learning from human feedback (LHF) -- and in particular learning from pairwise preferences -- has recently become a crucial ingredient in training large language models (LLMs), and has been the subject of much research. Most recent works frame it as a reinforcement learning problem, where a reward function is learned from pairwise preference data and the LLM is treated as a policy which is adapted to maximize the rewards, often under additional regularization constraints. We propose an alternative interpretation which centers on the generative process for pairwise preferences and treats LHF as a density estimation problem. We provide theoretical and empirical results showing that for a family of generative processes defined via preference behavior distribution equations, training a reward function on pairwise preferences effectively models an annotator's implicit preference distribution. Finally, we discuss and present findings on "annotator misspecification" -- failure cases where wrong modeling assumptions are made about annotator behavior, resulting in poorly-adapted models -- suggesting that approaches that learn from pairwise human preferences could have trouble learning from a population of annotators with diverse viewpoints.
    摘要 学习人类反馈(LHF)——特别是学习对比喜好——在训练大型自然语言模型(LLM)中变得越来越重要,而这也成为了许多研究的主要话题。大多数最新的研究将其视为一种强化学习问题,在对比喜好数据上学习一个奖励函数,并将LLM视为一种策略,该策略是在奖励函数的支持下进行最大化。经常附加一些常规约束。我们提出了一个不同的解释,即基于对比喜好生成过程的概率分布来对LHF进行定义。我们提供了理论和实验结果,表明对一家族的生成过程定义via喜好行为分布方程,训练对对比人类喜好的奖励函数可以有效地模型录音器的隐藏喜好分布。最后,我们讨论了“录音器误pecification”——录音器行为假设错误的情况,导致模型适应度差。我们发现,通过对对比人类喜好进行学习,可能会遇到录音器视角多样性导致的适应度差问题。

When is Off-Policy Evaluation Useful? A Data-Centric Perspective

  • paper_url: http://arxiv.org/abs/2311.14110
  • repo_url: None
  • paper_authors: Hao Sun, Alex J. Chan, Nabeel Seedat, Alihan Hüyük, Mihaela van der Schaar
  • for: 评估假设政策的价值,使用已 logged 数据进行评估,是一个重要 yet 挑战性的任务。
  • methods: 本文提出了一种数据靠拢的框架(DataCOPE),用于评估 OPE 问题。DataCOPE 可以预测 OPE 算法的性能,不需要访问环境,特别是在实际应用前,评估 OPE 是不可能的。
  • results: DataCOPE 能够评估机器学习和人类专家政策(如医疗指南)的性能。我们在contextual bandit 设置下对医疗数据进行了Empirical 分析,并证明了 DataCOPE 的能力。
    Abstract Evaluating the value of a hypothetical target policy with only a logged dataset is important but challenging. On the one hand, it brings opportunities for safe policy improvement under high-stakes scenarios like clinical guidelines. On the other hand, such opportunities raise a need for precise off-policy evaluation (OPE). While previous work on OPE focused on improving the algorithm in value estimation, in this work, we emphasize the importance of the offline dataset, hence putting forward a data-centric framework for evaluating OPE problems. We propose DataCOPE, a data-centric framework for evaluating OPE, that answers the questions of whether and to what extent we can evaluate a target policy given a dataset. DataCOPE (1) forecasts the overall performance of OPE algorithms without access to the environment, which is especially useful before real-world deployment where evaluating OPE is impossible; (2) identifies the sub-group in the dataset where OPE can be inaccurate; (3) permits evaluations of datasets or data-collection strategies for OPE problems. Our empirical analysis of DataCOPE in the logged contextual bandit settings using healthcare datasets confirms its ability to evaluate both machine-learning and human expert policies like clinical guidelines.
    摘要 评估假设的目标策略的价值使用只有日志数据是重要但也是挑战。一方面,它提供了高度风险场景如医疗指南中的机会,例如,评估医疗指南的安全性。另一方面,这些机会需要精准的非直接评估(OPE)。先前的OPE研究主要关注了估计价值的算法,而在这种工作中,我们强调数据的重要性,因此提出了一个数据中心的评估框架,即DataCOPE。DataCOPE可以回答评估目标策略的问题,即是否可以评估目标策略,以及评估结果的准确程度。DataCOPE的主要特点包括:1. 无需环境访问,可以预测OPE算法的总性表现,特别是在实际部署前,评估OPE是不可能的时候。2. 可以在数据集中identify不准确的子组,从而避免在评估中出现偏导的问题。3. 允许对OPE问题进行数据集或数据采集策略的评估。我们对DataCOPE在日志上下文ual bandit设置中进行了实验,使用了医疗数据集,并证明了它能够评估机器学习和人类专家政策,如医疗指南。

Boosting the Power of Small Multimodal Reasoning Models to Match Larger Models with Self-Consistency Training

  • paper_url: http://arxiv.org/abs/2311.14109
  • repo_url: https://github.com/chengtan9907/mc-cot
  • paper_authors: Cheng Tan, Jingxuan Wei, Zhangyang Gao, Linzhuang Sun, Siyuan Li, Xihong Yang, Stan Z. Li
  • for: 提高多模态理解模型的表现,提高答案的准确性
  • methods: 使用自适应性训练策略MC-CoT,通过多个有效的理由生成和选择,提高理由质量,提高答案准确性
  • results: 通过多个benchmark测试,证明MC-CoT可以显著提高多模态理解模型的表现, même使用较小的基础模型,可以达到与大型模型相同的表现水平
    Abstract Multimodal reasoning is a challenging task that requires models to reason across multiple modalities to answer questions. Existing approaches have made progress by incorporating language and visual modalities into a two-stage reasoning framework, separating rationale generation from answer inference. However, these approaches often fall short due to the inadequate quality of the generated rationales. In this work, we delve into the importance of rationales in model reasoning. We observe that when rationales are completely accurate, the model's accuracy significantly improves, highlighting the need for high-quality rationale generation. Motivated by this, we propose MC-CoT, a self-consistency training strategy that generates multiple rationales and answers, subsequently selecting the most accurate through a voting process. This approach not only enhances the quality of generated rationales but also leads to more accurate and robust answers. Through extensive experiments, we demonstrate that our approach significantly improves model performance across various benchmarks. Remarkably, we show that even smaller base models, when equipped with our proposed approach, can achieve results comparable to those of larger models, illustrating the potential of our approach in harnessing the power of rationales for improved multimodal reasoning. The code is available at https://github.com/chengtan9907/mc-cot.
    摘要 多模态理解是一项复杂的任务,需要模型能够跨多种模式来回答问题。现有的方法已经做出了进步,通过将语言和视觉modalities integrating into a two-stage reasoning framework,分开了理由生成和答案推理。然而,这些方法经常因为生成的理由质量不够而失败。在这项工作中,我们深入探讨了理由在模型理解中的重要性。我们发现,当理由完全准确时,模型的准确率会明显提高,这 highlights the need for high-quality rationale generation. 驱动于这一点,我们提出了 MC-CoT,一种自适应训练策略,通过生成多个理由和答案,并将其选择为最准确的答案。这种方法不仅提高了生成的理由质量,还导致更加准确和稳定的答案。通过广泛的实验,我们证明了我们的方法可以在多个标准底下显著提高模型性能。吸引人的是,我们显示了使用我们提出的方法,甚至小型基础模型也可以达到与大型模型相当的性能,这说明了我们的方法在挖掘 rationales 的力量上的潜力。代码可以在 中找到。

Auditing and Mitigating Cultural Bias in LLMs

  • paper_url: http://arxiv.org/abs/2311.14096
  • repo_url: None
  • paper_authors: Yan Tao, Olga Viberg, Ryan S. Baker, Rene F. Kizilcec
  • for: 这个研究旨在检查大语言模型中的文化偏见,并评估国家特定的提示作为干预策略。
  • methods: 研究人员使用了大语言模型的回应与国家代表性调查数据进行比较,以探讨这些模型中嵌入的文化价值。
  • results: 研究发现,GPT-4、3.5和3等模型具有英语和 протестан领教派国家的文化价值。 however, the mitigation strategy only reduces cultural bias in recent models for some countries/territories, but not for all.I hope this helps! Let me know if you have any other questions.
    Abstract Culture fundamentally shapes people's reasoning, behavior, and communication. Generative artificial intelligence (AI) technologies may cause a shift towards a dominant culture. As people increasingly use AI to expedite and even automate various professional and personal tasks, cultural values embedded in AI models may bias authentic expression. We audit large language models for cultural bias, comparing their responses to nationally representative survey data, and evaluate country-specific prompting as a mitigation strategy. We find that GPT-4, 3.5 and 3 exhibit cultural values resembling English-speaking and Protestant European countries. Our mitigation strategy reduces cultural bias in recent models but not for all countries/territories. To avoid cultural bias in generative AI, especially in high-stakes contexts, we suggest using culture matching and ongoing cultural audits.
    摘要 文化深深影响人们的思维、行为和communication。生成人工智能技术可能导致一种主导文化。随着人们越来越使用AI快速和自动完成不同的职业和个人任务,AI模型中嵌入的文化价值可能干扰原始表达。我们对大语言模型进行文化偏见审核,比较其响应与国家代表性调查数据,并评估国家特定的提示作为缓解策略。我们发现GPT-4、3.5和3都具有类似英语和新教欧洲国家的文化价值。我们的缓解策略可以减少最新模型中的文化偏见,但不适用于所有国家/地区。为了避免生成AI中的文化偏见,特别在高风险上,我们建议使用文化匹配和持续文化审核。

PortfolioMentor: Multimodal Generative AI Companion for Learning and Crafting Interactive Digital Art Portfolios

  • paper_url: http://arxiv.org/abs/2311.14091
  • repo_url: None
  • paper_authors: Tao Long, Weirui Peng
  • for: This paper is written for design students who struggle to translate their creative ideas into tangible codes and designs, with a focus on providing tailored resources and academic support for non-technical art students.
  • methods: The paper presents a coding companion chatbot called PortfolioMentor, which guides and collaborates with students through proactive suggestions and responsible Q&As for learning, inspiration, and support.
  • results: The system synthesizes the artist’s visions, visual illustrations, audio or music suggestions, click-scroll effects, and creative vision conceptualization into a polished interactive digital portfolio.
    Abstract Digital art portfolios serve as impactful mediums for artists to convey their visions, weaving together visuals, audio, interactions, and narratives. However, without technical backgrounds, design students often find it challenging to translate creative ideas into tangible codes and designs, given the lack of tailored resources for the non-technical, academic support in art schools, and a comprehensive guiding tool throughout the mentally demanding process. Recognizing the role of companionship in code learning and leveraging generative AI models' capabilities in supporting creative tasks, we present PortfolioMentor, a coding companion chatbot for IDEs. This tool guides and collaborates with students through proactive suggestions and responsible Q&As for learning, inspiration, and support. In detail, the system starts with the understanding of the task and artist's visions, follows the co-creation of visual illustrations, audio or music suggestions and files, click-scroll effects for interactions, and creative vision conceptualization, and finally synthesizes these facets into a polished interactive digital portfolio.
    摘要 数字艺术端folio serves as a powerful medium for artists to convey their visions, combining visuals, audio, interactions, and narratives. However, without technical backgrounds, design students often find it difficult to translate their creative ideas into tangible codes and designs, due to the lack of tailored resources for non-technical students in art schools, comprehensive guiding tools, and the mentally demanding process. Recognizing the importance of companionship in code learning and leveraging generative AI models' capabilities in supporting creative tasks, we present PortfolioMentor, a coding companion chatbot for IDEs. This tool guides and collaborates with students through proactive suggestions and responsible Q&As for learning, inspiration, and support. In detail, the system starts with understanding the task and artist's visions, co-creating visual illustrations, audio or music suggestions and files, click-scroll effects for interactions, and creative vision conceptualization, and finally synthesizes these facets into a polished interactive digital portfolio.

AI-Generated Images Introduce Invisible Relevance Bias to Text-Image Retrieval

  • paper_url: http://arxiv.org/abs/2311.14084
  • repo_url: https://github.com/xsc1234/Invisible-Relevance-Bias
  • paper_authors: Shicheng Xu, Danyang Hou, Liang Pang, Jingcheng Deng, Jun Xu, Huawei Shen, Xueqi Cheng
  • for: This paper investigates the issue of invisible relevance bias in cross-modal retrieval, particularly when AI-generated images are present in the search results.
  • methods: The authors construct a benchmark to explore the existence of the bias and conduct extensive experiments to reveal that AI-generated images introduce an invisible relevance bias to text-image retrieval models.
  • results: The study shows that text-image retrieval models tend to rank AI-generated images higher than real images, even though the AI-generated images do not exhibit more visually relevant features to the query. The inclusion of AI-generated images in the training data of the retrieval models exacerbates the invisible relevance bias.
    Abstract With the advancement of generation models, AI-generated content (AIGC) is becoming more realistic, flooding the Internet. A recent study suggests that this phenomenon has elevated the issue of source bias in text retrieval for web searches. Specifically, neural retrieval models tend to rank generated texts higher than human-written texts. In this paper, we extend the study of this bias to cross-modal retrieval. Firstly, we successfully construct a suitable benchmark to explore the existence of the bias. Subsequent extensive experiments on this benchmark reveal that AI-generated images introduce an invisible relevance bias to text-image retrieval models. Specifically, our experiments show that text-image retrieval models tend to rank the AI-generated images higher than the real images, even though the AI-generated images do not exhibit more visually relevant features to the query than real images. This invisible relevance bias is prevalent across retrieval models with varying training data and architectures. Furthermore, our subsequent exploration reveals that the inclusion of AI-generated images in the training data of the retrieval models exacerbates the invisible relevance bias. The above phenomenon triggers a vicious cycle, which makes the invisible relevance bias become more and more serious. To elucidate the potential causes of invisible relevance and address the aforementioned issues, we introduce an effective training method aimed at alleviating the invisible relevance bias. Subsequently, we apply our proposed debiasing method to retroactively identify the causes of invisible relevance, revealing that the AI-generated images induce the image encoder to embed additional information into their representation. This information exhibits a certain consistency across generated images with different semantics and can make the retriever estimate a higher relevance score.
    摘要

Learning Saliency From Fixations

  • paper_url: http://arxiv.org/abs/2311.14073
  • repo_url: None
  • paper_authors: Yasser Abdelaziz Dahou Djilali, Kevin McGuiness, Noel O’Connor
  • for: 预测图像中的焦点点(saliency points)
  • methods: 使用平行解码器和transformer架构,从fixation maps直接学习焦点点
  • results: 与现状级方法相当的 metric scores 在 Salicon 和 MIT300 测试benchmark上
    Abstract We present a novel approach for saliency prediction in images, leveraging parallel decoding in transformers to learn saliency solely from fixation maps. Models typically rely on continuous saliency maps, to overcome the difficulty of optimizing for the discrete fixation map. We attempt to replicate the experimental setup that generates saliency datasets. Our approach treats saliency prediction as a direct set prediction problem, via a global loss that enforces unique fixations prediction through bipartite matching and a transformer encoder-decoder architecture. By utilizing a fixed set of learned fixation queries, the cross-attention reasons over the image features to directly output the fixation points, distinguishing it from other modern saliency predictors. Our approach, named Saliency TRansformer (SalTR), achieves metric scores on par with state-of-the-art approaches on the Salicon and MIT300 benchmarks.
    摘要 我团队提出了一种新的眩潜预测方法,利用并行解码器来学习眩潜,直接从定点图中学习眩潜。大多数模型通常采用连续眩潜地图来超越定点图优化的困难。我们尝试复制实验设置生成眩潜数据集的方式。我们的方法将眩潜预测视为直接集体预测问题,通过全球损失来强制唯一定点预测,并使用稳定学习的定点查询来直接输出定点。与其他现代眩潜预测器不同,我们的方法使用固定的学习定点查询,跨处理器的自注意力来直接输出定点。我们的方法,命名为眩潜变换(SalTR),在Salicon和MIT300benchmark上达到了与现代方法相当的度量分数。

Point2RBox: Combine Knowledge from Synthetic Visual Patterns for End-to-end Oriented Object Detection with Single Point Supervision

  • paper_url: http://arxiv.org/abs/2311.14758
  • repo_url: https://github.com/open-mmlab/mmrotate
  • paper_authors: Yu Yi, Xue Yang, Qingyun Li, Feipeng Da, Junchi Yan, Jifeng Dai, Yu Qiao
  • for: 这个论文主要关注在单点指导下的方向探测(OOD),尤其是实现了较少监督的情况下,从横均方向框(HBox)学习扭转方向框(RBox)的研究。
  • methods: 我们提出了两个原则来解决这个问题:1)组合Synthetic Pattern Knowledge:通过对每个图像中的标注点进行探索,将物体特征转换到Synthetic的视觉模式中,以提供监督框架的知识;2)对转换输入图像(例如,缩小/旋转),训练RBoxes以跟随同样的转换,使网络能够感受到物体之间的相对大小/旋转。
  • results: 我们的方法使用了轻量级架构,却可以与其他单点指导下的方法竞争,在DOTA/DIOR/HRSC datasets上获得了41.05%/27.62%/80.01%的表现。
    Abstract With the rapidly increasing demand for oriented object detection (OOD), recent research involving weakly-supervised detectors for learning rotated box (RBox) from the horizontal box (HBox) has attracted more and more attention. In this paper, we explore a more challenging yet label-efficient setting, namely single point-supervised OOD, and present our approach called Point2RBox. Specifically, we propose to leverage two principles: 1) Synthetic pattern knowledge combination: By sampling around each labelled point on the image, we transfer the object feature to synthetic visual patterns with the known bounding box to provide the knowledge for box regression. 2) Transform self-supervision: With a transformed input image (e.g. scaled/rotated), the output RBoxes are trained to follow the same transformation so that the network can perceive the relative size/rotation between objects. The detector is further enhanced by a few devised techniques to cope with peripheral issues, e.g. the anchor/layer assignment as the size of the object is not available in our point supervision setting. To our best knowledge, Point2RBox is the first end-to-end solution for point-supervised OOD. In particular, our method uses a lightweight paradigm, yet it achieves a competitive performance among point-supervised alternatives, 41.05%/27.62%/80.01% on DOTA/DIOR/HRSC datasets.
    摘要 受到对 Oriented Object Detection (OOD) 的需求不断增长的研究,最近的一些研究探讨了使用弱监督的检测器来学习旋转盒 (RBox) 从水平盒 (HBox)。在这篇论文中,我们探讨了更加挑战性强,但 Label-efficient 的设置,即单点监督 OOD。我们提出了两个原则:1. 综合使用生成的视觉模式知识:通过在每个标注点上采样图像,将对象特征传递到生成的视觉模式中,以提供盒形 regression 的知识。2. 变换自我监督:使用变换输入图像(例如缩放/旋转),输出 RBoxes 在不同的变换下进行训练,使网络可以感受到对象之间的相对大小/旋转关系。我们还提出了一些解决邻域问题的技术,例如对锚点/层次分配的问题,因为在我们的点监督设置下,对象大小不可用。我们认为 Point2RBox 是首个点监督 OOD 的综合解决方案。具体来说,我们的方法采用轻量级的思路,然而它在点监督相关的选择中达到了竞争性的性能,分别在 DOTA/DIOR/HRSC 数据集上得到了 41.05%/27.62%/80.01% 的表现。

PointOBB: Learning Oriented Object Detection via Single Point Supervision

  • paper_url: http://arxiv.org/abs/2311.14757
  • repo_url: https://github.com/luo-z13/pointobb
  • paper_authors: Junwei Luo, Xue Yang, Yi Yu, Qingyun Li, Junchi Yan, Yansheng Li
  • for: 本研究旨在提出一种基于单点的 Orientated Bounding Box(OBB)生成方法,以提高空中图像中物体检测中的精度和效率。
  • methods: 该方法基于三个不同视图:原始视图、缩放视图和旋转/翻折视图。通过这三个视图,我们实现了缩放增强模块和自然学习抽象模块。缩放增强模块使用了一种Scale-Sensitive Consistency(SSC)损失函数,以提高深度网络对物体大小的感知。自然学习抽象模块使用了自我supervised学习,以预测物体的角度,并与一种权重导航的Dense-to-Sparse(DS)匹配策略相结合,以获取稀疏物体的粗略角度。
  • results: 实验结果表明,PointOBB方法在DIOR-R和DOTA-v1.0数据集上表现出色,与可能的基于单点的参考模型相比,具有显著的性能优势。
    Abstract Single point-supervised object detection is gaining attention due to its cost-effectiveness. However, existing approaches focus on generating horizontal bounding boxes (HBBs) while ignoring oriented bounding boxes (OBBs) commonly used for objects in aerial images. This paper proposes PointOBB, the first single Point-based OBB generation method, for oriented object detection. PointOBB operates through the collaborative utilization of three distinctive views: an original view, a resized view, and a rotated/flipped (rot/flp) view. Upon the original view, we leverage the resized and rot/flp views to build a scale augmentation module and an angle acquisition module, respectively. In the former module, a Scale-Sensitive Consistency (SSC) loss is designed to enhance the deep network's ability to perceive the object scale. For accurate object angle predictions, the latter module incorporates self-supervised learning to predict angles, which is associated with a scale-guided Dense-to-Sparse (DS) matching strategy for aggregating dense angles corresponding to sparse objects. The resized and rot/flp views are switched using a progressive multi-view switching strategy during training to achieve coupled optimization of scale and angle. Experimental results on the DIOR-R and DOTA-v1.0 datasets demonstrate that PointOBB achieves promising performance, and significantly outperforms potential point-supervised baselines.
    摘要 单点监督对象检测正在吸引注意力,因为它的成本效益很高。然而,现有的方法都是生成水平 bounding box (HBB),而忽略了从航空图像中常见的 oriented bounding box (OBB)。本文提出了 PointOBB,单点基于 OBB 生成方法,用于 oriented 对象检测。PointOBB 通过三种不同的视图进行协同使用:原始视图、缩放视图和旋转/翻转 (rot/flp) 视图。在原始视图上,我们利用缩放视图和旋转/翻转视图建立了一个缩放增强模块,用于提高深度网络对对象大小的感知。为了准确预测对象角度,我们在这个模块中使用了无监督学习来预测角度,并与权重导向的 dense-to-sparse (DS) 匹配策略来聚合稠密的角度。在缩放视图和旋转/翻转视图之间进行交互的多视图交换策略来实现对scale和角度的相互优化。实验结果表明,PointOBB 在 DIOR-R 和 DOTA-v1.0 数据集上达到了出色的性能,并与单点监督基线相比显著提高。

Shortcut Bias Mitigation via Ensemble Diversity Using Diffusion Probabilistic Models

  • paper_url: http://arxiv.org/abs/2311.16176
  • repo_url: None
  • paper_authors: Luca Scimeca, Alexander Rubinstein, Damien Teney, Seong Joon Oh, Armand Mihai Nicolicioiu, Yoshua Bengio
  • for: 防止干扰关系导致的简单偏见,使模型不仅学习正确的特征,还学习多个相关特征。
  • methods: 使用Diffusion Probabilistic Models(DPMs) ensemble divergence架构来mitigate shortcut bias。
  • results: 通过DPMs生成图像中新的特征组合,提高模型的多样性和通用性。
    Abstract Spurious correlations in the data, where multiple cues are predictive of the target labels, often lead to a phenomenon known as simplicity bias, where a model relies on erroneous, easy-to-learn cues while ignoring reliable ones. In this work, we propose an ensemble diversification framework exploiting Diffusion Probabilistic Models (DPMs) for shortcut bias mitigation. We show that at particular training intervals, DPMs can generate images with novel feature combinations, even when trained on images displaying correlated input features. We leverage this crucial property to generate synthetic counterfactuals to increase model diversity via ensemble disagreement. We show that DPM-guided diversification is sufficient to remove dependence on primary shortcut cues, without a need for additional supervised signals. We further empirically quantify its efficacy on several diversification objectives, and finally show improved generalization and diversification performance on par with prior work that relies on auxiliary data collection.
    摘要 <>通过使用扩散概率模型(DPM),我们提出了一种多样性增强框架,以避免简单的寻索偏见。我们发现,在特定的训练 интерVALS,DPM 可以生成具有新的特征组合的图像,即使图像中的输入特征相关。我们利用这一特性,生成假数据,以增加模型多样性。我们证明,通过DPM 导航的多样性增强是可以消除主要寻索偏见的,无需额外的指导信号。我们进一步测试了这种方法的效果,并证明其与先前基于辅助数据收集的方法相当。>>>

Task-Distributionally Robust Data-Free Meta-Learning

  • paper_url: http://arxiv.org/abs/2311.14756
  • repo_url: None
  • paper_authors: Zixuan Hu, Li Shen, Zhenyi Wang, Yongxian Wei, Baoyuan Wu, Chun Yuan, Dacheng Tao
    for:这篇论文旨在提高数据自由元学习(DFML)的效率,使其可以快速学习新任务,不需要原始训练数据。methods:这篇论文提出了一种robust DFML框架,通过在一个具有内存 Buffer 的紧凑任务集中进行任务 interpolating来保证任务分布的稳定性。此外,论文还引入了一种自动模型选择机制,通过在元训练阶段使用Policy Gradient算法来优化每个模型的可靠性。results:实验结果表明,该框架能够有效地 Mitigate Task-Distribution Shift 和 Task-Distribution Corruption,并在实际应用中提高 DFML 的效果。
    Abstract Data-Free Meta-Learning (DFML) aims to efficiently learn new tasks by leveraging multiple pre-trained models without requiring their original training data. Existing inversion-based DFML methods construct pseudo tasks from a learnable dataset, which is inversely generated from the pre-trained model pool. For the first time, we reveal two major challenges hindering their practical deployments: Task-Distribution Shift (TDS) and Task-Distribution Corruption (TDC). TDS leads to a biased meta-learner because of the skewed task distribution towards newly generated tasks. TDC occurs when untrusted models characterized by misleading labels or poor quality pollute the task distribution. To tackle these issues, we introduce a robust DFML framework that ensures task distributional robustness. We propose to meta-learn from a pseudo task distribution, diversified through task interpolation within a compact task-memory buffer. This approach reduces the meta-learner's overreliance on newly generated tasks by maintaining consistent performance across a broader range of interpolated memory tasks, thus ensuring its generalization for unseen tasks. Additionally, our framework seamlessly incorporates an automated model selection mechanism into the meta-training phase, parameterizing each model's reliability as a learnable weight. This is optimized with a policy gradient algorithm inspired by reinforcement learning, effectively addressing the non-differentiable challenge posed by model selection. Comprehensive experiments across various datasets demonstrate the framework's effectiveness in mitigating TDS and TDC, underscoring its potential to improve DFML in real-world scenarios.
    摘要 “数据自由元学习(DFML)旨在高效地学习新任务,无需原始训练数据。现有的倒推基于DFML方法构建假任务从learnable dataset,这是由先前训练的模型池生成的反向推导的。我们为首次揭示了两大实际应用中阻碍因素:任务分布偏移(TDS)和任务分布损害(TDC)。TDS会导致偏向新生成任务的元学习器,因为生成任务的分布偏移。TDC发生在不可靠的模型中,它们具有误导性的标签或低质量,从而损害任务分布。为解决这些问题,我们提出了一种robust DFML框架,确保任务分布的稳定性。我们提议在 pseudo 任务分布上进行元学习,通过任务混合在一个减小的任务记忆缓存中来增强任务分布的多样性。这种方法可以减少元学习器对新生成任务的过分依赖,从而确保其在未看到的任务上的普适性。此外,我们的框架可以自然地 интеGRATE一种自动模型选择机制到元训练阶段,这是通过学习权重来parameterize每个模型的可靠性。这个问题可以通过一种基于奖励学习的策略算法来解决,这有效地 Addresses the non-differentiable challenge posed by model selection。实验结果表明,我们的框架可以有效地mitigate TDS和TDC,强调其在实际场景中的应用潜力。”

Towards Explainable Strategy Templates using NLP Transformers

  • paper_url: http://arxiv.org/abs/2311.14061
  • repo_url: None
  • paper_authors: Pallavi Bagga, Kostas Stathis
  • for: bridges the gap between mathematical heuristic strategies learned from Deep Reinforcement Learning (DRL) and comprehensible, natural language explanations, making these strategies more accessible to non-experts.
  • methods: leverages traditional Natural Language Processing (NLP) techniques and Large Language Models (LLMs) equipped with Transformers to transform parts of DRL strategies into user-friendly, human-like English narratives.
  • results: presents a top-level algorithm that involves parsing mathematical expressions of strategy templates, semantically interpreting variables and structures, generating rule-based primary explanations, and utilizing a Generative Pre-trained Transformer (GPT) model to refine and contextualize these explanations, with subsequent customization for varied audiences and meticulous validation processes in an example illustrating the approach’s applicability and potential.
    Abstract This paper bridges the gap between mathematical heuristic strategies learned from Deep Reinforcement Learning (DRL) in automated agent negotiation, and comprehensible, natural language explanations. Our aim is to make these strategies more accessible to non-experts. By leveraging traditional Natural Language Processing (NLP) techniques and Large Language Models (LLMs) equipped with Transformers, we outline how parts of DRL strategies composed of parts within strategy templates can be transformed into user-friendly, human-like English narratives. To achieve this, we present a top-level algorithm that involves parsing mathematical expressions of strategy templates, semantically interpreting variables and structures, generating rule-based primary explanations, and utilizing a Generative Pre-trained Transformer (GPT) model to refine and contextualize these explanations. Subsequent customization for varied audiences and meticulous validation processes in an example illustrate the applicability and potential of this approach.
    摘要 Here is the text in Simplified Chinese:这篇论文旨在将深度游戏学习(DRL)中自动代理人的数学决策策略带到更加易懂的自然语言解释。我们的目标是让这些策略更加容易被非专家理解。我们利用传统的自然语言处理(NLP)技术和大语言模型(LLM)Equipped with Transformers,我们详细介绍了如何将DRL策略的部分拟合成为人类化的英文 narra。为达到这个目标,我们提出了一个总级算法,该算法包括解析策略模板中的数学表达、semantic地理解变量和结构、生成规则基于的首要解释,以及使用Generative Pre-trained Transformer(GPT)模型来细化和contextualize these explanations。随后,我们适应不同的听众和仔细验证过程,以示该方法的可行性和潜力。

Identification for Tree-shaped Structural Causal Models in Polynomial Time

  • paper_url: http://arxiv.org/abs/2311.14058
  • repo_url: None
  • paper_authors: Aaryan Gupta, Markus Bläser
  • for: This paper is written to study linear structural causal models (SCMs) and develop algorithms for identifying the causal parameters from correlations between the nodes in SCMs.
  • methods: The paper uses a PSPACE-algorithm and a randomized polynomial-time algorithm to solve the identification problem for tree-shaped SCMs, and provides fractional affine square root terms of polynomials (FASTPs) for the corresponding parameters.
  • results: The paper presents a randomized polynomial-time algorithm that can identify the causal parameters in tree-shaped SCMs, and determines whether each parameter is generically identifiable, generically 2-identifiable, or generically unidentifiable.
    Abstract Linear structural causal models (SCMs) are used to express and analyse the relationships between random variables. Direct causal effects are represented as directed edges and confounding factors as bidirected edges. Identifying the causal parameters from correlations between the nodes is an open problem in artificial intelligence. In this paper, we study SCMs whose directed component forms a tree. Van der Zander et al. (AISTATS'22, PLMR 151, pp. 6770--6792, 2022) give a PSPACE-algorithm for the identification problem in this case, which is a significant improvement over the general Gr\"obner basis approach, which has doubly-exponential time complexity in the number of structural parameters. In this work, we present a randomized polynomial-time algorithm, which solves the identification problem for tree-shaped SCMs. For every structural parameter, our algorithms decides whether it is generically identifiable, generically 2-identifiable, or generically unidentifiable. (No other cases can occur.) In the first two cases, it provides one or two fractional affine square root terms of polynomials (FASTPs) for the corresponding parameter, respectively.
    摘要 Van der Zander et al. (AISTATS'22, PLMR 151, pp. 6770--6792, 2022) give a PSPACE-algorithm for the identification problem in this case, which is a significant improvement over the general Gr\"obner basis approach, which has doubly-exponential time complexity in the number of structural parameters.In this work, we present a randomized polynomial-time algorithm that solves the identification problem for tree-shaped SCMs. For every structural parameter, our algorithm decides whether it is generically identifiable, generically 2-identifiable, or generically unidentifiable. In the first two cases, it provides one or two fractional affine square root terms of polynomials (FASTPs) for the corresponding parameter, respectively.

Assessing the Impact of Noise on Quantum Neural Networks: An Experimental Analysis

  • paper_url: http://arxiv.org/abs/2311.14057
  • repo_url: None
  • paper_authors: Erik B. Terres Escudero, Danel Arias Alamo, Oier Mentxaka Gómez, Pablo García Bringas
  • for: 这 paper 旨在研究量子神经网络 (QNNs) 在各种噪声模型下的性能影响。
  • methods: 这 paper 使用 Mottonen 状态准备算法进行研究,并对多层 QNNs 的量子状态倒退进行分析。
  • results: 这 paper 发现噪声会导致 QNNs 性能下降,并证明了噪声模型对量子计算的挑战。
    Abstract In the race towards quantum computing, the potential benefits of quantum neural networks (QNNs) have become increasingly apparent. However, Noisy Intermediate-Scale Quantum (NISQ) processors are prone to errors, which poses a significant challenge for the execution of complex algorithms or quantum machine learning. To ensure the quality and security of QNNs, it is crucial to explore the impact of noise on their performance. This paper provides a comprehensive analysis of the impact of noise on QNNs, examining the Mottonen state preparation algorithm under various noise models and studying the degradation of quantum states as they pass through multiple layers of QNNs. Additionally, the paper evaluates the effect of noise on the performance of pre-trained QNNs and highlights the challenges posed by noise models in quantum computing. The findings of this study have significant implications for the development of quantum software, emphasizing the importance of prioritizing stability and noise-correction measures when developing QNNs to ensure reliable and trustworthy results. This paper contributes to the growing body of literature on quantum computing and quantum machine learning, providing new insights into the impact of noise on QNNs and paving the way towards the development of more robust and efficient quantum algorithms.
    摘要 在量子计算领域,量子神经网络(QNN)的潜在利益已经越来越明显。然而,中等规模量子计算(NISQ)处理器受到错误的影响,对复杂的算法或量子机器学习的执行 pose significant challenge。为保证 QNN 的质量和安全,需要探索噪声对其性能的影响。这篇论文对 QNN 在不同噪声模型下的 Mottonen 状态准备算法进行了全面的分析,并研究了多层 QNN 中 quantum state 的异常损耗。此外,论文还评估了噪声对预训练 QNN 的影响,并指出了量子计算中噪声模型的挑战。这些发现对量子软件的开发有重要的影响,强调在开发 QNN 时需要优先级置于稳定性和噪声纠正措施,以确保可靠和可信worthy的结果。这篇论文的发现对量子计算和量子机器学习的成长体系有重要的贡献,为开发更加稳定和高效的量子算法提供新的意见和方向。

PrivateLoRA For Efficient Privacy Preserving LLM

  • paper_url: http://arxiv.org/abs/2311.14030
  • repo_url: None
  • paper_authors: Yiming Wang, Yu Lin, Xiaodong Zeng, Guannan Zhang
  • for: 提供一种高效、隐私保护的大语言模型服务方案,使edge设备可以提供个性化的AI经验。
  • methods: 提出了一种分布式计算方法,将隐私敏感计算分布在Edge设备和云端之间,仅在云端和Edge设备之间传输激活值,以保持数据地域性。
  • results: 实现了高效的通信压缩和隐私保护,在标准5G网络下实现了对device-only解决方案的300%的吞吐量提高和对A100 GPU的80%的性能提高。
    Abstract End users face a choice between privacy and efficiency in current Large Language Model (LLM) service paradigms. In cloud-based paradigms, users are forced to compromise data locality for generation quality and processing speed. Conversely, edge device paradigms maintain data locality but fail to deliver satisfactory performance. In this work, we propose a novel LLM service paradigm that distributes privacy-sensitive computation on edge devices and shared computation in the cloud. Only activations are transmitted between the central cloud and edge devices to ensure data locality. Our core innovation, PrivateLoRA, addresses the challenging communication overhead by exploiting the low rank of residual activations, achieving over 95% communication reduction. Consequently, PrivateLoRA effectively maintains data locality and is extremely resource efficient. Under standard 5G networks, PrivateLoRA achieves throughput over 300% of device-only solutions for 7B models and over 80% of an A100 GPU for 33B models. PrivateLoRA also provides tuning performance comparable to LoRA for advanced personalization. Our approach democratizes access to state-of-the-art generative AI for edge devices, paving the way for more tailored LLM experiences for the general public. To our knowledge, our proposed framework is the first efficient and privacy-preserving LLM solution in the literature.
    摘要 (Simplified Chinese translation)用户面临着隐私和效率之间的选择,在当前的大语言模型(LLM)服务方案中。在云基础设施中,用户被迫为了生成质量和处理速度而 compromise 数据地域性。相反,边缘设备方案保持数据地域性,但是无法提供满意的性能。在这种工作中,我们提出了一种新的 LLM 服务方案,将隐私敏感计算分布在边缘设备和共享计算在云中。只有活动被传输 между中央云和边缘设备,以确保数据地域性。我们的核心创新,PrivateLoRA,利用 residual 活动的低级别来解决困难的通信开销,实现了超过 95% 的通信减少。因此,PrivateLoRA 能够保持数据地域性,同时具有极高的资源效率。在标准 5G 网络下,PrivateLoRA 可以实现更 чем 300% 的设备Only 解决方案的throughput,并且可以达到 A100 GPU 的80%。 PrivateLoRA 还提供了与 LoRA 相当的进退具有调整性能。我们的方法通过将state-of-the-art 生成 AI 技术带到边缘设备,为普通用户提供了更个性化的 LLM 经验。根据我们所知,我们提出的框架是 литературе中首次提出的有效和隐私保护的 LLM 解决方案。

Continual Learning of Diffusion Models with Generative Distillation

  • paper_url: http://arxiv.org/abs/2311.14028
  • repo_url: https://github.com/atenrev/difussion_continual_learning
  • paper_authors: Sergi Masip, Pau Rodriguez, Tinne Tuytelaars, Gido M. van de Ven
  • for: 这篇论文主要是为了解决Diffusion模型在不断学习中的问题。
  • methods: 论文提出了一种名为Generative Distillation的方法,即通过精炼整个反推过程来提高Diffusion模型在不断学习中的性能。
  • results: 论文表明,使用Generative Distillation方法可以在不断学习中提高Diffusion模型的性能,但需要只有moderate幂等增加计算成本。
    Abstract Diffusion models are powerful generative models that achieve state-of-the-art performance in tasks such as image synthesis. However, training them demands substantial amounts of data and computational resources. Continual learning would allow for incrementally learning new tasks and accumulating knowledge, thus reusing already trained models would be possible. One potentially suitable approach is generative replay, where a copy of a generative model trained on previous tasks produces synthetic data that are interleaved with data from the current task. However, standard generative replay applied to diffusion models results in a catastrophic loss in denoising capabilities. In this paper, we propose generative distillation, an approach that distils the entire reverse process of a diffusion model. We demonstrate that our approach significantly improves the continual learning performance of generative replay with only a moderate increase in the computational costs.
    摘要 输入文本翻译为简化中文:Diffusion models是强大的生成模型,可以在图像生成等任务中达到状态级表现。然而,它们的训练需要大量的数据和计算资源。不断学习可以让模型逐渐学习新任务和积累知识,因此可以重用已经训练过的模型。一种可能有利可图的方法是生成回放,其中一个拷贝了已经训练过的生成模型生成的 sintetic 数据与当前任务的数据相间。但标准的生成回放方法应用于扩散模型会导致降低噪声能力。在这篇论文中,我们提出了生成凝聚,一种方法,它借鉴了扩散模型的反向过程的整体凝聚。我们示出了我们的方法可以在只有 moderate 的计算成本增加的情况下显著提高不断学习性能。

When Side-Channel Attacks Break the Black-Box Property of Embedded Artificial Intelligence

  • paper_url: http://arxiv.org/abs/2311.14005
  • repo_url: None
  • paper_authors: Benoit Coqueret, Mathieu Carbone, Olivier Sentieys, Gabriel Zaid
  • for: 这个论文的目的是研究黑盒子攻击策略,具体来说是在黑盒子情况下,没有访问网络的攻击者可以通过电磁干扰攻击来提取网络的输出。
  • methods: 该论文提出了一种基于硬件和软件攻击的攻击方法,通过电磁干扰攻击来提取网络的输出,并使用这些输出来估算网络的梯度和生成高效的敌意例子。
  • results: 该论文通过实验表明,使用这种攻击方法可以在黑盒子情况下,高效地生成敌意例子,并且可以在不同的网络架构下进行攻击。
    Abstract Artificial intelligence, and specifically deep neural networks (DNNs), has rapidly emerged in the past decade as the standard for several tasks from specific advertising to object detection. The performance offered has led DNN algorithms to become a part of critical embedded systems, requiring both efficiency and reliability. In particular, DNNs are subject to malicious examples designed in a way to fool the network while being undetectable to the human observer: the adversarial examples. While previous studies propose frameworks to implement such attacks in black box settings, those often rely on the hypothesis that the attacker has access to the logits of the neural network, breaking the assumption of the traditional black box. In this paper, we investigate a real black box scenario where the attacker has no access to the logits. In particular, we propose an architecture-agnostic attack which solve this constraint by extracting the logits. Our method combines hardware and software attacks, by performing a side-channel attack that exploits electromagnetic leakages to extract the logits for a given input, allowing an attacker to estimate the gradients and produce state-of-the-art adversarial examples to fool the targeted neural network. Through this example of adversarial attack, we demonstrate the effectiveness of logits extraction using side-channel as a first step for more general attack frameworks requiring either the logits or the confidence scores.
    摘要 人工智能和深度神经网络(DNN)在过去的一代快速崛起为多种任务,从特定的广告到物体检测。DNN的表现使得它们成为了 kritical 的嵌入系统,需要高效和可靠。然而,DNN也面临着针对性的攻击,即恶意示例(adversarial examples)。在过去的研究中,人们提出了在黑盒设置下实施这些攻击的方案,但这些方案通常假设攻击者有访问神经网络的logits的权限,这打破了传统的黑盒假设。在这篇论文中,我们调查了一种真正的黑盒场景,在这种场景下,攻击者没有访问神经网络的logits的权限。我们提出了一种 architecture-agnostic 攻击,解决这个约束,通过使用电磁遍历攻击来提取logits。我们的方法结合硬件和软件攻击,通过对给定的输入进行电磁遍历攻击,以获取logits,并且通过估算Gradient来生成高效的恶意示例,欺骗目标神经网络。通过这个例子,我们示示了使用电磁遍历攻击来提取logits的效果,作为更通用的攻击框架的第一步。

Direct Preference-Based Evolutionary Multi-Objective Optimization with Dueling Bandit

  • paper_url: http://arxiv.org/abs/2311.14003
  • repo_url: None
  • paper_authors: Tian Huang, Ke Li
  • for: 本研究旨在提出一种基于直接 preference 学习的多目标演化算法框架,以解决各种优化问题。
  • methods: 该方法基于活动对抗弹性算法,不需要计算优劣函数,直接根据人工反馈来学习用户的偏好。
  • results: 实验结果表明,该方法可以快速地适应用户的偏好,并在多目标演化算法中实现了高效的搜索。
    Abstract Optimization problems find widespread use in both single-objective and multi-objective scenarios. In practical applications, users aspire for solutions that converge to the region of interest (ROI) along the Pareto front (PF). While the conventional approach involves approximating a fitness function or an objective function to reflect user preferences, this paper explores an alternative avenue. Specifically, we aim to discover a method that sidesteps the need for calculating the fitness function, relying solely on human feedback. Our proposed approach entails conducting direct preference learning facilitated by an active dueling bandit algorithm. The experimental phase is structured into three sessions. Firstly, we assess the performance of our active dueling bandit algorithm. Secondly, we implement our proposed method within the context of Multi-objective Evolutionary Algorithms (MOEAs). Finally, we deploy our method in a practical problem, specifically in protein structure prediction (PSP). This research presents a novel interactive preference-based MOEA framework that not only addresses the limitations of traditional techniques but also unveils new possibilities for optimization problems.
    摘要 优化问题在单目标和多目标场景中都有广泛的应用。在实际应用中,用户希望得到折衔到 интере点区域(ROI)的解决方案,同时满足用户的偏好。而 convential 方法通常是通过 aproximating 一个fitness函数或一个目标函数来反映用户的偏好。本文探讨了一种alternative 的方法,即通过直接学习用户偏好来解决问题。我们提议的方法是通过活跃对抗bandit算法来实现direct preference learning。实验阶段分为三个 sessio。首先,我们评估我们的活跃对抗bandit算法的性能。其次,我们在多目标演化算法(MOEA)中实现我们的提议方法。最后,我们在蛋白结构预测(PSP)中应用我们的方法,以展示一种新的互动式偏好基于MOEA框架,不仅解决了传统方法的局限性,还探讨了新的优化问题的可能性。

Learning Dynamic Selection and Pricing of Out-of-Home Deliveries

  • paper_url: http://arxiv.org/abs/2311.13983
  • repo_url: https://github.com/frakkerman/ooh_code
  • paper_authors: Fabian Akkerman, Peter Dieter, Martijn Mes
    for: 这篇论文的目的是解决最后一英里物流的问题,即门派送失败、交通拥堵和处理时间过长,这些外部因素会增加物流总成本的28%和温室气体排放的25%。methods: 这篇论文使用了动态选择和定价策略(DSPO),一种基于人工智能的算法管道,来模型客户选择行为。DSPO使用了一种新的空间-时间状态编码,并使用 convolutional neural network 进行预测。results: 对比三种现有方法,DSPO 可以节省20.8%的成本,比无 OOH 位置情况更高,比静态选择和定价策略高8.1%,并且比一种 state-of-the-art 需求管理标准高4.6%。这些结果提供了对 OOH 交付动态选择和定价策略的深入理解,并建议实施者采用这种策略,随着 OOH 交付市场份额的增加。
    Abstract Home delivery failures, traffic congestion, and relatively large handling times have a negative impact on the profitability of last-mile logistics. These external factors contribute to up to $28\%$ of the overall costs and $25\%$ of emissions for the home delivery supply chain. A potential solution, showing annual growth rates up to $36\%$, is the delivery to parcel lockers or parcel shops, denoted by out-of-home (OOH) delivery. In the academic literature, models of customer behavior with respect to OOH delivery were so far limited to deterministic settings, contrasting with the stochastic nature of actual customer choices. We model the sequential decision-making problem of which OOH location to offer against what incentive for each incoming customer, taking into account future customer arrivals and choices. We propose Dynamic Selection and Pricing of OOH (DSPO), an algorithmic pipeline that uses a novel spatial-temporal state encoding as input to a convolutional neural network. We demonstrate the performance of our method by benchmarking it against three state-of-the-art approaches. Our extensive numerical study, guided by real-world data, reveals that DSPO can save $20.8\%$ in costs compared to a situation without OOH locations, $8.1\%$ compared to a static selection and pricing policy, and $4.6\%$ compared to a state-of-the-art demand management benchmark. We provide comprehensive insights into the complex interplay between OOH delivery dynamics and customer behavior influenced by pricing strategies. The implications of our findings suggest practitioners to adopt dynamic selection and pricing policies as OOH delivery gains a larger market share.
    摘要 last-mile logistics的营运成本受到家配送失败、交通堵塞和较长的处理时间的负面影响,这些外部因素可以占到总成本的28%和25%的排放。一个可能的解决方案是通过物流到快递箱或快递店(简称OOH)进行交付。在学术文献中,OOH交付的顾客行为模型曾被限定为决定性设定,与实际顾客选择的杂乱性相对。我们模型了顾客到达时的选择决策,考虑到未来的顾客到达和选择。我们提出了动态选择和价格(DSPO)算法批处,使用一种新的空间-时间状态编码作为输入,并使用卷积神经网络进行分析。我们通过对三种现有方法进行比较,证明了DSPO的性能。我们的广泛的数学研究,受到实际数据指导,表明DSPO可以将成本降低20.8%,比无OOH位置情况更好28.1%,并且比静态选择和价格策略更好8.1%。我们提供了详细的折衔分析,揭示了OOH交付动态选择和价格策略对顾客行为的影响,并提供了实践者采用动态选择和价格策略的建议。

Probabilistic Tree-of-thought Reasoning for Answering Knowledge-intensive Complex Questions

  • paper_url: http://arxiv.org/abs/2311.13982
  • repo_url: https://github.com/Neo-Zhangjiajie/ProbTree
  • paper_authors: Shulin Cao, Jiajie Zhang, Jiaxin Shi, Xin Lv, Zijun Yao, Qi Tian, Juanzi Li, Lei Hou
  • for: 本研究旨在提高语言模型(LLM)的复杂问题回答能力,特别是处理链式思维(CoT)类问题。
  • methods: 本研究提出了一种新的方法——概率树思维(ProbTree),它使用概率思维来解决复杂问题,从leaf节点到根节点进行问题解决,并考虑问题的信任度。
  • results: 实验结果表明,ProbTree方法在三个复杂问题集上比前一个最佳方法表现出色,显示了probabilistic tree-of-thought reasoning的效果。
    Abstract Large language models (LLMs) are capable of answering knowledge-intensive complex questions with chain-of-thought (CoT) reasoning. However, they tend to generate factually incorrect reasoning steps when the required knowledge is not available or up-to-date in models' parameters. Recent works turn to retrieving external knowledge to augment CoT reasoning. Despite being promising, these chain-based methods suffer from: 1) Negative retrieval. Unnecessary or incorrect retrieval may mislead the reasoning; 2) Limited sight. Lacking the ability to look backward or forward, a local error in one step will propagate along the chain. In this paper, we propose a novel approach: Probabilistic Tree-of-thought Reasoning (ProbTree). First, LLMs translate a complex question into a query tree, in which each non-root node denotes a sub-question of its parent node. Then, probabilistic reasoning is conducted over the tree, by solving questions from leaf to root considering the confidence of both question decomposing and answering. During reasoning, for leaf nodes, LLMs choose a more confident answer from Closed-book QA that employs parametric knowledge and Open-book QA that employs retrieved external knowledge, thus eliminating the negative retrieval problem. For non-leaf nodes, with the hierarchical structure, LLMs have broader sights and are able to globally reason with the information from child nodes, thus recovering from local errors. The experiments on three Complex QA datasets under the open-domain setting show that our approach outperforms SOTA methods significantly, demonstrating the effect of probabilistic tree-of-thought reasoning.
    摘要 大型语言模型(LLM)可以回答复杂问题,并使用链式(CoT)逻辑进行解释。然而,当需要的知识不在模型参数中时,它们往往会产生错误的逻辑步骤。现有的链基方法是将外部知识加入链式中,以增强逻辑能力。然而,这些链基方法受到以下两个问题:1)负面检索。不必要或错误的检索可能会诱导逻辑;2)局部视野。这些方法缺乏回看前进的能力,因此当一个步骤发生错误时,错误将在链式中传播。在这篇论文中,我们提出了一种新的方法:可信度链式思维推理(ProbTree)。首先,LLMs将复杂问题转换为一个查询树,其中每个非根节点表示父节点的子问题。然后,我们使用可信度推理,将问题由叶节点至根节点解释,并考虑问题的可信度。在推理过程中,对叶节点,LLMs从关闭书签QA和开放书签QA中选择更信任的答案,因此解决负面检索问题。对非根节点,这些链式结构允许LLMs在儿子节点中具有更广泛的视野,因此能够对全局信息进行全面的推理,从而获得更好的效果。我们在三个复杂问题集上进行了开放领域的实验,结果显示,我们的方法与顶尖方法相比,表现出色,证明了可信度链式思维推理的效用。

Cluster trajectory of SOFA score in predicting mortality in sepsis

  • paper_url: http://arxiv.org/abs/2311.17066
  • repo_url: None
  • paper_authors: Yuhe Ke, Matilda Swee Sun Tang, Celestine Jia Ling Loh, Hairil Rizal Abdullah, Nicholas Brian Shannon
  • For: 这项研究旨在调查在初期72小时ICU admit期间SOFA分数的动态变化与患者结果之间的关系。* Methods: 研究使用了组合时间戳拼接和k-means分组分析,从ICU admit期间的SOFA分数动态变化中挖掘出不同的轨迹模式。* Results: 研究发现,ICU和医院mortality最高的群组是 persistently elevated scores(群组D),而且ICU和医院的停留时间最长。医院内转床率最初相似,但逐渐下降的群组C在医院内转床率较低。I hope this helps! Let me know if you have any further questions.
    Abstract Objective: Sepsis is a life-threatening condition. Sequential Organ Failure Assessment (SOFA) score is commonly used to assess organ dysfunction and predict ICU mortality, but it is taken as a static measurement and fails to capture dynamic changes. This study aims to investigate the relationship between dynamic changes in SOFA scores over the first 72 hours of ICU admission and patient outcomes. Design, setting, and participants: 3,253 patients in the Medical Information Mart for Intensive Care IV database who met the sepsis-3 criteria and were admitted from the emergency department with at least 72 hours of ICU admission and full-active resuscitation status were analysed. Group-based trajectory modelling with dynamic time warping and k-means clustering identified distinct trajectory patterns in dynamic SOFA scores. They were subsequently compared using Python. Main outcome measures: Outcomes including hospital and ICU mortality, length of stay in hospital and ICU, and readmission during hospital stay, were collected. Discharge time from ICU to wards and cut-offs at 7-day and 14-day were taken. Results: Four clusters were identified: A (consistently low SOFA scores), B (rapid increase followed by a decline in SOFA scores), C (higher baseline scores with gradual improvement), and D (persistently elevated scores). Cluster D had the longest ICU and hospital stays, highest ICU and hospital mortality. Discharge rates from ICU were similar for Clusters A and B, while Cluster C had initially comparable rates but a slower transition to ward. Conclusion: Monitoring dynamic changes in SOFA score is valuable for assessing sepsis severity and treatment responsiveness.
    摘要 目标:脓毒是一种可能致命的疾病。脓毒失血评估(SOFA)分数通常用于评估器官功能退化和预测ICU死亡率,但它被视为静态测量,而不能捕捉动态变化。这项研究的目的是研究在ICU admit期间的首72小时内SOFA分数的动态变化与病人结果之间的关系。设计、场景和参与者:这项研究使用了医疗信息 mart for Intensive Care IV数据库中的3,253名患者,他们符合脓毒-3 критериria,从急诊室入院并在ICU中接受了全面抢救。使用群体轨迹模型、动态时间折叠和k-means分组,研究人员发现了不同的轨迹模式在动态SOFA分数中。他们之后使用Python进行比较。主要结果计量:研究人员收集了医院和ICU死亡率、医院和ICU的治疗时间、在医院内重新入院率以及转移时间。医院内ICU转移时间和7天、14天的分割点也被记录。结论:监测SOFA分数的动态变化有价值,可以评估脓毒严重程度和治疗效果。 clusters A(consistently low SOFA scores),B(rapid increase followed by a decline in SOFA scores),C(higher baseline scores with gradual improvement)和D(persistently elevated scores)。 cluster D had the longest ICU and hospital stays, highest ICU and hospital mortality。 transferred from ICU to wards at similar rates as clusters A and B, while cluster C had initially comparable rates but a slower transition to ward。

Deep Interactive Segmentation of Medical Images: A Systematic Review and Taxonomy

  • paper_url: http://arxiv.org/abs/2311.13964
  • repo_url: None
  • paper_authors: Zdravko Marinov, Paul F. Jäger, Jan Egger, Jens Kleesiek, Rainer Stiefelhagen
    for:This paper provides a comprehensive overview of the emerging field of interactive segmentation in medical image analysis, with a focus on deep learning-based approaches.methods:The paper reviews 121 methods proposed in the medical imaging domain, providing a systematic analysis of current practices and identifying challenges and opportunities in the field.results:The paper discusses the challenges of comparing different methods due to the lack of standardized baselines and benchmarks, highlighting the need for further research in this area.
    Abstract Interactive segmentation is a crucial research area in medical image analysis aiming to boost the efficiency of costly annotations by incorporating human feedback. This feedback takes the form of clicks, scribbles, or masks and allows for iterative refinement of the model output so as to efficiently guide the system towards the desired behavior. In recent years, deep learning-based approaches have propelled results to a new level causing a rapid growth in the field with 121 methods proposed in the medical imaging domain alone. In this review, we provide a structured overview of this emerging field featuring a comprehensive taxonomy, a systematic review of existing methods, and an in-depth analysis of current practices. Based on these contributions, we discuss the challenges and opportunities in the field. For instance, we find that there is a severe lack of comparison across methods which needs to be tackled by standardized baselines and benchmarks.
    摘要 协作分割是医疗影像分析中的一个关键研究领域,旨在提高costly annotations的效率,通过人类反馈。这种反馈可以是点击、畅推或面Mask,并允许iterative refinement模型输出,以便有效导引系统到所需的行为。在过去的几年中,深度学习基于的方法在医疗影像领域取得了很大进步,已经推动了该领域的快速发展,总计121种方法在医疗影像领域中提出。本文提供了这个emerging领域的结构化概述,并进行了系统性的方法评议和深入分析。根据这些贡献,我们讨论了该领域的挑战和机遇。例如,我们发现了一种严重的对比方法的不足,需要通过标准化基准和benchmarks来解决这个问题。

Human Machine Co-Creation. A Complementary Cognitive Approach to Creative Character Design Process Using GANs

  • paper_url: http://arxiv.org/abs/2311.13960
  • repo_url: None
  • paper_authors: Mohammad Lataifeh, Xavier A Carrascoa, Ashraf M Elnagara, Naveed Ahmeda, Imran Junejo
    for: This research aims to create a complementary codesign process between humans and machines to augment character designers’ abilities in visualizing and creating new characters for multimedia projects.methods: The proposed approach uses Generative Adversarial Networks (GANs) to generate new visual content, and the machine-generated concepts are used as a launching platform for character designers to conceptualize new characters.results: The discussed results substantiate the value of the proposed co-creation framework and elucidate how the generated concepts are used as cognitive substances that interact with designers’ competencies in a versatile manner to influence the creative processes of conceptualizing novel characters.
    Abstract Recent advances in Generative Adversarial Networks GANs applications continue to attract the attention of researchers in different fields. In such a framework, two neural networks compete adversely to generate new visual contents indistinguishable from the original dataset. The objective of this research is to create a complementary codesign process between humans and machines to augment character designers abilities in visualizing and creating new characters for multimedia projects such as games and animation. Driven by design cognitive scaffolding, the proposed approach aims to inform the process of perceiving, knowing, and making. The machine generated concepts are used as a launching platform for character designers to conceptualize new characters. A labelled dataset of 22,000 characters was developed for this work and deployed using different GANs to evaluate the most suited for the context, followed by mixed methods evaluation for the machine output and human derivations. The discussed results substantiate the value of the proposed cocreation framework and elucidate how the generated concepts are used as cognitive substances that interact with designers competencies in a versatile manner to influence the creative processes of conceptualizing novel characters.
    摘要 Driven by design cognitive scaffolding, the proposed approach aims to inform the process of perceiving, knowing, and making. The machine-generated concepts are used as a starting point for character designers to conceptualize new characters. A labeled dataset of 22,000 characters was developed for this work and deployed using different GANs to evaluate the most suitable for the context. The results were evaluated using mixed methods to assess the machine output and human derivations.The discussed results demonstrate the value of the proposed co-creation framework and illustrate how the generated concepts are used as cognitive substances that interact with designers' competencies in a versatile manner to influence the creative processes of conceptualizing novel characters.

Learning Uniform Clusters on Hypersphere for Deep Graph-level Clustering

  • paper_url: http://arxiv.org/abs/2311.13953
  • repo_url: None
  • paper_authors: Mengling Hu, Chaochao Chen, Weiming Liu, Xinyi Zhang, Xinting Liao, Xiaolin Zheng
  • for: 这个论文的目的是提出一种新的深度graph级别 clustering方法(UDGC),以实现更好的cluster分布和避免cluster collapse问题。
  • methods: UDGC使用了Augmentation-Consensus Optimal Transport(ACOT)和contrastive learning来生成具有均匀分布和可靠性的pseudo标签,并且使用Center Alignment Optimal Transport(CAOT)来导引模型学习更好的参数,以提高cluster表现。
  • results: 根据八个知名数据集的实验结果,UDGC比state-of-the-art模型有更好的表现,实现了更好的cluster分布和避免cluster collapse问题。
    Abstract Graph clustering has been popularly studied in recent years. However, most existing graph clustering methods focus on node-level clustering, i.e., grouping nodes in a single graph into clusters. In contrast, graph-level clustering, i.e., grouping multiple graphs into clusters, remains largely unexplored. Graph-level clustering is critical in a variety of real-world applications, such as, properties prediction of molecules and community analysis in social networks. However, graph-level clustering is challenging due to the insufficient discriminability of graph-level representations, and the insufficient discriminability makes deep clustering be more likely to obtain degenerate solutions (cluster collapse). To address the issue, we propose a novel deep graph-level clustering method called Uniform Deep Graph Clustering (UDGC). UDGC assigns instances evenly to different clusters and then scatters those clusters on unit hypersphere, leading to a more uniform cluster-level distribution and a slighter cluster collapse. Specifically, we first propose Augmentation-Consensus Optimal Transport (ACOT) for generating uniformly distributed and reliable pseudo labels for partitioning clusters. Then we adopt contrastive learning to scatter those clusters. Besides, we propose Center Alignment Optimal Transport (CAOT) for guiding the model to learn better parameters, which further promotes the cluster performance. Our empirical study on eight well-known datasets demonstrates that UDGC significantly outperforms the state-of-the-art models.
    摘要 “graph clustering在近年中得到了广泛的研究。然而,现有的graph clustering方法主要关注单个图的节点级别划分,即将单个图的节点分为几个群。相比之下,多个图的划分,即graph-level clustering,尚未得到充分的研究。graph-level clustering在各种实际应用中具有重要意义,如药物性质预测和社交网络社区分析。然而,graph-level clustering受到graph-level表示缺乏充分特征和深度划分容易导致解决方案呈极值现象(即群集坍塌)的影响。为解决这个问题,我们提出了一种新的深度graph-level clustering方法,即均衡深度graph clustering(UDGC)。UDGC将实例均匀分配到不同的群集中,然后将这些群集投射到单位球上,从而导致群集水平分布更加均匀,并避免群集坍塌。具体来说,我们首先提出了增强合理交通(ACOT)来生成具有均匀分布和可靠性的 pseudo标签,用于分区群集。然后,我们采用对比学习来散布这些群集。此外,我们还提出了中心对齐对比交通(CAOT),用于导引模型学习更好的参数,从而进一步提高群集性能。我们的实验表明,UDGC在八个知名数据集上显著超过了当前模型。”

Parameter Exchange for Robust Dynamic Domain Generalization

  • paper_url: http://arxiv.org/abs/2311.13928
  • repo_url: https://github.com/metavisionlab/pe
  • paper_authors: Luojun Lin, Zhifeng Shen, Zhishu Sun, Yuanlong Yu, Lei Zhang, Weijie Chen
  • For: The paper aims to improve the generalization ability of dynamic domain generalization (DDG) models on unknown target domains by disentangling the static and dynamic components more thoroughly from an optimization perspective.* Methods: The proposed method, called Parameter Exchange (PE), perturbs the combination between the static and dynamic components to enable the static component to learn domain-invariant features more comprehensively, while the dynamic component focuses on learning adaptive domain-specific features. The model is optimized using the gradients from both the perturbed and non-perturbed feed-forward jointly.* Results: Extensive experiments show that PE can be easily plugged into existing dynamic networks to improve their generalization ability without bells and whistles, resisting agnostic domain shifts and improving self-adaptability on unknown target domains.Here’s the simplified Chinese text for the three information points:* For: 这篇论文目标是提高未知目标领域中 dynamic domain generalization(DDG)模型的泛化能力。* Methods: 提议方法是 Parameter Exchange(PE),它在 Optimization 角度上更加彻底地分离静态和动态组件,使静态组件更好地学习域无关特征,而动态组件则更专注地学习适应性域pecific特征。* Results: 实验结果表明,PE 可以轻松地插入现有的动态网络中,提高其泛化能力,抵御agnostic域shift和提高 unknown 目标域上的自适应性。
    Abstract Agnostic domain shift is the main reason of model degradation on the unknown target domains, which brings an urgent need to develop Domain Generalization (DG). Recent advances at DG use dynamic networks to achieve training-free adaptation on the unknown target domains, termed Dynamic Domain Generalization (DDG), which compensates for the lack of self-adaptability in static models with fixed weights. The parameters of dynamic networks can be decoupled into a static and a dynamic component, which are designed to learn domain-invariant and domain-specific features, respectively. Based on the existing arts, in this work, we try to push the limits of DDG by disentangling the static and dynamic components more thoroughly from an optimization perspective. Our main consideration is that we can enable the static component to learn domain-invariant features more comprehensively by augmenting the domain-specific information. As a result, the more comprehensive domain-invariant features learned by the static component can then enforce the dynamic component to focus more on learning adaptive domain-specific features. To this end, we propose a simple yet effective Parameter Exchange (PE) method to perturb the combination between the static and dynamic components. We optimize the model using the gradients from both the perturbed and non-perturbed feed-forward jointly to implicitly achieve the aforementioned disentanglement. In this way, the two components can be optimized in a mutually-beneficial manner, which can resist the agnostic domain shifts and improve the self-adaptability on the unknown target domain. Extensive experiments show that PE can be easily plugged into existing dynamic networks to improve their generalization ability without bells and whistles.
    摘要 agnostic domain shift是模型降低的主要原因,带来了开发Domain Generalization(DG)的紧迫需求。 current advances in DG use dynamic networks to achieve training-free adaptation on unknown target domains, termed Dynamic Domain Generalization(DDG), which compensates for the lack of self-adaptability in static models with fixed weights. The parameters of dynamic networks can be decoupled into a static and a dynamic component, which are designed to learn domain-invariant and domain-specific features, respectively. Based on existing arts, in this work, we try to push the limits of DDG by disentangling the static and dynamic components more thoroughly from an optimization perspective. Our main consideration is that we can enable the static component to learn domain-invariant features more comprehensively by augmenting domain-specific information. As a result, the more comprehensive domain-invariant features learned by the static component can then enforce the dynamic component to focus more on learning adaptive domain-specific features. To this end, we propose a simple yet effective Parameter Exchange(PE)method to perturb the combination between the static and dynamic components. We optimize the model using the gradients from both the perturbed and non-perturbed feed-forward jointly to implicitly achieve the aforementioned disentanglement. In this way, the two components can be optimized in a mutually-beneficial manner, which can resist the agnostic domain shifts and improve the self-adaptability on the unknown target domain. Extensive experiments show that PE can be easily plugged into existing dynamic networks to improve their generalization ability without bells and whistles.

A DRL solution to help reduce the cost in waiting time of securing a traffic light for cyclists

  • paper_url: http://arxiv.org/abs/2311.13905
  • repo_url: https://github.com/LucasMagnana/A-DRL-solution-to-help-reduce-the-cost-in-waiting-time-of-securing-a-traffic-light-for-cyclists.
  • paper_authors: Lucas Magnana, Hervé Rivano, Nicolas Chiabaut
  • for: 本研究旨在减少 cyclists 等待时间,通过对交通灯车流控制算法进行深度学习优化。
  • methods: 本研究使用了深度学习算法,并基于车辆计数器数据进行比较。
  • results: 研究结果显示,使用深度学习算法可以更好地减少车辆等待时间,并且对摩托车流量变化不大强度具有稳定性。Here’s the English version of the paper’s abstract again for reference:”Cyclists prefer to use infrastructure that separates them from motorized traffic. Using a traffic light to segregate car and bike flows, with the addition of bike-specific green phases, is a lightweight and cheap solution that can be deployed dynamically to assess the opportunity of a heavier infrastructure such as a separate bike lane. To compensate for the increased waiting time induced by these new phases, we introduce in this paper a deep reinforcement learning solution that adapts the green phase cycle of a traffic light to the traffic. Vehicle counter data are used to compare the DRL approach with the actuated traffic light control algorithm over whole days. Results show that DRL achieves better minimization of vehicle waiting time at almost all hours. Our DRL approach is also robust to moderate changes in bike traffic.”
    Abstract Cyclists prefer to use infrastructure that separates them from motorized traffic. Using a traffic light to segregate car and bike flows, with the addition of bike-specific green phases, is a lightweight and cheap solution that can be deployed dynamically to assess the opportunity of a heavier infrastructure such as a separate bike lane. To compensate for the increased waiting time induced by these new phases, we introduce in this paper a deep reinforcement learning solution that adapts the green phase cycle of a traffic light to the traffic. Vehicle counter data are used to compare the DRL approach with the actuated traffic light control algorithm over whole days. Results show that DRL achieves better minimization of vehicle waiting time at almost all hours. Our DRL approach is also robust to moderate changes in bike traffic. The code of this paper is available at https://github.com/LucasMagnana/A-DRL-solution-to-help-reduce-the-cost-in-waiting-time-of-securing-a-traffic-light-for-cyclists.
    摘要 自行车车手偏好使用与机动车辆分离的基建,使用交通灯将汽车和自行车流径分隔,加上自行车特定的绿色阶段,是一种轻量级和便宜的解决方案,可以 dynamically 评估 heavier 基建如自行车专用道的可行性。为了补偿增加 waitting time 的影响,这篇论文提出了一个深度强化学习解决方案,可以适应交通灯的绿色阶段周期。使用车辆计数数据与实际控制算法进行比较,结果显示 DRL 方法能够更好地 minimize 车辆等待时间,且对于轻度的自行车交通变化具有耐性。代码这篇论文可以在 GitHub 上找到:https://github.com/LucasMagnana/A-DRL-solution-to-help-reduce-the-cost-in-waiting-time-of-securing-a-traffic-light-for-cyclists。

General Phrase Debiaser: Debiasing Masked Language Models at a Multi-Token Level

  • paper_url: http://arxiv.org/abs/2311.13892
  • repo_url: https://github.com/bingkangshi/general-phrase-debiaser
  • paper_authors: Bingkang Shi, Xiaodan Zhang, Dehan Kong, Yulei Wu, Zongzhen Liu, Honglei Lyu, Longtao Huang
  • for: 这个论文的目的是为了减少语言模型中的社会偏见和不良刻板印象。
  • methods: 该论文提出了一个自动多токен减少偏见管道,称为通用短语减少器,可以在词语级别和短语级别减少语言模型中的偏见。
  • results: 该论文通过使用Wikipedia页面中的特征性短语生成和模型减少stage来减少语言模型中的偏见,并在标准数据集和度量上达到了顶尖的成果,可以减少职业和多个领域中的性别偏见。
    Abstract The social biases and unwelcome stereotypes revealed by pretrained language models are becoming obstacles to their application. Compared to numerous debiasing methods targeting word level, there has been relatively less attention on biases present at phrase level, limiting the performance of debiasing in discipline domains. In this paper, we propose an automatic multi-token debiasing pipeline called \textbf{General Phrase Debiaser}, which is capable of mitigating phrase-level biases in masked language models. Specifically, our method consists of a \textit{phrase filter stage} that generates stereotypical phrases from Wikipedia pages as well as a \textit{model debias stage} that can debias models at the multi-token level to tackle bias challenges on phrases. The latter searches for prompts that trigger model's bias, and then uses them for debiasing. State-of-the-art results on standard datasets and metrics show that our approach can significantly reduce gender biases on both career and multiple disciplines, across models with varying parameter sizes.
    摘要 社会偏见和不欢迎的刻板印象,由预训练语言模型表明出来,成为其应用的障碍。相比于单词减偏方法的多样性,对话层偏见减少的注意度相对较少,这限制了对话层减少的性能。在这篇论文中,我们提出了一个自动多ток文本减偏管道,称为《通用短语减偏器》,它可以为掩码语言模型中的短语层减少偏见。具体来说,我们的方法包括一个《短语筛选阶段》,生成来自Wikipedia页面的刻板短语,以及一个《模型减偏阶段》,可以在多ток文本层次上减少模型的偏见。后者通过找到模型偏见的触发器,然后使用它们进行减偏。使用标准数据集和度量,我们的方法可以在不同参数大小的模型中显著减少性别偏见,并且在职业和多个领域中减少偏见。

Can Physics Informed Neural Operators Self Improve?

  • paper_url: http://arxiv.org/abs/2311.13885
  • repo_url: None
  • paper_authors: Ritam Majumdar, Amey Varhade, Shirish Karande, Lovekesh Vig
  • for: 这个论文的目的是探索自适应技术在干扰函数神经网络(FNO)中的应用。
  • methods: 这篇论文使用了自适应技术,特别是使用自适应学习来训练FNO。
  • results: 论文发现,通过自适应学习,FNO可以在1D-Burgers和2D-Darcy等典型问题上达到比较高的准确率,而无需使用实际数据。此外,论文还发现了一种使用pseudo-labels进行自适应学习的方法,可以提高PINO的性能。
    Abstract Self-training techniques have shown remarkable value across many deep learning models and tasks. However, such techniques remain largely unexplored when considered in the context of learning fast solvers for systems of partial differential equations (Eg: Neural Operators). In this work, we explore the use of self-training for Fourier Neural Operators (FNO). Neural Operators emerged as a data driven technique, however, data from experiments or traditional solvers is not always readily available. Physics Informed Neural Operators (PINO) overcome this constraint by utilizing a physics loss for the training, however the accuracy of PINO trained without data does not match the performance obtained by training with data. In this work we show that self-training can be used to close this gap in performance. We examine canonical examples, namely the 1D-Burgers and 2D-Darcy PDEs, to showcase the efficacy of self-training. Specifically, FNOs, when trained exclusively with physics loss through self-training, approach 1.07x for Burgers and 1.02x for Darcy, compared to FNOs trained with both data and physics loss. Furthermore, we discover that pseudo-labels can be used for self-training without necessarily training to convergence in each iteration. A consequence of this is that we are able to discover self-training schedules that improve upon the baseline performance of PINO in terms of accuracy as well as time.
    摘要 自我训练技术在多种深度学习模型和任务上表现出了非凡的价值。然而,这些技术在系统 partial differential equations(PDE)中学习快速解决器时仍然未得到了广泛的探索。在这个工作中,我们探索了使用自我训练来解决 Fourier Neural Operators(FNO)。Neural Operators是一种数据驱动的技术,但是实验或传统的解决器数据不总是可用。Physics Informed Neural Operators(PINO)可以解决这个问题,因为它们使用物理损失来进行训练,但是没有数据的情况下PINO的性能并不如训练数据的情况下。在这个工作中,我们表明了自我训练可以用来弥补这个性能差距。我们使用一些典型的例子,如1D-Burgers和2D-Darcy PDEs,来示cases the efficacy of self-training. Specifically, FNOs, when trained exclusively with physics loss through self-training, approach 1.07x for Burgers and 1.02x for Darcy, compared to FNOs trained with both data and physics loss. Furthermore, we discover that pseudo-labels can be used for self-training without necessarily training to convergence in each iteration. As a result, we are able to discover self-training schedules that improve upon the baseline performance of PINO in terms of accuracy as well as time.

Controlling Large Language Model-based Agents for Large-Scale Decision-Making: An Actor-Critic Approach

  • paper_url: http://arxiv.org/abs/2311.13884
  • repo_url: None
  • paper_authors: Bin Zhang, Hangyu Mao, Jingqing Ruan, Ying Wen, Yang Li, Shao Zhang, Zhiwei Xu, Dapeng Li, Ziyue Li, Rui Zhao, Lijuan Li, Guoliang Fan
  • for: 提高大型语言模型(LLM)在多代理系统中的协调和决策能力。
  • methods: 基于actor-critic框架的多代理学习,开发了可模块化和Token高效的解决方案,有效地解决LLM和多代理系统中的挑战。
  • results: 通过在系统资源分配和机器人网格运输等实验中的评估,显示了提案的方法具有显著的优势。
    Abstract The significant advancements in large language models (LLMs) have presented novel opportunities for tackling planning and decision-making within multi-agent systems. However, as the number of agents increases, the issues of hallucination in LLMs and coordination in multi-agent systems (MAS) have become increasingly pronounced. Additionally, the efficient utilization of tokens becomes a critical consideration when employing LLMs to facilitate the interactions of large numbers of agents. In this paper, we present a novel framework aimed at enhancing coordination and decision-making capabilities of LLMs within large-scale multi-agent environments. Our approach draws inspiration from the actor-critic framework employed in multi-agent reinforcement learning, and we develop a modular and token-efficient solution that effectively addresses challenges presented by LLMs and MAS. Through evaluations conducted in experiments involving system resource allocation and robot grid transportation, we demonstrate the considerable advantages afforded by our proposed approach.
    摘要 大型语言模型(LLM)的进步已经为多智能体系统(MAS)中的规划和决策带来了新的机遇。然而,随着智能体数量的增加,LLM中的幻觉问题和多智能体系统中的协调问题变得越来越突出。此外,在使用LLM来促进多个智能体之间的交互时,有效使用token变得非常重要。在这篇论文中,我们提出了一种新的框架,旨在提高LLM在大规模多智能体环境中的协调和决策能力。我们的方法 draws inspiration from multi-agent reinforcement learning中的actor-critic框架,并开发了一种模块化和Token高效的解决方案,有效地解决LLM和MAS中的挑战。通过在系统资源分配和机器人网格运输等实验中进行的评估,我们 demonstate了我们提出的方法具有了很大的优势。

A Multi-solution Study on GDPR AI-enabled Completeness Checking of DPAs

  • paper_url: http://arxiv.org/abs/2311.13881
  • repo_url: None
  • paper_authors: Muhammad Ilyas Azeem, Sallam Abualhaija
  • for: 这种研究的目的是提供一种自动化方法来检查数据处理协议(DPA)是否符合欧盟个人数据保护法(GDPR)。
  • methods: 该研究使用了十种不同的技术来实现法律领域中的自动化检查,包括传统机器学习、深度学习、语言模型和几何学习。
  • results: 研究发现,使用预训练的BERT和RoBERTa语言模型可以达到最高的F2分(86.7%和89.7%),而深度学习(BiLSTM)和几何学习(SetFit)也可以实现相似的准确率,但是更快速发展。
    Abstract Specifying legal requirements for software systems to ensure their compliance with the applicable regulations is a major concern to requirements engineering (RE). Personal data which is collected by an organization is often shared with other organizations to perform certain processing activities. In such cases, the General Data Protection Regulation (GDPR) requires issuing a data processing agreement (DPA) which regulates the processing and further ensures that personal data remains protected. Violating GDPR can lead to huge fines reaching to billions of Euros. Software systems involving personal data processing must adhere to the legal obligations stipulated in GDPR and outlined in DPAs. Requirements engineers can elicit from DPAs legal requirements for regulating the data processing activities in software systems. Checking the completeness of a DPA according to the GDPR provisions is therefore an essential prerequisite to ensure that the elicited requirements are complete. Analyzing DPAs entirely manually is time consuming and requires adequate legal expertise. In this paper, we propose an automation strategy to address the completeness checking of DPAs against GDPR. Specifically, we pursue ten alternative solutions which are enabled by different technologies, namely traditional machine learning, deep learning, language modeling, and few-shot learning. The goal of our work is to empirically examine how these different technologies fare in the legal domain. We computed F2 score on a set of 30 real DPAs. Our evaluation shows that best-performing solutions yield F2 score of 86.7% and 89.7% are based on pre-trained BERT and RoBERTa language models. Our analysis further shows that other alternative solutions based on deep learning (e.g., BiLSTM) and few-shot learning (e.g., SetFit) can achieve comparable accuracy, yet are more efficient to develop.
    摘要 在需求工程(RE)中,指定法律要求以确保软件系统符合适用法规是一个主要问题。组织收集的个人数据经常被分享给其他组织进行特定的处理活动。在这种情况下,欧盟一般数据保护条例(GDPR)要求发出数据处理协议(DPA),以规范处理并确保个人数据继续保护。违反GDPR可能导致极高的罚款,达到数十亿欧元。软件系统包含个人数据处理的情况必须遵循GDPR中的法律规定,并根据DPA进行规范。需求工程师可以从DPA中提取法律要求,以便在软件系统中规范数据处理活动。完整性检查DPA根据GDPR规定是一项必要的前提,以确保提取的需求完整。手动检查DPA整体是时间consuming,需要充足的法律知识。在这篇论文中,我们提出自动化策略,以检查DPA的完整性。我们追究10种方案,其中每种方案都是基于不同的技术,如传统机器学习、深度学习、语言模型和几何学习。我们的目标是通过实验检验这些不同技术在法律领域的表现。我们对30个实际DPA进行计算F2分,我们的评估显示最佳解决方案的F2分为86.7%和89.7%,基于预训练BERT和RoBERTa语言模型。我们的分析还显示,基于深度学习(例如BiLSTM)和几何学习(例如SetFit)的解决方案可以实现相似的准确率, yet是更高效的开发。

Minimizing Factual Inconsistency and Hallucination in Large Language Models

  • paper_url: http://arxiv.org/abs/2311.13878
  • repo_url: None
  • paper_authors: Muneeswaran I, Shreya Saxena, Siva Prasad, M V Sai Prakash, Advaith Shankar, Varun V, Vishal Vaddina, Saisubramaniam Gopalakrishnan
  • for: 提高语言模型(LLM)的可靠性和信任性,以便在医疗、教育和金融等领域中进行有效的应用。
  • methods: 提出了一种多Stage框架,该框架首先生成了解释,然后对错误的解释进行验证和修正,并使用这些解释和上下文参考来生成答案。
  • results: 在生物科学领域中,使用该框架可以提高对药物相关问题的回答质量,并且可以提高OpenAI GPT-3.5-turbo的准确率和忠实度。同时,对小型开源语言模型进行精细调整可以提高其精度,并与商业模型竞争。
    Abstract Large Language Models (LLMs) are widely used in critical fields such as healthcare, education, and finance due to their remarkable proficiency in various language-related tasks. However, LLMs are prone to generating factually incorrect responses or "hallucinations," which can lead to a loss of credibility and trust among users. To address this issue, we propose a multi-stage framework that generates the rationale first, verifies and refines incorrect ones, and uses them as supporting references to generate the answer. The generated rationale enhances the transparency of the answer and our framework provides insights into how the model arrived at this answer, by using this rationale and the references to the context. In this paper, we demonstrate its effectiveness in improving the quality of responses to drug-related inquiries in the life sciences industry. Our framework improves traditional Retrieval Augmented Generation (RAG) by enabling OpenAI GPT-3.5-turbo to be 14-25% more faithful and 16-22% more accurate on two datasets. Furthermore, fine-tuning samples based on our framework improves the accuracy of smaller open-access LLMs by 33-42% and competes with RAG on commercial models.
    摘要

  • paper_url: http://arxiv.org/abs/2311.13871
  • repo_url: https://github.com/yty3805595/typhoon_prediction
  • paper_authors: Sallam Abualhaija, Marcello Ceci, Lionel Briand
  • for: 这篇论文主要是关于软件开发过程中的需求工程(RE)阶段,以及如何从法律协议和法规中提取法定要求。
  • methods: 这篇论文使用了多种方法来分析法定要求,包括将法规转化为机器可读格式,Survey existing automated means for enabling compliance verification against regulations,并反思当前法定要求分析的挑战。
  • results: 这篇论文通过分析法定要求,提出了一些可能的法定要求分析方法,并对GDPR进行了 exemplification。
    Abstract Modern software has been an integral part of everyday activities in many disciplines and application contexts. Introducing intelligent automation by leveraging artificial intelligence (AI) led to break-throughs in many fields. The effectiveness of AI can be attributed to several factors, among which is the increasing availability of data. Regulations such as the general data protection regulation (GDPR) in the European Union (EU) are introduced to ensure the protection of personal data. Software systems that collect, process, or share personal data are subject to compliance with such regulations. Developing compliant software depends heavily on addressing legal requirements stipulated in applicable regulations, a central activity in the requirements engineering (RE) phase of the software development process. RE is concerned with specifying and maintaining requirements of a system-to-be, including legal requirements. Legal agreements which describe the policies organizations implement for processing personal data can provide an additional source to regulations for eliciting legal requirements. In this chapter, we explore a variety of methods for analyzing legal requirements and exemplify them on GDPR. Specifically, we describe possible alternatives for creating machine-analyzable representations from regulations, survey the existing automated means for enabling compliance verification against regulations, and further reflect on the current challenges of legal requirements analysis.
    摘要 现代软件已经成为多个领域和应用场景的日常活动的一部分。通过利用人工智能(AI),实现了许多领域的breakthrough。AI的效果可以归功于许多因素,其中之一是数据的更加普遍性。例如欧洲联盟(EU)的通用数据保护条例(GDPR)等法规,是为保护个人数据而实施的。软件系统收集、处理或分享个人数据时,必须遵守相关法规。开发符合法规的软件具有重要的法律要求,是软件开发过程中的中心活动。系统需求工程(RE)阶段是软件开发过程中的一个关键阶段,旨在确定和维护系统所需的要求。法律要求是RE的一个重要组成部分。组织实施处理个人数据的政策可以提供一个额外的来源,用于抽取法律要求。在这章中,我们将探讨多种分析法律要求的方法,并将其应用于GDPR。具体来说,我们将介绍可能的法律要求分析方法,评估现有的自动化方法,以及legal requirements分析current challenges。

Challenges of Large Language Models for Mental Health Counseling

  • paper_url: http://arxiv.org/abs/2311.13857
  • repo_url: None
  • paper_authors: Neo Christopher Chung, George Dyer, Lennart Brocki
  • for: 这篇论文旨在探讨人工智能(AI)在心理咨询领域的应用,以及这些应用所面临的主要挑战。
  • methods: 论文使用大语言模型(LLM)来支持或提供心理咨询,并检查这些模型的准确性、可读性、偏见、隐私和临床有效性。
  • results: 论文发现了LMM在心理咨询领域的应用存在许多挑战,包括模型幻想、可读性、偏见、隐私和临床有效性等问题。然而,通过仔细解决这些问题,AI可以在心理咨询领域提供有用的支持。
    Abstract The global mental health crisis is looming with a rapid increase in mental disorders, limited resources, and the social stigma of seeking treatment. As the field of artificial intelligence (AI) has witnessed significant advancements in recent years, large language models (LLMs) capable of understanding and generating human-like text may be used in supporting or providing psychological counseling. However, the application of LLMs in the mental health domain raises concerns regarding the accuracy, effectiveness, and reliability of the information provided. This paper investigates the major challenges associated with the development of LLMs for psychological counseling, including model hallucination, interpretability, bias, privacy, and clinical effectiveness. We explore potential solutions to these challenges that are practical and applicable to the current paradigm of AI. From our experience in developing and deploying LLMs for mental health, AI holds a great promise for improving mental health care, if we can carefully navigate and overcome pitfalls of LLMs.
    摘要 全球精神健康危机正在迅速增加,有限的资源和寻求治疗的社会偏见,使得人工智能(AI)在心理辅导方面的应用日益受到关注。然而,将大型自然语言模型(LLMs)应用于心理辅导领域,带来了精神健康信息的准确性、有效性和可靠性的担忧。本文探讨了心理辅导领域实施LLMs的主要挑战,包括模型幻视、可读性、偏见、隐私和临床效果。我们探索了解决这些挑战的实用解决方案,并从我们在发展和应用LLMs for mental health的经验中获得了有利的结论:AI在精神健康领域具有巨大的推动力,只要我们能够小心翼节和突破LLMs的问题。

A Cross Attention Approach to Diagnostic Explainability using Clinical Practice Guidelines for Depression

  • paper_url: http://arxiv.org/abs/2311.13852
  • repo_url: None
  • paper_authors: Sumit Dalal, Deepa Tilwani, Manas Gaur, Sarika Jain, Valerie Shalin, Amit Seth
  • for: This paper aims to address the lack of explainability in Artificial Intelligence-powered analysis of unstructured clinical dialogue, specifically in the context of mental health (MH) and depression diagnosis.
  • methods: The authors develop a method called ProcesS knowledge-infused cross ATtention (PSAT) that incorporates clinical practice guidelines (CPGs) when computing attention, which enables the model to provide clinician-understandable explanations for classification.
  • results: The authors evaluate PSAT on three expert-curated datasets related to depression and demonstrate its ability to provide application-relevant explainability, surpassing the performance of nine baseline models. Additionally, the authors show that PSAT can provide explanations where other baselines fall short.
    Abstract The lack of explainability using relevant clinical knowledge hinders the adoption of Artificial Intelligence-powered analysis of unstructured clinical dialogue. A wealth of relevant, untapped Mental Health (MH) data is available in online communities, providing the opportunity to address the explainability problem with substantial potential impact as a screening tool for both online and offline applications. We develop a method to enhance attention in popular transformer models and generate clinician-understandable explanations for classification by incorporating external clinical knowledge. Inspired by how clinicians rely on their expertise when interacting with patients, we leverage relevant clinical knowledge to model patient inputs, providing meaningful explanations for classification. This will save manual review time and engender trust. We develop such a system in the context of MH using clinical practice guidelines (CPG) for diagnosing depression, a mental health disorder of global concern. We propose an application-specific language model called ProcesS knowledge-infused cross ATtention (PSAT), which incorporates CPGs when computing attention. Through rigorous evaluation on three expert-curated datasets related to depression, we demonstrate application-relevant explainability of PSAT. PSAT also surpasses the performance of nine baseline models and can provide explanations where other baselines fall short. We transform a CPG resource focused on depression, such as the Patient Health Questionnaire (e.g. PHQ-9) and related questions, into a machine-readable ontology using SNOMED-CT. With this resource, PSAT enhances the ability of models like GPT-3.5 to generate application-relevant explanations.
    摘要 因为现有的人工智能分析不具有相关的医疗知识,因此在使用无结构化医疗对话分析方面存在不可靠性问题。然而,在线社区中有巨量的无用的精神健康(MH)数据,这提供了解释问题的机会,并具有巨大的潜在影响力,作为在线和离线应用的检测工具。我们开发了一种方法,通过在受欢迎的转换器模型中插入外部医疗知识来增强注意力,并生成医生理解的解释,以便分类。这将节省人工审核时间,并且增加信任。我们在精神健康方面开发了一种基于临床实践指南(CPG)的应用特定语言模型,称为ProcesS知识插入交叉注意力(PSAT)。我们通过对三个专家精心编辑的精神健康相关数据进行严格评估,示出PSAT在应用上的解释能力。PSAT还超过了九个基线模型的性能,并且在其他基线模型无法提供解释时能够提供解释。我们将精神健康问题的资源,如抑郁症(DEPRESSION)的患者健康问naire(PHQ-9)和相关问题,转化为机器可读的 ontology 使用 SNOMED-CT。通过这个资源,PSAT 使得模型如 GPT-3.5 能够生成应用相关的解释。

Touring sampling with pushforward maps

  • paper_url: http://arxiv.org/abs/2311.13845
  • repo_url: None
  • paper_authors: Vivien Cabannes, Charles Arnal
  • for: 本研究旨在为generative modeling领域的 sampling问题提供一种 theoretically grounded 的综述和组织方法。
  • methods: 本文使用了多种sampling方法的综述和比较,包括diffusion模型、MCMC和VB等方法。
  • results: 本文通过揭示现有方法之间的联系,可能为diffusion模型等sampling方法的推广应用提供帮助,例如降低扩散估计时间和生成样本的不够多样性等问题。
    Abstract The number of sampling methods could be daunting for a practitioner looking to cast powerful machine learning methods to their specific problem. This paper takes a theoretical stance to review and organize many sampling approaches in the ``generative modeling'' setting, where one wants to generate new data that are similar to some training examples. By revealing links between existing methods, it might prove useful to overcome some of the current challenges in sampling with diffusion models, such as long inference time due to diffusion simulation, or the lack of diversity in generated samples.
    摘要 “sampling方法的数量可能会让实践者感到压力,想要使用强大的机器学习方法来解决特定问题。这篇论文从理论角度来审视和组织许多采样方法,在``生成模型''Setting中,我们想要生成与训练示例相似的新数据。通过把现有方法联系起来,可能会帮助解决 diffusion模型中的推理时间过长或生成样本缺乏多样性的现象。”Note that the translation is in Simplified Chinese, which is the standard form of Chinese used in mainland China and Singapore. If you need Traditional Chinese, please let me know.

Exact Combinatorial Optimization with Temporo-Attentional Graph Neural Networks

  • paper_url: http://arxiv.org/abs/2311.13843
  • repo_url: None
  • paper_authors: Mehdi Seyfi, Amin Banitalebi-Dehkordi, Zirui Zhou, Yong Zhang
  • for: 这篇论文旨在提高分支和约束(Branch and Bound)算法中机器学习(ML)模型的性能。
  • methods: 论文使用了两种机制来改善B&B算法:时间特征和矩阵关注。
  • results: 实验结果表明,在使用这两种机制的情况下,B&B算法的性能有所提高,并且可以在几个标准数据集上达到最佳性能。Here is the same information in English:
  • for: The paper aims to improve the performance of the Branch and Bound algorithm by incorporating machine learning models.
  • methods: The paper uses two mechanisms to improve the B&B algorithm: temporal features and bipartite graph attention.
  • results: Experimental results show that the B&B algorithm’s performance is improved when these two mechanisms are used, and it can achieve optimal performance on several standard datasets.
    Abstract Combinatorial optimization finds an optimal solution within a discrete set of variables and constraints. The field has seen tremendous progress both in research and industry. With the success of deep learning in the past decade, a recent trend in combinatorial optimization has been to improve state-of-the-art combinatorial optimization solvers by replacing key heuristic components with machine learning (ML) models. In this paper, we investigate two essential aspects of machine learning algorithms for combinatorial optimization: temporal characteristics and attention. We argue that for the task of variable selection in the branch-and-bound (B&B) algorithm, incorporating the temporal information as well as the bipartite graph attention improves the solver's performance. We support our claims with intuitions and numerical results over several standard datasets used in the literature and competitions. Code is available at: https://developer.huaweicloud.com/develop/aigallery/notebook/detail?id=047c6cf2-8463-40d7-b92f-7b2ca998e935
    摘要 combinatorial optimization 找到一个最佳解 Within a discrete set of variables and constraints. 这个 Field 在研究和工业中都有很大的进步。在过去十年里,深度学习的成功导致了 combinatorial optimization 中使用机器学习(ML)模型的增加。在这篇论文中,我们investigate two essential aspects of machine learning algorithms for combinatorial optimization:temporal characteristics和 attention。我们 argue that for the task of variable selection in the branch-and-bound(B&B)algorithm,将时间信息和双向图注意力包含在机器学习模型中可以提高算法的性能。我们支持我们的声明 With intuitions and numerical results over several standard datasets used in the literature and competitions. Code is available at: https://developer.huaweicloud.com/develop/aigallery/notebook/detail?id=047c6cf2-8463-40d7-b92f-7b2ca998e935.Note: Please note that the translation is in Simplified Chinese, which is one of the two standard versions of Chinese. If you prefer Traditional Chinese, please let me know and I can provide the translation in that format as well.

HypUC: Hyperfine Uncertainty Calibration with Gradient-boosted Corrections for Reliable Regression on Imbalanced Electrocardiograms

  • paper_url: http://arxiv.org/abs/2311.13821
  • repo_url: None
  • paper_authors: Uddeshya Upadhyay, Sairam Bade, Arjun Puranik, Shahir Asfahan, Melwin Babu, Francisco Lopez-Jimenez, Samuel J. Asirvatham, Ashim Prasad, Ajit Rajasekharan, Samir Awasthi, Rakesh Barve
  • for: 这个研究旨在提出一个数据集中数据偏好问题的解决方案,以便更好地预测医疗时间序列中的临床实验结果。
  • methods: 这个研究使用了深度神经网络(DNNs)处理医疗时间序列,并提出了一个内核密度基本技术来解决偏好问题。
  • results: 这个研究在处理大量的医疗时间序列数据中表现出色,较前一些传统基准模型的性能。
    Abstract The automated analysis of medical time series, such as the electrocardiogram (ECG), electroencephalogram (EEG), pulse oximetry, etc, has the potential to serve as a valuable tool for diagnostic decisions, allowing for remote monitoring of patients and more efficient use of expensive and time-consuming medical procedures. Deep neural networks (DNNs) have been demonstrated to process such signals effectively. However, previous research has primarily focused on classifying medical time series rather than attempting to regress the continuous-valued physiological parameters central to diagnosis. One significant challenge in this regard is the imbalanced nature of the dataset, as a low prevalence of abnormal conditions can lead to heavily skewed data that results in inaccurate predictions and a lack of certainty in such predictions when deployed. To address these challenges, we propose HypUC, a framework for imbalanced probabilistic regression in medical time series, making several contributions. (i) We introduce a simple kernel density-based technique to tackle the imbalanced regression problem with medical time series. (ii) Moreover, we employ a probabilistic regression framework that allows uncertainty estimation for the predicted continuous values. (iii) We also present a new approach to calibrate the predicted uncertainty further. (iv) Finally, we demonstrate a technique to use calibrated uncertainty estimates to improve the predicted continuous value and show the efficacy of the calibrated uncertainty estimates to flag unreliable predictions. HypUC is evaluated on a large, diverse, real-world dataset of ECGs collected from millions of patients, outperforming several conventional baselines on various diagnostic tasks, suggesting a potential use-case for the reliable clinical deployment of deep learning models.
    摘要 “医疗时间序列自动分析,如电cardiogram(ECG)、电энцеfalogram(EEG)、脉冲推测等,有可能成为诊断决策的有价值工具,允许远程监测病人并更有效地使用昂贵和时间consuming的医疗程序。深度神经网络(DNNs)已经证明可以有效地处理这些信号。然而,先前的研究主要集中在分类医疗时间序列而不是尝试 regression continuous-valued physiological parameters central to diagnosis。一个重要的挑战在这个方面是数据集的偏好性,因为低频率的异常情况可能导致极其偏好的数据,从而导致不准确的预测和预测中的不确定性。为解决这些挑战,我们提出了 HypUC 框架,这是一种用于医疗时间序列的不对称可能 regression 框架,我们做了以下贡献:(i)我们介绍了一种简单的基于 kernel density 的技术来处理医疗时间序列的不对称 regression 问题。(ii)我们采用了一种 probabilistic regression 框架,允许预测结果中的uncertainty estimation。(iii)我们还提出了一种新的方法来准确化预测结果中的uncertainty estimate。(iv)最后,我们示出了如何使用准确化的uncertainty estimate来改进预测结果,并证明了准确化的uncertainty estimate可以 flag 不可靠的预测。HypUC 在一个大、多样化、真实世界的 ECG 数据集上进行了评估,与多种常见基eline相比,表现出色,这表明了 HypUC 的可能用途是可靠地在临床中部署深度学习模型。”

Fairness-Aware Domain Generalization under Covariate and Dependence Shifts

  • paper_url: http://arxiv.org/abs/2311.13816
  • repo_url: None
  • paper_authors: Chen Zhao, Kai Jiang, Xintao Wu, Haoliang Wang, Latifur Khan, Christan Grant, Feng Chen
  • for: 本研究旨在Addressing domain shifts while simultaneously considering model fairness in machine learning, particularly in the context of invariant classifier generalization.
  • methods: 我们提出了一种新的领域普适方法,该方法考虑了两种类型的分布转移: covariate shift和dependence shift。我们采用了一种基于变换模型的方法,通过生成数据在synthetic domains中,学习一个 fairness-aware invariant classifier。
  • results: 我们的方法在四个标准数据集上进行了广泛的实验研究,并证明了我们的方法在unseen domains中的表现优于当前state-of-the-art方法。
    Abstract Achieving the generalization of an invariant classifier from source domains to shifted target domains while simultaneously considering model fairness is a substantial and complex challenge in machine learning. Existing domain generalization research typically attributes domain shifts to concept shift, which relates to alterations in class labels, and covariate shift, which pertains to variations in data styles. In this paper, by introducing another form of distribution shift, known as dependence shift, which involves variations in fair dependence patterns across domains, we propose a novel domain generalization approach that addresses domain shifts by considering both covariate and dependence shifts. We assert the existence of an underlying transformation model can transform data from one domain to another. By generating data in synthetic domains through the model, a fairness-aware invariant classifier is learned that enforces both model accuracy and fairness in unseen domains. Extensive empirical studies on four benchmark datasets demonstrate that our approach surpasses state-of-the-art methods.
    摘要 通过减少域外域化问题的挑战,我们提出了一种基于依赖shift的领域泛化方法。我们认为,域外域化问题不仅由概念shift和 covariate shift引起,还由依赖shift引起,即在域之间存在不同的公平依赖模式。我们提出了一种基于变换模型的领域泛化方法,通过生成数据在synthetic域中,实现了一个公平性感知的泛化预测器,并且在未知域中保持了模型准确性和公平性。我们在四个标准数据集上进行了广泛的实验,结果表明,我们的方法超过了当前最佳方法。

Mechanical Characterization and Inverse Design of Stochastic Architected Metamaterials Using Neural Operators

  • paper_url: http://arxiv.org/abs/2311.13812
  • repo_url: None
  • paper_authors: Hanxun Jin, Enrui Zhang, Boyu Zhang, Sridhar Krishnaswamy, George Em Karniadakis, Horacio D. Espinosa
  • for: 这篇论文旨在探讨如何使用机器学习(ML)技术设计建筑材料,以实现超过传统实验室试验方法的性能。
  • methods: 这篇论文使用了深度神经操作符(DeepONet)来直接学习建筑材料的完整微结构和机械响应之间的关系,从精炼的实验数据中获得。
  • results: 结果表明,使用神经操作符和高级微机械实验技术,可以实现 inverse design 的目标结构,并且预测错误在5-10%之间。这种方法可以在数据稀缺的情况下进行设计,并且可以实现设计Complex micro-architected materials with desired properties。
    Abstract Machine learning (ML) is emerging as a transformative tool for the design of architected materials, offering properties that far surpass those achievable through lab-based trial-and-error methods. However, a major challenge in current inverse design strategies is their reliance on extensive computational and/or experimental datasets, which becomes particularly problematic for designing micro-scale stochastic architected materials that exhibit nonlinear mechanical behaviors. Here, we introduce a new end-to-end scientific ML framework, leveraging deep neural operators (DeepONet), to directly learn the relationship between the complete microstructure and mechanical response of architected metamaterials from sparse but high-quality in situ experimental data. The approach facilitates the inverse design of structures tailored to specific nonlinear mechanical behaviors. Results obtained from spinodal microstructures, printed using two-photon lithography, reveal that the prediction error for mechanical responses is within a range of 5 - 10%. Our work underscores that by employing neural operators with advanced micro-mechanics experimental techniques, the design of complex micro-architected materials with desired properties becomes feasible, even in scenarios constrained by data scarcity. Our work marks a significant advancement in the field of materials-by-design, potentially heralding a new era in the discovery and development of next-generation metamaterials with unparalleled mechanical characteristics derived directly from experimental insights.
    摘要

Education distillation:getting student models to learn in shcools

  • paper_url: http://arxiv.org/abs/2311.13811
  • repo_url: None
  • paper_authors: Ling Feng, Danyang Li, Tianhao Wu, Xuliang Duan
  • for: 这篇论文主要关注于将知识传授到较低级模型中,以提高模型压缩的效率。
  • methods: 本论文提出了一种名为“动态增量学习”的方法,将知识传授到较低级模型中,并与设计的教学参考层进行结合。在不同的学习阶段中, Fragmented student models 会逐渐深化,并从更多的教师模型中学习和传授知识。
  • results: compared with单独的传授算法, combing education distillation strategies with distillation algorithms on the public dataset CIFAR100,Caltech256, Food-101 dataset 的性能有所提高。
    Abstract Knowledge distillation is one of the methods for model compression, and existing knowledge distillation techniques focus on how to improve the distillation algorithm so as to enhance the distillation efficiency. This paper introduces dynamic incremental learning into knowledge distillation and proposes a distillation strategy for education distillation. Specifically, it is proposed to take fragmented student models divided from the complete student model as lower-grade models. As the grade level rises, fragmented student models deepen in conjunction with designed teaching reference layers, while learning and distilling from more teacher models. By moving from lower to higher grades, fragmented student models were gradually integrated into a complete target student model, and the performance of the student models gradually improved from lower to higher grades of the stage. Education distillation strategies combined with distillation algorithms outperform the results of single distillation algorithms on the public dataset CIFAR100,Caltech256, Food-101 dataset.
    摘要 知识填充是模型压缩的一种方法,现有的知识填充技术主要关注如何改进填充算法以提高填充效率。这篇论文介绍了动态增量学习到知识填充中,并提出了教育填充策略。具体来说,是提出使用 Fragmented student models(分解学生模型)作为lower-grade models。随着学生模型的等级提高,分解学生模型与设计的教学引用层结合进行深化学习和填充,而学习和填充更多的教师模型。在从下到上的等级阶段,分解学生模型逐渐 интеGRATED到了完整的target student model,并且学生模型的性能逐渐提高从下到上的等级阶段。结合填充算法和教育填充策略的结果超出了单独使用填充算法的结果在公共数据集CIFAR100、Caltech256和Food-101数据集上。

Bridging Classical and Quantum Machine Learning: Knowledge Transfer From Classical to Quantum Neural Networks Using Knowledge Distillation

  • paper_url: http://arxiv.org/abs/2311.13810
  • repo_url: None
  • paper_authors: Mohammad Junayed Hasan, M. R. C. Mahdy
    for: This paper aims to bridge the gap between classical machine learning and emergent quantum computing techniques by transferring knowledge from classical to quantum neural networks using knowledge distillation.methods: The paper introduces a new method of transfer learning through classical-quantum integration using knowledge distillation, where classical convolutional neural network (CNN) architectures like LeNet and AlexNet serve as teacher networks to train student quantum models.results: The approach yields significant performance improvements for the quantum models by solely depending on classical CNNs, with quantum models achieving an average accuracy improvement of 0.80% on the MNIST dataset and 5.40% on the more complex Fashion MNIST dataset.
    Abstract Very recently, studies have shown that quantum neural networks surpass classical neural networks in tasks like image classification when a similar number of learnable parameters are used. However, the development and optimization of quantum models are currently hindered by issues such as qubit instability and limited qubit availability, leading to error-prone systems with weak performance. In contrast, classical models can exhibit high-performance owing to substantial resource availability. As a result, more studies have been focusing on hybrid classical-quantum integration. A line of research particularly focuses on transfer learning through classical-quantum integration or quantum-quantum approaches. Unlike previous studies, this paper introduces a new method to transfer knowledge from classical to quantum neural networks using knowledge distillation, effectively bridging the gap between classical machine learning and emergent quantum computing techniques. We adapt classical convolutional neural network (CNN) architectures like LeNet and AlexNet to serve as teacher networks, facilitating the training of student quantum models by sending supervisory signals during backpropagation through KL-divergence. The approach yields significant performance improvements for the quantum models by solely depending on classical CNNs, with quantum models achieving an average accuracy improvement of 0.80% on the MNIST dataset and 5.40% on the more complex Fashion MNIST dataset. Applying this technique eliminates the cumbersome training of huge quantum models for transfer learning in resource-constrained settings and enables re-using existing pre-trained classical models to improve performance.Thus, this study paves the way for future research in quantum machine learning (QML) by positioning knowledge distillation as a core technique for advancing QML applications.
    摘要 近些时候,研究表明,量子神经网络在图像分类任务中超过了类传统神经网络,当使用相同数量的可学习参数时。然而,量子模型的开发和优化目前受到芯片不稳定和可用芯片数量的限制,导致错误的系统和弱的性能。相比之下,类传统模型可以达到高性能由于资源的可用性。因此,更多的研究都在注重классические量子结合。一条研究特别关注在传输学习通过类传统-量子结合或量子-量子方法。与前一代研究不同,这篇论文引入了一种新的方法,将类传统神经网络的知识传递给量子神经网络,使用知识填充减少了类传统机器学习和量子计算技术之间的差距。我们采用类传统卷积神经网络(CNN) architecture如LeNet和AlexNet作为教师网络,通过反propagation通过KL散度来训练学生量子模型。这种方法对量子模型产生了显著性能改进,量子模型在MNIST数据集上平均准确率提高0.80%,在更复杂的Fashion MNIST数据集上提高5.40%。通过这种技术,可以消除训练庞大量子模型的负担,并使用现有的预训练类传统模型来提高性能。因此,这种研究为未来的量子机器学习(QML)领域开创了新的可能性。

Enhancing Intrusion Detection In Internet Of Vehicles Through Federated Learning

  • paper_url: http://arxiv.org/abs/2311.13800
  • repo_url: None
  • paper_authors: Abhishek Sebastian, Pragna R, Sudhakaran G, Renjith P N, Leela Karthikeyan H
  • for: 这个研究是为了提出一个基于联盟学习的侵入检测方法,用于网络车辆之间的互动。
  • methods: 该方法使用了 SMOTE 处理类别偏见、极值检测来识别和移除异常观察,以及对模型性能进行优化。
  • results: 作者透过使用不同的表现指标和与传统分类器进行比较,证明了提案的方法能够有效地检测侵入,并且可以保护敏感数据。
    Abstract Federated learning is a technique of decentralized machine learning. that allows multiple parties to collaborate and learn a shared model without sharing their raw data. Our paper proposes a federated learning framework for intrusion detection in Internet of Vehicles (IOVs) using the CIC-IDS 2017 dataset. The proposed framework employs SMOTE for handling class imbalance, outlier detection for identifying and removing abnormal observations, and hyperparameter tuning to optimize the model's performance. The authors evaluated the proposed framework using various performance metrics and demonstrated its effectiveness in detecting intrusions with other datasets (KDD-Cup 99 and UNSW- NB-15) and conventional classifiers. Furthermore, the proposed framework can protect sensitive data while achieving high intrusion detection performance.
    摘要 federated learning是一种分布式机器学习技术,允许多方共同协作学习共享模型,无需分享原始数据。我们的论文提出了基于CIC-IDS 2017 dataset的IOVs Federated Learning框架,该框架采用SMOTE处理类偏移、异常检测 removing 异常观察,并进行参数调整以优化模型性能。作者通过不同的性能指标进行评估,并证明了该框架在使用其他dataset(KDD-Cup 99和UNSW-NB-15)和传统分类器时的护卫功能。此外,该框架还能保护敏感数据,同时实现高度的护卫性能。

Scalable AI Generative Content for Vehicular Network Semantic Communication

  • paper_url: http://arxiv.org/abs/2311.13782
  • repo_url: None
  • paper_authors: Hao Feng, Yi Yang, Zhu Han
  • for: 提高驾驶者的安全驾驶,检测驾驶员盲点中的危险车辆
  • methods: 使用encoder-decoder架构,将图像转换为文本表示,并使用强化学习提高生成内容的可靠性
  • results: 比基eline方法更高的检测率,可靠地压缩通信数据,并可以在带宽允许的情况下integrate auxilary information
    Abstract Perceiving vehicles in a driver's blind spot is vital for safe driving. The detection of potentially dangerous vehicles in these blind spots can benefit from vehicular network semantic communication technology. However, efficient semantic communication involves a trade-off between accuracy and delay, especially in bandwidth-limited situations. This paper unveils a scalable Artificial Intelligence Generated Content (AIGC) system that leverages an encoder-decoder architecture. This system converts images into textual representations and reconstructs them into quality-acceptable images, optimizing transmission for vehicular network semantic communication. Moreover, when bandwidth allows, auxiliary information is integrated. The encoder-decoder aims to maintain semantic equivalence with the original images across various tasks. Then the proposed approach employs reinforcement learning to enhance the reliability of the generated contents. Experimental results suggest that the proposed method surpasses the baseline in perceiving vehicles in blind spots and effectively compresses communication data. While this method is specifically designed for driving scenarios, this encoder-decoder architecture also holds potential for wide use across various semantic communication scenarios.
    摘要 识别司机盲区内的车辆是安全驾驶的关键。检测可能危险的车辆在盲区内可以通过交通网络semantic通信技术进行检测。然而,有效的semantic通信通常会存在精度和延迟之间的负担,特别是在带宽有限的情况下。这篇论文揭示了一种可扩展的人工智能生成内容(AIGC)系统,该系统利用编码器-解码器架构,将图像转换为文本表示形式,并将其重构为可接受质量的图像,以便为交通网络semantic通信进行优化传输。此外,当带宽允许时,auxiliary信息也会被集成。编码器-解码器旨在维护图像原始Semantic的等价性,并使用反馈学习进行改进。实验结果表明,提议的方法可以超越基准值,准确地检测司机盲区内的车辆,并有效地压缩通信数据。尽管该方法特性是为驾驶场景设计的,但编码器-解码器架构也可能在各种semantic通信场景中广泛应用。

Archiving Body Movements: Collective Generation of Chinese Calligraphy

  • paper_url: http://arxiv.org/abs/2311.13770
  • repo_url: None
  • paper_authors: Aven Le Zhou, Jiayi Ye, Tianchen Liu, Kang Zhang
  • for: 这篇论文的目的是探讨oriental calligraphy中的身体运动和如何运用calligraphy原则来刺激和档案身体运动。
  • methods: 本论文使用了互动和生成的方法来吸引观众参与运动和档案身体运动,并将这些运动转换为生成的字体。
  • results: 这篇论文的结果表明,透过这种互动和生成的方法,可以将身体运动转换为字体,并且这种方法可以激发更多的注意力和讨论 concerning Chinese characters和字体。
    Abstract As a communication channel, body movements have been widely explored in behavioral studies and kinesics. Performing and visual arts share the same interests but focus on documenting and representing human body movements, such as for dance notation and visual work creation. This paper investigates body movements in oriental calligraphy and how to apply calligraphy principles to stimulate and archive body movements. Through an artwork (Wushu), the authors experiment with an interactive and generative approach to engage the audience's bodily participation and archive the body movements as a compendium of generated calligraphy. The audience assumes the role of both writers and readers; creating ("writing") and appreciating ("reading") the generated calligraphy becomes a cyclical process within this infinite "Book," which can motivate further attention and discussions concerning Chinese characters and calligraphy.
    摘要 为了探讨人体运动的表达和记录,通过动作研究和形式学,人们已经广泛利用了身体语言。表演艺术和视觉艺术都具有共同的利益,但它们更关注记录和表现人体运动的方法,如舞蹈notation和视觉创作。这篇论文探讨了中国书法中的身体运动,并如何通过书法原理来刺激和存档身体运动。通过一件艺术作品(武术),作者们实验了一种互动和生成的方法,让观众通过创作(写)和赏析(读)生成的书法来参与并存档身体运动。观众在这个无限的“书”中扮演了作家和读者的双重角色,创造和赏析生成的书法成为了一种循环的过程,可以激发进一步的注意和讨论中国字符和书法。

3D-MIR: A Benchmark and Empirical Study on 3D Medical Image Retrieval in Radiology

  • paper_url: http://arxiv.org/abs/2311.13752
  • repo_url: None
  • paper_authors: Asma Ben Abacha, Alberto Santamaria-Pang, Ho Hin Lee, Jameson Merkow, Qin Cai, Surya Teja Devarakonda, Abdullah Islam, Julia Gong, Matthew P. Lungren, Thomas Lin, Noel C Codella, Ivan Tarapov
  • for: 这篇论文旨在探讨医疗图像检索的三维医疗图像检索(3D-MIR)领域,并提出一个全新的评估标准。
  • methods: 这篇论文使用了多种搜寻策略,包括总体2D扫描、3D量子和多模式嵌入,并将其应用于各种医疗图像检索任务。
  • results: 这篇论文提供了评估这些搜寻策略的量值和质感分析,并进一步针对未来研究提供了一些建议。
    Abstract The increasing use of medical imaging in healthcare settings presents a significant challenge due to the increasing workload for radiologists, yet it also offers opportunity for enhancing healthcare outcomes if effectively leveraged. 3D image retrieval holds potential to reduce radiologist workloads by enabling clinicians to efficiently search through diagnostically similar or otherwise relevant cases, resulting in faster and more precise diagnoses. However, the field of 3D medical image retrieval is still emerging, lacking established evaluation benchmarks, comprehensive datasets, and thorough studies. This paper attempts to bridge this gap by introducing a novel benchmark for 3D Medical Image Retrieval (3D-MIR) that encompasses four different anatomies imaged with computed tomography. Using this benchmark, we explore a diverse set of search strategies that use aggregated 2D slices, 3D volumes, and multi-modal embeddings from popular multi-modal foundation models as queries. Quantitative and qualitative assessments of each approach are provided alongside an in-depth discussion that offers insight for future research. To promote the advancement of this field, our benchmark, dataset, and code are made publicly available.
    摘要 随着医疗设备的广泛应用,医生的工作负担增加,但同时也提供了改善医疗效果的机会。三维医疗图像检索可以减轻医生的工作负担,使临床专业人员能够高效搜索相似或相关的病例,以更快、更准确的诊断。然而,三维医疗图像检索领域仍处于初期阶段,缺乏成熔标准、完善的数据集和深入的研究。这篇论文意图填补这个差距,通过引入三维医疗图像检索(3D-MIR)的新标准,涵盖了 computed tomography 图像四种不同的生物结构。使用这个标准,我们探索了多种搜索策略,包括使用汇总的二维slice、三维体和多模态嵌入。每种方法的量化和质量评估,以及深入的讨论,以便对未来研究提供指导。为推动这个领域的发展,我们将标准、数据集和代码公开发布。

Towards Transferable Multi-modal Perception Representation Learning for Autonomy: NeRF-Supervised Masked AutoEncoder

  • paper_url: http://arxiv.org/abs/2311.13750
  • repo_url: None
  • paper_authors: Xiaohao Xu
  • for: 本文提出了一种基于Neural Radiance Field(NeRF)的自监督预训练框架,用于实现多modal感知表示学习。
  • methods: 该框架使用 conditional masked multi-modal autoencoder(NS-MAE),其中,基于certain view direction和位置的多modal嵌入被渲染成 проекed多modal特征图via neural rendering。然后,原始多modal信号被用作恢复目标,以实现自监督表示学习。
  • results: 对于多modal和单modal(摄像头只和激光达只)感知模型,NS-MAE学习的表示能够具备优秀的转移性,并在多种3D感知下游任务(3D物体检测和BEV地图分割)中表现出色,只需少量的精度调整标签数据。此外,我们还发现NS-MAE具有机制协调的优势,即masked autoencoder和NeRF的机制相互补偿。
    Abstract This work proposes a unified self-supervised pre-training framework for transferable multi-modal perception representation learning via masked multi-modal reconstruction in Neural Radiance Field (NeRF), namely NeRF-Supervised Masked AutoEncoder (NS-MAE). Specifically, conditioned on certain view directions and locations, multi-modal embeddings extracted from corrupted multi-modal input signals, i.e., Lidar point clouds and images, are rendered into projected multi-modal feature maps via neural rendering. Then, original multi-modal signals serve as reconstruction targets for the rendered multi-modal feature maps to enable self-supervised representation learning. Extensive experiments show that the representation learned via NS-MAE shows promising transferability for diverse multi-modal and single-modal (camera-only and Lidar-only) perception models on diverse 3D perception downstream tasks (3D object detection and BEV map segmentation) with diverse amounts of fine-tuning labeled data. Moreover, we empirically find that NS-MAE enjoys the synergy of both the mechanism of masked autoencoder and neural radiance field. Our code shall be released upon acceptance.
    摘要 这个工作提出了一种统一的自我超vision批处理框架,用于实现可转移多Modal观察表示学习,通过遮盖多Modal输入信号,例如激光点云和图像,提取的多Modal嵌入被映射到 проекed多Modal特征图像上。然后,原始多Modal信号被用作恢复目标,以便在自我超视批处理中学习表示。广泛的实验表明,通过NS-MAE学习的表示具有优秀的多Modal和单Modal(摄像机只和激光只)观察模型的转移性,在多种3D观察下沉tasks(3D对象检测和BEV地图分割)中,与不同量的精度标注数据进行微调。此外,我们还发现NS-MAE具有masked autoencoder和神经采样场的机制的共同作用,实际上,我们的代码将在接受后发布。

Security and Privacy Challenges in Deep Learning Models

  • paper_url: http://arxiv.org/abs/2311.13744
  • repo_url: https://github.com/maheravi92/Data-Science
  • paper_authors: Gopichandh Golla
  • for: 本研究旨在探讨深度学习模型在不同阶段的攻击方式,以及这些攻击对模型安全性和数据隐私的影响。
  • methods: 本研究使用了不同类型的攻击方法,包括模型提取攻击、模型反转攻击、对抗攻击和数据毒攻击等。
  • results: 研究发现,深度学习模型在不同阶段都可能受到攻击,导致模型的安全性和数据隐私受到威胁。此外,研究还发现了一些可能的防御策略,可以帮助减少模型的攻击风险。
    Abstract These days, deep learning models have achieved great success in multiple fields, from autonomous driving to medical diagnosis. These models have expanded the abilities of artificial intelligence by offering great solutions to complex problems that were very difficult to solve earlier. In spite of their unseen success in various, it has been identified, through research conducted, that deep learning models can be subjected to various attacks that compromise model security and data privacy of the Deep Neural Network models. Deep learning models can be subjected to various attacks at different stages of their lifecycle. During the testing phase, attackers can exploit vulnerabilities through different kinds of attacks such as Model Extraction Attacks, Model Inversion attacks, and Adversarial attacks. Model Extraction Attacks are aimed at reverse-engineering a trained deep learning model, with the primary objective of revealing its architecture and parameters. Model inversion attacks aim to compromise the privacy of the data used in the Deep learning model. These attacks are done to compromise the confidentiality of the model by going through the sensitive training data from the model's predictions. By analyzing the model's responses, attackers aim to reconstruct sensitive information. In this way, the model's data privacy is compromised. Adversarial attacks, mainly employed on computer vision models, are made to corrupt models into confidently making incorrect predictions through malicious testing data. These attacks subtly alter the input data, making it look normal but misleading deep learning models to make incorrect decisions. Such attacks can happen during both the model's evaluation and training phases. Data Poisoning Attacks add harmful data to the training set, disrupting the learning process and reducing the reliability of the deep learning mode.
    摘要 Model Extraction Attacks are aimed at reverse-engineering a trained deep learning model, with the primary objective of revealing its architecture and parameters. Model inversion attacks aim to compromise the privacy of the data used in the Deep learning model. These attacks are done to compromise the confidentiality of the model by going through the sensitive training data from the model's predictions. By analyzing the model's responses, attackers aim to reconstruct sensitive information. In this way, the model's data privacy is compromised. Adversarial attacks, mainly employed on computer vision models, are made to corrupt models into confidently making incorrect predictions through malicious testing data. These attacks subtly alter the input data, making it look normal but misleading deep learning models to make incorrect decisions. Such attacks can happen during both the model's evaluation and training phases. Data Poisoning Attacks add harmful data to the training set, disrupting the learning process and reducing the reliability of the deep learning mode.

FinMe: A Performance-Enhanced Large Language Model Trading Agent with Layered Memory and Character Design

  • paper_url: http://arxiv.org/abs/2311.13743
  • repo_url: None
  • paper_authors: Yangyang Yu, Haohang Li, Zhi Chen, Yuechen Jiang, Yang Li, Denghui Zhang, Rong Liu, Jordan W. Suchow, Khaldoun Khashanah
  • for: 这个论文旨在开发一种基于大语言模型(LLM)的自主投资代理程序,以提高投资决策的效率和稳定性。
  • methods: 这个论文使用了一种新的 agent 框架,称为 \textsc{FinMe},它包括三个核心模块: Profiling、Memory 和 Decision-making。 Profiling 模块用于描述代理的特点,Memory 模块使用层次处理来帮助代理 assimilate financials 数据,而 Decision-making 模块则将来自 Memory 模块的发现转化为投资决策。
  • results: compared to 其他算法性代理,\textsc{FinMe} 在一个大规模的实际金融数据集上表现出了突出的投资成绩,并且通过自适应其专业知识的方式来自适应新的投资提示,从而不断提高投资回报。
    Abstract Recent advancements in Large Language Models (LLMs) have exhibited notable efficacy in question-answering (QA) tasks across diverse domains. Their prowess in integrating extensive web knowledge has fueled interest in developing LLM autonomous agents. While LLMs are efficient in decoding human instructions and deriving solutions by holistically processing historical inputs, transitioning to purpose-driven agents requires a supplementary rational architecture to process multi-source information, establish reasoning chains, and prioritize critical tasks. Addressing this, we introduce \textsc{FinMe}, a novel LLM-based agent framework devised for financial decision-making, encompassing three core modules: Profiling, to outline the agent's characteristics; Memory, with layered processing, to aid the agent in assimilating realistic hierarchical financial data; and Decision-making, to convert insights gained from memories into investment decisions. Notably, \textsc{FinMe}'s memory module aligns closely with the cognitive structure of human traders, offering robust interpretability and real-time tuning. Its adjustable cognitive span allows for the retention of critical information beyond human perceptual limits, thereby enhancing trading outcomes. This framework enables the agent to self-evolve its professional knowledge, react agilely to new investment cues, and continuously refine trading decisions in the volatile financial environment. We first compare \textsc{FinMe} with various algorithmic agents on a scalable real-world financial dataset, underscoring its leading trading performance in stocks and funds. We then fine-tuned the agent's perceptual spans to achieve a significant trading performance. Collectively, \textsc{FinMe} presents a cutting-edge LLM agent framework for automated trading, boosting cumulative investment returns.
    摘要 Recent advances in Large Language Models (LLMs) have shown notable effectiveness in question-answering (QA) tasks across diverse domains. Their ability to integrate extensive web knowledge has sparked interest in developing LLM autonomous agents. Although LLMs are efficient in decoding human instructions and deriving solutions by holistically processing historical inputs, transitioning to purpose-driven agents requires a supplementary rational architecture to process multi-source information, establish reasoning chains, and prioritize critical tasks. To address this, we introduce \textsc{FinMe}, a novel LLM-based agent framework for financial decision-making, which includes three core modules:1. Profiling: to outline the agent's characteristics.2. Memory: with layered processing, to aid the agent in assimilating realistic hierarchical financial data.3. Decision-making: to convert insights gained from memories into investment decisions.Notably, \textsc{FinMe}'s memory module closely aligns with the cognitive structure of human traders, offering robust interpretability and real-time tuning. Its adjustable cognitive span allows for the retention of critical information beyond human perceptual limits, thereby enhancing trading outcomes. This framework enables the agent to self-evolve its professional knowledge, react agilely to new investment cues, and continuously refine trading decisions in the volatile financial environment.We first compared \textsc{FinMe} with various algorithmic agents on a scalable real-world financial dataset, highlighting its leading trading performance in stocks and funds. We then fine-tuned the agent's perceptual spans to achieve a significant trading performance. Collectively, \textsc{FinMe} presents a cutting-edge LLM agent framework for automated trading, boosting cumulative investment returns.

OASIS: Offsetting Active Reconstruction Attacks in Federated Learning

  • paper_url: http://arxiv.org/abs/2311.13739
  • repo_url: None
  • paper_authors: Tre’ R. Jeter, Truc Nguyen, Raed Alharbi, My T. Thai
  • for: 保护用户隐私和提高模型训练效率
  • methods: 使用图像增强来防御活动重建攻击
  • results: 实验表明,OASIS可以有效地防御活动重建攻击,同时保持模型性能Here’s a breakdown of each point:
  • for: The paper is written to address the challenge of active reconstruction attacks in Federated Learning (FL), which can compromise user privacy and threaten the security of model training.
  • methods: The proposed defense mechanism is based on image augmentation, which is used to undermine the attack principle of gradient inversion.
  • results: Comprehensive evaluations demonstrate the efficacy of OASIS, highlighting its feasibility as a solution to protect user privacy and enhance model training efficiency in FL.
    Abstract Federated Learning (FL) has garnered significant attention for its potential to protect user privacy while enhancing model training efficiency. However, recent research has demonstrated that FL protocols can be easily compromised by active reconstruction attacks executed by dishonest servers. These attacks involve the malicious modification of global model parameters, allowing the server to obtain a verbatim copy of users' private data by inverting their gradient updates. Tackling this class of attack remains a crucial challenge due to the strong threat model. In this paper, we propose OASIS, a defense mechanism based on image augmentation that effectively counteracts active reconstruction attacks while preserving model performance. We first uncover the core principle of gradient inversion that enables these attacks and theoretically identify the main conditions by which the defense can be robust regardless of the attack strategies. We then construct OASIS with image augmentation showing that it can undermine the attack principle. Comprehensive evaluations demonstrate the efficacy of OASIS highlighting its feasibility as a solution.
    摘要