2023-09-10

cs.AI

cs.AI - 2023-09-10

Collecting Visually-Grounded Dialogue with A Game Of Sorts

paper_url: http://arxiv.org/abs/2309.05162
repo_url: https://github.com/willemsenbram/a-game-of-sorts
paper_authors: Bram Willemsen, Dmytro Kalpakchi, Gabriel Skantze
for: 本研究旨在检验对话中referring表达的生成和固定过程是如何进行的。
methods: 本研究使用了一个合作图片排序任务，称为“ Sorting Game”，以检验对话中referring表达的困难性和复杂性。
results: 研究发现，在这种合作交流中，参与者需要通过讨论和协商来达成一致，而不是只是交换简单的referring表达。这些讨论和协商过程中，参与者需要共同理解和协商referent的含义和特征。

Abstract
An idealized, though simplistic, view of the referring expression production and grounding process in (situated) dialogue assumes that a speaker must merely appropriately specify their expression so that the target referent may be successfully identified by the addressee. However, referring in conversation is a collaborative process that cannot be aptly characterized as an exchange of minimally-specified referring expressions. Concerns have been raised regarding assumptions made by prior work on visually-grounded dialogue that reveal an oversimplified view of conversation and the referential process. We address these concerns by introducing a collaborative image ranking task, a grounded agreement game we call "A Game Of Sorts". In our game, players are tasked with reaching agreement on how to rank a set of images given some sorting criterion through a largely unrestricted, role-symmetric dialogue. By putting emphasis on the argumentation in this mixed-initiative interaction, we collect discussions that involve the collaborative referential process. We describe results of a small-scale data collection experiment with the proposed task. All discussed materials, which includes the collected data, the codebase, and a containerized version of the application, are publicly available.

摘要

Faster, Lighter, More Accurate: A Deep Learning Ensemble for Content Moderation

paper_url: http://arxiv.org/abs/2309.05150
repo_url: None
paper_authors: Mohammad Hosseini, Mahmudul Hasan
for: addresses the increasing need for efficient and accurate content moderation
methods: combines simple visual features with a lightweight ensemble of models
results: achieves significant improvements in prediction accuracy with 7.64x faster inference and lower computation cost compared to popular deep learning models such as ResNet-50.

Abstract
To address the increasing need for efficient and accurate content moderation, we propose an efficient and lightweight deep classification ensemble structure. Our approach is based on a combination of simple visual features, designed for high-accuracy classification of violent content with low false positives. Our ensemble architecture utilizes a set of lightweight models with narrowed-down color features, and we apply it to both images and videos. We evaluated our approach using a large dataset of explosion and blast contents and compared its performance to popular deep learning models such as ResNet-50. Our evaluation results demonstrate significant improvements in prediction accuracy, while benefiting from 7.64x faster inference and lower computation cost. While our approach is tailored to explosion detection, it can be applied to other similar content moderation and violence detection use cases as well. Based on our experiments, we propose a "think small, think many" philosophy in classification scenarios. We argue that transforming a single, large, monolithic deep model into a verification-based step model ensemble of multiple small, simple, and lightweight models with narrowed-down visual features can possibly lead to predictions with higher accuracy.

摘要
Translated into Simplified Chinese:为了解决内容筛选的增加需求，我们提出了一种高效、轻量级的深度分类ensemble结构。我们的方法基于一组简单的视觉特征，设计用于高精度的暴力内容分类，false positive低。我们的ensemble架构使用一组轻量级的模型，并将其应用于图像和视频。我们使用一个大量的爆炸和爆炸内容Dataset进行评估，并与popular的深度学习模型 such as ResNet-50进行比较。我们的评估结果表明，我们的方法可以 achieve higher prediction accuracy, while enjoying 7.64x faster inference and lower computation cost。虽然我们的方法是针对爆炸检测的，但它可以应用于其他类似的内容筛选和暴力检测场景。基于我们的实验，我们提出了一种"思小、思多"的哲学，即将单一的大型、复杂的深度模型转换为验证基于步骤模型ensemble，并使用简单的视觉特征进行筛选。我们认为这可能会导致更高的预测精度。

Representation Learning in Low-rank Slate-based Recommender Systems

paper_url: http://arxiv.org/abs/2309.08622
repo_url: None
paper_authors: Yijia Dai, Wen Sun
for: 提高用户长期活跃度，通过强化学习推荐系统。
methods: 使用标准推荐setup和低维度Markov决策过程（MDPs）进行 represntation学习算法，以处理在线RL问题。
results: 通过构建推荐 simulate环境和提出的采样方法，实现样本效率的学习和探索。

Abstract
Reinforcement learning (RL) in recommendation systems offers the potential to optimize recommendations for long-term user engagement. However, the environment often involves large state and action spaces, which makes it hard to efficiently learn and explore. In this work, we propose a sample-efficient representation learning algorithm, using the standard slate recommendation setup, to treat this as an online RL problem with low-rank Markov decision processes (MDPs). We also construct the recommender simulation environment with the proposed setup and sampling method.

摘要
<>translate Reinforcement Learning (RL) in recommendation systems into 推荐学习 (RL)<>推荐学习（RL）在推荐系统中提供了长期用户参与度优化的潜在可能性。然而，环境通常具有大状态和动作空间，这使得效率地学习和探索变得困难。在这种工作中，我们提议一种效率的表示学习算法，使用标准推荐SlaterSetup，将这视为在线RL问题，使用低级Markov决策过程（MDP）。我们还构建了推荐 simulate环境，使用我们的设置和抽样方法。

Outlier Robust Adversarial Training

paper_url: http://arxiv.org/abs/2309.05145
repo_url: https://github.com/discovershu/orat
paper_authors: Shu Hu, Zhenhuan Yang, Xin Wang, Yiming Ying, Siwei Lyu
for: 这篇论文目的是提出一种能够同时处理含有异常值和敌意攻击的supervised学习模型，以提高模型的可靠性和robustness。
methods: 该论文提出了一种基于两级优化的BI-level adversarial Training（ORAT）方法，该方法使用一种鲁棒度-based loss函数来增强模型的鲁棒性。
results: 实验结果表明，ORAT可以有效地处理含有异常值和敌意攻击的训练数据，并且在高probability中保证了模型的一致性和一致性。

Abstract
Supervised learning models are challenged by the intrinsic complexities of training data such as outliers and minority subpopulations and intentional attacks at inference time with adversarial samples. While traditional robust learning methods and the recent adversarial training approaches are designed to handle each of the two challenges, to date, no work has been done to develop models that are robust with regard to the low-quality training data and the potential adversarial attack at inference time simultaneously. It is for this reason that we introduce Outlier Robust Adversarial Training (ORAT) in this work. ORAT is based on a bi-level optimization formulation of adversarial training with a robust rank-based loss function. Theoretically, we show that the learning objective of ORAT satisfies the $\mathcal{H}$-consistency in binary classification, which establishes it as a proper surrogate to adversarial 0/1 loss. Furthermore, we analyze its generalization ability and provide uniform convergence rates in high probability. ORAT can be optimized with a simple algorithm. Experimental evaluations on three benchmark datasets demonstrate the effectiveness and robustness of ORAT in handling outliers and adversarial attacks. Our code is available at https://github.com/discovershu/ORAT.

摘要
超级vised学习模型面临训练数据中的自然复杂性，如异常数据和少数批处理，以及推理时间的意外攻击。传统的Robust学习方法和最近的对抗学习方法可以处理每一个挑战，但到目前为止，没有任何工作在同时处理低质量训练数据和推理时间的对抗攻击。这是我们在这篇文章中引入Outlier Robust Adversarial Training（ORAT）的原因。ORAT基于对抗训练的双级优化形式和Robust排名基于损失函数。我们理论上显示，ORAT的学习目标满足了$\mathcal{H}$-一致性在二分类问题中，这使其成为对抗0/1损失函数的合法代理。此外，我们分析了它的泛化能力，并提供了高probability中的均匀收敛率。ORAT可以使用简单的算法进行优化。实验评估在三个标准数据集上表明ORAT有效地处理异常数据和对抗攻击。我们的代码可以在https://github.com/discovershu/ORAT中找到。

Large Language Models for Difficulty Estimation of Foreign Language Content with Application to Language Learning

paper_url: http://arxiv.org/abs/2309.05142
repo_url: None
paper_authors: Michalis Vlachos, Mircea Lungu, Yash Raj Shrestha, Johannes-Rudolf David
for: 帮助外语学习者提高外语水平，通过identifying感兴趣的话题和learner的外语水平相似的内容。
methods: 使用大语言模型提高外语学习者的掌握能力，包括发现learner关注的话题上的内容、更准确地估计内容的语言难度，以及提供文本和视频内容。
results: 提供一种可以适应学习者的兴趣和学习目标的语言学习解决方案，可以帮助学习者持续激发对外语学习的兴趣和motivation。

Abstract
We use large language models to aid learners enhance proficiency in a foreign language. This is accomplished by identifying content on topics that the user is interested in, and that closely align with the learner's proficiency level in that foreign language. Our work centers on French content, but our approach is readily transferable to other languages. Our solution offers several distinctive characteristics that differentiate it from existing language-learning solutions, such as, a) the discovery of content across topics that the learner cares about, thus increasing motivation, b) a more precise estimation of the linguistic difficulty of the content than traditional readability measures, and c) the availability of both textual and video-based content. The linguistic complexity of video content is derived from the video captions. It is our aspiration that such technology will enable learners to remain engaged in the language-learning process by continuously adapting the topics and the difficulty of the content to align with the learners' evolving interests and learning objectives.

摘要
我们使用大型语言模型帮助学生提高Foreign language proficiency。我们通过识别用户感兴趣的话题，并与学生的Foreign language水平相似的话题进行匹配，以提高学生的motivation。我们的工作主要关注法语内容，但我们的方法可以适用于其他语言。我们的解决方案具有以下三个特点：一、通过找到用户关心的话题来增加motivation；二、通过语言难度测试来更准确地评估内容的语言难度；三、提供文本和视频内容。视频内容的语言难度来自于视频字幕。我们的目标是通过不断地适应用户的兴趣和学习目标，使学生保持在Foreign language学习过程中的兴趣和积极性。

Signal Temporal Logic Neural Predictive Control

paper_url: http://arxiv.org/abs/2309.05131
repo_url: None
paper_authors: Yue Meng, Chuchu Fan
for: 本研究旨在提供一种能够系统地和可靠地满足长期机器人任务的安全性和时间约束要求的方法。
methods: 我们提出了一种直接使用强化学习学习一个神经网络控制器，以满足由Signal Temporal Logic（STL）所规定的要求。我们的控制器在训练中 Maximize STL 鲁棒性分数，在投入中类似于预测控制（MPC），预测一个在规划 horizons 内的 trajectory，以确保任务满足 STL 要求。
results: 我们在六个任务上进行了实验，其中我们的方法与备用策略在 STL 满足率方面表现出色，特别是在任务中存在复杂 STL 要求时，与传统方法（MPC、STL 解决方案）、模型自由和模型基于RL方法相比，速度比较快，10X-100X faster than classical methods。

Abstract
Ensuring safety and meeting temporal specifications are critical challenges for long-term robotic tasks. Signal temporal logic (STL) has been widely used to systematically and rigorously specify these requirements. However, traditional methods of finding the control policy under those STL requirements are computationally complex and not scalable to high-dimensional or systems with complex nonlinear dynamics. Reinforcement learning (RL) methods can learn the policy to satisfy the STL specifications via hand-crafted or STL-inspired rewards, but might encounter unexpected behaviors due to ambiguity and sparsity in the reward. In this paper, we propose a method to directly learn a neural network controller to satisfy the requirements specified in STL. Our controller learns to roll out trajectories to maximize the STL robustness score in training. In testing, similar to Model Predictive Control (MPC), the learned controller predicts a trajectory within a planning horizon to ensure the satisfaction of the STL requirement in deployment. A backup policy is designed to ensure safety when our controller fails. Our approach can adapt to various initial conditions and environmental parameters. We conduct experiments on six tasks, where our method with the backup policy outperforms the classical methods (MPC, STL-solver), model-free and model-based RL methods in STL satisfaction rate, especially on tasks with complex STL specifications while being 10X-100X faster than the classical methods.

摘要
Ensuring safety and meeting temporal specifications are critical challenges for long-term robotic tasks. 信号时间逻辑（STL）已广泛应用于系统地和准确地要求这些需求。然而，传统的控制策略找到方法是计算复杂和不可扩展高维或非线性动力学系统。 reinforcement learning（RL）方法可以通过手工或STL- inspirited reward学习策略满足STL要求，但可能会遇到意外行为 due to ambiguity and sparsity in the reward.在这篇论文中，我们提出了一种方法，可以直接学习神经网络控制器，满足STL要求。我们的控制器在训练中学习满足STLRobustness分数的扩展曲线。在测试中，类似于Model Predictive Control（MPC），我们学习的控制器预测一个在规划时间Horizon内的轨迹，以确保STL要求的满足。我们还设计了一个备份策略，以确保安全性，当我们的控制器失败时。我们的方法可以适应不同的初始条件和环境参数。我们在六个任务上进行了实验，我们的方法，备份策略相比于经典方法（MPC、STL-solver）、模型自由和模型基于RL方法，在STL满足率方面表现出色，特别是在复杂的STL要求下，并且在10X-100X快于经典方法。

The online learning architecture with edge computing for high-level control for assisting patients

paper_url: http://arxiv.org/abs/2309.05130
repo_url: None
paper_authors: Yue Shi, Yihui Zhao
for: 这篇研究旨在提高因病患或创伤等因素而导致下肢功能障碍的人士 mobility 和 Rehabilitation 的可能性。
methods: 本研究使用了在紧复时间内处理感应数据的边缘 Computing 和在线 adversarial learning 架构，实现高级的下肢 exoskeleton 控制。
results: 实验结果显示，该架构可以提高控制精度和适应性，同时提高 Quality-of-Service (QoS) 指标。这些成果显示，将在线 adversarial learning 与边缘 Computing 结合可以提供下一代下肢 exoskeleton 控制系统的可靠和高效方法。

Abstract
The prevalence of mobility impairments due to conditions such as spinal cord injuries, strokes, and degenerative diseases is on the rise globally. Lower-limb exoskeletons have been increasingly recognized as a viable solution for enhancing mobility and rehabilitation for individuals with such impairments. However, existing exoskeleton control systems often suffer from limitations such as latency, lack of adaptability, and computational inefficiency. To address these challenges, this paper introduces a novel online adversarial learning architecture integrated with edge computing for high-level lower-limb exoskeleton control. In the proposed architecture, sensor data from the user is processed in real-time through edge computing nodes, which then interact with an online adversarial learning model. This model adapts to the user's specific needs and controls the exoskeleton with minimal latency. Experimental evaluations demonstrate significant improvements in control accuracy and adaptability, as well as enhanced quality-of-service (QoS) metrics. These findings indicate that the integration of online adversarial learning with edge computing offers a robust and efficient approach for the next generation of lower-limb exoskeleton control systems.

摘要
全球的 mobililty 障碍（如脊梁创伤、中风和逐渐恶化的疾病）的发展趋势是增加的。Lower-limb exoskeletons 被越来越多地认为是提高 mobililty 和rehabilitation 的有效解决方案。然而，现有的 exoskeleton 控制系统经常受到 limitation 的影响，如延迟、缺乏适应性和计算不足。为了解决这些挑战，本文提出了一种基于 online adversarial learning 架构的高级 lower-limb exoskeleton 控制系统。在该架构中，用户的感知数据在实时通过边缘计算节点处理，然后与在线 adversarial learning 模型交互。这个模型适应用户的特定需求，控制 exoskeleton WITH 最小延迟。实验评估表明，该架构可以提高控制精度和适应性，同时提高质量服务（QoS）指标。这些发现表明，将 online adversarial learning 与边缘计算相结合可以提供下一代 lower-limb exoskeleton 控制系统的可靠和高效的解决方案。

WIP: Development of a Student-Centered Personalized Learning Framework to Advance Undergraduate Robotics Education

paper_url: http://arxiv.org/abs/2309.05124
repo_url: None
paper_authors: Ponkoj Chandra Shill, Rui Wu, Hossein Jamali, Bryan Hutchins, Sergiu Dascalu, Frederick C. Harris, David Feil-Seifer
for: 提供个性化学习环境 для机器人学生，解决了学院级机器人教学资源紧缺和高昂的训练成本问题。
methods: 开发一个基于网页界面的机器人教学系统，可以与较便宜的硬件配合使用，以便免费分布教学材料，推动更多的机器人课程在两年和四年大学 Offered。
results: 针对五个Module mini-course进行了评估，发现学生对在线内容表示 позитив的体验，同时在相关性、熟悉性和自主性等三个方面得分很高，表明这种方法具有强大的动机潜力。

Abstract
This paper presents a work-in-progress on a learn-ing system that will provide robotics students with a personalized learning environment. This addresses both the scarcity of skilled robotics instructors, particularly in community colleges and the expensive demand for training equipment. The study of robotics at the college level represents a wide range of interests, experiences, and aims. This project works to provide students the flexibility to adapt their learning to their own goals and prior experience. We are developing a system to enable robotics instruction through a web-based interface that is compatible with less expensive hardware. Therefore, the free distribution of teaching materials will empower educators. This project has the potential to increase the number of robotics courses offered at both two- and four-year schools and universities. The course materials are being designed with small units and a hierarchical dependency tree in mind; students will be able to customize their course of study based on the robotics skills they have already mastered. We present an evaluation of a five module mini-course in robotics. Students indicated that they had a positive experience with the online content. They also scored the experience highly on relatedness, mastery, and autonomy perspectives, demonstrating strong motivation potential for this approach.

摘要
这份论文介绍了一个学习系统，旨在为机器人学生提供个性化学习环境。这种系统将解决机器人教育人员短缺和训练设备成本高的问题，特别是在社区学院。学生在学习机器人时有各种兴趣、经验和目标，这个项目的目的是让学生可以根据自己的目标和先前学习来自定义学习路径。我们正在开发一个通过网络界面进行机器人教学，可以与较便宜的硬件相结合。因此，我们将免费发布教学材料，以便教育者们可以更加自由地使用。这个项目有望增加两年和四年学院和大学机器人课程的数量。我们正在设计课程材料，以小单元和层次结构为基础，学生可以根据已经掌握的机器人技能自定义课程。我们对五个模块小课程进行了评估，学生表示对在线内容有积极的体验，并在相关性、掌握和自主性方面得分高，表明这种方法具有强的动机潜力。

High Fidelity Fast Simulation of Human in the Loop Human in the Plant (HIL-HIP) Systems

paper_url: http://arxiv.org/abs/2309.06558
repo_url: None
paper_authors: Ayan Banerjee, Payal Kamboj, Aranyak Maity, Riya Sudhakar Salian, Sandeep K. S. Gupta
for: 这个论文是为了研究在 integrate wireless mobile networks 和人在loop （HIL）和人在plant （HIP）physical systems 下的非线性simulation 问题。methods: 该论文使用了分割时间变化Component的方法（PLIS），将其分解成多个Interval中的固定时间点，然后将这些Interval concatenated 在时间域中。results: 研究发现PLIS方法可以带来大于2.1倍的速度提升，并且保证了 simulations 的准确性。

Abstract
Non-linearities in simulation arise from the time variance in wireless mobile networks when integrated with human in the loop, human in the plant (HIL-HIP) physical systems under dynamic contexts, leading to simulation slowdown. Time variance is handled by deriving a series of piece wise linear time invariant simulations (PLIS) in intervals, which are then concatenated in time domain. In this paper, we conduct a formal analysis of the impact of discretizing time-varying components in wireless network-controlled HIL-HIP systems on simulation accuracy and speedup, and evaluate trade-offs with reliable guarantees. We develop an accurate simulation framework for an artificial pancreas wireless network system that controls blood glucose in Type 1 Diabetes patients with time varying properties such as physiological changes associated with psychological stress and meal patterns. PLIS approach achieves accurate simulation with greater than 2.1 times speedup than a non-linear system simulation for the given dataset.

摘要
非线性在模拟中来自无线移动网络与人loop（HIL-HIP）物理系统的时间变化下出现，导致模拟慢速。我们采取了分割时间变化的方法， derive a series of piece wise linear time invariant simulations（PLIS），然后将它们 concatenated 在时域中。在这篇论文中，我们进行了正式的时间变化精度和速度的分析，并评估了可靠保证的交易。我们开发了一个准确的模拟框架，用于控制Type 1 диабеت斯 patients的血糖水平，该系统具有时变性特征，如生物physiological 变化和心理压力和饭 Patterns。PLIS 方法实现了更高于 2.1 倍的速度提升，而不 sacrifi 精度。

A compendium of data sources for data science, machine learning, and artificial intelligence

paper_url: http://arxiv.org/abs/2309.05682
repo_url: None
paper_authors: Paul Bilokon, Oleksandr Bilokon, Saeed Amen
for: 提供数据科学、机器学习和人工智能领域的数据源列表，帮助数据科学家和机器学习专家在各个应用领域进行数据处理和分析。
methods: 列举了多个应用领域的数据源，包括金融和经济、法律（法律和规章）、生命科学（医学和药物发现）、新闻情感和社交媒体、零售和电商、卫星影像和运输和供应链，并提供了这些数据源的简要描述。
results: 提供了一个不完全的，但广泛的数据源列表，可以帮助数据科学家和机器学习专家在各个应用领域进行数据处理和分析。

Abstract
Recent advances in data science, machine learning, and artificial intelligence, such as the emergence of large language models, are leading to an increasing demand for data that can be processed by such models. While data sources are application-specific, and it is impossible to produce an exhaustive list of such data sources, it seems that a comprehensive, rather than complete, list would still benefit data scientists and machine learning experts of all levels of seniority. The goal of this publication is to provide just such an (inevitably incomplete) list -- or compendium -- of data sources across multiple areas of applications, including finance and economics, legal (laws and regulations), life sciences (medicine and drug discovery), news sentiment and social media, retail and ecommerce, satellite imagery, and shipping and logistics, and sports.

摘要
Recent advances in数据科学、机器学习和人工智能，如大语言模型的出现，导致了对这些模型处理数据的需求的增加。虽然数据来源是应用程序特定的，但是无法制作完整的列表，但一份具体的列表仍然会对数据科学家和机器学习专家们有帮助。本文的目标是提供一个（必然不完整的）列表，涵盖多个领域的应用，包括金融和经济、法律（法律和规章）、生命科学（医学和药物发现）、新闻情感和社交媒体、零售和电商、卫星图像和运输和供应链，以及体育。

Deep Learning-Aided Subspace-Based DOA Recovery for Sparse Arrays

paper_url: http://arxiv.org/abs/2309.05109
repo_url: None
paper_authors: Yoav Amiel, Dor H. Shmuel, Nir Shlezinger, Wasim Huleihel
for: 这项研究旨在开发一种基于深度学习的异常探测方法，以解决稀疏降噪数组中的方向探测问题。
methods: 该方法使用深度学习来学习一个专门的深度网络，以将数组中的异常信号分解成可分辨的子空间。
results: 该方法可以在稀疏降噪数组中处理听到的干扰信号，并且可以保持模型基于子空间方向探测器的解释性和适用性。

Abstract
Sparse arrays enable resolving more direction of arrivals (DoAs) than antenna elements using non-uniform arrays. This is typically achieved by reconstructing the covariance of a virtual large uniform linear array (ULA), which is then processed by subspace DoA estimators. However, these method assume that the signals are non-coherent and the array is calibrated; the latter often challenging to achieve in sparse arrays, where one cannot access the virtual array elements. In this work, we propose Sparse-SubspaceNet, which leverages deep learning to enable subspace-based DoA recovery from sparse miscallibrated arrays with coherent sources. Sparse- SubspaceNet utilizes a dedicated deep network to learn from data how to compute a surrogate virtual array covariance that is divisible into distinguishable subspaces. By doing so, we learn to cope with coherent sources and miscalibrated sparse arrays, while preserving the interpretability and the suitability of model-based subspace DoA estimators.

摘要
稀疏数组可以解决更多的方向来源（DoAs）than antenna element using non-uniform arrays. 通常通过重建虚拟大 uniform linear array（ULA）的协方差来实现这一点，然后使用子空间DoA估计器进行处理。然而，这些方法假设信号是非几何的和数组是calibrated; 后者经常是稀疏数组中的挑战。在这种情况下，我们提出了Sparse-SubspaceNet，这是一种使用深度学习来实现基于subspace的DoA恢复从稀疏不calibrated数组中的几何源。Sparse-SubspaceNet使用专门的深度网络来学习从数据中如何计算一个可分解的虚拟数组协方差，这使得我们可以处理几何源和不calibrated稀疏数组，同时保持模型基于subspace DoA估计器的可读性和适用性。

AGent: A Novel Pipeline for Automatically Creating Unanswerable Questions

paper_url: http://arxiv.org/abs/2309.05103
repo_url: https://github.com/sonqt/agent-unanswerable
paper_authors: Son Quoc Tran, Gia-Huy Do, Phong Nguyen-Thuan Do, Matt Kretchmar, Xinya Du
for: 提高Extractive Question Answering（EQA）领域中模型的性能，通过自动生成无法回答的问题来训练EQA模型，以避免模型提取错误或不正确的答案。
methods: 提出AGent管道，通过重新匹配问题与缺乏必要信息的上下文来自动生成无法回答的问题。
results: 通过使用AGent管道生成的无法回答问题，训练EQA模型可以达到与使用SQuAD 2.0 dataset的性能相似的水平。

Abstract
The development of large high-quality datasets and high-performing models have led to significant advancements in the domain of Extractive Question Answering (EQA). This progress has sparked considerable interest in exploring unanswerable questions within the EQA domain. Training EQA models with unanswerable questions helps them avoid extracting misleading or incorrect answers for queries that lack valid responses. However, manually annotating unanswerable questions is labor-intensive. To address this, we propose AGent, a novel pipeline that automatically creates new unanswerable questions by re-matching a question with a context that lacks the necessary information for a correct answer. In this paper, we demonstrate the usefulness of this AGent pipeline by creating two sets of unanswerable questions from answerable questions in SQuAD and HotpotQA. These created question sets exhibit low error rates. Additionally, models fine-tuned on these questions show comparable performance with those fine-tuned on the SQuAD 2.0 dataset on multiple EQA benchmarks.

摘要
<>大量高质量数据和高性能模型的发展对提取问题回答领域（EQA）带来了重要进步，这些进步引起了对不可回答问题的探索的广泛关注。通过训练EQA模型使用不可回答问题，可以帮助这些模型避免提取错误或 incomplete 答案。然而，手动标注不可回答问题是劳动密集的。为解决这个问题，我们提出了AGent管道，一种新的管道，可以自动生成新的不可回答问题，通过重新匹配一个问题与缺乏必要信息的上下文。在这篇论文中，我们示出了AGent管道的有用性，通过将answerable questions转换成不可回答问题，并创造了两个不可回答问题集，其错误率较低。此外，基于这些问题进行了 fine-tuning，模型在多个 EQA 测试上表现相当。[/INST Sure, here's the translation of the text into Simplified Chinese:大量高质量数据和高性能模型的发展对提取问题回答领域（EQA）带来了重要进步，这些进步引起了对不可回答问题的探索的广泛关注。通过训练EQA模型使用不可回答问题，可以帮助这些模型避免提取错误或 incomplete 答案。然而，手动标注不可回答问题是劳动密集的。为解决这个问题，我们提出了AGent管道，一种新的管道，可以自动生成新的不可回答问题，通过重新匹配一个问题与缺乏必要信息的上下文。在这篇论文中，我们示出了AGent管道的有用性，通过将answerable questions转换成不可回答问题，并创造了两个不可回答问题集，其错误率较低。此外，基于这些问题进行了 fine-tuning，模型在多个 EQA 测试上表现相当。

paper_url: http://arxiv.org/abs/2309.08621
repo_url: https://github.com/that-recsys-lab/scruf_facctrec_2023
paper_authors: Amanda Aird, Cassidy All, Paresha Farastu, Elena Stefancova, Joshua Sun, Nicholas Mattei, Robin Burke
for: 这篇论文主要针对的是推荐系统中的公平问题，具体来说是多个公平关注者之间的矛盾和讨论。
methods: 该论文使用社会选择理论来形式化和解决公平问题，并考虑了多种选择机制和分配方式来处理多个公平关注者之间的矛盾。
results: 该论文通过使用实际和synthetic数据进行实验，发现不同的选择机制和分配方式会导致不同的公平精度和准确率之间的权衡。此外，该论文还表明了多个代理人形式ulation的灵活性，可以适应用户人口动态变化。

Abstract
Fairness problems in recommender systems often have a complexity in practice that is not adequately captured in simplified research formulations. A social choice formulation of the fairness problem, operating within a multi-agent architecture of fairness concerns, offers a flexible and multi-aspect alternative to fairness-aware recommendation approaches. Leveraging social choice allows for increased generality and the possibility of tapping into well-studied social choice algorithms for resolving the tension between multiple, competing fairness concerns. This paper explores a range of options for choice mechanisms in multi-aspect fairness applications using both real and synthetic data and shows that different classes of choice and allocation mechanisms yield different but consistent fairness / accuracy tradeoffs. We also show that a multi-agent formulation offers flexibility in adapting to user population dynamics.

摘要
“具有多元 fairness 需求的推荐系统问题通常在实际应用中具有复杂性，不充分被研究形式化的研究所能够捕捉。使用社会选择形式ulation的 fairness 问题，在多代理oki的公平关注架构中运作，提供了一个洒脱的多方面替代方案。利用社会选择可以提高通用性和可以将多元公平 Concerns 转化为已经学习的社会选择算法来解决多元公平 Concerns 之间的紧张关系。本文将评估不同的选择和分配机制在多元公平应用中的效果，包括使用实际和 sintetic 数据，并显示出不同类型的选择和分配机制在公平率 / 准确度贸易中产生不同的但是一致的变化。我们还示出了多代理oki 形式的洒脱性，可以适应用户人口动态。”Note: Please note that the translation is in Simplified Chinese, which is the standard form of Chinese used in mainland China and Singapore. If you need Traditional Chinese, please let me know.

Variance Reduction of Resampling for Sequential Monte Carlo

paper_url: http://arxiv.org/abs/2309.08620
repo_url: https://github.com/986876245/variance-reduction-for-smc
paper_authors: Xiongming Dai, Gerald Baumgartner
for: 这篇论文是为了提出一种统计重点抽样方法，来替代低重量粒子的MCMC方法，以更快速地和更精确地描述隐藏Markov过程。
methods: 本论文使用了一种重复决定域法，并且使用中值ergodicity来进行抽样。
results: 研究发现，这种方法可以在非线性情况下更快速地和更精确地描述隐藏Markov过程，并且可以降低样本变化的方差。

Abstract
A resampling scheme provides a way to switch low-weight particles for sequential Monte Carlo with higher-weight particles representing the objective distribution. The less the variance of the weight distribution is, the more concentrated the effective particles are, and the quicker and more accurate it is to approximate the hidden Markov model, especially for the nonlinear case. We propose a repetitive deterministic domain with median ergodicity for resampling and have achieved the lowest variances compared to the other resampling methods. As the size of the deterministic domain $M\ll N$ (the size of population), given a feasible size of particles, our algorithm is faster than the state of the art, which is verified by theoretical deduction and experiments of a hidden Markov model in both the linear and non-linear cases.

摘要
一种重采样方案可以将低权重粒子换为顺序 Monte Carlo 中的高权重粒子，表示目标分布。当 variance 的低时，粒子的效果更集中，更快速地 aproximate 隐藏 Markov 模型，特别是非线性情况。我们提议一种循环决定的 deterministic Domain WITH median 征求，并实现了最低的方差。当 $M\ll N$ (人口大小)，给定可行的粒子大小，我们的算法比现状慢，经过了逻辑推导和隐藏 Markov 模型在线性和非线性情况下的实验验证。

Neural-Hidden-CRF: A Robust Weakly-Supervised Sequence Labeler

paper_url: http://arxiv.org/abs/2309.05086
repo_url: https://github.com/junchenzhi/neural-hidden-crf
paper_authors: Zhijun Chen, Hailong Sun, Wanhao Zhang, Chunyi Xu, Qianren Mao, Pengpeng Chen
for: 解决弱监督序列标签问题
methods: 使用神经网络隐藏CRF层模型word序列、隐藏真实标签序列和弱标签序列的变量，并利用全球视角来模型这些变量
results: 在一个人工智能 benchmark 和三个弱监督 benchmark 上达到新的状态对应记录，包括在一般化和推理性能中超过最近的进步模型CHMM的2.80 F1点和2.23 F1点。

Abstract
We propose a neuralized undirected graphical model called Neural-Hidden-CRF to solve the weakly-supervised sequence labeling problem. Under the umbrella of probabilistic undirected graph theory, the proposed Neural-Hidden-CRF embedded with a hidden CRF layer models the variables of word sequence, latent ground truth sequence, and weak label sequence with the global perspective that undirected graphical models particularly enjoy. In Neural-Hidden-CRF, we can capitalize on the powerful language model BERT or other deep models to provide rich contextual semantic knowledge to the latent ground truth sequence, and use the hidden CRF layer to capture the internal label dependencies. Neural-Hidden-CRF is conceptually simple and empirically powerful. It obtains new state-of-the-art results on one crowdsourcing benchmark and three weak-supervision benchmarks, including outperforming the recent advanced model CHMM by 2.80 F1 points and 2.23 F1 points in average generalization and inference performance, respectively.

摘要
我们提出了一种含有隐藏CRF层的神经网络模型，称为神经隐藏CRF，用于解决弱监督序列标签问题。在概率无向图论下，神经隐藏CRF模型了 palabras序列、隐藏真实序列和弱标签序列的变量，并且具有全局视角，特别是无向图论中的优势。在神经隐藏CRF中，我们可以利用深度语言模型BERT或其他深度模型提供丰富的语义知识来隐藏真实序列，并使用隐藏CRF层捕捉内部标签依赖关系。神经隐藏CRF的概念简单，实际强大。它在一个人工智能投票 benchmark和三个弱监督 benchmark 上取得了新的状态理论最佳结果，包括在平均总体化和推理性能方面比最近的高级模型CHMM高2.80个F1分和2.23个F1分。

An Appraisal-Based Chain-Of-Emotion Architecture for Affective Language Model Game Agents

paper_url: http://arxiv.org/abs/2309.05076
repo_url: None
paper_authors: Maximilian Croissant, Madeleine Frister, Guy Schofield, Cade McCall
for: 这项研究旨在解决人工智能代理人的可信度、自然性和互动性等领域中的一些挑战，具体来说是开发一种能够模拟人类情感的人工智能代理人。
methods: 这项研究采用了大型自然语言模型（LLM），通过挖掘情境评估中的共同模式来解决情感智能任务，并在视频游戏中测试了一种新的情感链架架构。
results: 研究结果表明，新的情感链架架构在用户体验和内容分析等方面的多个指标上表现出色，比标准LLM架构更高效。这项研究因此提供了在基于语言模型的认知过程中构建和测试情感代理人的初始证据。

Abstract
The development of believable, natural, and interactive digital artificial agents is a field of growing interest. Theoretical uncertainties and technical barriers present considerable challenges to the field, particularly with regards to developing agents that effectively simulate human emotions. Large language models (LLMs) might address these issues by tapping common patterns in situational appraisal. In three empirical experiments, this study tests the capabilities of LLMs to solve emotional intelligence tasks and to simulate emotions. It presents and evaluates a new chain-of-emotion architecture for emotion simulation within video games, based on psychological appraisal research. Results show that it outperforms standard LLM architectures on a range of user experience and content analysis metrics. This study therefore provides early evidence of how to construct and test affective agents based on cognitive processes represented in language models.

摘要
随着人工智能技术的不断发展，开发可信、自然、互动的数字人工智能代理人也成为了一项快速增长的领域。然而，许多理论上的不确定性和技术难题使得该领域面临着很大的挑战，尤其是在模拟人类情感方面。大语言模型（LLM）可能可以解决这些问题，通过捕捉情境评估中的共同模式。本研究通过三个实验测试了LLM在情感智能任务中的能力，以及它们在视频游戏中的情感模拟能力。结果表明，我们的新的情感链架系统在用户体验和内容分析指标上表现出色，比标准LLM架构更高效。这项研究因此为构建和测试基于语言模型的情感代理人提供了早期的证据。

Chebyshev Particles

paper_url: http://arxiv.org/abs/2309.06373
repo_url: https://github.com/986876245/chebyshevparticles
paper_authors: Xiongming Dai, Gerald Baumgartner
for: 这个论文主要用于推断隐藏马尔可夫模型（Hidden Markov Model，HMM）的参数，尤其是在维度约束的情况下，where the Monte Carlo sampler struggles with the curse of dimensionality.
methods: 该论文提出了一种新的 критерий，即最大化权重的里茨卷积量（weighted Riesz polarization quantity），来精确地拟合 rectifiable submanifolds，并通过对互相互动的pairwise interaction来离散化。
results: 该论文通过实验表明，在一个线性加 Gaussian state-space模型（Linear Gaussian state-space model，LGSSM）中的参数推断和一个非线性随机抖动模型（Non-linear stochastic volatility model，NLSM）中的参数推断都能够达到高性能。

Abstract
Markov chain Monte Carlo (MCMC) provides a feasible method for inferring Hidden Markov models, however, it is often computationally prohibitive, especially constrained by the curse of dimensionality, as the Monte Carlo sampler traverses randomly taking small steps within uncertain regions in the parameter space. We are the first to consider the posterior distribution of the objective as a mapping of samples in an infinite-dimensional Euclidean space where deterministic submanifolds are embedded and propose a new criterion by maximizing the weighted Riesz polarization quantity, to discretize rectifiable submanifolds via pairwise interaction. We study the characteristics of Chebyshev particles and embed them into sequential MCMC, a novel sampler with a high acceptance ratio that proposes only a few evaluations. We have achieved high performance from the experiments for parameter inference in a linear Gaussian state-space model with synthetic data and a non-linear stochastic volatility model with real-world data.

摘要
Markerov链 Монте Carlo（MCMC）提供了一种可行的方法来推断隐藏Markov模型，但是它常常由尺度约束所困，尤其是在维度约束的咒语下，MCMC抽样器在参数空间中随机漫步，难以在不确定的区域中进行准确的步长。我们是第一个考虑 posterior Distribution 作为抽象空间中的样本映射，并提出了一新的 критерий，通过最大化均值拓扑量的weighted Riesz polarization量来离散可导的子拓扑。我们研究了Chebychev particles的特点并将其集成到顺序MCMC中，一种新的抽样器，它的接受率很高，只需要少量的评估。我们通过实验表明，这种方法在Linear Gaussian state-space model中进行参数推断时可以 дости得高性能，并在非线性抽象噪声模型中进行参数推断时也能够达到高性能。

Spatiotemporal Graph Neural Networks with Uncertainty Quantification for Traffic Incident Risk Prediction

paper_url: http://arxiv.org/abs/2309.05072
repo_url: https://github.com/sttdanonymous/sttd
paper_authors: Xiaowei Gao, Xinke Jiang, Dingyi Zhuang, Huanfa Chen, Shenhao Wang, James Haworth
for: 预测交通事故风险在细致时空层面是一项挑战。现有数据主要具有零值，表示没有事故，而 occasional high-risk values 表示严重事故。现有大多数模型， especial deep learning methods, 强调估计风险值，忽视因事故本身具有不可预测性而产生的uncertainty。
methods: 我们引入了Spatiotemporal Zero-Inflated Tweedie Graph Neural Networks (STZITD-GNNs) 模型，这种模型结合了传统统计模型的可靠性和图神经网络的灵活性，以准确量化交通事故风险的不确定性。该模型采用了Tweedie家族中的复合模型，其中Poisson分布模型了风险频率，而Gamma分布做了事故严重程度的衡量。此外，zero-inflated组成部分帮助确定非事故风险enario。
results: 实验结果表明，STZITD-GNNs 模型在使用英国伦敦实际交通数据时，不仅在精度方面超越了目前的标准准则，而且在短（7天）和长（14天）时间尺度上都能够提供稳定和可靠的预测结果。STZITD-GNNs 模型的优势不仅在于准确性，还在于能够减少不确定性，从而提供更加可靠的预测结果。

Abstract
Predicting traffic incident risks at granular spatiotemporal levels is challenging. The datasets predominantly feature zero values, indicating no incidents, with sporadic high-risk values for severe incidents. Notably, a majority of current models, especially deep learning methods, focus solely on estimating risk values, overlooking the uncertainties arising from the inherently unpredictable nature of incidents. To tackle this challenge, we introduce the Spatiotemporal Zero-Inflated Tweedie Graph Neural Networks (STZITD-GNNs). Our model merges the reliability of traditional statistical models with the flexibility of graph neural networks, aiming to precisely quantify uncertainties associated with road-level traffic incident risks. This model strategically employs a compound model from the Tweedie family, as a Poisson distribution to model risk frequency and a Gamma distribution to account for incident severity. Furthermore, a zero-inflated component helps to identify the non-incident risk scenarios. As a result, the STZITD-GNNs effectively capture the dataset's skewed distribution, placing emphasis on infrequent but impactful severe incidents. Empirical tests using real-world traffic data from London, UK, demonstrate that our model excels beyond current benchmarks. The forte of STZITD-GNN resides not only in its accuracy but also in its adeptness at curtailing uncertainties, delivering robust predictions over short (7 days) and extended (14 days) timeframes.

摘要
预测路网冲击风险在精度空间时间层面是一项挑战。数据主要具有零值，表示没有事故，其中间间有极高风险值的严重事故。现有大多数模型，特别是深度学习方法，偏向仅仅估计风险值，忽视了事故的不可预测性。为了解决这个挑战，我们介绍了空间时间零值 Tweedie 图 neural network（STZITD-GNN）。我们的模型结合了传统统计模型的可靠性和图神经网络的灵活性，以准确量化道路层次交通事故的不确定性。我们的模型采用 Tweedie 家族中的复合模型，其中 Poisson 分布模型风险频率，而 Gamma 分布模型考虑事故严重程度。此外，零值填充部分帮助分辨非事故风险场景。因此，STZITD-GNN 能够准确地捕捉数据的极向分布，强调罕见但具有深远影响的严重事故。我们对实际的伦敦交通数据进行了 empirical 测试，发现我们的模型在短（7天）和长（14天）时间层面上都能够超越当前标准。STZITD-GNN 的 forte 不仅在准确性方面，还在减少不确定性方面，在短时间和长时间层面上都能够提供可靠的预测。

Chasing the Intruder: A Reinforcement Learning Approach for Tracking Intruder Drones

paper_url: http://arxiv.org/abs/2309.05070
repo_url: None
paper_authors: Shivam Kainth, Subham Sahoo, Rajtilak Pal, Shashi Shekhar Jha
for: 这篇论文是用来解决非法用探空机采用探空机跟踪攻击者探空机的问题的。
methods: 该论文提出了一种基于Policy学习的探空机跟踪方法，利用计算机视觉技术和Policy学习框架来学习控制策略，实现探空机跟踪攻击者探空机。
results: 实验结果表明，提出的方法可以快速和精准地识别和跟踪攻击者探空机，并且对攻击者探空机的速度或方向变化具有弹性性。

Abstract
Drones are becoming versatile in a myriad of applications. This has led to the use of drones for spying and intruding into the restricted or private air spaces. Such foul use of drone technology is dangerous for the safety and security of many critical infrastructures. In addition, due to the varied low-cost design and agility of the drones, it is a challenging task to identify and track them using the conventional radar systems. In this paper, we propose a reinforcement learning based approach for identifying and tracking any intruder drone using a chaser drone. Our proposed solution uses computer vision techniques interleaved with the policy learning framework of reinforcement learning to learn a control policy for chasing the intruder drone. The whole system has been implemented using ROS and Gazebo along with the Ardupilot based flight controller. The results show that the reinforcement learning based policy converges to identify and track the intruder drone. Further, the learnt policy is robust with respect to the change in speed or orientation of the intruder drone.

摘要
随着无人机在各种应用中的普及，无人机也开始用于间谍和非法进入受限或私人空域。这种不良使用无人机技术会对多个关键基础设施的安全和安全造成威胁。此外，由于无人机的多样化低成本设计和机敏性，使用传统雷达系统识别和跟踪它们是一项困难的任务。在这篇论文中，我们提出了基于Policy学习框架的强化学习方法，用于识别和跟踪任何非法无人机。我们的提议的解决方案使用计算机视觉技术与Policy学习框架结合，以学习控制策略，追踪非法无人机。整个系统使用ROS和Gazebo以及Ardupilot基于飞行控制器。实验结果表明，强化学习基于策略 converges to识别和跟踪非法无人机。此外，学习的策略还具有对速度或方向变化的robust性。

Federated Learning Incentive Mechanism under Buyers’ Auction Market

paper_url: http://arxiv.org/abs/2309.05063
repo_url: None
paper_authors: Jiaxi Yang, Zihao Guo, Sheng Cao, Cuifang Zhao, Li-Chuan Tsai
For: 本文探讨了基于拍卖的联合学习（AFL）如何在开放合作环境下实现数据拥有者和数据消费者之间的协作。* Methods: 本文采用了基于拍卖的订单框架，以解释在买家市场下的价格行为。文中还使用了一种基于区块链的声誉机制，以选择具有高可靠性和数据质量的客户端。* Results: 实验结果证明了我们的方法的有效性。

Abstract
Auction-based Federated Learning (AFL) enables open collaboration among self-interested data consumers and data owners. Existing AFL approaches are commonly under the assumption of sellers' market in that the service clients as sellers are treated as scarce resources so that the aggregation servers as buyers need to compete the bids. Yet, as the technology progresses, an increasing number of qualified clients are now capable of performing federated learning tasks, leading to shift from sellers' market to a buyers' market. In this paper, we shift the angle by adapting the procurement auction framework, aiming to explain the pricing behavior under buyers' market. Our modeling starts with basic setting under complete information, then move further to the scenario where sellers' information are not fully observable. In order to select clients with high reliability and data quality, and to prevent from external attacks, we utilize a blockchain-based reputation mechanism. The experimental results validate the effectiveness of our approach.

摘要
价格赢 Auction-based Federated Learning (AFL) 可以实现开放合作 among self-interested data consumers 和数据所有者。现有的 AFL 方法通常假设出售方为稀缺资源，因此整合服务器需要竞标。然而，技术的进步使得更多的资格客户可以执行联邦学习任务，导致市场的转变。在这篇论文中，我们将Angleshift towards buyers' market。我们采用了基于 blockchain 的声誉机制来选择可靠的客户和数据质量。实验结果证明了我们的方法的有效性。Note: The translation is in Simplified Chinese, which is the standard form of Chinese used in mainland China and Singapore. If you need Traditional Chinese, please let me know.

Machine Learning for maximizing the memristivity of single and coupled quantum memristors

paper_url: http://arxiv.org/abs/2309.05062
repo_url: None
paper_authors: Carlos Hernani-Morales, Gabriel Alvarado, Francisco Albarrán-Arriagada, Yolanda Vives-Gilabert, Enrique Solano, José D. Martín-Guerrero
for: 用机器学习方法描述单个和连接的量子幂istor的幂istor性质。
methods: 使用机器学习方法来描述单个和连接的量子幂istor的幂istor性质。
results: 结果表明，通过增加幂istor性，可以获得两个量子幂istor的高度相关性，从而证明了量子幂istor与记忆之间的密切关系。这些结果为量子幂istorneuromorphic量子计算提供了更多的可能性。

Abstract
We propose machine learning (ML) methods to characterize the memristive properties of single and coupled quantum memristors. We show that maximizing the memristivity leads to large values in the degree of entanglement of two quantum memristors, unveiling the close relationship between quantum correlations and memory. Our results strengthen the possibility of using quantum memristors as key components of neuromorphic quantum computing.

摘要
我们提出机器学习（ML）方法来描述单个和连接的量子幂istor的幂istor性质。我们发现通过提高幂istor性来获得两个量子幂istor的共聚能量，暴露出量子相关性和记忆之间的密切关系。我们的结果加强了使用量子幂istor作为神经omorphic量子计算的可能性。Note: The translation is done using Google Translate, and may not be perfect or entirely accurate.

Decolonial AI Alignment: Viśesadharma, Argument, and Artistic Expression

paper_url: http://arxiv.org/abs/2309.05030
repo_url: None
paper_authors: Kush R. Varshney
for: 本研究旨在寻找一种去殖民化人工智能（AI）的方法，以适应不同文化和价值观的需求。
methods: 本研究提出了三个建议来减少AIAlignment中的殖民化影响：（1）改变基础道德哲学从西方哲学改为道德，（2）允许不同传统的论证和多元主义在Alignment技术中，（3）扩展价值观的 épistémologie beyond自然语言中的 instrucciones or commandments。
results: 本研究的提议可以帮助去殖民化AIAlignment，使其更适应不同文化和价值观的需求，并且可以增强AI的多样性和包容性。

Abstract
Prior work has explicated the coloniality of artificial intelligence (AI) development and deployment. One process that that work has not engaged with much is alignment: the tuning of large language model (LLM) behavior to be in line with desired values based on fine-grained human feedback. In addition to other practices, colonialism has a history of altering the beliefs and values of colonized peoples; this history is recapitulated in current LLM alignment practices. We suggest that AI alignment be decolonialized using three proposals: (a) changing the base moral philosophy from Western philosophy to dharma, (b) permitting traditions of argument and pluralism in alignment technologies, and (c) expanding the epistemology of values beyond instructions or commandments given in natural language.

摘要

Shift the base moral philosophy from Western philosophy to dharma.2. Embrace diverse traditions of argument and pluralism in alignment technologies.3. Expand the epistemology of values beyond instructions or commandments given in natural language.By implementing these proposals, we can work towards decolonializing AI alignment and promoting more inclusive and culturally sensitive values in AI development.

VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching

paper_url: http://arxiv.org/abs/2309.05027
repo_url: None
paper_authors: Yiwei Guo, Chenpeng Du, Ziyang Ma, Xie Chen, Kai Yu
for: 提高 текст到语音synthesis的效率，替代传统的扩散模型。
methods: 提出了一种基于流匹配算法的语音模型，通过限制抽样步数，实现高质量的语音生成。
results: 对单和多话者 corpora进行主观和objective评估，显示了 VoiceFlow 的synthesis质量明显超过扩散模型。

Abstract
Although diffusion models in text-to-speech have become a popular choice due to their strong generative ability, the intrinsic complexity of sampling from diffusion models harms their efficiency. Alternatively, we propose VoiceFlow, an acoustic model that utilizes a rectified flow matching algorithm to achieve high synthesis quality with a limited number of sampling steps. VoiceFlow formulates the process of generating mel-spectrograms into an ordinary differential equation conditional on text inputs, whose vector field is then estimated. The rectified flow technique then effectively straightens its sampling trajectory for efficient synthesis. Subjective and objective evaluations on both single and multi-speaker corpora showed the superior synthesis quality of VoiceFlow compared to the diffusion counterpart. Ablation studies further verified the validity of the rectified flow technique in VoiceFlow.

摘要
Translated into Simplified Chinese:尽管扩散模型在文本到语音转换中成为了流行的选择，但它们的内在复杂性使其效率受限。相反，我们提出了 VoiceFlow，一种使用矫正流匹配算法来实现高质量的合成。VoiceFlow将文本输入转换为mel-spectrogram的过程形式化为一个条件的ordinary differential equation，然后估算vector field。矫正流技术然后有效地平直 sampling 的轨迹，从而提高合成效率。对单个和多个说话者 corpora进行主观和 объектив评估表明，VoiceFlow的合成质量高于扩散对应部分。另外，ablation 研究进一步证明了矫正流技术在 VoiceFlow 中的有效性。

FOLLOWUPQG: Towards Information-Seeking Follow-up Question Generation

paper_url: http://arxiv.org/abs/2309.05007
repo_url: None
paper_authors: Yan Meng, Liangming Pan, Yixin Cao, Min-Yen Kan
for: 本研究旨在提供一个真实世界信息寻求续问生成任务 (FQG), 协助模型产生更深入的理解和更多的质问。
methods: 研究人员使用 Reddit 论坛提供的开放式问题和回答数据集 (FOLLOWUPQG)，并使用现有的问题生成模型来评估模型的效果。
results: 研究结果显示，现有的问题生成模型可以生成一些有用的续问，但与人类提出的问题相比，模型生成的问题较为简单和不具有高级认知功能。

Abstract
Humans ask follow-up questions driven by curiosity, which reflects a creative human cognitive process. We introduce the task of real-world information-seeking follow-up question generation (FQG), which aims to generate follow-up questions seeking a more in-depth understanding of an initial question and answer. We construct FOLLOWUPQG, a dataset of over 3K real-world (initial question, answer, follow-up question) tuples collected from a Reddit forum providing layman-friendly explanations for open-ended questions. In contrast to existing datasets, questions in FOLLOWUPQG use more diverse pragmatic strategies to seek information, and they also show higher-order cognitive skills (such as applying and relating). We evaluate current question generation models on their efficacy for generating follow-up questions, exploring how to generate specific types of follow-up questions based on step-by-step demonstrations. Our results validate FOLLOWUPQG as a challenging benchmark, as model-generated questions are adequate but far from human-raised questions in terms of informativeness and complexity.

摘要
人类会提出续问，这反映了人类的创新性思维过程。我们介绍了实际世界信息寻求续问生成任务（FQG），该任务的目标是生成更深入理解初始问题和答案的续问。我们构建了FOLLOWUPQG数据集，包含了超过3000个实际世界（初始问题、答案、续问）元组，这些元组来自一个基于Reddit社区的讨论平台，提供了对开放问题的便捷描述。与现有数据集不同，FOLLOWUPQG中的问题使用更多的 Pragmatic 策略来寻求信息，同时也表现出更高一级的认知技能（如应用和关系）。我们对当前问题生成模型进行评估，explore如何基于步骤示例来生成特定类型的续问。我们的结果证明 FOLLOWUPQG 是一个具有挑战性的标准，因为模型生成的问题具有充足的信息和复杂性，但与人类提出的问题相比，它们仍然有一定的差距。

RGAT: A Deeper Look into Syntactic Dependency Information for Coreference Resolution

paper_url: http://arxiv.org/abs/2309.04977
repo_url: https://github.com/qingtian5/rgat_with_bert
paper_authors: Yuan Meng, Xuhao Pan, Jun Chang, Yue Wang
for: 这个论文主要研究了如何使用语法依赖关系图来解决核心引用解决问题。
methods: 该论文提出了一种结合预训练BERT和语法关系图注意力网络（RGAT）的终端解析器，以更深入地探究语法依赖关系图对核心引用解决问题的作用。RGAT模型首先被提出，然后用于理解语法依赖图并学习更好的任务特定语法嵌入。一个整合的建筑物 combining BERT嵌入和语法嵌入被构建，以生成融合表示 для下游任务。
results: 在一个公共的性别不确定 pronouns（GAP）数据集上的实验表明，在对语法依赖图的监督学习和不进行BERT全体调参的情况下，我们提高了之前最佳模型（RGCN-with-BERT）的F1分数从80.3%提高到82.5%，相比于单独使用BERT嵌入的F1分数从78.5%提高到82.5%。另一个公共的OntoNotes 5.0数据集上的实验结果也表明了模型的性能得到了改进。

Abstract
Although syntactic information is beneficial for many NLP tasks, combining it with contextual information between words to solve the coreference resolution problem needs to be further explored. In this paper, we propose an end-to-end parser that combines pre-trained BERT with a Syntactic Relation Graph Attention Network (RGAT) to take a deeper look into the role of syntactic dependency information for the coreference resolution task. In particular, the RGAT model is first proposed, then used to understand the syntactic dependency graph and learn better task-specific syntactic embeddings. An integrated architecture incorporating BERT embeddings and syntactic embeddings is constructed to generate blending representations for the downstream task. Our experiments on a public Gendered Ambiguous Pronouns (GAP) dataset show that with the supervision learning of the syntactic dependency graph and without fine-tuning the entire BERT, we increased the F1-score of the previous best model (RGCN-with-BERT) from 80.3% to 82.5%, compared to the F1-score by single BERT embeddings from 78.5% to 82.5%. Experimental results on another public dataset - OntoNotes 5.0 demonstrate that the performance of the model is also improved by incorporating syntactic dependency information learned from RGAT.

摘要
�although syntactic information is beneficial for many NLP tasks, combining it with contextual information between words to solve the coreference resolution problem needs to be further explored. In this paper, we propose an end-to-end parser that combines pre-trained BERT with a Syntactic Relation Graph Attention Network (RGAT) to take a deeper look into the role of syntactic dependency information for the coreference resolution task. In particular, the RGAT model is first proposed, then used to understand the syntactic dependency graph and learn better task-specific syntactic embeddings. An integrated architecture incorporating BERT embeddings and syntactic embeddings is constructed to generate blending representations for the downstream task. Our experiments on a public Gendered Ambiguous Pronouns (GAP) dataset show that with the supervision learning of the syntactic dependency graph and without fine-tuning the entire BERT, we increased the F1-score of the previous best model (RGCN-with-BERT) from 80.3% to 82.5%, compared to the F1-score by single BERT embeddings from 78.5% to 82.5%. Experimental results on another public dataset - OntoNotes 5.0 demonstrate that the performance of the model is also improved by incorporating syntactic dependency information learned from RGAT.Note: The translation is in Simplified Chinese, which is the standard writing system used in mainland China.

AVARS – Alleviating Unexpected Urban Road Traffic Congestion using UAVs

paper_url: http://arxiv.org/abs/2309.04976
repo_url: https://github.com/guojyjy/avars
paper_authors: Jiaying Guo, Michael R. Jones, Soufiene Djahel, Shen Wang
for: 实时监控交通流量并快速采取适当的交通信号控制措施，以减少城市快速几何化交通堵塞。
methods: 使用深度强化学习（DRL）算法控制交通信号灯，并运用无人机（UAV）实时监控交通流量提供高频高分辨率的交通数据。
results: 透过AVARS系统，可以实现快速对应未预期的交通堵塞，并将交通流量回复到原本的不堵塞状态，而且可以在一般无人机的电池寿命 duration 内完成。

Abstract
Reducing unexpected urban traffic congestion caused by en-route events (e.g., road closures, car crashes, etc.) often requires fast and accurate reactions to choose the best-fit traffic signals. Traditional traffic light control systems, such as SCATS and SCOOT, are not efficient as their traffic data provided by induction loops has a low update frequency (i.e., longer than 1 minute). Moreover, the traffic light signal plans used by these systems are selected from a limited set of candidate plans pre-programmed prior to unexpected events' occurrence. Recent research demonstrates that camera-based traffic light systems controlled by deep reinforcement learning (DRL) algorithms are more effective in reducing traffic congestion, in which the cameras can provide high-frequency high-resolution traffic data. However, these systems are costly to deploy in big cities due to the excessive potential upgrades required to road infrastructure. In this paper, we argue that Unmanned Aerial Vehicles (UAVs) can play a crucial role in dealing with unexpected traffic congestion because UAVs with onboard cameras can be economically deployed when and where unexpected congestion occurs. Then, we propose a system called "AVARS" that explores the potential of using UAVs to reduce unexpected urban traffic congestion using DRL-based traffic light signal control. This approach is validated on a widely used open-source traffic simulator with practical UAV settings, including its traffic monitoring ranges and battery lifetime. Our simulation results show that AVARS can effectively recover the unexpected traffic congestion in Dublin, Ireland, back to its original un-congested level within the typical battery life duration of a UAV.

摘要
红色减少意外城市堵塞需要快速准确的反应选择最佳的交通信号控制。传统的交通信号控制系统，如SCATS和SCOOT，不是高效的，因为它们的交通数据由感测器提供，更新频率较低（大于1分钟）。此外，这些系统使用的交通信号信息是从先前定义的候选计划中选择的，无法适应意外事件的发生。当前的研究表明，基于深度优化学习（DRL）算法控制的摄像头交通信号系统更有效地减少交通堵塞。然而，这些系统在大城市部署时需要昂贵的基础设施升级。在这篇论文中，我们提出了使用无人机（UAV）来解决意外交通堵塞的想法。我们认为UAV可以在意外堵塞发生时经济性地部署，并使用摄像头提供高频高分辨率的交通数据。然后，我们提出了一个名为“AVARS”的系统，该系统使用DRL算法控制UAV摄像头提供的交通数据，以减少意外城市堵塞。我们使用一个广泛使用的开源交通模拟器进行了实验，并模拟了实际的UAV设置，包括交通监测范围和电池寿命。我们的实验结果表明，AVARS可以在都柏林、爱尔兰 effectively recovery意外交通堵塞，并在UAV的Typical电池寿命内恢复到原始无堵塞水平。

Continual Robot Learning using Self-Supervised Task Inference

paper_url: http://arxiv.org/abs/2309.04974
repo_url: None
paper_authors: Muhammad Burhan Hafez, Stefan Wermter
for: 本研究旨在将机器人给予人类学习能力，即在生命途中不断学习多个技能。
methods: 本研究使用自我组织学习法，从观察运动和效果部分的自适应学习出动作和意图嵌入，以及从共同动作意图嵌入自适应学习出高级行为嵌入。
results: 本研究比较多种多任务学习基eline，在人工智能验证中表现出色，能够从不完整的示例中推理任务，并且在不断学习设定中表现更好。

Abstract
Endowing robots with the human ability to learn a growing set of skills over the course of a lifetime as opposed to mastering single tasks is an open problem in robot learning. While multi-task learning approaches have been proposed to address this problem, they pay little attention to task inference. In order to continually learn new tasks, the robot first needs to infer the task at hand without requiring predefined task representations. In this paper, we propose a self-supervised task inference approach. Our approach learns action and intention embeddings from self-organization of the observed movement and effect parts of unlabeled demonstrations and a higher-level behavior embedding from self-organization of the joint action-intention embeddings. We construct a behavior-matching self-supervised learning objective to train a novel Task Inference Network (TINet) to map an unlabeled demonstration to its nearest behavior embedding, which we use as the task representation. A multi-task policy is built on top of the TINet and trained with reinforcement learning to optimize performance over tasks. We evaluate our approach in the fixed-set and continual multi-task learning settings with a humanoid robot and compare it to different multi-task learning baselines. The results show that our approach outperforms the other baselines, with the difference being more pronounced in the challenging continual learning setting, and can infer tasks from incomplete demonstrations. Our approach is also shown to generalize to unseen tasks based on a single demonstration in one-shot task generalization experiments.

摘要
<>对于 робоット来说，授予它人类的学习能力，即在一生中不断学习多种技能，是一个打开的问题。虽然多任务学习方法有所提出，但它们对任务推理 pays little attention。为了不断学习新任务，首先需要由 robot 自动推理出当前任务，而不需要预定的任务表示。在这篇论文中，我们提出了一种自动任务推理方法。我们从无标示示例中自动学习动作和意图嵌入，以及高级行为嵌入。我们构建了一个行为匹配自我监督学习目标，用于训练一个新的任务推理网络（TINet），以将无标示示例映射到其最似的行为嵌入。基于 TINet 的多任务策略，我们使用强化学习训练，以优化任务表示的性能。我们在固定集和不断多任务学习设置中对我们的方法进行评估，并与不同的多任务学习基准进行比较。结果表明，我们的方法在不断学习设置中与其他基准之间的差异更加明显，并且可以从不完整的示例中推理任务。我们的方法还在一次任务扩展试验中被证明可以基于单个示例进行一次任务扩展。Note: The translation is done using Google Translate and may not be perfect. Please let me know if you need further assistance.

Prefix-diffusion: A Lightweight Diffusion Model for Diverse Image Captioning

paper_url: http://arxiv.org/abs/2309.04965
repo_url: None
paper_authors: Guisheng Liu, Yi Li, Zhengcong Fei, Haiyan Fu, Xiangyang Luo, Yanqing Guo
for: 提高图像描述的多样性和可靠性
methods: 使用轻量级图像描述网络和不间断填充方法
results: 实现多样化的图像描述，同时减少trainable参数数量

Abstract
While impressive performance has been achieved in image captioning, the limited diversity of the generated captions and the large parameter scale remain major barriers to the real-word application of these systems. In this work, we propose a lightweight image captioning network in combination with continuous diffusion, called Prefix-diffusion. To achieve diversity, we design an efficient method that injects prefix image embeddings into the denoising process of the diffusion model. In order to reduce trainable parameters, we employ a pre-trained model to extract image features and further design an extra mapping network. Prefix-diffusion is able to generate diverse captions with relatively less parameters, while maintaining the fluency and relevance of the captions benefiting from the generative capabilities of the diffusion model. Our work paves the way for scaling up diffusion models for image captioning, and achieves promising performance compared with recent approaches.

摘要
While impressive performance has been achieved in image captioning, the limited diversity of the generated captions and the large parameter scale remain major barriers to the real-world application of these systems. In this work, we propose a lightweight image captioning network in combination with continuous diffusion, called Prefix-diffusion. To achieve diversity, we design an efficient method that injects prefix image embeddings into the denoising process of the diffusion model. In order to reduce trainable parameters, we employ a pre-trained model to extract image features and further design an extra mapping network. Prefix-diffusion is able to generate diverse captions with relatively less parameters, while maintaining the fluency and relevance of the captions benefiting from the generative capabilities of the diffusion model. Our work paves the way for scaling up diffusion models for image captioning, and achieves promising performance compared with recent approaches.Here's the translation in Traditional Chinese:虽然印象描述中已经取得了卓越的表现，但是生成的描述仍然受到限制的多样性和大量的参数数量的阻碍。在这个工作中，我们提出了一个轻量级的图像描述网络，与不断传递（Diffusion）相结合，称为Prefix-diffusion。以提高多样性，我们设计了一个高效的方法，将预设的prefix图像嵌入送入传递过程中的混浊模型。以减少可读参数数量，我们使用预训模型提取图像特征，并设计了额外的映射网络。Prefix-diffusion能够生成多样的描述，并且保持描述的流利和相关性，充分利用传递模型的创造能力。我们的工作开启了扩展传递模型的可能性，并取得了最近的方法的优秀表现。

Multi-document Summarization: A Comparative Evaluation

paper_url: http://arxiv.org/abs/2309.04951
repo_url: None
paper_authors: Kushan Hewapathirana, Nisansa de Silva, C. D. Athuraliya
for: 评估现有多文摘要模型在不同领域和数据集上的表现，并探讨现有模型的局限性，以决定未来研究方向。
methods: 进行了广泛的文献综述，并分析了PRIMERA和PEGASUS模型在BigSurvey-MDS和MS$^2$数据集上的表现。
results: 发现LED全局预训练模型在MS$^2$数据集上比PRIMERA和PEGASUS表现更好，使用ROUGE分数来评估不同数据集上模型的表现。这些发现可以帮助未来的多文摘要研究，并为涉及复杂数据的各种领域提供准确和可靠的模型。

Abstract
This paper is aimed at evaluating state-of-the-art models for Multi-document Summarization (MDS) on different types of datasets in various domains and investigating the limitations of existing models to determine future research directions. To address this gap, we conducted an extensive literature review to identify state-of-the-art models and datasets. We analyzed the performance of PRIMERA and PEGASUS models on BigSurvey-MDS and MS$^2$ datasets, which posed unique challenges due to their varied domains. Our findings show that the General-Purpose Pre-trained Model LED outperforms PRIMERA and PEGASUS on the MS$^2$ dataset. We used the ROUGE score as a performance metric to evaluate the identified models on different datasets. Our study provides valuable insights into the models' strengths and weaknesses, as well as their applicability in different domains. This work serves as a reference for future MDS research and contributes to the development of accurate and robust models which can be utilized on demanding datasets with academically and/or scientifically complex data as well as generalized, relatively simple datasets.

摘要
Translation notes:* "Multi-document Summarization" (MDS) is translated as "多文摘要" (duō wén jué yào) in Simplified Chinese.* "BigSurvey-MDS" and "MS$^2$" are translated as "大调查-MDS" (dà zhù zhàng - MDs) and "MS$^2$" (Meng Shi Er Shi) respectively.* "PRIMERA" and "PEGASUS" are translated as "PRIMERA" (Pǐ Mǐ É Ra) and "PEGASUS" (Péi Jī É Shū) respectively.* "General-Purpose Pre-trained Model" (LED) is translated as "通用预训模型" (tōng yòng yù xùn módel) in Simplified Chinese.* "ROUGE" score is translated as "ROUGE" 得分 (ROUGE dé fèng) in Simplified Chinese.

paper_url: http://arxiv.org/abs/2309.05681
repo_url: None
paper_authors: Siwen Yan, Phillip Odom, Sriraam Natarajan
for: 本研究目的是解决作者归属问题，具体来说是通过构建和更新知识图来实现。
methods: 本研究使用了功能Gradient Boosting来学习概率逻辑模型，并在人工指导下进行知识填充。
results: 研究表明，在七种作者域中，人工知识可以有效地提高作者归属预测的准确率和可解性。

Abstract
We consider the problem of identifying authorship by posing it as a knowledge graph construction and refinement. To this effect, we model this problem as learning a probabilistic logic model in the presence of human guidance (knowledge-based learning). Specifically, we learn relational regression trees using functional gradient boosting that outputs explainable rules. To incorporate human knowledge, advice in the form of first-order clauses is injected to refine the trees. We demonstrate the usefulness of human knowledge both quantitatively and qualitatively in seven authorship domains.

摘要
我们视作推断作者的问题为建构知识图和精焕。为此，我们以学习机会逻辑模型为基础，使用函数Gradient Boosting学习关联 regression树，从而获得可解释的规则。为了包括人类知识，我们将知识型clause注入到树中来精焕。我们在七个作者领域证明了人类知识的有用性， both quantitatively and qualitatively。Note: The translation is in Simplified Chinese, which is the standard form of Chinese used in mainland China and Singapore. If you need Traditional Chinese, please let me know.

MFPNet: Multi-scale Feature Propagation Network For Lightweight Semantic Segmentation

paper_url: http://arxiv.org/abs/2309.04914
repo_url: None
paper_authors: Guoan Xu, Wenjing Jia, Tao Wu, Ligeng Chen
for: 提高轻量级 semantic segmentation 的进步（semantic segmentation 是指将图像分割成不同类别的过程），尤其是在 compare to large-scale models 的研究方面，研究发现该领域的进步相对较慢。
methods: 我们提出了一种新的轻量级 segmentation 架构，即 Multi-scale Feature Propagation Network (MFPNet)，用于解决这个问题。MFPNet 包括对称的 residual blocks，以及 flexible bottleneck residual modules (BRMs)，以探索深度和 ricoh 多尺度 semantic context。此外，我们还利用 Graph Convolutional Networks (GCNs) 来促进多尺度 feature propagation between BRM blocks。
results: 我们的方法在 benchmark datasets 上进行测试，显示了出色的 segmentation 结果。

Abstract
In contrast to the abundant research focusing on large-scale models, the progress in lightweight semantic segmentation appears to be advancing at a comparatively slower pace. However, existing compact methods often suffer from limited feature representation capability due to the shallowness of their networks. In this paper, we propose a novel lightweight segmentation architecture, called Multi-scale Feature Propagation Network (MFPNet), to address the dilemma. Specifically, we design a robust Encoder-Decoder structure featuring symmetrical residual blocks that consist of flexible bottleneck residual modules (BRMs) to explore deep and rich muti-scale semantic context. Furthermore, taking benefit from their capacity to model latent long-range contextual relationships, we leverage Graph Convolutional Networks (GCNs) to facilitate multi-scale feature propagation between the BRM blocks. When evaluated on benchmark datasets, our proposed approach shows superior segmentation results.

摘要
contrast to the abundant research focusing on large-scale models, the progress in lightweight semantic segmentation appears to be advancing at a comparatively slower pace. However, existing compact methods often suffer from limited feature representation capability due to the shallowness of their networks. In this paper, we propose a novel lightweight segmentation architecture, called Multi-scale Feature Propagation Network (MFPNet), to address the dilemma. Specifically, we design a robust Encoder-Decoder structure featuring symmetrical residual blocks that consist of flexible bottleneck residual modules (BRMs) to explore deep and rich multi-scale semantic context. Furthermore, taking benefit from their capacity to model latent long-range contextual relationships, we leverage Graph Convolutional Networks (GCNs) to facilitate multi-scale feature propagation between the BRM blocks. When evaluated on benchmark datasets, our proposed approach shows superior segmentation results.Here's the breakdown of the translation:* 异常 (contrast) - 对 (to)* 丰富 (abundant) - 研究 (research)* 注重 (focusing) - 大型 (large-scale) 模型 (models)* 进步 (progress) - 在 (in)* 较 (comparatively) slower pace* 然而 (however) - 现有 (existing) 紧凑 (compact) 方法 (methods)* 常 (often) suffer from - 有限 (limited) 表示 (representation) capability* due to - 由 (because of) the shallowness of their networks* In this paper, we propose - 在这篇论文中，我们提出* a novel lightweight segmentation architecture, called Multi-scale Feature Propagation Network (MFPNet)* to address the dilemma* Specifically, we design - 具体来说，我们设计* a robust Encoder-Decoder structure featuring symmetrical residual blocks* that consist of flexible bottleneck residual modules (BRMs)* to explore deep and rich multi-scale semantic context* Furthermore, taking benefit from - 另外，我们利用* their capacity to model latent long-range contextual relationships* we leverage Graph Convolutional Networks (GCNs) to facilitate multi-scale feature propagation between the BRM blocks* When evaluated on benchmark datasets, our proposed approach shows superior segmentation results.Note that Simplified Chinese is used in this translation, which is the standard written form of Chinese used in mainland China.

A Review of Machine Learning-based Security in Cloud Computing

paper_url: http://arxiv.org/abs/2309.04911
repo_url: https://github.com/jettbrains/-L-
paper_authors: Aptin Babaei, Parham M. Kebria, Mohsen Moradi Dalvand, Saeid Nahavandi
for: 本研究旨在提供一个全面的Machine Learning（ML）在云计算安全领域的现状报告，探讨不同ML算法的特点和效果，以及其可能的局限性。
methods: 本研究使用了许多最新的ML算法，包括分类、回归、 clustering等，以及其各自的特点和应用场景。
results: 本研究发现了一些ML算法在云计算安全领域的应用，包括攻击检测、数据分析、威胁感知等，以及这些算法的效果和局限性。

Abstract
Cloud Computing (CC) is revolutionizing the way IT resources are delivered to users, allowing them to access and manage their systems with increased cost-effectiveness and simplified infrastructure. However, with the growth of CC comes a host of security risks, including threats to availability, integrity, and confidentiality. To address these challenges, Machine Learning (ML) is increasingly being used by Cloud Service Providers (CSPs) to reduce the need for human intervention in identifying and resolving security issues. With the ability to analyze vast amounts of data, and make high-accuracy predictions, ML can transform the way CSPs approach security. In this paper, we will explore some of the most recent research in the field of ML-based security in Cloud Computing. We will examine the features and effectiveness of a range of ML algorithms, highlighting their unique strengths and potential limitations. Our goal is to provide a comprehensive overview of the current state of ML in cloud security and to shed light on the exciting possibilities that this emerging field has to offer.

摘要
云计算（CC）正在改变IT资源的提供方式，让用户通过更加成本效益和简化的基础设施访问和管理他们的系统。然而，随着CC的增长，也出现了一系列安全风险，包括可用性、完整性和机密性的威胁。为了解决这些挑战，机器学习（ML）在云服务提供商（CSP）中越来越广泛使用，以减少人类干预在安全问题上的需求。机器学习可以分析大量数据，并做出高准确率的预测，因此它可以将云安全问题的解决方式转化为自动化的过程。在这篇论文中，我们将探讨最新的云计算领域中ML基于安全性的研究。我们将评估一些常用的ML算法的特点和效果，并 highlight其独特优势和潜在的限制。我们的目标是提供云计算领域ML安全性的全面概述，并探讨这个新兴领域的激动人心的可能性。

Effective Real Image Editing with Accelerated Iterative Diffusion Inversion

paper_url: http://arxiv.org/abs/2309.04907
repo_url: None
paper_authors: Zhihong Pan, Riccardo Gherardi, Xiufeng Xie, Stephen Huang
for: 这 paper 的目的是提出一种高效的图像修改方法，以解决现代生成模型中的图像编辑问题。
methods: 该方法使用一种新的混合导航技术，将混合导航和梯度下降两种方法相互融合，以提高图像修改的准确率。
results: 对比其他扩散逆向方法，该方法在10和20扩散步的 режиме下显示出更高的稳定性和效率，并且不需要大量的类ifier-free导航。

Abstract
Despite all recent progress, it is still challenging to edit and manipulate natural images with modern generative models. When using Generative Adversarial Network (GAN), one major hurdle is in the inversion process mapping a real image to its corresponding noise vector in the latent space, since its necessary to be able to reconstruct an image to edit its contents. Likewise for Denoising Diffusion Implicit Models (DDIM), the linearization assumption in each inversion step makes the whole deterministic inversion process unreliable. Existing approaches that have tackled the problem of inversion stability often incur in significant trade-offs in computational efficiency. In this work we propose an Accelerated Iterative Diffusion Inversion method, dubbed AIDI, that significantly improves reconstruction accuracy with minimal additional overhead in space and time complexity. By using a novel blended guidance technique, we show that effective results can be obtained on a large range of image editing tasks without large classifier-free guidance in inversion. Furthermore, when compared with other diffusion inversion based works, our proposed process is shown to be more robust for fast image editing in the 10 and 20 diffusion steps' regimes.

摘要
尽管最近的进步很大，但是使用现代生成模型编辑和 manipulate 自然图像仍然是一个挑战。使用生成对抗网络（GAN）时，一个主要的障碍是在映射实际图像到其对应的隐藏空间噪声 вектор的过程中，因为需要能够重建图像以编辑其内容。同样，对于隐藏扩散假设模型（DDIM），每个反向步骤的线性化假设使整个推导性反向过程变得不可靠。现有的方法通常会在稳定性的权衡中做出大的牺牲。在这种情况下，我们提出了一种加速iterativediffusion inverse method，名为AIDI，该方法可以在重建精度方面取得显著改进，而无需增加空间和时间复杂度。我们使用了一种新的混合引导技术，并证明在大范围的图像编辑任务中可以获得有效的结果，无需大量的类ifier-free引导。此外，我们的提posed进程比其他扩散反向过程更加稳定，在10和20扩散步骤的 режиме下进行快速图像编辑。

Fun Paper

2023-09-10

cs.AI - 2023-09-10

Collecting Visually-Grounded Dialogue with A Game Of Sorts

Faster, Lighter, More Accurate: A Deep Learning Ensemble for Content Moderation

Representation Learning in Low-rank Slate-based Recommender Systems

Outlier Robust Adversarial Training

Large Language Models for Difficulty Estimation of Foreign Language Content with Application to Language Learning

Signal Temporal Logic Neural Predictive Control

The online learning architecture with edge computing for high-level control for assisting patients

WIP: Development of a Student-Centered Personalized Learning Framework to Advance Undergraduate Robotics Education

High Fidelity Fast Simulation of Human in the Loop Human in the Plant (HIL-HIP) Systems

A compendium of data sources for data science, machine learning, and artificial intelligence

Deep Learning-Aided Subspace-Based DOA Recovery for Sparse Arrays

AGent: A Novel Pipeline for Automatically Creating Unanswerable Questions

Variance Reduction of Resampling for Sequential Monte Carlo

Neural-Hidden-CRF: A Robust Weakly-Supervised Sequence Labeler

An Appraisal-Based Chain-Of-Emotion Architecture for Affective Language Model Game Agents

Chebyshev Particles

Spatiotemporal Graph Neural Networks with Uncertainty Quantification for Traffic Incident Risk Prediction

Chasing the Intruder: A Reinforcement Learning Approach for Tracking Intruder Drones

Federated Learning Incentive Mechanism under Buyers’ Auction Market

Machine Learning for maximizing the memristivity of single and coupled quantum memristors

Decolonial AI Alignment: Viśesadharma, Argument, and Artistic Expression

VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching

FOLLOWUPQG: Towards Information-Seeking Follow-up Question Generation

RGAT: A Deeper Look into Syntactic Dependency Information for Coreference Resolution

AVARS – Alleviating Unexpected Urban Road Traffic Congestion using UAVs

Continual Robot Learning using Self-Supervised Task Inference

Prefix-diffusion: A Lightweight Diffusion Model for Diverse Image Captioning

Multi-document Summarization: A Comparative Evaluation

Knowledge-based Refinement of Scientific Publication Knowledge Graphs

MFPNet: Multi-scale Feature Propagation Network For Lightweight Semantic Segmentation

A Review of Machine Learning-based Security in Cloud Computing

Effective Real Image Editing with Accelerated Iterative Diffusion Inversion

2023-09-10

Collecting Visually-Grounded Dialogue with A Game Of Sorts

Faster, Lighter, More Accurate: A Deep Learning Ensemble for Content Moderation

Representation Learning in Low-rank Slate-based Recommender Systems

Outlier Robust Adversarial Training

Large Language Models for Difficulty Estimation of Foreign Language Content with Application to Language Learning

Signal Temporal Logic Neural Predictive Control

The online learning architecture with edge computing for high-level control for assisting patients

WIP: Development of a Student-Centered Personalized Learning Framework to Advance Undergraduate Robotics Education

High Fidelity Fast Simulation of Human in the Loop Human in the Plant (HIL-HIP) Systems

A compendium of data sources for data science, machine learning, and artificial intelligence

Deep Learning-Aided Subspace-Based DOA Recovery for Sparse Arrays

AGent: A Novel Pipeline for Automatically Creating Unanswerable Questions

Exploring Social Choice Mechanisms for Recommendation Fairness in SCRUF

Variance Reduction of Resampling for Sequential Monte Carlo

Neural-Hidden-CRF: A Robust Weakly-Supervised Sequence Labeler

An Appraisal-Based Chain-Of-Emotion Architecture for Affective Language Model Game Agents

Chebyshev Particles

Spatiotemporal Graph Neural Networks with Uncertainty Quantification for Traffic Incident Risk Prediction

Chasing the Intruder: A Reinforcement Learning Approach for Tracking Intruder Drones

Federated Learning Incentive Mechanism under Buyers’ Auction Market

Machine Learning for maximizing the memristivity of single and coupled quantum memristors

Decolonial AI Alignment: Viśesadharma, Argument, and Artistic Expression

VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching

FOLLOWUPQG: Towards Information-Seeking Follow-up Question Generation

RGAT: A Deeper Look into Syntactic Dependency Information for Coreference Resolution

AVARS – Alleviating Unexpected Urban Road Traffic Congestion using UAVs

Continual Robot Learning using Self-Supervised Task Inference

Prefix-diffusion: A Lightweight Diffusion Model for Diverse Image Captioning

Multi-document Summarization: A Comparative Evaluation

Knowledge-based Refinement of Scientific Publication Knowledge Graphs

MFPNet: Multi-scale Feature Propagation Network For Lightweight Semantic Segmentation

A Review of Machine Learning-based Security in Cloud Computing

Effective Real Image Editing with Accelerated Iterative Diffusion Inversion