2023-09-16

cs.AI

cs.AI - 2023-09-16

Interactively Teaching an Inverse Reinforcement Learner with Limited Feedback

paper_url: http://arxiv.org/abs/2309.09095
repo_url: https://github.com/rzayanov/irl-teaching-limited-feedback
paper_authors: Rustam Zayanov, Francisco S. Melo, Manuel Lopes
for: 本研究强调教学via示例在顺序决策任务中，特别是教师无法访问学生模型和策略的情况下。
methods: 本文使用 inverse reinforcement learning 和 active learning 方法，教师可以通过选择开始状态和推断学生策略来解决这个教学问题。
results: 在一个人工汽车驾驶环境中测试了提议的算法，结果显示该算法在学生反馈有限时是一个有效的解决方案。

Abstract
We study the problem of teaching via demonstrations in sequential decision-making tasks. In particular, we focus on the situation when the teacher has no access to the learner's model and policy, and the feedback from the learner is limited to trajectories that start from states selected by the teacher. The necessity to select the starting states and infer the learner's policy creates an opportunity for using the methods of inverse reinforcement learning and active learning by the teacher. In this work, we formalize the teaching process with limited feedback and propose an algorithm that solves this teaching problem. The algorithm uses a modified version of the active value-at-risk method to select the starting states, a modified maximum causal entropy algorithm to infer the policy, and the difficulty score ratio method to choose the teaching demonstrations. We test the algorithm in a synthetic car driving environment and conclude that the proposed algorithm is an effective solution when the learner's feedback is limited.

摘要
我们研究教学via示例在时序决策任务中。特别是在教师无法访问学生的模型和策略的情况下，并且学生对教师的反馈仅仅是从教师选择的状态开始的路径。因为选择开始状态和推理学生策略创造了教师可以使用反对抗学习和活动学习的机会。在这个研究中，我们将教学过程 formalize 为有限反馈的情况，并提出一个解决这个教学问题的算法。这个算法使用修改版的活跃值-at-risk方法选择开始状态，修改最大 causal entropy 算法推理学生策略，以及对教学示例选择的困难分数比率方法。我们在人工智能汽车驾驶环境中试验了这个算法，结果显示，我们的提案算法在学生的反馈有限时是一个有效的解决方案。

RMDM: A Multilabel Fakenews Dataset for Vietnamese Evidence Verification

paper_url: http://arxiv.org/abs/2309.09071
repo_url: None
paper_authors: Hai-Long Nguyen, Thi-Kieu-Trang Pham, Thai-Son Le, Tan-Minh Nguyen, Thi-Hai-Yen Vuong, Ha-Thanh Nguyen
for: 这个研究是为了评估大语言模型（LLM）在电子信息相关法律上的性能，特别是用于识别假新闻作为电子证据的输入。
methods: 该研究使用了一个新的和挑战性的多标签越南语 dataset（RMDM），包括四个标签：实用、误差、恶意和恶假，表示实际信息、误差信息、恶意信息和假信息。
results: 研究发现，使用 GPT 和 BERT 模型测试 RMDM 数据集时，每个标签的性能异常，表明该数据集能够挑战不同语言模型对于识别各种类型的电子信息的能力。研究结果表明，用于识别电子信息相关法律上的 fake news 仍然是一个困难的问题，需要更多的研究人员投入，以提高 AI 模型的可靠性。

Abstract
In this study, we present a novel and challenging multilabel Vietnamese dataset (RMDM) designed to assess the performance of large language models (LLMs), in verifying electronic information related to legal contexts, focusing on fake news as potential input for electronic evidence. The RMDM dataset comprises four labels: real, mis, dis, and mal, representing real information, misinformation, disinformation, and mal-information, respectively. By including these diverse labels, RMDM captures the complexities of differing fake news categories and offers insights into the abilities of different language models to handle various types of information that could be part of electronic evidence. The dataset consists of a total of 1,556 samples, with 389 samples for each label. Preliminary tests on the dataset using GPT-based and BERT-based models reveal variations in the models' performance across different labels, indicating that the dataset effectively challenges the ability of various language models to verify the authenticity of such information. Our findings suggest that verifying electronic information related to legal contexts, including fake news, remains a difficult problem for language models, warranting further attention from the research community to advance toward more reliable AI models for potential legal applications.

摘要
在这项研究中，我们提出了一个新的和挑战性的多标签越南语数据集（RMDM），用于评估大语言模型（LLM）在电子信息相关法律上下文中的性能，特点是通过假新闻作为电子证据的输入。RMDM数据集包括四个标签：真实、误information、 désinformation和mal-information，分别表示真实信息、误信息、恶意误导和危险信息。由于这些多样化的标签，RMDM数据集能够捕捉各种假新闻类型的复杂性，并为不同语言模型的性能进行评估。该数据集包括总共1556个样本，每个标签各有389个样本。初步测试表明，使用基于GPT和BERT模型的语言模型在不同标签上表现有很大差异，这表明RMDM数据集对不同语言模型的挑战性很高。我们的发现表明，在法律上下文中电子信息的验证仍然是一个具有挑战性的问题，需要研究人员继续努力，以提出更可靠的AI模型，用于 potential legal applications。

NOWJ1@ALQAC 2023: Enhancing Legal Task Performance with Classic Statistical Models and Pre-trained Language Models

paper_url: http://arxiv.org/abs/2309.09070
repo_url: None
paper_authors: Tan-Minh Nguyen, Xuan-Hoa Nguyen, Ngoc-Duy Mai, Minh-Quan Hoang, Van-Huan Nguyen, Hoang-Viet Nguyen, Ha-Thanh Nguyen, Thi-Hai-Yen Vuong
for: 本研究旨在提高法律任务性能，通过精心搭配经典统计模型和预训练语言模型（PLMs）。
methods: 我们实施了一个预处理步骤，以解决输入限制，并应用学习到rank方法，以整合各种模型中的特征。在问答任务中，我们分成两个子任务：句子分类和答案提取。我们采用了当今最佳实践，以开发每个子任务的独特系统，并利用经典统计模型和预训练语言模型。
results: 实验结果表明，我们提出的方法在比赛中具有扎实的潜力。

Abstract
This paper describes the NOWJ1 Team's approach for the Automated Legal Question Answering Competition (ALQAC) 2023, which focuses on enhancing legal task performance by integrating classical statistical models and Pre-trained Language Models (PLMs). For the document retrieval task, we implement a pre-processing step to overcome input limitations and apply learning-to-rank methods to consolidate features from various models. The question-answering task is split into two sub-tasks: sentence classification and answer extraction. We incorporate state-of-the-art models to develop distinct systems for each sub-task, utilizing both classic statistical models and pre-trained Language Models. Experimental results demonstrate the promising potential of our proposed methodology in the competition.

摘要
Note: Simplified Chinese is also known as "简化字符" or "简体字".Here's the translation in Traditional Chinese:这份研究报告描述了NOWJ1队的2023年自动法律问题回答比赛（ALQAC）方法，强调通过结合古典统计模型和预训语言模型（PLMs）来提高法律任务性能。 для文档搜寻任务，我们实现了预处理步骤以超过输入限制，并使用学习排名方法将不同模型中的特征集成。问题回答任务分为两个子任务：句子分类和答案抽取。我们使用了现代模型，开发了两个不同的系统，一个是基于古典统计模型，另一个是基于预训语言模型。实验结果显示了我们的提案方法在比赛中的应用潜力。

GenDOM: Generalizable One-shot Deformable Object Manipulation with Parameter-Aware Policy

paper_url: http://arxiv.org/abs/2309.09051
repo_url: None
paper_authors: So Kuroki, Jiaxian Guo, Tatsuya Matsushima, Takuya Okubo, Masato Kobayashi, Yuya Ikeda, Ryosuke Takanami, Paul Yoo, Yutaka Matsuo, Yusuke Iwasawa
for: 一个可以实现单一示范的弹性物品操作框架 (a framework that can achieve one-shot deformable object manipulation)
methods: 使用弹性物品参数来条件政策并在训练过程中使用多种弹性物品模拟，以将政策适应不同弹性物品 (using deformable object parameters to condition the policy and training it with a diverse range of simulated deformable objects, so that the policy can adapt to different objects)
results: 实际验证范例显示，our方法可以实现不同弹性物品的一个示范操作 (empirical validations show that our method can manipulate different objects with a single demonstration)，并在实际环境中比基eline表现更好 (and significantly outperform the baseline in both simulation and real-world environments)

Abstract
Due to the inherent uncertainty in their deformability during motion, previous methods in deformable object manipulation, such as rope and cloth, often required hundreds of real-world demonstrations to train a manipulation policy for each object, which hinders their applications in our ever-changing world. To address this issue, we introduce GenDOM, a framework that allows the manipulation policy to handle different deformable objects with only a single real-world demonstration. To achieve this, we augment the policy by conditioning it on deformable object parameters and training it with a diverse range of simulated deformable objects so that the policy can adjust actions based on different object parameters. At the time of inference, given a new object, GenDOM can estimate the deformable object parameters with only a single real-world demonstration by minimizing the disparity between the grid density of point clouds of real-world demonstrations and simulations in a differentiable physics simulator. Empirical validations on both simulated and real-world object manipulation setups clearly show that our method can manipulate different objects with a single demonstration and significantly outperforms the baseline in both environments (a 62% improvement for in-domain ropes and a 15% improvement for out-of-distribution ropes in simulation, as well as a 26% improvement for ropes and a 50% improvement for cloths in the real world), demonstrating the effectiveness of our approach in one-shot deformable object manipulation.

摘要
由于物体的自然变形性在运动中的不确定性，过去的方法在弹性物体把握中经常需要数百个实际世界示例来训练把握策略，这限制了它们在我们的变化世界中的应用。为解决这个问题，我们介绍GenDOM框架，它允许把握策略处理不同的弹性物体，只需要单个实际世界示例。为达到这个目标，我们在策略中添加了基于弹性物体参数的条件，并在训练策略时使用了多种 simulate 的弹性物体，以便策略可以根据不同的物体参数调整动作。在推理时，对于新的物体，GenDOM可以通过在一个分解器上进行最小化，使得策略可以在具有不同物体参数的情况下进行适应。我们的实验证明，我们的方法可以在各种 simulate 和实际环境中一shot 把握不同的弹性物体，并且明显超过基eline，这证明了我们的方法在一shot 弹性物体把握中的有效性。

Generative AI-Driven Storytelling: A New Era for Marketing

paper_url: http://arxiv.org/abs/2309.09048
repo_url: None
paper_authors: Marko Vidrih, Shiva Mayahi
for: 这篇论文探讨了用生成AI驱动的故事创作在市场策略中的转变力。
methods: 该论文使用实际业界例子，如Google、Netflix和Stitch Fix，解释了如何使用此技术自定义消费者体验， navigate 相关挑战。
results: 论文描述了将来的发展方向和建议，包括实时个性化 Storytelling、 immerse Storytelling 体验和社交媒体 Storytelling。In English, the three key points are:
for: This paper explores the transformative power of Generative AI-driven storytelling in marketing.
methods: The paper uses real-world examples from industry leaders like Google, Netflix, and Stitch Fix to illustrate how this technology personalizes consumer experiences and navigates challenges.
results: The paper describes future directions and recommendations for generative AI-driven storytelling, including prospective applications such as real-time personalized storytelling, immersive storytelling experiences, and social media storytelling.

Abstract
This paper delves into the transformative power of Generative AI-driven storytelling in the realm of marketing. Generative AI, distinct from traditional machine learning, offers the capability to craft narratives that resonate with consumers on a deeply personal level. Through real-world examples from industry leaders like Google, Netflix and Stitch Fix, we elucidate how this technology shapes marketing strategies, personalizes consumer experiences, and navigates the challenges it presents. The paper also explores future directions and recommendations for generative AI-driven storytelling, including prospective applications such as real-time personalized storytelling, immersive storytelling experiences, and social media storytelling. By shedding light on the potential and impact of generative AI-driven storytelling in marketing, this paper contributes to the understanding of this cutting-edge approach and its transformative power in the field of marketing.

摘要
这篇论文探讨了生成AI驱动的故事创作在市场营销中的变革力。生成AI与传统机器学习不同，具有制作吸引消费者深层次共鸣的故事的能力。通过实业领导者如Google、Netflix和Stitch Fix的实践例子，我们详细介绍了该技术如何影响市场策略、个性化消费者经验和解决相关挑战。这篇论文还探讨了未来生成AI驱动的故事创作的发展趋势和建议，包括实时个性化storytelling、 immerse storytelling经验和社交媒体storytelling。这篇论文通过探讨生成AI驱动的故事创作在市场营销中的潜力和影响，贡献于这一领域的理解和发展。

A store-and-forward cloud-based telemonitoring system for automatic assessing dysarthria evolution in neurological diseases from video-recording analysis

paper_url: http://arxiv.org/abs/2309.09038
repo_url: None
paper_authors: Lucia Migliorelli, Daniele Berardini, Kevin Cela, Michela Coccia, Laura Villani, Emanuele Frontoni, Sara Moccia
For: This study aims to provide a remote telemonitoring system to support clinicians in monitoring the evolution of dysarthria in patients with neurological diseases.* Methods: The system uses a convolutional neural network (CNN) to analyze video recordings of individuals with dysarthria, and locates facial landmarks as a prior for assessing orofacial functions related to speech.* Results: The proposed CNN achieved a normalized mean error of 1.79 on localizing facial landmarks when tested on a public dataset, and showed promising outcomes in a real-life scenario with 11 bulbar-onset ALS subjects.

Abstract
Background and objectives: Patients suffering from neurological diseases may develop dysarthria, a motor speech disorder affecting the execution of speech. Close and quantitative monitoring of dysarthria evolution is crucial for enabling clinicians to promptly implement patient management strategies and maximizing effectiveness and efficiency of communication functions in term of restoring, compensating or adjusting. In the clinical assessment of orofacial structures and functions, at rest condition or during speech and non-speech movements, a qualitative evaluation is usually performed, throughout visual observation. Methods: To overcome limitations posed by qualitative assessments, this work presents a store-and-forward self-service telemonitoring system that integrates, within its cloud architecture, a convolutional neural network (CNN) for analyzing video recordings acquired by individuals with dysarthria. This architecture, called facial landmark Mask RCNN, aims at locating facial landmarks as a prior for assessing the orofacial functions related to speech and examining dysarthria evolution in neurological diseases. Results: When tested on the Toronto NeuroFace dataset, a publicly available annotated dataset of video recordings from patients with amyotrophic lateral sclerosis (ALS) and stroke, the proposed CNN achieved a normalized mean error equal to 1.79 on localizing the facial landmarks. We also tested our system in a real-life scenario on 11 bulbar-onset ALS subjects, obtaining promising outcomes in terms of facial landmark position estimation. Discussion and conclusions: This preliminary study represents a relevant step towards the use of remote tools to support clinicians in monitoring the evolution of dysarthria.

摘要
背景和目标：神经疾病患者可能发展出语言障碍，称为肌肉语言障碍，影响语言执行。为了帮助临床医生尽快实施患者管理策略并最大化对语言功能的效果和效率，close和量化监测肌肉语言障碍的发展是非常重要的。在评估嘴部结构和功能方面，通常采用观察方式进行评估。方法：为了超越质量评估的限制，本工作提出了一个自助式抽象云端系统，其中包含一个基于卷积神经网络（CNN）的脸部特征检测模型，用于分析患者所录制的视频记录。这个架构被称为脸部特征Mask RCNN，旨在在视频记录中检测肌肉语言障碍的发展，并评估嘴部结构和功能相关的语言功能。结果：当测试在多伦多大学的脸部数据集上时，我们的CNN模型实现了1.79的正常化平均错误，用于定位脸部特征。我们还在11名有患有贝叶静脉性肌肉疾病的实际情况下测试了我们的系统，并获得了可观的结果，表明我们的系统可以准确地定位脸部特征。讨论和结论：这一初步研究表明了远程工具的使用可以帮助临床医生更好地监测肌肉语言障碍的发展。通过评估肌肉语言障碍的发展，可以帮助临床医生更好地评估患者的状况，并采取更加有效的治疗策略。此外，这种远程监测技术可以帮助患者更好地与医生进行沟通，从而提高患者的生活质量。

Improve Deep Forest with Learnable Layerwise Augmentation Policy Schedule

paper_url: http://arxiv.org/abs/2309.09030
repo_url: https://github.com/dbsxfz/augdf
paper_authors: Hongyu Zhu, Sichu Liang, Wentao Hu, Fang-Qi Li, Yali yuan, Shi-Lin Wang, Guang Cheng
for: 提高 Deep Forest 模型的表现和泛化能力，对 tabular 数据进行更好的处理和分类。
methods: 使用可学习的层 wise 数据增强策略，包括 Cut Mix for Tabular data 技术，并使用人口基数搜索算法来调整增强程度。
results: 在多个 tabular 分类任务中达到新的 state-of-the-art 水平，超过了树ensemble、深度森林、深度神经网络和 AutoML 竞争对手。

Abstract
As a modern ensemble technique, Deep Forest (DF) employs a cascading structure to construct deep models, providing stronger representational power compared to traditional decision forests. However, its greedy multi-layer learning procedure is prone to overfitting, limiting model effectiveness and generalizability. This paper presents an optimized Deep Forest, featuring learnable, layerwise data augmentation policy schedules. Specifically, We introduce the Cut Mix for Tabular data (CMT) augmentation technique to mitigate overfitting and develop a population-based search algorithm to tailor augmentation intensity for each layer. Additionally, we propose to incorporate outputs from intermediate layers into a checkpoint ensemble for more stable performance. Experimental results show that our method sets new state-of-the-art (SOTA) benchmarks in various tabular classification tasks, outperforming shallow tree ensembles, deep forests, deep neural network, and AutoML competitors. The learned policies also transfer effectively to Deep Forest variants, underscoring its potential for enhancing non-differentiable deep learning modules in tabular signal processing.

摘要
为了提高表格分类性能，我们提出了一种优化的深度森林（DF）技术，具有更强的表达力。这种技术利用级别结构来构建深度模型，从而提高模型的表达力。然而，这种积极多层学习的方法容易过拟合，导致模型的效果和泛化性受限。为了解决这个问题，我们在这篇论文中提出了一种可学习的层weise数据增强策略，包括了Cut Mix for Tabular data（CMT）增强技术来mitigate过拟合。此外，我们还提出了一种基于人口的搜索算法来调整增强intensity的每层策略。此外，我们还提出了将 intermediate层的输出集成到检查点ensemble中，以确保模型的稳定性。实验结果显示，我们的方法在不同的表格分类任务中设置了新的最佳性能记录（SOTA），超过了树ensemble、深度森林、深度神经网络和AutoML竞争对手。学习的策略也可以有效地传递到 Deep Forest 的变体中，这 подтвержда了其在表格信号处理中的潜在应用。

Earth Virtualization Engines – A Technical Perspective

paper_url: http://arxiv.org/abs/2309.09002
repo_url: None
paper_authors: Torsten Hoefler, Bjorn Stevens, Andreas F. Prein, Johanna Baehr, Thomas Schulthess, Thomas F. Stocker, John Taylor, Daniel Klocke, Pekka Manninen, Piers M. Forster, Tobias Kölling, Nicolas Gruber, Hartwig Anzt, Claudia Frauen, Florian Ziemen, Milan Klöwer, Karthik Kashinath, Christoph Schär, Oliver Fuhrer, Bryan N. Lawrence
for: 提高气候变化应对能力
methods: 结合物理模型和机器学习技术，提高气候预测的准确性、效率和可读性
results: 实现了高分辨率的气候数据访问和分析，为气候变化的研究和应对做出了重要贡献

Abstract
Participants of the Berlin Summit on Earth Virtualization Engines (EVEs) discussed ideas and concepts to improve our ability to cope with climate change. EVEs aim to provide interactive and accessible climate simulations and data for a wide range of users. They combine high-resolution physics-based models with machine learning techniques to improve the fidelity, efficiency, and interpretability of climate projections. At their core, EVEs offer a federated data layer that enables simple and fast access to exabyte-sized climate data through simple interfaces. In this article, we summarize the technical challenges and opportunities for developing EVEs, and argue that they are essential for addressing the consequences of climate change.

摘要
BERLIN峰会上的地球虚拟化引擎（EVEs）参与者们讨论了如何改善对气候变化的应对能力。EVEs旨在提供互动性强、易于访问的气候模拟和数据，为广泛的用户群提供。它们结合高分辨率物理模型和机器学习技术，以提高模拟结果的准确性、效率和可解释性。EVEs的核心是一个联邦数据层，可以通过简单的接口访问数据，并且可以在毫秒级别进行数据交互。本文将讨论EVEs的技术挑战和机遇，并认为它们是Addressing the consequences of climate change的关键工具。

Deliberative Context-Aware Ambient Intelligence System for Assisted Living Homes

paper_url: http://arxiv.org/abs/2309.08984
repo_url: None
paper_authors: Mohannad Babli, Jaime A Rincon, Eva Onaindia, Carlos Carrascosa, Vicente Julian
for: 这个论文的目的是提出一种 ambient intelligence 健康应用的决策架构，用于舒缓受抚恤的老年人受到负面情绪的情况下，并在助生活机构中进行实施。
methods: 该架构使用了决策函数，以实现 Context-aware 的人机交互、感知、规划功能、反应性和环境意识等特性。文章还进行了一些实验研究，用以证明方法的效果和有效性。
results: 实验结果表明，提出的决策函数已经成功地实现了其决策目标，并且在 simulate 的助生活机构enario 中得到了有效的结果。

Abstract
Monitoring wellbeing and stress is one of the problems covered by ambient intelligence, as stress is a significant cause of human illnesses directly affecting our emotional state. The primary aim was to propose a deliberation architecture for an ambient intelligence healthcare application. The architecture provides a plan for comforting stressed seniors suffering from negative emotions in an assisted living home and executes the plan considering the environment's dynamic nature. Literature was reviewed to identify the convergence between deliberation and ambient intelligence and the latter's latest healthcare trends. A deliberation function was designed to achieve context-aware dynamic human-robot interaction, perception, planning capabilities, reactivity, and context-awareness with regard to the environment. A number of experimental case studies in a simulated assisted living home scenario were conducted to demonstrate the approach's behavior and validity. The proposed methods were validated to show classification accuracy. The validation showed that the deliberation function has effectively achieved its deliberative objectives.

摘要
监测健康和压力是智能环境技术中一个主要问题，因为压力直接影响我们的情感状态，是人类疾病的直接原因之一。我们的目标是提议一种 ambient intelligence 健康应用的决策建构。这种建构提供了一个计划，用于使用智能环境技术来慰宠受到压力的长者，并在考虑环境的动态特点下执行该计划。我们查看了相关 литераature，以确定决策和智能环境之间的叉合，以及智能环境最新的医疗趋势。我们设计了一种决策函数，以实现 context-aware 的人机交互、感知、规划能力、感应性和对环境的Context-awareness。我们进行了一些在 simulated 助生活场景中的实验案例，以证明方法的行为和有效性。我们验证了提案的方法，并证明了决策函数已经成功完成了它的决策目标。

Accelerating In-Browser Deep Learning Inference on Diverse Edge Clients through Just-in-Time Kernel Optimizations

paper_url: http://arxiv.org/abs/2309.08978
repo_url: None
paper_authors: Fucheng Jia, Shiqi Jiang, Ting Cao, Wei Cui, Tianrui Xia, Xu Cao, Yuanchun Li, Deyu Zhang, Ju Ren, Yunxin Liu, Lili Qiu, Mao Yang
for: 这篇论文的目的是提出一个首个在浏览器中进行深度学习（DL）推理的系统，以提高浏览器中DL推理的性能。
methods: 这篇论文使用了两种新的网络编程技术来实现自动生成优化的加速器，包括：Tensor-Web Compiling Co-Design和Web-Specific Lite Kernel Optimization Space Design。这两种技术可以减少加速器生成成本，同时维持或甚至提高性能。
results: 对现代转换器模型进行评估，nn-JIT.web可以在各种客户端设备上实现到8.2倍的加速，包括ARM、Intel、AMD和Nvidia的主流CPUs和GPUs。

Abstract
Web applications are increasingly becoming the primary platform for AI service delivery, making in-browser deep learning (DL) inference more prominent. However, current in-browser inference systems fail to effectively utilize advanced web programming techniques and customize kernels for various client devices, leading to suboptimal performance. To address the issues, this paper presents the first in-browser inference system, nn-JIT.web, which enables just-in-time (JIT) auto-generation of optimized kernels for both CPUs and GPUs during inference. The system achieves this by using two novel web programming techniques that can significantly reduce kernel generation time, compared to other tensor compilers such as TVM, while maintaining or even improving performance. The first technique, Tensor-Web Compiling Co-Design, lowers compiling costs by unifying tensor and web compiling and eliminating redundant and ineffective compiling passes. The second technique, Web-Specific Lite Kernel Optimization Space Design, reduces kernel tuning costs by focusing on web programming requirements and efficient hardware resource utilization, limiting the optimization space to only dozens. nn-JIT.web is evaluated for modern transformer models on a range of client devices, including the mainstream CPUs and GPUs from ARM, Intel, AMD and Nvidia. Results show that nn-JIT.web can achieve up to 8.2x faster within 30 seconds compared to the baselines across various models.

摘要
现代浏览器中的 Web 应用程序正在成为人工智能服务的主要平台，使得在浏览器内进行深度学习（DL）推理变得更加重要。然而，当前的浏览器推理系统无法有效利用高级网络编程技术和自定义核心 для各种客户端设备，导致性能下降。为解决这些问题，这篇论文提出了第一个在浏览器内进行推理的系统，即 nn-JIT.web。该系统可以在推理过程中通过实时生成优化的核心，以提高 CPU 和 GPU 的性能。该系统使用了两种新的网络编程技术来减少核心生成时间，相比于其他tensor编译器such as TVM，而无需增加编译成本。第一种技术是tensor-web编译合理化，它将tensor编译和网络编译融合起来，从而减少了无用的编译步骤。第二种技术是网络特定的轻量级核心优化空间设计，它将精力集中在了网络编程需求和高效硬件资源利用上，从而减少了优化空间的范围，只有几十。nn-JIT.web在多种现代转换器模型上进行了测试，包括ARM、Intel、AMD和Nvidia等主流CPU和GPU。结果显示，nn-JIT.web可以在30秒内达到8.2倍的速度提升，与基准值相比。

Data-driven Reachability using Christoffel Functions and Conformal Prediction

paper_url: http://arxiv.org/abs/2309.08976
repo_url: None
paper_authors: Abdelmouaiz Tebjou, Goran Frehse, Faïcel Chamroukhi
for: 本研究旨在提出一种数据驱动的方法，用于估计动力系统中可达的状态集（reach set），而无需知道系统动力学模型的准确参数。
methods: 本方法基于Christoffel函数的approximation来估计 reach set，并且通过使用数据驱动的方法来提高样本效率和鲁棒性。
results: 本研究显示，使用这种方法可以提高样本效率和鲁棒性，并且可以避免出现在训练集和校准集中的异常样本的影响。

Abstract
An important mathematical tool in the analysis of dynamical systems is the approximation of the reach set, i.e., the set of states reachable after a given time from a given initial state. This set is difficult to compute for complex systems even if the system dynamics are known and given by a system of ordinary differential equations with known coefficients. In practice, parameters are often unknown and mathematical models difficult to obtain. Data-based approaches are promised to avoid these difficulties by estimating the reach set based on a sample of states. If a model is available, this training set can be obtained through numerical simulation. In the absence of a model, real-life observations can be used instead. A recently proposed approach for data-based reach set approximation uses Christoffel functions to approximate the reach set. Under certain assumptions, the approximation is guaranteed to converge to the true solution. In this paper, we improve upon these results by notably improving the sample efficiency and relaxing some of the assumptions by exploiting statistical guarantees from conformal prediction with training and calibration sets. In addition, we exploit an incremental way to compute the Christoffel function to avoid the calibration set while maintaining the statistical convergence guarantees. Furthermore, our approach is robust to outliers in the training and calibration set.

摘要
“一个重要的数学工具在动态系统分析中是精确地计算可达集，即从一个初始状态到一个给定时间后可达的状态集。这个集是复杂系统的情况下难以计算，即使系统动态知道并且以常微分方程表示。实际上，参数通常未知，数学模型难以取得。数据驱动的方法可以避免这些问题，通过基于数据的估计来Estimate the reach set。如果有模型可用，则可以通过数值 simulations obtain training set。在缺乏模型的情况下，则可以使用实际观察。一种最近提出的方法是使用Christoffel函数估计可达集。在某些假设下，这个估计将会趋向真实解。在这篇论文中，我们会提高这些结果，包括提高样本效率和松动一些假设，通过滤节点和训练集的统计保证。此外，我们还会利用增量式计算Christoffel函数，以避免训练集的需求，同时保持统计的测度保证。此外，我们的方法也能够抗抗噪。”

Multiagent Reinforcement Learning with an Attention Mechanism for Improving Energy Efficiency in LoRa Networks

paper_url: http://arxiv.org/abs/2309.08965
repo_url: None
paper_authors: Xu Zhang, Ziqi Lin, Shimin Gong, Bo Gu, Dusit Niyato
for: 该研究旨在提高LoRa网络的能源效率（EE），适用于工业互联网物联网（IIoT）。
methods: 该研究首先提出了一个分析模型来计算LoRa网络的系统EE性能。然后，基于多代理强化学习（MALoRa）算法，对LoRa网络中每个终端设备（ED）的传输参数分配进行优化，以最大化系统EE。
results: simulation结果表明，相比基eline算法，MALoRa算法可以显著提高LoRa网络的系统EE，但是同时也导致了一定的数据包交换率（PDR）的下降。

Abstract
Long Range (LoRa) wireless technology, characterized by low power consumption and a long communication range, is regarded as one of the enabling technologies for the Industrial Internet of Things (IIoT). However, as the network scale increases, the energy efficiency (EE) of LoRa networks decreases sharply due to severe packet collisions. To address this issue, it is essential to appropriately assign transmission parameters such as the spreading factor and transmission power for each end device (ED). However, due to the sporadic traffic and low duty cycle of LoRa networks, evaluating the system EE performance under different parameter settings is time-consuming. Therefore, we first formulate an analytical model to calculate the system EE. On this basis, we propose a transmission parameter allocation algorithm based on multiagent reinforcement learning (MALoRa) with the aim of maximizing the system EE of LoRa networks. Notably, MALoRa employs an attention mechanism to guide each ED to better learn how much ''attention'' should be given to the parameter assignments for relevant EDs when seeking to improve the system EE. Simulation results demonstrate that MALoRa significantly improves the system EE compared with baseline algorithms with an acceptable degradation in packet delivery rate (PDR).

摘要
长距离无线技术（LoRa）， caracterizada por baja consumición de energía y una comunicación larga distancia, es considerada una de las tecnologías clave para la Internet de las Cosas Industriales (IIoT). Sin embargo, a medida que se incrementa la escala de la red, la eficiencia energética (EE) de las redes LoRa disminuye drásticamente debido a colisiones de paquetes graves. Para abordar este problema, es esencial asignar los parámetros de transmisión, como el factor de spreading y la potencia de transmisión, de manera adecuada para cada dispositivo de end (ED). Sin embargo, debido al tráfico esporádico y baja tasa de actividad de las redes LoRa, evaluar el desempeño de sistema EE bajo diferentes configuraciones de parámetros es un proceso tiempoconsumidor. Por lo tanto, primero formulamos un modelo analítico para calcular el sistema EE. A partir de este modelo, propusimos un algoritmo de asignación de parámetros de transmisión basado en aprendizaje por refuerzo multientidad (MALoRa) con el objetivo de maximizar el sistema EE de las redes LoRa. Destacablemente, MALoRa utiliza un mecanismo de atención para guiar a cada ED en cómo asignar la atención adecuada a las asignaciones de parámetros relevantes para mejorar el sistema EE. Los resultados de la simulación demuestran que MALoRa mejora significativamente el sistema EE en comparación con los algoritmos de referencia con una degradación aceptable en la tasa de entrega de paquetes (PDR).

Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca

paper_url: http://arxiv.org/abs/2309.08958
repo_url: None
paper_authors: Pinzhen Chen, Shaoxiong Ji, Nikolay Bogoychev, Barry Haddow, Kenneth Heafield
for: 本研究旨在探讨基础大语言模型（LLM）可以如何通过具体的指令微调来开发开放式问答能力，以便应用程序如AI助手等。
methods: 本研究采用了Alpaca数据集和机器翻译对其进行多语言训练数据的组合，然后通过低级别适应和全参数训练来微调LLMs。
results: 研究发现，虽然多语言微调不对英语表现有直接影响，但它对多语言环境下LLM的稳定性至关重要。具有固定预算的情况下，一个多语言指令微调模型，只需在减少数据上进行微调，可以和每种语言单独训练的模型相比。这些发现可以为减少计算资源的情况下扩展语言支持而提供指南。

Abstract
Foundational large language models (LLMs) can be instruction-tuned to develop open-ended question-answering capability, facilitating applications such as the creation of AI assistants. While such efforts are often carried out in a single language, building on prior research, we empirically analyze cost-efficient approaches of monolingual and multilingual tuning, shedding light on the efficacy of LLMs in responding to queries across monolingual and multilingual contexts. Our study employs the Alpaca dataset and machine translations of it to form multilingual training data, which is then used to tune LLMs through low-rank adaptation and full-parameter training. Comparisons reveal that multilingual tuning is not crucial for an LLM's English performance, but is key to its robustness in a multilingual environment. With a fixed budget, a multilingual instruction-tuned model, merely trained on downsampled data, can be as powerful as training monolingual models for each language. Our findings serve as a guide for expanding language support through instruction tuning with constrained computational resources.

摘要
基础大语言模型（LLM）可以通过指令调整来发展开放式问答能力，用于应用程序，如创建人工智能助手。虽然这些努力通常在单语言上进行，基于先前的研究，但我们employs the Alpaca dataset and machine translations of it to form multilingual training data，然后使用低级别适应和全参数训练来调整LLM。对比发现，在多语言环境中，多语言调整对LLM的英语性能并无关系，但对多语言环境的稳定性至关重要。假设有固定预算，一个多语言指令调整模型，只需在减样数据上进行训练，可以与每种语言 separately 训练的模型相当有力。我们的发现可以 serve as a guide for expanding language support through instruction tuning with constrained computational resources。

Cross-Lingual Knowledge Editing in Large Language Models

paper_url: http://arxiv.org/abs/2309.08952
repo_url: None
paper_authors: Jiaan Wang, Yunlong Liang, Zengkui Sun, Yuxuan Cao, Jiarong Xu
for: 本研究旨在 investigate the cross-lingual effect of knowledge editing in natural language processing.
methods: 我们首先收集了一个大规模的 across-lingual synthetic dataset，并对不同知识编辑方法进行了英语编辑。然后，我们对这些编辑后的模型进行了中文评估，并 vice versa。
results: 我们发现了编辑后模型的可靠性、通用性、地域性和可移植性在不同语言之间存在差异。此外，我们还分析了编辑后模型的不一致行为和特定挑战。

Abstract
Knowledge editing aims to change language models' performance on several special cases (i.e., editing scope) by infusing the corresponding expected knowledge into them. With the recent advancements in large language models (LLMs), knowledge editing has been shown as a promising technique to adapt LLMs to new knowledge without retraining from scratch. However, most of the previous studies neglect the multi-lingual nature of some main-stream LLMs (e.g., LLaMA, ChatGPT and GPT-4), and typically focus on monolingual scenarios, where LLMs are edited and evaluated in the same language. As a result, it is still unknown the effect of source language editing on a different target language. In this paper, we aim to figure out this cross-lingual effect in knowledge editing. Specifically, we first collect a large-scale cross-lingual synthetic dataset by translating ZsRE from English to Chinese. Then, we conduct English editing on various knowledge editing methods covering different paradigms, and evaluate their performance in Chinese, and vice versa. To give deeper analyses of the cross-lingual effect, the evaluation includes four aspects, i.e., reliability, generality, locality and portability. Furthermore, we analyze the inconsistent behaviors of the edited models and discuss their specific challenges.

摘要
知识编辑目标是改善语言模型在特定场景（即编辑范围）的表现，通过涂抹相应的预期知识到其中。随着大语言模型（LLM）的发展，知识编辑被证明为一种有希望的技术，可以在不重新训练的情况下，使LML在新的知识上进行适应。然而，大多数先前的研究忽视了主流LLM的多语言特性（例如LLaMA、ChatGPT和GPT-4），通常集中于单语言enario，而不是考虑多语言enario。因此， edit Language Model在不同目标语言下的效果仍然未知。在这篇论文中，我们想要解决这种跨语言效果。specifically，我们首先收集了一个大规模的跨语言合成数据集，将英语ZsRE翻译成中文。然后，我们对不同知识编辑方法进行英语编辑，并在中文和vice versa中进行评估。为了更深入地分析跨语言效果，评估包括四个方面：可靠性、通用性、本地性和可移植性。此外，我们还分析了编辑后模型的不一致行为，并讨论了其特定挑战。

Universal Metric Learning with Parameter-Efficient Transfer Learning

paper_url: http://arxiv.org/abs/2309.08944
repo_url: None
paper_authors: Sungyeon Kim, Donghyun Kim, Suha Kwak
for: 本文提出了一种新的度量学习方法，即通用度量学习（UML），该方法可以捕捉多个不同分布数据之间的关系。
methods: 本文提出了一种新的度量学习方法，即通用度量学习（UML），该方法包括一个预先固定的模型和两个附加模块：随机适应器和提示池。这些模块可以捕捉 dataset-specific 知识，同时避免倾斜到主导分布的偏见。
results: 本文的实验结果表明，使用 Parametric Universal Metric leArning（PUMA）方法可以在多个不同分布数据上实现更好的性能，并使用约 69 倍 fewer 可变参数。

Abstract
A common practice in metric learning is to train and test an embedding model for each dataset. This dataset-specific approach fails to simulate real-world scenarios that involve multiple heterogeneous distributions of data. In this regard, we introduce a novel metric learning paradigm, called Universal Metric Learning (UML), which learns a unified distance metric capable of capturing relations across multiple data distributions. UML presents new challenges, such as imbalanced data distribution and bias towards dominant distributions. To address these challenges, we propose Parameter-efficient Universal Metric leArning (PUMA), which consists of a pre-trained frozen model and two additional modules, stochastic adapter and prompt pool. These modules enable to capture dataset-specific knowledge while avoiding bias towards dominant distributions. Additionally, we compile a new universal metric learning benchmark with a total of 8 different datasets. PUMA outperformed the state-of-the-art dataset-specific models while using about 69 times fewer trainable parameters.

摘要
通常在 метри学习中，对每个数据集进行特定的模型训练和测试。这种数据集特定的方法无法模拟实际世界中存在多个不同类型数据的场景。为此，我们介绍了一种新的度量学习方法，即通用度量学习（UML），它学习了一个可以捕捉多个数据分布关系的统一距离度量。UML带来了新的挑战，如数据分布偏斜和主导分布的偏袋。为解决这些挑战，我们提出了Parameter-efficient Universal Metric leArning（PUMA），它包括一个预训练的冻结模型和两个附加模块，随机适应器和提示池。这些模块允许捕捉数据集特定的知识，而不是偏袋主导分布。此外，我们编译了一个新的通用度量学习Benchmark，包括8个不同的数据集。PUMA在比较州时表现了与当前最佳数据集特定模型相比，使用了约69倍少的可训练参数。

An Unified Search and Recommendation Foundation Model for Cold-Start Scenario

paper_url: http://arxiv.org/abs/2309.08939
repo_url: None
paper_authors: Yuqi Gong, Xichen Ding, Yehui Su, Kaiming Shen, Zhongyi Liu, Guannan Zhang
for: 这种 paper 的目的是提出一种基于多个领域的搜索和推荐系统模型，以提高系统的性能和灵活性。
methods: 该 paper 使用了大语言模型（LLM）来提取域无关的文本特征，并使用方面闭合合并来将 ID 特征、域无关文本特征和任务特定的多元稀热特征 merge 到获得查询和 Item 的表示。同时，该 paper 还提出了多个搜索和推荐enario 的适应Multi-task 模块来训练多个领域的基础模型。
results: 该 paper 通过在冷启动场景中使用 pre-train finetune 方式应用 S&R Multi-Domain Foundation 模型，实现了与其他 SOTA 传输学习方法相比较好的性能。此外，S&R Multi-Domain Foundation 模型已经成功应用在阿里巴巴手机应用程序中的内容查询推荐和服务卡推荐等方面。

Abstract
In modern commercial search engines and recommendation systems, data from multiple domains is available to jointly train the multi-domain model. Traditional methods train multi-domain models in the multi-task setting, with shared parameters to learn the similarity of multiple tasks, and task-specific parameters to learn the divergence of features, labels, and sample distributions of individual tasks. With the development of large language models, LLM can extract global domain-invariant text features that serve both search and recommendation tasks. We propose a novel framework called S\&R Multi-Domain Foundation, which uses LLM to extract domain invariant features, and Aspect Gating Fusion to merge the ID feature, domain invariant text features and task-specific heterogeneous sparse features to obtain the representations of query and item. Additionally, samples from multiple search and recommendation scenarios are trained jointly with Domain Adaptive Multi-Task module to obtain the multi-domain foundation model. We apply the S\&R Multi-Domain foundation model to cold start scenarios in the pretrain-finetune manner, which achieves better performance than other SOTA transfer learning methods. The S\&R Multi-Domain Foundation model has been successfully deployed in Alipay Mobile Application's online services, such as content query recommendation and service card recommendation, etc.

摘要
现代商业搜索引擎和推荐系统中，数据来自多个领域可以共同训练多领域模型。传统方法在多任务设定下训练多领域模型，使分享参数学习多任务之间的相似性，而任务特定参数学习多任务之间的差异。随着大语言模型的发展，LLM可以提取全局领域不变的文本特征，用于搜索和推荐任务。我们提出了一种新的框架，即S\&R多领域基础框架，使用LLM提取领域不变特征，并使用方面闭合合并模块将ID特征、领域不变文本特征和任务特有的多样化稀缺特征拼接而成查询和物品的表示。此外，我们在多个搜索和推荐场景中共同训练域 adapted multi-task模块，以获得多领域基础模型。我们在寒冷开始场景中使用预训练- fine-tune方式应用S\&R多领域基础模型，实现了与其他SOTA传输学习方法相比较好的性能。S\&R多领域基础模型已经成功部署在阿里巴巴手机应用程序内的内容查询推荐和服务卡推荐等功能中。

A Novel Neural-symbolic System under Statistical Relational Learning

paper_url: http://arxiv.org/abs/2309.08931
repo_url: None
paper_authors: Dongran Yu, Xueyan Liu, Shirui Pan, Anchen Li, Bo Yang
for: The paper aims to develop a cognitive model that can exhibit human-like intellectual capabilities through neural-symbolic systems, which combine the strengths of deep learning and symbolic reasoning.
methods: The proposed method is a general bi-level probabilistic graphical reasoning framework called GBPGR, which leverages statistical relational learning to effectively integrate deep learning models and symbolic reasoning in a mutually beneficial manner.
results: The approach achieves high performance and exhibits effective generalization in both transductive and inductive tasks, as demonstrated through extensive experiments.Here’s the same information in Simplified Chinese:
for: 本研究旨在通过神经符号系统实现人类智能水平的认知模型。
methods: 提议的方法是一种通用二级概率图解架构（GBPGR），利用统计关系学来有效地结合深度学习模型和符号逻辑。
results: 方法在推uctive和概率任务中具有高性能和有效的泛化能力，经过广泛的实验证明。

Abstract
A key objective in field of artificial intelligence is to develop cognitive models that can exhibit human-like intellectual capabilities. One promising approach to achieving this is through neural-symbolic systems, which combine the strengths of deep learning and symbolic reasoning. However, current approaches in this area have been limited in their combining way, generalization and interpretability. To address these limitations, we propose a general bi-level probabilistic graphical reasoning framework called GBPGR. This framework leverages statistical relational learning to effectively integrate deep learning models and symbolic reasoning in a mutually beneficial manner. In GBPGR, the results of symbolic reasoning are utilized to refine and correct the predictions made by the deep learning models. At the same time, the deep learning models assist in enhancing the efficiency of the symbolic reasoning process. Through extensive experiments, we demonstrate that our approach achieves high performance and exhibits effective generalization in both transductive and inductive tasks.

摘要
“一个关键目标在人工智能领域是开发人工智能模型，能够展现人类智能水平的 cognitive 能力。一种具有推进性的方法是通过神经做数学系统，结合深度学习和符号推理。但现有方法在这个领域有限，导致混合、扩展和解释性不足。为了解决这些限制，我们提出一个通用二级概率 graf 推理框架 called GBPGR。这个框架利用 Statistical Relational Learning 技术，实现深度学习模型和符号推理的共同优化。在 GBPGR 中，符号推理的结果用于修正和改善深度学习模型的预测结果。另一方面，深度学习模型帮助提高符号推理的效率。经过广泛的实验，我们证明了我们的方法在转掌和推理任务中具有高性能和有效扩展。”Note that the translation is in Simplified Chinese, which is the standard form of Chinese used in mainland China and Singapore. If you prefer Traditional Chinese, I can provide that as well.

DOMAIN: MilDly COnservative Model-BAsed OfflINe Reinforcement Learning

paper_url: http://arxiv.org/abs/2309.08925
repo_url: None
paper_authors: Xiao-Yin Liu, Xiao-Hu Zhou, Xiao-Liang Xie, Shi-Qi Liu, Zhen-Qiu Feng, Hao Li, Mei-Jiang Gui, Tian-Yu Xiang, De-Xing Huang, Zeng-Guang Hou
for: This paper proposes a new model-based reinforcement learning algorithm called DOMAIN to address the problem of distribution shift in offline RL.
methods: The DOMAIN algorithm uses adaptive sampling of model samples to adjust the model data penalty and does not rely on model uncertainty estimation.
results: The paper shows that the DOMAIN algorithm is less conservative than previous model-based offline RL algorithms and achieves better performance than other RL algorithms on tasks that require generalization.Here’s the Chinese translation of the three points:
for: 这篇论文提出了一种新的基于模型的强化学习算法called DOMAIN，以解决停机shift问题。
methods: DOMAIN算法使用适应样本的模型样本抽象来调整模型数据罚款，不需要模型不确定性估计。
results: 论文表明DOMAIN算法比前一代基于模型的停机RL算法更加保守，并在需要总体化的任务上达到更好的性能。

Abstract
Model-based reinforcement learning (RL), which learns environment model from offline dataset and generates more out-of-distribution model data, has become an effective approach to the problem of distribution shift in offline RL. Due to the gap between the learned and actual environment, conservatism should be incorporated into the algorithm to balance accurate offline data and imprecise model data. The conservatism of current algorithms mostly relies on model uncertainty estimation. However, uncertainty estimation is unreliable and leads to poor performance in certain scenarios, and the previous methods ignore differences between the model data, which brings great conservatism. Therefore, this paper proposes a milDly cOnservative Model-bAsed offlINe RL algorithm (DOMAIN) without estimating model uncertainty to address the above issues. DOMAIN introduces adaptive sampling distribution of model samples, which can adaptively adjust the model data penalty. In this paper, we theoretically demonstrate that the Q value learned by the DOMAIN outside the region is a lower bound of the true Q value, the DOMAIN is less conservative than previous model-based offline RL algorithms and has the guarantee of security policy improvement. The results of extensive experiments show that DOMAIN outperforms prior RL algorithms on the D4RL dataset benchmark, and achieves better performance than other RL algorithms on tasks that require generalization.

摘要
模型基于的再增强学习（RL），从偏置数据集中学习环境模型，并生成更多的不同于实际环境的模型数据，已成为解决偏移 Distribution shift 在线上 RL 中的有效方法。由于模型与实际环境之间的差距，需要在算法中加入保守性来平衡准确的偏置数据和不准确的模型数据。现有算法中的保守性主要基于模型uncertainty 估计。然而，不确定性估计不可靠，导致在某些场景下表现不佳，而之前的方法忽略了模型数据之间的差异，这会带来很大的保守性。因此，这篇论文提出了一种milDly cOnservative Model-bAsed offlINe RL算法（DOMAIN），不需要估计模型uncertainty，以解决上述问题。DOMAIN 引入适应样本分布的模型样本折扣，可以适应性地调整模型数据罚款。在本论文中，我们理论上展示了 DOMAIN 外部区域上学习的 Q 值是真实 Q 值的下界，DOMAIN 比前一代模型基于的offline RL算法更加保守，并且有安全政策改进的 garant。实验结果表明，DOMAIN 在 D4RL 数据集上比前一代 RL 算法表现出色，并在需要总结的任务上达到更好的表现。

Exploration of TPUs for AI Applications

paper_url: http://arxiv.org/abs/2309.08918
repo_url: None
paper_authors: Diego Sanmartín Carrión, Vera Prohaska
for: 这篇论文主要是关于 tensor processing units (TPU) 的特有硬件加速器，用于深度学习，以及其在边缘计算中的实现。
methods: 论文首先提供了 TPU 的概述，包括神经网络设计、通用架构、编译技术和支持框架。然后进行了云和边缘 TPU 性能对比其他相似架构芯片。
results: 结果显示，TPU 可以在云和边缘计算中提供显著性能提升。同时，论文还提出了在边缘 TPU 上部署更多架构的需求，以及在边缘计算中需要更多robust的比较。

Abstract
Tensor Processing Units (TPUs) are specialized hardware accelerators for deep learning developed by Google. This paper explores the performance of TPU with a focus on AI and its implementation in edge computing. It first provides an overview of TPUs, specifically their design in relation to neural networks, their general architecture, compilation techniques and supporting frameworks. Furthermore, we provide a comparative analysis of Cloud and Edge TPU performance against other counterpart chip architectures. It is then discussed how TPUs can be used to speed up AI workloads. The results show that TPUs can provide significant performance improvements both in cloud and edge computing. Additionally, we address the need for further research for the deployment of more architectures in the Edge TPU, as well as the need for the development of more robust comparisons in edge computing.

摘要
tensor 处理单元 (TPU) 是 Google 开发的特циализирован硬件加速器，用于深度学习。本文通过对 TPU 的性能进行分析，特别是在 AI 和边缘计算中的应用。首先，文章提供了 TPU 的概述，包括神经网络设计、通用架构、编译技术和支持框架。然后，文章进行了云和边缘 TPU 性能的比较分析，与其他相关芯片架构进行比较。最后，文章讨论了如何使用 TPU 加速 AI 工作负荷。结果表明，TPU 可以在云和边缘计算中提供显著性能提升。此外，文章还提出了进一步研究边缘 TPU 部署的需求，以及边缘计算中更加robust的比较开发需求。

Bidirectional Graph GAN: Representing Brain Structure-Function Connections for Alzheimer’s Disease

paper_url: http://arxiv.org/abs/2309.08916
repo_url: None
paper_authors: Shuqiang Wang, Chen Ding
for: 本研究旨在探讨脑结构和功能之间的关系，以揭示脑病、如阿尔茨海默病（AD）的发生机制。
methods: 本研究提出了一种bidirectional graph生成对抗网络（BGGAN），用于表征脑结构和功能之间的连接。 Specifically, InnerGCN模块被设计来使生成器使用直接和间接脑区域的特征来学习映射函数。另外，一个名为Balancer的模块被设计来对生成器和判别器进行平衡优化。
results: 对ADNI数据集进行实验表明，生成的结构连接和功能连接都可以提高识别AD的准确率。此外，基于提出的模型，发现脑结构和功能之间不是一对一的对应关系。脑结构是脑功能的基础，强的结构连接通常 accompanies 强的功能连接。

Abstract
The relationship between brain structure and function is critical for revealing the pathogenesis of brain disease, including Alzheimer's disease (AD). However, it is a great challenge to map brain structure-function connections due to various reasons. In this work, a bidirectional graph generative adversarial networks (BGGAN) is proposed to represent brain structure-function connections. Specifically, by designing a module incorporating inner graph convolution network (InnerGCN), the generators of BGGAN can employ features of direct and indirect brain regions to learn the mapping function between structural domain and functional domain. Besides, a new module named Balancer is designed to counterpoise the optimization between generators and discriminators. By introducing the Balancer into BGGAN, both the structural generator and functional generator can not only alleviate the issue of mode collapse but also learn complementarity of structural and functional features. Experimental results using ADNI datasets show that the both the generated structure connections and generated function connections can improve the identification accuracy of AD. More importantly, based the proposed model, it is found that the relationship between brain structure and function is not a complete one-to-one correspondence. Brain structure is the basis of brain function. The strong structural connections are almost accompanied by strong functional connections.

摘要
《Brain Structure-Function Relationship and Alzheimer's Disease》Introduction: brain structure-function relationship is crucial for understanding the pathogenesis of brain diseases, including Alzheimer's disease (AD). However, mapping brain structure-function connections is a significant challenge due to various reasons. In this work, we propose a bidirectional graph generative adversarial networks (BGGAN) to represent brain structure-function connections.Methodology:1. Inner Graph Convolution Network (InnerGCN): We design a module incorporating InnerGCN to enable the generators of BGGAN to learn the mapping function between structural domain and functional domain using features of direct and indirect brain regions.2. Balancer: To counterpoise the optimization between generators and discriminators, we introduce a new module named Balancer. This module allows both the structural generator and functional generator to alleviate the issue of mode collapse and learn complementarity of structural and functional features.Results:1. Improved identification accuracy of AD: Experimental results using ADNI datasets show that the generated structure connections and functional connections can improve the identification accuracy of AD.2. Relationship between brain structure and function: Based on the proposed model, we found that the relationship between brain structure and function is not a complete one-to-one correspondence. Brain structure is the basis of brain function, and strong structural connections are almost accompanied by strong functional connections.Conclusion:BGGAN provides a novel approach to mapping brain structure-function connections, which can be used to better understand the pathogenesis of brain diseases such as AD. The proposed model highlights the importance of considering both structural and functional features when studying brain function and disease.

A Statistical Turing Test for Generative Models

paper_url: http://arxiv.org/abs/2309.08913
repo_url: None
paper_authors: Hayden Helm, Carey E. Priebe, Weiwei Yang
for: 本研究旨在量化人类和机器生成内容的分布差异，以便评估生成模型是否具备人类化能力。
methods: 本研究使用统计模式识别语言框架，描述当前生成模型的评估方法，并用该框架进行生成模型的评估。
results: 研究发现，当前的生成模型在评估上的表现有所提高，但仍有一定的差异与人类生成内容的分布。

Abstract
The emergence of human-like abilities of AI systems for content generation in domains such as text, audio, and vision has prompted the development of classifiers to determine whether content originated from a human or a machine. Implicit in these efforts is an assumption that the generation properties of a human are different from that of the machine. In this work, we provide a framework in the language of statistical pattern recognition that quantifies the difference between the distributions of human and machine-generated content conditioned on an evaluation context. We describe current methods in the context of the framework and demonstrate how to use the framework to evaluate the progression of generative models towards human-like capabilities, among many axes of analysis.

摘要
人类化能力的AI系统在文本、音频和视觉等领域的内容生成中得到了广泛应用，这导致了判断内容是人类还是机器生成的分类器的发展。这种假设是人类生成的特征和机器生成的特征不同。在这项工作中，我们提供了一个基于统计模式识别语言的框架，以量化人类和机器生成的内容在评估上下文中的分布差异。我们将当前方法与此框架中的方法进行描述，并通过这个框架来评估生成模型在多个轴上的进步，包括人类化能力。

V2CE: Video to Continuous Events Simulator

paper_url: http://arxiv.org/abs/2309.08891
repo_url: None
paper_authors: Zhongyang Zhang, Shuyang Cui, Kaidong Chai, Haowen Yu, Subhasis Dasgupta, Upal Mahbub, Tauhidur Rahman
for: 本文旨在提出一种基于动态视场传感器（DVS）的视频转事件流转换方法，以提高DVS在计算机视觉任务中的表现。
methods: 本文提出了一种基于多视角的事件流转换方法，并采用了一系列特别设计的损失函数来提高生成的事件VOXEL的质量。此外，本文还提出了一种基于本地动态特征的时间推测策略，以准确地恢复事件时间排序和消除时间层次问题。
results: 根据对量化指标进行严格验证，本文的方法在所有阶段的管道中具有最高精度，可以视为当前最佳实践（SOTA）。

Abstract
Dynamic Vision Sensor (DVS)-based solutions have recently garnered significant interest across various computer vision tasks, offering notable benefits in terms of dynamic range, temporal resolution, and inference speed. However, as a relatively nascent vision sensor compared to Active Pixel Sensor (APS) devices such as RGB cameras, DVS suffers from a dearth of ample labeled datasets. Prior efforts to convert APS data into events often grapple with issues such as a considerable domain shift from real events, the absence of quantified validation, and layering problems within the time axis. In this paper, we present a novel method for video-to-events stream conversion from multiple perspectives, considering the specific characteristics of DVS. A series of carefully designed losses helps enhance the quality of generated event voxels significantly. We also propose a novel local dynamic-aware timestamp inference strategy to accurately recover event timestamps from event voxels in a continuous fashion and eliminate the temporal layering problem. Results from rigorous validation through quantified metrics at all stages of the pipeline establish our method unquestionably as the current state-of-the-art (SOTA).

摘要
dynamically 视场传感器（DVS）基本解决方案在各种计算机视觉任务中受到了广泛的关注，提供了明显的优势，包括动态范围、时间分辨率和推理速度。然而，作为与活动像素感知器（APS）设备，如RGB摄像头相比较新的视觉传感器，DVS受到了充分的标注数据的缺乏问题。先前的尝试将APS数据转换为事件经常遇到问题，如实际事件与模拟事件之间的很大领域转移、缺乏量化验证和时间轴层叠问题。在本文中，我们提出了一种新的视频到事件流转换方法，考虑了DVS的特点。一系列仔细设计的损失函数可以提高生成的事件粒子质量。我们还提出了一种新的本地动态感知时间推断策略，可以准确地从事件粒子中恢复事件时间戳，消除时间轴层叠问题。经过严格的验证，我们的方法在所有阶段的管道中都有明显的优势，无疑成为当前最佳实践（SOTA）。

GCL: Gradient-Guided Contrastive Learning for Medical Image Segmentation with Multi-Perspective Meta Labels

paper_url: http://arxiv.org/abs/2309.08888
repo_url: None
paper_authors: Yixuan Wu, Jintai Chen, Jiahuan Yan, Yiheng Zhu, Danny Z. Chen, Jian Wu
for: 降低错误标签成本，为医疗影像分类 зада增加效率的方法
methods: 利用Gradient Mitigator方法与Gradient Filter方法，将多种多面 semantics整合为一个单一的高级semantic recognition能力
results: 透过实验证明，新方法GCL可以从有限标签的情况下，学习出有用的医疗影像表示，并且在不同数据集上展现出良好的一致性和普遍性

Abstract
Since annotating medical images for segmentation tasks commonly incurs expensive costs, it is highly desirable to design an annotation-efficient method to alleviate the annotation burden. Recently, contrastive learning has exhibited a great potential in learning robust representations to boost downstream tasks with limited labels. In medical imaging scenarios, ready-made meta labels (i.e., specific attribute information of medical images) inherently reveal semantic relationships among images, which have been used to define positive pairs in previous work. However, the multi-perspective semantics revealed by various meta labels are usually incompatible and can incur intractable "semantic contradiction" when combining different meta labels. In this paper, we tackle the issue of "semantic contradiction" in a gradient-guided manner using our proposed Gradient Mitigator method, which systematically unifies multi-perspective meta labels to enable a pre-trained model to attain a better high-level semantic recognition ability. Moreover, we emphasize that the fine-grained discrimination ability is vital for segmentation-oriented pre-training, and develop a novel method called Gradient Filter to dynamically screen pixel pairs with the most discriminating power based on the magnitude of gradients. Comprehensive experiments on four medical image segmentation datasets verify that our new method GCL: (1) learns informative image representations and considerably boosts segmentation performance with limited labels, and (2) shows promising generalizability on out-of-distribution datasets.

摘要
自带笔迹标注医疗图像分割任务的成本高昂，因此极其感到需要设计一种笔迹效率的方法，以减轻笔迹负担。近年来，对比学习表现出了巨大的潜力，可以通过有限的标签来提高后续任务的性能。在医疗图像场景中，已有的meta标签（即医疗图像特征信息）自然地暴露了图像之间的semantic关系，这些meta标签在过去的工作中已经被用来定义正例对。然而，医疗图像场景中的多个meta标签之间的semantic关系通常是不兼容的，这会导致"semantic contradiction"现象，从而使得组合不同meta标签的"semantic contradiction"变得不可持续。在这篇论文中，我们解决了"semantic contradiction"问题，我们提出了一种 Gradient Mitigator 方法，该方法可以系统地统一多个视角的 meta标签，使得预训练模型能够更好地捕捉高级别semantic认知能力。此外，我们强调了分割预训练中的细腻分辨率的重要性，我们开发了一种 Gradient Filter 方法，该方法可以根据梯度的大小来动态屏蔽图像对的最有分辨力的像素对。我们在四个医疗图像分割数据集进行了广泛的实验，结果表明，我们的新方法 GCL 可以：（1）学习有用的图像表示，对有限标签进行分割任务进行明显的提升，以及（2）在不同数据集上表现出良好的普适性。

Solving Satisfiability Modulo Counting for Symbolic and Statistical AI Integration With Provable Guarantees

paper_url: http://arxiv.org/abs/2309.08883
repo_url: None
paper_authors: Jinzhao Li, Nan Jiang, Yexiang Xue
for: 这篇论文主要是解决 Symbolic Artificial Intelligence 和 Statistical Artificial Intelligence 的交叉问题，即 Satisfiability Modulo Counting (SMC) 问题。
methods: 该论文提出了一种基于 NP-oracle 的多项式算法 XOR-SMC，可以解决高度NP-完全的 SMC 问题，并提供了常量近似保证。 XOR-SMC 将 SMC 问题转化为满足随机 XOR 约束的 SAT 方程问题。
results: experiments 表明，XOR-SMC 能够在解决重要的 SMC 问题时，与基线相比，提供更好的近似解决方案，并且其近似精度较高。

Abstract
Satisfiability Modulo Counting (SMC) encompasses problems that require both symbolic decision-making and statistical reasoning. Its general formulation captures many real-world problems at the intersection of symbolic and statistical Artificial Intelligence. SMC searches for policy interventions to control probabilistic outcomes. Solving SMC is challenging because of its highly intractable nature($\text{NP}^{\text{PP}$-complete), incorporating statistical inference and symbolic reasoning. Previous research on SMC solving lacks provable guarantees and/or suffers from sub-optimal empirical performance, especially when combinatorial constraints are present. We propose XOR-SMC, a polynomial algorithm with access to NP-oracles, to solve highly intractable SMC problems with constant approximation guarantees. XOR-SMC transforms the highly intractable SMC into satisfiability problems, by replacing the model counting in SMC with SAT formulae subject to randomized XOR constraints. Experiments on solving important SMC problems in AI for social good demonstrate that XOR-SMC finds solutions close to the true optimum, outperforming several baselines which struggle to find good approximations for the intractable model counting in SMC.

摘要
满足性模ulo counting（SMC）包括问题需要 Both symbolic decision-making 和统计学推理。 Its general formulation capture 了许多实际世界问题的交叉点，位于符号AI和统计AI之间。 SMC寻找策略性的输入，以控制 probabilistic outcomes。解决 SMC 是困难的，因为它具有非常困难的性质（NP 完全），包括统计推理和符号推理。 Previous research on SMC solving 缺乏可证明的保证和/或具有不佳的实际性能，特别是在存在 combinatorial constraints 时。 We propose XOR-SMC，一种可 polynomials 算法，使用 NP-oracles，解决高度困难的 SMC 问题，并提供常量approximation guarantees。 XOR-SMC 将高度困难的 SMC 转换成满足性问题，通过将 SMC 中的模型计数换成 SAT 方程Subject to randomized XOR 约束。实验表明，XOR-SMC 能够在解决重要的 SMC 问题中，提供近似 true optimum 的解决方案，超越了许多基准值，它们在 intractable model counting 中的 SMC 中寻找具有好的approximation 的解决方案。

ChatGPT-4 with Code Interpreter can be used to solve introductory college-level vector calculus and electromagnetism problems

paper_url: http://arxiv.org/abs/2309.08881
repo_url: None
paper_authors: Tanuj Kumar, Mikhail A. Kats
for: 这个论文是为了测试 chatGPT 3.5、4、4 with Code Interpreter 在大学二年级电工和电磁学问题上的性能。
methods: 作者使用了一组13个问题，并使用了不同的 chatGPT 实例来解决这些问题多次。
results: 结果显示，使用 Code Interpreter 的 chatGPT 4 可以成功解决大多数问题，而不使用 Code Interpreter 的 chatGPT 3.5 和 chatGPT 4 的性能较差。

Abstract
We evaluated ChatGPT 3.5, 4, and 4 with Code Interpreter on a set of college-level engineering-math and electromagnetism problems, such as those often given to sophomore electrical engineering majors. We selected a set of 13 problems, and had ChatGPT solve them multiple times, using a fresh instance (chat) each time. We found that ChatGPT-4 with Code Interpreter was able to satisfactorily solve most problems we tested most of the time -- a major improvement over the performance of ChatGPT-4 (or 3.5) without Code Interpreter. The performance of ChatGPT was observed to be somewhat stochastic, and we found that solving the same problem N times in new ChatGPT instances and taking the most-common answer was an effective strategy. Based on our findings and observations, we provide some recommendations for instructors and students of classes at this level.

摘要
我们对 chatGPT 3.5、4 和 4 进行了测试，使用 college-level 工程学和电磁学问题，类似于第二年电机工程学生常 receives 的问题。我们选择了 13 个问题，让 chatGPT 多次解决这些问题，每次使用新的 chat 实例。我们发现，在使用 Code Interpreter 的情况下，chatGPT 4 能够大幅提高解决这些问题的能力，比不使用 Code Interpreter 的情况下要好。chatGPT 的性能被观察到有一定的随机性，我们发现，对同一个问题多次解决，并取得最常见的答案是一个有效的策略。根据我们的发现和观察，我们对教师和学生提供了一些建议。

Data-Driven H-infinity Control with a Real-Time and Efficient Reinforcement Learning Algorithm: An Application to Autonomous Mobility-on-Demand Systems

paper_url: http://arxiv.org/abs/2309.08880
repo_url: None
paper_authors: Ali Aalipour, Alireza Khani
for: 这篇论文旨在开发一个基于Q学习的无模型实时控制算法，用于解决线性碎时系统的H$_{\infty}$控制问题。
methods: 提议的算法使用了Q学习方法，并且在线性碎时系统中实现了实时和数据效率的控制。computational complexity降低至$\mathcal{O}(\underline{q}^2)$，比Literature中的$\mathcal{O}(\underline{q}^3)$更低。
results: 实验研究显示，提议的算法可以实现实时和数据效率的控制，并且不需要初始稳定政策。在一个实际应用中，将提议的算法应用到了一个自主移动需求系统中，并且得到了良好的效果。

Abstract
Reinforcement learning (RL) is a class of artificial intelligence algorithms being used to design adaptive optimal controllers through online learning. This paper presents a model-free, real-time, data-efficient Q-learning-based algorithm to solve the H$_{\infty}$ control of linear discrete-time systems. The computational complexity is shown to reduce from $\mathcal{O}(\underline{q}^3)$ in the literature to $\mathcal{O}(\underline{q}^2)$ in the proposed algorithm, where $\underline{q}$ is quadratic in the sum of the size of state variables, control inputs, and disturbance. An adaptive optimal controller is designed and the parameters of the action and critic networks are learned online without the knowledge of the system dynamics, making the proposed algorithm completely model-free. Also, a sufficient probing noise is only needed in the first iteration and does not affect the proposed algorithm. With no need for an initial stabilizing policy, the algorithm converges to the closed-form solution obtained by solving the Riccati equation. A simulation study is performed by applying the proposed algorithm to real-time control of an autonomous mobility-on-demand (AMoD) system for a real-world case study to evaluate the effectiveness of the proposed algorithm.

摘要
“强化学习（RL）是一类人工智能算法，用于设计适应最佳控制器通过在线学习。本文提出了一种无模型、实时、数据有效的Q学习基于算法，用于解决线性时间隔系统的H$_{\infty}$控制问题。在文章中，我们显示了计算复杂性从$\mathcal{O}(\underline{q}^3)$降低到$\mathcal{O}(\underline{q}^2)$，其中$\underline{q}$是状态变量、控制输入和干扰的总和的二次函数。我们实现了无模型的优化控制器，并在线学习行为和评价网络的参数，不需要系统动力学模型的知识，因此完全无模型。此外，我们只需在第一轮执行充分的探测噪声，并不影响提议算法。无需初始稳定策略，算法可以到达由解决里氏方程得到的闭合形解。我们对实时控制一个真实的自动化移动需求系统进行了一个实验研究，以评估提议算法的有效性。”Note: "Simplified Chinese" is also known as "Mandarin" or "Standard Chinese".

PDFTriage: Question Answering over Long, Structured Documents

paper_url: http://arxiv.org/abs/2309.08872
repo_url: None
paper_authors: Jon Saad-Falcon, Joe Barrow, Alexa Siu, Ani Nenkova, Ryan A. Rossi, Franck Dernoncourt
for: 本研究旨在解决大语言模型（LLM）在文档问答（QA）中遇到的问题，即当文档不能适应LLM的小上下文长度时。
methods: 本研究提议一种名为PDFTriage的方法，可以基于结构或内容来检索文档上下文。
results: 我们的实验显示，使用PDFTriage可以在多种问题类型中提高文档QA的效果，而现有的检索-加以LLM则失败。此外，我们还发布了一个包含80个结构化文档和900多个人工生成的问题的数据集，以便进一步研究这一基本问题。

Abstract
Large Language Models (LLMs) have issues with document question answering (QA) in situations where the document is unable to fit in the small context length of an LLM. To overcome this issue, most existing works focus on retrieving the relevant context from the document, representing them as plain text. However, documents such as PDFs, web pages, and presentations are naturally structured with different pages, tables, sections, and so on. Representing such structured documents as plain text is incongruous with the user's mental model of these documents with rich structure. When a system has to query the document for context, this incongruity is brought to the fore, and seemingly trivial questions can trip up the QA system. To bridge this fundamental gap in handling structured documents, we propose an approach called PDFTriage that enables models to retrieve the context based on either structure or content. Our experiments demonstrate the effectiveness of the proposed PDFTriage-augmented models across several classes of questions where existing retrieval-augmented LLMs fail. To facilitate further research on this fundamental problem, we release our benchmark dataset consisting of 900+ human-generated questions over 80 structured documents from 10 different categories of question types for document QA.

摘要

MHLAT: Multi-hop Label-wise Attention Model for Automatic ICD Coding

paper_url: http://arxiv.org/abs/2309.08868
repo_url: None
paper_authors: Junwen Duan, Han Jiang, Ying Yu
for: 医疗记录编码（ICD编码）任务是将医疗记录中的病理诊断代码分配给临床病理诊断。
methods: 我们提出了一种简单 yet effective 的模型，即 Multi-Hop Label-wise ATtention（MHLAT），其中使用多步标签层权重来获得更精准和有用的表示。
results: 我们在三个 MIMIC 数据集上进行了广泛的实验，并证明了我们的方法在七个指标中具有显著更好或竞争性表现，并且具有更少的参数优化。

Abstract
International Classification of Diseases (ICD) coding is the task of assigning ICD diagnosis codes to clinical notes. This can be challenging given the large quantity of labels (nearly 9,000) and lengthy texts (up to 8,000 tokens). However, unlike the single-pass reading process in previous works, humans tend to read the text and label definitions again to get more confident answers. Moreover, although pretrained language models have been used to address these problems, they suffer from huge memory usage. To address the above problems, we propose a simple but effective model called the Multi-Hop Label-wise ATtention (MHLAT), in which multi-hop label-wise attention is deployed to get more precise and informative representations. Extensive experiments on three benchmark MIMIC datasets indicate that our method achieves significantly better or competitive performance on all seven metrics, with much fewer parameters to optimize.

摘要
国际疾病分类 (ICD) 编码是将临床笔记中的 ICD 诊断代码赋予的任务。这可能是由于大量的标签（约9,000）和长文本（达8,000个token）所带来的挑战。然而，与之前的单一扫描过程不同，人类通常会重读文本和标签定义以获取更加自信的答案。此外，尽管已使用预训练语言模型来解决这些问题，但它们受到巨大的内存使用带来问题。为解决上述问题，我们提议一种简单 yet 有效的模型，称为多跳标签wise ATtention (MHLAT)，其中多跳标签wise ATtention 被部署以获取更加精确和有用的表示。我们在三个 MIMIC 数据集上进行了广泛的实验，结果显示，我们的方法在七个指标中均达到了显著更好或竞争性的性能，而且具有许多参数优化的多少。

Trajectory Tracking Control of Skid-Steering Mobile Robots with Slip and Skid Compensation using Sliding-Mode Control and Deep Learning

paper_url: http://arxiv.org/abs/2309.08863
repo_url: None
paper_authors: Payam Nourizadeh, Fiona J Stevens McFadden, Will N Browne
For: 这篇研究旨在提供一种可行的线上运行于开放环境中的游戏机器人运行控制系统，以减少游戏机器人在不可预测的环境中的追踪错误。* Methods: 本研究使用陡缓度控制技术设计了一个可靠的轨迹追踪系统，并将两个先前开发的深度学习模型 [1], [2] 组合到控制反馈循环中，以实时估算游戏机器人的滑行和不适合的滑行，并将补偿值传递到补偿器中。* Results: 实验结果显示，提案的控制器与补偿器可以将轨迹追踪系统的表现提高超过27%。

Abstract
Slip and skid compensation is crucial for mobile robots' navigation in outdoor environments and uneven terrains. In addition to the general slipping and skidding hazards for mobile robots in outdoor environments, slip and skid cause uncertainty for the trajectory tracking system and put the validity of stability analysis at risk. Despite research in this field, having a real-world feasible online slip and skid compensation is still challenging due to the complexity of wheel-terrain interaction in outdoor environments. This paper presents a novel trajectory tracking technique with real-world feasible online slip and skid compensation at the vehicle-level for skid-steering mobile robots in outdoor environments. The sliding mode control technique is utilized to design a robust trajectory tracking system to be able to consider the parameter uncertainty of this type of robot. Two previously developed deep learning models [1], [2] are integrated into the control feedback loop to estimate the robot's slipping and undesired skidding and feed the compensator in a real-time manner. The main advantages of the proposed technique are (1) considering two slip-related parameters rather than the conventional three slip parameters at the wheel-level, and (2) having an online real-world feasible slip and skid compensator to be able to reduce the tracking errors in unforeseen environments. The experimental results show that the proposed controller with the slip and skid compensator improves the performance of the trajectory tracking system by more than 27%.

摘要
滑动和滑倒补偿是移动机器人在户外环境中的导航关键，它们会导致轨迹追踪系统的不确定性和稳定分析的风险。尽管在这一领域进行了大量研究，但实现在线可行的滑动和滑倒补偿仍然是一个挑战，因为轮胎与地面的互动在户外环境中非常复杂。这篇论文提出了一种新的轨迹追踪技术，使用滑模控制技术设计一个可靠的轨迹追踪系统，并将两个先前开发的深度学习模型[1]、[2]integrated into the control feedback loop来估计机器人的滑动和不良滑倒，并在实时 manner中将其传递给补偿器。该技术的主要优点包括：一、考虑了机器人的两个滑动参数而不是传统的三个滑动参数，二、在实时可行的情况下实现了滑动和滑倒的补偿，从而降低了轨迹追踪系统的跟踪错误。实验结果显示，提案的控制器与滑动和滑倒补偿器可以提高轨迹追踪系统的性能，比例超过27%。

Emerging Approaches for THz Array Imaging: A Tutorial Review and Software Tool

paper_url: http://arxiv.org/abs/2309.08844
repo_url: None
paper_authors: Josiah W. Smith, Murat Torlak
for: 该文章主要是为了介绍在近场TERAHertz频率域中的THzSynthetic Aperture Radar（SAR）系统和算法。
methods: 该文章提出了一种组合信号处理和机器学习技术的新算法，以及一些传统的CLASSICAL和数据驱动的THz SAR算法，包括物体检测和SAR图像超分辨。
results: 该文章提出了一些Future研究方向，包括系统和算法标准化测试、采用当前最佳的深度学习技术、信号处理优化机器学习算法和гибридного数据驱动信号处理算法。

Abstract
Accelerated by the increasing attention drawn by 5G, 6G, and Internet of Things applications, communication and sensing technologies have rapidly evolved from millimeter-wave (mmWave) to terahertz (THz) in recent years. Enabled by significant advancements in electromagnetic (EM) hardware, mmWave and THz frequency regimes spanning 30 GHz to 300 GHz and 300 GHz to 3000 GHz, respectively, can be employed for a host of applications. The main feature of THz systems is high-bandwidth transmission, enabling ultra-high-resolution imaging and high-throughput communications; however, challenges in both the hardware and algorithmic arenas remain for the ubiquitous adoption of THz technology. Spectra comprising mmWave and THz frequencies are well-suited for synthetic aperture radar (SAR) imaging at sub-millimeter resolutions for a wide spectrum of tasks like material characterization and nondestructive testing (NDT). This article provides a tutorial review of systems and algorithms for THz SAR in the near-field with an emphasis on emerging algorithms that combine signal processing and machine learning techniques. As part of this study, an overview of classical and data-driven THz SAR algorithms is provided, focusing on object detection for security applications and SAR image super-resolution. We also discuss relevant issues, challenges, and future research directions for emerging algorithms and THz SAR, including standardization of system and algorithm benchmarking, adoption of state-of-the-art deep learning techniques, signal processing-optimized machine learning, and hybrid data-driven signal processing algorithms...

摘要
带动了5G、6G和物联网应用的增加关注，通信和感测技术在最近几年内快速发展从毫米波（mmWave）频率范围到tera响（THz）。通过电romagnetic（EM）硬件的重要进步，mmWave和THz频率范围分别是30GHz至300GHz和300GHz至3000GHz可以用于多种应用。THz系统的主要特点是高频带宽传输，使得超高分辨率成像和高速通信 possible;但是，硬件和算法领域中的挑战还需要解决才能广泛采用THz技术。包括mmWave和THz频率的spectrum适用于sub-millimeter分辨率的Synthetic Aperture Radar（SAR）成像，用于各种任务，如材料Characterization和非 destruктив测试（NDT）。本文提供了关于THz SAR的 tutorials review，强调emerging算法的发展，包括Signal Processing和机器学习技术的结合。本文还提供了经典和数据驱动THz SAR算法的概述，专注于安全应用中的对象探测。此外，我们还讨论了相关的问题、挑战和未来研究方向，包括系统和算法标准化、采用当前最佳的深度学习技术、Signal Processing优化的机器学习算法和混合数据驱动Signal Processing算法。

Bias and Fairness in Chatbots: An Overview

paper_url: http://arxiv.org/abs/2309.08836
repo_url: None
paper_authors: Jintang Xue, Yun-Cheng Wang, Chengwei Wei, Xiaofeng Liu, Jonghye Woo, C. -C. Jay Kuo
for: 本研究旨在提供一份对聊天机器人系统偏见和公平性的全面综述，以帮助开发者更好地设计和实现公平和无偏见的聊天机器人系统。
methods: 本研究使用了大量的文献综述和分析方法，检视了聊天机器人系统的历史和类别，分析了偏见的来源和应用中的可能的危害，并考虑了设计公平和无偏见的聊天机器人系统的因素。
results: 本研究结果表明，现代聊天机器人系统具有更高的功能和应用前景，但也存在偏见和公平性的担忧。通过分析偏见的来源和应用中的影响，以及考虑设计公平和无偏见的因素，可以更好地设计和实现公平和无偏见的聊天机器人系统。

Abstract
Chatbots have been studied for more than half a century. With the rapid development of natural language processing (NLP) technologies in recent years, chatbots using large language models (LLMs) have received much attention nowadays. Compared with traditional ones, modern chatbots are more powerful and have been used in real-world applications. There are however, bias and fairness concerns in modern chatbot design. Due to the huge amounts of training data, extremely large model sizes, and lack of interpretability, bias mitigation and fairness preservation of modern chatbots are challenging. Thus, a comprehensive overview on bias and fairness in chatbot systems is given in this paper. The history of chatbots and their categories are first reviewed. Then, bias sources and potential harms in applications are analyzed. Considerations in designing fair and unbiased chatbot systems are examined. Finally, future research directions are discussed.

摘要
chatbots 已经被研究了 более than half a century。随着自然语言处理（NLP）技术的快速发展，使用大型语言模型（LLMs）的 chatbots 在最近几年收到了很多关注。相比传统的 chatbots，现代 chatbots 更加强大，已经在实际应用中使用。然而，现代 chatbots 中存在偏见和公平问题。由于庞大的训练数据、极大的模型大小和解释性的缺失，现代 chatbots 的偏见缓和和公平保持是挑战。这个论文提供了对偏见和公平问题在 chatbot 系统中的全面回顾。首先，摘要了 chatbots 的历史和类别。然后，分析了偏见的来源和应用中的潜在危害。考虑了设计公平和不偏见 chatbot 系统的因素。最后，讨论了未来研究方向。

SLIDE: Reference-free Evaluation for Machine Translation using a Sliding Document Window

paper_url: http://arxiv.org/abs/2309.08832
repo_url: None
paper_authors: Vikas Raunak, Tom Kocmi, Matt Post
for: 这篇论文是关于语义评估的研究，旨在检验文档中的句子上是否可以提供同样的信息，如同人工参考。
methods: 这篇论文使用了一个新的评估指标，即SLIDE（SLiding Document Evaluator），它在文档中的块 Sentence 上使用了一个滑动窗口，将每个块传递给一个未修改的、市场上可得的质量评估模型进行评估。
results: 研究发现，SLIDE 可以达到高级系统精度，在某些情况下，甚至可以与人工参考 metric 减少差距。这表明，文档中的源Context 可以提供同样的信息，如同人工参考。

Abstract
Reference-based metrics that operate at the sentence level typically outperform quality estimation metrics, which have access only to the source and system output. This is unsurprising, since references resolve ambiguities that may be present in the source. We investigate whether additional source context can effectively substitute for a reference. We present a metric, SLIDE (SLiding Document Evaluator), which operates on blocks of sentences using a window that slides over each document in the test set, feeding each chunk into an unmodified, off-the-shelf quality estimation model. We find that SLIDE obtains significantly higher pairwise system accuracy than its sentence-level baseline, in some cases even eliminating the gap with reference-base metrics. This suggests that source context may provide the same information as a human reference.

摘要
通常情况下，参考基于的度量器在句子水平上表现比质量估计度量器更好，这不奇怪，因为参考可以解决源文本中的歧义。我们研究了 Whether additional source context can effectively substitute for a reference. We present a metric, SLIDE (SLiding Document Evaluator), which operates on blocks of sentences using a window that slides over each document in the test set, feeding each chunk into an unmodified, off-the-shelf quality estimation model. We find that SLIDE obtains significantly higher pairwise system accuracy than its sentence-level baseline, in some cases even eliminating the gap with reference-based metrics. This suggests that source context may provide the same information as a human reference.Note: The translation is in Simplified Chinese, which is the standard form of Chinese used in mainland China and Singapore. If you need Traditional Chinese, please let me know.

S3-DST: Structured Open-Domain Dialogue Segmentation and State Tracking in the Era of LLMs

paper_url: http://arxiv.org/abs/2309.08827
repo_url: None
paper_authors: Sarkar Snigdha Sarathi Das, Chirag Shah, Mengting Wan, Jennifer Neville, Longqi Yang, Reid Andersen, Georg Buscher, Tara Safavi
for: 提高open-domain对话系统中 Dialogue State Tracking（DST）的精度和 robustness，以适应大语言模型（LLM）驱动的对话系统中的复杂性和多样性。
methods: 提出了一种joint dialogue segmentation和state tracking的方法，使用Pre-Analytical Recollection机制来改进长期上下文跟踪。
results: 在一个Proprietary anonymous open-domain对话数据集以及公共可用的DST和分割数据集上进行了评估，与现状态的最佳方法进行比较，结果表明S3-DST在joint segmentation和state tracking中具有强大和稳定的性能。

Abstract
The traditional Dialogue State Tracking (DST) problem aims to track user preferences and intents in user-agent conversations. While sufficient for task-oriented dialogue systems supporting narrow domain applications, the advent of Large Language Model (LLM)-based chat systems has introduced many real-world intricacies in open-domain dialogues. These intricacies manifest in the form of increased complexity in contextual interactions, extended dialogue sessions encompassing a diverse array of topics, and more frequent contextual shifts. To handle these intricacies arising from evolving LLM-based chat systems, we propose joint dialogue segmentation and state tracking per segment in open-domain dialogue systems. Assuming a zero-shot setting appropriate to a true open-domain dialogue system, we propose S3-DST, a structured prompting technique that harnesses Pre-Analytical Recollection, a novel grounding mechanism we designed for improving long context tracking. To demonstrate the efficacy of our proposed approach in joint segmentation and state tracking, we evaluate S3-DST on a proprietary anonymized open-domain dialogue dataset, as well as publicly available DST and segmentation datasets. Across all datasets and settings, S3-DST consistently outperforms the state-of-the-art, demonstrating its potency and robustness the next generation of LLM-based chat systems.

摘要
traditional Dialogue State Tracking (DST) problem aims to track user preferences and intents in user-agent conversations. While sufficient for task-oriented dialogue systems supporting narrow domain applications, the advent of Large Language Model (LLM)-based chat systems has introduced many real-world intricacies in open-domain dialogues. These intricacies manifest in the form of increased complexity in contextual interactions, extended dialogue sessions encompassing a diverse array of topics, and more frequent contextual shifts. To handle these intricacies arising from evolving LLM-based chat systems, we propose joint dialogue segmentation and state tracking per segment in open-domain dialogue systems. Assuming a zero-shot setting appropriate to a true open-domain dialogue system, we propose S3-DST, a structured prompting technique that harnesses Pre-Analytical Recollection, a novel grounding mechanism we designed for improving long context tracking. To demonstrate the efficacy of our proposed approach in joint segmentation and state tracking, we evaluate S3-DST on a proprietary anonymized open-domain dialogue dataset, as well as publicly available DST and segmentation datasets. Across all datasets and settings, S3-DST consistently outperforms the state-of-the-art, demonstrating its potency and robustness for the next generation of LLM-based chat systems.Here's the breakdown of the translation:* "traditional Dialogue State Tracking (DST) problem" becomes "传统的对话状态追踪问题" (traditional DST problem)* "aims to track user preferences and intents in user-agent conversations" becomes "目标是跟踪用户首选和意图在用户代理对话中" (targets tracking user preferences and intents in user-agent conversations)* "While sufficient for task-oriented dialogue systems supporting narrow domain applications" becomes "然而对于支持窄领域应用的任务导向对话系统来说， sufficient" (However, for task-oriented dialogue systems supporting narrow domain applications, sufficient)* "the advent of Large Language Model (LLM)-based chat systems has introduced many real-world intricacies in open-domain dialogues" becomes "LLM基于对话系统的出现引入了许多实际世界中的复杂性，在开放领域对话中" (The advent of LLM-based chat systems has introduced many complexities in open-domain dialogues)* "These intricacies manifest in the form of increased complexity in contextual interactions, extended dialogue sessions encompassing a diverse array of topics, and more frequent contextual shifts" becomes "这些复杂性表现为对话中的增加复杂性，更长的对话会话，涵盖更多的话题，以及更频繁的上下文转换" (These complexities manifest as increased complexity in contextual interactions, longer dialogue sessions encompassing a diverse array of topics, and more frequent contextual shifts)* "To handle these intricacies arising from evolving LLM-based chat systems" becomes "面对这些来自演进的 LLM 基于对话系统的复杂性" (To handle these complexities arising from evolving LLM-based chat systems)* "we propose joint dialogue segmentation and state tracking per segment in open-domain dialogue systems" becomes "我们提议在开放领域对话系统中实现对话段化和状态追踪" (We propose joint dialogue segmentation and state tracking in open-domain dialogue systems)* "Assuming a zero-shot setting appropriate to a true open-domain dialogue system" becomes "假设真正的开放领域对话系统的零枪射设定" (Assuming a zero-shot setting appropriate to a true open-domain dialogue system)* "we propose S3-DST, a structured prompting technique that harnesses Pre-Analytical Recollection, a novel grounding mechanism we designed for improving long context tracking" becomes "我们提议 S3-DST，一种基于 Pre-Analytical Recollection 的结构化提示技术，用于改进长上下文跟踪" (We propose S3-DST, a structured prompting technique that harnesses Pre-Analytical Recollection, a novel grounding mechanism we designed for improving long context tracking)* "To demonstrate the efficacy of our proposed approach in joint segmentation and state tracking" becomes "用于证明我们提议的方法在对话段化和状态追踪中的有效性" (To demonstrate the efficacy of our proposed approach in joint segmentation and state tracking)* "we evaluate S3-DST on a proprietary anonymized open-domain dialogue dataset, as well as publicly available DST and segmentation datasets" becomes "我们在一个 propriety 隐私化的开放领域对话数据集上评估 S3-DST，以及公共可用的 DST 和分段数据集" (We evaluate S3-DST on a proprietary anonymized open-domain dialogue dataset, as well as publicly available DST and segmentation datasets)* "Across all datasets and settings, S3-DST consistently outperforms the state-of-the-art" becomes "在所有数据集和设定下，S3-DST 一直表现出优于当前领先的状态" (Across all datasets and settings, S3-DST consistently outperforms the state-of-the-art)* "demonstrating its potency and robustness for the next generation of LLM-based chat systems" becomes "这种表现力和可靠性为下一代 LLM 基于对话系统的发展提供了启示" (demonstrating its potency and robustness for the next generation of LLM-based chat systems)

Distributionally Robust Post-hoc Classifiers under Prior Shifts

paper_url: http://arxiv.org/abs/2309.08825
repo_url: https://github.com/weijiaheng/drops
paper_authors: Jiaheng Wei, Harikrishna Narasimhan, Ehsan Amid, Wen-Sheng Chu, Yang Liu, Abhishek Kumar
for: 本研究旨在强化机器学习模型对分布变化的抗预测能力。
methods: 我们提出了一种极其轻量级的后处理方法，通过在预训练模型上计算并应用批处理调整来减少一个目标分布下的抗预测损失。
results: 我们的方法可以提供减少分布变化导致的抗预测损失的保证，并且在实际实现中具有强大的表现。

Abstract
The generalization ability of machine learning models degrades significantly when the test distribution shifts away from the training distribution. We investigate the problem of training models that are robust to shifts caused by changes in the distribution of class-priors or group-priors. The presence of skewed training priors can often lead to the models overfitting to spurious features. Unlike existing methods, which optimize for either the worst or the average performance over classes or groups, our work is motivated by the need for finer control over the robustness properties of the model. We present an extremely lightweight post-hoc approach that performs scaling adjustments to predictions from a pre-trained model, with the goal of minimizing a distributionally robust loss around a chosen target distribution. These adjustments are computed by solving a constrained optimization problem on a validation set and applied to the model during test time. Our constrained optimization objective is inspired by a natural notion of robustness to controlled distribution shifts. Our method comes with provable guarantees and empirically makes a strong case for distributional robust post-hoc classifiers. An empirical implementation is available at https://github.com/weijiaheng/Drops.

摘要

GPT as a Baseline for Recommendation Explanation Texts

paper_url: http://arxiv.org/abs/2309.08817
repo_url: None
paper_authors: Joyce Zhou, Thorsten Joachims
for: 这个研究探讨了现代模型生成的电影推荐文本解释如何帮助用户，以及用户对不同组成部分的评价。
methods: 研究使用现代自然语言处理技术生成电影推荐文本解释，并对用户的评价进行分析。
results: 研究发现参与者对电影推荐文本解释的评价没有显著差异，但参与者对已经见过的电影的评价更高。此外，参与者 также标记了电影评论文本中重要的特征。

Abstract
In this work, we establish a baseline potential for how modern model-generated text explanations of movie recommendations may help users, and explore what different components of these text explanations that users like or dislike, especially in contrast to existing human movie reviews. We found that participants gave no significantly different rankings between movies, nor did they give significantly different individual quality scores to reviews of movies that they had never seen before. However, participants did mark reviews as significantly better when they were movies they had seen before. We also explore specific aspects of movie review texts that participants marked as important for each quality. Overall, we establish that modern LLMs are a promising source of recommendation explanations, and we intend on further exploring personalizable text explanations in the future.

摘要
在这项工作中，我们建立了现代模型生成的电影推荐文本解释的基线潜力，并探索用户对不同组成部分的响应，特别是与现有人类电影评论相比。我们发现参与者没有提供不同电影的排名，也没有对每部电影的质量分数作出不同的评价。但参与者确实将已经见过的电影的评论标记为更好。我们还探究每部电影评论文本中各个重要方面，参与者认为哪些方面是重要的。总的来说，我们发现现代LLM是可靠的推荐解释来源，我们将在未来进一步探索个性化文本解释。