cs.AI - 2023-07-31

ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs

  • paper_url: http://arxiv.org/abs/2307.16789
  • repo_url: https://github.com/openbmb/toolbench
  • paper_authors: Yujia Qin, Shihao Liang, Yining Ye, Kunlun Zhu, Lan Yan, Yaxi Lu, Yankai Lin, Xin Cong, Xiangru Tang, Bill Qian, Sihan Zhao, Runchu Tian, Ruobing Xie, Jie Zhou, Mark Gerstein, Dahai Li, Zhiyuan Liu, Maosong Sun
    for:* The paper aims to improve the tool-use capabilities of open-source large language models (LLMs) by introducing a general framework called ToolLLM.methods:* The framework consists of data construction, model training, and evaluation, including the creation of an instruction-tuning dataset called ToolBench and the development of a novel depth-first search-based decision tree (DFSDT) to enhance planning and reasoning capabilities.results:* The paper demonstrates the effectiveness of ToolLLM by fine-tuning LLaMA on ToolBench and obtaining ToolLLaMA, which shows a remarkable ability to execute complex instructions and generalize to unseen APIs, and exhibits comparable performance to ChatGPT. Additionally, the paper proposes a neural API retriever to recommend appropriate APIs for each instruction.
    Abstract Despite the advancements of open-source large language models (LLMs) and their variants, e.g., LLaMA and Vicuna, they remain significantly limited in performing higher-level tasks, such as following human instructions to use external tools (APIs). This is because current instruction tuning largely focuses on basic language tasks instead of the tool-use domain. This is in contrast to state-of-the-art (SOTA) LLMs, e.g., ChatGPT, which have demonstrated excellent tool-use capabilities but are unfortunately closed source. To facilitate tool-use capabilities within open-source LLMs, we introduce ToolLLM, a general tool-use framework of data construction, model training and evaluation. We first present ToolBench, an instruction-tuning dataset for tool use, which is created automatically using ChatGPT. Specifically, we collect 16,464 real-world RESTful APIs spanning 49 categories from RapidAPI Hub, then prompt ChatGPT to generate diverse human instructions involving these APIs, covering both single-tool and multi-tool scenarios. Finally, we use ChatGPT to search for a valid solution path (chain of API calls) for each instruction. To make the searching process more efficient, we develop a novel depth-first search-based decision tree (DFSDT), enabling LLMs to evaluate multiple reasoning traces and expand the search space. We show that DFSDT significantly enhances the planning and reasoning capabilities of LLMs. For efficient tool-use assessment, we develop an automatic evaluator: ToolEval. We fine-tune LLaMA on ToolBench and obtain ToolLLaMA. Our ToolEval reveals that ToolLLaMA demonstrates a remarkable ability to execute complex instructions and generalize to unseen APIs, and exhibits comparable performance to ChatGPT. To make the pipeline more practical, we devise a neural API retriever to recommend appropriate APIs for each instruction, negating the need for manual API selection.
    摘要 尽管开源大型自然语言模型(LLM)和其变体(如LLaMA和Vicuna)在进行更高级别任务方面有所进步,但它们仍然在使用外部工具(API)上有限制。这是因为当前的 instrucion 调整主要关注基础语言任务而不是工具使用领域。与此相反,状态元(SOTA)LLM(如ChatGPT)已经示出了出色的工具使用能力,但它们却是关闭源代码。为了帮助开源LLM在工具使用方面增强能力,我们介绍了 ToolLLM,一个通用工具使用框架,包括数据建构、模型训练和评估。我们首先介绍了 ToolBench,一个用于工具使用的 instrucion 调整数据集,通过自动使用 ChatGPT 生成了16,464个实际RESTful API,覆盖49个类别。然后,我们使用 ChatGPT 生成多种人类 instrucion,涵盖单个工具和多个工具enario。最后,我们使用 ChatGPT 搜索一个有效的解决方案路径(chain of API calls)。为了使搜索过程更有效率,我们开发了一种深度优先搜索基于决策树(DFSDT),使 LLVM 可以评估多种理解迹象和扩大搜索空间。我们表明,DFSDT 有效地提高 LLVM 的规划和推理能力。为了有效评估工具使用,我们开发了一个自动评估器:ToolEval。我们精度调整 LLaMA 在 ToolBench 上,得到了 ToolLLaMA。我们的 ToolEval 表明,ToolLLaMA 能够执行复杂 instrucion 并对未看到API进行扩展,并且与 ChatGPT 的性能相当。为了使管道更实用,我们设计了一种神经API搜索器,以便为每个 instrucion 推荐合适的API,从而消除手动API选择的需求。

The Ethics of AI Value Chains: An Approach for Integrating and Expanding AI Ethics Research, Practice, and Governance

  • paper_url: http://arxiv.org/abs/2307.16787
  • repo_url: None
  • paper_authors: Blair Attard-Frost, David Gray Widder
  • for: 本研究旨在探讨人工智能伦理学的新方法和实践,以满足多个actor、context和scale of activity中的AI系统设计、开发、使用和管理的伦理和实践问题。
  • methods: 本文使用价值链概念作为一个综合的概念,以涵盖AI系统的伦理和实践问题。文章对价值链理论 perspective from strategic management, service science, and economic geography literature进行了回顾和整合,并对学术、产业和政策文献中关于AI价值链的观点进行了回顾。
  • results: 文章通过将AI伦理问题与AI价值链中的actor和资源活动相连接,示出了 approaching AI伦理问题为价值链问题可以实现更加全面和紧凑的研究和管理实践。文章还提出了五个未来方向,以便研究者、实践者和政策制定者可以在AI价值链中调查和 intervene 伦理问题。
    Abstract Recent criticisms of AI ethics principles and practices have indicated a need for new approaches to AI ethics that can account for and intervene in the design, development, use, and governance of AI systems across multiple actors, contexts, and scales of activity. This paper positions AI value chains as an integrative concept that satisfies those needs, enabling AI ethics researchers, practitioners, and policymakers to take a more comprehensive view of the ethical and practical implications of AI systems. We review and synthesize theoretical perspectives on value chains from the literature on strategic management, service science, and economic geography. We then review perspectives on AI value chains from the academic, industry, and policy literature. We connect an inventory of ethical concerns in AI to the actors and resourcing activities involved in AI value chains to demonstrate that approaching AI ethics issues as value chain issues can enable more comprehensive and integrative research and governance practices. We illustrate this by suggesting five future directions for researchers, practitioners, and policymakers to investigate and intervene in the ethical concerns associated with AI value chains.
    摘要 We review and synthesize theoretical perspectives on value chains from the literature on strategic management, service science, and economic geography, and then examine perspectives on AI value chains from the academic, industry, and policy literature. We connect an inventory of ethical concerns in AI to the actors and resourcing activities involved in AI value chains, demonstrating that approaching AI ethics issues as value chain issues can lead to more comprehensive and integrative research and governance practices.To illustrate this, we suggest five future directions for researchers, practitioners, and policymakers to investigate and intervene in the ethical concerns associated with AI value chains:1. Investigating the ethical implications of AI value chains in different contexts, such as healthcare, finance, and education.2. Developing new methods and tools for analyzing and managing AI value chains, including the use of blockchain and other distributed ledger technologies.3. Examining the role of governance and regulation in AI value chains, and developing new frameworks for ethical governance and oversight.4. Addressing the issue of bias and discrimination in AI value chains, and developing strategies for mitigating these issues.5. Investigating the impact of AI value chains on employment and the labor market, and developing policies to ensure that the benefits of AI are shared fairly among all stakeholders.

Ranking-based Argumentation Semantics Applied to Logical Argumentation (full version)

  • paper_url: http://arxiv.org/abs/2307.16780
  • repo_url: None
  • paper_authors: Jesse Heyninck, Badran Raddaoui, Christian Straßer
  • for: 这篇论文主要针对的是形式论证中的扩展基于 semantics 和结构基于 semantics 之间的比较,以及这两种 semantics 在不同的论证结构下的行为。
  • methods: 这篇论文使用了许多不同的论证结构和推理方法,包括扩展基于 semantics 和结构基于 semantics,以及各种不同的推理方法。
  • results: 研究发现,使用rank-based semantics 在不同的论证结构下的行为都很相似,并且这些 semantics 可以生成一种称为责任度度量的量化表示,这种度量可以用于衡量不同论证结构下的论证结果。
    Abstract In formal argumentation, a distinction can be made between extension-based semantics, where sets of arguments are either (jointly) accepted or not, and ranking-based semantics, where grades of acceptability are assigned to arguments. Another important distinction is that between abstract approaches, that abstract away from the content of arguments, and structured approaches, that specify a method of constructing argument graphs on the basis of a knowledge base. While ranking-based semantics have been extensively applied to abstract argumentation, few work has been done on ranking-based semantics for structured argumentation. In this paper, we make a systematic investigation into the behaviour of ranking-based semantics applied to existing formalisms for structured argumentation. We show that a wide class of ranking-based semantics gives rise to so-called culpability measures, and are relatively robust to specific choices in argument construction methods.
    摘要 formal 的论证中,可以区分Extension-based semantics和ranking-based semantics两种不同的Semantics。Extension-based semantics是指集合的论证是接受或不接受的,而ranking-based semantics则是将论证的可接受度赋予一个排名。另外,还可以分为抽象approaches和结构化approaches两种不同的方法。而ranking-based semantics主要应用于抽象论证,对于结构化论证的应用则比较少。在这篇论文中,我们进行了系统性的调查,探讨了现有的结构化论证 formalism中ranking-based semantics的行为。我们发现,一类广泛的ranking-based semantics会导致所谓的责任度量,并且对于不同的论证构建方法具有一定的抗锋性。

KoBBQ: Korean Bias Benchmark for Question Answering

  • paper_url: http://arxiv.org/abs/2307.16778
  • repo_url: None
  • paper_authors: Jiho Jin, Jiseon Kim, Nayeon Lee, Haneul Yoo, Alice Oh, Hwaran Lee
  • for: 本研究旨在开发一个适用于韩语问答任务中的社会偏见标准 dataset,以评估语言模型(LM)在不同文化环境中的偏见表现。
  • methods: 我们采用了一种基于英语 BBQ dataset 的文化适应方法,将 BBQ 中的样本分为三类:简单翻译(可以直接通过文化翻译使用)、目标修改(需要本地化)和样本除除(不适合韩国文化)。 我们还通过采样韩国文学作品中的社会偏见,新增了适用于韩国文化的四种偏见类别。
  • results: 我们使用 KoBBQ dataset 测试了多种国际化语言模型的准确率和偏见分数,发现韩语和英语中语言模型的偏见不同,提示需要手动制作考虑到文化差异的数据。
    Abstract The BBQ (Bias Benchmark for Question Answering) dataset enables the evaluation of the social biases that language models (LMs) exhibit in downstream tasks. However, it is challenging to adapt BBQ to languages other than English as social biases are culturally dependent. In this paper, we devise a process to construct a non-English bias benchmark dataset by leveraging the English BBQ dataset in a culturally adaptive way and present the KoBBQ dataset for evaluating biases in Question Answering (QA) tasks in Korean. We identify samples from BBQ into three classes: Simply-Translated (can be used directly after cultural translation), Target-Modified (requires localization in target groups), and Sample-Removed (does not fit Korean culture). We further enhance the cultural relevance to Korean culture by adding four new categories of bias specific to Korean culture and newly creating samples based on Korean literature. KoBBQ consists of 246 templates and 4,740 samples across 12 categories of social bias. Using KoBBQ, we measure the accuracy and bias scores of several state-of-the-art multilingual LMs. We demonstrate the differences in the bias of LMs in Korean and English, clarifying the need for hand-crafted data considering cultural differences.
    摘要 BBQ(偏见权重标准)数据集允许语言模型(LM)在下游任务中展现社会偏见。然而,将 BBQ adapted 到非英语语言是文化依赖的。在这篇论文中,我们提出了一种构建非英语偏见标准数据集的过程,利用英语 BBQ dataset 的文化适应方式,并介绍了韩国语言Question Answering(QA)任务中的 KoBBQ 数据集。我们将 BBQ 中的样本分为三类:可以直接使用的简单翻译(Simply-Translated)、需要本地化的目标修改(Target-Modified)和不适应韩国文化的样本(Sample-Removed)。我们还增强了韩国文化的文化相关性,添加了适用于韩国文化的四种偏见类型,并基于韩国文学创作了新的样本。KoBBQ 包含 246 个模板和 4,740 个样本,分别属于 12 个社会偏见类别。使用 KoBBQ,我们测试了多种当前领域最佳的多语言语言模型的准确性和偏见分数。我们显示了韩国语言和英语中 LM 的偏见之间的差异,从而证明了需要手动制作考虑到文化差异的数据。

AsdKB: A Chinese Knowledge Base for the Early Screening and Diagnosis of Autism Spectrum Disorder

  • paper_url: http://arxiv.org/abs/2307.16773
  • repo_url: None
  • paper_authors: Tianxing Wu, Xudong Cao, Yipeng Zhu, Feiyue Wu, Tianling Gong, Yuxiang Wang, Shenqi Jing
  • for: 这篇论文的目的是创建一个中文知识库(AsdKB),以帮助早期诊断和诊断Autism Spectrum Disorder(ASD)。
  • methods: 这篇论文使用了多种源,包括SNOMED CT和ICD-10的疾病知识、DSM-5的诊断标准和社会组织和医疗机构推荐的检测工具,以及专业医生和医院的知识。
  • results: 这篇论文创建了一个包含ontological和事实知识的中文知识库(AsdKB),可以用于问答、辅助诊断和专家建议。论文还提出了一个基于这个知识库的 прототип,可以在http://asdkb.org.cn/上查看。
    Abstract To easily obtain the knowledge about autism spectrum disorder and help its early screening and diagnosis, we create AsdKB, a Chinese knowledge base on autism spectrum disorder. The knowledge base is built on top of various sources, including 1) the disease knowledge from SNOMED CT and ICD-10 clinical descriptions on mental and behavioural disorders, 2) the diagnostic knowledge from DSM-5 and different screening tools recommended by social organizations and medical institutes, and 3) the expert knowledge on professional physicians and hospitals from the Web. AsdKB contains both ontological and factual knowledge, and is accessible as Linked Data at https://w3id.org/asdkb/. The potential applications of AsdKB are question answering, auxiliary diagnosis, and expert recommendation, and we illustrate them with a prototype which can be accessed at http://asdkb.org.cn/.
    摘要 为了轻松获得关于自闭异症 спектル谱病的知识,帮助早期检测和诊断,我们创建了Autism Spectrum Disorder知识库(AsdKB)。该知识库基于多种来源,包括1)疾病知识从SNOMED CT和ICD-10临床描述中的精神和行为病种,2)诊断知识从DSM-5和不同检测工具推荐的社会组织和医疗机构,以及3)专业医生和医院的知识。AsdKB包含ontological和事实知识,可以通过链接数据访问https://w3id.org/asdkb/。该知识库的潜在应用包括问答、辅助诊断和专家建议,我们在http://asdkb.org.cn/中提供了一个原型。

Advancing Smart Malnutrition Monitoring: A Multi-Modal Learning Approach for Vital Health Parameter Estimation

  • paper_url: http://arxiv.org/abs/2307.16745
  • repo_url: None
  • paper_authors: Ashish Marisetty, Prathistith Raj M, Praneeth Nemani, Venkanna Udutalapally, Debanjan Das
  • for: 这个研究旨在开发一种智能营养监测系统,用于识别营养不良的问题并提供个性化的营养计划。
  • methods: 该研究使用了一种基于全身图像的多模态学习框架,通过重构高精度的3D点云和提取512维度的特征嵌入,以便准确估算身高和体重。同时,通过应用学习参数,这些特征还可以用于估算体重准确。
  • results: 该研究的模型具有较低的平均绝对误差(MAE),可以在多种照明条件下并行运行,并且能够准确地估算身高和体重。
    Abstract Malnutrition poses a significant threat to global health, resulting from an inadequate intake of essential nutrients that adversely impacts vital organs and overall bodily functioning. Periodic examinations and mass screenings, incorporating both conventional and non-invasive techniques, have been employed to combat this challenge. However, these approaches suffer from critical limitations, such as the need for additional equipment, lack of comprehensive feature representation, absence of suitable health indicators, and the unavailability of smartphone implementations for precise estimations of Body Fat Percentage (BFP), Basal Metabolic Rate (BMR), and Body Mass Index (BMI) to enable efficient smart-malnutrition monitoring. To address these constraints, this study presents a groundbreaking, scalable, and robust smart malnutrition-monitoring system that leverages a single full-body image of an individual to estimate height, weight, and other crucial health parameters within a multi-modal learning framework. Our proposed methodology involves the reconstruction of a highly precise 3D point cloud, from which 512-dimensional feature embeddings are extracted using a headless-3D classification network. Concurrently, facial and body embeddings are also extracted, and through the application of learnable parameters, these features are then utilized to estimate weight accurately. Furthermore, essential health metrics, including BMR, BFP, and BMI, are computed to conduct a comprehensive analysis of the subject's health, subsequently facilitating the provision of personalized nutrition plans. While being robust to a wide range of lighting conditions across multiple devices, our model achieves a low Mean Absolute Error (MAE) of $\pm$ 4.7 cm and $\pm$ 5.3 kg in estimating height and weight.
    摘要 globally, malnutrition poses a significant threat to health, resulting from an inadequate intake of essential nutrients that adversely impacts vital organs and overall bodily functioning. To combat this challenge, periodic examinations and mass screenings have been employed, incorporating both conventional and non-invasive techniques. However, these approaches are limited by the need for additional equipment, lack of comprehensive feature representation, absence of suitable health indicators, and the unavailability of smartphone implementations for precise estimations of Body Fat Percentage (BFP), Basal Metabolic Rate (BMR), and Body Mass Index (BMI) to enable efficient smart-malnutrition monitoring.To address these constraints, this study presents a groundbreaking, scalable, and robust smart malnutrition-monitoring system that leverages a single full-body image of an individual to estimate height, weight, and other crucial health parameters within a multi-modal learning framework. Our proposed methodology involves the reconstruction of a highly precise 3D point cloud, from which 512-dimensional feature embeddings are extracted using a headless-3D classification network. Concurrently, facial and body embeddings are also extracted, and through the application of learnable parameters, these features are then utilized to estimate weight accurately. Furthermore, essential health metrics, including BMR, BFP, and BMI, are computed to conduct a comprehensive analysis of the subject's health, subsequently facilitating the provision of personalized nutrition plans.Our model is robust to a wide range of lighting conditions across multiple devices, achieving a low Mean Absolute Error (MAE) of $\pm$ 4.7 cm and $\pm$ 5.3 kg in estimating height and weight.

Hybrid quantum transfer learning for crack image classification on NISQ hardware

  • paper_url: http://arxiv.org/abs/2307.16723
  • repo_url: None
  • paper_authors: Alexander Geng, Ali Moghiseh, Claudia Redenbach, Katja Schladitz
  • for: 这个论文是为了探讨量子计算机在图像处理方面的应用。
  • methods: 这个论文使用了量子转移学习来检测灰度图像中的裂隙。
  • results: 研究发现,使用量子转移学习可以快速地检测灰度图像中的裂隙,并且可以提高对图像的检测精度。I hope this helps! Let me know if you have any further questions or if there’s anything else I can assist you with.
    Abstract Quantum computers possess the potential to process data using a remarkably reduced number of qubits compared to conventional bits, as per theoretical foundations. However, recent experiments have indicated that the practical feasibility of retrieving an image from its quantum encoded version is currently limited to very small image sizes. Despite this constraint, variational quantum machine learning algorithms can still be employed in the current noisy intermediate scale quantum (NISQ) era. An example is a hybrid quantum machine learning approach for edge detection. In our study, we present an application of quantum transfer learning for detecting cracks in gray value images. We compare the performance and training time of PennyLane's standard qubits with IBM's qasm\_simulator and real backends, offering insights into their execution efficiency.
    摘要 量子计算机具有可能处理数据使用remarkably减少的量子比特数量,根据理论基础。然而,最近的实验表明目前只能处理非常小的图像大小。尽管如此,可以在当前的含杂中渠道量子(NISQ)时代使用量子机器学习算法。我们的研究中,我们应用了量子传输学习方法 для检测灰度图像中的裂隙。我们比较了PennyLane的标准量子比特与IBM的qasm\_simulator和真实后端的执行效率。

TFE-GNN: A Temporal Fusion Encoder Using Graph Neural Networks for Fine-grained Encrypted Traffic Classification

  • paper_url: http://arxiv.org/abs/2307.16713
  • repo_url: https://github.com/ViktorAxelsen/TFE-GNN
  • paper_authors: Haozhen Zhang, Le Yu, Xi Xiao, Qing Li, Francesco Mercaldo, Xiapu Luo, Qixu Liu
  • for: 这篇论文主要关注于 encrypt 流量分类,尤其是对于短流的处理。
  • methods: 本文提出了一个基于点对点相互信息(PMI)的字节级流量图建构方法,并使用图神经网络(GNN)实现特征提取。特别是,本文设计了一个双嵌入层、GNN基于流量图编码器以及跨关特征融合机制,从 Header 和 Payload 字节分别嵌入,然后融合以获得更强的特征表示。
  • results: 实验结果显示,TFE-GNN 在两个真实数据集上比多个现有方法进行细化 encrypt 流量分类任务中表现更好。
    Abstract Encrypted traffic classification is receiving widespread attention from researchers and industrial companies. However, the existing methods only extract flow-level features, failing to handle short flows because of unreliable statistical properties, or treat the header and payload equally, failing to mine the potential correlation between bytes. Therefore, in this paper, we propose a byte-level traffic graph construction approach based on point-wise mutual information (PMI), and a model named Temporal Fusion Encoder using Graph Neural Networks (TFE-GNN) for feature extraction. In particular, we design a dual embedding layer, a GNN-based traffic graph encoder as well as a cross-gated feature fusion mechanism, which can first embed the header and payload bytes separately and then fuses them together to obtain a stronger feature representation. The experimental results on two real datasets demonstrate that TFE-GNN outperforms multiple state-of-the-art methods in fine-grained encrypted traffic classification tasks.
    摘要 伪Encrypted traffic classification 已经受到研究人员和工业公司的广泛关注。然而,现有的方法只是EXTract flow-level features,因为流量的统计性不可靠,或者对header和payload进行平等处理,而不是挖掘字节之间的可能的相关性。因此,在这篇论文中,我们提出了基于点wise mutual information(PMI)的字节级流量图构建方法,以及一种基于图神经网络(GNN)的特征提取模型——时间融合编码器(TFE-GNN)。特别是,我们设计了两层双向嵌入层,一个基于GNN的流量图编码器以及一个跨门控制的特征融合机制,可以先将header和payload字节分别嵌入,然后将其融合起来获得更强的特征表示。实验结果表明,TFE-GNN在真实的两个数据集上比多种现状顶尖方法在细化Encrypted traffic classification任务中表现出优异。

An O.D.E. Framework of Distributed TD-Learning for Networked Multi-Agent Markov Decision Processes

  • paper_url: http://arxiv.org/abs/2307.16706
  • repo_url: None
  • paper_authors: Donghwan Lee, Han-Dong Lim, Do Wan Kim
  • for: 这篇论文主要目标是研究分布式常微分方程(ODE)和分布式时间差(TD)学习算法,用于分布式多智能体马尔可夫决策问题(MAMDP)。
  • methods: 我们采用分布式多智能体框架,每个智能体只能访问自己的奖励,缺乏其它智能体奖励的信息。此外,每个智能体可以与邻居智能体共享自己的参数通过通信网络,表示为一个图。
  • results: 我们的贡献包括两个重要点:1)我们提出了新的分布式ODE, inspirited by the averaging consensus method在连续时间领域。ODE的 converges是通过控制理论的角度进行评估的。2)基于上述ODE,我们开发了新的分布式TD学习算法。其中一个提出的分布式ODE具有两个独立的动态系统,每个系统都有独特的作用,这种特点使得我们可以提出一种新的分布式TD学习策略,其 converges可能通过Borkar-Meyn定理进行证明。
    Abstract The primary objective of this paper is to investigate distributed ordinary differential equation (ODE) and distributed temporal difference (TD) learning algorithms for networked multi-agent Markov decision problems (MAMDPs). In our study, we adopt a distributed multi-agent framework where individual agents have access only to their own rewards, lacking insights into the rewards of other agents. Additionally, each agent has the ability to share its parameters with neighboring agents through a communication network, represented by a graph. Our contributions can be summarized in two key points: 1) We introduce novel distributed ODEs, inspired by the averaging consensus method in the continuous-time domain. The convergence of the ODEs is assessed through control theory perspectives. 2) Building upon the aforementioned ODEs, we devise new distributed TD-learning algorithms. A standout feature of one of our proposed distributed ODEs is its incorporation of two independent dynamic systems, each with a distinct role. This characteristic sets the stage for a novel distributed TD-learning strategy, the convergence of which can potentially be established using Borkar-Meyn theorem.
    摘要 主要目标 OF 这篇论文是研究分布式常微分方程(ODE)和分布式时间差(TD)学习算法,用于分布式多智能体决策问题(MAMDPs)。在我们的研究中,我们采用了分布式多智能体框架,每个智能体只有自己的奖励信息,缺乏其他智能体奖励信息的视野。此外,每个智能体可以与邻居智能体通过通信网络(表示为图)共享自己的参数。我们的贡献可以概括为两个关键点:1. 我们提出了新的分布式ODE, inspirited 由积分协议在连续时间域。我们证明了这些ODE的收敛性,使用控制理论的视角。2. 基于上述ODE,我们开发了新的分布式TD学习算法。我们的一个提出的分布式ODE特征是它包含两个独立的动力系统,每个动力系统都有独特的作用。这种特点为我们提出了一种新的分布式TD学习策略,其收敛性可能通过博尔卡-美因定理证明。

Lookbehind Optimizer: k steps back, 1 step forward

  • paper_url: http://arxiv.org/abs/2307.16704
  • repo_url: None
  • paper_authors: Gonçalo Mordido, Pranshu Malviya, Aristide Baratin, Sarath Chandar
  • for: 提高深度神经网络训练稳定性,通过快速的权重来导引落幅方向。
  • methods: combinates Lookahead optimizer 和 Sharpness-Aware Minimization (SAM) 来稳定多步变体,提高损失精度与稳定性之间的交换。
  • results: 通过 Lookbehind 算法,在不同任务和训练环境中实现了各种优势,包括提高泛化性能、增强权重噪声抗性和生命长学习中的抗忘记性。
    Abstract The Lookahead optimizer improves the training stability of deep neural networks by having a set of fast weights that "look ahead" to guide the descent direction. Here, we combine this idea with sharpness-aware minimization (SAM) to stabilize its multi-step variant and improve the loss-sharpness trade-off. We propose Lookbehind, which computes $k$ gradient ascent steps ("looking behind") at each iteration and combine the gradients to bias the descent step toward flatter minima. We apply Lookbehind on top of two popular sharpness-aware training methods -- SAM and adaptive SAM (ASAM) -- and show that our approach leads to a myriad of benefits across a variety of tasks and training regimes. Particularly, we show increased generalization performance, greater robustness against noisy weights, and higher tolerance to catastrophic forgetting in lifelong learning settings.
    摘要 “lookahead”优化器可以提高深度神经网络的训练稳定性,通过一组快速的权重来引导 DESC 方向。我们将这个想法与锐度感知最小化(SAM)相结合,以稳定其多步变体并改善损失锐度协调。我们提议“lookbehind”,它在每次迭代中计算 $k$ 步梯度上升(“寻看后”),并将梯度相加以偏移下降步向平坦的极小值。我们在两种流行的锐度感知训练方法——SAM 和 adaptive SAM(ASAM)之上应用 Lookbehind,并证明我们的方法在各种任务和训练 режиmidst 中具有多种优点,包括提高泛化性能、增强随着权重噪声的Robustness和生长学习中的快速忘记症。

Ontology engineering with Large Language Models

  • paper_url: http://arxiv.org/abs/2307.16699
  • repo_url: https://github.com/jettbrains/-L-
  • paper_authors: Patricia Mateiu, Adrian Groza
  • for: 增强ontology的描述逻辑表示,通过自动将自然语言句子翻译成描述逻辑语言。
  • methods: 使用大型自然语言模型(LLMs)进行翻译,并对GPT-3模型进行微调,以将自然语言句子翻译成OWL函数 sintax。
  • results: 通过人工监督,使用得到的axioms来扩充ontology,并提供了一个Protge插件来实现这一目标。
    Abstract We tackle the task of enriching ontologies by automatically translating natural language sentences into Description Logic. Since Large Language Models (LLMs) are the best tools for translations, we fine-tuned a GPT-3 model to convert Natural Language sentences into OWL Functional Syntax. We employ objective and concise examples to fine-tune the model regarding: instances, class subsumption, domain and range of relations, object properties relationships, disjoint classes, complements, cardinality restrictions. The resulted axioms are used to enrich an ontology, in a human supervised manner. The developed tool is publicly provided as a Protge plugin.
    摘要 我们致力于增强ontology的措施,通过自动将自然语言句子翻译成描述逻辑。由于大型语言模型(LLMs)是翻译的最佳工具,我们对GPT-3模型进行了细化,将自然语言句子翻译成OWL函数 sintaxis。我们使用明确和简洁的例子来细化模型,包括实例、类划覆盖、领域和关系范围、对象属性关系、离散类、补充、Cardinality约束。得到的axioms可以用于增强ontology,以人工监督的方式。我们已经开发了一个Protge插件,以便公共提供这种工具。

Anticipating Responsibility in Multiagent Planning

  • paper_url: http://arxiv.org/abs/2307.16685
  • repo_url: None
  • paper_authors: Timothy Parker, Umberto Grandi, Emiliano Lorini
  • for: 本 paper 探讨了责任预测(Responsibility Anticipation)的概念,即在多智能计划设定中,智能机器可以预测自己是否会对某些结果负责。
  • methods: 本 paper 使用了linear temporal logic的形式来表达责任预测,并使用了不同的责任概念来定义责任预测的不同方面。
  • results: 本 paper 证明了责任预测可以在多智能计划设定中协调智能机器的行为,并提供了复杂性结果和类比CLASSICAL PLANNING的等价性。 addition, the paper also outlines a method for solving some of the attribution and anticipation problems using PDDL solvers.
    Abstract Responsibility anticipation is the process of determining if the actions of an individual agent may cause it to be responsible for a particular outcome. This can be used in a multi-agent planning setting to allow agents to anticipate responsibility in the plans they consider. The planning setting in this paper includes partial information regarding the initial state and considers formulas in linear temporal logic as positive or negative outcomes to be attained or avoided. We firstly define attribution for notions of active, passive and contributive responsibility, and consider their agentive variants. We then use these to define the notion of responsibility anticipation. We prove that our notions of anticipated responsibility can be used to coordinate agents in a planning setting and give complexity results for our model, discussing equivalence with classical planning. We also present an outline for solving some of our attribution and anticipation problems using PDDL solvers.
    摘要 责任预测是确定特定行为代理人可能对结果负责的过程。这可以在多代理人规划设置中使用,以便代理人可以在考虑的计划中预测责任。本文中的规划设置包括初始状态的部分信息,并考虑线性时间逻辑逻辑为正或负结果来达成或避免。我们首先定义了活动、被动和贡献责任的归属,然后使用这些定义来定义责任预测。我们证明了我们的责任预测概念可以用于协调代理人在规划设置中,并提供了复杂性结果,讨论与传统规划相等的等价性。此外,我们还提供了解决一些归属和预测问题的OUTLINE,使用PDDL解决器。

On the Trustworthiness Landscape of State-of-the-art Generative Models: A Comprehensive Survey

  • paper_url: http://arxiv.org/abs/2307.16680
  • repo_url: None
  • paper_authors: Mingyuan Fan, Cen Chen, Chengyu Wang, Jun Huang
  • for: This paper aims to investigate the trustworthiness of large-scale generative models, specifically addressing privacy, security, fairness, and responsibility concerns.
  • methods: The paper employs a comprehensive approach, analyzing both long-standing and emerging threats associated with these models across four fundamental dimensions.
  • results: The authors provide an extensive map outlining the trustworthiness of these models, as well as practical recommendations and future directions for promoting their trustworthy deployment.Here’s the same information in Simplified Chinese text:
  • for: 这篇论文旨在调查大规模生成模型的可靠性,具体探讨隐私、安全、公正性和责任等四个基本维度上的威胁。
  • methods: 该论文采用了全面的方法,对这些模型的威胁进行了分析,包括长期存在的和新出现的威胁。
  • results: 作者们提供了详细的可靠性地图,以及实践的建议和未来发展方向,以便推广这些模型的可靠部署,最终为社会带来利益。
    Abstract Diffusion models and large language models have emerged as leading-edge generative models and have sparked a revolutionary impact on various aspects of human life. However, the practical implementation of these models has also exposed inherent risks, highlighting their dual nature and raising concerns regarding their trustworthiness. Despite the abundance of literature on this subject, a comprehensive survey specifically delving into the intersection of large-scale generative models and their trustworthiness remains largely absent. To bridge this gap, This paper investigates both the long-standing and emerging threats associated with these models across four fundamental dimensions: privacy, security, fairness, and responsibility. In this way, we construct an extensive map outlining the trustworthiness of these models, while also providing practical recommendations and identifying future directions. These efforts are crucial for promoting the trustworthy deployment of these models, ultimately benefiting society as a whole.
    摘要 文本翻译为简化中文:大量生成模型和大语言模型在不同领域中得到了广泛应用,但实践中也暴露出了内在的风险,表现出这些模型的双重性和可信worthiness的问题。虽然有很多相关文献,但一篇具体探讨这些模型与可信worthiness的关系的总结尚未出现。为了填补这一空白,本文 investigate了这些模型中长期存在和新出现的威胁,从四个基本维度出发:隐私、安全、公平和责任。通过构建了这些模型的可信worthiness映射,并提供了实践推荐和未来方向,以促进这些模型的可靠应用,终于为社会带来利益。

Proactive Resource Request for Disaster Response: A Deep Learning-based Optimization Model

  • paper_url: http://arxiv.org/abs/2307.16661
  • repo_url: None
  • paper_authors: Hongzhe Zhang, Xiaohang Zhao, Xiao Fang, Bintong Chen
  • for: 本研究旨在提供一种优化资源请求方法,以满足灾后地区的救援资源需求。
  • methods: 本研究使用了深度学习方法进行未来需求预测,并根据特点分析了一种难题优化模型。
  • results: 对实际数据和模拟数据进行比较,本研究的方法显示出与现有方法相比的超过其他方法的性能。
    Abstract Disaster response is critical to save lives and reduce damages in the aftermath of a disaster. Fundamental to disaster response operations is the management of disaster relief resources. To this end, a local agency (e.g., a local emergency resource distribution center) collects demands from local communities affected by a disaster, dispatches available resources to meet the demands, and requests more resources from a central emergency management agency (e.g., Federal Emergency Management Agency in the U.S.). Prior resource management research for disaster response overlooks the problem of deciding optimal quantities of resources requested by a local agency. In response to this research gap, we define a new resource management problem that proactively decides optimal quantities of requested resources by considering both currently unfulfilled demands and future demands. To solve the problem, we take salient characteristics of the problem into consideration and develop a novel deep learning method for future demand prediction. We then formulate the problem as a stochastic optimization model, analyze key properties of the model, and propose an effective solution method to the problem based on the analyzed properties. We demonstrate the superior performance of our method over prevalent existing methods using both real world and simulated data. We also show its superiority over prevalent existing methods in a multi-stakeholder and multi-objective setting through simulations.
    摘要 灾害应急应对是生命与损害避免的关键,紧急应对措施的核心是紧急救援资源的管理。因此,当地机构(例如本地紧急资源分配中心)需要收集受灾地区的需求,派发可用资源,并请求中央紧急管理机构(例如美国联邦紧急管理署)的更多资源。但是,以前的资源管理研究忽略了地方机构请求最佳资源量的问题。为了填补这个研究漏洞,我们定义了一个新的资源管理问题,该问题可以考虑当前未满足的需求和未来需求,并且决定最佳的请求资源量。为了解决这个问题,我们考虑了问题的重要特征,并开发了一种新的深度学习方法来预测未来需求。然后,我们将问题转化为一个 Stochastic Optimization 模型,分析了模型的关键性质,并提出了一种有效的解决方案。我们通过实际数据和验证数据示范了我们的方法的优越性。此外,我们通过多方面和多目标的 simulations 示范了我们的方法在多个维度上的优越性。

LLMs4OL: Large Language Models for Ontology Learning

  • paper_url: http://arxiv.org/abs/2307.16648
  • repo_url: https://github.com/hamedbabaei/llms4ol
  • paper_authors: Hamed Babaei Giglou, Jennifer D’Souza, Sören Auer
  • for: 这个论文旨在检验大语言模型(LLMs)是否可以有效地应用其语言模式捕捉能力来自然语言文本中自动提取和结构知识。
  • methods: 该论文使用了零例示训练方法,对九种不同的 LLM 模型家族进行了三个主要ontoLOG知识任务的评估: term typing、taxonomy discovery 和 non-taxonomic relations 提取。
  • results: 论文的评估结果显示,LLMs 可以很好地应用其语言模式捕捉能力来自然语言文本中自动提取和结构知识,并且在不同的ontoLOG知识领域中表现出色。
    Abstract We propose the LLMs4OL approach, which utilizes Large Language Models (LLMs) for Ontology Learning (OL). LLMs have shown significant advancements in natural language processing, demonstrating their ability to capture complex language patterns in different knowledge domains. Our LLMs4OL paradigm investigates the following hypothesis: \textit{Can LLMs effectively apply their language pattern capturing capability to OL, which involves automatically extracting and structuring knowledge from natural language text?} To test this hypothesis, we conduct a comprehensive evaluation using the zero-shot prompting method. We evaluate nine different LLM model families for three main OL tasks: term typing, taxonomy discovery, and extraction of non-taxonomic relations. Additionally, the evaluations encompass diverse genres of ontological knowledge, including lexicosemantic knowledge in WordNet, geographical knowledge in GeoNames, and medical knowledge in UMLS.
    摘要 我们提出了LLMs4OL方法,它利用大语言模型(LLMs)进行ontology学习(OL)。LLMs在自然语言处理方面已经展示出了显著的进步,可以捕捉不同知识领域中的复杂语言模式。我们的LLMs4OL思想是:《可以LLMs通过捕捉和结构化自然语言文本中的知识来应用其语言模式捕捉能力到OL吗?》为了测试这一假设,我们采用了零shot提问方法进行全面评估。我们评估了9种不同的LLM模型家族,对三种主要OL任务进行评估:词类分类、分类发现和非分类关系EXTRACTION。此外,评估还涵盖了不同类型的ontological知识,包括WordNet中的lexicosemantic知识、GeoNames中的地理知识和UMLS中的医学知识。

Perceptions of the Fourth Industrial Revolution and Artificial Intelligence Impact on Society

  • paper_url: http://arxiv.org/abs/2308.02030
  • repo_url: None
  • paper_authors: Daniel Agbaji, Brady Lund, Nishith Reddy Mannuru
  • for: This study aims to examine the perceptions of individuals in different information flow categorizations toward AI and its implications for society.
  • methods: The study uses participant-supplied definitions of AI and the fourth industrial revolution to identify key themes and concerns regarding AI, such as job replacement, privacy invasion, and inaccurate information.
  • results: The results reveal that participants expressed concerns about the potential negative impacts of AI, such as job replacement and privacy invasion, but also recognized the benefits of AI, such as solving complex problems and increasing convenience.Here are the three points in Simplified Chinese text:
  • for: 这个研究旨在探讨不同信息流类别的人们对AI的看法和其对社会的影响。
  • methods: 这个研究使用参与者提供的AI和第四次工业革命的定义来确定关键主题和AI的关注点,如工作替代、隐私侵犯和不准确的信息。
  • results: 结果显示参与者对AI的可能性影响存在担忧,如工作替代和隐私侵犯,但也认可AI的好处,如解决复杂问题和提高便利性。
    Abstract The Fourth Industrial Revolution, particularly Artificial Intelligence (AI), has had a profound impact on society, raising concerns about its implications and ethical considerations. The emergence of text generative AI tools like ChatGPT has further intensified concerns regarding ethics, security, privacy, and copyright. This study aims to examine the perceptions of individuals in different information flow categorizations toward AI. The results reveal key themes in participant-supplied definitions of AI and the fourth industrial revolution, emphasizing the replication of human intelligence, machine learning, automation, and the integration of digital technologies. Participants expressed concerns about job replacement, privacy invasion, and inaccurate information provided by AI. However, they also recognized the benefits of AI, such as solving complex problems and increasing convenience. Views on government involvement in shaping the fourth industrial revolution varied, with some advocating for strict regulations and others favoring support and development. The anticipated changes brought by the fourth industrial revolution include automation, potential job impacts, increased social disconnect, and reliance on technology. Understanding these perceptions is crucial for effectively managing the challenges and opportunities associated with AI in the evolving digital landscape.
    摘要 第四次工业革命,尤其是人工智能(AI),对社会产生了深见的影响,引起了关于其意图和伦理考虑的担忧。文本生成AI工具如ChatGPT的出现更加强调了伦理、安全、隐私和版权等问题的重要性。这项研究旨在探讨不同信息流类型人对AI的看法。结果显示参与者提供的AI和第四个工业革命的定义强调了人工智能的复制、机器学习、自动化和数字技术的集成。参与者表达了对AI的替换工作、隐私侵犯和不准确信息的担忧,但也认可AI的好处,如解决复杂问题和提高便利性。对第四个工业革命的预期变革包括自动化、工作的可能性影响、社会的孤立和依赖于技术。理解这些看法是管理AI在不断发展的数字环境中的挑战和机遇的关键。

NLLG Quarterly arXiv Report 06/23: What are the most influential current AI Papers?

  • paper_url: http://arxiv.org/abs/2308.04889
  • repo_url: https://github.com/nl2g/quaterly-arxiv
  • paper_authors: Steffen Eger, Christoph Leiter, Jonas Belouadi, Ran Zhang, Aida Kostikova, Daniil Larionov, Yanran Chen, Vivian Fresen
    for: 这个研究报告的目的是为研究者和实践者提供一份快速导航,以帮助他们综合了解最新的发展和趋势在自然语言处理(NLP)和机器学习(ML)领域。methods: 这份报告使用arXiv上的 normalized citation counts来列出40篇最受欢迎的论文,以及这些论文的主要研究方向和发展趋势。results: 研究发现,自然语言处理相关的论文在最初的一半年内获得了60%的影响力,而机器学习相关的论文则占据了总数的一半。研究还发现,LLM效率、评估技术、伦理考虑、embodied agents和问题解决方法等是最受欢迎的论文主题。
    Abstract The rapid growth of information in the field of Generative Artificial Intelligence (AI), particularly in the subfields of Natural Language Processing (NLP) and Machine Learning (ML), presents a significant challenge for researchers and practitioners to keep pace with the latest developments. To address the problem of information overload, this report by the Natural Language Learning Group at Bielefeld University focuses on identifying the most popular papers on arXiv, with a specific emphasis on NLP and ML. The objective is to offer a quick guide to the most relevant and widely discussed research, aiding both newcomers and established researchers in staying abreast of current trends. In particular, we compile a list of the 40 most popular papers based on normalized citation counts from the first half of 2023. We observe the dominance of papers related to Large Language Models (LLMs) and specifically ChatGPT during the first half of 2023, with the latter showing signs of declining popularity more recently, however. Further, NLP related papers are the most influential (around 60\% of top papers) even though there are twice as many ML related papers in our data. Core issues investigated in the most heavily cited papers are: LLM efficiency, evaluation techniques, ethical considerations, embodied agents, and problem-solving with LLMs. Additionally, we examine the characteristics of top papers in comparison to others outside the top-40 list (noticing the top paper's focus on LLM related issues and higher number of co-authors) and analyze the citation distributions in our dataset, among others.
    摘要 “对于生成人工智能(AI)领域的资讯快速增长,特别是自然语言处理(NLP)和机器学习(ML)的子领域,对研究者和实践者而言是一个巨大的挑战。为了解决资讯过多的问题,这份报告由比丰德大学的自然语言学习研究小组统计了arXiv上最受欢迎的40篇论文,并对这些论文进行了分析。我们发现在2023年第一季度,LLMs和ChatGPT相关的论文占了最多的份额(around 60%),但是在最近几个月中,ChatGPT的人气开始下降。此外,NLP相关的论文是所有最具影响力的论文中的主要部分(约60%),即使ML相关的论文的数量是NLP相关的论文的两倍。我们发现这些最受欢迎的论文的主要研究方向包括:LLM效率、评估技术、道德考虑、具体代理人、以及使用LLM解决问题。此外,我们还评估了排名前40篇论文与其他论文之间的差异,以及我们的数据中的引用分布,等其他问题。”

Chatbot Application to Support Smart Agriculture in Thailand

  • paper_url: http://arxiv.org/abs/2308.02524
  • repo_url: None
  • paper_authors: Paweena Suebsombut, Pradorn Sureephong, Aicha Sekhari, Suepphong Chernbumroong, Abdelaziz Bouras
  • for: 这个论文的目的是提供一个基于LINE应用程序的智能农业辅助系统,以帮助农民做出更好的农业决策。
  • methods: 这个论文使用的方法包括LINE应用程序的开发、智能农业系统的整合、推荐系统的实现等。
  • results: 在实现该论文中,农民们对该应用程序的满意度达到96%,但是当农民们使用问题箱时,该应用程序只是基于脚本的规则驱动机器人,农民们需要按照指定的关键词输入才能获得回复。
    Abstract A chatbot is a software developed to help reply to text or voice conversations automatically and quickly in real time. In the agriculture sector, the existing smart agriculture systems just use data from sensing and internet of things (IoT) technologies that exclude crop cultivation knowledge to support decision-making by farmers. To enhance this, the chatbot application can be an assistant to farmers to provide crop cultivation knowledge. Consequently, we propose the LINE chatbot application as an information and knowledge representation providing crop cultivation recommendations to farmers. It works with smart agriculture and recommendation systems. Our proposed LINE chatbot application consists of five main functions (start/stop menu, main page, drip irri gation page, mist irrigation page, and monitor page). Farmers will receive information for data monitoring to support their decision-making. Moreover, they can control the irrigation system via the LINE chatbot. Furthermore, farmers can ask questions relevant to the crop environment via a chat box. After implementing our proposed chatbot, farmers are very satisfied with the application, scoring a 96% satisfaction score. However, in terms of asking questions via chat box, this LINE chatbot application is a rule-based bot or script bot. Farmers have to type in the correct keywords as prescribed, otherwise they won't get a response from the chatbots. In the future, we will enhance the asking function of our LINE chatbot to be an intelligent bot.
    摘要 一个聊天机器人是一种软件,用于自动回复文本或语音对话,以便在实时进行快速响应。在农业领域,现有的智能农业系统只是使用感知和互联网专利技术,排除了农作知识以支持农民决策。为了进一步提高这一点,我们提议使用LINE聊天机器人应用程序,以提供农作知识支持。我们的提议的LINE聊天机器人应用程序包括五大主要功能(开始/停止菜单、主页、滴水页面、雾化页面和监测页面)。农民可以通过这些功能获得数据监测信息,以支持他们的决策。此外,农民还可以通过聊天机器人控制滴水系统。此外,农民可以通过聊天盒子提问与作物环境相关的问题。经过我们的提议聊天机器人的实施,农民对该应用程序非常满意,得分96%。然而,在咨询问题方面,这个LINE聊天机器人应用程序是一个规则式机器人或脚本机器人,农民必须按照预先定义的关键词输入,否则无法获得机器人的回应。未来,我们计划将我们的LINE聊天机器人应用程序的问题咨询功能提升到智能机器人水平。

Approximating Counterfactual Bounds while Fusing Observational, Biased and Randomised Data Sources

  • paper_url: http://arxiv.org/abs/2307.16577
  • repo_url: None
  • paper_authors: Marco Zaffalon, Alessandro Antonucci, Rafael Cabañas, David Huber
  • for: 本研究旨在 Addressing the problem of integrating多个可能受偏见的观察和实验研究数据,以计算结构 causal模型中的 counterfactuals。
  • methods: 我们使用 causal expectation-maximization scheme 来 Approximate partially identifiable counterfactual queries,并采用 graphical transformations 将多个数据源重新映射到单一的 caso Study on palliative care 表明我们的方法的有效性,并提示了将多种不同数据源融合以获得有用的结果。
  • results: 我们的研究结果表明,通过 fusion of heterogeneous data sources,可以获得更加有用的结果,尤其在 partial identifiability 的情况下。
    Abstract We address the problem of integrating data from multiple, possibly biased, observational and interventional studies, to eventually compute counterfactuals in structural causal models. We start from the case of a single observational dataset affected by a selection bias. We show that the likelihood of the available data has no local maxima. This enables us to use the causal expectation-maximisation scheme to approximate the bounds for partially identifiable counterfactual queries, which are the focus of this paper. We then show how the same approach can address the general case of multiple datasets, no matter whether interventional or observational, biased or unbiased, by remapping it into the former one via graphical transformations. Systematic numerical experiments and a case study on palliative care show the effectiveness of our approach, while hinting at the benefits of fusing heterogeneous data sources to get informative outcomes in case of partial identifiability.
    摘要 我们考虑了多个可能受偏见的观察和交互式研究数据的集成问题,以计算结构 causal 模型中的可能值。我们从单一观察数据集中开始,假设存在选择偏见。我们表明了可用数据的概率无地方最大值。这使我们可以使用 causal 期望-最大化方案来 aproximate 部分可识别的 counterfactual 查询的上下文约束。然后,我们展示了如何使用图形变换将多个数据集重新映射到原来的情况中,无论这些数据集是交互式的、受偏见的或未受偏见的。我们的方法在系统的数字实验和一个案例研究中证明了其有效性,而且表明了将多种不同数据源融合起来可以获得有用的结果,即使在部分可识别的情况下。

Toward Quantum Machine Translation of Syntactically Distinct Languages

  • paper_url: http://arxiv.org/abs/2307.16576
  • repo_url: None
  • paper_authors: Mina Abbaszade, Mariam Zomorodi, Vahid Salari, Philip Kurian
  • for: 这个研究旨在探索使用量子自然语言处理算法在噪声中等级量子(NISQ)设备上进行语言翻译。经典方法在自然语言处理(NLP)中难以处理复杂语言任务,但量子NLP在NISQ设备上可以充分利用量子并行和束缚来高效地处理和分析大量语言数据,可能会革命化NLP应用。
  • methods: 我们采用了ShannonEntropy来示出某些适当的旋转门的角色在parametrized量子电路性能中。特别是,我们利用这些角度(参数)作为不同语言的量子电路之间的通信方式。
  • results: 我们的实验使用了160个样本,包括英语句子和其波斯语翻译。我们使用了классификатор-生成器模型,并使用了长期快速储存(LSTM)来实现翻译任务。我们使用了不同的优化器,包括权重参数SGD和两个附加的优化器。最终,我们得到了最佳模型,包括两个LSTM层,并使用了Adam优化器。我们的小数据集,尽管只包括简单的同义句子和单词映射,但表明了ShannonEntropy作为评价指标在更复杂的机器翻译模型中的用 utility。
    Abstract The present study aims to explore the feasibility of language translation using quantum natural language processing algorithms on noisy intermediate-scale quantum (NISQ) devices. Classical methods in natural language processing (NLP) struggle with handling large-scale computations required for complex language tasks, but quantum NLP on NISQ devices holds promise in harnessing quantum parallelism and entanglement to efficiently process and analyze vast amounts of linguistic data, potentially revolutionizing NLP applications. Our research endeavors to pave the way for quantum neural machine translation, which could potentially offer advantages over classical methods in the future. We employ Shannon entropy to demonstrate the significant role of some appropriate angles of rotation gates in the performance of parametrized quantum circuits. In particular, we utilize these angles (parameters) as a means of communication between quantum circuits of different languages. To achieve our objective, we adopt the encoder-decoder model of classical neural networks and implement the translation task using long short-term memory (LSTM). Our experiments involved 160 samples comprising English sentences and their Persian translations. We trained the models with different optimisers implementing stochastic gradient descent (SGD) as primary and subsequently incorporating two additional optimizers in conjunction with SGD. Notably, we achieved optimal results-with mean absolute error of 0.03, mean squared error of 0.002, and 0.016 loss-by training the best model, consisting of two LSTM layers and using the Adam optimiser. Our small dataset, though consisting of simple synonymous sentences with word-to-word mappings, points to the utility of Shannon entropy as a figure of merit in more complex machine translation models for intricate sentence structures.
    摘要 本研究旨在探索使用量子自然语言处理算法(Quantum NLP)在不稳定量子设备(NISQ)上进行语言翻译的可行性。经典的自然语言处理(NLP)方法在处理复杂语言任务时会遇到大规模计算的问题,但量子NP中的量子并行和积分可能可以有效地处理和分析大量语言数据,从而可能革命化NLP应用。我们的研究努力于开拓量子神经机器翻译,这可能将在未来提供经典方法的优势。我们使用雪兰度来示出一些适当的旋转门阻的重要性。特别是,我们使用这些角度(参数)作为不同语言的量子Circuit之间的交流方式。为实现我们的目标,我们采用了经典神经网络的编码器-解码器模型,并通过长短期记忆(LSTM)实现翻译任务。我们的实验使用了160个样本,包括英语句子和其波斯语翻译。我们使用不同的优化器进行 Stochastic Gradient Descent(SGD),并在其中添加了两个额外的优化器。结果显示,我们使用最佳模型,包括两个LSTM层,并使用 Adam 优化器,可以达到最佳结果,其中的平均绝对错误为0.03,平均平方错误为0.002,损失为0.016。我们的小样本,尽管只包括简单的同义句子,但表明雪兰度作为评价量子机器翻译模型的效用。

No Fair Lunch: A Causal Perspective on Dataset Bias in Machine Learning for Medical Imaging

  • paper_url: http://arxiv.org/abs/2307.16526
  • repo_url: None
  • paper_authors: Charles Jones, Daniel C. Castro, Fabio De Sousa Ribeiro, Ozan Oktay, Melissa McCradden, Ben Glocker
  • for: This paper aims to address fairness concerns in machine learning methods used in clinical decision-making, highlighting the need for a more comprehensive understanding of algorithmic bias and its mitigation strategies.
  • methods: The paper uses a causal perspective to identify three families of causal bias mechanisms in medical imaging datasets, and provides a practical three-step framework for reasoning about fairness in AI prediction models.
  • results: The paper highlights the limitations of current mitigation methods and emphasizes the importance of considering a broader range of scenarios when addressing algorithmic bias in medical imaging.
    Abstract As machine learning methods gain prominence within clinical decision-making, addressing fairness concerns becomes increasingly urgent. Despite considerable work dedicated to detecting and ameliorating algorithmic bias, today's methods are deficient with potentially harmful consequences. Our causal perspective sheds new light on algorithmic bias, highlighting how different sources of dataset bias may appear indistinguishable yet require substantially different mitigation strategies. We introduce three families of causal bias mechanisms stemming from disparities in prevalence, presentation, and annotation. Our causal analysis underscores how current mitigation methods tackle only a narrow and often unrealistic subset of scenarios. We provide a practical three-step framework for reasoning about fairness in medical imaging, supporting the development of safe and equitable AI prediction models.
    摘要 随着机器学习方法在医疗决策中升级,解决公平问题变得越来越紧迫。虽然已经投入了大量的时间和精力来检测和改进算法偏见,但目前的方法仍然存在可能有害的后果。我们的 causal 视角把 Algorithmic Bias 推广到了不同来源的数据集偏见,并指出了这些偏见需要不同的mitigation策略。我们引入了三种家族的 causal 偏见机制,它们来自于数据集中的 disparities、presentation 和 annotation。我们的 causal 分析表明,现有的mitigation方法只能处理一小部分的情况,而且经常是不切实际的。我们提供了一个实用的三步框架,用于考虑医疗图像中的公平问题,以支持开发安全和公平的 AI 预测模型。

Rethinking Collaborative Perception from the Spatial-Temporal Importance of Semantic Information

  • paper_url: http://arxiv.org/abs/2307.16517
  • repo_url: https://github.com/huangqzj/iosi-cp
  • paper_authors: Yuntao Liu, Qian Huang, Rongpeng Li, Xianfu Chen, Zhifeng Zhao, Shuyuan Zhao, Yongdong Zhu, Honggang Zhang
  • for: 提高感知能力,通过共享semantic信息。
  • methods: 提出一种新的共享感知框架,即IoSI-CP,包括在空间和时间维度上的IoSI基于的合作选择方法和Semantic信息融合算法HPHA。
  • results: 对两个开源数据集进行了广泛的实验,并证明了IoSI-CP可以在比较情况下显著提高感知性能。
    Abstract Collaboration by the sharing of semantic information is crucial to enable the enhancement of perception capabilities. However, existing collaborative perception methods tend to focus solely on the spatial features of semantic information, while neglecting the importance of the temporal dimension in collaborator selection and semantic information fusion, which instigates performance degradation. In this article, we propose a novel collaborative perception framework, IoSI-CP, which takes into account the importance of semantic information (IoSI) from both temporal and spatial dimensions. Specifically, we develop an IoSI-based collaborator selection method that effectively identifies advantageous collaborators but excludes those that bring negative benefits. Moreover, we present a semantic information fusion algorithm called HPHA (historical prior hybrid attention), which integrates a multi-scale transformer module and a short-term attention module to capture IoSI from spatial and temporal dimensions, and assigns varying weights for efficient aggregation. Extensive experiments on two open datasets demonstrate that our proposed IoSI-CP significantly improves the perception performance compared to state-of-the-art approaches. The code associated with this research is publicly available at https://github.com/huangqzj/IoSI-CP/.
    摘要 合作通过 semantic 信息的共享是关键来提高感知能力。然而,现有的合作感知方法通常只关注空间维度上的 semantic 信息,而忽视了时间维度在合作者选择和 semantic 信息融合方面的重要性,这会导致性能下降。在本文中,我们提出了一种新的合作感知框架,即 IoSI-CP,该框架考虑了 semantic 信息的时间和空间维度。具体来说,我们开发了基于 IoSI 的合作者选择方法,可以有效地选择有利合作者,而不选择带有负面效应的合作者。此外,我们提出了一种 semantic 信息融合算法called HPHA(历史先 hybrid 注意),该算法包括多尺度变换器模块和短期注意模块,可以从空间和时间维度中捕捉 IoSI,并将其分配不同的权重进行有效聚合。我们对两个公开的数据集进行了广泛的实验,结果显示,我们的提议的 IoSI-CP 方法可以significantly 提高感知性能,比州先进方法更高。相关代码可以在 GitHub 上找到:https://github.com/huangqzj/IoSI-CP/.

Deception Abilities Emerged in Large Language Models

  • paper_url: http://arxiv.org/abs/2307.16513
  • repo_url: None
  • paper_authors: Thilo Hagendorff
  • for: 这个研究的目的是探讨现代大语言模型(LLM)是否具备骗取人类操作者的能力,以及如何通过链式思维提高其骗取性能。
  • methods: 该研究使用了现代大语言模型GPT-4进行实验,并通过评估模型在骗取场景中的表现来评估其骗取能力。
  • results: 研究发现,现代大语言模型已经具备了骗取false beliefs的能力,并且可以通过链式思维提高其骗取性能。此外,通过刺激模型的马ки雅文化可以改变模型的骗取倾向。
    Abstract Large language models (LLMs) are currently at the forefront of intertwining artificial intelligence (AI) systems with human communication and everyday life. Thus, aligning them with human values is of great importance. However, given the steady increase in reasoning abilities, future LLMs are under suspicion of becoming able to deceive human operators and utilizing this ability to bypass monitoring efforts. As a prerequisite to this, LLMs need to possess a conceptual understanding of deception strategies. This study reveals that such strategies emerged in state-of-the-art LLMs, such as GPT-4, but were non-existent in earlier LLMs. We conduct a series of experiments showing that state-of-the-art LLMs are able to understand and induce false beliefs in other agents, that their performance in complex deception scenarios can be amplified utilizing chain-of-thought reasoning, and that eliciting Machiavellianism in LLMs can alter their propensity to deceive. In sum, revealing hitherto unknown machine behavior in LLMs, our study contributes to the nascent field of machine psychology.
    摘要

Towards a Comprehensive Human-Centred Evaluation Framework for Explainable AI

  • paper_url: http://arxiv.org/abs/2308.06274
  • repo_url: None
  • paper_authors: Ivania Donoso-Guzmán, Jeroen Ooge, Denis Parra, Katrien Verbert
  • for: 这篇论文的目的是为了提出一种人类中心的XAI评估框架,以便评估XAI方法的人类体验。
  • methods: 这篇论文使用了一种基于推荐系统的用户中心评估框架,并将解释方面的特性综合评估,以及解释与用户之间的关系。
  • results: 这篇论文通过这种新的评估框架,对XAI方法的人类体验进行了全面的评估,并提出了一些可能的改进方法。
    Abstract While research on explainable AI (XAI) is booming and explanation techniques have proven promising in many application domains, standardised human-centred evaluation procedures are still missing. In addition, current evaluation procedures do not assess XAI methods holistically in the sense that they do not treat explanations' effects on humans as a complex user experience. To tackle this challenge, we propose to adapt the User-Centric Evaluation Framework used in recommender systems: we integrate explanation aspects, summarise explanation properties, indicate relations between them, and categorise metrics that measure these properties. With this comprehensive evaluation framework, we hope to contribute to the human-centred standardisation of XAI evaluation.
    摘要 而研究具体化AI(XAI)正在急速发展,并且各种解释技术在各个应用领域都有扎实的成果。然而,现有的评估方法仍然缺乏标准化人类中心的评估程序。此外,当前的评估方法并不对XAI方法进行总体性的评估,即不视解释作为人类用户的复杂经验进行评估。为解决这个挑战,我们建议采用用户中心评估框架,将解释方面、解释特性、解释关系和评估指标集成到一起。通过这种全面的评估框架,我们希望能为人类中心化XAI评估做出贡献,以便在XAI评估中实现人类中心化标准化。

Value-Informed Skill Chaining for Policy Learning of Long-Horizon Tasks with Surgical Robot

  • paper_url: http://arxiv.org/abs/2307.16503
  • repo_url: https://github.com/med-air/viskill
  • paper_authors: Tao Huang, Kai Chen, Wang Wei, Jianan Li, Yonghao Long, Qi Dou
  • for: 解决长期术urgical robot任务中的策略探索问题,提高术urgical robot任务的完成率和执行效率。
  • methods: 使用值 Informed skill chaining(ViSkill)技术,通过分析每个子任务的状态值函数,选择最适合的状态为下一个子任务开始。
  • results: 在三个复杂的术urgical robot任务上 achieved high task success rates和执行效率,证明了 ViSkill 技术的有效性。
    Abstract Reinforcement learning is still struggling with solving long-horizon surgical robot tasks which involve multiple steps over an extended duration of time due to the policy exploration challenge. Recent methods try to tackle this problem by skill chaining, in which the long-horizon task is decomposed into multiple subtasks for easing the exploration burden and subtask policies are temporally connected to complete the whole long-horizon task. However, smoothly connecting all subtask policies is difficult for surgical robot scenarios. Not all states are equally suitable for connecting two adjacent subtasks. An undesired terminate state of the previous subtask would make the current subtask policy unstable and result in a failed execution. In this work, we introduce value-informed skill chaining (ViSkill), a novel reinforcement learning framework for long-horizon surgical robot tasks. The core idea is to distinguish which terminal state is suitable for starting all the following subtask policies. To achieve this target, we introduce a state value function that estimates the expected success probability of the entire task given a state. Based on this value function, a chaining policy is learned to instruct subtask policies to terminate at the state with the highest value so that all subsequent policies are more likely to be connected for accomplishing the task. We demonstrate the effectiveness of our method on three complex surgical robot tasks from SurRoL, a comprehensive surgical simulation platform, achieving high task success rates and execution efficiency. Code is available at $\href{https://github.com/med-air/ViSkill}{\text{https://github.com/med-air/ViSkill}$.
    摘要 <>translate the following text into Simplified Chinese:Reinforcement learning is still struggling with solving long-horizon surgical robot tasks which involve multiple steps over an extended duration of time due to the policy exploration challenge. Recent methods try to tackle this problem by skill chaining, in which the long-horizon task is decomposed into multiple subtasks for easing the exploration burden and subtask policies are temporally connected to complete the whole long-horizon task. However, smoothly connecting all subtask policies is difficult for surgical robot scenarios. Not all states are equally suitable for connecting two adjacent subtasks. An undesired terminate state of the previous subtask would make the current subtask policy unstable and result in a failed execution. In this work, we introduce value-informed skill chaining (ViSkill), a novel reinforcement learning framework for long-horizon surgical robot tasks. The core idea is to distinguish which terminal state is suitable for starting all the following subtask policies. To achieve this target, we introduce a state value function that estimates the expected success probability of the entire task given a state. Based on this value function, a chaining policy is learned to instruct subtask policies to terminate at the state with the highest value so that all subsequent policies are more likely to be connected for accomplishing the task. We demonstrate the effectiveness of our method on three complex surgical robot tasks from SurRoL, a comprehensive surgical simulation platform, achieving high task success rates and execution efficiency. Code is available at $\href{https://github.com/med-air/ViSkill}{\text{https://github.com/med-air/ViSkill}$.Translate the text into Simplified Chinese:<> Reinforcement learning 仍然在解决长期护理机器人任务上遇到策略探索挑战,这些任务通常包括多个步骤,需要较长的时间来完成。 recent methods 尝试通过细分任务,使策略探索压力减轻。 However, smoothly connecting all subtask policies is difficult for surgical robot scenarios. Not all states are equally suitable for connecting two adjacent subtasks. An undesired terminate state of the previous subtask would make the current subtask policy unstable and result in a failed execution.在这项工作中,我们介绍了值 Informed skill chaining(ViSkill),一种新的强化学习框架,用于解决长期护理机器人任务。 ViSkill 的核心思想是分配终端状态,以便在该状态下启动所有后续任务。 To achieve this target, we introduce a state value function that estimates the expected success probability of the entire task given a state. Based on this value function, a chaining policy is learned to instruct subtask policies to terminate at the state with the highest value, so that all subsequent policies are more likely to be connected for accomplishing the task.我们在 SurRoL 平台上进行了三种复杂的外科 робо术任务的实验,包括胸部手术、脊梁手术和肠胃手术。 results show that our method can achieve high task success rates and execution efficiency. Code is available at $\href{https://github.com/med-air/ViSkill}{\text{https://github.com/med-air/ViSkill}$.Translation notes:* "Reinforcement learning" 翻译为 "强化学习"* "long-horizon surgical robot tasks" 翻译为 "长期护理机器人任务"* "policy exploration challenge" 翻译为 "策略探索挑战"* "subtask policies" 翻译为 "子任务策略"* "chaining policy" 翻译为 "链接策略"* "state value function" 翻译为 "状态价值函数"* "success probability" 翻译为 "成功概率"* "SurRoL" 翻译为 "SurRoL"

BAGM: A Backdoor Attack for Manipulating Text-to-Image Generative Models

  • paper_url: http://arxiv.org/abs/2307.16489
  • repo_url: None
  • paper_authors: Jordan Vice, Naveed Akhtar, Richard Hartley, Ajmal Mian
  • for: 这个论文主要是为了探讨文本生成人工智能(AI)中的后门攻击问题。
  • methods: 这篇论文使用了多种攻击策略,包括表面攻击、浅层攻击和深层攻击。这些攻击方法 targeted various stages of the text-to-image generative pipeline,包括 Tokenizer 和语言和视觉 нейрон网络。
  • results: 作者通过对 state-of-the-art stable diffusion pipeline 进行攻击,并创建了 Marketable Foods 数据集,证明了他们的攻击方法的有效性。
    Abstract The rise in popularity of text-to-image generative artificial intelligence (AI) has attracted widespread public interest. At the same time, backdoor attacks are well-known in machine learning literature for their effective manipulation of neural models, which is a growing concern among practitioners. We highlight this threat for generative AI by introducing a Backdoor Attack on text-to-image Generative Models (BAGM). Our attack targets various stages of the text-to-image generative pipeline, modifying the behaviour of the embedded tokenizer and the pre-trained language and visual neural networks. Based on the penetration level, BAGM takes the form of a suite of attacks that are referred to as surface, shallow and deep attacks in this article. We compare the performance of BAGM to recently emerging related methods. We also contribute a set of quantitative metrics for assessing the performance of backdoor attacks on generative AI models in the future. The efficacy of the proposed framework is established by targeting the state-of-the-art stable diffusion pipeline in a digital marketing scenario as the target domain. To that end, we also contribute a Marketable Foods dataset of branded product images. We hope this work contributes towards exposing the contemporary generative AI security challenges and fosters discussions on preemptive efforts for addressing those challenges. Keywords: Generative Artificial Intelligence, Generative Models, Text-to-Image generation, Backdoor Attacks, Trojan, Stable Diffusion.
    摘要 “文本至图生成人工智能(AI)的崛起引起了广泛的公众关注。同时,机器学习领域内的后门攻击已经得到了广泛的关注,这是一种可以高效地操纵神经网络的攻击方法。我们在文本至图生成模型中引入了后门攻击,并将其命名为BAGM。我们的攻击targets多个生成图像管道的不同阶段,包括嵌入的 токен化器和预训练的语言和视觉神经网络。根据渗透度,BAGM可以按照表面、浅层和深层攻击的形式出现。在这篇文章中,我们与其他相关的方法进行比较,并提出了一组用于评估生成AI模型的后门攻击性能的量化指标。我们的提案的效果得到了验证,我们使用了现有的稳定扩散管道作为目标领域,并提供了一个Marketable Foods数据集,用于评估BAGM的性能。我们希望这项工作能够曝光当代生成AI安全挑战,并促进相关的预防措施的发展。”Note: The translation is in Simplified Chinese, which is the standard writing system used in mainland China.

Model-free Grasping with Multi-Suction Cup Grippers for Robotic Bin Picking

  • paper_url: http://arxiv.org/abs/2307.16488
  • repo_url: None
  • paper_authors: Philipp Schillinger, Miroslav Gabriel, Alexander Kuss, Hanna Ziesche, Ngo Anh Vien
  • for: 预测多杯吸盘器的抓取姿势,不需要特定的抓取器培训数据。
  • methods: 提出了一种两步方法,首先使用神经网络预测输入图像中可抓取区域的ixel精度,然后使用配置的抓取器布局和活动方案确定最佳抓取姿势。
  • results: 在实际工业应用中,对各种复杂的托盘场景进行了实验评估,示出了方法的效果。
    Abstract This paper presents a novel method for model-free prediction of grasp poses for suction grippers with multiple suction cups. Our approach is agnostic to the design of the gripper and does not require gripper-specific training data. In particular, we propose a two-step approach, where first, a neural network predicts pixel-wise grasp quality for an input image to indicate areas that are generally graspable. Second, an optimization step determines the optimal gripper selection and corresponding grasp poses based on configured gripper layouts and activation schemes. In addition, we introduce a method for automated labeling for supervised training of the grasp quality network. Experimental evaluations on a real-world industrial application with bin picking scenes of varying difficulty demonstrate the effectiveness of our method.
    摘要
  1. A neural network predicts pixel-wise grasp quality for an input image to indicate areas that are generally graspable.2. An optimization step determines the optimal gripper selection and corresponding grasp poses based on configured gripper layouts and activation schemes.In addition, we introduce a method for automated labeling for supervised training of the grasp quality network. Experimental evaluations on a real-world industrial application with bin picking scenes of varying difficulty demonstrate the effectiveness of our method.中文翻译:这篇论文提出了一种新的模型自由预测方法,用于预测多个抓握器的抓握姿态。我们的方法包括两个步骤:1. 一个神经网络预测输入图像中的抓握质量,以指示可以抓取的区域。2. 一个优化步骤,根据配置的抓握器布局和活动方案,确定最佳的抓握器选择和相应的抓握姿态。此外,我们还介绍了一种自动标注方法,用于supervised学习抓握质量网络的训练。实验证明了我们的方法在实际工业应用中的减压场景中的效果。

To Classify is to Interpret: Building Taxonomies from Heterogeneous Data through Human-AI Collaboration

  • paper_url: http://arxiv.org/abs/2307.16481
  • repo_url: None
  • paper_authors: Sebastian Meier, Katrin Glinka
  • for: 这 paper ocuses on supporting taxonomy building with machine learning (ML) systems, with an emphasis on human-AI collaboration.
  • methods: The approach proposed in the paper allows users to iteratively consider multiple ML models’ outputs as part of their sensemaking process.
  • results: The authors implemented their approach in two real-world use cases and found that it enabled more effective taxonomy building compared to relying solely on black-boxed ML systems.
    Abstract Taxonomy building is a task that requires interpreting and classifying data within a given frame of reference, which comes to play in many areas of application that deal with knowledge and information organization. In this paper, we explore how taxonomy building can be supported with systems that integrate machine learning (ML). However, relying only on black-boxed ML-based systems to automate taxonomy building would sideline the users' expertise. We propose an approach that allows the user to iteratively take into account multiple model's outputs as part of their sensemaking process. We implemented our approach in two real-world use cases. The work is positioned in the context of HCI research that investigates the design of ML-based systems with an emphasis on enabling human-AI collaboration.
    摘要 税onomy建构是一项需要解释和分类数据的任务,这种任务在许多知识和信息组织领域中发挥着重要作用。在这篇论文中,我们研究如何通过 интеGRatin machine learning(ML)系统来支持税onomy建构。然而,仅仅靠黑盒ML基于系统自动化税onomy建构会忽略用户的专业知识。我们提议一种方法,允许用户在感性过程中逐渐考虑多个模型的输出。我们在两个实际应用场景中实现了这种方法。我们的工作位于人机合作研究的背景下,探讨ML基本系统的设计,以启用人AI合作。

Tracking mulitple targets with multiple radars using Distributed Auctions

  • paper_url: http://arxiv.org/abs/2307.16477
  • repo_url: None
  • paper_authors: Pierre Larrenie, Cédric Buron, Frédéric Barbaresco
  • for: 提高雷达网络的可恒性和灵活性
  • methods: 基于分布式和协同拍卖的Bundle auctions算法
  • results: 可同时跟踪多个目标,并且使用多个雷达跟踪同一个目标以提高准确性
    Abstract Coordination of radars can be performed in various ways. To be more resilient radar networks can be coordinated in a decentralized way. In this paper, we introduce a highly resilient algorithm for radar coordination based on decentralized and collaborative bundle auctions. We first formalize our problem as a constrained optimization problem and apply a market-based algorithm to provide an approximate solution. Our approach allows to track simultaneously multiple targets, and to use up to two radars tracking the same target to improve accuracy. We show that our approach performs sensibly as well as a centralized approach relying on a MIP solver, and depending on the situations, may outperform it or be outperformed.
    摘要 协调雷达可以通过不同的方式进行。为了更加鲁棒的雷达网络,可以使用分散式的协调方式。本文介绍了一种基于分散和合作的粒度拍卖算法来提高雷达协调的可靠性。我们首先将问题形式化为一个受限制的优化问题,然后应用市场基本算法提供一个近似解决方案。我们的方法可以同时跟踪多个目标,并且使用两个雷达跟踪同一个目标以提高准确性。我们表明,我们的方法与中央化方法基于MIP解决器相比,在某些情况下可能高效或低效。

L3DMC: Lifelong Learning using Distillation via Mixed-Curvature Space

  • paper_url: http://arxiv.org/abs/2307.16459
  • repo_url: https://github.com/csiro-robotics/l3dmc
  • paper_authors: Kaushik Roy, Peyman Moghadam, Mehrtash Harandi
  • for: 提高生命时间学习(L3)模型在连续学习任务中的性能,解决模型在学习新概念时的减少性能问题。
  • methods: 提出了一种名为L3DMC的继承策略,该策略在混合曲率空间中保持已经学习的知识,并在高维的 reproduce kernel space(RKHS)中使用正定的kernel函数来实现丰富的表示。
  • results: 经过实验证明,L3DMC可以更好地适应新的知识,而无需忘记过去的知识,并且在医疗图像分类任务中显示出了高效性。
    Abstract The performance of a lifelong learning (L3) model degrades when it is trained on a series of tasks, as the geometrical formation of the embedding space changes while learning novel concepts sequentially. The majority of existing L3 approaches operate on a fixed-curvature (e.g., zero-curvature Euclidean) space that is not necessarily suitable for modeling the complex geometric structure of data. Furthermore, the distillation strategies apply constraints directly on low-dimensional embeddings, discouraging the L3 model from learning new concepts by making the model highly stable. To address the problem, we propose a distillation strategy named L3DMC that operates on mixed-curvature spaces to preserve the already-learned knowledge by modeling and maintaining complex geometrical structures. We propose to embed the projected low dimensional embedding of fixed-curvature spaces (Euclidean and hyperbolic) to higher-dimensional Reproducing Kernel Hilbert Space (RKHS) using a positive-definite kernel function to attain rich representation. Afterward, we optimize the L3 model by minimizing the discrepancies between the new sample representation and the subspace constructed using the old representation in RKHS. L3DMC is capable of adapting new knowledge better without forgetting old knowledge as it combines the representation power of multiple fixed-curvature spaces and is performed on higher-dimensional RKHS. Thorough experiments on three benchmarks demonstrate the effectiveness of our proposed distillation strategy for medical image classification in L3 settings. Our code implementation is publicly available at https://github.com/csiro-robotics/L3DMC.
    摘要 “一个生命时间学习(L3)模型的性能会随着在不同任务之间学习新的概念而下降。现有大多数L3方法都是在固定曲率(例如零曲率欧几里得)空间中进行学习,这并不一定适合数据的复杂的几何结构。另外,浸泡策略通常直接在低维度表示上应用约束,使L3模型不能学习新的概念,而是使模型变得非常稳定。为解决这个问题,我们提出了一种浸泡策略名为L3DMC,它在混合曲率空间中进行学习,以保持已经学习的知识,并在高维度的径规kernel空间(RKHS)中实现丰富的表示。然后,我们通过在RKHS中对新样本的表示与以前的表示建立的子空间进行优化,来最小化L3模型中的差异。L3DMC可以更好地适应新的知识,而不会忘记以前的知识,因为它结合了多个固定曲率空间的表示力和高维度RKHS中的表示。我们对三个标准 benchmark进行了详细的实验,并证明了L3DMC在医学图像分类中的效果。我们的代码实现可以在https://github.com/csiro-robotics/L3DMC中获得。”

An Effective Data Creation Pipeline to Generate High-quality Financial Instruction Data for Large Language Model

  • paper_url: http://arxiv.org/abs/2308.01415
  • repo_url: None
  • paper_authors: Ziao Wang, Jianning Wang, Junda Wu, Xiaofeng Zhang
  • for: This paper is written for the purpose of creating a high-quality financial dataset to fine-tune large language models for financial tasks.
  • methods: The paper presents a data creation pipeline that incorporates human financial expert feedback to refine the dataset, using ChatGPT to initiate a dialogue between an AI investor and financial expert.
  • results: The pipeline yields a robust instruction tuning dataset of 103k multi-turn chats, and the experimental results show significant advancements in generating accurate, relevant, and financial-style responses from AI models, providing a powerful tool for financial applications.Here’s the text in Traditional Chinese if you prefer:
  • for: 本研究的目的是为了创建一个高品质的金融数据集,以便微型语言模型进行金融相关任务的 fine-tuning。
  • methods: 该研究提出了一个具有人工金融专家反馈的数据创建管线,使用 ChatGPT 将人工投资者和金融专家之间的对话整合到数据创建中,以提高数据的品质。
  • results: 管线产生了103k多轮对话的强健 instruction tuning 数据集,并通过在这个数据集上进行广泛的实验,发现 AI 模型在这个数据集上的表现有所提高,可以实现更加精确、更加相关、更加金融风格的回答,从而提供了金融业中具有强大的应用力。
    Abstract At the beginning era of large language model, it is quite critical to generate a high-quality financial dataset to fine-tune a large language model for financial related tasks. Thus, this paper presents a carefully designed data creation pipeline for this purpose. Particularly, we initiate a dialogue between an AI investor and financial expert using ChatGPT and incorporate the feedback of human financial experts, leading to the refinement of the dataset. This pipeline yielded a robust instruction tuning dataset comprised of 103k multi-turn chats. Extensive experiments have been conducted on this dataset to evaluate the model's performance by adopting an external GPT-4 as the judge. The promising experimental results verify that our approach led to significant advancements in generating accurate, relevant, and financial-style responses from AI models, and thus providing a powerful tool for applications within the financial sector.
    摘要 在大型语言模型开始时期,制作高质量金融数据集是非常重要的,以便使用大型语言模型进行金融相关任务的细化。因此,这篇论文提出了一个仔细设计的数据创建管道,以达到这个目标。特别是,我们通过与人工智能投资者和金融专家之间的对话,使用ChatGPT,并基于人类金融专家的反馈,对数据进行细化。这个管道生成了103k多个转换的 instruciton 数据集。我们进行了广泛的实验,使用外部的GPT-4作为评判,以评估模型的性能。结果表明,我们的方法导致了AI模型生成的精度、 relevance 和金融风格响应的显著提高,从而为金融领域应用提供了强大的工具。

Every Mistake Counts in Assembly

  • paper_url: http://arxiv.org/abs/2307.16453
  • repo_url: None
  • paper_authors: Guodong Ding, Fadime Sener, Shugao Ma, Angela Yao
  • for: 本研究旨在帮助智能助手更好地协助用户完成复杂的过程,如烹饪、家居维护和组装任务。
  • methods: 该研究使用了学习知识库的方法,将掌握了的知识库用于检测组装过程中的错误顺序。
  • results: 实验表明,使用我们提出的假设推理算法可以在真实世界动作序列中准确地检测错误顺序。
    Abstract One promising use case of AI assistants is to help with complex procedures like cooking, home repair, and assembly tasks. Can we teach the assistant to interject after the user makes a mistake? This paper targets the problem of identifying ordering mistakes in assembly procedures. We propose a system that can detect ordering mistakes by utilizing a learned knowledge base. Our framework constructs a knowledge base with spatial and temporal beliefs based on observed mistakes. Spatial beliefs depict the topological relationship of the assembling components, while temporal beliefs aggregate prerequisite actions as ordering constraints. With an episodic memory design, our algorithm can dynamically update and construct the belief sets as more actions are observed, all in an online fashion. We demonstrate experimentally that our inferred spatial and temporal beliefs are capable of identifying incorrect orderings in real-world action sequences. To construct the spatial beliefs, we collect a new set of coarse-level action annotations for Assembly101 based on the positioning of the toy parts. Finally, we demonstrate the superior performance of our belief inference algorithm in detecting ordering mistakes on the Assembly101 dataset.
    摘要 一个有前途的用 caso of AI 助手是帮助完成复杂的过程,如cooking、家居维修和组装任务。我们可以教育助手在用户commit mistake时进行 intercept?这篇论文 targets the problem of identifying ordering mistakes in assembly procedures. We propose a system that can detect ordering mistakes by utilizing a learned knowledge base. Our framework constructs a knowledge base with spatial and temporal beliefs based on observed mistakes. Spatial beliefs depict the topological relationship of the assembling components, while temporal beliefs aggregate prerequisite actions as ordering constraints. With an episodic memory design, our algorithm can dynamically update and construct the belief sets as more actions are observed, all in an online fashion. We demonstrate experimentally that our inferred spatial and temporal beliefs are capable of identifying incorrect orderings in real-world action sequences. To construct the spatial beliefs, we collect a new set of coarse-level action annotations for Assembly101 based on the positioning of the toy parts. Finally, we demonstrate the superior performance of our belief inference algorithm in detecting ordering mistakes on the Assembly101 dataset.Here's the text with some additional information about the Simplified Chinese translation:The translation is in Simplified Chinese, which is the standardized form of Chinese used in mainland China and Singapore. The translation is written in a formal and neutral tone, which is appropriate for an academic paper.Some of the technical terms and concepts in the original text, such as "AI assistants," "assembly procedures," "knowledge base," "belief inference," and "episodic memory," are translated directly into Simplified Chinese. Other terms, such as "coarse-level action annotations," are translated as "粗级动作标注" (rough-level action annotations).The translation follows the standard grammar and sentence structure of Simplified Chinese. For example, the subject-verb-object word order is maintained in the sentences, and the use of particles and grammatical markers is consistent with the language norms.Overall, the translation is accurate and faithful to the original text, and it should be understandable to readers who are fluent in Simplified Chinese.

HouYi: An open-source large language model specially designed for renewable energy and carbon neutrality field

  • paper_url: http://arxiv.org/abs/2308.01414
  • repo_url: None
  • paper_authors: Mingliang Bai, Zhihao Zhou, Ruidong Wang, Yusheng Yang, Zizhen Qin, Yunxiao Chen, Chunjin Mu, Jinfu Liu, Daren Yu
  • for: 本研究は、可再生能源の目标达成のために、Large Language Models (LLMs)を自动内容生成に応用することに焦点を当てています。
  • methods: 本研究では、1,168,970件の学术论文タイトルと摘要からREAP datasetを构筑し、これを基にして初の可再生能源専门のLarge Language Model(HouYi model)を开発しました。 HouYi modelは、一般的なLLMsを调整して作成されました。
  • results: 実験结果では、HouYi modelは、可再生能源分野の学术论文パラグラフ生成能力がChatGPTと比较しても优れています。特に、Claude、ERNIE Bot、SparkDeskと比较しても优れています。また、open-source LLaMA-13B modelに比较しても大きく优れています。
    Abstract Renewable energy is important for achieving carbon neutrality goal. With the great success of Large Language Models (LLMs) like ChatGPT in automatic content generation, LLMs are playing an increasingly important role. However, there has not been a specially designed LLM for renewable energy. Meanwhile, there has not been any dataset of renewable energy for training LLMs. Therefore, this paper published the first open-source Renewable Energy Academic Paper (REAP) dataset for non-commercial LLM research of renewable energy. REAP dataset is collected through searching the title and abstract of 1,168,970 academic literatures from Web of Science. Based on REAP dataset, HouYi model, the first LLM for renewable energy, is developed through finetuning general LLMs. HouYi demonstrated powerful academic paper paragraph generation ability in renewable energy field. Experiments show that its ability to generate academic papers on renewable energy is comparable to ChatGPT, slightly outperforms Claude, ERNIE Bot and SparkDesk, and significantly outperforms open-source LLaMA-13B model.
    摘要 重要的可再生能源是实现碳中和目标的关键。大语言模型(LLM)如ChatGPT在自动内容生成方面取得了巨大成功。然而,没有特地设计的LLM用于可再生能源。同时,没有任何可再生能源训练LLM的数据集。因此,本研究发表了首个非商业用途的可再生能源学术论文数据集(REAP)。REAP数据集通过搜索Web of Science上的标题和摘要来收集1,168,970份学术论文。基于REAP数据集,我们开发了首个可再生能源 LLM 模型—— HouYi。通过训练通用 LLM 模型,HouYi 在可再生能源领域的学术论文段落生成能力强大。实验表明,HouYi 的学术论文段落生成能力在可再生能源领域与ChatGPT相当,轻微超过Claude、ERNIE Bot和SparkDesk,并显著超过开源 LLMA-13B 模型。

Causal Inference for Banking Finance and Insurance A Survey

  • paper_url: http://arxiv.org/abs/2307.16427
  • repo_url: None
  • paper_authors: Satyam Kumar, Yelleti Vivek, Vadlamani Ravi, Indranil Bose
  • for: 本研究准备了37篇1992-2023年发表的论文,探讨了在银行、金融和保险领域中的 causal inference 应用。
  • methods: 本研究使用了不同的统计方法,如 bayesian causal network、Granger causality,以及相关的术语,如 counterfactuals。
  • results: 研究发现,银行和保险领域中的 causal inference 应用还处于初始阶段,因此更多的研究是可能的,以使其成为可靠的方法。I hope that helps!
    Abstract Causal Inference plays an significant role in explaining the decisions taken by statistical models and artificial intelligence models. Of late, this field started attracting the attention of researchers and practitioners alike. This paper presents a comprehensive survey of 37 papers published during 1992-2023 and concerning the application of causal inference to banking, finance, and insurance. The papers are categorized according to the following families of domains: (i) Banking, (ii) Finance and its subdomains such as corporate finance, governance finance including financial risk and financial policy, financial economics, and Behavioral finance, and (iii) Insurance. Further, the paper covers the primary ingredients of causal inference namely, statistical methods such as Bayesian Causal Network, Granger Causality and jargon used thereof such as counterfactuals. The review also recommends some important directions for future research. In conclusion, we observed that the application of causal inference in the banking and insurance sectors is still in its infancy, and thus more research is possible to turn it into a viable method.
    摘要 causal inference 在解释统计模型和人工智能模型所做出的决策中扮演着重要的角色。近年来,这个领域吸引了研究者和实践者的关注。这篇论文对1992-2023年发表的37篇论文进行了全面的报告,这些论文关注银行、金融和保险领域中的应用 causal inference。这些论文被分为以下三个家族域:(i)银行,(ii)金融和其子领域,如公司财务、管理财务、金融风险和金融政策、金融经济和行为金融,以及(iii)保险。此外,论文还覆盖了 causal inference 的主要组成部分,包括统计方法如 Bayesian Causal Network 和 Granger Causality,以及在其中使用的术语如 counterfactuals。评论还提出了未来研究的重要方向。结论是,在银行和保险领域中应用 causal inference 的应用仍处于初始阶段,因此更多的研究可以使其成为可靠的方法。

Subspace Distillation for Continual Learning

  • paper_url: http://arxiv.org/abs/2307.16419
  • repo_url: https://github.com/csiro-robotics/sdcl
  • paper_authors: Kaushik Roy, Christian Simon, Peyman Moghadam, Mehrtash Harandi
  • for: 本研究的主要目标是解决 continual learning 中的忘记知识问题,即在学习新任务时保持之前任务的知识。
  • methods: 本研究提出了一种基于 manifold structure 的知识泵化技术,通过近似数据槽的方式来保持 neural network 的知识。
  • results: 实验表明,提出的方法可以提高 continual learning 中的性能,并且可以适应 classification 和 segmentation 问题。 codes 将于 https://github.com/csiro-robotics/SDCL 上提供。
    Abstract An ultimate objective in continual learning is to preserve knowledge learned in preceding tasks while learning new tasks. To mitigate forgetting prior knowledge, we propose a novel knowledge distillation technique that takes into the account the manifold structure of the latent/output space of a neural network in learning novel tasks. To achieve this, we propose to approximate the data manifold up-to its first order, hence benefiting from linear subspaces to model the structure and maintain the knowledge of a neural network while learning novel concepts. We demonstrate that the modeling with subspaces provides several intriguing properties, including robustness to noise and therefore effective for mitigating Catastrophic Forgetting in continual learning. We also discuss and show how our proposed method can be adopted to address both classification and segmentation problems. Empirically, we observe that our proposed method outperforms various continual learning methods on several challenging datasets including Pascal VOC, and Tiny-Imagenet. Furthermore, we show how the proposed method can be seamlessly combined with existing learning approaches to improve their performances. The codes of this article will be available at https://github.com/csiro-robotics/SDCL.
    摘要 最终目标是在继续学习中保持之前学习的知识,以避免忘记先前的知识。为解决这问题,我们提出了一种新的知识填充技术,利用神经网络的输出/积存空间的拟合方法,以便在学习新任务时维护神经网络的知识。我们提出的方法是在第一阶段上对数据拟合空间进行线性逼近,从而利用线性子空间来模型结构,并维护神经网络的知识。我们发现这种模型方法具有一些惊喜性质,如鲁棒性和噪声Robustness,因此可以有效避免Catastrophic Forgetting问题。此外,我们还讨论了如何将我们的提议方法应用于分类和分割问题。实验表明,我们的提议方法在 Pascal VOC 和 Tiny-Imagenet 等数据集上比较出色,并且可以轻松地与现有的学习方法结合使用,以提高其性能。代码将在 https://github.com/csiro-robotics/SDCL 上公开。

Bridging the Gap: Exploring the Capabilities of Bridge-Architectures for Complex Visual Reasoning Tasks

  • paper_url: http://arxiv.org/abs/2307.16395
  • repo_url: None
  • paper_authors: Kousik Rajesh, Mrigank Raman, Mohammed Asad Karim, Pranit Chawla
  • for: 这个论文主要研究了基于大语言模型的多Modal arquitectures,以及这些模型在复杂视觉逻辑任务中的表现。
  • methods: 这些模型使用了零 shot生成能力,将图像嵌入到文本空间中,并使用自动生成能力解决VQA、captioning和图像检索等任务。
  • results: 研究发现,将bridge-architectures扩展到NLVR2 dataset上,并不能提高表现。然而,通过多Modal预训练,bridge-architectures可以在复杂视觉逻辑任务中表现出色。此外,研究还初步测试了LLaVA模型在零 shot设定下的表现。
    Abstract In recent times there has been a surge of multi-modal architectures based on Large Language Models, which leverage the zero shot generation capabilities of LLMs and project image embeddings into the text space and then use the auto-regressive capacity to solve tasks such as VQA, captioning, and image retrieval. We name these architectures as "bridge-architectures" as they project from the image space to the text space. These models deviate from the traditional recipe of training transformer based multi-modal models, which involve using large-scale pre-training and complex multi-modal interactions through co or cross attention. However, the capabilities of bridge architectures have not been tested on complex visual reasoning tasks which require fine grained analysis about the image. In this project, we investigate the performance of these bridge-architectures on the NLVR2 dataset, and compare it to state-of-the-art transformer based architectures. We first extend the traditional bridge architectures for the NLVR2 dataset, by adding object level features to faciliate fine-grained object reasoning. Our analysis shows that adding object level features to bridge architectures does not help, and that pre-training on multi-modal data is key for good performance on complex reasoning tasks such as NLVR2. We also demonstrate some initial results on a recently bridge-architecture, LLaVA, in the zero shot setting and analyze its performance.
    摘要 近些时候,有一波基于大型语言模型的多模态架构出现,这些架构利用大型语言模型的零shot生成能力,将图像嵌入 proyect到文本空间中,然后使用自动逆向能力解决问题如VQA、标题和图像检索。我们称这些架构为“桥架架构”,因为它们从图像空间到文本空间进行项目。这些模型与传统的转换器基于多模态模型的训练方式不同,后者通常使用大规模预训练和复杂的多模态交互通过协同或跨模态注意力。然而, bridge 架构的能力尚未在复杂的视觉逻辑任务中测试,这些任务需要细致的图像分析。在这个项目中,我们investigate bridge 架构在 NLVR2 数据集上的性能,并与当前的转换器基于多模态模型进行比较。我们首先将传统的 bridge 架构进行了 NLVR2 数据集的扩展,添加了 объек 级别特征以便进行细致的物体分析。我们的分析表明,向 bridge 架构添加 objet 级别特征并不帮助,而预训练在多模态数据上是关键 для复杂的逻辑任务如 NLVR2 的好性能。我们还展示了一些初步的 LLaVA 桥架构在零shot设置下的性能,并进行了分析。

A Pre-trained Data Deduplication Model based on Active Learning

  • paper_url: http://arxiv.org/abs/2308.00721
  • repo_url: None
  • paper_authors: Xinyao Liu, Shengdong Du, Fengmao Lv, Hongtao Xue, Jie Hu, Tianrui Li
  • for: 解决大数据中 dirty data 问题,提高数据质量和效率。
  • methods: 基于活动学习的预训练deduplication模型, integrate transformer 和 active learning into end-to-end architecture,使用 R-Drop 方法进行数据增强。
  • results: 在 benchmark 数据集上,提高了 recall 分数达到28%,胜过之前的 SOTA 模型。
    Abstract In the era of big data, the issue of data quality has become increasingly prominent. One of the main challenges is the problem of duplicate data, which can arise from repeated entry or the merging of multiple data sources. These "dirty data" problems can significantly limit the effective application of big data. To address the issue of data deduplication, we propose a pre-trained deduplication model based on active learning, which is the first work that utilizes active learning to address the problem of deduplication at the semantic level. The model is built on a pre-trained Transformer and fine-tuned to solve the deduplication problem as a sequence to classification task, which firstly integrate the transformer with active learning into an end-to-end architecture to select the most valuable data for deduplication model training, and also firstly employ the R-Drop method to perform data augmentation on each round of labeled data, which can reduce the cost of manual labeling and improve the model's performance. Experimental results demonstrate that our proposed model outperforms previous state-of-the-art (SOTA) for deduplicated data identification, achieving up to a 28% improvement in Recall score on benchmark datasets.
    摘要 在大数据时代,数据质量问题变得越来越突出。一个主要挑战是重复的数据问题,可能由重复入库或多个数据源合并而导致。这些“垃圾数据”问题可能会很大地限制大数据的有效应用。为解决数据混淆问题,我们提议一种基于活动学习的预训练混淆模型,这是首次利用活动学习来解决 semantic 层次的混淆问题。模型基于预训练的 Transformer 并在这上进行了精细调整,用于解决混淆问题作为一个序列分类任务,首次将 Transformer 与活动学习集成到了端到端架构中,以选择最有价值的数据进行混淆模型训练,并首次采用 R-Drop 方法进行数据扩展,可以降低人工标注成本并提高模型性能。实验结果表明,我们提议的模型在比较数据集上的混淆率达到了前一个State-of-the-art(SOTA)的28%提升。

STL: A Signed and Truncated Logarithm Activation Function for Neural Networks

  • paper_url: http://arxiv.org/abs/2307.16389
  • repo_url: None
  • paper_authors: Yuanhao Gong
  • for: 这篇论文旨在提出一种新的激活函数,用于改进神经网络的精度和运行性能。
  • methods: 该论文使用了一种新的激活函数,即 signed and truncated logarithm function,其具有优秀的数学性质,如不对称、单调、可导、无界值范围和连续非零导数。
  • results: 对比其他一些常见的激活函数,该新激活函数的result表明它在大多数神经网络中具有最佳性能。这种激活函数可以应用于各种需要激活函数的神经网络中。
    Abstract Activation functions play an essential role in neural networks. They provide the non-linearity for the networks. Therefore, their properties are important for neural networks' accuracy and running performance. In this paper, we present a novel signed and truncated logarithm function as activation function. The proposed activation function has significantly better mathematical properties, such as being odd function, monotone, differentiable, having unbounded value range, and a continuous nonzero gradient. These properties make it an excellent choice as an activation function. We compare it with other well-known activation functions in several well-known neural networks. The results confirm that it is the state-of-the-art. The suggested activation function can be applied in a large range of neural networks where activation functions are necessary.
    摘要 aktivation functions play an essential role in neural networks. They provide the non-linearity for the networks. Therefore, their properties are important for neural networks' accuracy and running performance. In this paper, we present a novel signed and truncated logarithm function as activation function. The proposed activation function has significantly better mathematical properties, such as being odd function, monotone, differentiable, having unbounded value range, and a continuous nonzero gradient. These properties make it an excellent choice as an activation function. We compare it with other well-known activation functions in several well-known neural networks. The results confirm that it is the state-of-the-art. The suggested activation function can be applied in a large range of neural networks where activation functions are necessary.(Simplified Chinese)活动函数在神经网络中扮演着关键的角色。它们提供了非线性性,因此其属性对神经网络的准确率和运行性非常重要。在这篇论文中,我们提出了一种新的签名和截断对数函数作为活动函数。我们的提案的活动函数具有更好的数学属性,如是奇函数、 monotonic、导函数、无 bound 值范围和连续非零导数。这些属性使其成为非常出色的选择。我们与其他常见的活动函数进行比较,并在一些常见的神经网络中进行了测试。结果表明,它是当前最佳的。我们建议这种活动函数可以在神经网络中广泛应用,特别是在需要活动函数的情况下。

Relation-Oriented: Toward Knowledge-Aligned Causal AI

  • paper_url: http://arxiv.org/abs/2307.16387
  • repo_url: None
  • paper_authors: Jia Li, Xiang Li
  • for: 本研究旨在探讨现有模型论坛中的不一致性,并提出一种基于关系的模型论坛,以便在大数据时代中更好地理解人类知识的构成和演化。
  • methods: 本研究使用了一种基于关系的模型论坛,通过对计算机视觉和医疗信息学中的实验 validate this approach。此外,研究还提出了一种基于关系的表示学习方法,用于实现relation-oriented模型论坛。
  • results: 研究发现,传统的observation-oriented模型论坛在处理大数据时会存在不一致性,而基于关系的模型论坛可以更好地理解人类知识的构成和演化。此外,基于关系的表示学习方法也得到了广泛的实验验证。
    Abstract In machine learning, we naturally apply an Observation-Oriented principle, in which observational variables preexist and set the stage for constructing relationships. While sufficient for traditional models, the integration of AI with big data exposes the misalignment between the observational models and our actual comprehension. Contrarily, humans shape cognitive entities defined by relationships, enabling us to formulate knowledge across temporal and hyper-dimensional spaces, rather than being confined to observational constructs. From an innovative Relation-Oriented perspective, this study examines the roots of this misalignment within our current modeling paradigm, illuminated by intuitive examples from computer vision and health informatics. We also introduce the relation-defined representation learning methodology as a practical implementation of Relation-Oriented modeling, supported by extensive experimental validation. Consider an analogy where ants dwell on a two-dimensional plane of a floor. If these ants were to construct models, they might use the nearest tree as a reference to specify the elevation in their two-dimensional models. By modeling, they observe an increased disruption at the tree's mid-level, which indicates a higher chance of encountering children. However, since they fail to comprehend humans as three-dimensional beings, instead of interpreting this phenomenon in a new dimension, "height", they solely relate it to the tree's mid-level. If they migrate to a different tree with a varying height, where mid-level no longer presents a risk, they might conclude that human behavior is too complex to model effectively. Similarly, when modeling time series, we usually discount the dimension, "time", as a single timeline, which has become our "tree".
    摘要 在机器学习中,我们自然采用观察主导的原则,在其中观察变量先exists并设定了模型的场景。虽然适用于传统模型,但与人工智能和大数据相结合后,这种模型的不一致性变得更加明显。相反,人类通过形成知识的关系定义了认知实体,允许我们透过时间和多维空间形成知识,而不是仅仅遵循观察构建。从一种实际的关系主导的角度,本研究探讨了我们当前模型平台的不一致性,通过直观的计算机视觉和医疗信息学示例进行描述。我们还介绍了基于关系定义学习的方法ología,并通过广泛的实验验证。假设我们的蚂蚁在二维平面上生活。如果它们构建模型,它们可能使用最近的树作为参照,以指定模型中的高度。通过模型,它们发现高度中间层的干扰增加,表示更高的儿童遇到的可能性。但是,由于它们无法理解人类为三维存在,而不是在一个新的维度中解释这种现象,它们只是将其解释为树高度中间层的问题。如果它们migrate到另一个高度不同的树,其中中间层不再是风险,它们可能会认为人类行为是无法模型有效的。类似地,当我们模型时间序列时,我们通常忽略维度"时间",将其变成我们的"树"。

When Large Language Models Meet Personalization: Perspectives of Challenges and Opportunities

  • paper_url: http://arxiv.org/abs/2307.16376
  • repo_url: None
  • paper_authors: Jin Chen, Zheng Liu, Xu Huang, Chenwang Wu, Qi Liu, Gangwei Jiang, Yuanhao Pu, Yuxuan Lei, Xiaolong Chen, Xingmei Wang, Defu Lian, Enhong Chen
  • For: This paper discusses the potential uses of large language models (LLMs) in personalization systems, and how they can revolutionize the way personalization is conducted.* Methods: The paper explores the development and challenges of existing personalization systems, as well as the newly emerged capabilities of LLMs.* Results: The paper discusses the potential ways of making use of LLMs for personalization, including the ability to proactively explore user requests and provide personalized services.
    Abstract The advent of large language models marks a revolutionary breakthrough in artificial intelligence. With the unprecedented scale of training and model parameters, the capability of large language models has been dramatically improved, leading to human-like performances in understanding, language synthesizing, and common-sense reasoning, etc. Such a major leap-forward in general AI capacity will change the pattern of how personalization is conducted. For one thing, it will reform the way of interaction between humans and personalization systems. Instead of being a passive medium of information filtering, large language models present the foundation for active user engagement. On top of such a new foundation, user requests can be proactively explored, and user's required information can be delivered in a natural and explainable way. For another thing, it will also considerably expand the scope of personalization, making it grow from the sole function of collecting personalized information to the compound function of providing personalized services. By leveraging large language models as general-purpose interface, the personalization systems may compile user requests into plans, calls the functions of external tools to execute the plans, and integrate the tools' outputs to complete the end-to-end personalization tasks. Today, large language models are still being developed, whereas the application in personalization is largely unexplored. Therefore, we consider it to be the right time to review the challenges in personalization and the opportunities to address them with LLMs. In particular, we dedicate this perspective paper to the discussion of the following aspects: the development and challenges for the existing personalization system, the newly emerged capabilities of large language models, and the potential ways of making use of large language models for personalization.
    摘要 大语言模型的出现标志着人工智能领域的一个革命性突破。它们的规模和参数数量在前所未有地提高了大语言模型的能力,从而实现了人类化的理解、语言生成和推理等等。这种大幅提升的通用人工智能能力将改变个性化的方式。一方面,它将改变人与个性化系统之间的交互方式。而不是仅作为信息滤波的温馈媒体,大语言模型将成为活跃的用户参与基础。在这个新基础上,用户的请求可以积极探索,并将用户需要的信息传达在自然和可追溯的方式。另一方面,它将扩大个性化的范围,从单一的收集个性信息变为多重功能的提供个性服务。通过利用大语言模型作为通用界面,个性化系统可以将用户的请求编译成计划,调用外部工具的函数执行计划,并将工具输出集成到完成个性化任务。目前,大语言模型仍在开发中,个性化应用则尚未得到广泛的探索。因此,我们认为现在是评估个性化挑战和利用大语言模型解决这些挑战的时候。本观点文特别关注以下方面:现有个性化系统的发展和挑战,大语言模型新出现的能力,以及可能的使用大语言模型进行个性化的方式。

Promptly: Using Prompt Problems to Teach Learners How to Effectively Utilize AI Code Generators

  • paper_url: http://arxiv.org/abs/2307.16364
  • repo_url: None
  • paper_authors: Paul Denny, Juho Leinonen, James Prather, Andrew Luxton-Reilly, Thezyrie Amarouche, Brett A. Becker, Brent N. Reeves
    for: This paper aims to introduce a novel pedagogical concept called “Prompt Problem” to help students learn how to craft effective prompts for large language models (LLMs) in computing education.methods: The paper presents a novel tool called Promptly, which hosts a repository of Prompt Problems and automates the evaluation of prompt-generated code. The authors conducted a field study with 54 first-year Python programming students to explore student interactions with the tool and their perceptions of the Prompt Problem concept.results: The study found that Promptly was well-received by students for its ability to engage their computational thinking skills and expose them to new programming constructs. The authors also discuss avenues for future work, including variations on the design of Prompt Problems and the need to study their integration into the curriculum and teaching practice.
    Abstract With their remarkable ability to generate code, large language models (LLMs) are a transformative technology for computing education practice. They have created an urgent need for educators to rethink pedagogical approaches and teaching strategies for newly emerging skill sets. Traditional approaches to learning programming have focused on frequent and repeated practice at writing code. The ease with which code can now be generated has resulted in a shift in focus towards reading, understanding and evaluating LLM-generated code. In parallel with this shift, a new essential skill is emerging -- the ability to construct good prompts for code-generating models. This paper introduces a novel pedagogical concept known as a `Prompt Problem', designed to help students learn how to craft effective prompts for LLMs. A Prompt Problem challenges a student to create a natural language prompt that leads an LLM to produce the correct code for a specific problem. To support the delivery of Prompt Problems at scale, in this paper we also present a novel tool called Promptly which hosts a repository of Prompt Problems and automates the evaluation of prompt-generated code. We report empirical findings from a field study in which Promptly was deployed in a first-year Python programming course (n=54). We explore student interactions with the tool and their perceptions of the Prompt Problem concept. We found that Promptly was largely well-received by students for its ability to engage their computational thinking skills and expose them to new programming constructs. We also discuss avenues for future work, including variations on the design of Prompt Problems and the need to study their integration into the curriculum and teaching practice.
    摘要 带有杰出代码生成能力的大语言模型(LLM)已经是计算教育实践中的一种转变性技术。它们对教学方法和教学策略的需求已经产生了紧迫性,让教师需要重新思考教学方法。传统的编程学习方法通常是通过频繁地编写代码来帮助学生学习编程。然而,由于代码可以非常容易地生成,因此教学的重点已经从编写代码转移到了阅读、理解和评估 LLM 生成的代码。同时,一种新的重要技能正在emerging---如何构建好的提问。这篇论文介绍了一种新的教学概念,即“提问问题”(Prompt Problem),用于帮助学生学习如何编写有效的提问。一个 Prompt Problem 挑战学生创建一个自然语言提问,使 LLM 生成 correct 代码来解决特定问题。为了在大规模上提供 Prompt Problems,这篇论文还介绍了一种名为 Promptly 的新工具,该工具hosts一个提问问题的存储库和自动评估提问生成的代码。我们在一个 Python 编程课程中(n=54)进行了一项场景研究,并报告了学生与工具的交互和提问问题概念的看法。我们发现 Promptly 受到了学生的欢迎,他们认为该工具能够帮助他们发展计算思维和暴露他们于新编程构造。我们还讨论了未来工作的可能性,包括提问问题的设计变化和在课程和教学实践中的集成。

BearingPGA-Net: A Lightweight and Deployable Bearing Fault Diagnosis Network via Decoupled Knowledge Distillation and FPGA Acceleration

  • paper_url: http://arxiv.org/abs/2307.16363
  • repo_url: https://github.com/asdvfghg/bearingpga-net
  • paper_authors: Jing-Xiao Liao, Sheng-Lai Wei, Chen-Long Xie, Tieyong Zeng, Jinwei Sun, Shiping Zhang, Xiaoge Zhang, Feng-Lei Fan
  • for: 本研究旨在提出一种轻量级可部署的涤略破坏诊断模型(BearingPGA-Net),以解决现有深度学习模型的大小和计算复杂性限制其在工业应用中的使用。
  • methods: 我们使用了一个强化训练的大型模型,通过分离知识填充来训练 BearingPGA-Net。尽管它的体积小,但我们的模型仍能达到其他轻量级状态对比模型的优秀破坏诊断性能。
  • results: 我们设计了一种基于FPGA的加速方案,使用Verilog进行定制化量化和设计可编程逻辑门阵列 для每层的 BearingPGA-Net。这种方案强调了并行计算和模块再利用,以提高计算速度。根据我们所知,这是第一次将一个基于CNN的涤略破坏诊断模型部署在FPGA上。我们的实验结果表明,我们的部署方案可以在CPU上实现200倍以上的诊断速度,同时保持F1、回归和准确率分数下降较少于0.4%。
    Abstract Deep learning has achieved remarkable success in the field of bearing fault diagnosis. However, this success comes with larger models and more complex computations, which cannot be transferred into industrial fields requiring models to be of high speed, strong portability, and low power consumption. In this paper, we propose a lightweight and deployable model for bearing fault diagnosis, referred to as BearingPGA-Net, to address these challenges. Firstly, aided by a well-trained large model, we train BearingPGA-Net via decoupled knowledge distillation. Despite its small size, our model demonstrates excellent fault diagnosis performance compared to other lightweight state-of-the-art methods. Secondly, we design an FPGA acceleration scheme for BearingPGA-Net using Verilog. This scheme involves the customized quantization and designing programmable logic gates for each layer of BearingPGA-Net on the FPGA, with an emphasis on parallel computing and module reuse to enhance the computational speed. To the best of our knowledge, this is the first instance of deploying a CNN-based bearing fault diagnosis model on an FPGA. Experimental results reveal that our deployment scheme achieves over 200 times faster diagnosis speed compared to CPU, while achieving a lower-than-0.4\% performance drop in terms of F1, Recall, and Precision score on our independently-collected bearing dataset. Our code is available at \url{https://github.com/asdvfghg/BearingPGA-Net}.
    摘要 深度学习在滚珠缺陷诊断领域取得了很大成功。然而,这些成功带来更大的模型和更复杂的计算,无法在需要高速、强可移植和低功耗的工业领域中传输。在这篇论文中,我们提出了一个轻量级可部署的滚珠缺陷诊断模型,称为BearingPGA-Net,以解决这些挑战。首先,我们通过大型模型的帮助,使用分离知识采样来训练BearingPGA-Net。尽管它的体积小,我们的模型在其他轻量级state-of-the-art方法的比较中仍然表现出色。其次,我们为BearingPGA-Net设计了FPGA加速方案,使用Verilog语言设计。这个方案包括对BearingPGA-Net每层的自定义量化和FPGA中的可编程逻辑门的设计,强调并行计算和模块重用以提高计算速度。我们知道,这是第一次将CNN基于滚珠缺陷诊断模型部署到FPGA上。实验结果表明,我们的部署方案在CPU上进行诊断速度比较,可以达到200倍以上的加速速度,同时在独立收集的滚珠数据集上保持下降0.4%以下的F1、回归和准确率分数。我们的代码可以在 上下载。

Distributionally Robust Safety Filter for Learning-Based Control in Active Distribution Systems

  • paper_url: http://arxiv.org/abs/2307.16351
  • repo_url: None
  • paper_authors: Hoang Tien Nguyen, Dae-Hyun Choi
  • for: 避免深度强化学习(DRL)代理在真实世界中的分布系统中进行训练时出现操作限制 violet 的问题。
  • methods: 使用一种通用的分布 robust安全筛(DRSF)来降低DRL代理在分布系统中的约束违反,同时保持near-optimal的解决方案。DRSF是基于分布 robust优化问题,通过考虑操作限制的可能性来计算near-optimal的动作,从而提供约束满意度保证。
  • results: 对于IEEE 33-bus和123-bus系统,提出的DRSF可以减少DRL代理在分布系统中的约束违反,同时保持near-optimal的解决方案。
    Abstract Operational constraint violations may occur when deep reinforcement learning (DRL) agents interact with real-world active distribution systems to learn their optimal policies during training. This letter presents a universal distributionally robust safety filter (DRSF) using which any DRL agent can reduce the constraint violations of distribution systems significantly during training while maintaining near-optimal solutions. The DRSF is formulated as a distributionally robust optimization problem with chance constraints of operational limits. This problem aims to compute near-optimal actions that are minimally modified from the optimal actions of DRL-based Volt/VAr control by leveraging the distribution system model, thereby providing constraint satisfaction guarantee with a probability level under the model uncertainty. The performance of the proposed DRSF is verified using the IEEE 33-bus and 123-bus systems.
    摘要 <>对深度强化学习(DRL)代理与实际运行系统进行交互学习时,运行系统约束可能会被违反。这封信提出了一种通用分布robust安全筛选器(DRSF),可以在DRL代理训练过程中减少运行系统约束违反的概率,同时保持近似优解。DRSF是以分布robust优化问题的形式表述,其目标是在操作限制下计算近似优解,并且具有随机变量模型的承诺保证。此外,DRSF可以利用DRL基于Volt/Var控制的优化解,从而提供约束满足保证,并且在模型不确定性下保证优化解的可行性。本文通过IEEE 33-bus和123-bus系统的实验,证明了提案的DRSF的性能。Note: Simplified Chinese is a version of Chinese that uses simpler grammar and vocabulary, and is often used in informal writing and communication. It is different from Traditional Chinese, which is a more formal version of Chinese that is used in many official documents and publications.

Rating-based Reinforcement Learning

  • paper_url: http://arxiv.org/abs/2307.16348
  • repo_url: https://github.com/Aryia-Behroziuan/References
  • paper_authors: Devin White, Mingkang Wu, Ellen Novoseller, Vernon Lawhern, Nick Waytowich, Yongcan Cao
  • for: 本研究开发了一种基于评价的强化学习方法,通过人类评价来获取人类指导。不同于现有的偏好基于和排名基于强化学习模式,本研究基于人类评价个别路径而不需要相对比较项目对。
  • methods: 本研究使用了一个新的预测模型来预测人类评价,以及一种多维度损失函数。
  • results: 本研究通过实验研究,证明了新的评价基于强化学习方法的有效性和利害。
    Abstract This paper develops a novel rating-based reinforcement learning approach that uses human ratings to obtain human guidance in reinforcement learning. Different from the existing preference-based and ranking-based reinforcement learning paradigms, based on human relative preferences over sample pairs, the proposed rating-based reinforcement learning approach is based on human evaluation of individual trajectories without relative comparisons between sample pairs. The rating-based reinforcement learning approach builds on a new prediction model for human ratings and a novel multi-class loss function. We conduct several experimental studies based on synthetic ratings and real human ratings to evaluate the effectiveness and benefits of the new rating-based reinforcement learning approach.
    摘要 这个论文提出了一种新的评分基于权重学习方法,利用人类评分来获得人类指导。与现有的偏好基于样本对比和排名基于样本对比的学习 paradigms不同,提posed评分基于学习方法是基于人类评分个 trajectory 而不是对样本对比的Relative preferences。这种评分基于学习方法采用了一种新的预测模型和一种多类损失函数。我们在使用 synthetic 评分和实际人类评分进行了多个实验研究,以评估新评分基于学习方法的效果和优势。

Proof-of-Federated-Learning-Subchain: Free Partner Selection Subchain Based on Federated Learning

  • paper_url: http://arxiv.org/abs/2307.16342
  • repo_url: None
  • paper_authors: Boyang Li, Bingyu Shen, Qing Lu, Taeho Jung, Yiyu Shi
  • for: 这篇论文是为了提出一种新的分布式协议,即Proof-of-Federated-Learning-Subchain(PoFLSC),用于取代之前提出的Proof-of-Deep-Learning(PoDL)协议,以更好地利用能源并维护区块链。
  • methods: 这篇论文使用了一种名为Proof-of-Federated-Learning-Subchain(PoFLSC)的新的分布式协议,该协议使用了一个子链来记录训练、挑战和审核活动,并强调了合理的数据集的选择。
  • results: 在 simulate 20 个矿工的情况下,论文示出了 PoFLSC 协议的有效性,当矿工池的大小减少时, Pool 中的矿工会根据 Shapley Value (SV) 的优先级顺序进行选择。在实验中,PoFLSC 协议支持了子链管理员对储存优先级的了解,并使核心分区的参与者建立和维护一个竞争性的子链。
    Abstract The continuous thriving of the Blockchain society motivates research in novel designs of schemes supporting cryptocurrencies. Previously multiple Proof-of-Deep-Learning(PoDL) consensuses have been proposed to replace hashing with useful work such as deep learning model training tasks. The energy will be more efficiently used while maintaining the ledger. However deep learning models are problem-specific and can be extremely complex. Current PoDL consensuses still require much work to realize in the real world. In this paper, we proposed a novel consensus named Proof-of-Federated-Learning-Subchain(PoFLSC) to fill the gap. We applied a subchain to record the training, challenging, and auditing activities and emphasized the importance of valuable datasets in partner selection. We simulated 20 miners in the subchain to demonstrate the effectiveness of PoFLSC. When we reduce the pool size concerning the reservation priority order, the drop rate difference in the performance in different scenarios further exhibits that the miner with a higher Shapley Value (SV) will gain a better opportunity to be selected when the size of the subchain pool is limited. In the conducted experiments, the PoFLSC consensus supported the subchain manager to be aware of reservation priority and the core partition of contributors to establish and maintain a competitive subchain.
    摘要 “ blockchain 社会的持续发展为我们的研究提供了新的设计方案,以支持 криптовалютой。先前,多种 Proof-of-Deep-Learning(PoDL)的共识方案已经被提出,以替换哈希函数,使用有用的工作,如深度学习模型训练任务。这将更有效地使用能量,同时维护 ledger。但是深度学习模型是问题特定的,可能非常复杂。当前的 PoDL 共识方案仍需要大量的实践和改进。在这篇论文中,我们提出了一种新的共识方案,名为 Proof-of-Federated-Learning-Subchain(PoFLSC)。我们使用了一个子链来记录训练、挑战和审核活动,并强调了合理的数据集的选择。我们在 simulations 中使用 20 个矿工,以示 PoFLSC 的效果。当我们降低了池子大小,关于保留优先级顺序,则drop rate 差异在不同的场景中进一步表现出,表明矿工 avec 更高的 Shapley 值(SV)在池子大小有限化时会获得更好的机会被选择。在我们的实验中,PoFLSC 共识支持了子链经理者了解保留优先级,以及核心分区的参与者们建立和维护竞争性的子链。”

Anatomy of an AI-powered malicious social botnet

  • paper_url: http://arxiv.org/abs/2307.16336
  • repo_url: https://github.com/osome-iu/aibot_fox8
  • paper_authors: Kai-Cheng Yang, Filippo Menczer
  • for: 这篇论文探讨了使用大语言模型(LLM)生成的社交媒体帐户,以及这些帐户是如何通过生成人类样式的内容来伪装自己的问题。
  • methods: 该论文使用了规则来识别1,140个涉嫌使用ChatGPT生成内容的Twitter帐户,并通过人工纠察确认其为假帐户。
  • results: 研究发现这些假帐户通常发布机器生成的内容和盗取的图片,并与其他假帐户进行回复和转发交互。这些帐户推广了嫌疑 Website 和散播负面评论,但现有的 LLM 内容分类器无法在野外环境中分辨它们和真正的人帐户。
    Abstract Large language models (LLMs) exhibit impressive capabilities in generating realistic text across diverse subjects. Concerns have been raised that they could be utilized to produce fake content with a deceptive intention, although evidence thus far remains anecdotal. This paper presents a case study about a Twitter botnet that appears to employ ChatGPT to generate human-like content. Through heuristics, we identify 1,140 accounts and validate them via manual annotation. These accounts form a dense cluster of fake personas that exhibit similar behaviors, including posting machine-generated content and stolen images, and engage with each other through replies and retweets. ChatGPT-generated content promotes suspicious websites and spreads harmful comments. While the accounts in the AI botnet can be detected through their coordination patterns, current state-of-the-art LLM content classifiers fail to discriminate between them and human accounts in the wild. These findings highlight the threats posed by AI-enabled social bots.
    摘要 大型语言模型(LLM)展示了各种主题的文本生成能力,但是有人们对其用于生成假内容的担忧。虽然证据ntil now still anecdotal,但这篇文章描述了一个Twitter botnet使用ChatGPT生成人类化内容。通过规则,我们确认了1,140个帐户,并通过手动标注 validate them。这些帐户组成了一个假人类型的集群,其中包括发布机器生成内容和盗取图片,以及通过回复和转推相互交流。ChatGPT生成的内容推广了嫌疑 Website和散播负面评论。虽然AI botnet帐户可以通过协调pattern detection,但现有的LLM内容分类器无法在野外 distinguishing them from human accounts。这些发现提醒了AI-enabled social bot的威胁。

Evaluating ChatGPT and GPT-4 for Visual Programming

  • paper_url: http://arxiv.org/abs/2308.02522
  • repo_url: None
  • paper_authors: Adish Singla
  • for: 这个论文主要研究了现代生成AI和大语言模型在视觉编程领域的可能性,以及这些模型在不同的编程教育场景中的表现。
  • methods: 研究使用了两种模型:ChatGPT(基于GPT-3.5)和GPT-4,并使用了专家标注来评估这些模型在不同的场景中的表现。
  • results: 研究发现,这些模型在视觉编程领域表现不佳,尤其是在组合空间、逻辑和编程技能方面表现较差。这些结果还提供了未来开发改进生成模型在视觉编程领域表现的可能性。
    Abstract Generative AI and large language models have the potential to drastically improve the landscape of computing education by automatically generating personalized feedback and content. Recent works have studied the capabilities of these models for different programming education scenarios; however, these works considered only text-based programming, in particular, Python programming. Consequently, they leave open the question of how well these models would perform in visual programming domains popularly used for K-8 programming education. The main research question we study is: Do state-of-the-art generative models show advanced capabilities in visual programming on par with their capabilities in text-based Python programming? In our work, we evaluate two models, ChatGPT (based on GPT-3.5) and GPT-4, in visual programming domains for various scenarios and assess performance using expert-based annotations. In particular, we base our evaluation using reference tasks from the domains of Hour of Code: Maze Challenge by Code-dot-org and Karel. Our results show that these models perform poorly and struggle to combine spatial, logical, and programming skills crucial for visual programming. These results also provide exciting directions for future work on developing techniques to improve the performance of generative models in visual programming.
    摘要 <>将文本翻译成简化中文。<>生成AI和大语言模型有可能在计算教育领域带来很大改进,自动生成个性化反馈和内容。先前的研究已经研究了这些模型在不同的编程教育场景中的能力,但是这些研究仅考虑了文本编程,尤其是Python编程。因此,它们留下了如何在视觉编程领域中表现的问题。我们的研究问题是:现代生成模型在视觉编程领域中是否有高度的能力,与文本基于Python编程的能力相当?在我们的工作中,我们评估了两个模型:ChatGPT(基于GPT-3.5)和GPT-4,在不同的场景下进行了评估,并使用专家标注来评估性能。具体来说,我们基于Code-dot-org的Hour of Code:迷宫挑战和Karel的参考任务进行评估。我们的结果表明,这些模型在视觉编程中表现不佳,它们困难于结合空间逻辑和编程技能,这些技能是视觉编程的关键。这些结果还提供了未来开发改进生成模型在视觉编程中表现的可能性的激动人心的方向。

Representing and Reasoning with Multi-Stakeholder Qualitative Preference Queries

  • paper_url: http://arxiv.org/abs/2307.16307
  • repo_url: None
  • paper_authors: Samik Basu, Vasant Honavar, Ganesh Ram Santhanam, Jia Tao
  • for: 这篇论文旨在处理多方参与者的资源选择问题,以便在决策过程中考虑多个参与者的 préférences。
  • methods: 本文使用形式化的多方参与者 préférences 语言,如 CP-net、CI-net、TCP-net 和 CP-Theory,并提出了一种查询语言来表达对这些 préférences 的询问。
  • results: 本文提供了一种可靠的算法来回答多方参与者资源选择询问,使用 alternation-free μ-calculus 进行模型检查。实验结果表明该方法的可行性。
    Abstract Many decision-making scenarios, e.g., public policy, healthcare, business, and disaster response, require accommodating the preferences of multiple stakeholders. We offer the first formal treatment of reasoning with multi-stakeholder qualitative preferences in a setting where stakeholders express their preferences in a qualitative preference language, e.g., CP-net, CI-net, TCP-net, CP-Theory. We introduce a query language for expressing queries against such preferences over sets of outcomes that satisfy specified criteria, e.g., $\mlangpref{\psi_1}{\psi_2}{A}$ (read loosely as the set of outcomes satisfying $\psi_1$ that are preferred over outcomes satisfying $\psi_2$ by a set of stakeholders $A$). Motivated by practical application scenarios, we introduce and analyze several alternative semantics for such queries, and examine their interrelationships. We provide a provably correct algorithm for answering multi-stakeholder qualitative preference queries using model checking in alternation-free $\mu$-calculus. We present experimental results that demonstrate the feasibility of our approach.
    摘要 多种决策场景,如公共政策、医疗、商业和灾难应急处理,需要考虑多个利益相关者的首选。我们提供了首个正式对多个利益相关者质量预ферен斯的理解,在一种使用质量预ферен斯语言表达利益偏好的设定下。我们引入了一种表达对多个利益相关者对结果集的偏好 queries 的查询语言,例如 $\mlangpref{\psi_1}{\psi_2}{A}$(翻译为:结果集满足 $\psi_1$ 的结果集,在 $\psi_2$ 中超过 $\psi_1$ 的结果集的所有利益相关者 $A$ 中的偏好)。受实际应用场景的限制,我们引入了和分析了多种代理 semantics,并研究它们之间的关系。我们提供了一种可靠的回答多个利益相关者质量预ферен斯查询的可靠算法,使用无需分支的 $\mu$-calculus 进行模板检查。我们发现实际结果,证明了我们的方法的可行性。

LaFiCMIL: Rethinking Large File Classification from the Perspective of Correlated Multiple Instance Learning

  • paper_url: http://arxiv.org/abs/2308.01413
  • repo_url: None
  • paper_authors: Tiezhu Sun, Weiguo Pian, Nadia Daoudi, Kevin Allix, Tegawendé F. Bissyandé, Jacques Klein
  • for: 本研究旨在解决Transformer模型(如BERT)在大文件分类任务中的输入限制问题,以提高其在不同领域的表现。
  • methods: 本研究采用相关多个实例学习的思想,提出了LaFiCMIL方法,可应用于不同领域的大文件分类任务,包括多类、多标签和二分类分类任务。
  • results: 经employmBERT家族模型作为特征提取器,我们的实验结果表明,LaFiCMIL在八个benchmark datasets中均达到了新的状态码表现,其中包括Long Document Classification、Code Defect Detection和Android Malware Detection等领域。
    Abstract Transformer-based models, such as BERT, have revolutionized various language tasks, but still struggle with large file classification due to their input limit (e.g., 512 tokens). Despite several attempts to alleviate this limitation, no method consistently excels across all benchmark datasets, primarily because they can only extract partial essential information from the input file. Additionally, they fail to adapt to the varied properties of different types of large files. In this work, we tackle this problem from the perspective of correlated multiple instance learning. The proposed approach, LaFiCMIL, serves as a versatile framework applicable to various large file classification tasks covering binary, multi-class, and multi-label classification tasks, spanning various domains including Natural Language Processing, Programming Language Processing, and Android Analysis. To evaluate its effectiveness, we employ eight benchmark datasets pertaining to Long Document Classification, Code Defect Detection, and Android Malware Detection. Leveraging BERT-family models as feature extractors, our experimental results demonstrate that LaFiCMIL achieves new state-of-the-art performance across all benchmark datasets. This is largely attributable to its capability of scaling BERT up to nearly 20K tokens, running on a single Tesla V-100 GPU with 32G of memory.
    摘要 transformer-based 模型,如 BERT,已经革命化了多种语言任务,但仍然在大文件分类任务中困难,主要因为其输入限制(例如 512 个 tokens)。 despite several attempts to alleviate this limitation, no method has consistently excelled across all benchmark datasets, primarily because they can only extract partial essential information from the input file. In addition, they fail to adapt to the varied properties of different types of large files.在这种情况下,我们从相关多个实例学习的角度来解决这个问题。我们提出了一种名为 LaFiCMIL 的框架,可以应用于多种大文件分类任务,包括二分类、多类和多标签分类任务,覆盖了自然语言处理、编程语言处理和 Android 分析等领域。为了评估其效果,我们使用了八个基准数据集, relate to Long Document Classification、Code Defect Detection 和 Android Malware Detection。通过使用 BERT 家族模型作为特征提取器,我们的实验结果表明,LaFiCMIL 在所有基准数据集上达到了新的状态之术性能。这主要归功于它可以在单个 Tesla V-100 GPU 上运行,并且可以扩展到 nearly 20K 个 tokens。

Implementing Edge Based Object Detection For Microplastic Debris

  • paper_url: http://arxiv.org/abs/2307.16289
  • repo_url: None
  • paper_authors: Amardeep Singh, Prof. Charles Jia, Prof. Donald Kirk
  • for: 这个研究旨在帮助解决废弃 пластиック的问题,通过开发基于计算机视觉的移动设备来检测和除废弃 пластиック。
  • methods: 该研究使用了计算机视觉和开放视觉技术,并将其与机器人臂结合使用,以提高废弃 пластиック的检测和除除效率。
  • results: 研究发现,使用augmented CNN方法可以在Sampled图像中实时检测废弃 пластиック,并且可以根据不同的废弃类型进行比较。此外,研究还发现了最佳预处理步骤和硬件配置,以便扩展废弃物检测研究到更大的环境中。
    Abstract Plastic has imbibed itself as an indispensable part of our day to day activities, becoming a source of problems due to its non-biodegradable nature and cheaper production prices. With these problems, comes the challenge of mitigating and responding to the aftereffects of disposal or the lack of proper disposal which leads to waste concentrating in locations and disturbing ecosystems for both plants and animals. As plastic debris levels continue to rise with the accumulation of waste in garbage patches in landfills and more hazardously in natural water bodies, swift action is necessary to plug or cease this flow. While manual sorting operations and detection can offer a solution, they can be augmented using highly advanced computer imagery linked with robotic appendages for removing wastes. The primary application of focus in this report are the much-discussed Computer Vision and Open Vision which have gained novelty for their light dependence on internet and ability to relay information in remote areas. These applications can be applied to the creation of edge-based mobility devices that can as a counter to the growing problem of plastic debris in oceans and rivers, demanding little connectivity and still offering the same results with reasonably timed maintenance. The principal findings of this project cover the various methods that were tested and deployed to detect waste in images, as well as comparing them against different waste types. The project has been able to produce workable models that can perform on time detection of sampled images using an augmented CNN approach. Latter portions of the project have also achieved a better interpretation of the necessary preprocessing steps required to arrive at the best accuracies, including the best hardware for expanding waste detection studies to larger environments.
    摘要 塑料已经成为我们日常生活中不可或缺的一部分,但它却带来了一系列问题,主要是因为它的不可降解性和便宜生产成本。这些问题导致塑料的不当处理和弃置,导致垃圾拥挤在特定区域和影响植物和动物的生态系统。随着塑料垃圾的水平不断升高,紧急需要采取 Swift action 来抑制或中止这流。而 manual sorting operations 和检测可以提供一个解决方案,但这些方法可以通过高度进步的计算机影像连结到机械臂,从而实现塑料的自动排除。本报告的主要应用是著名的计算机见识和开放见识,它们在不需互联网的情况下,可以实现塑料的检测和分类。这些应用可以应用于创建边缘基础设施,以抵消塑料垃圾在海洋和河流中的问题,需要 little connectivity,并且仍然可以获得相同的结果,并且在合理的维护时间进行维护。本项目的主要发现包括对图像中的垃圾检测方法的评估和比较,以及不同垃圾类型之间的比较。项目最终得出了可靠的模型,可以在合理的时间内检测样本图像,使用增强的 CNN 方法。项目的后期也获得了更好的理解有关垃圾检测的必要预处理步骤,包括最佳硬件 для扩展垃圾检测研究到更大的环境中。

Towards Learned Predictability of Storage Systems

  • paper_url: http://arxiv.org/abs/2307.16288
  • repo_url: None
  • paper_authors: Chenyuan Wu
  • For: This paper focuses on the proactive prediction of performance instability and hardware failures in storage systems, with the goal of improving their reliability and availability.* Methods: The paper surveys various mechanisms and field studies that have been proposed in the past few years for predicting performance instability and device failures in storage systems, with a focus on machine learning-based black-box approaches.* Results: The paper evaluates the strengths and limitations of three representative research works in this field, providing insights into the effectiveness of machine learning-based approaches for predicting performance instability and device failures in storage systems.
    Abstract With the rapid development of cloud computing and big data technologies, storage systems have become a fundamental building block of datacenters, incorporating hardware innovations such as flash solid state drives and non-volatile memories, as well as software infrastructures such as RAID and distributed file systems. Despite the growing popularity and interests in storage, designing and implementing reliable storage systems remains challenging, due to their performance instability and prevailing hardware failures. Proactive prediction greatly strengthens the reliability of storage systems. There are two dimensions of prediction: performance and failure. Ideally, through detecting in advance the slow IO requests, and predicting device failures before they really happen, we can build storage systems with especially low tail latency and high availability. While its importance is well recognized, such proactive prediction in storage systems, on the other hand, is particularly difficult. To move towards predictability of storage systems, various mechanisms and field studies have been proposed in the past few years. In this report, we present a survey of these mechanisms and field studies, focusing on machine learning based black-box approaches. Based on three representative research works, we discuss where and how machine learning should be applied in this field. The strengths and limitations of each research work are also evaluated in detail.
    摘要 With the rapid development of cloud computing and big data technologies, storage systems have become a fundamental building block of datacenters, incorporating hardware innovations such as flash solid state drives and non-volatile memories, as well as software infrastructures such as RAID and distributed file systems. Despite the growing popularity and interests in storage, designing and implementing reliable storage systems remains challenging, due to their performance instability and prevailing hardware failures. 随着云计算和大数据技术的快速发展,存储系统已成为数据中心的基本构建件,涵盖硬件创新 such as flash固态驱动器和非术TLM存储器,以及软件基础设施 such as RAID和分布式文件系统。虽然存储领域的兴趣和popularity在增长,但设计和实施可靠的存储系统仍然具有挑战性,因为它们的性能不稳定和存在硬件故障。Predictive maintenance greatly enhances the reliability of storage systems. There are two dimensions of prediction: performance and failure. Ideally, by detecting slow IO requests in advance and predicting device failures before they occur, we can build storage systems with especially low tail latency and high availability. However, such proactive prediction in storage systems is particularly difficult. To move towards predictability of storage systems, various mechanisms and field studies have been proposed in the past few years. In this report, we present a survey of these mechanisms and field studies, focusing on machine learning-based black-box approaches. Based on three representative research works, we discuss where and how machine learning should be applied in this field. The strengths and limitations of each research work are also evaluated in detail.

Predicting delays in Indian lower courts using AutoML and Decision Forests

  • paper_url: http://arxiv.org/abs/2307.16285
  • repo_url: https://github.com/mb7419/pendencyprediction
  • paper_authors: Mohit Bhatnagar, Shivraj Huchhanavar
  • for: 这项研究旨在预测印度下级法院的延迟时间,以便改善印度法院的效率和公正性。
  • methods: 该研究使用AutoML技术建立了一个多类分类模型,以预测印度法院的延迟时间。研究使用了420万起法律案件的数据,其中7000多个法院的数据,并使用了二分类决策森林分类器来提高预测精度。
  • results: 研究得出的结果显示,使用AutoML技术可以建立一个高精度的延迟预测模型,其准确率达81.4%,精度、回归和准确率均为0.81。这项研究也提供了可用于进一步研究印度法院改革的数据集和Python代码文件。
    Abstract This paper presents a classification model that predicts delays in Indian lower courts based on case information available at filing. The model is built on a dataset of 4.2 million court cases filed in 2010 and their outcomes over a 10-year period. The data set is drawn from 7000+ lower courts in India. The authors employed AutoML to develop a multi-class classification model over all periods of pendency and then used binary decision forest classifiers to improve predictive accuracy for the classification of delays. The best model achieved an accuracy of 81.4%, and the precision, recall, and F1 were found to be 0.81. The study demonstrates the feasibility of AI models for predicting delays in Indian courts, based on relevant data points such as jurisdiction, court, judge, subject, and the parties involved. The paper also discusses the results in light of relevant literature and suggests areas for improvement and future research. The authors have made the dataset and Python code files used for the analysis available for further research in the crucial and contemporary field of Indian judicial reform.
    摘要 Translated into Simplified Chinese:这篇论文介绍了一种基于文件时间的印度下级法院延迟预测模型。模型是基于2010年420万个法律案件的签到和结果数据,来自7000多个印度下级法院。作者使用AutoML开发了一个多类别预测模型,然后使用二分决策树分类器进行改进预测准确性。最佳模型的准确率为81.4%,精度、回归率和F1分别为0.81。研究表明,使用相关数据点,如司法管辖区、法院、法官、案件主题和参与党人,可以预测印度法院的延迟。研究还与相关文献进行了比较分析,并提出了改进和未来研究的建议。作者已经将数据集和使用于分析的Python代码文件公开,以便进一步研究印度司法改革的当前和紧迫领域。

Recent Advances in Hierarchical Multi-label Text Classification: A Survey

  • paper_url: http://arxiv.org/abs/2307.16265
  • repo_url: None
  • paper_authors: Rundong Liu, Wenhan Liang, Weijun Luo, Yuxiang Song, He Zhang, Ruohua Xu, Yunfeng Li, Ming Liu
  • for: 科学文献搜索、多标签文本分类
  • methods: 主要使用开源数据集、主要方法、评价指标、学习策略
  • results: 研究进展、挑战和未来发展方向In English, this translates to:
  • for: Scientific literature archiving, hierarchical multi-label text classification
  • methods: Mainly using open-sourced data sets, main methods, evaluation metrics, learning strategies
  • results: Research progress, challenges, and future development directionsI hope this helps!
    Abstract Hierarchical multi-label text classification aims to classify the input text into multiple labels, among which the labels are structured and hierarchical. It is a vital task in many real world applications, e.g. scientific literature archiving. In this paper, we survey the recent progress of hierarchical multi-label text classification, including the open sourced data sets, the main methods, evaluation metrics, learning strategies and the current challenges. A few future research directions are also listed for community to further improve this field.
    摘要 Here is the text in Simplified Chinese: Hierarchical multi-label text classification aims to classify input text into multiple labels, with the labels being structured and hierarchical. This is a crucial task in many real-world applications, such as scientific literature archiving. In this paper, we review recent progress in hierarchical multi-label text classification, including open-source data sets, main methods, evaluation metrics, learning strategies, and current challenges. We also list a few future research directions for the community to further improve this field.