cs.AI - 2023-08-14

MM-GEF: Multi-modal representation meet collaborative filtering

  • paper_url: http://arxiv.org/abs/2308.07222
  • repo_url: None
  • paper_authors: Hao Wu, Alejandro Ariza-Casabona, Bartłomiej Twardowski, Tri Kurniawan Wijaya
  • for: 这篇论文的目的是提出一种基于图的项目结构优化方法,以提高多Modal的推荐系统的性能。
  • methods: 该方法使用图形 Early-Fusion 技术,将多Modal的内容特征综合到一起,以获取更加精准的项目表示。
  • results: 经过广泛的实验 validate,该方法在四个公开数据集上实现了比基于单Modal的方法更高的推荐性能。
    Abstract In modern e-commerce, item content features in various modalities offer accurate yet comprehensive information to recommender systems. The majority of previous work either focuses on learning effective item representation during modelling user-item interactions, or exploring item-item relationships by analysing multi-modal features. Those methods, however, fail to incorporate the collaborative item-user-item relationships into the multi-modal feature-based item structure. In this work, we propose a graph-based item structure enhancement method MM-GEF: Multi-Modal recommendation with Graph Early-Fusion, which effectively combines the latent item structure underlying multi-modal contents with the collaborative signals. Instead of processing the content feature in different modalities separately, we show that the early-fusion of multi-modal features provides significant improvement. MM-GEF learns refined item representations by injecting structural information obtained from both multi-modal and collaborative signals. Through extensive experiments on four publicly available datasets, we demonstrate systematical improvements of our method over state-of-the-art multi-modal recommendation methods.
    摘要 现代电子商务中,物品内容特征在不同的Modalities上提供了准确且全面的信息,以便推荐系统进行推荐。大多数前一次的工作都集中在学习用户-物品交互中有效的物品表示,或者分析多Modalities的特征来探索物品之间的关系。然而,这些方法却忽略了用户-物品-物品的相互关系在多Modalities的特征基础上的协同作用。在这项工作中,我们提出了一种图structured enhancement方法MM-GEF:多Modalities推荐with Graph Early-Fusion,该方法能够有效地结合多Modalities的特征基础和用户-物品-物品的协同信号。而不是分别处理不同Modalities的内容特征,我们表明在早期融合多Modalities的特征提供了显著的改善。MM-GEF通过对多Modalities特征基础和协同信号获得的结构信息进行投入,学习精细的物品表示。经过对四个公共数据集的广泛实验,我们证明了我们的方法在多Modalities推荐方法中的系统性改进。

Generating Individual Trajectories Using GPT-2 Trained from Scratch on Encoded Spatiotemporal Data

  • paper_url: http://arxiv.org/abs/2308.07940
  • repo_url: None
  • paper_authors: Taizo Horikomi, Shouji Fujimoto, Atushi Ishikawa, Takayuki Mizuno
  • for: 这个论文主要是为了构建一个基于深度学习的人体行为预测模型,以便预测人们的日常行为。
  • methods: 这个论文使用了GPT-2架构来训练一个序列生成模型,该模型可以从零开始训练,并且可以基于具有不同空间缩放的地理坐标和时间间隔 tokens 来生成人们的日常行动路径。
  • results: 这个论文的结果表明,通过使用GPT-2架构和特殊符号来表示环境因素和个人特征,可以生成基于具有多种空间缩放的人们的日常行动路径,并且可以预测人们在不同时间和空间上的行为。
    Abstract Following Mizuno, Fujimoto, and Ishikawa's research (Front. Phys. 2022), we transpose geographical coordinates expressed in latitude and longitude into distinctive location tokens that embody positions across varied spatial scales. We encapsulate an individual daily trajectory as a sequence of tokens by adding unique time interval tokens to the location tokens. Using the architecture of an autoregressive language model, GPT-2, this sequence of tokens is trained from scratch, allowing us to construct a deep learning model that sequentially generates an individual daily trajectory. Environmental factors such as meteorological conditions and individual attributes such as gender and age are symbolized by unique special tokens, and by training these tokens and trajectories on the GPT-2 architecture, we can generate trajectories that are influenced by both environmental factors and individual attributes.
    摘要 根据米泽野、藤本、石川等人的研究(Front. Phys. 2022),我们将地理坐标表示为纬度和经度转化为特征强调的位置标记,这些标记表示位置在不同的空间尺度上的不同位置。我们将每天的行走路径视为一个序列的位置标记,通过将唯一的时间间隔标记添加到位置标记中来实现。使用GPT-2架构的自然语言模型,我们从头来训练这些标记和路径,以建立一个可以顺序生成每天行走路径的深度学习模型。环境因素,如天气条件和个人特征,例如性别和年龄,被表示为特殊的特征标记,通过训练这些标记和路径,我们可以生成受环境因素和个人特征影响的行走路径。

Algorithms for the Training of Neural Support Vector Machines

  • paper_url: http://arxiv.org/abs/2308.07204
  • repo_url: https://github.com/sayantann11/all-classification-templetes-for-ML
  • paper_authors: Lars Simon, Manuel Radons
  • for: 这篇论文是为了探讨逻辑支持向量机器学习(NSVMs)如何利用领域知识在模型架构设计中。
  • methods: 这篇论文使用了 Pegasos 算法,并提供了一些训练算法来实现 NSVMs。
  • results: 这篇论文通过解决一些标准机器学习任务来证明 NSVMs 的效果。
    Abstract Neural support vector machines (NSVMs) allow for the incorporation of domain knowledge in the design of the model architecture. In this article we introduce a set of training algorithms for NSVMs that leverage the Pegasos algorithm and provide a proof of concept by solving a set of standard machine learning tasks.
    摘要 神经支持向量机 (NSVM) 允许在模型建立的架构中吸收域知识。在这篇文章中,我们介绍了一组使用 Pegasos 算法的 NSVM 训练算法,并通过解决一组标准机器学习任务来提供证明。Here's a word-for-word translation of the text:神经支持向量机(NSVM)允许在模型建立的架构中吸收域知识。在这篇文章中,我们介绍了一组使用 Pegasos 算法的 NSVM 训练算法,并通过解决一组标准机器学习任务来提供证明。

Neural Categorical Priors for Physics-Based Character Control

  • paper_url: http://arxiv.org/abs/2308.07200
  • repo_url: https://github.com/Tencent-RoboticsX/NCP
  • paper_authors: Qingxu Zhu, He Zhang, Mengting Lan, Lei Han
  • for: 本研究旨在提出一种新的学习框架,用于控制基于物理的人形角色,以获得更高质量和多样性的运动。
  • methods: 该方法使用奖励学习(RL)来跟踪和模仿生物样的运动,并使用矩阵量化自适应器(VQ-VAE)来压缩运动clip中的信息。
  • results: 该方法可以生成高质量的生物样运动,并且可以帮助上层政策学习下游任务。我们在人形角色上进行了广泛的实验,并获得了considerably高质量的运动。
    Abstract Recent advances in learning reusable motion priors have demonstrated their effectiveness in generating naturalistic behaviors. In this paper, we propose a new learning framework in this paradigm for controlling physics-based characters with significantly improved motion quality and diversity over existing state-of-the-art methods. The proposed method uses reinforcement learning (RL) to initially track and imitate life-like movements from unstructured motion clips using the discrete information bottleneck, as adopted in the Vector Quantized Variational AutoEncoder (VQ-VAE). This structure compresses the most relevant information from the motion clips into a compact yet informative latent space, i.e., a discrete space over vector quantized codes. By sampling codes in the space from a trained categorical prior distribution, high-quality life-like behaviors can be generated, similar to the usage of VQ-VAE in computer vision. Although this prior distribution can be trained with the supervision of the encoder's output, it follows the original motion clip distribution in the dataset and could lead to imbalanced behaviors in our setting. To address the issue, we further propose a technique named prior shifting to adjust the prior distribution using curiosity-driven RL. The outcome distribution is demonstrated to offer sufficient behavioral diversity and significantly facilitates upper-level policy learning for downstream tasks. We conduct comprehensive experiments using humanoid characters on two challenging downstream tasks, sword-shield striking and two-player boxing game. Our results demonstrate that the proposed framework is capable of controlling the character to perform considerably high-quality movements in terms of behavioral strategies, diversity, and realism. Videos, codes, and data are available at https://tencent-roboticsx.github.io/NCP/.
    摘要 近期研究生成可重用运动偏好的进步已经证明它们可以生成自然的行为。在这篇论文中,我们提出了一种新的学习框架,用于控制基于物理的角色,并且可以提高运动质量和多样性。我们使用奖励学习(RL)来初始化并模仿生命体的自然运动,使用不结构化运动片断中的离散信息瓶颈,这种结构可以压缩运动片断中最重要的信息到一个 компакт又有用的积分空间中。通过从训练过的分类假设分布中采样代码,可以生成高质量的生命体运动。虽然这个假设分布可以通过编码器的输出进行训练,但它会跟踪原始运动片断的分布,这可能会导致行为偏执。为了解决这个问题,我们进一步提出了一种名为“偏shift”的技术,通过吸引力驱动RL来调整假设分布。经过这种技术的调整,结果分布能够提供足够的行为多样性,并且可以大大提高下游策略学习的效果。我们在humanoid角色上进行了严格的实验,使用两个具有挑战性的下游任务:剑盾擦擦和两个玩家拳击游戏。我们的结果表明,我们的框架可以控制角色进行较高质量的运动,包括行为策略、多样性和现实感。视频、代码和数据可以在https://tencent-roboticsx.github.io/NCP/中获得。

Explaining Black-Box Models through Counterfactuals

  • paper_url: http://arxiv.org/abs/2308.07198
  • repo_url: https://github.com/juliatrustworthyai/counterfactualexplanations.jl
  • paper_authors: Patrick Altmeyer, Arie van Deursen, Cynthia C. S. Liem
  • for: 这篇论文旨在提供一个用于生成对比事实解释(Counterfactual Explanations,CE)和算法救济(Algorithmic Recourse,AR)的Julia包,用于解释黑盒模型的输出。
  • methods: 这篇论文使用了Julia语言开发了一个包,包含了一系列的对比事实生成器,用于生成可行、可信度高的对比事实解释。
  • results: 论文通过使用这些对比事实生成器,可以提供实用、可行的对比事实解释,并且可以用于提供算法救济,帮助改善不满的结果。
    Abstract We present CounterfactualExplanations.jl: a package for generating Counterfactual Explanations (CE) and Algorithmic Recourse (AR) for black-box models in Julia. CE explain how inputs into a model need to change to yield specific model predictions. Explanations that involve realistic and actionable changes can be used to provide AR: a set of proposed actions for individuals to change an undesirable outcome for the better. In this article, we discuss the usefulness of CE for Explainable Artificial Intelligence and demonstrate the functionality of our package. The package is straightforward to use and designed with a focus on customization and extensibility. We envision it to one day be the go-to place for explaining arbitrary predictive models in Julia through a diverse suite of counterfactual generators.
    摘要 我们介绍CounterfactualExplanations.jl:一个用于生成反事件解释(CE)和算法救援(AR)的 julia 套件。CE 可以解释模型对特定预测所需的输入如何改变,以提供实用和可行的改善方案。我们认为 CE 对于可解释人工智能(Explainable AI)是非常有用,并在这篇文章中详细介绍了我们的套件功能。我们的套件易于使用,采用了自定义和扩展的设计,我们希望这将成为一个用于解释 julia 中的任意预测模型的首选场所,通过多种反事件生成器。

Task Offloading for Smart Glasses in Healthcare: Enhancing Detection of Elevated Body Temperature

  • paper_url: http://arxiv.org/abs/2308.07193
  • repo_url: None
  • paper_authors: Abdenacer Naouri, Nabil Abdelkader Nouri, Attia Qammar, Feifei Shi, Huansheng Ning, Sahraoui Dhelim
  • for: 这个研究的目的是分析在智能眼镜上执行医疗监测应用时的任务卸载场景,以确定最佳的卸载条件。
  • methods: 该研究使用了实际情况下的性能指标,包括任务完成时间、计算能力和能源消耗,来评估卸载的有效性。
  • results: 研究发现,在一个室内环境中,如机场,使用智能眼镜检测高体温可以减轻医疗人员的工作负担,提高医疗服务质量。这些发现表明在医疗设置中,任务卸载可以为智能眼镜提供实用性和 relevance。
    Abstract Wearable devices like smart glasses have gained popularity across various applications. However, their limited computational capabilities pose challenges for tasks that require extensive processing, such as image and video processing, leading to drained device batteries. To address this, offloading such tasks to nearby powerful remote devices, such as mobile devices or remote servers, has emerged as a promising solution. This paper focuses on analyzing task-offloading scenarios for a healthcare monitoring application performed on smart wearable glasses, aiming to identify the optimal conditions for offloading. The study evaluates performance metrics including task completion time, computing capabilities, and energy consumption under realistic conditions. A specific use case is explored within an indoor area like an airport, where security agents wearing smart glasses to detect elevated body temperature in individuals, potentially indicating COVID-19. The findings highlight the potential benefits of task offloading for wearable devices in healthcare settings, demonstrating its practicality and relevance.
    摘要 智能眼镜和其他智能穿戴设备在不同应用领域中得到了普及。然而,它们的计算能力有限,导致需要大量处理的任务,如图像和视频处理,会使设备电池耗尽。为解决这问题,将这些任务外卸到附近的强大Remote设备,如移动设备或远程服务器,已成为一种有前途的解决方案。本文针对智能眼镜在医疗监测应用中进行任务外卸场景分析,以评估最佳外卸条件。研究评估了任务完成时间、计算能力和能源消耗的性能指标,在实际情况下进行测试。一个具体的应用场景是在机场内,安全人员通过穿戴智能眼镜检测身体发热,可能indi COVID-19。发现任务外卸可以为智能穿戴设备在医疗设置中提供优化的性能。

Context-Aware Service Recommendation System for the Social Internet of Things

  • paper_url: http://arxiv.org/abs/2308.08499
  • repo_url: None
  • paper_authors: Amar Khelloufi, Huansheng Ning, Abdelkarim Ben Sada, Abdenacer Naouri, Sahraoui Dhelim
  • for: This paper aims to improve the accuracy and relevance of personalized service recommendations in the Social Internet of Things (SIoT) context by exploring the contextual representation of each device-service pair.
  • methods: The proposed framework uses a latent features combination technique to capture latent feature interactions and Factorization Machines to model higher-order feature interactions specific to each SIoT device-service pair.
  • results: The experimental evaluation demonstrates the framework’s effectiveness in improving service recommendation accuracy and relevance.Here’s the text in Simplified Chinese:
  • for: 这篇论文目标是在社交互联设备(SIoT)上提高个性化服务推荐的准确率和相关性,通过研究每个设备-服务对的Contextual表示。
  • methods: 提议的框架使用Latent features combination技术捕捉设备之间的Latent特征交互,并使用Factorization Machines模型每个SIoT设备-服务对的高阶特征交互。
  • results: 实验证明框架可以提高服务推荐准确率和相关性。
    Abstract The Social Internet of Things (SIoT) enables interconnected smart devices to share data and services, opening up opportunities for personalized service recommendations. However, existing research often overlooks crucial aspects that can enhance the accuracy and relevance of recommendations in the SIoT context. Specifically, existing techniques tend to consider the extraction of social relationships between devices and neglect the contextual presentation of service reviews. This study aims to address these gaps by exploring the contextual representation of each device-service pair. Firstly, we propose a latent features combination technique that can capture latent feature interactions, by aggregating the device-device relationships within the SIoT. Then, we leverage Factorization Machines to model higher-order feature interactions specific to each SIoT device-service pair to accomplish accurate rating prediction. Finally, we propose a service recommendation framework for SIoT based on review aggregation and feature learning processes. The experimental evaluation demonstrates the framework's effectiveness in improving service recommendation accuracy and relevance.
    摘要 社交互联网关系(SIoT)可以让智能设备之间进行数据和服务之间的共享,从而开启了个性化服务推荐的可能性。然而,现有的研究往往忽视了在SIoT上的重要因素,这些因素可以提高推荐的准确性和相关性。具体来说,现有的技术通常会忽视设备之间的社交关系和服务评价的上下文显示。本研究旨在解决这些缺陷,通过对每个设备-服务对的上下文表示进行描述。首先,我们提出一种秘密特征组合技术,可以捕捉设备之间的秘密特征互动,通过在SIoT中对设备之间的关系进行聚合。然后,我们利用因子分解机制来模型每个SIoT设备-服务对的高阶特征互动,以实现准确的评级预测。最后,我们提出了基于评价聚合和特征学习的服务推荐框架 дляSIoT。实验评估表明该框架可以提高服务推荐的准确性和相关性。

Conformal Predictions Enhanced Expert-guided Meshing with Graph Neural Networks

  • paper_url: http://arxiv.org/abs/2308.07358
  • repo_url: https://github.com/ahnobari/autosurf
  • paper_authors: Amin Heyrani Nobari, Justin Rey, Suhas Kodali, Matthew Jones, Faez Ahmed
  • for: 这个论文是为了自动生成CFD模型的网格而设计的。
  • methods: 这个论文使用图 Nueral Networks(GNN)和专家指导来自动生成CFD模型的网格。它还提出了一种新的3D分割算法,可以更高效地分类表面。
  • results: 论文通过一个实际案例研究表明,自动生成的网格与专家生成的网格相当,并且可以使得计算机 fluid dynamics 方法 converge 并生成准确的结果。此外,论文还比较了自动生成网格和适应重新网格两种方法的效率,发现自动生成网格比适应重新网格5倍 faster。代码和数据可以在https://github.com/ahnobari/AutoSurf上下载。
    Abstract Computational Fluid Dynamics (CFD) is widely used in different engineering fields, but accurate simulations are dependent upon proper meshing of the simulation domain. While highly refined meshes may ensure precision, they come with high computational costs. Similarly, adaptive remeshing techniques require multiple simulations and come at a great computational cost. This means that the meshing process is reliant upon expert knowledge and years of experience. Automating mesh generation can save significant time and effort and lead to a faster and more efficient design process. This paper presents a machine learning-based scheme that utilizes Graph Neural Networks (GNN) and expert guidance to automatically generate CFD meshes for aircraft models. In this work, we introduce a new 3D segmentation algorithm that outperforms two state-of-the-art models, PointNet++ and PointMLP, for surface classification. We also present a novel approach to project predictions from 3D mesh segmentation models to CAD surfaces using the conformal predictions method, which provides marginal statistical guarantees and robust uncertainty quantification and handling. We demonstrate that the addition of conformal predictions effectively enables the model to avoid under-refinement, hence failure, in CFD meshing even for weak and less accurate models. Finally, we demonstrate the efficacy of our approach through a real-world case study that demonstrates that our automatically generated mesh is comparable in quality to expert-generated meshes and enables the solver to converge and produce accurate results. Furthermore, we compare our approach to the alternative of adaptive remeshing in the same case study and find that our method is 5 times faster in the overall process of simulation. The code and data for this project are made publicly available at https://github.com/ahnobari/AutoSurf.
    摘要 computational fluid dynamics (CFD) 在不同的工程领域广泛使用,但准确的 simulate 需要正确的刻分 simulation 领域。高度细化的 mesh 可能确保准确性,但会带来高计算成本。同时,自适应刻分技术需要多次 simulate 和高计算成本。这意味着刻分过程依赖于专业知识和年资。自动生成 mesh 可以 savesignificant time and effort, leading to a faster and more efficient design process.这篇文章提出了一种基于机器学习的方案,使用 Graph Neural Networks (GNN) 和专家指导来自动生成 CFD 模型的 mesh для飞机模型。在这种工作中,我们提出了一种新的3D 分割算法,超过了 PointNet++ 和 PointMLP 两种surface classification 模型。我们还提出了一种将预测从 3D mesh 分割模型 projet 到 CAD 表面的方法,使用 conformal predictions 方法,该方法提供了边缘统计保证和稳定的uncertainty quantification和处理。我们证明了,通过添加conformal predictions,模型可以避免under-refinement,因此失败,在 CFD 刻分中。最后,我们通过一个真实的案例研究,证明我们自动生成的 mesh 与专家生成的 mesh 相比较,并且能够使解ilder converge 并生成准确的结果。此外,我们与 adaptive remeshing 的相对比较,发现我们的方法比 adaptive remeshing 5 倍快。我们的代码和数据在 https://github.com/ahnobari/AutoSurf 上公开 disponibles。

Knowledge Prompt-tuning for Sequential Recommendation

  • paper_url: http://arxiv.org/abs/2308.08459
  • repo_url: https://github.com/zhaijianyang/kp4sr
  • paper_authors: Jianyang Zhai, Xiawu Zheng, Chang-Dong Wang, Hui Li, Yonghong Tian
  • for: 本研究的目的是提出一种以知识库为基础的sequential recommendation(SR)方法,以解决现有SR方法缺乏域知识和细化用户喜好的问题。
  • methods: 本研究提出了一种名为Knowledge Prompt-tuning for Sequential Recommendation(KP4SR)的方法,它利用了外部知识库和知识提示来解决 semantic gap 问题。而知识提示的执行方式包括构建关系模板和知识树,以及在知识树上应用知识树面罩来缓解噪声问题。
  • results: 实验结果显示,KP4SR方法在三个真实世界数据集上的评价指标上比现有的PLM-based方法(基于语言模型)表现出色,特别是在NDCG@5和HR@5指标上有40.65%、36.42%和22.17%的提升。
    Abstract Pre-trained language models (PLMs) have demonstrated strong performance in sequential recommendation (SR), which are utilized to extract general knowledge. However, existing methods still lack domain knowledge and struggle to capture users' fine-grained preferences. Meanwhile, many traditional SR methods improve this issue by integrating side information while suffering from information loss. To summarize, we believe that a good recommendation system should utilize both general and domain knowledge simultaneously. Therefore, we introduce an external knowledge base and propose Knowledge Prompt-tuning for Sequential Recommendation (\textbf{KP4SR}). Specifically, we construct a set of relationship templates and transform a structured knowledge graph (KG) into knowledge prompts to solve the problem of the semantic gap. However, knowledge prompts disrupt the original data structure and introduce a significant amount of noise. We further construct a knowledge tree and propose a knowledge tree mask, which restores the data structure in a mask matrix form, thus mitigating the noise problem. We evaluate KP4SR on three real-world datasets, and experimental results show that our approach outperforms state-of-the-art methods on multiple evaluation metrics. Specifically, compared with PLM-based methods, our method improves NDCG@5 and HR@5 by \textcolor{red}{40.65\%} and \textcolor{red}{36.42\%} on the books dataset, \textcolor{red}{11.17\%} and \textcolor{red}{11.47\%} on the music dataset, and \textcolor{red}{22.17\%} and \textcolor{red}{19.14\%} on the movies dataset, respectively. Our code is publicly available at the link: \href{https://github.com/zhaijianyang/KP4SR}{\textcolor{blue}{https://github.com/zhaijianyang/KP4SR}.}
    摘要 预训语言模型(PLM)在sequential recommendation(SR)中表现出了强大的能力,但现有方法仍缺乏域知识和用户细致的偏好。而 tradicional SR 方法通常通过integrating side information来解决这个问题,但这会导致信息损失。因此,我们认为一个好的推荐系统应该同时利用通用知识和域知识。因此,我们引入了外部知识库和建议 Knowledge Prompt-tuning for Sequential Recommendation(KP4SR)。我们构建了一组关系模板,将结构化知识图(KG)转换为知识提示,以解决 semantic gap 问题。然而,知识提示会破坏原始数据结构并引入很多噪声。我们进一步构建了知识树和知识树面罩,使得数据结构在面罩矩阵形式中得到修复,因此 Mitigate the noise problem。我们在三个真实世界数据集上测试了我们的方法,结果显示,我们的方法在多个评价指标上超越了现有方法。具体来说,与 PLM 基于方法相比,我们的方法在书籍数据集上提高 NDCG@5 和 HR@5 的值为 \red{40.65\%} 和 \red{36.42\%},在音乐数据集上提高 \red{11.17\%} 和 \red{11.47\%},在电影数据集上提高 \red{22.17\%} 和 \red{19.14\%},分别。我们的代码公开available于以下链接:\href{https://github.com/zhaijianyang/KP4SR}{\textcolor{blue}{https://github.com/zhaijianyang/KP4SR}.}

Demonstration of CORNET: A System For Learning Spreadsheet Formatting Rules By Example

  • paper_url: http://arxiv.org/abs/2308.07357
  • repo_url: None
  • paper_authors: Mukul Singh, Jose Cambronero, Sumit Gulwani, Vu Le, Carina Negreanu, Gust Verbruggen
  • for: 用于自动学习条件 formatting 规则,并将其应用到 Microsoft Excel 中。
  • methods: 使用 симвоlic rule enumeration、 semi-supervised clustering 和 iterative decision tree learning,以及 neural ranker 来生成条件 formatting 规则。
  • results: 可以将用户提供的一两个格式化 cell 作为示范,然后生成 formatting rule 建议供用户应用到 spreadsheet 中。
    Abstract Data management and analysis tasks are often carried out using spreadsheet software. A popular feature in most spreadsheet platforms is the ability to define data-dependent formatting rules. These rules can express actions such as "color red all entries in a column that are negative" or "bold all rows not containing error or failure." Unfortunately, users who want to exercise this functionality need to manually write these conditional formatting (CF) rules. We introduce CORNET, a system that automatically learns such conditional formatting rules from user examples. CORNET takes inspiration from inductive program synthesis and combines symbolic rule enumeration, based on semi-supervised clustering and iterative decision tree learning, with a neural ranker to produce accurate conditional formatting rules. In this demonstration, we show CORNET in action as a simple add-in to Microsoft Excel. After the user provides one or two formatted cells as examples, CORNET generates formatting rule suggestions for the user to apply to the spreadsheet.
    摘要 <> translate into Simplified Chinese数据管理和分析任务经常使用表格软件进行。许多表格平台具有定义数据依赖的格式化规则的功能。这些规则可以表达如"将列中的负数颜色为红"或"不包含错误或失败的行加粗"。然而,用户们想要实现这些Conditional Formatting(CF)规则需要手动写出这些规则。我们介绍了CORNET,一个系统可以自动从用户示例中学习Conditional Formatting规则。CORNET吸取了 inductive 程序生成的灵感,结合半supervised clustering和迭代决策树学习,以生成准确的Conditional Formatting规则。在这次演示中,我们将CORNET作为Microsoft Excel中的一个简单插件展示。在用户提供一个或二个格式化的单元格示例后,CORNET会生成格式化规则建议,让用户应用于表格。

SPEGTI: Structured Prediction for Efficient Generative Text-to-Image Models

  • paper_url: http://arxiv.org/abs/2308.10997
  • repo_url: None
  • paper_authors: Sadeep Jayasumana, Daniel Glasner, Srikumar Ramalingam, Andreas Veit, Ayan Chakrabarti, Sanjiv Kumar
  • for: 提高文本描述生成图像质量和效率
  • methods: 使用Markov Random Field(MRF)模型提高图像兼容性和减少Muse预测步骤
  • results: 提高Muse模型的运算效率,无损输出图像质量Here’s a brief explanation of each point:* “for”: The paper is aimed at improving the quality and efficiency of text-to-image generation models.* “methods”: The paper proposes using a Markov Random Field (MRF) model to improve the compatibility between different regions of an image, which in turn speeds up the Muse model.* “results”: The proposed method, called SPEGTI, achieves a 1.5X speedup in Muse inference with no loss in output image quality.
    Abstract Modern text-to-image generation models produce high-quality images that are both photorealistic and faithful to the text prompts. However, this quality comes at significant computational cost: nearly all of these models are iterative and require running inference multiple times with large models. This iterative process is needed to ensure that different regions of the image are not only aligned with the text prompt, but also compatible with each other. In this work, we propose a light-weight approach to achieving this compatibility between different regions of an image, using a Markov Random Field (MRF) model. This method is shown to work in conjunction with the recently proposed Muse model. The MRF encodes the compatibility among image tokens at different spatial locations and enables us to significantly reduce the required number of Muse prediction steps. Inference with the MRF is significantly cheaper, and its parameters can be quickly learned through back-propagation by modeling MRF inference as a differentiable neural-network layer. Our full model, SPEGTI, uses this proposed MRF model to speed up Muse by 1.5X with no loss in output image quality.
    摘要 现代文本到图像生成模型可以生成高质量的图像,这些图像不仅具有摄影真实性,还具有文本提示的准确性。然而,这种质量需要支付高计算成本:大多数这些模型都是迭代的,需要在大型模型上进行多次推理。这种迭代过程是为确保图像各个区域不仅与文本提示相Alignment,还与其他区域兼容。在这项工作中,我们提出了一种轻量级的方法来实现图像各个区域之间的兼容性,使用Markov随机场(MRF)模型。这种方法可以与最近提出的Muse模型一起使用,并且可以减少Muse预测步骤的数量,从而大幅降低计算成本。我们的全模型,SPEGTI,使用这种提出的MRF模型,可以在Muse模型中快速预测图像,并且不会失去输出图像质量。

HyperBandit: Contextual Bandit with Hypernewtork for Time-Varying User Preferences in Streaming Recommendation

  • paper_url: http://arxiv.org/abs/2308.08497
  • repo_url: None
  • paper_authors: Chenglei Shen, Xiao Zhang, Wei Wei, Jun Xu
  • for: 本研究旨在提出一种能够快速适应用户时间变化的流媒体推荐模型,以满足实际流媒体推荐场景中用户偏好的动态变化。
  • methods: 本研究提出了一种Contextual Bandit方法,使用了hypernetwork来模型时间变化的用户偏好,并采用了bandit策略来在线进行推荐。为了满足实时要求,研究者们在训练过程中利用了低级结构的low-rank factorization。
  • results: 对实际 dataset进行了广泛的实验,并证明了 HyperBandit 能够在流媒体推荐场景中具有优于现有基eline的表现,并且可以快速适应用户时间变化。
    Abstract In real-world streaming recommender systems, user preferences often dynamically change over time (e.g., a user may have different preferences during weekdays and weekends). Existing bandit-based streaming recommendation models only consider time as a timestamp, without explicitly modeling the relationship between time variables and time-varying user preferences. This leads to recommendation models that cannot quickly adapt to dynamic scenarios. To address this issue, we propose a contextual bandit approach using hypernetwork, called HyperBandit, which takes time features as input and dynamically adjusts the recommendation model for time-varying user preferences. Specifically, HyperBandit maintains a neural network capable of generating the parameters for estimating time-varying rewards, taking into account the correlation between time features and user preferences. Using the estimated time-varying rewards, a bandit policy is employed to make online recommendations by learning the latent item contexts. To meet the real-time requirements in streaming recommendation scenarios, we have verified the existence of a low-rank structure in the parameter matrix and utilize low-rank factorization for efficient training. Theoretically, we demonstrate a sublinear regret upper bound against the best policy. Extensive experiments on real-world datasets show that the proposed HyperBandit consistently outperforms the state-of-the-art baselines in terms of accumulated rewards.
    摘要 实际流媒体推荐系统中,用户偏好经常在时间上变化(例如,用户在工作日和周末有不同的偏好)。现有的铲剑基于推荐模型仅考虑时间为毫科学上的时间戳,没有明确模型时间变量和用户偏好之间的关系。这会导致推荐模型无法快速适应动态场景。为解决这个问题,我们提议一种 Contextual Bandit 方法,使用嵌入式神经网络(HyperNetwork),以时间特征为输入,动态调整用户偏好变化的推荐模型。具体来说,HyperBandit 保持一个可以生成时间变量相关的参数来估计用户偏好变化的神经网络,同时考虑用户偏好和时间特征之间的相关性。使用估计的时间变量奖励,采用铲剑策略进行在线推荐,学习隐藏的项目上下文。为满足流媒体推荐场景的实时需求,我们已经验证了低纬度结构的存在,并利用低纬度因子化进行高效的训练。理论上,我们证明了对最佳策略的下界 regret Upper Bound。实际实验表明,提议的 HyperBandit 在实际 dataset 上持续超过状态艺术基elines。

AIGC In China: Current Developments And Future Outlook

  • paper_url: http://arxiv.org/abs/2308.08451
  • repo_url: None
  • paper_authors: Xiangyu Li, Yuqing Fan, Shenghui Cheng
  • for: 本研究旨在分析中国AI生成内容(AIGC)领域的当前状况,包括技术基础和应用领域。
  • methods: 本研究使用关键词搜索方法来 Identify relevant academic papers and analyze the market status, policy landscape, and development trajectory of AIGC in China.
  • results: 研究发现,中国AIGC领域正在快速发展,但也面临着一些挑战和风险。本研究提供了一个全面的AIGC产品和相关生态系统的分析,以及对AIGC产业未来发展的前瞻性分析。
    Abstract The increasing attention given to AI Generated Content (AIGC) has brought a profound impact on various aspects of daily life, industrial manufacturing, and the academic sector. Recognizing the global trends and competitiveness in AIGC development, this study aims to analyze China's current status in the field. The investigation begins with an overview of the foundational technologies and current applications of AIGC. Subsequently, the study delves into the market status, policy landscape, and development trajectory of AIGC in China, utilizing keyword searches to identify relevant scholarly papers. Furthermore, the paper provides a comprehensive examination of AIGC products and their corresponding ecosystem, emphasizing the ecological construction of AIGC. Finally, this paper discusses the challenges and risks faced by the AIGC industry while presenting a forward-looking perspective on the industry's future based on competitive insights in AIGC.
    摘要 随着人工智能生成内容(AIGC)的注意力增加,它对日常生活、工业生产和学术领域产生了深远的影响。在认识全球趋势和竞争力的基础上,本研究目的是分析中国AIGC领域的当前状况。研究从基础技术和当前应用领域入手,然后探讨中国AIGC市场情况、政策风景和发展轨迹,通过关键词搜索获取相关学术论文。此外,本文还进行了全面的AIGC产品和相关生态系统的检视,强调AIGC生态建设。最后,本文讨论了AIGC行业所面临的挑战和风险,并提供了基于竞争情况的未来展望。

OctoPack: Instruction Tuning Code Large Language Models

  • paper_url: http://arxiv.org/abs/2308.07124
  • repo_url: https://github.com/bigcode-project/octopack
  • paper_authors: Niklas Muennighoff, Qian Liu, Armel Zebaze, Qinkai Zheng, Binyuan Hui, Terry Yue Zhuo, Swayam Singh, Xiangru Tang, Leandro von Werra, Shayne Longpre
  • for: 这篇论文的目的是精致地训练大型自然语言模型(LLMs),以提高自然语言任务的性能。
  • methods: 这篇论文使用了代码的自然结构,即Git提交,并将其转换为人类的指令。 authors 创建了 CommitPack,一个包含了350种程式语言的4 terabytes Git提交。
  • results: 在使用 CommitPack 训练 StarCoder 模型(16B参数)时,在 HumanEval Python 测试 benchmark 上 achieved state-of-the-art 性能(46.2% pass@1),并在 HumanEvalPack 测试 benchmark 上显示了最好的性能。
    Abstract Finetuning large language models (LLMs) on instructions leads to vast performance improvements on natural language tasks. We apply instruction tuning using code, leveraging the natural structure of Git commits, which pair code changes with human instructions. We compile CommitPack: 4 terabytes of Git commits across 350 programming languages. We benchmark CommitPack against other natural and synthetic code instructions (xP3x, Self-Instruct, OASST) on the 16B parameter StarCoder model, and achieve state-of-the-art performance among models not trained on OpenAI outputs, on the HumanEval Python benchmark (46.2% pass@1). We further introduce HumanEvalPack, expanding the HumanEval benchmark to a total of 3 coding tasks (Code Repair, Code Explanation, Code Synthesis) across 6 languages (Python, JavaScript, Java, Go, C++, Rust). Our models, OctoCoder and OctoGeeX, achieve the best performance across HumanEvalPack among all permissive models, demonstrating CommitPack's benefits in generalizing to a wider set of languages and natural coding tasks. Code, models and data are freely available at https://github.com/bigcode-project/octopack.
    摘要 大型语言模型(LLM)的微调使得自然语言任务中的表现得到了很大的改善。我们使用代码的自然结构,通过 Git 提交来进行 instruction tuning,并compile了 4 兆Byte的 Git 提交 across 350 种程式语言。我们将其与其他自然和合成代码指令(xP3x、Self-Instruct、OASST)进行比较,使用 16B 参数的 StarCoder 模型,在 HumanEval Python 套件中获得了最佳性能(46.2% pass@1)。我们还引入了 HumanEvalPack,扩展了 HumanEval 套件,包括 3 个程式码任务(Code Repair、Code Explanation、Code Synthesis) across 6 种程式语言(Python、JavaScript、Java、Go、C++、Rust)。我们的模型 OctoCoder 和 OctoGeeX 在 HumanEvalPack 中获得了最佳性能,证明 CommitPack 对于更多的语言和自然程式码任务具有普遍性。代码、模型和数据可以免费下载于 https://github.com/bigcode-project/octopack。

CTP: Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation

  • paper_url: http://arxiv.org/abs/2308.07146
  • repo_url: https://github.com/kevinlight831/ctp
  • paper_authors: Hongguang Zhu, Yunchao Wei, Xiaodan Liang, Chunjie Zhang, Yao Zhao
  • for: 这个论文是为了研究视觉语言不间断预训练(Vision-Language Continual Pretraining,VLCP)而写的。
  • methods: 这个论文使用了一种新的算法,即兼容势量对比法(Compatible Momentum Contrast),以及一种概率转移方法(Topology Preservation)来实现VLCP。
  • results: 实验结果表明,这个算法不仅可以在多个基线上达到更高的性能,而且也不需要付出贵重的训练成本。
    Abstract Vision-Language Pretraining (VLP) has shown impressive results on diverse downstream tasks by offline training on large-scale datasets. Regarding the growing nature of real-world data, such an offline training paradigm on ever-expanding data is unsustainable, because models lack the continual learning ability to accumulate knowledge constantly. However, most continual learning studies are limited to uni-modal classification and existing multi-modal datasets cannot simulate continual non-stationary data stream scenarios. To support the study of Vision-Language Continual Pretraining (VLCP), we first contribute a comprehensive and unified benchmark dataset P9D which contains over one million product image-text pairs from 9 industries. The data from each industry as an independent task supports continual learning and conforms to the real-world long-tail nature to simulate pretraining on web data. We comprehensively study the characteristics and challenges of VLCP, and propose a new algorithm: Compatible momentum contrast with Topology Preservation, dubbed CTP. The compatible momentum model absorbs the knowledge of the current and previous-task models to flexibly update the modal feature. Moreover, Topology Preservation transfers the knowledge of embedding across tasks while preserving the flexibility of feature adjustment. The experimental results demonstrate our method not only achieves superior performance compared with other baselines but also does not bring an expensive training burden. Dataset and codes are available at https://github.com/KevinLight831/CTP.
    摘要 vision-language预训练(VLP)在多种下游任务上表现出色,但由于实际数据的不断增长,这种离线训练方式在不断学习的能力不足以满足需求。然而,大多数不断学习研究仅限于单modal类别,现有的多modal数据集不能模拟不断非站ARY数据流场景。为支持视图语言不断预训练(VLCP)的研究,我们首先提供了一个完整的、统一的benchmark数据集P9D,包含超过一百万个产品图像-文本对from 9个行业。每个行业的数据作为独立任务支持不断学习,并与实际世界的长尾分布相符,以模拟在网络数据上的预训练。我们系统地研究了VLCP的特点和挑战,并提出了一种新的算法:Compatible Momentum Contrast with Topology Preservation(CTP)。Compatible Momentum模型吸收当前和前一任务模型的知识,以flexibly更新Modal特征。此外,Topology Preservation将知识传递到下一任务模型,保持特征调整的灵活性。实验结果表明,我们的方法不仅在其他基eline上达到了superior表现,而且不会带来昂贵的训练负担。数据集和代码可以在https://github.com/KevinLight831/CTP获取。

Natural Language is All a Graph Needs

  • paper_url: http://arxiv.org/abs/2308.07134
  • repo_url: https://github.com/Aryia-Behroziuan/neurons
  • paper_authors: Ruosong Ye, Caiqi Zhang, Runhui Wang, Shuyuan Xu, Yongfeng Zhang
  • for: 本研究的目的是探讨 Whether large language models (LLMs) can replace graph neural networks (GNNs) as the foundation model for graphs.
  • methods: 我们提出了 InstructGLM (Instruction-finetuned Graph Language Model),使用自然语言指令系统atically design高级别可扩展的提示,并使用自然语言描述图像的几何结构和节点特征。
  • results: 我们的方法在ogbn-arxiv、Cora和PubMed dataset上都超过了所有竞争性GNN基线值,这表明了我们的方法的有效性,并且探讨了大语言模型作为图像机器学习的基础模型的可能性。
    Abstract The emergence of large-scale pre-trained language models, such as ChatGPT, has revolutionized various research fields in artificial intelligence. Transformers-based large language models (LLMs) have gradually replaced CNNs and RNNs to unify fields of computer vision and natural language processing. Compared with the data that exists relatively independently such as images, videos or texts, graph is a type of data that contains rich structural and relational information. Meanwhile, natural language, as one of the most expressive mediums, excels in describing complex structures. However, existing work on incorporating graph learning problems into the generative language modeling framework remains very limited. As the importance of large language models continues to grow, it becomes essential to explore whether LLMs can also replace GNNs as the foundation model for graphs. In this paper, we propose InstructGLM (Instruction-finetuned Graph Language Model), systematically design highly scalable prompts based on natural language instructions, and use natural language to describe the geometric structure and node features of the graph for instruction tuning an LLM to perform learning and inference on graphs in a generative manner. Our method exceeds all competitive GNN baselines on ogbn-arxiv, Cora and PubMed datasets, which demonstrates the effectiveness of our method and sheds light on generative large language models as the foundation model for graph machine learning.
    摘要 大规模预训练语言模型,如ChatGPT,在人工智能多个研究领域中引发革命。基于Transformers的大语言模型(LLM)逐渐取代了CNNs和RNNs,将计算机视觉和自然语言处理等领域联系起来。与独立存在的数据类型如图像、视频或文本不同,图表是一种包含丰富结构和关系信息的数据类型。同时,自然语言作为最具表达力的媒介,能够描述复杂结构。然而,将图学学习问题 incorporated into the generative language modeling framework 的现有工作很有限。随着大语言模型的重要性不断增长,我们认为可以explore whether LLMs can also replace GNNs as the foundation model for graphs。在这篇论文中,我们提出了InstructGLM(基于natural language instruction的图语言模型),系统地设计了可扩展的提示,并使用自然语言来描述图表的几何结构和节点特征。通过对LLM进行学习和推理,我们实现了对图表的生成式学习和推理。我们的方法在ogbn-arxiv、Cora和PubMed dataset上超过了所有相关GNN基elines,这 demonstartes了我们的方法的效果,并且推翻了大语言模型作为图机器学习基础模型的可能性。

Implementation of The Future of Drug Discovery: QuantumBased Machine Learning Simulation (QMLS)

  • paper_url: http://arxiv.org/abs/2308.08561
  • repo_url: None
  • paper_authors: Yew Kee Wong, Yifan Zhou, Yan Shing Liang, Haichuan Qiu, Yu Xi Wu, Bin He
  • for: 这个研究旨在缩短药物开发过程的R&D阶段,从三到六个月,并降低成本至五万到八万美元。
  • methods: 这个概念使用机器学习和量子模拟来发现可能的灵数,并将其筛选以根据反应和与目标蛋白质的缩合效果进行筛选。
  • results: 这个概念可以将R&D阶段缩短至三到六个月,并降低成本至五万到八万美元,并生成多达数十个适用于临床试验的药物。
    Abstract The Research & Development (R&D) phase of drug development is a lengthy and costly process. To revolutionize this process, we introduce our new concept QMLS to shorten the whole R&D phase to three to six months and decrease the cost to merely fifty to eighty thousand USD. For Hit Generation, Machine Learning Molecule Generation (MLMG) generates possible hits according to the molecular structure of the target protein while the Quantum Simulation (QS) filters molecules from the primary essay based on the reaction and binding effectiveness with the target protein. Then, For Lead Optimization, the resultant molecules generated and filtered from MLMG and QS are compared, and molecules that appear as a result of both processes will be made into dozens of molecular variations through Machine Learning Molecule Variation (MLMV), while others will only be made into a few variations. Lastly, all optimized molecules would undergo multiple rounds of QS filtering with a high standard for reaction effectiveness and safety, creating a few dozen pre-clinical-trail-ready drugs. This paper is based on our first paper, where we pitched the concept of machine learning combined with quantum simulations. In this paper we will go over the detailed design and framework of QMLS, including MLMG, MLMV, and QS.
    摘要 研发(R&D)阶段是药品开发的长期和昂贵的过程。为了革新这个过程,我们提出了新的概念——快速药品开发(QMLS),可以缩短整个R&D阶段到3-6个月,并降低成本至50-80万美元。在找到可能的杀手(Hit)方面,机器学习分子生成(MLMG)根据目标蛋白质的分子结构生成可能的杀手,而量子模拟(QS)从原始试验中筛选出符合目标蛋白质的反应和结合效果的分子。在吸引化学物质阶段,得到的分子被QS筛选后,通过机器学习分子变化(MLMV)生成多种分子变体,而其他分子只生成一些变体。最后,所有优化的分子都会经过多次QS筛选,以确保它们具有高效性和安全性。通过这种方式,我们可以在几个月内生成几十个前期临床药品。这篇文章是我们的第一篇论文,我们在那里提出了机器学习与量子模拟的概念。在这篇文章中,我们将详细介绍QMLS的设计和框架,包括MLMG、MLMV和QS。

Ada-QPacknet – adaptive pruning with bit width reduction as an efficient continual learning method without forgetting

  • paper_url: http://arxiv.org/abs/2308.07939
  • repo_url: None
  • paper_authors: Marcin Pietroń, Dominik Żurek, Kamil Faber, Roberto Corizzo
  • for: This paper aims to improve the efficiency of Continual Learning (CL) algorithms in dynamic and complex environments.
  • methods: The proposed approach, called Ada-QPacknet, incorporates both pruning and quantization techniques to reduce the size of the model and improve its performance in CL scenarios.
  • results: The presented results show that the proposed approach achieves similar accuracy as floating-point sub-networks in well-known CL scenarios, and outperforms most other CL strategies in task and class incremental scenarios.Here’s the full text in Simplified Chinese:
  • for: 本文目的是提高深度学习模型在动态复杂环境下的效率,通过Continual Learning(CL)算法。
  • methods: 提议的方法是Ada-QPacknet,它将杜绝和量化技术相结合,以减少模型的大小并提高CL场景中的性能。
  • results: presente results表明,提议的方法在知名CL场景中与浮点子网络相当的准确率,并在任务和类增量场景中超过大多数CL策略。
    Abstract Continual Learning (CL) is a process in which there is still huge gap between human and deep learning model efficiency. Recently, many CL algorithms were designed. Most of them have many problems with learning in dynamic and complex environments. In this work new architecture based approach Ada-QPacknet is described. It incorporates the pruning for extracting the sub-network for each task. The crucial aspect in architecture based CL methods is theirs capacity. In presented method the size of the model is reduced by efficient linear and nonlinear quantisation approach. The method reduces the bit-width of the weights format. The presented results shows that hybrid 8 and 4-bit quantisation achieves similar accuracy as floating-point sub-network on a well-know CL scenarios. To our knowledge it is the first CL strategy which incorporates both compression techniques pruning and quantisation for generating task sub-networks. The presented algorithm was tested on well-known episode combinations and compared with most popular algorithms. Results show that proposed approach outperforms most of the CL strategies in task and class incremental scenarios.
    摘要

#InsTag: Instruction Tagging for Analyzing Supervised Fine-tuning of Large Language Models

  • paper_url: http://arxiv.org/abs/2308.07074
  • repo_url: https://github.com/ofa-sys/instag
  • paper_authors: Keming Lu, Hongyi Yuan, Zheng Yuan, Runji Lin, Junyang Lin, Chuanqi Tan, Chang Zhou, Jingren Zhou
  • for: 这篇论文主要用于探讨基于监督精细调教(SFT)的语言模型是如何获得 instrucion-following 能力的?
  • methods: 该论文提出了一种名为 InsTag 的开放集成精细标注器,用于标注 SFT 数据集中的样本,并定义了 instrucion 多样性和复杂性的量化分析。
  • results: 研究发现,通过使用 InsTag 选择的 6K 多样性和复杂性的样本进行精细调教,可以使模型的能力得到显著提升,并在 MT-Bench 中与大量 SFT 数据进行比较。
    Abstract Foundation language models obtain the instruction-following ability through supervised fine-tuning (SFT). Diversity and complexity are considered critical factors of a successful SFT dataset, while their definitions remain obscure and lack quantitative analyses. In this work, we propose InsTag, an open-set fine-grained tagger, to tag samples within SFT datasets based on semantics and intentions and define instruction diversity and complexity regarding tags. We obtain 6.6K tags to describe comprehensive user queries. Then we analyze popular open-sourced SFT datasets and find that the model ability grows with more diverse and complex data. Based on this observation, we propose a data selector based on InsTag to select 6K diverse and complex samples from open-source datasets and fine-tune models on InsTag-selected data. The resulting models, TagLM, outperform open-source models based on considerably larger SFT data evaluated by MT-Bench, echoing the importance of query diversity and complexity. We open-source InsTag in https://github.com/OFA-Sys/InsTag.
    摘要

Machine Unlearning: Solutions and Challenges

  • paper_url: http://arxiv.org/abs/2308.07061
  • repo_url: None
  • paper_authors: Jie Xu, Zihan Wu, Cong Wang, Xiaohua Jia
  • for: 本研究旨在提供一个全面的机器学习忘记研究taxonomy,并对现有研究进行分析和评价。
  • methods: 本研究使用了精确的忘记算法和近似的忘记方法,并对这些方法进行了分析和评价。
  • results: 本研究提出了未来机器学习忘记研究的发展方向,并鼓励研究人员通过解决实际问题来提供影响ful的贡献。
    Abstract Machine learning models may inadvertently memorize sensitive, unauthorized, or malicious data, posing risks of privacy violations, security breaches, and performance deterioration. To address these issues, machine unlearning has emerged as a critical technique to selectively remove specific training data points' influence on trained models. This paper provides a comprehensive taxonomy and analysis of machine unlearning research. We categorize existing research into exact unlearning that algorithmically removes data influence entirely and approximate unlearning that efficiently minimizes influence through limited parameter updates. By reviewing the state-of-the-art solutions, we critically discuss their advantages and limitations. Furthermore, we propose future directions to advance machine unlearning and establish it as an essential capability for trustworthy and adaptive machine learning. This paper provides researchers with a roadmap of open problems, encouraging impactful contributions to address real-world needs for selective data removal.
    摘要 Translated into Simplified Chinese:机器学习模型可能不慎地记忆敏感、未经授权或黑客数据,导致隐私侵犯、安全泄露和性能下降。为解决这些问题,机器忘记技术已经出现,可以选择性地从训练模型中除去特定数据点的影响。这篇论文提供了机器忘记的全面分类和分析,将现有研究分为精确忘记和近似忘记两类。我们 kritisch 评估了现状的解决方案,并提出了未来的发展方向,以便在可靠和适应性Machine learning中确立机器忘记的能力。这篇论文为研究人员提供了一份未解决的问题路线图,鼓励他们对实际需求进行有力的贡献,以解决实际中的选择性数据 removals。

Distinguishing Risk Preferences using Repeated Gambles

  • paper_url: http://arxiv.org/abs/2308.07054
  • repo_url: None
  • paper_authors: James Price, Colm Connaughton
  • For: The paper explores the practical challenges of inferring risk preferences from the observed choices of artificial agents in sequences of repeated gambles.* Methods: The paper uses the Yeo-Johnson transformation to construct a family of gambles that interpolates smoothly between the additive and multiplicative cases, and analyzes the optimal strategy for this family both analytically and numerically.* Results: The paper finds that it becomes increasingly difficult to distinguish the risk preferences of agents as their wealth increases, because agents with different risk preferences eventually make the same decisions for sufficiently high wealth.
    Abstract Sequences of repeated gambles provide an experimental tool to characterize the risk preferences of humans or artificial decision-making agents. The difficulty of this inference depends on factors including the details of the gambles offered and the number of iterations of the game played. In this paper we explore in detail the practical challenges of inferring risk preferences from the observed choices of artificial agents who are presented with finite sequences of repeated gambles. We are motivated by the fact that the strategy to maximize long-run wealth for sequences of repeated additive gambles (where gains and losses are independent of current wealth) is different to the strategy for repeated multiplicative gambles (where gains and losses are proportional to current wealth.) Accurate measurement of risk preferences would be needed to tell whether an agent is employing the optimal strategy or not. To generalize the types of gambles our agents face we use the Yeo-Johnson transformation, a tool borrowed from feature engineering for time series analysis, to construct a family of gambles that interpolates smoothly between the additive and multiplicative cases. We then analyze the optimal strategy for this family, both analytically and numerically. We find that it becomes increasingly difficult to distinguish the risk preferences of agents as their wealth increases. This is because agents with different risk preferences eventually make the same decisions for sufficiently high wealth. We believe that these findings are informative for the effective design of experiments to measure risk preferences in humans.
    摘要 sequences of repeated gambles 提供了一种实验工具来描述人类或人工决策代理的风险偏好。这种推断的困难程度取决于因素,包括对投注的细节和游戏的数量。在这篇论文中,我们详细探讨了人工代理在重复的加权投注中的实际挑战。我们被激励于因为在重复的加权投注中,最佳长期财富战略和加权投注战略不同。准确测量风险偏好是必要的,以确定代理是否使用最佳策略。为推广代理面临的投注类型,我们使用 Yeo-Johnson 变换,一种从特征工程中借鉴的时间序列分析工具,构建了一家 interpolation between additive and multiplicative cases 的投注家族。然后,我们分析了这家族中的最佳策略,包括分析和数值方法。我们发现,随着代理的财富增加,代理的风险偏好难以分辨。这是因为代理不同的风险偏好在财富增加到足够高的时候会做出同样的决策。我们认为这些发现对人类风险偏好测量的设计有用。

Diagnosis of Scalp Disorders using Machine Learning and Deep Learning Approach – A Review

  • paper_url: http://arxiv.org/abs/2308.07052
  • repo_url: None
  • paper_authors: Hrishabh Tiwari, Jatin Moolchandani, Shamla Mantri
  • for: 该研究旨在提高scalp疾病的诊断精度和效率。
  • methods: 该研究使用了深度学习模型,包括CNN和FCN,以及一个扩展的APP,以实现精准的scalp疾病诊断。
  • results: 研究表明,使用深度学习模型可以实现scalp疾病诊断的高精度和高效率,其中一些研究达到了97.41%-99.09%的准确率,而其他研究则达到了82.9%和91.4%的准确率。
    Abstract The morbidity of scalp diseases is minuscule compared to other diseases, but the impact on the patient's life is enormous. It is common for people to experience scalp problems that include Dandruff, Psoriasis, Tinea-Capitis, Alopecia and Atopic-Dermatitis. In accordance with WHO research, approximately 70% of adults have problems with their scalp. It has been demonstrated in descriptive research that hair quality is impaired by impaired scalp, but these impacts are reversible with early diagnosis and treatment. Deep Learning advances have demonstrated the effectiveness of CNN paired with FCN in diagnosing scalp and skin disorders. In one proposed Deep-Learning-based scalp inspection and diagnosis system, an imaging microscope and a trained model are combined with an app that classifies scalp disorders accurately with an average precision of 97.41%- 99.09%. Another research dealt with classifying the Psoriasis using the CNN with an accuracy of 82.9%. As part of another study, an ML based algorithm was also employed. It accurately classified the healthy scalp and alopecia areata with 91.4% and 88.9% accuracy with SVM and KNN algorithms. Using deep learning models to diagnose scalp related diseases has improved due to advancements i computation capabilities and computer vision, but there remains a wide horizon for further improvements.
    摘要 Scalp 疾病的患病率相对其他疾病较低,但对病人生活的影响很大。人们常常会经历Scalp 问题,包括斑点病、 Psoriasis、Tinea-Capitis、 Alopecia 和 Atopic-Dermatitis。根据Who研究,成人约70%有Scalp 问题。研究表明,发现早期Scalp 问题可以有效地改善毛发质量,但这些影响可以逆转。在 Deep Learning 技术的推动下,CNN 与 FCN 的结合已经在诊断Scalp 和皮肤疾病方面达到了高度的准确率。一种提议的 Deep-Learning-based scalp 检查和诊断系统使用了一个快速的 imaging 镜和一个训练好的模型,并与一个APP结合,可以准确地分类Scalp 疾病,其准确率为97.41%-99.09%。另一项研究则是使用CNN来分类 Psoriasis,其准确率为82.9%。在另一项研究中,一种ML 基于的算法也被使用,可以准确地分类健康的Scalp 和 Alopecia areata,其准确率为91.4%和88.9%。使用 Deep learning 模型诊断Scalp 相关疾病的精度已经得到了进一步提高,但还有很大的可exploration空间。

The minimal computational substrate of fluid intelligence

  • paper_url: http://arxiv.org/abs/2308.07039
  • repo_url: None
  • paper_authors: Amy PK Nelson, Joe Mole, Guilherme Pombo, Robert J Gray, James K Ruffle, Edgar Chan, Geraint E Rees, Lisa Cipolotti, Parashkev Nachev
  • for: 这个研究是用来评估一个改进后的Raven进攻性智能测试(RAPM),以验证人类水平的智商测试能否通过自我指导的人工神经网络(LaMa)完成。
  • methods: 这个研究使用了LaMa自我指导的人工神经网络,该网络只在完成部分遮盖的自然环境场景图像上进行了自我学习。
  • results: 研究发现,LaMa在完成RAPM测试时达到了人类水平的成绩,而且与健康参与者和focus lesion参与者的表现类似,并且在损害右前额叶功能时出现了类似的错误。
    Abstract The quantification of cognitive powers rests on identifying a behavioural task that depends on them. Such dependence cannot be assured, for the powers a task invokes cannot be experimentally controlled or constrained a priori, resulting in unknown vulnerability to failure of specificity and generalisability. Evaluating a compact version of Raven's Advanced Progressive Matrices (RAPM), a widely used clinical test of fluid intelligence, we show that LaMa, a self-supervised artificial neural network trained solely on the completion of partially masked images of natural environmental scenes, achieves human-level test scores a prima vista, without any task-specific inductive bias or training. Compared with cohorts of healthy and focally lesioned participants, LaMa exhibits human-like variation with item difficulty, and produces errors characteristic of right frontal lobe damage under degradation of its ability to integrate global spatial patterns. LaMa's narrow training and limited capacity -- comparable to the nervous system of the fruit fly -- suggest RAPM may be open to computationally simple solutions that need not necessarily invoke abstract reasoning.
    摘要 评估认知能力的量化基于确定一个行为任务取决于它们。但是,这种依赖性不可控制,因为任务所调用的能力无法在实验上预先控制或受限制,导致不知之处的失败和不一致。我们评估了一个简化版的鸟智慧进步性测验(RAPM),一种广泛用于诊断智商的临床测试,我们发现LaMa,一个自我指导的人工神经网络,在完全不受任务指导的情况下,直接完成部分遮盖的自然环境场景图像的完成任务,可以达到人类水平的测试得分。与健康参与者和损伤参与者的群组相比,LaMa表现出人类化的变化,并且在Item难度上出现了人类类似的错误。LaMa的窄训练和有限容量(与蜂蜜蜂 nervous system相当)表明,RAPM可能是一个计算简单的解决方案,不需要涉及抽象的理性。

Bayesian Flow Networks

  • paper_url: http://arxiv.org/abs/2308.07037
  • repo_url: https://github.com/stefanradev93/BayesFlow
  • paper_authors: Alex Graves, Rupesh Kumar Srivastava, Timothy Atkinson, Faustino Gomez
  • for: 这篇论文探讨了一种新的生成模型,即权重投影网络(BFN),它利用 bayesian 推理来修改一组独立的分布参数,然后将这些参数作为神经网络的输入,输出一个第二个、相互依赖的分布。
  • methods: 该模型使用了 bayesian 推理来修改参数,然后通过神经网络输出一个生成分布。在实验中,使用了不同的损失函数,包括离散和连续时间损失函数,以及采样生成过程。
  • results: 实验表明,BFNs 可以与其他权重投影模型相比,在静止化 MNIST 和 CIFAR-10 图像模型任务上实现竞争性的 log-likelihood,并在文本8 字符级语言模型任务上超越所有已知的离散扩散模型。
    Abstract This paper introduces Bayesian Flow Networks (BFNs), a new class of generative model in which the parameters of a set of independent distributions are modified with Bayesian inference in the light of noisy data samples, then passed as input to a neural network that outputs a second, interdependent distribution. Starting from a simple prior and iteratively updating the two distributions yields a generative procedure similar to the reverse process of diffusion models; however it is conceptually simpler in that no forward process is required. Discrete and continuous-time loss functions are derived for continuous, discretised and discrete data, along with sample generation procedures. Notably, the network inputs for discrete data lie on the probability simplex, and are therefore natively differentiable, paving the way for gradient-based sample guidance and few-step generation in discrete domains such as language modelling. The loss function directly optimises data compression and places no restrictions on the network architecture. In our experiments BFNs achieve competitive log-likelihoods for image modelling on dynamically binarized MNIST and CIFAR-10, and outperform all known discrete diffusion models on the text8 character-level language modelling task.
    摘要 Here is the text in Simplified Chinese:这篇论文介绍了概率流网络(BFN),一种新的生成模型,其中概率流网络中参数的修改使用权化推断,然后通过神经网络输出第二个、相互dependent的分布。这个过程类似于扩散模型的反向过程,但是更简单,不需要前向过程。模型可以处理整数、连续和整数化数据,并且神经网络输入的整数数据位于概率 simpliciter 上,因此可以使用导数下降的技术进行批处理和几步生成。损失函数直接优化数据压缩,不受网络结构限制。在实验中,BFNs在动态 binary MNIST 和 CIFAR-10 上实现了图像模型的竞争性Log-likelihood,并在 text8 字符级语言模型任务上超过了所有已知整数扩散模型。

Bayesian Physics-Informed Neural Network for the Forward and Inverse Simulation of Engineered Nano-particles Mobility in a Contaminated Aquifer

  • paper_url: http://arxiv.org/abs/2308.07352
  • repo_url: None
  • paper_authors: Shikhar Nilabh, Fidel Grandia
  • for: 这研究旨在开发一种可预测的工程尺度材料浸泡环境中的气体传输模型,以便为地下水污染 Site 的整体环境和生态系统进行有效的恢复。
  • methods: 该研究使用了一种基于 Bayesian Physics-Informed Neural Network(B-PINN)的方法,模拟了气体在aquifer中的传输行为。
  • results: 研究结果表明,B-PINN 方法可以高度准确地预测气体的传输行为,并且可以量化不确定性。此外,研究还发现了aquifer中气体传输的主要控制因素。这些结果表明,该工具可以为开发有效的地下水污染 Site 整治策略提供预测性的 Insights。
    Abstract Globally, there are many polluted groundwater sites that need an active remediation plan for the restoration of local ecosystem and environment. Engineered nanoparticles (ENPs) have proven to be an effective reactive agent for the in-situ degradation of pollutants in groundwater. While the performance of these ENPs has been highly promising on the laboratory scale, their application in real field case conditions is still limited. The complex transport and retention mechanisms of ENPs hinder the development of an efficient remediation strategy. Therefore, a predictive tool to comprehend the transport and retention behavior of ENPs is highly required. The existing tools in the literature are dominated with numerical simulators, which have limited flexibility and accuracy in the presence of sparse datasets and the aquifer heterogeneity. This work uses a Bayesian Physics-Informed Neural Network (B-PINN) framework to model the nano-particles mobility within an aquifer. The result from the forward model demonstrates the effective capability of B-PINN in accurately predicting the ENPs mobility and quantifying the uncertainty. The inverse model output is then used to predict the governing parameters for the ENPs mobility in a small-scale aquifer. The research demonstrates the capability of the tool to provide predictive insights for developing an efficient groundwater remediation strategy.
    摘要

IOB: Integrating Optimization Transfer and Behavior Transfer for Multi-Policy Reuse

  • paper_url: http://arxiv.org/abs/2308.07351
  • repo_url: None
  • paper_authors: Siyuan Li, Hao Li, Jin Zhang, Zhen Wang, Peng Liu, Chongjie Zhang
  • for: 本研究旨在解决选择适用于目标任务的源策略的挑战,提出了一种新的转移学习RL方法。
  • methods: 该方法利用actor-critic框架中的Q函数导引策略选择,选择目标策略的最大一步改进的源策略。它还 combining optimization transfer和behavior transfer(IOB),通过准则学习的策略来模仿导引策略,以提高转移效果。
  • results: 对于基准任务,该方法超过了现有的转移RL基线,并在持续学习场景中提高了最终性和知识传递性。此外,我们证明了该优化传输技术可以提高目标策略学习。
    Abstract Humans have the ability to reuse previously learned policies to solve new tasks quickly, and reinforcement learning (RL) agents can do the same by transferring knowledge from source policies to a related target task. Transfer RL methods can reshape the policy optimization objective (optimization transfer) or influence the behavior policy (behavior transfer) using source policies. However, selecting the appropriate source policy with limited samples to guide target policy learning has been a challenge. Previous methods introduce additional components, such as hierarchical policies or estimations of source policies' value functions, which can lead to non-stationary policy optimization or heavy sampling costs, diminishing transfer effectiveness. To address this challenge, we propose a novel transfer RL method that selects the source policy without training extra components. Our method utilizes the Q function in the actor-critic framework to guide policy selection, choosing the source policy with the largest one-step improvement over the current target policy. We integrate optimization transfer and behavior transfer (IOB) by regularizing the learned policy to mimic the guidance policy and combining them as the behavior policy. This integration significantly enhances transfer effectiveness, surpasses state-of-the-art transfer RL baselines in benchmark tasks, and improves final performance and knowledge transferability in continual learning scenarios. Additionally, we show that our optimization transfer technique is guaranteed to improve target policy learning.
    摘要 人类具有重用已经学习的策略来快速解决新任务的能力,而强化学习(RL)代理也可以通过将来源策略传播到相关的目标任务中来传递知识。传递RL方法可以改变策略优化目标(优化传递)或影响行为策略(行为传递)使用源策略。然而,在有限样本情况下选择合适的源策略是一个挑战。先前的方法可能会添加额外的组件,如层次政策或估计源策略的价值函数,这可能会导致非站点策略优化或重大的采样成本, thereby reducing transfer effectiveness.为了解决这个挑战,我们提出了一种新的传递RL方法,不需要训练额外的组件。我们利用actor-critic框架中的Q函数来导引策选择,选择目标策略中最大化一步改进的源策略。我们将优化传递和行为传递(IOB)相结合,通过对学习的策略进行正则化,使其模仿指导策略,并将其与行为策略相结合。这种结合显著提高了传递效果,超过了状态静态的传递RL基准值,并提高了最终性和知识传递性在不断学习场景中。此外,我们证明了我们的优化传递技术可以确保目标策略学习的改进。

Efficient Neural PDE-Solvers using Quantization Aware Training

  • paper_url: http://arxiv.org/abs/2308.07350
  • repo_url: None
  • paper_authors: Winfried van den Dool, Tijmen Blankevoort, Max Welling, Yuki M. Asano
  • for: 解决Partial Differential Equations(PDE)领域中计算成本的问题,通过使用神经网络作为传统数学方法的替代方案。
  • methods: 使用现有的量化方法来降低神经网络的计算成本,而不需要限制PDE的空间分辨率。
  • results: 对四个标准PDE数据集和三种网络架构进行了实验,并证明了在不同的设置下,量化意识训练可以成功降低计算成本,同时保持性能。此外,我们还证明了将计算成本与性能之间的 pareto优化是通过量化来实现的。
    Abstract In the past years, the application of neural networks as an alternative to classical numerical methods to solve Partial Differential Equations has emerged as a potential paradigm shift in this century-old mathematical field. However, in terms of practical applicability, computational cost remains a substantial bottleneck. Classical approaches try to mitigate this challenge by limiting the spatial resolution on which the PDEs are defined. For neural PDE solvers, we can do better: Here, we investigate the potential of state-of-the-art quantization methods on reducing computational costs. We show that quantizing the network weights and activations can successfully lower the computational cost of inference while maintaining performance. Our results on four standard PDE datasets and three network architectures show that quantization-aware training works across settings and three orders of FLOPs magnitudes. Finally, we empirically demonstrate that Pareto-optimality of computational cost vs performance is almost always achieved only by incorporating quantization.
    摘要

Aggregating Intrinsic Information to Enhance BCI Performance through Federated Learning

  • paper_url: http://arxiv.org/abs/2308.11636
  • repo_url: None
  • paper_authors: Rui Liu, Yuanyuan Chen, Anran Li, Yi Ding, Han Yu, Cuntai Guan
  • for: 提高Brain-Computer Interface(BCI)高性能深度学习模型的建立
  • methods: 提出了一种 Hierarchical Personalized Federated Learning EEG Decoding(FLEEG)框架,通过协同学习多个数据集,提高模型的泛化能力和稳定性
  • results: 在 Motor Imagery(MI)分类任务中,与9个不同设备收集的EEG数据集进行了合作训练,可以提高分类性能达16.7%,特别是对小数据集的提升更大。
    Abstract Insufficient data is a long-standing challenge for Brain-Computer Interface (BCI) to build a high-performance deep learning model. Though numerous research groups and institutes collect a multitude of EEG datasets for the same BCI task, sharing EEG data from multiple sites is still challenging due to the heterogeneity of devices. The significance of this challenge cannot be overstated, given the critical role of data diversity in fostering model robustness. However, existing works rarely discuss this issue, predominantly centering their attention on model training within a single dataset, often in the context of inter-subject or inter-session settings. In this work, we propose a hierarchical personalized Federated Learning EEG decoding (FLEEG) framework to surmount this challenge. This innovative framework heralds a new learning paradigm for BCI, enabling datasets with disparate data formats to collaborate in the model training process. Each client is assigned a specific dataset and trains a hierarchical personalized model to manage diverse data formats and facilitate information exchange. Meanwhile, the server coordinates the training procedure to harness knowledge gleaned from all datasets, thus elevating overall performance. The framework has been evaluated in Motor Imagery (MI) classification with nine EEG datasets collected by different devices but implementing the same MI task. Results demonstrate that the proposed frame can boost classification performance up to 16.7% by enabling knowledge sharing between multiple datasets, especially for smaller datasets. Visualization results also indicate that the proposed framework can empower the local models to put a stable focus on task-related areas, yielding better performance. To the best of our knowledge, this is the first end-to-end solution to address this important challenge.
    摘要 BCIs 长期面临不充分数据的挑战,建立高性能深度学习模型。虽然许多研究机构和机构收集了多个 EEG 数据集,但是共享 EEG 数据从多个地点仍然困难,主要因为设备的不同。这个挑战的重要性不可遗憾,因为数据多样性对模型的稳定性具有关键作用。然而,现有的研究很少讨论这一问题,通常是在单个数据集的训练方法上围绕着间subject或间 session 设置中做出主要听讲。在这种情况下,我们提出了一种层次个性化联合学习 EEG 解码(FLEEG)框架,以超越这一挑战。这种创新的框架标识了一种新的学习 paradigma для BCIs,使得不同数据格式的数据集可以在模型训练过程中合作。每个客户端被分配特定的数据集,并训练一个层次个性化模型来管理多种数据格式和促进信息交换。同时,服务器协调训练过程,以利用所有数据集中获得的知识,从而提高整体性能。我们在 Motor Imagery (MI) 分类任务上使用了九个 EEG 数据集,每个数据集都是由不同的设备收集的,但是实现了同一个 MI 任务。结果表明,我们的框架可以提高分类性能达到 16.7%,尤其是 для小型数据集。视觉结果还表明,我们的框架可以让本地模型固定焦点在任务相关的区域,从而提高表现。到目前为止,我们知道这是第一个综合解决这一重要挑战的解决方案。

Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder

  • paper_url: http://arxiv.org/abs/2308.08488
  • repo_url: https://github.com/mispchallenge/misp-icme-avsr
  • paper_authors: Yusheng Dai, Hang Chen, Jun Du, Xiaofei Ding, Ning Ding, Feijun Jiang, Chin-Hui Lee
  • for: 提高音频视频演示系统(AVSR)的表现,特别是在低质量视频下。
  • methods: 提出两种新技术:一是基于 Mandarin 语音形态的 lip shape 和 syllable-level subword unit 的关系研究,以获得准确的视频和音频流重叠。二是一种 audio-guided cross-modal fusion encoder(CMFE)神经网络,使用多个交叠模式来完全利用多modal complementarity。
  • results: 在 MISP2021-AVSR 数据集上进行实验,证明了两种提posed技术的有效性。使用只需相对较少的训练数据,最终系统的表现优于一些更复杂的前端和后端的现有系统。
    Abstract In recent research, slight performance improvement is observed from automatic speech recognition systems to audio-visual speech recognition systems in the end-to-end framework with low-quality videos. Unmatching convergence rates and specialized input representations between audio and visual modalities are considered to cause the problem. In this paper, we propose two novel techniques to improve audio-visual speech recognition (AVSR) under a pre-training and fine-tuning training framework. First, we explore the correlation between lip shapes and syllable-level subword units in Mandarin to establish good frame-level syllable boundaries from lip shapes. This enables accurate alignment of video and audio streams during visual model pre-training and cross-modal fusion. Next, we propose an audio-guided cross-modal fusion encoder (CMFE) neural network to utilize main training parameters for multiple cross-modal attention layers to make full use of modality complementarity. Experiments on the MISP2021-AVSR data set show the effectiveness of the two proposed techniques. Together, using only a relatively small amount of training data, the final system achieves better performances than state-of-the-art systems with more complex front-ends and back-ends.
    摘要 现在的研究显示,自动音频识别系统在端到端框架中对低质量视频进行自动化识别时会有轻微的性能提升。这是因为音频和视频模式之间的匹配速率和特殊输入表示方式存在差异。在这篇论文中,我们提出了两种新的技术来改进音频视频识别(AVSR),并在预训练和精度调整训练框架下进行评估。首先,我们研究了拟合舌形和句子级别的子音单元之间的相关性,以确定准确的帧级句子界限。这使得视频和音频流之间的同步Alignment能够更加精准。然后,我们提出了一种听音导向的交叉模态融合编程(CMFE)神经网络,以利用主要训练参数在多个交叉模态注意层中进行全面的融合。实验表明,使用这两种提posed技术后,只需使用相对较小的训练数据,最终系统可以在与更复杂的前端和后端的状态-of-the-art系统相比,达到更高的性能。

pNNCLR: Stochastic Pseudo Neighborhoods for Contrastive Learning based Unsupervised Representation Learning Problems

  • paper_url: http://arxiv.org/abs/2308.06983
  • repo_url: None
  • paper_authors: Momojit Biswas, Himanshu Buckchash, Dilip K. Prasad
  • for: 本研究旨在提高 nearest neighbor 基于自助学习(SSL)的图像识别问题中的 semantic variation。
  • methods: 本研究使用 nearest neighbor sampling 提供更多的 semantic variation,并引入 pseudo nearest neighbors(pNN)控制支持集质量,以提高表现。
  • results: 对多个公共图像识别和医学图像识别数据集进行评估,本方法与基eline nearest neighbor 方法相比,性能提高约8%,与其他先前提出的 SSL 方法相比几乎相当。
    Abstract Nearest neighbor (NN) sampling provides more semantic variations than pre-defined transformations for self-supervised learning (SSL) based image recognition problems. However, its performance is restricted by the quality of the support set, which holds positive samples for the contrastive loss. In this work, we show that the quality of the support set plays a crucial role in any nearest neighbor based method for SSL. We then provide a refined baseline (pNNCLR) to the nearest neighbor based SSL approach (NNCLR). To this end, we introduce pseudo nearest neighbors (pNN) to control the quality of the support set, wherein, rather than sampling the nearest neighbors, we sample in the vicinity of hard nearest neighbors by varying the magnitude of the resultant vector and employing a stochastic sampling strategy to improve the performance. Additionally, to stabilize the effects of uncertainty in NN-based learning, we employ a smooth-weight-update approach for training the proposed network. Evaluation of the proposed method on multiple public image recognition and medical image recognition datasets shows that it performs up to 8 percent better than the baseline nearest neighbor method, and is comparable to other previously proposed SSL methods.
    摘要 近邻采样(NN)提供更多semantic变化 than pre-defined transformations for self-supervised learning(SSL)based image recognition problems. However, its performance is restricted by the quality of the support set, which holds positive samples for the contrastive loss. In this work, we show that the quality of the support set plays a crucial role in any nearest neighbor based method for SSL. We then provide a refined baseline(pNNCLR)to the nearest neighbor based SSL approach(NNCLR). To this end, we introduce pseudo nearest neighbors(pNN)to control the quality of the support set, wherein, rather than sampling the nearest neighbors, we sample in the vicinity of hard nearest neighbors by varying the magnitude of the resultant vector and employing a stochastic sampling strategy to improve the performance. Additionally, to stabilize the effects of uncertainty in NN-based learning, we employ a smooth-weight-update approach for training the proposed network. Evaluation of the proposed method on multiple public image recognition and medical image recognition datasets shows that it performs up to 8 percent better than the baseline nearest neighbor method, and is comparable to other previously proposed SSL methods.Note: Please note that the translation is in Simplified Chinese, which is one of the two standard Chinese languages used in mainland China and Singapore. If you need the translation in Traditional Chinese, please let me know.

Routing Recovery for UAV Networks with Deliberate Attacks: A Reinforcement Learning based Approach

  • paper_url: http://arxiv.org/abs/2308.06973
  • repo_url: None
  • paper_authors: Sijie He, Ziye Jia, Chao Dong, Wei Wang, Yilu Cao, Yang Yang, Qihui Wu
  • for: 本研究针对无人机网络中的攻击问题,提出了一种路由计划和恢复策略。
  • methods: 本文提出了一种基于重要性分析的节点重要性排名机制,并使用了人工智能技术来恢复路由路径。
  • results: 实验结果表明,提出的方法比其他已知方法更高效,能够有效地恢复路由路径并提高无人机网络的可靠性。
    Abstract The unmanned aerial vehicle (UAV) network is popular these years due to its various applications. In the UAV network, routing is significantly affected by the distributed network topology, leading to the issue that UAVs are vulnerable to deliberate damage. Hence, this paper focuses on the routing plan and recovery for UAV networks with attacks. In detail, a deliberate attack model based on the importance of nodes is designed to represent enemy attacks. Then, a node importance ranking mechanism is presented, considering the degree of nodes and link importance. However, it is intractable to handle the routing problem by traditional methods for UAV networks, since link connections change with the UAV availability. Hence, an intelligent algorithm based on reinforcement learning is proposed to recover the routing path when UAVs are attacked. Simulations are conducted and numerical results verify the proposed mechanism performs better than other referred methods.
    摘要 自这些年来,无人飞行器(UAV)网络已经非常受欢迎,因为它们在各种应用方面表现出了优异的能力。在UAV网络中,路由受到分布式网络架构的影响,导致UAV受到意外损害。因此,这篇论文关注于UAV网络中的路由规划和恢复,以适应攻击。具体来说,我们设计了一种基于节点重要性的敌对攻击模型,并提出了一种考虑节点度和链接重要性的节点重要性排名机制。但是,由于UAV网络中的链接连接不断变化,因此传统的路由方法无法应对UAV网络中的攻击。因此,我们提出了一种基于强化学习的智能算法,以恢复UAV网络中的路由路径当UAV受到攻击时。我们进行了 simulations 和数值分析,并证明了我们提出的机制在UAV网络中恢复路由路径时比其他已知方法更好。

BIRP: Bitcoin Information Retrieval Prediction Model Based on Multimodal Pattern Matching

  • paper_url: http://arxiv.org/abs/2308.08558
  • repo_url: None
  • paper_authors: Minsuk Kim, Byungchul Kim, Junyeong Yong, Jeongwoo Park, Gyeongmin Kim
  • for: 该论文旨在提出一种基于PC模式匹配算法的方向预测模型,以提高对财务时间序列的预测能力。
  • methods: 该论文使用了多模式匹配算法来检测财务市场中的征兆,并将PC模式匹配结果作为额外特征来提高方向预测模型的准确性。
  • results: 研究人员通过应用该方法在比特币市场中进行了方向预测,并发现该方法可以提高方向预测的准确性。
    Abstract Financial time series have historically been assumed to be a martingale process under the Random Walk hypothesis. Instead of making investment decisions using the raw prices alone, various multimodal pattern matching algorithms have been developed to help detect subtly hidden repeatable patterns within the financial market. Many of the chart-based pattern matching tools only retrieve similar past chart (PC) patterns given the current chart (CC) pattern, and leaves the entire interpretive and predictive analysis, thus ultimately the final investment decision, to the investors. In this paper, we propose an approach of ranking similar PC movements given the CC information and show that exploiting this as additional features improves the directional prediction capacity of our model. We apply our ranking and directional prediction modeling methodologies on Bitcoin due to its highly volatile prices that make it challenging to predict its future movements.
    摘要 金融时间序列历史上通常被视为一个martingale过程,而不是使用 raw 价格做投资决策。多种多模式匹配算法已经被开发出来帮助检测金融市场中的潜在征性重复模式。许多图表基本 pattern matching 工具只是根据当前图表(CC)提供类似过去图表(PC)模式,留下整个解释和预测分析,最终决策,给投资者。在这篇论文中,我们提出一种基于 CC 信息对 PC 运动进行排名的方法,并证明在我们的模型中利用这些特征可以提高方向预测能力。我们在使用我们的排名和方向预测模型方法时选择比特币,因为它的价格波动性较高,使其预测未来运动更加挑战。

Graph Structural Residuals: A Learning Approach to Diagnosis

  • paper_url: http://arxiv.org/abs/2308.06961
  • repo_url: None
  • paper_authors: Jan Lukas Augustin, Oliver Niggemann
  • for: 提出了一种基于深度图结构学习的数据驱动的系统诊断方法,它可以轻松地集成图结构学习和模型基于诊断。
  • methods: 使用两个不同的自适应图结构学习模型架构,通过自动学习系统的下游结构来提取系统的动态观察数据,并将系统的构造、观察和缺陷重新定义。
  • results: 通过对振荡器系统的实验, demonstate了该方法的可行性和效果,并证明了该方法可以提供更加准确和有效的系统诊断。
    Abstract Traditional model-based diagnosis relies on constructing explicit system models, a process that can be laborious and expertise-demanding. In this paper, we propose a novel framework that combines concepts of model-based diagnosis with deep graph structure learning. This data-driven approach leverages data to learn the system's underlying structure and provide dynamic observations, represented by two distinct graph adjacency matrices. Our work facilitates a seamless integration of graph structure learning with model-based diagnosis by making three main contributions: (i) redefining the constructs of system representation, observations, and faults (ii) introducing two distinct versions of a self-supervised graph structure learning model architecture and (iii) demonstrating the potential of our data-driven diagnostic method through experiments on a system of coupled oscillators.
    摘要 传统的模型基于诊断方法是通过构建明确的系统模型来进行,这可能是劳动密集且需要专业知识的。在这篇论文中,我们提出了一种新的框架,它将模型基于诊断与深度图结构学习结合起来。这种数据驱动的方法利用数据来学习系统的下面结构,并提供动态观察结果,表示为两个不同的图邻接矩阵。我们的工作使得图结构学习与模型基于诊断的集成变得自然和简单,我们做出了三个主要贡献:(一)重新定义系统表示、观察和故障的构造(二)引入两种不同的自我超级vised图结构学习模型架构(三)通过对振荡器系统进行实验,证明我们的数据驱动诊断方法的潜力。

Search to Fine-tune Pre-trained Graph Neural Networks for Graph-level Tasks

  • paper_url: http://arxiv.org/abs/2308.06960
  • repo_url: None
  • paper_authors: Zhili Wang, Shimin Di, Lei Chen, Xiaofang Zhou
  • for: This paper aims to design a better fine-tuning strategy for pre-trained graph neural networks (GNNs) to improve their performance on downstream graph-level tasks.
  • methods: The proposed method, called S2PGNN, searches for an appropriate fine-tuning framework for the given labeled data on the downstream task, adaptively designing a suitable strategy for each task.
  • results: The empirical studies show that S2PGNN can be implemented on the top of 10 famous pre-trained GNNs and consistently improve their performance. Additionally, S2PGNN achieves better performance than existing fine-tuning strategies within and outside the GNN area.
    Abstract Recently, graph neural networks (GNNs) have shown its unprecedented success in many graph-related tasks. However, GNNs face the label scarcity issue as other neural networks do. Thus, recent efforts try to pre-train GNNs on a large-scale unlabeled graph and adapt the knowledge from the unlabeled graph to the target downstream task. The adaptation is generally achieved by fine-tuning the pre-trained GNNs with a limited number of labeled data. Despite the importance of fine-tuning, current GNNs pre-training works often ignore designing a good fine-tuning strategy to better leverage transferred knowledge and improve the performance on downstream tasks. Only few works start to investigate a better fine-tuning strategy for pre-trained GNNs. But their designs either have strong assumptions or overlook the data-aware issue for various downstream datasets. Therefore, we aim to design a better fine-tuning strategy for pre-trained GNNs to improve the model performance in this paper. Given a pre-trained GNN, we propose to search to fine-tune pre-trained graph neural networks for graph-level tasks (S2PGNN), which adaptively design a suitable fine-tuning framework for the given labeled data on the downstream task. To ensure the improvement brought by searching fine-tuning strategy, we carefully summarize a proper search space of fine-tuning framework that is suitable for GNNs. The empirical studies show that S2PGNN can be implemented on the top of 10 famous pre-trained GNNs and consistently improve their performance. Besides, S2PGNN achieves better performance than existing fine-tuning strategies within and outside the GNN area. Our code is publicly available at \url{https://anonymous.4open.science/r/code_icde2024-A9CB/}.
    摘要 近些年来,图节点网络(GNNs)在许多图关联任务上显示出无前例的成功。然而,GNNs还面临着标签稀缺问题,与其他神经网络一样。因此,current efforts是在大规模无标记图上预训练GNNs,然后将知识从无标记图 adapts到目标下游任务。适应通常是通过精度调整预训练GNNs中的一部分参数来实现。 despite the importance of fine-tuning, current GNNs pre-training works often ignore designing a good fine-tuning strategy to better leverage transferred knowledge and improve the performance on downstream tasks. Only few works start to investigate a better fine-tuning strategy for pre-trained GNNs. But their designs either have strong assumptions or overlook the data-aware issue for various downstream datasets. Therefore, we aim to design a better fine-tuning strategy for pre-trained GNNs to improve the model performance in this paper. Given a pre-trained GNN, we propose to search for fine-tune pre-trained graph neural networks for graph-level tasks (S2PGNN), which adaptively designs a suitable fine-tuning framework for the given labeled data on the downstream task. To ensure the improvement brought by searching fine-tuning strategy, we carefully summarize a proper search space of fine-tuning framework that is suitable for GNNs. The empirical studies show that S2PGNN can be implemented on the top of 10 famous pre-trained GNNs and consistently improve their performance. Besides, S2PGNN achieves better performance than existing fine-tuning strategies within and outside the GNN area. Our code is publicly available at \url{https://anonymous.4open.science/r/code_icde2024-A9CB/}.

Approximating Human-Like Few-shot Learning with GPT-based Compression

  • paper_url: http://arxiv.org/abs/2308.06942
  • repo_url: None
  • paper_authors: Cynthia Huang, Yuqing Xie, Zhiying Jiang, Jimmy Lin, Ming Li
  • For: 本研究旨在塑造生成模型具备人类学习能力,以便在推理过程中进行数据压缩。* Methods: 本文提出了一种使用生成预训练模型(GPT)来估计kolmogorov复杂度,以便在几何学习中进行数据压缩。* Results: 实验结果表明,使用GPT模型作为压缩约束可以实现15.5倍的压缩率,并且对于困难的NLG任务(包括语义相似性、零和一极少文本分类和零极少文本排名)具有优秀的性能。
    Abstract In this work, we conceptualize the learning process as information compression. We seek to equip generative pre-trained models with human-like learning capabilities that enable data compression during inference. We present a novel approach that utilizes the Generative Pre-trained Transformer (GPT) to approximate Kolmogorov complexity, with the aim of estimating the optimal Information Distance for few-shot learning. We first propose using GPT as a prior for lossless text compression, achieving a noteworthy compression ratio. Experiment with LLAMA2-7B backbone achieves a compression ratio of 15.5 on enwik9. We justify the pre-training objective of GPT models by demonstrating its equivalence to the compression length, and, consequently, its ability to approximate the information distance for texts. Leveraging the approximated information distance, our method allows the direct application of GPT models in quantitative text similarity measurements. Experiment results show that our method overall achieves superior performance compared to embedding and prompt baselines on challenging NLP tasks, including semantic similarity, zero and one-shot text classification, and zero-shot text ranking.
    摘要 在这个工作中,我们概念化学习过程为信息压缩。我们希望为生成预训练模型增加人类学习能力,以便在推理过程中进行数据压缩。我们提出了一种新的方法,利用生成预训练变换器(GPT)来近似kolmogorov复杂度,以便估算少量学习中的最佳信息距离。我们首先提出使用GPT作为损失less文本压缩的先验,实现了一个很好的压缩比例。在LLAMA2-7B底层上进行实验,实现了enwik9上的压缩比例为15.5。我们证明了GPT模型的预训练目标的正确性,并且因此能够近似信息距离的估算,从而使得GPT模型可以直接应用于文本相似度量度中。通过利用近似信息距离,我们的方法可以在挑战性的NLP任务中实现超越预测和描述基elines的性能。

FusionPlanner: A Multi-task Motion Planner for Mining Trucks using Multi-sensor Fusion Method

  • paper_url: http://arxiv.org/abs/2308.06931
  • repo_url: None
  • paper_authors: Siyu Teng, Luxi Li, Yuchen Li, Xuemin Hu, Lingxi Li, Yunfeng Ai, Long Chen
    for:This paper proposes a comprehensive paradigm for unmanned transportation in open-pit mines, including a simulation platform, a testing benchmark, and a trustworthy and robust motion planner.methods:The paper proposes a multi-task motion planning algorithm called FusionPlanner, which uses a multi-sensor fusion method to adapt both lateral and longitudinal control tasks for unmanned transportation.results:The performance of FusionPlanner is tested by MiningNav in PMS, and the empirical results demonstrate a significant reduction in the number of collisions and takeovers of their planner.
    Abstract In recent years, significant achievements have been made in motion planning for intelligent vehicles. However, as a typical unstructured environment, open-pit mining attracts limited attention due to its complex operational conditions and adverse environmental factors. A comprehensive paradigm for unmanned transportation in open-pit mines is proposed in this research, including a simulation platform, a testing benchmark, and a trustworthy and robust motion planner. \textcolor{red}{Firstly, we propose a multi-task motion planning algorithm, called FusionPlanner, for autonomous mining trucks by the Multi-sensor fusion method to adapt both lateral and longitudinal control tasks for unmanned transportation. Then, we develop a novel benchmark called MiningNav, which offers three validation approaches to evaluate the trustworthiness and robustness of well-trained algorithms in transportation roads of open-pit mines. Finally, we introduce the Parallel Mining Simulator (PMS), a new high-fidelity simulator specifically designed for open-pit mining scenarios. PMS enables the users to manage and control open-pit mine transportation from both the single-truck control and multi-truck scheduling perspectives.} \textcolor{red}{The performance of FusionPlanner is tested by MiningNav in PMS, and the empirical results demonstrate a significant reduction in the number of collisions and takeovers of our planner. We anticipate our unmanned transportation paradigm will bring mining trucks one step closer to trustworthiness and robustness in continuous round-the-clock unmanned transportation.
    摘要 近年来,在智能汽车运动规划方面有了 significative achievements。然而,由于开采矿场的复杂操作条件和不利环境因素,这种场景吸引了有限的关注。本研究提出了一种涵盖全面的无人运输解决方案,包括仿真平台、测试标准和可靠性和稳定性较高的运动规划算法。首先,我们提出了一种多任务运动规划算法,称为FusionPlanner,用于自动采矿车辆的多感器融合方法,以适应无人运输中的 lateral和longitudinal控制任务。然后,我们开发了一个新的测试标准,称为MiningNav,它提供了三种验证方法来评估训练过的算法在交通路上的可靠性和稳定性。最后,我们介绍了一个新的高级仿真平台,称为Parallel Mining Simulator (PMS),它专门针对开采矿场 scenarios。PMS允许用户在交通路上控制开采矿车辆的单车控制和多车调度两种视角。FusionPlanner的性能被MiningNav在PMS中测试,实际结果表明我们的 плаanner的数量紧急和takeover的减少了显著。我们预计我们的无人运输方案将使采矿车辆一步 closer to trustworthiness和稳定性在无人不断运输中。

FedEdge AI-TC: A Semi-supervised Traffic Classification Method based on Trusted Federated Deep Learning for Mobile Edge Computing

  • paper_url: http://arxiv.org/abs/2308.06924
  • repo_url: None
  • paper_authors: Pan Wang, Zeyi Li, Mengyi Fu, Zixuan Wang, Ze Zhang, MinYao Liu
    for: 本文旨在提出一种基于联合学习(Federated Learning,FL)的可靠网络流量分类(TC)框架,以提高5G客户端设备(CPE)中的TC性能。methods: 本文使用了机器学习(Machine Learning,ML)和深度学习(Deep Learning,DL)技术来提高TC性能,并提出了一种基于自变量自编码器(Variational Auto-Encoder,VAE)和卷积神经网络(Convolutional Neural Network,CNN)的半监督TC算法,以减少数据依赖性。此外,本文还提出了一种名为XAI-Pruning的AI模型压缩方法,以减少模型大小并保持模型解释性。results: 实验评估表明,基于FedEdge AI-TC框架的TC模型在精度和效率两个方面具有明显的优势,而且可以保护用户隐私和模型准确性。这种框架可以提高服务质量和安全性,因此具有广泛的应用前景。
    Abstract As a typical entity of MEC (Mobile Edge Computing), 5G CPE (Customer Premise Equipment)/HGU (Home Gateway Unit) has proven to be a promising alternative to traditional Smart Home Gateway. Network TC (Traffic Classification) is a vital service quality assurance and security management method for communication networks, which has become a crucial functional entity in 5G CPE/HGU. In recent years, many researchers have applied Machine Learning or Deep Learning (DL) to TC, namely AI-TC, to improve its performance. However, AI-TC faces challenges, including data dependency, resource-intensive traffic labeling, and user privacy concerns. The limited computing resources of 5G CPE further complicate efficient classification. Moreover, the "black box" nature of AI-TC models raises transparency and credibility issues. The paper proposes the FedEdge AI-TC framework, leveraging Federated Learning (FL) for reliable Network TC in 5G CPE. FL ensures privacy by employing local training, model parameter iteration, and centralized training. A semi-supervised TC algorithm based on Variational Auto-Encoder (VAE) and convolutional neural network (CNN) reduces data dependency while maintaining accuracy. To optimize model light-weight deployment, the paper introduces XAI-Pruning, an AI model compression method combined with DL model interpretability. Experimental evaluation demonstrates FedEdge AI-TC's superiority over benchmarks in terms of accuracy and efficient TC performance. The framework enhances user privacy and model credibility, offering a comprehensive solution for dependable and transparent Network TC in 5G CPE, thus enhancing service quality and security.
    摘要 为了提高5G CPE(客户端设备)/HGU(家庭网关)的服务质量和安全性,MEC(移动边缘计算)中的网络TC(流量分类)已成为一种有前途的替代方案。然而,使用人工智能(AI)进行TC(TC使用AI)存在一些挑战,包括数据依赖、负担重的流量标注和用户隐私问题。另外,TC模型的“黑盒”性也会导致透明度和信任问题。为了解决这些问题,本文提出了FedEdge AI-TC框架,利用联邦学习(FL)来确保5G CPE中的可靠网络TC。FL确保了隐私,通过本地训练、模型参数迭代和中心训练。使用变量自动编码器(VAE)和卷积神经网络(CNN)的半监督TC算法可以减少数据依赖而保持准确性。为了优化模型轻量级部署,本文介绍了XAI-Pruning,一种将AI模型压缩与DL模型解释相结合的模型压缩方法。实验评估表明FedEdge AI-TC在精度和效率方面与标准准点。这种框架提高了用户隐私和模型信任度,为5G CPE中可靠和透明的网络TC提供了全面的解决方案,从而提高服务质量和安全性。

Probabilistic contingent planning based on HTN for high-quality plans

  • paper_url: http://arxiv.org/abs/2308.06922
  • repo_url: None
  • paper_authors: Peng Zhao
  • For: The paper is written for planning in partially observable environments, where traditional deterministic planning methods are not practical.* Methods: The paper proposes a probabilistic contingent Hierarchical Task Network (HTN) planner called High-Quality Contingent Planner (HQCP) to generate high-quality plans in partially observable environments. The planner extends HTN planning formalisms to partial observability and evaluates plans based on cost.* Results: The paper explores a novel heuristic for high-quality plans and develops an integrated planning algorithm. An empirical study verifies the effectiveness and efficiency of the planner in probabilistic contingent planning and obtaining high-quality plans.
    Abstract Deterministic planning assumes that the planning evolves along a fully predictable path, and therefore it loses the practical value in most real projections. A more realistic view is that planning ought to take into consideration partial observability beforehand and aim for a more flexible and robust solution. What is more significant, it is inevitable that the quality of plan varies dramatically in the partially observable environment. In this paper we propose a probabilistic contingent Hierarchical Task Network (HTN) planner, named High-Quality Contingent Planner (HQCP), to generate high-quality plans in the partially observable environment. The formalisms in HTN planning are extended into partial observability and are evaluated regarding the cost. Next, we explore a novel heuristic for high-quality plans and develop the integrated planning algorithm. Finally, an empirical study verifies the effectiveness and efficiency of the planner both in probabilistic contingent planning and for obtaining high-quality plans.
    摘要 <>transliteration: zhèng zhì yì yù xiǎng zhèng zhì yì yù, yīn zhèng zhì yì yù de zhèng yì yù zhèng zhì yì yù. translated text: deterministic planning assumes that the planning evolves along a fully predictable path, and therefore it loses the practical value in most real projections. a more realistic view is that planning ought to take into consideration partial observability beforehand and aim for a more flexible and robust solution. what is more significant, it is inevitable that the quality of plan varies dramatically in the partially observable environment. in this paper we propose a probabilistic contingent hierarchical task network (htn) planner, named high-quality contingent planner (hqcp), to generate high-quality plans in the partially observable environment. the formalisms in htn planning are extended into partial observability and are evaluated regarding the cost. next, we explore a novel heuristic for high-quality plans and develop the integrated planning algorithm. finally, an empirical study verifies the effectiveness and efficiency of the planner both in probabilistic contingent planning and for obtaining high-quality plans.Note: The transliteration is based on the Hanyu Pinyin system, which is a standardized system for romanizing Chinese characters. The translated text is in Simplified Chinese, which is the standardized form of Chinese used in mainland China.

Chatbots in Drug Discovery: A Case Study on Anti-Cocaine Addiction Drug Development with ChatGPT

  • paper_url: http://arxiv.org/abs/2308.06920
  • repo_url: None
  • paper_authors: Rui Wang, Hongsong Feng, Guo-Wei Wei
  • for: 这个论文的目的是发展抗吸毒药物,使用GPT-4作为虚拟导航员,为研究人员提供策略和方法指导,以开发更有价值的药物候选体。
  • methods: 这个研究使用GPT-4语言模型chatbot,作为虚拟导航员,为研究人员提供策略和方法指导,以开发更有价值的药物候选体。
  • results: 这个研究发现,通过使用GPT-4语言模型chatbot,可以帮助研究人员更好地开发更有价值的药物候选体,并且可以提高药物开发的效率和优化性。
    Abstract The birth of ChatGPT, a cutting-edge language model chatbot developed by OpenAI, ushered in a new era in AI, and this paper vividly showcases its innovative application within the field of drug discovery. Focused specifically on developing anti-cocaine addiction drugs, the study employs GPT-4 as a virtual guide, offering strategic and methodological insights to researchers working on generative models for drug candidates. The primary objective is to generate optimal drug-like molecules with desired properties. By leveraging the capabilities of ChatGPT, the study introduces a novel approach to the drug discovery process. This symbiotic partnership between AI and researchers transforms how drug development is approached. Chatbots become facilitators, steering researchers towards innovative methodologies and productive paths for creating effective drug candidates. This research sheds light on the collaborative synergy between human expertise and AI assistance, wherein ChatGPT's cognitive abilities enhance the design and development of potential pharmaceutical solutions. This paper not only explores the integration of advanced AI in drug discovery but also reimagines the landscape by advocating for AI-powered chatbots as trailblazers in revolutionizing therapeutic innovation.
    摘要 开启AI时代的掌门人---ChatGPT,一个前所未有的语言模型对话机器人,由OpenAI开发,带来了新的时代。这篇研究专注于开发抗科塞鸽药物,使用GPT-4作为虚拟导师,为研究人员工作于生成模型的药物候选者提供策略和方法学见解。研究的主要目标是生成符合需求的药物分子。通过利用ChatGPT的能力,这篇研究引入了一个新的药物发现过程。在人类专家和AI助手的协力下,虚拟导师成为了药物发现的帮手,导引研究人员朝着创新的方法和有效的药物候选者进行探索。这篇研究不仅探讨了AI在药物发现中的应用,而且重新定义了领域的概念,宣扬AI助手作为药物发现的先驱者,推动药物创新的发展。

A Novel Ehanced Move Recognition Algorithm Based on Pre-trained Models with Positional Embeddings

  • paper_url: http://arxiv.org/abs/2308.10822
  • repo_url: None
  • paper_authors: Hao Wen, Jie Wang, Xiaodong Qiao
  • for: 这篇论文主要targets at improving the recognition of abstracts in Chinese scientific and technological papers.
  • methods: 该论文提出了一种基于改进预训练模型和闭包网络听写机制的新的增强 Move recognition算法,用于处理中文科技论文的摘要。
  • results: 实验结果显示,该算法相比原始数据集,在分割数据集上提高了13.37%的准确率,并在基础比较模型上提高了7.55%的准确率。
    Abstract The recognition of abstracts is crucial for effectively locating the content and clarifying the article. Existing move recognition algorithms lack the ability to learn word position information to obtain contextual semantics. This paper proposes a novel enhanced move recognition algorithm with an improved pre-trained model and a gated network with attention mechanism for unstructured abstracts of Chinese scientific and technological papers. The proposed algorithm first performs summary data segmentation and vocabulary training. The EP-ERNIE$\_$AT-GRU framework is leveraged to incorporate word positional information, facilitating deep semantic learning and targeted feature extraction. Experimental results demonstrate that the proposed algorithm achieves 13.37$\%$ higher accuracy on the split dataset than on the original dataset and a 7.55$\%$ improvement in accuracy over the basic comparison model.
    摘要 摘要理解是科技文献检索和解释的关键,但现有的 Move 认识算法无法学习单词位置信息以获取语义上的上下文。这篇论文提出了一种新的增强 Move 认识算法,使用改进的预训练模型和闭合网络听力机制来处理中文科技文献摘要。该算法首先执行摘要数据分 segmentation 和词汇训练。使用 EP-ERNIE $\_$ AT-GRU 框架,以获取单词位置信息,进一步深入学习语义和特定特征提取。实验结果表明,提出的算法在分数据集上比原始数据集高出 13.37%,与基本对比模型相比高出 7.55%。

Hierarchy Flow For High-Fidelity Image-to-Image Translation

  • paper_url: http://arxiv.org/abs/2308.06909
  • repo_url: https://github.com/weichenfan/hierarchyflow
  • paper_authors: Weichen Fan, Jinghuan Chen, Ziwei Liu
  • for: 本研究旨在提高图像到图像翻译中的内容保持性。
  • methods: 我们提出了一种新的流基模型,即层次流(Hierarchy Flow),以提高翻译过程中的内容保持性。
  • results: 我们的方法在各种图像到图像翻译benchmark上实现了状态的表现,特别是在强级和正常级翻译任务中表现出了明显的优势。
    Abstract Image-to-image (I2I) translation comprises a wide spectrum of tasks. Here we divide this problem into three levels: strong-fidelity translation, normal-fidelity translation, and weak-fidelity translation, indicating the extent to which the content of the original image is preserved. Although existing methods achieve good performance in weak-fidelity translation, they fail to fully preserve the content in both strong- and normal-fidelity tasks, e.g. sim2real, style transfer and low-level vision. In this work, we propose Hierarchy Flow, a novel flow-based model to achieve better content preservation during translation. Specifically, 1) we first unveil the drawbacks of standard flow-based models when applied to I2I translation. 2) Next, we propose a new design, namely hierarchical coupling for reversible feature transformation and multi-scale modeling, to constitute Hierarchy Flow. 3) Finally, we present a dedicated aligned-style loss for a better trade-off between content preservation and stylization during translation. Extensive experiments on a wide range of I2I translation benchmarks demonstrate that our approach achieves state-of-the-art performance, with convincing advantages in both strong- and normal-fidelity tasks. Code and models will be at https://github.com/WeichenFan/HierarchyFlow.
    摘要 Image-to-image(I2I)翻译包括多种任务。我们在这里将这个问题分为三级:强度精度翻译、常规精度翻译和弱度精度翻译,这些级别指的是原始图像内容的保留程度。虽然现有方法在弱度翻译任务上达到了好的性能,但是它们在强度翻译和常规翻译任务中却失败了,例如sim2real、style transfer和低级视觉。在这种情况下,我们提出了一种新的方法——层次流模型,以提高翻译过程中内容的保留。具体来说,我们首先揭示了标准流模型在I2I翻译中的缺陷。然后,我们提出了一种新的设计——层次协调对应的反转特征转换和多尺度模型,以构成层次流模型。最后,我们提出了一种专门设计的对齐风格损失,以实现在翻译过程中更好的内容保留和风格化融合。我们在多种I2I翻译 benchmark 上进行了广泛的实验,并证明了我们的方法可以达到状态之最的性能,在强度翻译和常规翻译任务中都有证明性的优势。代码和模型将在 GitHub 上提供。

Generative Interpretation

  • paper_url: http://arxiv.org/abs/2308.06907
  • repo_url: https://github.com/yonathanarbel/generativeinterpretation
  • paper_authors: Yonathan A. Arbel, David Hoffman
  • for: 这篇论文目的是提出一种新的合同解释方法,使用大语言模型来估算合同意义。
  • methods: 这篇论文使用了大语言模型来实现这个目的,并通过实践案例来示例其能力。
  • results: 论文显示了这些模型可以帮助法官确定合同的真实意义,衡量不确定性,并填充党之间的协议缺陷。它还示出了这些模型可以评估外部证据的证据价值。
    Abstract We introduce generative interpretation, a new approach to estimating contractual meaning using large language models. As AI triumphalism is the order of the day, we proceed by way of grounded case studies, each illustrating the capabilities of these novel tools in distinct ways. Taking well-known contracts opinions, and sourcing the actual agreements that they adjudicated, we show that AI models can help factfinders ascertain ordinary meaning in context, quantify ambiguity, and fill gaps in parties' agreements. We also illustrate how models can calculate the probative value of individual pieces of extrinsic evidence. After offering best practices for the use of these models given their limitations, we consider their implications for judicial practice and contract theory. Using LLMs permits courts to estimate what the parties intended cheaply and accurately, and as such generative interpretation unsettles the current interpretative stalemate. Their use responds to efficiency-minded textualists and justice-oriented contextualists, who argue about whether parties will prefer cost and certainty or accuracy and fairness. Parties--and courts--would prefer a middle path, in which adjudicators strive to predict what the contract really meant, admitting just enough context to approximate reality while avoiding unguided and biased assimilation of evidence. As generative interpretation offers this possibility, we argue it can become the new workhorse of contractual interpretation.
    摘要 我们介绍生成解释,一种新的方法来估算合同意义使用大语言模型。在人工智能胜利的时代,我们通过实际案例来证明这些新工具在不同方面的能力。使用知名合同意见和实际协议,我们表明AI模型可以帮助判决者了解具体的意思,衡量不确定性,并填充党们的协议中的缺陷。我们还示出了模型可以计算个别外部证据的证据价值。在使用这些模型的限制后,我们考虑了它们对法律实践和合同理论的影响。使用LLMs permit courts to estimate what the parties intended at a low cost and high accuracy, and thus generative interpretation breaks the current interpretive deadlock. Its use responds to efficiency-minded textualists and justice-oriented contextualists, who argue about whether parties will prefer cost and certainty or accuracy and fairness. Parties and courts would prefer a middle path, in which adjudicators strive to predict what the contract really meant, admitting just enough context to approximate reality while avoiding unguided and biased assimilation of evidence. As generative interpretation offers this possibility, we argue it can become the new workhorse of contractual interpretation.

The Michigan Robotics Undergraduate Curriculum: Defining the Discipline of Robotics for Equity and Excellence

  • paper_url: http://arxiv.org/abs/2308.06905
  • repo_url: None
  • paper_authors: Odest Chadwicke Jenkins, Jessy Grizzle, Ella Atkins, Leia Stirling, Elliott Rouse, Mark Guzdial, Damen Provost, Kimberly Mann, Joanna Millunchick
    for:The paper is written to propose and establish a new undergraduate program in Robotics at the University of Michigan, with a focus on equity and excellence.methods:The program is designed with an adaptable curriculum that is accessible through a diversity of student pathways, and includes partnerships with Historically Black Colleges and Universities.results:The program has been highly successful in its first academic year, with over 100 students declaring Robotics as their major, completion of the Robotics major by the first two graduates, and soaring enrollments in Robotics classes.
    Abstract The Robotics Major at the University of Michigan was successfully launched in the 2022-23 academic year as an innovative step forward to better serve students, our communities, and our society. Building on our guiding principle of "Robotics with Respect" and our larger Robotics Pathways model, the Michigan Robotics Major was designed to define robotics as a true academic discipline with both equity and excellence as our highest priorities. Understanding that talent is equally distributed but opportunity is not, the Michigan Robotics Major has embraced an adaptable curriculum that is accessible through a diversity of student pathways and enables successful and sustained career-long participation in robotics, AI, and automation professions. The results after our planning efforts (2019-22) and first academic year (2022-23) have been highly encouraging: more than 100 students declared Robotics as their major, completion of the Robotics major by our first two graduates, soaring enrollments in our Robotics classes, thriving partnerships with Historically Black Colleges and Universities. This document provides our original curricular proposal for the Robotics Undergraduate Program at the University of Michigan, submitted to the Michigan Association of State Universities in April 2022 and approved in June 2022. The dissemination of our program design is in the spirit of continued growth for higher education towards realizing equity and excellence. The most recent version of this document is also available on Google Docs through this link: https://ocj.me/robotics_major
    摘要 美国密歇根大学机器人学专业在2022-23学年 successfully 发起了一个创新的步骤,以更好地服务学生、社区和社会。 基于我们的指导原则“机器人学 avec 尊重”和我们更大的机器人路径模型,密歇根大学机器人学专业是为了定义机器人学为真正的学术领域,并将 equity 和 excellence 作为我们最高的优先级。 理解才华平等分布,但机会不平等分布,密歇根大学机器人学专业采用了适应性课程,通过多种学生路径访问,帮助学生在机器人、人工智能和自动化领域成功地职业发展。 我们的规划努力(2019-22)和首学年(2022-23)的结果非常鼓舞人:More than 100 学生声明机器人学为他们的主修,首两名毕业生完成机器人学专业,课程报名人数在趋升,与 Historically Black Colleges and Universities 的合作也在逐渐增长。这份文件包含我们原始的课程设计提案,在2022年6月由密歇根州大学协会批准。 我们希望通过分享我们的program design ,促进高等教育的增长,实现 equity 和 excellence。最新版本的这份文件可以在 Google Docs 上通过以下链接获取:https://ocj.me/robotics_major

Robustified ANNs Reveal Wormholes Between Human Category Percepts

  • paper_url: http://arxiv.org/abs/2308.06887
  • repo_url: https://github.com/ggaziv/wormholes
  • paper_authors: Guy Gaziv, Michael J. Lee, James J. DiCarlo
  • for: 这篇论文旨在探讨人工神经网络(ANNs)对图像分类的敏感性,以及人类视觉处理模型的不足。
  • methods: 研究人员使用了标准的ANN模型和roboustified ANN模型,生成了小norm图像扰动,并评估了人类视觉对这些扰动的稳定性。
  • results: 研究发现,人类视觉对小norm图像扰动是高度稳定的,而roboustified ANN模型可以可靠地找到低norm图像扰动,使人类视觉受到强烈扰动。此外,研究还发现了“孔雀门”现象,即在图像空间中的任意起点都存在一些“蛇口”,可以带领人类视觉从当前分类状态转移到另一个semantically very different的状态。
    Abstract The visual object category reports of artificial neural networks (ANNs) are notoriously sensitive to tiny, adversarial image perturbations. Because human category reports (aka human percepts) are thought to be insensitive to those same small-norm perturbations -- and locally stable in general -- this argues that ANNs are incomplete scientific models of human visual perception. Consistent with this, we show that when small-norm image perturbations are generated by standard ANN models, human object category percepts are indeed highly stable. However, in this very same "human-presumed-stable" regime, we find that robustified ANNs reliably discover low-norm image perturbations that strongly disrupt human percepts. These previously undetectable human perceptual disruptions are massive in amplitude, approaching the same level of sensitivity seen in robustified ANNs. Further, we show that robustified ANNs support precise perceptual state interventions: they guide the construction of low-norm image perturbations that strongly alter human category percepts toward specific prescribed percepts. These observations suggest that for arbitrary starting points in image space, there exists a set of nearby "wormholes", each leading the subject from their current category perceptual state into a semantically very different state. Moreover, contemporary ANN models of biological visual processing are now accurate enough to consistently guide us to those portals.
    摘要 人工神经网络(ANNs)的视觉物体类别报告具有极其敏感于微小、敌意的图像扰动的特点。因为人类的视觉报告(也称为人类感知)被认为不受这些小范围扰动的影响,而且在总体上是稳定的,这意味着ANNs是人类视觉认知的不完整科学模型。我们的研究表明,当使用标准ANN模型生成小范围扰动时,人类对象类别报告是非常稳定的。然而,在这个“人类假设稳定”的 régime中,我们发现了使用强化ANN模型时发现的低范围扰动,这些扰动会强烈地打乱人类报告。这些扰动的振荡强度非常大,相当于强化ANNs的敏感度。此外,我们发现了使用强化ANN模型支持精确的感知状态改变:它们可以生成低范围扰动,使人类category报告强烈地改变。这些观察表明,对于任意的图像初始状态,存在一组“蠕虫洞”,每个蠕虫洞都可以将主体从其当前的报告状态转移到具有不同semantics的状态。此外,当代的生物视觉处理ANN模型已经够精度,可以一直引导我们到这些门户。

Optimizing Offensive Gameplan in the National Basketball Association with Machine Learning

  • paper_url: http://arxiv.org/abs/2308.06851
  • repo_url: None
  • paper_authors: Eamon Mukhopadhyay
  • for: This paper aims to verify the effectiveness of the Offensive Rating (ORTG) metric in predicting different NBA playtypes.
  • methods: The authors use both linear regression and neural network regression models to evaluate the correlation between ORTG and different playtypes.
  • results: The authors find that both models have a strong correlation with the playtypes, but the neural network model performs slightly better than the linear regression model. They also use the accuracy of the models to optimize the output of the model with test examples, demonstrating the combination of features that can achieve a highly functioning offense.Here’s the simplified Chinese text:
  • for: 这篇论文目的是验证 Offensive Rating (ORTG) metric 是否有效地预测不同的 NBA 战术类型。
  • methods: 作者使用了线性回归和神经网络回归模型来评估 ORTG 和不同战术类型之间的相关性。
  • results: 作者发现两种模型都有强相关性,但神经网络模型在准确性上略微高于线性回归模型。他们还使用模型准确性来优化输出模型的测试例子,以示出可以实现高效的攻击防御。
    Abstract Throughout the analytical revolution that has occurred in the NBA, the development of specific metrics and formulas has given teams, coaches, and players a new way to see the game. However - the question arises - how can we verify any metrics? One method would simply be eyeball approximation (trying out many different gameplans) and/or trial and error - an estimation-based and costly approach. Another approach is to try to model already existing metrics with a unique set of features using machine learning techniques. The key to this approach is that with these features that are selected, we can try to gauge the effectiveness of these features combined, rather than using individual analysis in simple metric evaluation. If we have an accurate model, it can particularly help us determine the specifics of gameplan execution. In this paper, the statistic ORTG (Offensive Rating, developed by Dean Oliver) was found to have a correlation with different NBA playtypes using both a linear regression model and a neural network regression model, although ultimately, a neural network worked slightly better than linear regression. Using the accuracy of the models as a justification, the next step was to optimize the output of the model with test examples, which would demonstrate the combination of features to best achieve a highly functioning offense.
    摘要 在NBA analytics革命中,发展特定的指标和公式为球队、教练和球员提供了一种新的视角。然而,问题 arise - 如何验证这些指标呢?一种方法是通过观察多个不同的战斗策略来估算(trying out many different gameplans),以及或者试错法则 - 一种估计基于的成本高的方法。另一种方法是使用机器学习技术来模型已有的指标,并选择一组独特的特征。这种方法的关键在于,我们可以通过这些选择的特征来评估这些特征的组合效果,而不是单独评估指标。如果我们有一个准确的模型,那么它可以尤其帮助我们确定游戏计划的执行细节。在这篇论文中,发展了ORTG(Offensive Rating,由Dean Oliver创造)指标,与不同的NBA战斗类型之间存在相关性,使用线性回归模型和神经网络回归模型进行相关性分析,最终发现神经网络模型的准确性略高于线性回归模型。使用模型准确性作为正当化,下一步是优化模型输出的测试例子,以示出最佳执行游戏计划所需的组合特征。

A Parallel Ensemble of Metaheuristic Solvers for the Traveling Salesman Problem

  • paper_url: http://arxiv.org/abs/2308.07347
  • repo_url: None
  • paper_authors: Swetha Varadarajan, Darrell Whitley
  • for: 解决购物人巡游问题 (TSP),一个广泛研究的NP困难问题。
  • methods: 使用Lin-Kernighan-Helsgaun(LKH)规则和Edge Assembly crossover(EAX)算法,以及其hybrid版本和Mixing Genetic Algorithm(MGA)。
  • results: ensemble setup中 combine these solvers,可以超越单个 solver的性能,特别是在大于10,000个城市的问题上。
    Abstract The travelling salesman problem (TSP) is one of the well-studied NP-hard problems in the literature. The state-of-the art inexact TSP solvers are the Lin-Kernighan-Helsgaun (LKH) heuristic and Edge Assembly crossover (EAX). A recent study suggests that EAX with restart mechanisms perform well on a wide range of TSP instances. However, this study is limited to 2,000 city problems. We study for problems ranging from 2,000 to 85,900. We see that the performance of the solver varies with the type of the problem. However, combining these solvers in an ensemble setup, we are able to outperform the individual solver's performance. We see the ensemble setup as an efficient way to make use of the abundance of compute resources. In addition to EAX and LKH, we use several versions of the hybrid of EAX and Mixing Genetic Algorithm (MGA). A hybrid of MGA and EAX is known to solve some hard problems. We see that the ensemble of the hybrid version outperforms the state-of-the-art solvers on problems larger than 10,000 cities.
    摘要 “旅游销售问题”(TSP)是一个已有广泛研究的NP困难问题。现今的State-of-the-art对于TSP问题的不精确解决方案是林-肯尼根-赫尔斯堡(LKH)规律和边组聚合交叉(EAX)。一 recent study 显示,在广泛的TSP问题上,EAX加上重新启动机制perform well。但是,这个研究仅对2,000个城市问题进行了研究。我们在2,000至85,900个城市之间进行了研究,发现解释器的表现随问题的类型而异。不过,将这些解释器集成为一个组合设置,我们能够超过个体解释器的表现。我们视这个组合设置为一种有效的使用计算资源的方式。此外,我们还使用了一些版本的EAX和混合遗传算法(MGA)的复合版本。一个混合版本已经能够解决一些困难的问题。我们发现,这个组合设置在10,000个城市以上的问题上表现更好。

Diagnostic Reasoning Prompts Reveal the Potential for Large Language Model Interpretability in Medicine

  • paper_url: http://arxiv.org/abs/2308.06834
  • repo_url: None
  • paper_authors: Thomas Savage, Ashwin Nayak, Robert Gallo, Ekanath Rangan, Jonathan H Chen
  • for: 本研究旨在探讨语言模型在医学领域中是否可以准确地诊断疾病,并使用什么方法来实现这一目标。
  • methods: 本研究使用了GPT4语言模型,并开发了一些新的诊断逻辑提问来评估语言模型的诊断能力。
  • results: 研究发现,通过使用新的诊断逻辑提问,GPT4可以模仿医生的常见诊断过程,同时保持诊断准确性。这表明,可以使用价值的诊断逻辑提问来让语言模型进行可读性的诊断。
    Abstract One of the major barriers to using large language models (LLMs) in medicine is the perception they use uninterpretable methods to make clinical decisions that are inherently different from the cognitive processes of clinicians. In this manuscript we develop novel diagnostic reasoning prompts to study whether LLMs can perform clinical reasoning to accurately form a diagnosis. We find that GPT4 can be prompted to mimic the common clinical reasoning processes of clinicians without sacrificing diagnostic accuracy. This is significant because an LLM that can use clinical reasoning to provide an interpretable rationale offers physicians a means to evaluate whether LLMs can be trusted for patient care. Novel prompting methods have the potential to expose the black box of LLMs, bringing them one step closer to safe and effective use in medicine.
    摘要 一个主要阻碍大语言模型(LLMs)在医学中使用的问题是人们认为它们使用不可解释的方法来做诊断决策,这与医生的认知过程有很大差异。在这篇论文中,我们开发了新的诊断思维提要,以研究LLMs是否可以准确地进行诊断。我们发现GPT4可以通过模仿常见的临床思维过程来提供一个可解释的诊断原因,而不会产生诊断准确性的损失。这是重要的,因为一个可以使用临床思维来提供可解释的诊断原因的 LLM 可以让医生评估 LLMS 是否可以用于患者护理。这种新的提要方法有助于暴露 LLMS 的黑盒子,使其更接近安全和有效的使用。

InTune: Reinforcement Learning-based Data Pipeline Optimization for Deep Recommendation Models

  • paper_url: http://arxiv.org/abs/2308.08500
  • repo_url: None
  • paper_authors: Kabir Nagrecha, Lingyi Liu, Pablo Delgado, Prasanna Padmanabhan
  • for: 这篇论文旨在探讨深度学习推荐模型(DLRM)训练中的数据接入问题,以及相关的管道瓶颈和挑战。
  • methods: 本论文使用了人工智能学习(RL)代理来学习训练机器的CPU资源分配方式,以更好地平行化数据加载和提高通过put。
  • results: experiments表明,使用InTune可以在几分钟内构建优化数据管道配置,并可以轻松地与现有训练工作流 integration。 InTune可以提高在线数据接入速率,从而减少模型执行时间的浪费和提高效率。在实际场景中,InTune可以提高数据接入吞吐量 by up to 2.29倍,并同时提高CPU和GPU资源利用率。
    Abstract Deep learning-based recommender models (DLRMs) have become an essential component of many modern recommender systems. Several companies are now building large compute clusters reserved only for DLRM training, driving new interest in cost- and time- saving optimizations. The systems challenges faced in this setting are unique; while typical deep learning training jobs are dominated by model execution, the most important factor in DLRM training performance is often online data ingestion. In this paper, we explore the unique characteristics of this data ingestion problem and provide insights into DLRM training pipeline bottlenecks and challenges. We study real-world DLRM data processing pipelines taken from our compute cluster at Netflix to observe the performance impacts of online ingestion and to identify shortfalls in existing pipeline optimizers. We find that current tooling either yields sub-optimal performance, frequent crashes, or else requires impractical cluster re-organization to adopt. Our studies lead us to design and build a new solution for data pipeline optimization, InTune. InTune employs a reinforcement learning (RL) agent to learn how to distribute the CPU resources of a trainer machine across a DLRM data pipeline to more effectively parallelize data loading and improve throughput. Our experiments show that InTune can build an optimized data pipeline configuration within only a few minutes, and can easily be integrated into existing training workflows. By exploiting the responsiveness and adaptability of RL, InTune achieves higher online data ingestion rates than existing optimizers, thus reducing idle times in model execution and increasing efficiency. We apply InTune to our real-world cluster, and find that it increases data ingestion throughput by as much as 2.29X versus state-of-the-art data pipeline optimizers while also improving both CPU & GPU utilization.
    摘要 深度学习基本的推荐模型(DLRM)已经成为现代推荐系统中的重要组件。许多公司现在为DLRM训练建立大型计算集群,导致新的成本和时间OPTIMIZATION需求。这些系统的挑战是独特的; Typical deep learning training Jobs是模型执行所dominated,但DLRM训练性能中最重要的因素通常是在线数据取入。 在这篇论文中,我们探索DLRM数据取入问题的独特特性并提供了推荐管道瓶颈和挑战。我们研究了Netflix的compute集群中的实际DLRM数据处理管道,并观察了在线取入的性能影响以及现有管道优化工具的缺点。我们发现现有工具可能会导致下 optimize performance,或者频繁崩溃,或者需要重新组织集群以采用。我们的研究导致我们设计并建立了一个新的数据管道优化解决方案,即InTune。InTune使用了强化学习(RL)代理来分配训练机器的CPU资源在DLRM数据管道中更有效地并行数据加载和提高吞吐量。我们的实验表明,InTune可以在只需几分钟之内构建优化的数据管道配置,并可以轻松地与现有训练工作流 integrate。通过强化学习的响应和适应性,InTune可以在现有优化器的基础上提高在线数据取入率,从而降低模型执行时的空闲时间和提高效率。我们在实际集群上应用InTune,发现它可以提高数据取入吞吐量达2.29倍,而且同时提高CPU和GPU资源利用率。

An Ensemble Approach to Question Classification: Integrating Electra Transformer, GloVe, and LSTM

  • paper_url: http://arxiv.org/abs/2308.06828
  • repo_url: None
  • paper_authors: Sanad Aburass, Osama Dorgham
  • for: 本研究は问题分类 задачіに对するnovelensemble方法を提出します。
  • methods: 提案された模型はElectra、GloVe、LSTMの三种state-of-the-art模型を组み合わせた Ensemble方法です。
  • results: 对TREC数据集进行训练和评估,结果显示提案的模型在所有评估指标上都超过了BERT、RoBERTa、DistilBERT等其他 cutting-edge模型,测试集准确率达0.8。
    Abstract This paper introduces a novel ensemble approach for question classification using state-of-the-art models -- Electra, GloVe, and LSTM. The proposed model is trained and evaluated on the TREC dataset, a well-established benchmark for question classification tasks. The ensemble model combines the strengths of Electra, a transformer-based model for language understanding, GloVe, a global vectors for word representation, and LSTM, a recurrent neural network variant, providing a robust and efficient solution for question classification. Extensive experiments were carried out to compare the performance of the proposed ensemble approach with other cutting-edge models, such as BERT, RoBERTa, and DistilBERT. Our results demonstrate that the ensemble model outperforms these models across all evaluation metrics, achieving an accuracy of 0.8 on the test set. These findings underscore the effectiveness of the ensemble approach in enhancing the performance of question classification tasks, and invite further exploration of ensemble methods in natural language processing.
    摘要 这篇论文介绍了一种新的集成方法用于问题分类,使用当前的模型---Electra、GloVe和LSTM。该提议的模型在TREC数据集上进行训练和评估,这是一个已知的问题分类任务 benchmark。集成模型 combinesthe strengths of Electra、GloVe和LSTM,提供一种强大和高效的问题分类解决方案。我们进行了广泛的实验,比较了该集成模型与其他当前最佳模型,如BERT、RoBERTa和DistilBERT的性能。我们的结果表明,集成模型在所有评估指标上都超过了这些模型,实现了测试集上的准确率0.8。这些发现证明了集成方法在问题分类任务中的效iveness,并邀请进一步探索集成方法在自然语言处理领域的应用。

Reinforcement Graph Clustering with Unknown Cluster Number

  • paper_url: http://arxiv.org/abs/2308.06827
  • repo_url: https://github.com/yueliu1999/awesome-deep-graph-clustering
  • paper_authors: Yue Liu, Ke Liang, Jun Xia, Xihong Yang, Sihang Zhou, Meng Liu, Xinwang Liu, Stan Z. Li
  • for: 这种方法的目的是为了不需要先知道cluster数量的情况下,使用神经网络进行深度图 clustering。
  • methods: 该方法使用了对冲预测任务来学习节点表示,然后使用了 reinforcement learning 机制来决定节点分布。
  • results: 实验表明,该方法可以具有高效率和高准确率,并且可以在不知道cluster数量的情况下进行深度图 clustering。
    Abstract Deep graph clustering, which aims to group nodes into disjoint clusters by neural networks in an unsupervised manner, has attracted great attention in recent years. Although the performance has been largely improved, the excellent performance of the existing methods heavily relies on an accurately predefined cluster number, which is not always available in the real-world scenario. To enable the deep graph clustering algorithms to work without the guidance of the predefined cluster number, we propose a new deep graph clustering method termed Reinforcement Graph Clustering (RGC). In our proposed method, cluster number determination and unsupervised representation learning are unified into a uniform framework by the reinforcement learning mechanism. Concretely, the discriminative node representations are first learned with the contrastive pretext task. Then, to capture the clustering state accurately with both local and global information in the graph, both node and cluster states are considered. Subsequently, at each state, the qualities of different cluster numbers are evaluated by the quality network, and the greedy action is executed to determine the cluster number. In order to conduct feedback actions, the clustering-oriented reward function is proposed to enhance the cohesion of the same clusters and separate the different clusters. Extensive experiments demonstrate the effectiveness and efficiency of our proposed method. The source code of RGC is shared at https://github.com/yueliu1999/RGC and a collection (papers, codes and, datasets) of deep graph clustering is shared at https://github.com/yueliu1999/Awesome-Deep-Graph-Clustering on Github.
    摘要 深度图 clustering,目标是通过神经网络在无监督的情况下将节点分组成不同的分支,在过去几年内吸引了广泛的关注。 although the performance has been largely improved, the excellent performance of the existing methods heavily relies on accurately predefined cluster number, which is not always available in the real-world scenario. To enable the deep graph clustering algorithms to work without the guidance of the predefined cluster number, we propose a new deep graph clustering method termed Reinforcement Graph Clustering (RGC). In our proposed method, cluster number determination and unsupervised representation learning are unified into a uniform framework by the reinforcement learning mechanism. Concretely, the discriminative node representations are first learned with the contrastive pretext task. Then, to capture the clustering state accurately with both local and global information in the graph, both node and cluster states are considered. Subsequently, at each state, the qualities of different cluster numbers are evaluated by the quality network, and the greedy action is executed to determine the cluster number. In order to conduct feedback actions, the clustering-oriented reward function is proposed to enhance the cohesion of the same clusters and separate the different clusters. Extensive experiments demonstrate the effectiveness and efficiency of our proposed method. The source code of RGC is shared at and a collection (papers, codes and, datasets) of deep graph clustering is shared at on Github.

Approximate and Weighted Data Reconstruction Attack in Federated Learning

  • paper_url: http://arxiv.org/abs/2308.06822
  • repo_url: None
  • paper_authors: Ziqi Wang, Yongcun Song, Enrique Zuazua
  • for: 这个论文的目的是提出一种可以攻击 Federated Learning(FL)中最常使用的水平 Federated Averaging(FedAvg)scenario的方法,并且提高攻击效率和预测品质。
  • methods: 本文使用了一种 interpolating-based approximation method,将客户端的本地训练过程中的模型更新转换为可以攻击的形式,然后设计了一个层别加权损失函数来提高攻击的数据质量。
  • results: 实验结果显示,提出的 aproximate和weighted攻击(AWA)方法比其他现有的方法有更好的攻击效率和预测品质,特别是在静止图像数据重建 task 上。
    Abstract Federated Learning (FL) is a distributed learning paradigm that enables multiple clients to collaborate on building a machine learning model without sharing their private data. Although FL is considered privacy-preserved by design, recent data reconstruction attacks demonstrate that an attacker can recover clients' training data based on the parameters shared in FL. However, most existing methods fail to attack the most widely used horizontal Federated Averaging (FedAvg) scenario, where clients share model parameters after multiple local training steps. To tackle this issue, we propose an interpolation-based approximation method, which makes attacking FedAvg scenarios feasible by generating the intermediate model updates of the clients' local training processes. Then, we design a layer-wise weighted loss function to improve the data quality of reconstruction. We assign different weights to model updates in different layers concerning the neural network structure, with the weights tuned by Bayesian optimization. Finally, experimental results validate the superiority of our proposed approximate and weighted attack (AWA) method over the other state-of-the-art methods, as demonstrated by the substantial improvement in different evaluation metrics for image data reconstructions.
    摘要 受欢迎的分布式学习(FL)是一种分布式学习模式,允许多个客户端共同建立一个机器学习模型,无需分享他们的私人数据。虽然FL被视为隐私保护的设计,但最近的数据重建攻击表明,攻击者可以根据在FL中共享的参数来恢复客户端的训练数据。然而,大多数现有的方法无法攻击最常用的水平 Federated Averaging(FedAvg)场景,在这种场景下,客户端在多个本地训练步骤后共享模型参数。为解决这个问题,我们提出了一种 interpolate-based approximation 方法,可以让攻击者在客户端的本地训练过程中生成 intercept 的模型更新。然后,我们设计了层weise 权重损失函数,以提高数据重建的质量。我们对模型更新在不同层之间分配不同的权重,并通过 Bayesian 优化调整这些权重。最后,我们进行了实验,并证明了我们提出的approximate和权重 attacked(AWA)方法在不同评价指标中具有显著的优势,特别是对于图像数据重建 task。

Ground Manipulator Primitive Tasks to Executable Actions using Large Language Models

  • paper_url: http://arxiv.org/abs/2308.06810
  • repo_url: None
  • paper_authors: Yue Cao, C. S. George Lee
  • for: 这 paper 的目的是解决 robot 系统中高级任务与低级动作之间的转换问题。
  • methods: 该 paper 使用大型自然语言模型 (LLM) 将 manipulate primitive task 转换为机器人低级动作。
  • results: 该 paper 提供了一种基于任务框架的程序式似的提问,使得 LLM 可以生成位置/力集点,以便实现混合控制。
    Abstract Layered architectures have been widely used in robot systems. The majority of them implement planning and execution functions in separate layers. However, there still lacks a straightforward way to transit high-level tasks in the planning layer to the low-level motor commands in the execution layer. In order to tackle this challenge, we propose a novel approach to ground the manipulator primitive tasks to robot low-level actions using large language models (LLMs). We designed a program-like prompt based on the task frame formalism. In this way, we enable LLMs to generate position/force set-points for hybrid control. Evaluations over several state-of-the-art LLMs are provided.
    摘要 层次架构在机器人系统中广泛应用。大多数其中实现规划和执行功能在不同的层次。然而,从高级任务到低级机器指令的转换仍然存在一定的挑战。为了解决这个问题,我们提出了一种新的方法,通过大型自然语言模型(LLM)将抓取器基础任务地标准化到机器人低级动作上。我们设计了基于任务框架 formalism的程序式样本。这样,我们允许 LLM 生成位/力设点,以便在混合控制中生成位置/力信号。我们对多个现代 LLM 进行了评估。

Neural Networks for Programming Quantum Annealers

  • paper_url: http://arxiv.org/abs/2308.06807
  • repo_url: https://github.com/boschsamuel/nnforprogrammingquantumannealers
  • paper_authors: Samuel Bosch, Bobak Kiani, Rui Yang, Adrian Lupascu, Seth Lloyd
  • for: 这个论文旨在探讨量子机器学习是否可以提高人工智能的发展,特别是解决类别问题。
  • methods: 本论文使用了一种组合量子逻辑和类传播神经网络的方法,通过将量子逻辑连接到神经网络中来实现分类任务。
  • results: 研究发现,在使用量子逻辑的情况下,神经网络的性能不变化很大,并且不如使用常规非线性神经网络来解决分类问题。
    Abstract Quantum machine learning has the potential to enable advances in artificial intelligence, such as solving problems intractable on classical computers. Some fundamental ideas behind quantum machine learning are similar to kernel methods in classical machine learning. Both process information by mapping it into high-dimensional vector spaces without explicitly calculating their numerical values. We explore a setup for performing classification on labeled classical datasets, consisting of a classical neural network connected to a quantum annealer. The neural network programs the quantum annealer's controls and thereby maps the annealer's initial states into new states in the Hilbert space. The neural network's parameters are optimized to maximize the distance of states corresponding to inputs from different classes and minimize the distance between quantum states corresponding to the same class. Recent literature showed that at least some of the "learning" is due to the quantum annealer, connecting a small linear network to a quantum annealer and using it to learn small and linearly inseparable datasets. In this study, we consider a similar but not quite the same case, where a classical fully-fledged neural network is connected with a small quantum annealer. In such a setting, the fully-fledged classical neural-network already has built-in nonlinearity and learning power, and can already handle the classification problem alone, we want to see whether an additional quantum layer could boost its performance. We simulate this system to learn several common datasets, including those for image and sound recognition. We conclude that adding a small quantum annealer does not provide a significant benefit over just using a regular (nonlinear) classical neural network.
    摘要 量子机器学习有潜力推动人工智能的发展,如解决классические计算机无法解决的问题。一些量子机器学习的基本想法与经典机器学习的核心思想类似。两者都将信息映射到高维向量空间中,不必显式计算其数值。我们研究一种将标注的古典数据集用类icial neural network和小规模量子热退器连接起来进行分类。 neural network控制量子热退器的初态,并将其映射到彪维空间中的新状态。 neural network的参数被优化,以最大化输入 différences between states corresponding to different classes and minimize the difference between states corresponding to the same class.在文献中显示,至少一些"学习"是由量子热退器提供的,将小线性网络连接到量子热退器,并用其学习小和线性不可分离的数据集。在这种情况下,我们考虑一个类似的情况,其中一个完整的古典神经网络与一个小量子热退器连接在一起。在这种设置下,古典神经网络已经具有内置的非线性和学习能力,它可以独立处理分类问题。我们通过模拟这个系统,学习了一些常见的数据集,包括图像和声音识别。我们结论是,添加小量子热退器并不提供显著的改善,相比于使用非线性的古典神经网络。

  • paper_url: http://arxiv.org/abs/2308.07346
  • repo_url: None
  • paper_authors: Joseph D. Ramsey, Bryan Andrews
  • for: 这个论文是为了提供一个新的Python和R接口来访问Tetrad项目的 causal 模型计算、搜索和估计。
  • methods: 这个论文使用了JPype和Reticulate两种新的接口方法来实现Python和R与Tetrad的交互,这些方法是直接解决现有的问题。
  • results: 该论文提供了一些简单的工具和一些工作示例,使用JPype和Reticulate来接口Python和R与Tetrad是直观的和易懂的。
    Abstract We give novel Python and R interfaces for the (Java) Tetrad project for causal modeling, search, and estimation. The Tetrad project is a mainstay in the literature, having been under consistent development for over 30 years. Some of its algorithms are now classics, like PC and FCI; others are recent developments. It is increasingly the case, however, that researchers need to access the underlying Java code from Python or R. Existing methods for doing this are inadequate. We provide new, up-to-date methods using the JPype Python-Java interface and the Reticulate Python-R interface, directly solving these issues. With the addition of some simple tools and the provision of working examples for both Python and R, using JPype and Reticulate to interface Python and R with Tetrad is straightforward and intuitive.
    摘要 我们提供了一个新的Python和R接口 дляJava Tetrad项目,用于 causal模型搜索和估计。Tetrad项目已经在文献中被不断开发了超过30年,其中一些算法已经成为了经典,如PC和FCI;其他则是最近的发展。然而,研究人员在使用Python或R访问下面的Java代码变得越来越重要。现有的方法无法满足这些需求。我们提供了新的、当前的方法,使用JPype Python-Java接口和Reticulate Python-R接口,直接解决这些问题。另外,我们还提供了一些简单的工具和工作示例,使用JPype和Reticulate来接口Python和R与Tetrad是直观的。

SAILOR: Structural Augmentation Based Tail Node Representation Learning

  • paper_url: http://arxiv.org/abs/2308.06801
  • repo_url: https://github.com/jie-re/sailor
  • paper_authors: Jie Liao, Jintang Li, Liang Chen, Bingzhe Wu, Yatao Bian, Zibin Zheng
  • for: 提高Graph Neural Networks(GNNs)对尾节点的表示学习效果,增强GNNs在真实世界图像中的表示能力。
  • methods: 提出了一种通用的Structural Augmentation based taIL nOde Representation learning框架,名为SAILOR,可以同时增强图структуры和提取更有用的尾节点表示。
  • results: 对公共 benchmark 数据集进行了广泛的实验,显示SAILOR可以显著提高尾节点表示,并超越当前的基线。
    Abstract Graph Neural Networks (GNNs) have achieved state-of-the-art performance in representation learning for graphs recently. However, the effectiveness of GNNs, which capitalize on the key operation of message propagation, highly depends on the quality of the topology structure. Most of the graphs in real-world scenarios follow a long-tailed distribution on their node degrees, that is, a vast majority of the nodes in the graph are tail nodes with only a few connected edges. GNNs produce inferior node representations for tail nodes since they lack structural information. In the pursuit of promoting the expressiveness of GNNs for tail nodes, we explore how the deficiency of structural information deteriorates the performance of tail nodes and propose a general Structural Augmentation based taIL nOde Representation learning framework, dubbed as SAILOR, which can jointly learn to augment the graph structure and extract more informative representations for tail nodes. Extensive experiments on public benchmark datasets demonstrate that SAILOR can significantly improve the tail node representations and outperform the state-of-the-art baselines.
    摘要 GRAPHNeural Networks (GNNs) 最近已经达到了图表示学习中的状态艺术水平。然而,GNNs的效果,它们利用消息传递操作的关键,具体取决于图的结构质量。大多数实际场景中的图都follows a long-tailed distribution on node degrees, that is, most nodes in the graph are tail nodes with only a few connected edges. GNNs produce inferior node representations for tail nodes because they lack structural information. In order to improve the expressiveness of GNNs for tail nodes, we explore how the lack of structural information degrades the performance of tail nodes and propose a general Structural Augmentation based tail Node Representation learning framework, called SAILOR, which can jointly learn to augment the graph structure and extract more informative representations for tail nodes. Extensive experiments on public benchmark datasets show that SAILOR can significantly improve the tail node representations and outperform the state-of-the-art baselines.Here's the word-for-word translation of the text into Simplified Chinese:GRAPHNeural Networks (GNNs) 最近已经达到了图表示学习中的状态艺术水平。然而,GNNs的效果,它们利用消息传递操作的关键,具体取决于图的结构质量。大多数实际场景中的图都follows a long-tailed distribution on node degrees, that is, most nodes in the graph are tail nodes with only a few connected edges. GNNs produce inferior node representations for tail nodes because they lack structural information. In order to improve the expressiveness of GNNs for tail nodes, we explore how the lack of structural information degrades the performance of tail nodes and propose a general Structural Augmentation based tail Node Representation learning framework, called SAILOR, which can jointly learn to augment the graph structure and extract more informative representations for tail nodes. Extensive experiments on public benchmark datasets show that SAILOR can significantly improve the tail node representations and outperform the state-of-the-art baselines.