2023-07-30

cs.LG

cs.LG - 2023-07-30

Efficient Federated Learning via Local Adaptive Amended Optimizer with Linear Speedup

paper_url: http://arxiv.org/abs/2308.00522
repo_url: None
paper_authors: Yan Sun, Li Shen, Hao Sun, Liang Ding, Dacheng Tao
for: 这 paper 的目的是提出一种基于 momentum 的 Federated Learning 算法，以解决分布式学习中的 rugged convergence 和客户端漂移问题。
methods: 该 paper 使用了一种名为 Federated Local ADaptive Amended optimizer（FedLADA），它将 global gradient descent 和 local adaptive amended optimizer 相结合，通过在前一个通信回合中估计全局平均偏移量，并通过一个 momentum-like 项来更好地改进实际训练速度和缓解不同客户端的过拟合。
results: 该 paper 的实验结果表明，使用 FedLADA 可以大幅减少通信回合数和实现更高的准确率，比如基于几个基elines的基elines。

Abstract
Adaptive optimization has achieved notable success for distributed learning while extending adaptive optimizer to federated Learning (FL) suffers from severe inefficiency, including (i) rugged convergence due to inaccurate gradient estimation in global adaptive optimizer; (ii) client drifts exacerbated by local over-fitting with the local adaptive optimizer. In this work, we propose a novel momentum-based algorithm via utilizing the global gradient descent and locally adaptive amended optimizer to tackle these difficulties. Specifically, we incorporate a locally amended technique to the adaptive optimizer, named Federated Local ADaptive Amended optimizer (\textit{FedLADA}), which estimates the global average offset in the previous communication round and corrects the local offset through a momentum-like term to further improve the empirical training speed and mitigate the heterogeneous over-fitting. Theoretically, we establish the convergence rate of \textit{FedLADA} with a linear speedup property on the non-convex case under the partial participation settings. Moreover, we conduct extensive experiments on the real-world dataset to demonstrate the efficacy of our proposed \textit{FedLADA}, which could greatly reduce the communication rounds and achieves higher accuracy than several baselines.

摘要
适应优化在分布式学习中获得了显著的成功，但扩展适应优化到联邦学习（FL）中受到严重的不稳定性困扰，包括（i）粗糙的收敛 due to 不准确的梯度估计在全局适应优化器中;（ii）客户端漂移加剧由本地适应优化器引起的本地过拟合。在这种情况下，我们提出了一种新的慢速逻辑算法，通过利用全局梯度下降和本地适应修正优化器来解决这些困难。 Specifically, we incorporate a locally amended technique to the adaptive optimizer, named Federated Local ADaptive Amended optimizer (\textit{FedLADA}), which estimates the global average offset in the previous communication round and corrects the local offset through a momentum-like term to further improve the empirical training speed and mitigate the heterogeneous over-fitting. 理论上，我们确立了\textit{FedLADA}的收敛率在非对称 случа下的线性快速性质。此外，我们在真实世界数据集上进行了广泛的实验，以证明我们提出的\textit{FedLADA}可以减少通信圈数并达到更高的准确率，比许多基eline的性能更高。

DRL4Route: A Deep Reinforcement Learning Framework for Pick-up and Delivery Route Prediction

paper_url: http://arxiv.org/abs/2307.16246
repo_url: https://github.com/maoxiaowei97/drl4route
paper_authors: Xiaowei Mao, Haomin Wen, Hengrui Zhang, Huaiyu Wan, Lixia Wu, Jianbin Zheng, Haoyuan Hu, Youfang Lin
for: 预测劳务者的服务路线，提高快递服务质量和效率。
methods: 基于强化学习框架， combining 深度学习模型的行为学习能力和强化学习的非导数目标优化能力。
results: 在实际数据集上，对比既有方法，DRL4Route-GAE 提高了 Location Square Deviation (LSD) 和 Accuracy@3 (ACC@3) 的值，具体提高了 0.9%-2.7% 和 2.4%-3.2%。

Abstract
Pick-up and Delivery Route Prediction (PDRP), which aims to estimate the future service route of a worker given his current task pool, has received rising attention in recent years. Deep neural networks based on supervised learning have emerged as the dominant model for the task because of their powerful ability to capture workers' behavior patterns from massive historical data. Though promising, they fail to introduce the non-differentiable test criteria into the training process, leading to a mismatch in training and test criteria. Which considerably trims down their performance when applied in practical systems. To tackle the above issue, we present the first attempt to generalize Reinforcement Learning (RL) to the route prediction task, leading to a novel RL-based framework called DRL4Route. It combines the behavior-learning abilities of previous deep learning models with the non-differentiable objective optimization ability of reinforcement learning. DRL4Route can serve as a plug-and-play component to boost the existing deep learning models. Based on the framework, we further implement a model named DRL4Route-GAE for PDRP in logistic service. It follows the actor-critic architecture which is equipped with a Generalized Advantage Estimator that can balance the bias and variance of the policy gradient estimates, thus achieving a more optimal policy. Extensive offline experiments and the online deployment show that DRL4Route-GAE improves Location Square Deviation (LSD) by 0.9%-2.7%, and Accuracy@3 (ACC@3) by 2.4%-3.2% over existing methods on the real-world dataset.

摘要
picked-up 和交付路线预测（PDRP）在最近几年内 Received rising attention，目的是计算工作者当前任务池的未来服务路线。基于supervised learning的深度神经网络在任务中 Emerged as the dominant model，因为它们可以很好地捕捉工作者的行为模式从大量历史数据中。虽然有前景，但它们无法将不对数据进行梯度下降的测试标准引入到训练过程中，导致训练和测试标准之间的匹配性异常低。为解决这一问题，我们提出了将Reinforcement Learning（RL）应用于路线预测任务，并提出了一个基于RL的框架 called DRL4Route。这个框架结合了以前的深度学习模型中的行为学习能力和RL的非梯度优化能力。DRL4Route可以作为现有的深度学习模型的插件，以提高其性能。基于此框架，我们进一步实现了一个名为DRL4Route-GAE的模型，用于PDRP在物流服务中。这个模型采用actor-critic架构，并配备一个Generalized Advantage Estimator，可以平衡策略梯度估计的偏好和方差，从而实现更优化的策略。经过大量的离线实验和在线部署，我们发现DRL4Route-GAE可以在实际数据上提高Location Square Deviation（LSD）和Accuracy@3（ACC@3）的值，相比 existed 方法，LSD提高0.9%-2.7%，ACC@3提高2.4%-3.2%。

Synaptic Plasticity Models and Bio-Inspired Unsupervised Deep Learning: A Survey

paper_url: http://arxiv.org/abs/2307.16236
repo_url: None
paper_authors: Gabriele Lagani, Fabrizio Falchi, Claudio Gennaro, Giuseppe Amato
for: 本文探讨了基于深度学习的新技术，以及它们在人工智能领域中的应用和挑战。
methods: 本文描述了一些基于生物机制的深度学习模型，包括synaptic plasticity模型和脉冲神经网络（SNNs）模型。
results: 本文总结了这些生物启发的深度学习模型在不同场景下的应用和效果，并指出了这些模型在人工智能领域的潜在发展前景。

Abstract
Recently emerged technologies based on Deep Learning (DL) achieved outstanding results on a variety of tasks in the field of Artificial Intelligence (AI). However, these encounter several challenges related to robustness to adversarial inputs, ecological impact, and the necessity of huge amounts of training data. In response, researchers are focusing more and more interest on biologically grounded mechanisms, which are appealing due to the impressive capabilities exhibited by biological brains. This survey explores a range of these biologically inspired models of synaptic plasticity, their application in DL scenarios, and the connections with models of plasticity in Spiking Neural Networks (SNNs). Overall, Bio-Inspired Deep Learning (BIDL) represents an exciting research direction, aiming at advancing not only our current technologies but also our understanding of intelligence.

摘要
（注意：以下是简化中文版本，如果您需要正式的中文版本，请勿使用这个版本）Recently emerged Deep Learning (DL) technologies have achieved remarkable results in various Artificial Intelligence (AI) tasks, but they also face challenges such as robustness to adversarial inputs, ecological impact, and the need for large amounts of training data. In response, researchers are increasingly interested in biologically grounded mechanisms, which are attractive due to the impressive capabilities of biological brains. This survey explores a range of biologically inspired models of synaptic plasticity, their applications in DL scenarios, and connections with models of plasticity in Spiking Neural Networks (SNNs). Overall, Bio-Inspired Deep Learning (BIDL) is an exciting research direction that aims to advance not only our current technologies but also our understanding of intelligence.

Spiking Neural Networks and Bio-Inspired Supervised Deep Learning: A Survey

paper_url: http://arxiv.org/abs/2307.16235
repo_url: None
paper_authors: Gabriele Lagani, Fabrizio Falchi, Claudio Gennaro, Giuseppe Amato
for: This survey provides a comprehensive review of recent biologically-inspired approaches for Artificial Intelligence (AI) technologies, with a focus on Spiking Neural Network (SNN) models and bio-inspired training methods.
methods: The survey discusses SNN models and their challenges, as well as bio-inspired training methods that pose alternatives to traditional backprop-based optimization. These methods aim to advance the computational capabilities and biological plausibility of current models.
results: The survey provides a thorough presentation of recent biologically-inspired approaches for AI, including SNN models and bio-inspired training methods. These approaches aim to improve the computational capabilities and biological plausibility of current AI models.

Abstract
For a long time, biology and neuroscience fields have been a great source of inspiration for computer scientists, towards the development of Artificial Intelligence (AI) technologies. This survey aims at providing a comprehensive review of recent biologically-inspired approaches for AI. After introducing the main principles of computation and synaptic plasticity in biological neurons, we provide a thorough presentation of Spiking Neural Network (SNN) models, and we highlight the main challenges related to SNN training, where traditional backprop-based optimization is not directly applicable. Therefore, we discuss recent bio-inspired training methods, which pose themselves as alternatives to backprop, both for traditional and spiking networks. Bio-Inspired Deep Learning (BIDL) approaches towards advancing the computational capabilities and biological plausibility of current models.

摘要
For a long time, biology and neuroscience fields have been a great source of inspiration for computer scientists, towards the development of Artificial Intelligence (AI) technologies. This survey aims at providing a comprehensive review of recent biologically-inspired approaches for AI. After introducing the main principles of computation and synaptic plasticity in biological neurons, we provide a thorough presentation of Spiking Neural Network (SNN) models, and we highlight the main challenges related to SNN training, where traditional backprop-based optimization is not directly applicable. Therefore, we discuss recent bio-inspired training methods, which pose themselves as alternatives to backprop, both for traditional and spiking networks. Bio-Inspired Deep Learning (BIDL) approaches towards advancing the computational capabilities and biological plausibility of current models.Here's the text in Simplified Chinese characters: For a long time, biology and neuroscience fields have been a great source of inspiration for computer scientists, towards the development of Artificial Intelligence (AI) technologies. This survey aims at providing a comprehensive review of recent biologically-inspired approaches for AI. After introducing the main principles of computation and synaptic plasticity in biological neurons, we provide a thorough presentation of Spiking Neural Network (SNN) models, and we highlight the main challenges related to SNN training, where traditional backprop-based optimization is not directly applicable. Therefore, we discuss recent bio-inspired training methods, which pose themselves as alternatives to backprop, both for traditional and spiking networks. Bio-Inspired Deep Learning (BIDL) approaches towards advancing the computational capabilities and biological plausibility of current models.

Robust Electric Vehicle Balancing of Autonomous Mobility-On-Demand System: A Multi-Agent Reinforcement Learning Approach

paper_url: http://arxiv.org/abs/2307.16228
repo_url: None
paper_authors: Sihong He, Shuo Han, Fei Miao
for:The paper is written for electric autonomous vehicles (EAVs) in future autonomous mobility-on-demand (AMoD) systems, with the goal of designing an integrated vehicle balancing solution that can handle supply and demand uncertainties.methods:The paper uses multi-agent reinforcement learning (MARL) to model both the EAVs supply and mobility demand uncertainties, and proposes a robust E-AMoD Balancing MARL (REBAMA) algorithm to train a robust EAVs balancing policy that can balance both the supply-demand ratio and charging utilization rate across the whole city.results:The proposed robust method improves the reward, charging utilization fairness, and supply-demand fairness compared to a non-robust MARL method and a robust optimization-based method. Specifically, the proposed method improves the reward by 19.28%, charging utilization fairness by 28.18%, and supply-demand fairness by 3.97%, compared to the non-robust MARL method. Compared to the robust optimization-based method, the proposed MARL algorithm improves the reward by 8.21%, charging utilization fairness by 8.29%, and supply-demand fairness by 9.42%.

Abstract
Electric autonomous vehicles (EAVs) are getting attention in future autonomous mobility-on-demand (AMoD) systems due to their economic and societal benefits. However, EAVs' unique charging patterns (long charging time, high charging frequency, unpredictable charging behaviors, etc.) make it challenging to accurately predict the EAVs supply in E-AMoD systems. Furthermore, the mobility demand's prediction uncertainty makes it an urgent and challenging task to design an integrated vehicle balancing solution under supply and demand uncertainties. Despite the success of reinforcement learning-based E-AMoD balancing algorithms, state uncertainties under the EV supply or mobility demand remain unexplored. In this work, we design a multi-agent reinforcement learning (MARL)-based framework for EAVs balancing in E-AMoD systems, with adversarial agents to model both the EAVs supply and mobility demand uncertainties that may undermine the vehicle balancing solutions. We then propose a robust E-AMoD Balancing MARL (REBAMA) algorithm to train a robust EAVs balancing policy to balance both the supply-demand ratio and charging utilization rate across the whole city. Experiments show that our proposed robust method performs better compared with a non-robust MARL method that does not consider state uncertainties; it improves the reward, charging utilization fairness, and supply-demand fairness by 19.28%, 28.18%, and 3.97%, respectively. Compared with a robust optimization-based method, the proposed MARL algorithm can improve the reward, charging utilization fairness, and supply-demand fairness by 8.21%, 8.29%, and 9.42%, respectively.

摘要
电动自动车 (EAV) 在未来自动移动需求 (AMoD) 系统中吸引了关注，因为它们具有经济和社会的好处。然而，EAV 的充电特点 (长充电时间、高充电频率、不可预测的充电行为等) 使得预测 EAV 供应很困难。此外，移动需求的预测不确定性使得设计一个集成的车辆均衡解决方案变得非常困难和挑战性。虽然激励学习基于 E-AMoD 均衡算法得到了成功，但是状态不确定性以及 EV 供应或移动需求的预测不确定性尚未得到探讨。在这种情况下，我们设计了一个多代理激励学习 (MARL) 基础的框架，用于 EAV 均衡解决方案。我们在这个框架中引入了对 EAV 供应和移动需求不确定性的模型，以便模拟这些不确定性的影响。我们then propose a robust E-AMoD Balancing MARL (REBAMA) algorithm to train a robust EAVs balancing policy to balance both the supply-demand ratio and charging utilization rate across the whole city.实验表明，我们提出的方法比非robust MARL 方法更好，可以提高奖励、充电利用公平性和供应需求公平性的表现。与一种robust优化基础的方法进行比较，我们的方法可以提高奖励、充电利用公平性和供应需求公平性的表现。

Optimizing the Neural Network Training for OCR Error Correction of Historical Hebrew Texts

paper_url: http://arxiv.org/abs/2307.16220
repo_url: https://github.com/smartinternz02/SI-GuidedProject-2307-1622049182
paper_authors: Omri Suissa, Avshalom Elmalech, Maayan Zhitomirsky-Geffet
For: This paper aims to improve the accuracy of Optical Character Recognition (OCR) post-correction for historical documents by developing a method for automatically generating language and task-specific training data.* Methods: The proposed method uses a light-weight neural network and significantly less manually created data to correct OCR errors in Hebrew newspapers. The method is based on natural language analysis and machine learning techniques such as neural networks.* Results: The proposed method outperforms other state-of-the-art neural networks and complex spellcheckers for OCR post-correction, and the performance of the neural network depends on the genre and area of the training data.

Abstract
Over the past few decades, large archives of paper-based documents such as books and newspapers have been digitized using Optical Character Recognition. This technology is error-prone, especially for historical documents. To correct OCR errors, post-processing algorithms have been proposed based on natural language analysis and machine learning techniques such as neural networks. Neural network's disadvantage is the vast amount of manually labeled data required for training, which is often unavailable. This paper proposes an innovative method for training a light-weight neural network for Hebrew OCR post-correction using significantly less manually created data. The main research goal is to develop a method for automatically generating language and task-specific training data to improve the neural network results for OCR post-correction, and to investigate which type of dataset is the most effective for OCR post-correction of historical documents. To this end, a series of experiments using several datasets was conducted. The evaluation corpus was based on Hebrew newspapers from the JPress project. An analysis of historical OCRed newspapers was done to learn common language and corpus-specific OCR errors. We found that training the network using the proposed method is more effective than using randomly generated errors. The results also show that the performance of the neural network for OCR post-correction strongly depends on the genre and area of the training data. Moreover, neural networks that were trained with the proposed method outperform other state-of-the-art neural networks for OCR post-correction and complex spellcheckers. These results may have practical implications for many digital humanities projects.

摘要
在过去几十年，大量的纸质文档，如书籍和报纸，已经被数字化使用光学字符识别（OCR）技术。这种技术存在误差，尤其是对历史文档。为了修正OCR错误，基于自然语言分析和机器学习技术的后处理算法已经被提出。然而，这些算法需要大量的手动标注数据，却经常不可获得。这篇论文提出了一种创新的方法，使用较少的手动创建数据来训练轻量级的神经网络进行希伯来文OCR后处理。研究的主要目标是开发一种自动生成语言和任务特定的训练数据，以提高神经网络的OCR后处理效果，并investigate历史文档OCR后处理中哪种数据集是最有效的。为此，我们进行了一系列实验，使用了多个数据集。评估集基于希伯来报纸JPress项目。我们分析了历史OCR后的报纸，了解希伯来文OCR后处理中的常见语言和核心错误。我们发现，使用我们提出的方法训练神经网络是比使用随机生成错误更有效的。结果还表明，神经网络的OCR后处理效果强度取决于训练数据的类别和地区。此外，使用我们提出的方法训练的神经网络，超过了当前最佳的神经网络和复杂的拼写检查器。这些结果可能对许多数字人文项目产生实质性的影响。

Improving Probabilistic Bisimulation for MDPs Using Machine Learning

paper_url: http://arxiv.org/abs/2308.02519
repo_url: None
paper_authors: Mohammadsadegh Mohaghegh, Khayyam Salehi
for: 本文旨在应用形式验证技术来分析复杂系统，但是遇到了状态空间爆炸问题。
methods: 本文使用 bisimulation 减少模型状态数量，以解决状态空间爆炸问题。在涉及杂Event-driven系统时，使用概率 bisimulation 来减少模型的状态数量。
results: 本文提出一种新的方法，使用 PRISM 程序和机器学习分类技术来partition 模型的状态空间。实验结果显示，该方法可以减少运行时间相比之前的工具。

Abstract
The utilization of model checking has been suggested as a formal verification technique for analyzing critical systems. However, the primary challenge in applying to complex systems is state space explosion problem. To address this issue, bisimulation minimization has emerged as a prominent method for reducing the number of states in a labeled transition system, aiming to overcome the difficulties associated with the state space explosion problem. In the case of systems exhibiting stochastic behaviors, probabilistic bisimulation is employed to minimize a given model, obtaining its equivalent form with fewer states. Recently, various techniques have been introduced to decrease the time complexity of the iterative methods used to compute probabilistic bisimulation for stochastic systems that display nondeterministic behaviors. In this paper, we propose a new technique to partition the state space of a given probabilistic model to its bisimulation classes. This technique uses the PRISM program of a given model and constructs some small versions of the model to train a classifier. It then applies machine learning classification techniques to approximate the related partition. The resulting partition is used as an initial one for the standard bisimulation technique in order to reduce the running time of the method. The experimental results show that the approach can decrease significantly the running time compared to state-of-the-art tools.

摘要
utilization of model checking 被建议作为形式验证技术来分析关键系统。然而，主要挑战在应用于复杂系统时是状态空间爆炸问题。为解决这个问题， bisimulation minimization emerged as a prominent method for reducing the number of states in a labeled transition system, aiming to overcome the difficulties associated with the state space explosion problem。在系统展现杂次性行为时， probabilistic bisimulation 被使用来最小化给定模型，从而获得 fewer states 的等价形式。最近， various techniques have been introduced to decrease the time complexity of the iterative methods used to compute probabilistic bisimulation for stochastic systems that display nondeterministic behaviors。在这篇论文中，我们提出了一种新的方法，用于将 givens 模型的状态空间 partition 到其 bisimulation classes。这种方法使用 PRISM 程序的 givens 模型，并将其转换为一些小版本的模型，以训练一个类ifier。然后，通过机器学习分类技术来approximate 相关的 partition。 obtained 的 partition 被用作标准 bisimulation 技术的初始 partition，以降低方法的运行时间。实验结果表明，该方法可以在比较器与当前工具之间减少运行时间。

Text Analysis Using Deep Neural Networks in Digital Humanities and Information Science

paper_url: http://arxiv.org/abs/2307.16217
repo_url: None
paper_authors: Omri Suissa, Avshalom Elmalech, Maayan Zhitomirsky-Geffet
for: 本研究的目的是探讨如何在人文科技领域中使用深度神经网络（DNN）来自动分析文本资源，以便为人文科学研究（DH）提供更多的可靠的数据分析方法。
methods: 本研究使用了多个DNN模型来解决各种NLP任务，包括拼写检查、语言检测、实体提取、作者检测、问答等任务。这些模型通过从大量“正确”和“错误”示例中学习模式，并将其应用于新的示例。
results: 本研究通过分析多个DH研究 literatura 中的实践，探讨了使用DNN模型在DH研究中的两大挑战：数据AVAILABILITY和领域适应。此外，本研究还提出了一个实用的决策模型，以帮助DH专家在选择合适的深度学习方法时作出更好的决策。

Abstract
Combining computational technologies and humanities is an ongoing effort aimed at making resources such as texts, images, audio, video, and other artifacts digitally available, searchable, and analyzable. In recent years, deep neural networks (DNN) dominate the field of automatic text analysis and natural language processing (NLP), in some cases presenting a super-human performance. DNNs are the state-of-the-art machine learning algorithms solving many NLP tasks that are relevant for Digital Humanities (DH) research, such as spell checking, language detection, entity extraction, author detection, question answering, and other tasks. These supervised algorithms learn patterns from a large number of "right" and "wrong" examples and apply them to new examples. However, using DNNs for analyzing the text resources in DH research presents two main challenges: (un)availability of training data and a need for domain adaptation. This paper explores these challenges by analyzing multiple use-cases of DH studies in recent literature and their possible solutions and lays out a practical decision model for DH experts for when and how to choose the appropriate deep learning approaches for their research. Moreover, in this paper, we aim to raise awareness of the benefits of utilizing deep learning models in the DH community.

摘要
使用计算机技术和人文学是一项持续的努力，旨在使文本、图像、音频、视频和其他文物 digitally可用、搜索可用和分析可用。在过去几年中，深度神经网络（DNN）在自动文本分析和自然语言处理（NLP）领域占据了主导地位，在一些情况下表现出超人般的表现。DNN是当前最佳的机器学习算法，用于解决数字人文学（DH）研究中相关的许多NLP任务，如拼写检查、语言检测、实体提取、作者检测、问题回答等任务。这些有监督的算法通过大量“正确”和“错误”示例学习出模式，然后应用于新示例。但是，在DH研究中使用DNN分析文本资源存在两个主要挑战：数据训练的可用性和领域适应。本文分析了多个DH研究中的用例，并评估了它们的可能的解决方案，并提出了实用的决策模型，以帮助DH专家在选择合适的深度学习方法时做出决策。此外，本文的目的是提醒DH社区利用深度学习模型的好处。

Question Answering with Deep Neural Networks for Semi-Structured Heterogeneous Genealogical Knowledge Graphs

paper_url: http://arxiv.org/abs/2307.16214
repo_url: https://github.com/omrivm/uncle-bert
paper_authors: Omri Suissa, Maayan Zhitomirsky-Geffet, Avshalom Elmalech
for: 这个研究旨在开发一种基于家谱树的问答系统，以便为家谱研究提供更加精准的问答功能。
methods: 该研究使用了转换器模型，将家谱数据转换为知识图，然后与文本结合，并使用自动生成的家谱数据进行训练。
results: 研究发现，与开放领域问答模型相比，专门为家谱问答模型具有更高的精度和更低的复杂性。此外，该方法可能有实际意义 для家谱研究和实际项目，使家谱数据更加访问ible。

Abstract
With the rising popularity of user-generated genealogical family trees, new genealogical information systems have been developed. State-of-the-art natural question answering algorithms use deep neural network (DNN) architecture based on self-attention networks. However, some of these models use sequence-based inputs and are not suitable to work with graph-based structure, while graph-based DNN models rely on high levels of comprehensiveness of knowledge graphs that is nonexistent in the genealogical domain. Moreover, these supervised DNN models require training datasets that are absent in the genealogical domain. This study proposes an end-to-end approach for question answering using genealogical family trees by: 1) representing genealogical data as knowledge graphs, 2) converting them to texts, 3) combining them with unstructured texts, and 4) training a trans-former-based question answering model. To evaluate the need for a dedicated approach, a comparison between the fine-tuned model (Uncle-BERT) trained on the auto-generated genealogical dataset and state-of-the-art question-answering models was per-formed. The findings indicate that there are significant differences between answering genealogical questions and open-domain questions. Moreover, the proposed methodology reduces complexity while increasing accuracy and may have practical implications for genealogical research and real-world projects, making genealogical data accessible to experts as well as the general public.

摘要
随着家谱创建者自动生成的家谱树的 популяр度的提高，新的家谱信息系统已经被开发出来。现代自然问答算法使用深度神经网络（DNN）架构，其中一些模型使用序列化输入，不适合处理图structured数据，而图structured DNN模型则需要高度完整的知识图，而这种图structured 知识图在家谱领域缺失。此外，这些监督式DNN模型需要家谱领域缺失的培训数据。本研究提出了一种终端方法，通过以下步骤来回答家谱问题：1）将家谱数据转换为知识图，2）将其与未结构化文本结合，3）使用转换器基于 transformer 模型进行问答。为了评估需要专门的方法，对自动生成的家谱数据进行了精心 fine-tune 的 Uncle-BERT 模型和现有的问答模型进行比较。结果显示，回答家谱问题和开放领域问题存在显著差异。此外，提出的方法可以降低复杂性，提高准确性，并可能对家谱研究和实际项目产生实际意义，使家谱数据更加可访问ible для专家和一般公众。

Toward a Period-Specific Optimized Neural Network for OCR Error Correction of Historical Hebrew Texts

paper_url: http://arxiv.org/abs/2307.16213
repo_url: None
paper_authors: Omri Suissa, Maayan Zhitomirsky-Geffet, Avshalom Elmalech
for: 为了提高希伯来文件中的Optical Character Recognition（OCR）识别精度，提供一种多阶段方法。
methods: 使用神经网络进行OCR识别错误修复，并且通过人工生成的训练数据集和优化超参数来提高模型的性能。
results: 通过实验表明，该方法可以提高希伯来文件中OCR识别精度，并且可以适应不同的语言风格和时期变化。

Abstract
Over the past few decades, large archives of paper-based historical documents, such as books and newspapers, have been digitized using the Optical Character Recognition (OCR) technology. Unfortunately, this broadly used technology is error-prone, especially when an OCRed document was written hundreds of years ago. Neural networks have shown great success in solving various text processing tasks, including OCR post-correction. The main disadvantage of using neural networks for historical corpora is the lack of sufficiently large training datasets they require to learn from, especially for morphologically-rich languages like Hebrew. Moreover, it is not clear what are the optimal structure and values of hyperparameters (predefined parameters) of neural networks for OCR error correction in Hebrew due to its unique features. Furthermore, languages change across genres and periods. These changes may affect the accuracy of OCR post-correction neural network models. To overcome these challenges, we developed a new multi-phase method for generating artificial training datasets with OCR errors and hyperparameters optimization for building an effective neural network for OCR post-correction in Hebrew.

摘要
To overcome these challenges, we developed a new multi-phase method for generating artificial training datasets with OCR errors and hyperparameters optimization for building an effective neural network for OCR post-correction in Hebrew.

Robust Multi-Agent Reinforcement Learning with State Uncertainty

paper_url: http://arxiv.org/abs/2307.16212
repo_url: https://github.com/sihongho/robust_marl_with_state_uncertainty
paper_authors: Sihong He, Songyang Han, Sanbao Su, Shuo Han, Shaofeng Zou, Fei Miao
for: 本研究旨在解决多代理人学习（MARL）中存在状态不确定性（state uncertainty）的问题，提高agent的稳定性和可靠性。
methods: 本研究使用Markov Game with state perturbation adversaries（MG-SPA）模型，并提出了一种robust equilibrium（RE）作为解题方法。然后，提出了一种robust multi-agent Q-learning（RMAQ）算法来实现RE，并在高维状态动作空间中提出了一种robust multi-agent actor-critic（RMAAC）算法。
results: 实验结果表明，提出的RMAQ算法能够 converges to the optimal value function，而RMAAC算法在多个多代理人环境中比较多个MARL和robust MARL方法表现更好，特别是在状态不确定性存在时。

Abstract
In real-world multi-agent reinforcement learning (MARL) applications, agents may not have perfect state information (e.g., due to inaccurate measurement or malicious attacks), which challenges the robustness of agents' policies. Though robustness is getting important in MARL deployment, little prior work has studied state uncertainties in MARL, neither in problem formulation nor algorithm design. Motivated by this robustness issue and the lack of corresponding studies, we study the problem of MARL with state uncertainty in this work. We provide the first attempt to the theoretical and empirical analysis of this challenging problem. We first model the problem as a Markov Game with state perturbation adversaries (MG-SPA) by introducing a set of state perturbation adversaries into a Markov Game. We then introduce robust equilibrium (RE) as the solution concept of an MG-SPA. We conduct a fundamental analysis regarding MG-SPA such as giving conditions under which such a robust equilibrium exists. Then we propose a robust multi-agent Q-learning (RMAQ) algorithm to find such an equilibrium, with convergence guarantees. To handle high-dimensional state-action space, we design a robust multi-agent actor-critic (RMAAC) algorithm based on an analytical expression of the policy gradient derived in the paper. Our experiments show that the proposed RMAQ algorithm converges to the optimal value function; our RMAAC algorithm outperforms several MARL and robust MARL methods in multiple multi-agent environments when state uncertainty is present. The source code is public on \url{https://github.com/sihongho/robust_marl_with_state_uncertainty}.

摘要
在实际多智能体强化学习（MARL）应用中，智能体可能不具备完整的状态信息（例如因为不准确的测量或攻击），这会挑战智能体的策略的稳定性。虽然稳定性在MARL部署中变得越来越重要，但前一些研究却没有系统地研究了状态不确定性在MARL中的问题。我们受到这种稳定性问题和相关研究的缺失启发，在这里研究了MARL中的状态不确定性问题。我们首次在Markov游戏中引入状态干扰对手（MG-SPA），并将状态干扰对手作为问题的解决方案。我们 THEN 进行了基本的分析和探索，包括状态不确定性下的稳定性存在的条件。然后，我们提出了一种可靠的多智能体Q学习（RMAQ）算法，以找到这种稳定性的解决方案，并有确定的收敛保证。为了处理高维状态动作空间，我们设计了一种基于分析表达的策略梯度的多智能体actor-critic（RMAAC）算法。我们的实验表明，我们的RMAQ算法可以到达优质函数的优化值；我们的RMAAC算法在多个多智能体环境中高效地处理状态不确定性。我们的代码可以在 \url{https://github.com/sihongho/robust_marl_with_state_uncertainty} 上获取。

paper_url: http://arxiv.org/abs/2307.16210
repo_url: https://github.com/zjukg/UMAEA
paper_authors: Zhuo Chen, Lingbing Guo, Yin Fang, Yichi Zhang, Jiaoyan Chen, Jeff Z. Pan, Yangning Li, Huajun Chen, Wen Zhang
for: 本文主要目标是提出一种robust多模态实体对应方法，以解决在多个知识图(KG)中存在不完整的视觉模态的问题。
methods: 本文使用了最新的MMEA模型，并在我们提出的MMEA-UMVM数据集上进行了 benchmarking。该数据集包括了双语和单语对照KG，并采用了标准(非迭代)和迭代训练方法来评估模型性能。
results: 研究结果表明，在面临多模态不完整性时，模型容易过拟合多模态噪音，并出现高比例的性能波动或下降。这表明，在某些情况下，附加的多模态数据可能会对实体对应性能产生负面影响。为解决这些挑战，我们提出了UMAEA方法，它可以有效地处理不确定的多模态视觉信息。UMAEA方法在所有97个分 splitting中表现出色，superiority 过 existed baseline，并且具有限制parameters和时间消耗的优点。

Abstract
As a crucial extension of entity alignment (EA), multi-modal entity alignment (MMEA) aims to identify identical entities across disparate knowledge graphs (KGs) by exploiting associated visual information. However, existing MMEA approaches primarily concentrate on the fusion paradigm of multi-modal entity features, while neglecting the challenges presented by the pervasive phenomenon of missing and intrinsic ambiguity of visual images. In this paper, we present a further analysis of visual modality incompleteness, benchmarking latest MMEA models on our proposed dataset MMEA-UMVM, where the types of alignment KGs covering bilingual and monolingual, with standard (non-iterative) and iterative training paradigms to evaluate the model performance. Our research indicates that, in the face of modality incompleteness, models succumb to overfitting the modality noise, and exhibit performance oscillations or declines at high rates of missing modality. This proves that the inclusion of additional multi-modal data can sometimes adversely affect EA. To address these challenges, we introduce UMAEA , a robust multi-modal entity alignment approach designed to tackle uncertainly missing and ambiguous visual modalities. It consistently achieves SOTA performance across all 97 benchmark splits, significantly surpassing existing baselines with limited parameters and time consumption, while effectively alleviating the identified limitations of other models. Our code and benchmark data are available at https://github.com/zjukg/UMAEA.

摘要
Traditional multi-modal entity alignment (MMEA) aims to identify the same entity across different knowledge graphs (KGs) by leveraging associated visual information. However, existing MMEA methods primarily focus on fusing multi-modal entity features, while neglecting the challenges posed by the prevalent phenomenon of missing and inherent ambiguity of visual images. In this paper, we conduct a further analysis of the incompleteness of visual modalities, and benchmark the latest MMEA models on our proposed dataset MMEA-UMVM, which includes bilingual and monolingual alignment graphs with standard (non-iterative) and iterative training paradigms to evaluate model performance. Our findings indicate that, in the face of modality incompleteness, models tend to overfit the modality noise and exhibit performance fluctuations or declines at high rates of missing modality. This suggests that the inclusion of additional multi-modal data can sometimes adversely affect entity alignment. To address these challenges, we propose UMAEA, a robust multi-modal entity alignment approach designed to handle uncertain, missing, and ambiguous visual modalities. It consistently achieves state-of-the-art (SOTA) performance across all 97 benchmark splits, significantly outperforming existing baselines with limited parameters and time consumption, while effectively alleviating the limitations of other models. Our code and benchmark data are available at https://github.com/zjukg/UMAEA.

Around the GLOBE: Numerical Aggregation Question-Answering on Heterogeneous Genealogical Knowledge Graphs with Deep Neural Networks

paper_url: http://arxiv.org/abs/2307.16208
repo_url: None
paper_authors: Omri Suissa, Maayan Zhitomirsky-Geffet, Avshalom Elmalech
for: 这个研究是为了提高基础设施领域中的数字资产管理和研究效率。
methods: 该研究使用了自动化数据集训练方法、转换器基于表格选择方法和优化的转换器基于数字聚合Question Answering模型。
results: 研究发现，提案的建筑GLOBE，在数字聚合Question Answering任务中的准确率为87%，比现有状态方法和管道的准确率提高了66%。

Abstract
One of the key AI tools for textual corpora exploration is natural language question-answering (QA). Unlike keyword-based search engines, QA algorithms receive and process natural language questions and produce precise answers to these questions, rather than long lists of documents that need to be manually scanned by the users. State-of-the-art QA algorithms based on DNNs were successfully employed in various domains. However, QA in the genealogical domain is still underexplored, while researchers in this field (and other fields in humanities and social sciences) can highly benefit from the ability to ask questions in natural language, receive concrete answers and gain insights hidden within large corpora. While some research has been recently conducted for factual QA in the genealogical domain, to the best of our knowledge, there is no previous research on the more challenging task of numerical aggregation QA (i.e., answering questions combining aggregation functions, e.g., count, average, max). Numerical aggregation QA is critical for distant reading and analysis for researchers (and the general public) interested in investigating cultural heritage domains. Therefore, in this study, we present a new end-to-end methodology for numerical aggregation QA for genealogical trees that includes: 1) an automatic method for training dataset generation; 2) a transformer-based table selection method, and 3) an optimized transformer-based numerical aggregation QA model. The findings indicate that the proposed architecture, GLOBE, outperforms the state-of-the-art models and pipelines by achieving 87% accuracy for this task compared to only 21% by current state-of-the-art models. This study may have practical implications for genealogical information centers and museums, making genealogical data research easy and scalable for experts as well as the general public.

摘要
一种关键的人工智能工具 для文本 corpora 探索是自然语言问答（QA）。与关键词搜索引擎不同，QA 算法会根据自然语言问题进行处理，而不是将长列表交给用户手动搜索。现状的QA 算法基于 DNN 已经在不同领域得到成功应用。然而，在家谱领域，QA 仍然处于未探索的阶段，而研究人员在这个领域（以及人文社会科学领域）可以很大程度上受益于能够通过自然语言提问，得到准确的答案，并从大量文本中获得隐藏的洞察。虽然有些研究在家谱领域的事实Question Answering（QA）方面已经进行，但我们知道，对数字和平均函数的QA（即Answering questions combining aggregation functions, e.g., count, average, max）的研究尚未进行。这种QA 任务对于远程阅读和分析是非常重要的，因此，在这种研究中，我们提出了一种新的综合方法，包括：1）自动生成训练数据集方法；2）基于 transformer 的表格选择方法；3）优化的 transformer 基于 numerical aggregation QA 模型。研究结果表明，我们提出的架构，GLOBE，在这个任务上比现状模型和管道的性能高出87%，而不是只有21%。这个研究可能具有实质性的实际应用，使得家谱信息中心和博物馆的研究变得容易和可扩展，以便专家和一般公众都可以使用。

Deep Convolutional Neural Networks with Zero-Padding: Feature Extraction and Learning

paper_url: http://arxiv.org/abs/2307.16203
repo_url: https://github.com/liubc17/eDCNN_zero_padding
paper_authors: Zhi Han, Baichen Liu, Shao-Bo Lin, Ding-Xuan Zhou
for: 这个论文研究了深度卷积神经网络（DCNNs）中零填充的表现和学习。
methods: 论文首先验证了零填充在特征提取和学习中的作用，并证明了 pooling 的翻译不变性驱动性。然后，论文表明了任何深度全连接神经网络（DFCNs）可以被表示为 DCNNs WITH 零填充，这表明了 DCNNs 的更好的特征提取能力。
results: 论文 derivates 了 DCNNs WITH 零填充的 universal consistency，并证明了其在学习过程中的翻译不变性。numerical experiments 验证了这些理论结论，包括 both Toy 仿真和实际数据运行。

Abstract
This paper studies the performance of deep convolutional neural networks (DCNNs) with zero-padding in feature extraction and learning. After verifying the roles of zero-padding in enabling translation-equivalence, and pooling in its translation-invariance driven nature, we show that with similar number of free parameters, any deep fully connected networks (DFCNs) can be represented by DCNNs with zero-padding. This demonstrates that DCNNs with zero-padding is essentially better than DFCNs in feature extraction. Consequently, we derive universal consistency of DCNNs with zero-padding and show its translation-invariance in the learning process. All our theoretical results are verified by numerical experiments including both toy simulations and real-data running.

摘要

Shuffled Differentially Private Federated Learning for Time Series Data Analytics

paper_url: http://arxiv.org/abs/2307.16196
repo_url: None
paper_authors: Chenxi Huang, Chaoyang Jiang, Zhenghua Chen
for: 针对时间序列数据进行信任性联合学习，以达到最佳性能 while 保护客户端的隐私。
methods: 使用本地差分隐私来延伸隐私保护信赖关系到客户端，并 incorporate 摇摆技术以进一步增强隐私。
results: 在五个时间序列数据集上进行了广泛的实验，结果显示我们的算法在小客户和大客户enario 中都实现了最小的精度损失，并在同等隐私保护水平下与中央差分隐私联合学习的比较更好。

Abstract
Trustworthy federated learning aims to achieve optimal performance while ensuring clients' privacy. Existing privacy-preserving federated learning approaches are mostly tailored for image data, lacking applications for time series data, which have many important applications, like machine health monitoring, human activity recognition, etc. Furthermore, protective noising on a time series data analytics model can significantly interfere with temporal-dependent learning, leading to a greater decline in accuracy. To address these issues, we develop a privacy-preserving federated learning algorithm for time series data. Specifically, we employ local differential privacy to extend the privacy protection trust boundary to the clients. We also incorporate shuffle techniques to achieve a privacy amplification, mitigating the accuracy decline caused by leveraging local differential privacy. Extensive experiments were conducted on five time series datasets. The evaluation results reveal that our algorithm experienced minimal accuracy loss compared to non-private federated learning in both small and large client scenarios. Under the same level of privacy protection, our algorithm demonstrated improved accuracy compared to the centralized differentially private federated learning in both scenarios.

摘要
信任worthy的联合学习目标是实现最佳性能，同时保障客户端的隐私。现有的隐私保护联合学习方法主要针对图像数据，缺乏应用于时间序列数据，这种数据在机器健康监测、人活动识别等领域具有重要应用。此外，对时间序列数据分析模型的保护噪声可能会对时间相关的学习产生干扰，导致准确性下降。为解决这些问题，我们开发了一种隐私保护的联合学习算法 для时间序列数据。具体来说，我们使用本地差分隐私来扩展隐私保护的信任边界到客户端。我们还 integrates 搅拌技术来实现隐私增强，为了 Mitigating the accuracy decline caused by leveraging local differential privacy.我们对五个时间序列数据集进行了广泛的实验。评估结果表明，我们的算法在小客户和大客户场景中都体现出较少的准确性下降，与非隐私联合学习相比。同时，在保持同样的隐私保护水平下，我们的算法在两个场景中都表现出了与中央差分隐私联合学习的改进准确性。

An Efficient Approach to Mitigate Numerical Instability in Backpropagation for 16-bit Neural Network Training

paper_url: http://arxiv.org/abs/2307.16189
repo_url: None
paper_authors: Juyoung Yun
for: 这个研究探讨了在机器学习模型16位计算中出现的数学不稳定性问题，特别是在广泛使用的优化算法RMSProp和Adam中。
methods: 研究人员发现了epsilongamma的单个参数对这种数学不稳定性问题产生了主要影响，并提出了一种新的方法来缓解这些问题。
results: 研究人员发现，通过轻微调整epsilongamma的值，可以恢复RMSProp和Adam在16位计算中的正常功能，并提高了深度神经网络训练过程的稳定性。

Abstract
In this research, we delve into the intricacies of the numerical instability observed in 16-bit computations of machine learning models, particularly when employing popular optimization algorithms such as RMSProp and Adam. This instability is commonly experienced during the training phase of deep neural networks, leading to disrupted learning processes and hindering the effective deployment of such models. We identify the single hyperparameter, epsilon, as the main culprit behind this numerical instability. An in-depth exploration of the role of epsilon in these optimizers within 16-bit computations reveals that a minor adjustment of its value can restore the functionality of RMSProp and Adam, consequently enabling the effective utilization of 16-bit neural networks. We propose a novel method to mitigate the identified numerical instability issues. This method capitalizes on the updates from the Adam optimizer and significantly improves the robustness of the learning process in 16-bit computations. This study contributes to better understanding of optimization in low-precision computations and provides an effective solution to a longstanding issue in training deep neural networks, opening new avenues for more efficient and stable model training.

摘要
在这项研究中，我们探讨了16位计算中机器学习模型的数值不稳定现象，尤其是在广泛使用的优化算法such as RMSProp和Adam时。这种不稳定现象通常在深度神经网络训练阶段出现，导致学习过程中断并阻碍深度神经网络的有效部署。我们确定了ε参数为这种数值不稳定的主要罪魁。在16位计算中的RMSProp和Adam优化器中，我们进行了深入的探讨，发现一小调整ε参数的值可以恢复这些优化器的功能，从而启用16位神经网络的有效使用。我们提出了一种新的约束数值不稳定问题的方法。这种方法基于Adam优化器的更新，可以在16位计算中提高学习过程的稳定性。本研究对低精度计算中优化的理解做出了贡献，并提供了训练深度神经网络的有效解决方案，开创了更高效和稳定的模型训练新途径。

ESP: Exploiting Symmetry Prior for Multi-Agent Reinforcement Learning

paper_url: http://arxiv.org/abs/2307.16186
repo_url: None
paper_authors: Xin Yu, Rongye Shi, Pu Feng, Yongkai Tian, Jie Luo, Wenjun Wu
for: 这篇论文旨在提高多智能体强化学习（MARL）的数据效率。
methods: 该paper提出了一个基于同质现象的框架，通过融合数据增强和一个妥善设计的一致损失函数，以提高现有MARL方法的数据效率。
results: 实验结果显示，提案的框架能够提高多个具有挑战性的任务的数据效率。此外，该框架还应用于物理多机器人测试平台，以显示其优势。

Abstract
Multi-agent reinforcement learning (MARL) has achieved promising results in recent years. However, most existing reinforcement learning methods require a large amount of data for model training. In addition, data-efficient reinforcement learning requires the construction of strong inductive biases, which are ignored in the current MARL approaches. Inspired by the symmetry phenomenon in multi-agent systems, this paper proposes a framework for exploiting prior knowledge by integrating data augmentation and a well-designed consistency loss into the existing MARL methods. In addition, the proposed framework is model-agnostic and can be applied to most of the current MARL algorithms. Experimental tests on multiple challenging tasks demonstrate the effectiveness of the proposed framework. Moreover, the proposed framework is applied to a physical multi-robot testbed to show its superiority.

摘要
Translation notes:* "Multi-agent reinforcement learning" is translated as "多 Agent 强化学习" (MARL)* "achieved promising results" is translated as "取得了可观的成果"* "most existing reinforcement learning methods" is translated as "大多数现有的 reinforcement learning 方法"* "require a large amount of data" is translated as "需要大量数据"* "data-efficient reinforcement learning" is translated as "数据有效的 reinforcement learning"* "strong inductive biases" is translated as "强大的推理假设"* " ignored in the current MARL approaches" is translated as "在当前的 MARL 方法中被忽略"* "inspired by the symmetry phenomenon" is translated as " inspirited by the symmetry phenomenon"* "a framework for exploiting prior knowledge" is translated as "一个抽象框架 для利用先前知识"* "integrating data augmentation" is translated as " integrating data augmentation"* "a well-designed consistency loss" is translated as "一个良好的一致性损失"* "model-agnostic" is translated as "模型无关"* "can be applied to most of the current MARL algorithms" is translated as "可以应用于大多数当前的 MARL 算法"* "Experimental tests on multiple challenging tasks" is translated as "在多个复杂任务上进行了实验测试"* "demonstrate the effectiveness of the proposed framework" is translated as "示出提议的框架的有效性"* "applied to a physical multi-robot testbed" is translated as "应用于物理多机器人测试平台"* "show its superiority" is translated as "示出其优越性"

Unified Model for Image, Video, Audio and Language Tasks

paper_url: http://arxiv.org/abs/2307.16184
repo_url: https://github.com/mshukor/unival
paper_authors: Mustafa Shukor, Corentin Dancette, Alexandre Rame, Matthieu Cord
for:* The paper aims to build a unified model that can support all modalities (text, images, videos, and audio) efficiently and without relying on large datasets or complex models.methods:* The proposed model, UnIVAL, is pretrained on many tasks using task balancing and multimodal curriculum learning.* The model uses weight interpolation of models trained on different multimodal tasks to improve generalization to out-of-distribution inputs.results:* UnIVAL shows competitive performance on image and video-text tasks and achieves competitive performance on audio-text tasks despite not being pretrained on audio.* The unified model demonstrates the synergy between tasks and improves generalization to out-of-distribution inputs.Here is the information in Simplified Chinese text:for:* 论文目的是建立一个能够支持所有Modalities（文本、图像、视频和音频）的有效和高效的模型，不需要庞大的数据集或复杂的模型。methods:* 提议的模型UnIVAL通过任务均衡和多模态训练来预训练多个任务。* 模型使用多模态任务训练的模型Weight interpolation来提高对异常输入的泛化。results:* UnIVAL在图像和视频-文本任务上显示了竞争性表现，并在没有对 audio 进行预训练的情况下在 audio-文本任务上达到了竞争性表现。* 统一模型展示了任务之间的共谊和对异常输入的泛化提高。I hope that helps!

Abstract
Large Language Models (LLMs) have made the ambitious quest for generalist agents significantly far from being a fantasy. A key hurdle for building such general models is the diversity and heterogeneity of tasks and modalities. A promising solution is unification, allowing the support of a myriad of tasks and modalities within one unified framework. While few large models (e.g., Flamingo (Alayrac et al., 2022), trained on massive datasets, can support more than two modalities, current small to mid-scale unified models are still limited to 2 modalities, usually image-text or video-text. The question that we ask is: is it possible to build efficiently a unified model that can support all modalities? To answer this, we propose UnIVAL, a step further towards this ambitious goal. Without relying on fancy datasets sizes or models with billions of parameters, the ~ 0.25B parameter UnIVAL model goes beyond two modalities and unifies text, images, video, and audio into a single model. Our model is efficiently pretrained on many tasks, based on task balancing and multimodal curriculum learning. UnIVAL shows competitive performance to existing state-of-the-art approaches, across image and video-text tasks. The feature representations learned from image and video-text modalities, allows the model to achieve competitive performance when finetuned on audio-text tasks, despite not being pretrained on audio. Thanks to the unified model, we propose a novel study on multimodal model merging via weight interpolation of models trained on different multimodal tasks, showing their benefits in particular for out-of-distribution generalization. Finally, we motivate unification by showing the synergy between tasks. The model weights and code are released here: https://github.com/mshukor/UnIVAL.

摘要
大型语言模型（LLM）已经让普通的通用代理人变得不是一个梦想。一个关键的难点是多元化和多种多样的任务和模式。一个有前途的解决方案是统一，允许支持一大量的任务和模式在一个统一框架下。现在的小型至中型统一模型都只支持2种模式，通常是图像文本或视频文本。我们的问题是：是否可以有效地建立一个统一模型，可以支持所有模式？为了回答这个问题，我们提出了 UnIVAL，这是一个进一步的目标。不需要庞大的数据集或者具有亿位 Parameters 的模型，我们的 ~ 0.25B 参数的 UnIVAL 模型可以超过二种模式，并将文本、图像、视频和音频统一为一个模型。我们的模型通过多任务调整和多模式学习来快速预训。 UnIVAL 在图像和视频文本任务上显示了竞争性的表现，而且可以在不直接预训的音频文本任务上 achieve 竞争性的表现，只因为它可以从图像和视频文本模式中学习出来的特征表现。我们还提出了一个新的研究，通过多模式模型的权重 interpolating 来评估多模式模型在不同多模式任务之间的融合效果，这些任务包括 audio-text 任务。最后，我们鼓励统一，因为多个任务之间存在联互关系，这使得模型可以从不同任务中学习到普遍的特征表现。模型和代码可以在 GitHub 上获取：https://github.com/mshukor/UnIVAL。

Redundancy-aware unsupervised rankings for collections of gene sets

paper_url: http://arxiv.org/abs/2307.16182
repo_url: None
paper_authors: Chiara Balestra, Carlo Maj, Emmanuel Müller, Andreas Mayr
for: 提高生物学 Pathway 的可读性和解释力
methods: 使用 Shapley 值来评估 Pathway 的重要性，并使用 trick 避免计算复杂性
results: 可以减少 Pathway 集合的维度，同时保持高度覆盖所有基因的表达In simplified Chinese:
for: 提高生物学 Pathway 的可读性和解释力
methods: 使用 Shapley 值来评估 Pathway 的重要性，并使用 trick 避免计算复杂性
results: 可以减少 Pathway 集合的维度，同时保持高度覆盖所有基因的表达

Abstract
The biological roles of gene sets are used to group them into collections. These collections are often characterized by being high-dimensional, overlapping, and redundant families of sets, thus precluding a straightforward interpretation and study of their content. Bioinformatics looked for solutions to reduce their dimension or increase their intepretability. One possibility lies in aggregating overlapping gene sets to create larger pathways, but the modified biological pathways are hardly biologically justifiable. We propose to use importance scores to rank the pathways in the collections studying the context from a set covering perspective. The proposed Shapley values-based scores consider the distribution of the singletons and the size of the sets in the families; Furthermore, a trick allows us to circumvent the usual exponential complexity of Shapley values' computation. Finally, we address the challenge of including a redundancy awareness in the obtained rankings where, in our case, sets are redundant if they show prominent intersections. The rankings can be used to reduce the dimension of collections of gene sets, such that they show lower redundancy and still a high coverage of the genes. We further investigate the impact of our selection on Gene Sets Enrichment Analysis. The proposed method shows a practical utility in bioinformatics to increase the interpretability of the collections of gene sets and a step forward to include redundancy into Shapley values computations.

摘要
生物学角色集合用于分组 gene set。这些集合经常是高维、重叠、重复的家庭集合，因此禁止直接解释和研究其内容。生物信息学搜索解决方案以降低维度或增加可读性。一种可能性在于将重叠的 gene set 聚合成更大的路径，但修改后的生物路径几乎不能正确地表达生物学意义。我们提议使用importance scores来排序pathway，并研究集合从集合覆盖角度来学习context。我们的提案基于 Shapley 值，考虑单个元素和集合的大小，并且可以避免通常的对 Shapley 值的计算的指数复杂性。 finally，我们解决了包含重复性在内的获得的排名中的挑战。这些排名可以用来降低集合的维度，以便仍然保持高度覆盖所有的基因。我们进一步调查了我们的选择对 Gene Sets Enrichment Analysis 的影响。我们的方法显示了生物信息学中可行的增加可读性的集合，以及包含重复性在内的 Shapley 值计算的一个进步。

Adaptive learning of density ratios in RKHS

paper_url: http://arxiv.org/abs/2307.16164
repo_url: None
paper_authors: Werner Zellinger, Stefan Kindermann, Sergei V. Pereverzyev
for: 估计两个概率密度之间的比率从finite数据观测中。
methods: 使用regularized Bregman divergence在 reproduce kernel Hilbert space（RKHS）中对概率密度比率进行估计。
results: 提供新的finite-sample error bounds，并提出Lepskii type parameter choice principle，可以在不知道概率密度比率的情况下最小化 bound。在特定情况下，我们的方法可以达到最优的 minimax 错误率。

Abstract
Estimating the ratio of two probability densities from finitely many observations of the densities is a central problem in machine learning and statistics with applications in two-sample testing, divergence estimation, generative modeling, covariate shift adaptation, conditional density estimation, and novelty detection. In this work, we analyze a large class of density ratio estimation methods that minimize a regularized Bregman divergence between the true density ratio and a model in a reproducing kernel Hilbert space (RKHS). We derive new finite-sample error bounds, and we propose a Lepskii type parameter choice principle that minimizes the bounds without knowledge of the regularity of the density ratio. In the special case of quadratic loss, our method adaptively achieves a minimax optimal error rate. A numerical illustration is provided.

摘要
估算两个概率密度之间的比率是机器学习和统计领域中的中心问题，具有应用于两个样本测试、差异估计、生成模型、 covariate shift 适应、条件概率密度估计和新奇检测等领域。在这个工作中，我们分析了一大类的概率密度比率估计方法，这些方法在一个 reproduce kernel Hilbert space（RKHS）中减少了一个弹性Bregman divergence的正则化。我们 derivated新的finite-sample error bounds，并提出了一种Lepskii类型的参数选择原则，该原则可以在不知道概率密度比率的regulatory情况下最小化error bounds。在特定的quadratic loss情况下，我们的方法可以自适应取得一个minimax优化的错误率。一个数字示例也提供。

Variance Control for Distributional Reinforcement Learning

paper_url: http://arxiv.org/abs/2307.16152
repo_url: https://github.com/kuangqi927/qem
paper_authors: Qi Kuang, Zhoufan Zhu, Liwen Zhang, Fan Zhou
for: 这个论文主要是为了检验分布式强化学习（DRL）中Q函数估计器的有效性。
methods: 该论文使用了错误分析来理解Q函数估计器在分布式设定下的拟合误差的影响，并提出了一种新的估计器\emph{Quantiled Expansion Mean}（QEM）以及一种新的DRL算法（QEMRL）。
results: 对于variety of Atari和Mujoco benchmark任务，QEMRL算法比基eline算法在样本效率和收敛性方面具有显著改进。

Abstract
Although distributional reinforcement learning (DRL) has been widely examined in the past few years, very few studies investigate the validity of the obtained Q-function estimator in the distributional setting. To fully understand how the approximation errors of the Q-function affect the whole training process, we do some error analysis and theoretically show how to reduce both the bias and the variance of the error terms. With this new understanding, we construct a new estimator \emph{Quantiled Expansion Mean} (QEM) and introduce a new DRL algorithm (QEMRL) from the statistical perspective. We extensively evaluate our QEMRL algorithm on a variety of Atari and Mujoco benchmark tasks and demonstrate that QEMRL achieves significant improvement over baseline algorithms in terms of sample efficiency and convergence performance.

摘要
尽管分布式强化学习（DRL）在过去几年内得到了广泛的研究，但是很少研究对于分布式设定中的Q函数估计器的有效性。为了全面理解Q函数估计器的折衔错误对整个训练过程的影响，我们进行了错误分析并从统计角度提出了一种新的估计器——量划扩展均值（QEM），以及一种基于统计学的新DRL算法（QEMRL）。我们对多个Atari和Mujoco benchmark任务进行了广泛的评估，并证明了QEMRL在样本效率和收敛性方面具有显著的改进。

An Effective LSTM-DDPM Scheme for Energy Theft Detection and Forecasting in Smart Grid

paper_url: http://arxiv.org/abs/2307.16149
repo_url: None
paper_authors: Xun Yuan, Yang Yang, Arwa Alromih, Prosanta Gope, Biplab Sikdar
for: 该研究目标是解决智能电网系统中的能源盗用检测（ETD）和能源消耗预测（ECF）两个相关的挑战。
methods: 该研究提出了一种结合长期快短训练记忆（LSTM）和杂噪扩散概率模型（DDPM）的方法，通过生成输入重建和预测来实现ETD和ECF。
results: 经过大量的实验 validate 实验，该方法在实际数据集和 sintetic 数据集上都能够高效地解决ETD和ECF问题，并且在ETD问题上显示出了显著的改善。

Abstract
Energy theft detection (ETD) and energy consumption forecasting (ECF) are two interconnected challenges in smart grid systems. Addressing these issues collectively is crucial for ensuring system security. This paper addresses the interconnected challenges of ETD and ECF in smart grid systems. The proposed solution combines long short-term memory (LSTM) and a denoising diffusion probabilistic model (DDPM) to generate input reconstruction and forecasting. By leveraging the reconstruction and forecasting errors, the system identifies instances of energy theft, with the methods based on reconstruction error and forecasting error complementing each other in detecting different types of attacks. Through extensive experiments on real-world and synthetic datasets, the proposed scheme outperforms baseline methods in ETD and ECF problems. The ensemble method significantly enhances ETD performance, accurately detecting energy theft attacks that baseline methods fail to detect. The research offers a comprehensive and effective solution for addressing ETD and ECF challenges, demonstrating promising results and improved security in smart grid systems.

摘要
智能Grid系统中的能源抢夺检测（ETD）和能源消耗预测（ECF）是两个相关的挑战。对这两个问题进行集中解决是确保系统安全的关键。这篇论文解决了智能Grid系统中的ETD和ECF问题。提议的解决方案将长期短期记忆（LSTM）和杂度减少概率模型（DDPM）结合使用，生成输入重建和预测。通过利用重建和预测错误，系统可以识别能源抢夺行为，基于重建错误和预测错误来识别不同类型的攻击。经过广泛的实验，提议的方案在ETD和ECF问题上表现出优于基eline方法。 ensemble方法可以明显提高ETD性能，准确地检测基eline方法无法检测的能源抢夺攻击。这项研究提供了智能Grid系统中ETD和ECF问题的全面和有效解决方案，实验结果表明，该方案在智能Grid系统中提供了更好的安全保障。

Pupil Learning Mechanism

paper_url: http://arxiv.org/abs/2307.16141
repo_url: None
paper_authors: Rua-Huan Tsaih, Yu-Hang Chien, Shih-Yi Chien
for: 这个研究旨在解决人工神经网络中的快速衰减和过拟合问题。
methods: 本研究使用了学习眼视程序，包括解释、选择、理解、填充和组织等模块，从而 derivate 视力学习机制（PLM），并将其应用到2层神经网络（2LNN）中。
results: 在实验中，PLM模型与线性回归模型和传统的反射式2LNN模型相比，PLM模型具有较高的准确率和较低的错误率。

Abstract
Studies on artificial neural networks rarely address both vanishing gradients and overfitting issues. In this study, we follow the pupil learning procedure, which has the features of interpreting, picking, understanding, cramming, and organizing, to derive the pupil learning mechanism (PLM) by which to modify the network structure and weights of 2-layer neural networks (2LNNs). The PLM consists of modules for sequential learning, adaptive learning, perfect learning, and less-overfitted learning. Based upon a copper price forecasting dataset, we conduct an experiment to validate the PLM module design modules, and an experiment to evaluate the performance of PLM. The empirical results indeed approve the PLM module design and show the superiority of the proposed PLM model over the linear regression model and the conventional backpropagation-based 2LNN model.

摘要
研究人工神经网络通常不关注两个问题：衰减梯度和适应过度。在这个研究中，我们采用学生学习过程，具有解释、选择、理解、填充和组织等特点， derivate学生学习机制（PLM），用于修改网络结构和权重。PLM包括顺序学习、适应学习、完美学习和较少适应学习模块。我们使用铜价预测数据集进行实验验证PLM模块设计，并对PLM模型的性能进行评估。实验结果证明PLM模块设计的正确性，并表明我们提出的PLM模型在线性回归模型和传统的反射层2LNN模型的性能上显著优于。

User-Controlled Knowledge Fusion in Large Language Models: Balancing Creativity and Hallucination

paper_url: http://arxiv.org/abs/2307.16139
repo_url: None
paper_authors: Chen Zhang
for: 这个论文旨在解决现代对话系统中大语言模型（LLM）的使用问题，即找到一种能够控制LLM的幽默和实际知识的平衡点。
methods: 该论文提出了一种新的用户可控的机制，通过在LLM训练阶段添加数字标签来控制LLM的幽默和实际知识之间的平衡。该标签的值由自动化过程计算，根据ROUGE分数、Sentence-BERT嵌入和LLM自我评价分数来度量LLM对参考知识的依赖程度。
results: 该论文通过了广泛的实验，证明了该方法的适应性和可控性，并在不同的场景下保持了LLM的回快和准确性。结果表明该方法可以提高LLM的多样性，同时保持幽默和幻想的平衡。

Abstract
In modern dialogue systems, the use of Large Language Models (LLMs) has grown exponentially due to their capacity to generate diverse, relevant, and creative responses. Despite their strengths, striking a balance between the LLMs' creativity and their faithfulness to external knowledge remains a key challenge. This paper presents an innovative user-controllable mechanism that modulates the balance between an LLM's imaginative capabilities and its adherence to factual information. Our approach incorporates a numerical tag during the fine-tuning phase of the LLM's training, representing the degree of faithfulness to the reference knowledge in the generated responses. This degree is computed through an automated process that measures lexical overlap using ROUGE scores, semantic similarity using Sentence-BERT embeddings, and an LLM's self-evaluation score. During model inference, users can manipulate this numerical tag, thus controlling the degree of the LLM's reliance on external knowledge. We conduct extensive experiments across various scenarios, demonstrating the adaptability of our method and its efficacy in ensuring the quality and accuracy of the LLM's responses. The results highlight the potential of our approach to enhance the versatility of LLMs while maintaining a balance between creativity and hallucination.

摘要
现代对话系统中，使用大型语言模型（LLM）的使用量在增长 exponentially due to their capacity to generate diverse, relevant, and creative responses. Despite their strengths, striking a balance between the LLMs' creativity and their faithfulness to external knowledge remains a key challenge. This paper presents an innovative user-controllable mechanism that modulates the balance between an LLM's imaginative capabilities and its adherence to factual information. Our approach incorporates a numerical tag during the fine-tuning phase of the LLM's training, representing the degree of faithfulness to the reference knowledge in the generated responses. This degree is computed through an automated process that measures lexical overlap using ROUGE scores, semantic similarity using Sentence-BERT embeddings, and an LLM's self-evaluation score. During model inference, users can manipulate this numerical tag, thus controlling the degree of the LLM's reliance on external knowledge. We conduct extensive experiments across various scenarios, demonstrating the adaptability of our method and its efficacy in ensuring the quality and accuracy of the LLM's responses. The results highlight the potential of our approach to enhance the versatility of LLMs while maintaining a balance between creativity and hallucination.Note: Please note that the translation is in Simplified Chinese, which is the standard form of Chinese used in mainland China and Singapore. If you need the translation in Traditional Chinese, please let me know.

Deep Unrolling Networks with Recurrent Momentum Acceleration for Nonlinear Inverse Problems

paper_url: http://arxiv.org/abs/2307.16120
repo_url: https://github.com/zhouqp631/dunets-rma
paper_authors: Qingping Zhou, Jiayu Qian, Junqi Tang, Jinglai Li
for: 解决非线性逆问题
methods: 使用梯度加速技术（RMA）扩展深度推导网络（DuNets）
results: 对两种流行的 DuNets 方法（LPGD 和 LPD）进行了改进，提高了非线性逆问题的解决能力。实验结果表明，RMA 技术在非线性逆问题中的改进效果随问题的非线性程度增长。

Abstract
Combining the strengths of model-based iterative algorithms and data-driven deep learning solutions, deep unrolling networks (DuNets) have become a popular tool to solve inverse imaging problems. While DuNets have been successfully applied to many linear inverse problems, nonlinear problems tend to impair the performance of the method. Inspired by momentum acceleration techniques that are often used in optimization algorithms, we propose a recurrent momentum acceleration (RMA) framework that uses a long short-term memory recurrent neural network (LSTM-RNN) to simulate the momentum acceleration process. The RMA module leverages the ability of the LSTM-RNN to learn and retain knowledge from the previous gradients. We apply RMA to two popular DuNets -- the learned proximal gradient descent (LPGD) and the learned primal-dual (LPD) methods, resulting in LPGD-RMA and LPD-RMA respectively. We provide experimental results on two nonlinear inverse problems: a nonlinear deconvolution problem, and an electrical impedance tomography problem with limited boundary measurements. In the first experiment we have observed that the improvement due to RMA largely increases with respect to the nonlinearity of the problem. The results of the second example further demonstrate that the RMA schemes can significantly improve the performance of DuNets in strongly ill-posed problems.

摘要
使用模型基于迭代算法和数据驱动深度学习解决方案，深度螺旋网络（DuNets）已成为解析逆问题的流行工具。 Although DuNets have been successfully applied to many linear inverse problems, nonlinear problems tend to impair the method's performance. Inspired by momentum acceleration techniques commonly used in optimization algorithms, we propose a recurrent momentum acceleration (RMA) framework that uses a long short-term memory recurrent neural network (LSTM-RNN) to simulate the momentum acceleration process. The RMA module leverages the ability of the LSTM-RNN to learn and retain knowledge from previous gradients. We apply RMA to two popular DuNets - the learned proximal gradient descent (LPGD) and the learned primal-dual (LPD) methods, resulting in LPGD-RMA and LPD-RMA, respectively. We provide experimental results on two nonlinear inverse problems: a nonlinear deconvolution problem and an electrical impedance tomography problem with limited boundary measurements. In the first experiment, we observed that the improvement due to RMA increases significantly with respect to the nonlinearity of the problem. The results of the second example further demonstrate that the RMA schemes can significantly improve the performance of DuNets in strongly ill-posed problems.

TMPNN: High-Order Polynomial Regression Based on Taylor Map Factorization

paper_url: http://arxiv.org/abs/2307.16105
repo_url: https://github.com/andiva/tmpnn
paper_authors: Andrei Ivanov, Stefan Maria Ailuro
for: 这篇论文旨在提出一种基于Taylor map汇合的高阶多项式回传 regression 方法，用于解决非线性模式的预测问题。
methods: 这篇论文使用了 Taylor map 汇合来建构高阶多项式回传 regression 方法，并实现了多目标回传和内部目标之间的关联。
results: 根据 UCI 开放资料集、Feynman симвоlic regression 资料集和 Friedman-1 资料集的比较，提出的方法与现有的回传方法相比，在特定任务上表现更好，并且在某些任务上表现更好。

Abstract
Polynomial regression is widely used and can help to express nonlinear patterns. However, considering very high polynomial orders may lead to overfitting and poor extrapolation ability for unseen data. The paper presents a method for constructing a high-order polynomial regression based on the Taylor map factorization. This method naturally implements multi-target regression and can capture internal relationships between targets. Additionally, we introduce an approach for model interpretation in the form of systems of differential equations. By benchmarking on UCI open access datasets, Feynman symbolic regression datasets, and Friedman-1 datasets, we demonstrate that the proposed method performs comparable to the state-of-the-art regression methods and outperforms them on specific tasks.

摘要
“多项式回传 regression 广泛应用，可以表示非线性征 patten。然而，考虑非常高的多项式顺序可能会导致过拟合和未见数据的 extrapolation 能力不佳。论文提出了基于 Taylor 对偶 factorization 的高顺位多项式回传方法。这种方法自然地实现多 Target 回传和目标之间的内部关系。此外，我们引入了一种模型解释方法，即系统 diferential Equations。通过 UCI 开放存储数据集、Feynman симвоlic 回传数据集和 Friedman-1 数据集的实验，我们显示了提案的方法与现有的回传方法相比，在特定任务上表现相似，甚至在某些任务上超越。”Note: The translation is in Simplified Chinese, which is one of the two standard forms of Chinese writing. The other form is Traditional Chinese.

AI Increases Global Access to Reliable Flood Forecasts

paper_url: http://arxiv.org/abs/2307.16104
repo_url: https://github.com/google-research-datasets/global_streamflow_model_paper
paper_authors: Grey Nearing, Deborah Cohen, Vusumuzi Dube, Martin Gauch, Oren Gilon, Shaun Harrigan, Avinatan Hassidim, Frederik Kratzert, Asher Metzger, Sella Nevo, Florian Pappenberger, Christel Prudhomme, Guy Shalev, Shlomo Shenzis, Tadele Tekalign, Dana Weitzner, Yoss Matias
for: 这份研究是为了开发一个使用人工智能（AI）预测极端水文事件的模型，以提供更准确和更早的洪水警告。methods: 这份研究使用了AI模型来预测极端水文事件，并且比较了这个模型与现有的全球水文模型（Copernicus Emergency Management Service Global Flood Awareness System）的性能。results: 研究发现，这个AI模型在全球各地、不同的时间点和返回期下，都有着更高的准确性和更早的预测能力，特别是在无测流域中。这个模型已经被 integrate into an operational early warning system，并且在更 чем 80个国家提供免费和开放的预测。

Abstract
Floods are one of the most common and impactful natural disasters, with a disproportionate impact in developing countries that often lack dense streamflow monitoring networks. Accurate and timely warnings are critical for mitigating flood risks, but accurate hydrological simulation models typically must be calibrated to long data records in each watershed where they are applied. We developed an Artificial Intelligence (AI) model to predict extreme hydrological events at timescales up to 7 days in advance. This model significantly outperforms current state of the art global hydrology models (the Copernicus Emergency Management Service Global Flood Awareness System) across all continents, lead times, and return periods. AI is especially effective at forecasting in ungauged basins, which is important because only a few percent of the world's watersheds have stream gauges, with a disproportionate number of ungauged basins in developing countries that are especially vulnerable to the human impacts of flooding. We produce forecasts of extreme events in South America and Africa that achieve reliability approaching the current state of the art in Europe and North America, and we achieve reliability at between 4 and 6-day lead times that are similar to current state of the art nowcasts (0-day lead time). Additionally, we achieve accuracies over 10-year return period events that are similar to current accuracies over 2-year return period events, meaning that AI can provide warnings earlier and over larger and more impactful events. The model that we develop in this paper has been incorporated into an operational early warning system that produces publicly available (free and open) forecasts in real time in over 80 countries. This work using AI and open data highlights a need for increasing the availability of hydrological data to continue to improve global access to reliable flood warnings.

摘要
洪水是最常见且最有影响力的自然灾害之一，特别是在开发中国家，那里缺乏密集的流域流速监测网。精确和时间对洪水风险的警示是 Mitigating flood risks critical，但是需要对每个水系进行精确的水文模型 calibration 。我们发展了一个人工智能（AI）模型，可以预测7天内的极端水文事件。这个模型在所有大陆、领先时间和返回时间方面都有 significatively outperform 现有的全球水文模型（Copernicus Emergency Management Service Global Flood Awareness System）。AI 特别有用于预测无测流域，因为只有一小 percent of the world's watersheds have stream gauges，而且这些无测流域主要集中在开发中国家，这些国家对人类洪水的影响更加敏感。我们在南美和非洲预测极端事件的精度接近现有的欧洲和北美洲精度，并在4-6天领先时间内实现相似的可靠性。此外，我们在10年返回时间的事件中实现了现有2年返回时间的精度，这意味着AI可以提供更早的警示和更大和更重要的事件。我们在这篇文章中开发的模型已经被 integrate 到一个操作中的早期警示系统中，该系统在实时生成可公开获取（免费和开放）的预测。这个使用 AI 和开放数据的工作 highlights 对于全球访问可靠洪水警示的需求。

On Neural Network approximation of ideal adversarial attack and convergence of adversarial training

paper_url: http://arxiv.org/abs/2307.16099
repo_url: None
paper_authors: Rajdeep Haldar, Qifan Song
for: 本文针对适用于防御模型对抗攻击的方法。
methods: 本文使用了一种基于神经网络的方法，将攻击表示为可训练的函数，不需要进一步的梯度计算。
results: 本文证明了在适当的条件下，攻击可以被表示为光滑的块状函数（块状Holder函数），并使用神经网络实现理想的攻击过程。

Abstract
Adversarial attacks are usually expressed in terms of a gradient-based operation on the input data and model, this results in heavy computations every time an attack is generated. In this work, we solidify the idea of representing adversarial attacks as a trainable function, without further gradient computation. We first motivate that the theoretical best attacks, under proper conditions, can be represented as smooth piece-wise functions (piece-wise H\"older functions). Then we obtain an approximation result of such functions by a neural network. Subsequently, we emulate the ideal attack process by a neural network and reduce the adversarial training to a mathematical game between an attack network and a training model (a defense network). We also obtain convergence rates of adversarial loss in terms of the sample size $n$ for adversarial training in such a setting.

摘要
adversarial attacks 通常表示为输入数据和模型的梯度基于操作，这会导致每次生成攻击时需要重大计算。在这项工作中，我们坚持思考表达攻击为可学习函数，不需要进一步的梯度计算。我们首先证明，理论上最佳的攻击，在适当的条件下，可以表示为流畅的割辑函数（割辑Holder函数）。然后，我们得到了这些函数的近似结果，使用神经网络。接着，我们模拟理想的攻击过程，用神经网络来实现，并将反恐训练转化为数学游戏， между攻击网络和训练模型（防御网络）。我们还得到了对攻击损失的整数化速率，随着样本大小 $n$ 的增加。

ADR-GNN: Advection-Diffusion-Reaction Graph Neural Networks

paper_url: http://arxiv.org/abs/2307.16092
repo_url: None
paper_authors: Moshe Eliasof, Eldad Haber, Eran Treister
for: 本文提出了一种基于扩散吸引系统的图 neural network 架构（ADR-GNN），用于解决图数据上复杂现象的学习表示。
methods: 该架构结合了扩散、吸引和反应三种过程，以模型图数据上的导向传输信息、本地平滑信息和非线性变换信息。
results: 作者对实验数据集进行了评估，并显示了 ADR-GNN 在图分类和空间时间数据集上提供了改进或与状态艺术网络竞争的表现。

Abstract
Graph neural networks (GNNs) have shown remarkable success in learning representations for graph-structured data. However, GNNs still face challenges in modeling complex phenomena that involve advection. In this paper, we propose a novel GNN architecture based on Advection-Diffusion-Reaction systems, called ADR-GNN. Advection models the directed transportation of information, diffusion captures the local smoothing of information, and reaction represents the non-linear transformation of information in channels. We provide an analysis of the qualitative behavior of ADR-GNN, that shows the benefit of combining advection, diffusion, and reaction. To demonstrate its efficacy, we evaluate ADR-GNN on real-world node classification and spatio-temporal datasets, and show that it improves or offers competitive performance compared to state-of-the-art networks.

摘要
GRAPH Neural Networks (GNNs) 已经取得了非常成功的表示图structured数据的学习。然而，GNNS仍然面临Complex Phenomena 的挑战，包括适应。在这篇论文中，我们提出了一种基于适应扩散反应系统的新GNN架构，称为ADR-GNN。适应模型化了irectional transportation of information，扩散捕捉了Local Smoothing of information，并且Reaction表示通道中的非线性变换。我们提供了ADR-GNN的qualitative行为分析，显示了结合适应、扩散和反应的优势。为证明其有效性，我们对实际世界节点分类和空时间数据集进行了评估，并显示了它与当前网络的竞争性或提高性。

Rapid Flood Inundation Forecast Using Fourier Neural Operator

paper_url: http://arxiv.org/abs/2307.16090
repo_url: None
paper_authors: Alexander Y. Sun, Zhi Li, Wonhyun Lee, Qixing Huang, Bridget R. Scanlon, Clint Dawson
For: 预测洪水覆盖范围和水深，提供紧急准备和应急响应之用。* Methods: 结合过程基本模型和数据驱动机器学习方法，采用Fourier neural operator（FNO）模型进行蒸发模拟。* Results: FNO模型在训练使用六个历史洪水事件的计算水深数据（15分钟间隔）后，在两个保留事件上进行测试，显示FNO模型在所有领先时间（最长3小时）中保持高预测精度，并在应用于新地点时表现良好，表明具有强泛化能力。

Abstract
Flood inundation forecast provides critical information for emergency planning before and during flood events. Real time flood inundation forecast tools are still lacking. High-resolution hydrodynamic modeling has become more accessible in recent years, however, predicting flood extents at the street and building levels in real-time is still computationally demanding. Here we present a hybrid process-based and data-driven machine learning (ML) approach for flood extent and inundation depth prediction. We used the Fourier neural operator (FNO), a highly efficient ML method, for surrogate modeling. The FNO model is demonstrated over an urban area in Houston (Texas, U.S.) by training using simulated water depths (in 15-min intervals) from six historical storm events and then tested over two holdout events. Results show FNO outperforms the baseline U-Net model. It maintains high predictability at all lead times tested (up to 3 hrs) and performs well when applying to new sites, suggesting strong generalization skill.

摘要
洪水泛洪预测提供了重要的紧急准备和应急管理之前和在洪水事件发生时的信息。实时洪水泛洪预测工具仍然缺乏。高分解力 hidrodynamic 模型在过去几年中变得更加可 accessible，但是在实时预测洪水泛洪范围和泛洪深度方面仍然是计算挑战。我们提出了一种 hybrid 过程基于的数据驱动机器学习（ML）方法，用于预测洪水泛洪范围和泛洪深度。我们使用了 Fourier 神经网络（FNO）模型，这是一种高效的 ML 方法，用于模拟器。FNO 模型在得克萨斯州HOUSTON 市区域上进行了训练，使用了六个历史洪水事件中的 simulate 水深数据（每 15 分钟一个数据点），然后在两个保留事件上进行测试。结果显示，FNO 模型在所有领先时间（最长 3 小时）中保持高度预测性，并在应用于新地点时表现良好，这表明其具有强大的泛化能力。

Using Implicit Behavior Cloning and Dynamic Movement Primitive to Facilitate Reinforcement Learning for Robot Motion Planning

paper_url: http://arxiv.org/abs/2307.16062
repo_url: None
paper_authors: Zengjie Zhang, Jayden Hong, Amir Soufi Enayati, Homayoun Najjaran
for: 提高多度OFRobot的运动规划效率和可重用性
methods: 使用偏好行为假设（IBC）和动态运动原理（DMP）提高RLagent的训练速度和通用性
results: 在模拟和实验中比对RLagent和传统RLagent，表明提议方法具有更快的训练速度和更高的分数Here’s the breakdown of each point:
for: The paper is written to improve the efficiency and generalizability of reinforcement learning (RL) for motion planning of multi-degree-of-freedom robots.
methods: The proposed method uses implicit behavior cloning (IBC) and dynamic movement primitive (DMP) to improve the training speed and generalizability of an off-policy RL agent.
results: The proposed method is compared with conventional RL agents in simulation and real-robot experiments, showing faster training speed and higher scores.

Abstract
Reinforcement learning (RL) for motion planning of multi-degree-of-freedom robots still suffers from low efficiency in terms of slow training speed and poor generalizability. In this paper, we propose a novel RL-based robot motion planning framework that uses implicit behavior cloning (IBC) and dynamic movement primitive (DMP) to improve the training speed and generalizability of an off-policy RL agent. IBC utilizes human demonstration data to leverage the training speed of RL, and DMP serves as a heuristic model that transfers motion planning into a simpler planning space. To support this, we also create a human demonstration dataset using a pick-and-place experiment that can be used for similar studies. Comparison studies in simulation reveal the advantage of the proposed method over the conventional RL agents with faster training speed and higher scores. A real-robot experiment indicates the applicability of the proposed method to a simple assembly task. Our work provides a novel perspective on using motion primitives and human demonstration to leverage the performance of RL for robot applications.

摘要
Translated into Simplified Chinese:利用强化学习（RL）的动作规划方法，多度 freedom 机器人的准确率仍然受到低效率的困扰，即训练速度慢和泛化能力差。在这篇论文中，我们提出了一种基于RL的机器人动作规划框架，使用隐式行为封装（IBC）和动态运动 primitives（DMP）来提高RLagent的训练速度和泛化能力。IBC利用人类示范数据来利用RL的训练速度，而DMP作为一种启发模型，将动作规划转移到简单的规划空间。为支持这一点，我们还创建了一个人类示范数据集，用于类似的研究。在模拟环境中的比较研究表明，提案方法比普通RL代理人具有更快的训练速度和更高的分数。一个真实机器人实验表明提案方法对简单的组装任务有应用性。我们的工作提供了一种新的思路，利用动作 primitives 和人类示范来提高RL的表现 для机器人应用。

Click-Conversion Multi-Task Model with Position Bias Mitigation for Sponsored Search in eCommerce

paper_url: http://arxiv.org/abs/2307.16060
repo_url: None
paper_authors: Yibo Wang, Yanbing Xue, Bo Liu, Musen Wen, Wenting Zhao, Stephen Guo, Philip S. Yu
for: This paper aims to mitigate position bias in ranking systems, particularly in e-commerce sponsored product search.methods: The authors propose two position-bias-free CTR and CVR prediction models: Position-Aware Click-Conversion (PACC) and PACC via Position Embedding (PACC-PE). PACC is built upon probability decomposition, while PACC-PE utilizes neural networks to model product-specific position information as embedding.results: The proposed models have better ranking effectiveness and can greatly alleviate position bias in both CTR and CVR prediction, as shown in experiments on the E-commerce sponsored product search dataset.

Abstract
Position bias, the phenomenon whereby users tend to focus on higher-ranked items of the search result list regardless of the actual relevance to queries, is prevailing in many ranking systems. Position bias in training data biases the ranking model, leading to increasingly unfair item rankings, click-through-rate (CTR), and conversion rate (CVR) predictions. To jointly mitigate position bias in both item CTR and CVR prediction, we propose two position-bias-free CTR and CVR prediction models: Position-Aware Click-Conversion (PACC) and PACC via Position Embedding (PACC-PE). PACC is built upon probability decomposition and models position information as a probability. PACC-PE utilizes neural networks to model product-specific position information as embedding. Experiments on the E-commerce sponsored product search dataset show that our proposed models have better ranking effectiveness and can greatly alleviate position bias in both CTR and CVR prediction.

摘要
<>TRANSLATE_TEXT位置偏见，用户偏好搜索结果列表中高排名的项目，不管它们与查询的实际相关性有多少关系，是许多排名系统中的现象。位置偏见在训练数据中扭曲排名模型，导致排名预测变得越来越不公平，点击率（CTR）和转化率（CVR）预测也受到影响。为了同时消除位置偏见在ITEM CTR和 CVR预测中，我们提议了两种位置偏见自由的预测模型：Position-Aware Click-Conversion（PACC）和PACC via Position Embedding（PACC-PE）。PACC基于概率分解，将位置信息视为概率；PACC-PE使用神经网络来模型产品特定的位置信息作为嵌入。在电商推荐 searched product dataset上的实验表明，我们的提议模型在排名效果和减少位置偏见方面具有显著优势。Note: The text has been translated using the Google Translate API, which may not be perfect and may not capture all the nuances of the original text.

Evaluating the Robustness of Test Selection Methods for Deep Neural Networks

paper_url: http://arxiv.org/abs/2308.01314
repo_url: None
paper_authors: Qiang Hu, Yuejun Guo, Xiaofei Xie, Maxime Cordy, Wei Ma, Mike Papadakis, Yves Le Traon
for: This paper aims to investigate the reliability of multiple test selection methods for deep learning-based systems, and to identify potential pitfalls in their construction.
methods: The paper examines 11 test selection methods from top-tier venues, and conducts a study on five datasets with two model architectures per dataset to empirically confirm the existence of pitfalls.
results: The paper finds that methods for fault detection suffer from test data that are correctly classified but uncertain, or misclassified but confident, leading to a drop in test relative coverage of up to 86.85%. Additionally, methods for performance estimation are sensitive to the choice of intermediate-layer output, and can be less effective than random selection when using an inappropriate layer.

Abstract
Testing deep learning-based systems is crucial but challenging due to the required time and labor for labeling collected raw data. To alleviate the labeling effort, multiple test selection methods have been proposed where only a subset of test data needs to be labeled while satisfying testing requirements. However, we observe that such methods with reported promising results are only evaluated under simple scenarios, e.g., testing on original test data. This brings a question to us: are they always reliable? In this paper, we explore when and to what extent test selection methods fail for testing. Specifically, first, we identify potential pitfalls of 11 selection methods from top-tier venues based on their construction. Second, we conduct a study on five datasets with two model architectures per dataset to empirically confirm the existence of these pitfalls. Furthermore, we demonstrate how pitfalls can break the reliability of these methods. Concretely, methods for fault detection suffer from test data that are: 1) correctly classified but uncertain, or 2) misclassified but confident. Remarkably, the test relative coverage achieved by such methods drops by up to 86.85%. On the other hand, methods for performance estimation are sensitive to the choice of intermediate-layer output. The effectiveness of such methods can be even worse than random selection when using an inappropriate layer.

摘要
测试深度学习系统是关键但困难的，因为需要大量的时间和劳动来标注采集的原始数据。为了减轻标注劳动，许多测试选择方法已经被提出，只需要标注一个子集的测试数据而不符合测试要求。然而，我们发现这些方法在报道了Promising结果后，很少被评估在复杂的场景下。这引发了我们的问题：这些方法是否总是可靠？在这篇论文中，我们探索测试选择方法在测试时会失败的情况。 Specifically, first, we identify potential pitfalls of 11 selection methods from top-tier venues based on their construction. Second, we conduct a study on five datasets with two model architectures per dataset to empirically confirm the existence of these pitfalls. Furthermore, we demonstrate how pitfalls can break the reliability of these methods. Concretely, methods for fault detection suffer from test data that are: 1) correctly classified but uncertain, or 2) misclassified but confident. Remarkably, the test relative coverage achieved by such methods drops by up to 86.85%. On the other hand, methods for performance estimation are sensitive to the choice of intermediate-layer output. The effectiveness of such methods can be even worse than random selection when using an inappropriate layer.

Unveiling Exotic Magnetic Phases in Fibonacci Quasicrystalline Stacking of Ferromagnetic Layers through Machine Learning

paper_url: http://arxiv.org/abs/2307.16052
repo_url: None
paper_authors: Pablo S. Cornaglia, Matias Nuñez, D. J. Garcia
for: 这个研究探讨了使用短距离磁Materials实现的菲波奈克里斯Stacking结构的磁性性质。
methods: 该研究使用了机器学习方法来探索这个 quasi-crystalline 系统中磁性行为的复杂关系，并提供了磁性相对图。
results: 研究发现了一种特殊的斜排列相，其中磁化程度递减Logarithmically with the stack height。此外，研究还发现了其他斜排列相和非斜排列相。

Abstract
In this study, we conduct a comprehensive theoretical analysis of a Fibonacci quasicrystalline stacking of ferromagnetic layers, potentially realizable using van der Waals magnetic materials. We construct a model of this magnetic heterostructure, which includes up to second neighbor interlayer magnetic interactions, that displays a complex relationship between geometric frustration and magnetic order in this quasicrystalline system. To navigate the parameter space and identify distinct magnetic phases, we employ a machine learning approach, which proves to be a powerful tool in revealing the complex magnetic behavior of this system. We offer a thorough description of the magnetic phase diagram as a function of the model parameters. Notably, we discover among other collinear and non-collinear phases, a unique ferromagnetic alternating helical phase. In this non-collinear quasiperiodic ferromagnetic configuration the magnetization decreases logarithmically with the stack height.

摘要
在这项研究中，我们进行了详细的理论分析，涉及到费波纳契镁磁层的杂合堆叠，可能通过磁性van der Waals材料实现。我们构建了这种磁性异构体系的模型，包括最多的第二邻居层磁交互，这种系统显示了复杂的几何阻碍和磁ORDER之间的关系。为了探索参数空间并特征化不同磁相，我们使用机器学习方法，这证明了这种方法在揭示这种系统的复杂磁性行为上是一个强大工具。我们提供了磁相图的全面描述，其中包括了模型参数的函数。特别是，我们发现了一种独特的梯形扁平磁相，在堆高上呈指数减少的情况下，磁化强度下降。

Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback

paper_url: http://arxiv.org/abs/2307.16039
repo_url: https://github.com/nlp-uoregon/okapi
paper_authors: Viet Dac Lai, Chien Van Nguyen, Nghia Trung Ngo, Thuat Nguyen, Franck Dernoncourt, Ryan A. Rossi, Thien Huu Nguyen
for: 本研究旨在提高大型自然语言模型（LLM）的可用性和影响力，通过训练和调教 LLM 可以更好地适应人类的期望，从而实现出色的学习能力。
methods: 本研究使用了 supervised fine-tuning (SFT) 和 reinforcement learning from human feedback (RLHF) 两种方法进行 instruction tuning，以便在多种语言上实现最佳的性能。
results: 我们的实验表明，使用 RLHF 进行多语言 instruction tuning 可以超过 SFT 的性能，并且可以在不同的基模型和数据集上实现优秀的结果。

Abstract
A key technology for the development of large language models (LLMs) involves instruction tuning that helps align the models' responses with human expectations to realize impressive learning abilities. Two major approaches for instruction tuning characterize supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF), which are currently applied to produce the best commercial LLMs (e.g., ChatGPT). To improve the accessibility of LLMs for research and development efforts, various instruction-tuned open-source LLMs have also been introduced recently, e.g., Alpaca, Vicuna, to name a few. However, existing open-source LLMs have only been instruction-tuned for English and a few popular languages, thus hindering their impacts and accessibility to many other languages in the world. Among a few very recent work to explore instruction tuning for LLMs in multiple languages, SFT has been used as the only approach to instruction-tune LLMs for multiple languages. This has left a significant gap for fine-tuned LLMs based on RLHF in diverse languages and raised important questions on how RLHF can boost the performance of multilingual instruction tuning. To overcome this issue, we present Okapi, the first system with instruction-tuned LLMs based on RLHF for multiple languages. Okapi introduces instruction and response-ranked data in 26 diverse languages to facilitate the experiments and development of future multilingual LLM research. We also present benchmark datasets to enable the evaluation of generative LLMs in multiple languages. Our experiments demonstrate the advantages of RLHF for multilingual instruction over SFT for different base models and datasets. Our framework and resources are released at https://github.com/nlp-uoregon/Okapi.

摘要
键技术 для开发大语言模型（LLM）包括指令调整，以使模型的回答与人类期望相一致，从而实现很好的学习能力。目前，supervised fine-tuning（SFT）和人类反馈学习（RLHF）是两种主要的指令调整方法，用于生产最佳商业LLM（如ChatGPT）。为了提高LLM的研究和开发的可accessibility，各种指令调整的开源LLM也已经被引入，如Alpaca和Vicuna等。然而，现有的开源LLM只有在英语和一些流行的语言上进行了指令调整，这限制了它们在全球各种语言中的影响和可用性。在最近几年中，有一些工作尝试了在多语言中进行指令调整，但只使用了SFT作为唯一的调整方法。这留下了一个大的 gap，即在多语言中使用RLHF进行细调的可能性。为了解决这个问题，我们提出了Okapi，首个基于RLHF的多语言指令调整系统。Okapi introduce了26种多语言的指令和回答排名数据，以便进行实验和未来多语言LLM研究的发展。我们还提供了多语言生成LLM的评价数据集。我们的实验表明，RLHF在多语言指令调整中的优势，比SFT在不同的基本模型和数据集上。我们的框架和资源在https://github.com/nlp-uoregon/Okapi上发布。

Developing novel ligands with enhanced binding affinity for the sphingosine 1-phosphate receptor 1 using machine learning

paper_url: http://arxiv.org/abs/2307.16037
repo_url: None
paper_authors: Colin Zhang, Yang Ha
for: 这个研究旨在使用机器学习模型加速多斯普朗肌病（MS）的药物发现过程，并通过分析蛋白质-药物交互的化学特性，探索新的药物设计方法。
methods: 该研究使用了自适应器机器学习模型，将化学式转化为数学向量，并生成了超过500个分子变体基于斯皮诺模（siponimod），其中25种分子具有更高的靶蛋白S1PR1的绑定亲和力。
results: 该研究发现了6种有药理性和易合成的药物候选者，并通过分析这些药物与S1PR1的绑定交互，探讨了一些靶蛋白-药物交互的化学特性，这些结果表明机器学习可以加速药物发现过程，并为药物设计提供新的视角。

Abstract
Multiple sclerosis (MS) is a debilitating neurological disease affecting nearly one million people in the United States. Sphingosine-1-phosphate receptor 1, or S1PR1, is a protein target for MS. Siponimod, a ligand of S1PR1, was approved by the FDA in 2019 for MS treatment, but there is a demonstrated need for better therapies. To this end, we finetuned an autoencoder machine learning model that converts chemical formulas into mathematical vectors and generated over 500 molecular variants based on siponimod, out of which 25 compounds had higher predicted binding affinity to S1PR1. The model was able to generate these ligands in just under one hour. Filtering these compounds led to the discovery of six promising candidates with good drug-like properties and ease of synthesis. Furthermore, by analyzing the binding interactions for these ligands, we uncovered several chemical properties that contribute to high binding affinity to S1PR1. This study demonstrates that machine learning can accelerate the drug discovery process and reveal new insights into protein-drug interactions.

摘要

MUSE: Multi-View Contrastive Learning for Heterophilic Graphs

paper_url: http://arxiv.org/abs/2307.16026
repo_url: None
paper_authors: Mengyi Yuan, Minjie Chen, Xiang Li
for: 这篇文章的目的是提出一种基于多视图对照学习的自动学习模型，即MUSE，以解决传统Graph Neural Networks（GNN）中的标签依赖和泛化性问题。methods: 该模型使用了两种视图来捕捉egos节点和其邻居的信息，即GNNs增强了对照学习的视图，然后将这两个视图融合以生成节点表示。此外，该模型还使用了对照强化和信息整合控制器来模型节点邻居相似性的多样性。results: 对于9个benchmark数据集，我们的实验结果表明MUSE模型在节点分类和聚类任务中具有显著的效果。

Abstract
In recent years, self-supervised learning has emerged as a promising approach in addressing the issues of label dependency and poor generalization performance in traditional GNNs. However, existing self-supervised methods have limited effectiveness on heterophilic graphs, due to the homophily assumption that results in similar node representations for connected nodes. In this work, we propose a multi-view contrastive learning model for heterophilic graphs, namely, MUSE. Specifically, we construct two views to capture the information of the ego node and its neighborhood by GNNs enhanced with contrastive learning, respectively. Then we integrate the information from these two views to fuse the node representations. Fusion contrast is utilized to enhance the effectiveness of fused node representations. Further, considering that the influence of neighboring contextual information on information fusion may vary across different ego nodes, we employ an information fusion controller to model the diversity of node-neighborhood similarity at both the local and global levels. Finally, an alternating training scheme is adopted to ensure that unsupervised node representation learning and information fusion controller can mutually reinforce each other. We conduct extensive experiments to evaluate the performance of MUSE on 9 benchmark datasets. Our results show the effectiveness of MUSE on both node classification and clustering tasks.

摘要
Here's the translation in Simplified Chinese:近年来，自我超vision学习 emerged as a promising approach to address the issues of label dependency and poor generalization performance in traditional GNNs. However, existing self-supervised methods have limited effectiveness on heterophilic graphs due to the homophily assumption, which results in similar node representations for connected nodes. In this work, we propose a multi-view contrastive learning model for heterophilic graphs, called MUSE. Specifically, we construct two views to capture the information of the ego node and its neighborhood using GNNs enhanced with contrastive learning, respectively. Then we integrate the information from these two views to fuse the node representations. Fusion contrast is utilized to enhance the effectiveness of fused node representations. Moreover, considering that the influence of neighboring contextual information on information fusion may vary across different ego nodes, we employ an information fusion controller to model the diversity of node-neighborhood similarity at both the local and global levels. Finally, an alternating training scheme is adopted to ensure that unsupervised node representation learning and information fusion controller can mutually reinforce each other. We conduct extensive experiments to evaluate the performance of MUSE on 9 benchmark datasets. Our results show the effectiveness of MUSE on both node classification and clustering tasks.

Discrete neural nets and polymorphic learning

paper_url: http://arxiv.org/abs/2308.00677
repo_url: https://github.com/caten2/tripods2021ua
paper_authors: Charlotte Aten
for: 这篇论文旨在统一提出神经网络和 универсаль算法的关系，并介绍一种基于 polymorphisms of relational structures 的学习算法。
methods: 这篇论文使用了 Murski\u{i} 的 universal algebra 结论和 Cybenko 的神经网络 universal approximation 结论，并提出了一种基于 polymorphisms of relational structures 的学习算法。
results: 这篇论文的结果表明，使用这种学习算法可以解决一些 классические学习任务。

Abstract
Theorems from universal algebra such as that of Murski\u{i} from the 1970s have a striking similarity to universal approximation results for neural nets along the lines of Cybenko's from the 1980s. We consider here a discrete analogue of the classical notion of a neural net which places these results in a unified setting. We introduce a learning algorithm based on polymorphisms of relational structures and show how to use it for a classical learning task.

摘要
theorem 从通用代数如 murski的 1970年代有一种 striking similarity 与 neural network 的 universal approximation result 类似，例如 cybenko 在 1980年代的 result。我们在这里 Consider 一个离散的 neural network 的类传统的概念，并将这些结果集成到一个统一的设定中。我们介绍一种基于 relational structure 的学习算法，并证明可以用它来解决 classical learning task。Note:* "Murski" should be written as "穆尔斯基" (Mù'ěrskī) in Simplified Chinese.* "Cybenko" should be written as "谢本科" (Xiè Běnkē) in Simplified Chinese.

Fuzzy Logic Visual Network (FLVN): A neuro-symbolic approach for visual features matching

paper_url: http://arxiv.org/abs/2307.16019
repo_url: https://gitlab.com/grains2/flvn
paper_authors: Francesco Manigrasso, Lia Morra, Fabrizio Lamberti
for: 这个研究目的是实现具有symbolic知识表示和深度神经网络学习的neuro-symbolic整合。
methods: 这个研究使用了Logic Tensor Networks (LTNs)来将背景知识转换为可微分的操作，并将其应用到零例学习（ZSL）分类任务中。
results: 这个研究提出了Fuzzy Logic Visual Network (FLVN)，它在neuro-symbolic LTN框架下学习了一个可视 Semantic embedding 空间，并将内在知识（例如类别和概念阶层）统一到这个 embedding 空间中。 FLVN 在 Generalized ZSL（GZSL）测试 benchmark 上表现出色，与其他最新的 ZSL 方法相比，具有较少的计算负载。

Abstract
Neuro-symbolic integration aims at harnessing the power of symbolic knowledge representation combined with the learning capabilities of deep neural networks. In particular, Logic Tensor Networks (LTNs) allow to incorporate background knowledge in the form of logical axioms by grounding a first order logic language as differentiable operations between real tensors. Yet, few studies have investigated the potential benefits of this approach to improve zero-shot learning (ZSL) classification. In this study, we present the Fuzzy Logic Visual Network (FLVN) that formulates the task of learning a visual-semantic embedding space within a neuro-symbolic LTN framework. FLVN incorporates prior knowledge in the form of class hierarchies (classes and macro-classes) along with robust high-level inductive biases. The latter allow, for instance, to handle exceptions in class-level attributes, and to enforce similarity between images of the same class, preventing premature overfitting to seen classes and improving overall performance. FLVN reaches state of the art performance on the Generalized ZSL (GZSL) benchmarks AWA2 and CUB, improving by 1.3% and 3%, respectively. Overall, it achieves competitive performance to recent ZSL methods with less computational overhead. FLVN is available at https://gitlab.com/grains2/flvn.

摘要
In this study, we present the Fuzzy Logic Visual Network (FLVN), which formulates the task of learning a visual-semantic embedding space within a neuro-symbolic LTN framework. FLVN incorporates prior knowledge in the form of class hierarchies and robust high-level inductive biases, allowing for exception handling and similarity enforcement between images of the same class. This improves overall performance and reduces premature overfitting to seen classes.FLVN achieves state-of-the-art performance on the Generalized ZSL (GZSL) benchmarks AWA2 and CUB, improving by 1.3% and 3%, respectively. It also achieves competitive performance to recent ZSL methods with less computational overhead. FLVN is available at .

2023-07-30

Efficient Federated Learning via Local Adaptive Amended Optimizer with Linear Speedup

DRL4Route: A Deep Reinforcement Learning Framework for Pick-up and Delivery Route Prediction

Synaptic Plasticity Models and Bio-Inspired Unsupervised Deep Learning: A Survey

Spiking Neural Networks and Bio-Inspired Supervised Deep Learning: A Survey

Robust Electric Vehicle Balancing of Autonomous Mobility-On-Demand System: A Multi-Agent Reinforcement Learning Approach

Optimizing the Neural Network Training for OCR Error Correction of Historical Hebrew Texts

Improving Probabilistic Bisimulation for MDPs Using Machine Learning

Text Analysis Using Deep Neural Networks in Digital Humanities and Information Science

Question Answering with Deep Neural Networks for Semi-Structured Heterogeneous Genealogical Knowledge Graphs

Toward a Period-Specific Optimized Neural Network for OCR Error Correction of Historical Hebrew Texts

Robust Multi-Agent Reinforcement Learning with State Uncertainty

Rethinking Uncertainly Missing and Ambiguous Visual Modality in Multi-Modal Entity Alignment

Around the GLOBE: Numerical Aggregation Question-Answering on Heterogeneous Genealogical Knowledge Graphs with Deep Neural Networks

Deep Convolutional Neural Networks with Zero-Padding: Feature Extraction and Learning

Shuffled Differentially Private Federated Learning for Time Series Data Analytics

An Efficient Approach to Mitigate Numerical Instability in Backpropagation for 16-bit Neural Network Training

ESP: Exploiting Symmetry Prior for Multi-Agent Reinforcement Learning

Unified Model for Image, Video, Audio and Language Tasks

Redundancy-aware unsupervised rankings for collections of gene sets

Adaptive learning of density ratios in RKHS

Variance Control for Distributional Reinforcement Learning

An Effective LSTM-DDPM Scheme for Energy Theft Detection and Forecasting in Smart Grid

Pupil Learning Mechanism

User-Controlled Knowledge Fusion in Large Language Models: Balancing Creativity and Hallucination

Deep Unrolling Networks with Recurrent Momentum Acceleration for Nonlinear Inverse Problems

TMPNN: High-Order Polynomial Regression Based on Taylor Map Factorization

AI Increases Global Access to Reliable Flood Forecasts

On Neural Network approximation of ideal adversarial attack and convergence of adversarial training

ADR-GNN: Advection-Diffusion-Reaction Graph Neural Networks

Rapid Flood Inundation Forecast Using Fourier Neural Operator

Using Implicit Behavior Cloning and Dynamic Movement Primitive to Facilitate Reinforcement Learning for Robot Motion Planning

Click-Conversion Multi-Task Model with Position Bias Mitigation for Sponsored Search in eCommerce

Evaluating the Robustness of Test Selection Methods for Deep Neural Networks

Unveiling Exotic Magnetic Phases in Fibonacci Quasicrystalline Stacking of Ferromagnetic Layers through Machine Learning

Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback

Developing novel ligands with enhanced binding affinity for the sphingosine 1-phosphate receptor 1 using machine learning

MUSE: Multi-View Contrastive Learning for Heterophilic Graphs

Discrete neural nets and polymorphic learning

Fuzzy Logic Visual Network (FLVN): A neuro-symbolic approach for visual features matching