2023-12-07

cs.LG

cs.LG - 2023-12-07

SoK: Unintended Interactions among Machine Learning Defenses and Risks

paper_url: http://arxiv.org/abs/2312.04542
repo_url: None
paper_authors: Vasisht Duddu, Sebastian Szyller, N. Asokan
for: 这篇论文目的是为了探讨机器学习模型中的安全、隐私和公平性风险，以及如何通过不同的防御策略来减少这些风险。
methods: 该论文使用了现有的防御策略和检测技术，以及一种新的框架来捕捉和解释不良交互的现象。
results: 该论文通过对现有文献和实验数据进行分析和验证，证明了其框架的可靠性和有效性，并发现了两种新的不良交互现象。

Abstract
Machine learning (ML) models cannot neglect risks to security, privacy, and fairness. Several defenses have been proposed to mitigate such risks. When a defense is effective in mitigating one risk, it may correspond to increased or decreased susceptibility to other risks. Existing research lacks an effective framework to recognize and explain these unintended interactions. We present such a framework, based on the conjecture that overfitting and memorization underlie unintended interactions. We survey existing literature on unintended interactions, accommodating them within our framework. We use our framework to conjecture on two previously unexplored interactions, and empirically validate our conjectures.

摘要
机器学习（ML）模型不能忽略安全、隐私和公平的风险。一些防御策已经被提议以减轻这些风险。当一种防御效果于一种风险减轻时，可能与其他风险相对增加或减少抵抗性。现有的研究缺乏一个有效的框架来识别和解释这些不意图的互动。我们提出了这种框架，基于假设过拟合和记忆在不意图互动中发挥作用。我们对现有文献进行了survey，并将其嵌入我们的框架中。我们使用我们的框架来推测两个以前未explored的互动，并employs empirical validation来验证我们的推测。

Trajeglish: Learning the Language of Driving Scenarios

paper_url: http://arxiv.org/abs/2312.04535
repo_url: None
paper_authors: Jonah Philion, Xue Bin Peng, Sanja Fidler
for: 模拟自动驾驶 scenarios 的 dynamics 和交互行为，以提高自动驾驶系统的可靠性和安全性。
methods: 使用 discrete sequence modeling 模型，对 vehicles、行人和自行车的交互行为进行模拟，并使用 GPT-like 编码器-解码器来模型多种机器之间的交互。
results: 模型在 Waymo Sim Agents Benchmark 上得分最高，超过了先前的工作，在实зм和交互度量上增加了3.3%和9.9%。模型在全自动和半自动设置下进行了ablation，并证明了模型学习的表示可以快速地改进 nuScenes 的性能。

Abstract
A longstanding challenge for self-driving development is simulating dynamic driving scenarios seeded from recorded driving logs. In pursuit of this functionality, we apply tools from discrete sequence modeling to model how vehicles, pedestrians and cyclists interact in driving scenarios. Using a simple data-driven tokenization scheme, we discretize trajectories to centimeter-level resolution using a small vocabulary. We then model the multi-agent sequence of motion tokens with a GPT-like encoder-decoder that is autoregressive in time and takes into account intra-timestep interaction between agents. Scenarios sampled from our model exhibit state-of-the-art realism; our model tops the Waymo Sim Agents Benchmark, surpassing prior work along the realism meta metric by 3.3% and along the interaction metric by 9.9%. We ablate our modeling choices in full autonomy and partial autonomy settings, and show that the representations learned by our model can quickly be adapted to improve performance on nuScenes. We additionally evaluate the scalability of our model with respect to parameter count and dataset size, and use density estimates from our model to quantify the saliency of context length and intra-timestep interaction for the traffic modeling task.

摘要
自驾发展中长期的挑战之一是模拟基于记录的驾驶情况，以便实现自驾车在不同情况下的适应能力。为了实现这一目标，我们使用排序序列模型工具来模拟汽车、人行车和自行车之间的互动。我们使用一种简单的数据驱动的封装方案，将轨迹拆分为厘米级别，并使用一小 vocabulary 来模型多个代理人的运动序列。我们then使用一种 GPT 类 Encoder-Decoder 模型，通过时间进行自然语言处理和同时间间互动来模型多个代理人的运动序列。从我们的模型中采样的情况显示出了当今最高的实际性；我们的模型在 Waymo Sim Agents Benchmark 中超过了先前的工作，在实际性指标上高于前一个 metric 的 3.3%，并在互动指标上高于前一个 metric 的 9.9%。我们还对我们的模型化选择进行了杜比特分析，并证明了在全自动和半自动设置中，我们的模型可以快速适应改进 nuScenes 的表现。此外，我们还评估了我们的模型的参数计数和数据集大小的扩展性，并使用我们的模型生成的浓度估计来评估上下文长度和同时间互动的重要性。

Horizon-Free and Instance-Dependent Regret Bounds for Reinforcement Learning with General Function Approximation

paper_url: http://arxiv.org/abs/2312.04464
repo_url: None
paper_authors: Jiayi Huang, Han Zhong, Liwei Wang, Lin F. Yang
for: 解决长期规划问题在强化学习中的问题
methods: 提出了首个算法UCRL-WVTR，实现了时间 horizon 自由和实例 dependent
results: 实现了时间 horizon 自由、实例 dependent 和锐度下界，并且 computationally efficientHere’s a more detailed explanation of each point:
for: The paper aims to solve the long-planning horizon problem in reinforcement learning, which is a challenging problem that many existing algorithms cannot handle.
methods: The proposed algorithm UCRL-WVTR uses a novel approach that combines weighted value-targeted regression and high-order moment estimation to achieve horizon-free and instance-dependent performance.
results: The paper shows that UCRL-WVTR achieves a sharp regret bound, which is the best possible bound for this type of problem, and is computationally efficient. The results are corroborated by comprehensive experiments.

Abstract
To tackle long planning horizon problems in reinforcement learning with general function approximation, we propose the first algorithm, termed as UCRL-WVTR, that achieves both \emph{horizon-free} and \emph{instance-dependent}, since it eliminates the polynomial dependency on the planning horizon. The derived regret bound is deemed \emph{sharp}, as it matches the minimax lower bound when specialized to linear mixture MDPs up to logarithmic factors. Furthermore, UCRL-WVTR is \emph{computationally efficient} with access to a regression oracle. The achievement of such a horizon-free, instance-dependent, and sharp regret bound hinges upon (i) novel algorithm designs: weighted value-targeted regression and a high-order moment estimator in the context of general function approximation; and (ii) fine-grained analyses: a novel concentration bound of weighted non-linear least squares and a refined analysis which leads to the tight instance-dependent bound. We also conduct comprehensive experiments to corroborate our theoretical findings.

摘要
为了解决具有扩展规划难度的征配学习问题，我们提出了首个算法UCRL-WVTR，该算法实现了“无顶点”和“特定实例”两个特点，即消除了规划难度的多项式依赖关系。我们 derive的 regret bound被称为“锐利”，因为它与最小最优下界几乎匹配，当特化到线性混合MDP时。此外，UCRL-WVTR具有计算效率的优势，只需访问一个回归权重 oracle。实现这种“无顶点”、“特定实例”和“锐利” regret bound的成就取决于以下两点：（i）新的算法设计：Weighted value-targeted regression和高阶 moments estimator在总函数近似下的应用。（ii）细腻的分析：Weighted non-linear least squares的新的准则和高阶实例特定的分析。我们还进行了广泛的实验来证明我们的理论发现。

Privacy-preserving quantum federated learning via gradient hiding

paper_url: http://arxiv.org/abs/2312.04447
repo_url: None
paper_authors: Changhao Li, Niraj Kumar, Zhixin Song, Shouvanik Chakrabarti, Marco Pistoia
for: 这篇论文旨在解决分布式量子计算中的隐私问题，具体是在分布式Machine Learning（FL）场景下，数据泄露是由服务器进行梯度反转攻击的问题。
methods: 本论文提出了两种新的量子协议，其中一种基于私有内积计算，另一种基于增量学习。这两种协议都使用量子信息隐藏来保护梯度信息，从而提高了隐私保护。
results: 本论文的两种协议比前一些使用表达式量子圈或差分隐私技术的协议更加高效，可以在低通信资源下提供强大的隐私保护。这些协议可以被视为分布式量子计算领域的一个重要突破，并且为安全分布式量子机器学习的发展奠定了基础。

Abstract
Distributed quantum computing, particularly distributed quantum machine learning, has gained substantial prominence for its capacity to harness the collective power of distributed quantum resources, transcending the limitations of individual quantum nodes. Meanwhile, the critical concern of privacy within distributed computing protocols remains a significant challenge, particularly in standard classical federated learning (FL) scenarios where data of participating clients is susceptible to leakage via gradient inversion attacks by the server. This paper presents innovative quantum protocols with quantum communication designed to address the FL problem, strengthen privacy measures, and optimize communication efficiency. In contrast to previous works that leverage expressive variational quantum circuits or differential privacy techniques, we consider gradient information concealment using quantum states and propose two distinct FL protocols, one based on private inner-product estimation and the other on incremental learning. These protocols offer substantial advancements in privacy preservation with low communication resources, forging a path toward efficient quantum communication-assisted FL protocols and contributing to the development of secure distributed quantum machine learning, thus addressing critical privacy concerns in the quantum computing era.

摘要
分布式量子计算，特别是分布式量子机器学习，已经受到了大量的投入，因为它可以利用分布在多个量子节点之间的量子资源，超越个体量子节点的限制。然而，在分布式计算协议中保护隐私是一个急需解决的主要挑战，特别是在标准的类型ical federated learning（FL）场景中，参与计算的客户端数据容易遭受到服务器的泄露。这篇论文提出了新的量子协议，利用量子通信来解决FL问题，强化隐私措施，并优化通信效率。与前一些使用表达式变量量子电路或差分隐私技术的作品不同，我们考虑了抽象量子状态来隐藏梯度信息，并提出了两种不同的FL协议，一种基于私有内积估计，另一种基于增量学习。这些协议可以在隐私保护方面做出重要进步，并且具有低通信资源的优势，开拓出有效的量子通信助手FL协议的发展，从而解决了量子计算时代的关键隐私问题。

FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning Attacks in Federated Learning

paper_url: http://arxiv.org/abs/2312.04432
repo_url: None
paper_authors: Hossein Fereidooni, Alessandro Pegoraro, Phillip Rieger, Alexandra Dmitrienko, Ahmad-Reza Sadeghi
for: 防范 Federated Learning 中的毒素攻击 (poisoning attacks)
methods: 使用频率域技术来检测和过滤毒素更新
results: 可以有效地防范毒素攻击，且无较大影响于模型的实用性

Abstract
Federated learning (FL) is a collaborative learning paradigm allowing multiple clients to jointly train a model without sharing their training data. However, FL is susceptible to poisoning attacks, in which the adversary injects manipulated model updates into the federated model aggregation process to corrupt or destroy predictions (untargeted poisoning) or implant hidden functionalities (targeted poisoning or backdoors). Existing defenses against poisoning attacks in FL have several limitations, such as relying on specific assumptions about attack types and strategies or data distributions or not sufficiently robust against advanced injection techniques and strategies and simultaneously maintaining the utility of the aggregated model. To address the deficiencies of existing defenses, we take a generic and completely different approach to detect poisoning (targeted and untargeted) attacks. We present FreqFed, a novel aggregation mechanism that transforms the model updates (i.e., weights) into the frequency domain, where we can identify the core frequency components that inherit sufficient information about weights. This allows us to effectively filter out malicious updates during local training on the clients, regardless of attack types, strategies, and clients' data distributions. We extensively evaluate the efficiency and effectiveness of FreqFed in different application domains, including image classification, word prediction, IoT intrusion detection, and speech recognition. We demonstrate that FreqFed can mitigate poisoning attacks effectively with a negligible impact on the utility of the aggregated model.

摘要
Federation learning (FL) 是一种合作学习模式，允许多个客户端共同训练模型，无需共享训练数据。然而，FL 受到毒 attacks 的威胁，敌人可以在联合模型聚合过程中注入假设Injected model updates 来腐蚀或破坏预测结果（无目标毒 attacks）或植入隐藏功能（Targeted poisoning 或 backdoors）。现有的 FL 中毒防御措施有一些限制，如基于特定的攻击类型和策略或数据分布的假设，或不够鲁棒地对抗高级别的插入技术和策略。为了解决现有防御措施的不足，我们采用一种通用和全然不同的方法来检测毒 attacks。我们称之为 FreqFed，它是一种新的聚合机制，将模型更新（即权重）转换到频率域，从而可以识别模型更新中具有足够信息的核心频率成分。这使得我们可以在客户端上本地训练时，不论攻击类型、策略或客户端数据分布，都可以有效地过滤恶意更新。我们在不同的应用领域进行了广泛的测试，包括图像分类、词语预测、互联网入侵检测和语音识别。我们的结果表明，FreqFed 可以有效地抵御毒 attacks，而且对联合模型的实用性产生较小的影响。

Monitoring Sustainable Global Development Along Shared Socioeconomic Pathways

paper_url: http://arxiv.org/abs/2312.04416
repo_url: None
paper_authors: Michelle W. L. Wan, Jeffrey N. Clark, Edward A. Small, Elena Fillola Mayoral, Raúl Santos-Rodríguez
for: 监测可持续global发展
methods: 数学 derivation scoring algorithm和机器学习方法
results: 初步研究获得了promising results，实现了可持续发展监测的可能性

Abstract
Sustainable global development is one of the most prevalent challenges facing the world today, hinging on the equilibrium between socioeconomic growth and environmental sustainability. We propose approaches to monitor and quantify sustainable development along the Shared Socioeconomic Pathways (SSPs), including mathematically derived scoring algorithms, and machine learning methods. These integrate socioeconomic and environmental datasets, to produce an interpretable metric for SSP alignment. An initial study demonstrates promising results, laying the groundwork for the application of different methods to the monitoring of sustainable global development.

摘要
可持续全球发展是当今世界面临的一个最主要挑战，它取决于经济社会发展和环境可持续性的平衡。我们提出了监测和衡量可持续发展的方法，包括数学 derivation 的分数算法和机器学习方法。这些方法可以将经济社会和环境数据集成起来，生成可解释的指标，用于衡量可持续发展的Alignment。一项初步研究表明了这些方法的替代性，为可持续全球发展的监测奠定基础。

On the Impact of Multi-dimensional Local Differential Privacy on Fairness

paper_url: http://arxiv.org/abs/2312.04404
repo_url: https://github.com/karimamakhlouf/impact_of_ldp_on_fairness
paper_authors: karima Makhlouf, Heber H. Arcolezi, Sami Zhioua, Ghassen Ben Brahim, Catuscia Palamidessi
for: 该论文主要探讨了多Attribute differential privacy（LDP）在保持公平性的情况下如何实现隐私保护。
methods: 该论文使用了多Attribute LDP，并进行了详细的Empirical分析，包括Synthetic数据和Benchmark数据的测试。
results: 研究发现，多Attribute LDP可以有效减少差异，但是独立vs共合LDP只在保障低水平时才有意义，而输出Y分布也对于哪个群体更加敏感于隐私保护。

Abstract
Automated decision systems are increasingly used to make consequential decisions in people's lives. Due to the sensitivity of the manipulated data as well as the resulting decisions, several ethical concerns need to be addressed for the appropriate use of such technologies, in particular, fairness and privacy. Unlike previous work, which focused on centralized differential privacy (DP) or local DP (LDP) for a single sensitive attribute, in this paper, we examine the impact of LDP in the presence of several sensitive attributes (i.e., multi-dimensional data) on fairness. Detailed empirical analysis on synthetic and benchmark datasets revealed very relevant observations. In particular, (1) multi-dimensional LDP is an efficient approach to reduce disparity, (2) the multi-dimensional approach of LDP (independent vs. combined) matters only at low privacy guarantees, and (3) the outcome Y distribution has an important effect on which group is more sensitive to the obfuscation. Last, we summarize our findings in the form of recommendations to guide practitioners in adopting effective privacy-preserving practices while maintaining fairness and utility in ML applications.

摘要

Semi-Supervised Active Learning for Semantic Segmentation in Unknown Environments Using Informative Path Planning

paper_url: http://arxiv.org/abs/2312.04402
repo_url: None
paper_authors: Julius Rückin, Federico Magistri, Cyrill Stachniss, Marija Popović
for: 提高自动化机器人的视觉能力，使其在不知道环境中进行任务。
methods: 使用自适应地图基于的 планинг方法，将模型不确定性高的区域用于人工标注数据收集，并将稀缺的人工标注与自动生成的pseudo标签结合使用。
results: 与普通的全supervised方法相当，但减少了人工标注努力，而且超过了无监督方法的性能。

Abstract
Semantic segmentation enables robots to perceive and reason about their environments beyond geometry. Most of such systems build upon deep learning approaches. As autonomous robots are commonly deployed in initially unknown environments, pre-training on static datasets cannot always capture the variety of domains and limits the robot's perception performance during missions. Recently, self-supervised and fully supervised active learning methods emerged to improve a robot's vision. These approaches rely on large in-domain pre-training datasets or require substantial human labelling effort. We propose a planning method for semi-supervised active learning of semantic segmentation that substantially reduces human labelling requirements compared to fully supervised approaches. We leverage an adaptive map-based planner guided towards the frontiers of unexplored space with high model uncertainty collecting training data for human labelling. A key aspect of our approach is to combine the sparse high-quality human labels with pseudo labels automatically extracted from highly certain environment map areas. Experimental results show that our method reaches segmentation performance close to fully supervised approaches with drastically reduced human labelling effort while outperforming self-supervised approaches.

摘要
We propose a planning method for semi-supervised active learning of semantic segmentation that significantly reduces the amount of human labeling required compared to fully supervised approaches. Our method uses an adaptive map-based planner that is guided towards the frontiers of unexplored space with high model uncertainty, collecting training data for human labeling. We combine the sparse, high-quality human labels with pseudo labels automatically extracted from highly certain environment map areas.Experimental results show that our method achieves segmentation performance that is close to fully supervised approaches with drastically reduced human labeling effort, while outperforming self-supervised approaches.

A Scalable Network-Aware Multi-Agent Reinforcement Learning Framework for Decentralized Inverter-based Voltage Control

paper_url: http://arxiv.org/abs/2312.04371
repo_url: None
paper_authors: Han Xu, Jialin Zheng, Guannan Qu
for: 这 paper Addresses the challenges of decentralized voltage control in power grids with an increase in distributed generations (DGs).
methods: 该 paper 提出了一种 scalable network-aware (SNA) 框架，利用网络结构来缩减评估器的输入，从而提高可扩展性和通信成本在训练过程中。
results: 该 paper 在一个包含 114 个分布式发电机的系统中成功应用了该 SNA 框架，并提供了一个可能的解决方案 для Decentralized voltage control in increasingly complex power grid systems.

Abstract
This paper addresses the challenges associated with decentralized voltage control in power grids due to an increase in distributed generations (DGs). Traditional model-based voltage control methods struggle with the rapid energy fluctuations and uncertainties of these DGs. While multi-agent reinforcement learning (MARL) has shown potential for decentralized secondary control, scalability issues arise when dealing with a large number of DGs. This problem lies in the dominant centralized training and decentralized execution (CTDE) framework, where the critics take global observations and actions. To overcome these challenges, we propose a scalable network-aware (SNA) framework that leverages network structure to truncate the input to the critic's Q-function, thereby improving scalability and reducing communication costs during training. Further, the SNA framework is theoretically grounded with provable approximation guarantee, and it can seamlessly integrate with multiple multi-agent actor-critic algorithms. The proposed SNA framework is successfully demonstrated in a system with 114 DGs, providing a promising solution for decentralized voltage control in increasingly complex power grid systems.

摘要

Investigating the Design Space of Diffusion Models for Speech Enhancement

paper_url: http://arxiv.org/abs/2312.04370
repo_url: None
paper_authors: Philippe Gonzalez, Zheng-Hua Tan, Jan Østergaard, Jesper Jensen, Tommy Sonne Alstrøm, Tobias May
for: 这个论文旨在扩展 diffusion models 到speech enhancement 领域，以提高 speech enhancement 的性能。
methods: 这个论文使用了 diffusion models 模型进行 speech enhancement，并对不同的设计方面进行了系统性的调查。
results: 论文表明，在使用 proper preconditioning、training loss weighting、SDE 和抽象器时，可以超过一个流行的 diffusion-based speech enhancement 系统，并且使用更少的抽象步骤，从而降低计算成本。

Abstract
Diffusion models are a new class of generative models that have shown outstanding performance in image generation literature. As a consequence, studies have attempted to apply diffusion models to other tasks, such as speech enhancement. A popular approach in adapting diffusion models to speech enhancement consists in modelling a progressive transformation between the clean and noisy speech signals. However, one popular diffusion model framework previously laid in image generation literature did not account for such a transformation towards the system input, which prevents from relating the existing diffusion-based speech enhancement systems with the aforementioned diffusion model framework. To address this, we extend this framework to account for the progressive transformation between the clean and noisy speech signals. This allows us to apply recent developments from image generation literature, and to systematically investigate design aspects of diffusion models that remain largely unexplored for speech enhancement, such as the neural network preconditioning, the training loss weighting, the stochastic differential equation (SDE), or the amount of stochasticity injected in the reverse process. We show that the performance of previous diffusion-based speech enhancement systems cannot be attributed to the progressive transformation between the clean and noisy speech signals. Moreover, we show that a proper choice of preconditioning, training loss weighting, SDE and sampler allows to outperform a popular diffusion-based speech enhancement system in terms of perceptual metrics while using fewer sampling steps, thus reducing the computational cost by a factor of four.

摘要
Diffusion模型是一新的一类生成模型，在图像生成领域中表现出色。因此，研究人员尝试将Diffusion模型应用到其他任务中，如声音提高。一种流行的方法是在Diffusion模型中模拟清晰声音和噪声声音之间的渐进变换。然而，一个流行的Diffusion模型框架在图像生成领域中提出，并没有考虑这种变换向系统输入的问题，这阻止了将现有的Diffusion模型基于声音提高系统与上述Diffusion模型框架相关联。为了解决这个问题，我们延展这个框架，以考虑清晰声音和噪声声音之间的渐进变换。这allowed我们应用最新的发展在图像生成领域，并系统地研究Diffusion模型的设计方面，例如神经网络预处理、训练损失权重、斯德笃抽象（SDE）或抽象进程中的随机性。我们发现，先前的Diffusion模型基于声音提高系统的性能不能归因于清晰声音和噪声声音之间的渐进变换。此外，我们发现，对预处理、训练损失权重、SDE和抽象进程中的随机性进行合适的选择可以超过一种流行的Diffusion模型基于声音提高系统，以至于使用更少的抽象步骤，从而降低计算成本，减少一半。

NeuJeans: Private Neural Network Inference with Joint Optimization of Convolution and Bootstrapping

paper_url: http://arxiv.org/abs/2312.04356
repo_url: None
paper_authors: Jae Hyung Ju, Jaiyoung Park, Jongmin Kim, Donghwan Kim, Jung Ho Ahn
for: 本文旨在提出一种基于完全同质加密（FHE）的深度 convolutional neural network（CNN）的私有透明度服务（PI）解决方案，使得客户端可以将推理任务完全委托给云服务器，而不需要把客户端数据泄露给服务器。
methods: 本文提出了一种叫做NeuJeans的FHE基于解决方案，它解决了深度conv2d层的计算成本过高的问题，主要是因为数据重新排序和启动成本过高。作者们提出了一种嵌入式结构的编码方法，以便在FHE中实现高效的conv2d算法，并且发现将conv2d与启动相结合可以减少计算成本。最后，作者们提出了一些优化的执行流程，以实现不同类型的conv2d的高效实现。
results: 作者们的NeuJeans解决方案可以在 ImageNet （ResNet18）中实现深度 CNN 的透明度服务，只需几秒钟的时间，并且在 state-of-the-art FHE-based PI 工作中提高了 conv2d 的性能，最高达 5.68 倍。

Abstract
Fully homomorphic encryption (FHE) is a promising cryptographic primitive for realizing private neural network inference (PI) services by allowing a client to fully offload the inference task to a cloud server while keeping the client data oblivious to the server. This work proposes NeuJeans, an FHE-based solution for the PI of deep convolutional neural networks (CNNs). NeuJeans tackles the critical problem of the enormous computational cost for the FHE evaluation of convolutional layers (conv2d), mainly due to the high cost of data reordering and bootstrapping. We first propose an encoding method introducing nested structures inside encoded vectors for FHE, which enables us to develop efficient conv2d algorithms with reduced data reordering costs. However, the new encoding method also introduces additional computations for conversion between encoding methods, which could negate its advantages. We discover that fusing conv2d with bootstrapping eliminates such computations while reducing the cost of bootstrapping. Then, we devise optimized execution flows for various types of conv2d and apply them to end-to-end implementation of CNNs. NeuJeans accelerates the performance of conv2d by up to 5.68 times compared to state-of-the-art FHE-based PI work and performs the PI of a CNN at the scale of ImageNet (ResNet18) within a mere few seconds

摘要
《归一深度学习（FHE）》是一种有前途的密码学 primitives，可以让客户端将推理任务完全卸载到云服务器上，同时保持客户端数据对服务器的不透明度。这项工作提出了《NeuJeans》，一种基于FHE的深度学习（CNN）的私有推理（PI）解决方案。 NeuJeans解决了FHE评估中深度学习层（conv2d）的巨大计算成本问题，主要是因为数据重新排序和启动成本高昂。我们首先提出了嵌入结构内编码向量的编码方法，该方法可以帮助我们开发高效的 conv2d 算法，并降低数据重新排序成本。然而，新的编码方法也引入了 conversions between encoding methods 的计算成本，这可能会消除其优势。我们发现，将 conv2d 与启动混合可以消除这些计算成本，同时降低启动成本。然后，我们设计了不同类型的 conv2d 的优化执行流，并将其应用到 CNN 的端到端实现。 NeuJeans 可以在几秒钟内完成 ImageNet （ResNet18）的 CNN 私有推理，并在 state-of-the-art FHE-based PI 工作中提高 conv2d 性能的速度达 5.68 倍。

Improved Efficient Two-Stage Denoising Diffusion Power System Measurement Recovery Against False Data Injection Attacks and Data Losses

paper_url: http://arxiv.org/abs/2312.04346
repo_url: None
paper_authors: Jianhua Pei, Jingyu Wang, Dongyuan Shi, Ping Wang
for: 强制恢复电力系统测量数据 despite 随机变化和数据损失
methods: 提出一种改进的两阶段推散模型（TSDM），包括分类器引导的假值检测和推散基于推散测量的增强部分
results: 经过广泛的数值实验表明，提出的 TSDM 可以准确恢复电力系统测量数据，即使面临强度随机变化和复杂的Cyber-Physical 情况下Here’s a brief explanation of each point:
for: The paper is written to address the issue of measurement uncertainties in power systems, specifically the problem of recovering accurate measurements despite random changes and data losses.
methods: The proposed solution is an improved two-stage denoising diffusion model (TSDM) that consists of a classifier-guided conditional anomaly detection component and a diffusion-based measurement imputation component. The model also employs precise means and optimal variances to accelerate the diffusion generation process with subsequence sampling.
results: Extensive numerical case studies demonstrate that the proposed TSDM can accurately recover power system measurements under various measurement uncertainties, including renewable energy integration and complex cyber-physical contingencies. Additionally, the proposed TSDM shows stronger robustness compared to existing reconstruction networks and lower computational complexity than general denoising diffusion models.

Abstract
Measurement uncertainties, represented by cyber-attacks and data losses, seriously degrade the quality of power system measurements. Fortunately, the powerful generation ability of the denoising diffusion models can enable more precise measurement generation for power system data recovery. However, the controllable data generation and efficient computing methods of denoising diffusion models for deterministic trajectory still need further investigation. To this end, this paper proposes an improved two-stage denoising diffusion model (TSDM) to identify and reconstruct the measurements with various measurement uncertainties. The first stage of the model comprises a classifier-guided conditional anomaly detection component, while the second stage involves diffusion-based measurement imputation component. Moreover, the proposed TSDM adopts precise means and optimal variances to accelerate the diffusion generation process with subsequence sampling. Extensive numerical case studies demonstrate that the proposed TSDM can accurately recover power system measurements despite strong randomness under renewable energy integration and highly nonlinear dynamics under complex cyber-physical contingencies. Additionally, the proposed TSDM has stronger robustness compared to existing reconstruction networks and exhibits lower computational complexity than general denoising diffusion models.

摘要
量度不确定性，表现为网络攻击和数据丢失，对电力系统量度质量产生严重影响。幸运的是，具有强生成能力的杂散扩散模型可以为电力系统数据恢复提供更精确的量度生成。然而，杂散扩散模型的可控数据生成和高效计算方法仍需进一步研究。为此，本文提出了改进的两stage杂散扩散模型（TSDM），用于标识和重建具有多种量度不确定性的量度。首stage模型包括类 Conditional anomaly detection 组件，而第二stage模型则是基于扩散的测量填充组件。此外，提议的 TSDM 采用精确的方法和优化的方差，以加速扩散生成过程，并使用 subsequential sampling。广泛的数字案例研究表明，提议的 TSDM 可以准确地恢复电力系统量度，即使面临环境变化和复杂的网络攻击。此外，提议的 TSDM 具有更高的Robustness 和更低的计算复杂度，相比于现有的重建网络。

Learning to sample in Cartesian MRI

paper_url: http://arxiv.org/abs/2312.04327
repo_url: None
paper_authors: Thomas Sanchez
for: 这个论文目的是优化磁共振成像(MRI)的探测时间，以提高病人的舒适度、减少检测成本和提高过程效率。
methods: 这个论文使用了减少样本数探测(CS)和深度学习来加速MRI的取样。它还探讨了两种方法来优化探测方式，包括lazzy LBCS和Stochastic LBCS。
results: 这个论文的结果显示，lazzy LBCS和Stochastic LBCS可以对G"ozc"u et al.的greedy learning-based CS(LBCS)方法进行优化，并且可以在大规模的临床enario中运行。此外，这个论文还表明，生成 adversarial networks(GANs)可以作为探测方式的自然标准，并且可以透过量子随机性来引导取样。

Abstract
Despite its exceptional soft tissue contrast, Magnetic Resonance Imaging (MRI) faces the challenge of long scanning times compared to other modalities like X-ray radiography. Shortening scanning times is crucial in clinical settings, as it increases patient comfort, decreases examination costs and improves throughput. Recent advances in compressed sensing (CS) and deep learning allow accelerated MRI acquisition by reconstructing high-quality images from undersampled data. While reconstruction algorithms have received most of the focus, designing acquisition trajectories to optimize reconstruction quality remains an open question. This thesis explores two approaches to address this gap in the context of Cartesian MRI. First, we propose two algorithms, lazy LBCS and stochastic LBCS, that significantly improve upon G\"ozc\"u et al.'s greedy learning-based CS (LBCS) approach. These algorithms scale to large, clinically relevant scenarios like multi-coil 3D MR and dynamic MRI, previously inaccessible to LBCS. Additionally, we demonstrate that generative adversarial networks (GANs) can serve as a natural criterion for adaptive sampling by leveraging variance in the measurement domain to guide acquisition. Second, we delve into the underlying structures or assumptions that enable mask design algorithms to perform well in practice. Our experiments reveal that state-of-the-art deep reinforcement learning (RL) approaches, while capable of adaptation and long-horizon planning, offer only marginal improvements over stochastic LBCS, which is neither adaptive nor does long-term planning. Altogether, our findings suggest that stochastic LBCS and similar methods represent promising alternatives to deep RL. They shine in particular by their scalability and computational efficiency and could be key in the deployment of optimized acquisition trajectories in Cartesian MRI.

摘要
尽管它具有出色的软组织干扰特征，但是磁共振成像（MRI）仍然面临着扫描时间较长的挑战，相比其他方法如X射线成像。缩短扫描时间是在临床设置中非常重要，因为它提高了患者的 COMFORT，降低了检查成本，并提高了通过率。 latest advances in compressed sensing（CS）和深度学习允许加速MRI捕获，但是设计捕获路径以提高重建质量仍然是一个未解决的问题。这个论文探讨了在Cartesian MRI中两种方法来解决这个问题。首先，我们提出了两种算法，懒散LBCS和随机LBCS，这两种算法可以在大规模临床场景中，如多极ocoil 3D MR和动态MRI中，提高重建质量。此外，我们发现了使用生成对抗网络（GANs）可以作为重建质量的自然标准，通过测量频谱的差异来引导捕获。其次，我们研究了在实践中底层的假设和结构，以便设计面积算法可以在实际场景中表现良好。我们的实验表明，当前的深度学习RL方法，尽管具有适应和长期规划能力，仅提供了有限的改进，而stochastic LBCS则不是适应的 nor does it involve long-term planning。总之，我们的发现表明，stochastic LBCS和类似方法在Cartesian MRI中具有优秀的可扩展性和计算效率，可能成为优化捕获路径的关键。

Equivariant Scalar Fields for Molecular Docking with Fast Fourier Transforms

paper_url: http://arxiv.org/abs/2312.04323
repo_url: https://github.com/bjing2016/scalar-fields
paper_authors: Bowen Jing, Tommi Jaakkola, Bonnie Berger
For: The paper is written for researchers and practitioners in the field of molecular docking and virtual screening, who are interested in accelerating the optimization process of scoring functions using machine learning techniques.* Methods: The paper proposes a machine learning-based approach to learn a scoring function with a functional form that allows for more rapid optimization. Specifically, the scoring function is defined as the cross-correlation of multi-channel ligand and protein scalar fields parameterized by equivariant graph neural networks. The approach uses fast Fourier transforms to enable rapid optimization over rigid-body degrees of freedom.* Results: The paper benchmarks the proposed scoring function on two simplified docking-related tasks: decoy pose scoring and rigid conformer docking. The results show that the proposed method attains similar but faster performance on crystal structures compared to the widely-used Vina and Gnina scoring functions, and is more robust on computationally predicted structures.

Abstract
Molecular docking is critical to structure-based virtual screening, yet the throughput of such workflows is limited by the expensive optimization of scoring functions involved in most docking algorithms. We explore how machine learning can accelerate this process by learning a scoring function with a functional form that allows for more rapid optimization. Specifically, we define the scoring function to be the cross-correlation of multi-channel ligand and protein scalar fields parameterized by equivariant graph neural networks, enabling rapid optimization over rigid-body degrees of freedom with fast Fourier transforms. The runtime of our approach can be amortized at several levels of abstraction, and is particularly favorable for virtual screening settings with a common binding pocket. We benchmark our scoring functions on two simplified docking-related tasks: decoy pose scoring and rigid conformer docking. Our method attains similar but faster performance on crystal structures compared to the widely-used Vina and Gnina scoring functions, and is more robust on computationally predicted structures. Code is available at https://github.com/bjing2016/scalar-fields.

摘要
分子停靠是结构基于虚拟屏选的关键步骤，但这些工作流程的throughput受到大多数 docking 算法中的优化 scoring function 的成本限制。我们 investigate 如何使用机器学习加速这个过程，通过学习一个具有更快的优化的函数形式。Specifically，我们定义 scoring function 为多通道 ligand 和蛋白质 scalar fields 参数的 equivariant graph neural networks，可以快速优化 rigid-body 度OF freedom WITH fast Fourier transforms。我们的方法可以在多级别归一化 runtime，特别适用于具有共同绑定袋的虚拟屏选设置。我们对两个简化 docking 相关任务进行了 benchmarking：decoy pose scoring 和rigid conformer docking。我们的方法在 crystal structures 上与广泛使用的 Vina 和 Gnina scoring functions 相当，但更快速。此外，我们的方法在 computationally predicted structures 上更加稳定。代码可以在 https://github.com/bjing2016/scalar-fields 上获取。

Stochastic-Constrained Stochastic Optimization with Markovian Data

paper_url: http://arxiv.org/abs/2312.04312
repo_url: None
paper_authors: Yeongjong Kim, Dabeen Lee
for: 本研究考虑了随机限制的随机优化问题，其中随机限制是保证随机函数的期望值低于某个阈值。
methods: 我们扩展了偏置加penalty框架，一种 primal-dual 随机梯度法，到Markov链抽样Setting中。我们提出了两种变种，其中一种是当 Mixing Time 知道的情况，而另一种是当 Mixing Time 不知道的情况。我们的算法适用于更一般的受限online凸优化问题中，其中约束函数序列遵循Markov链。
results: 我们通过对分类问题with fairness约束进行数值实验来证明我们的提出方法的效果。

Abstract
This paper considers stochastic-constrained stochastic optimization where the stochastic constraint is to satisfy that the expectation of a random function is below a certain threshold. In particular, we study the setting where data samples are drawn from a Markov chain and thus are not independent and identically distributed. We generalize the drift-plus-penalty framework, a primal-dual stochastic gradient method developed for the i.i.d. case, to the Markov chain sampling setting. We propose two variants of drift-plus-penalty; one is for the case when the mixing time of the underlying Markov chain is known while the other is for the case of unknown mixing time. In fact, our algorithms apply to a more general setting of constrained online convex optimization where the sequence of constraint functions follows a Markov chain. Both algorithms are adaptive in that the first works without knowledge of the time horizon while the second uses AdaGrad-style algorithm parameters, which is of independent interest. We demonstrate the effectiveness of our proposed methods through numerical experiments on classification with fairness constraints.

摘要
这个论文考虑了随机约束随机优化问题，其中随机约束是确保随机函数的期望值下于某个阈值。特别是，我们研究了从马尔可夫链中采样数据的情况，因此数据不是独立和 Identically distributed。我们扩展了推移加权策略，一种在i.i.d.情况下开发的 primal-dual 随机梯度法。我们提出了两种变体的推移加权策略，其中一种是当下知 mixing 时间的情况，另一种是当 mixing 时间未知的情况。实际上，我们的算法适用于更一般的受限 online 凸优化问题，其中约束函数序列采样自马尔可夫链。两种算法都是可适应的，第一个不需要时间框架，而第二个使用 AdaGrad 类型的算法参数，这是独立的兴趣。我们通过对分类问题的数学实验表明了我们的提议的方法的效果。

Finding Interpretable Class-Specific Patterns through Efficient Neural Search

paper_url: http://arxiv.org/abs/2312.04311
repo_url: None
paper_authors: Nils Philipp Walter, Jonas Fischer, Jilles Vreeken
for: 本研究旨在找到数据中表现出类别差异的特征，以推测和推理不同类别的机制。在分子生物中，这可能会导致新的治疗方案。
methods: 本文提出了一种新的二进制神经网络架构DIFFNAPS，用于从数据中找到差异特征。DIFFNAPS可扩展到万个特征和噪音抗性，因此可以应用于大规模应用场景。
results: 在 sintetic 数据和实际世界数据中，包括三个生物应用，DiffNaps 能够准确、简洁和可解释地描述类别。与现有方法不同，DiffNaps 能够在大规模应用场景中提供有用的结果。

Abstract
Discovering patterns in data that best describe the differences between classes allows to hypothesize and reason about class-specific mechanisms. In molecular biology, for example, this bears promise of advancing the understanding of cellular processes differing between tissues or diseases, which could lead to novel treatments. To be useful in practice, methods that tackle the problem of finding such differential patterns have to be readily interpretable by domain experts, and scalable to the extremely high-dimensional data. In this work, we propose a novel, inherently interpretable binary neural network architecture DIFFNAPS that extracts differential patterns from data. DiffNaps is scalable to hundreds of thousands of features and robust to noise, thus overcoming the limitations of current state-of-the-art methods in large-scale applications such as in biology. We show on synthetic and real world data, including three biological applications, that, unlike its competitors, DiffNaps consistently yields accurate, succinct, and interpretable class descriptions

摘要
发现数据中类别之间的差异模式，可以推测和理解类型特有的机制。在分子生物学中，这有可能提高了Cells的过程之间的理解，从而导致新的治疗方法。为了在实践中有用，找到这些差异模式的方法必须能够快速地被领域专家理解，并且可以处理大量的特征。在这种情况下，我们提出了一种新的、自然语言可读的二进制神经网络架构DIFFNAPS，它可以从数据中提取差异模式。DiffNaps可扩展到千上万个特征，并且具有鲁棒性和抗噪特性，因此可以在大规模应用中超越现有的状况架构。我们在 sintetic和实际数据上，包括三个生物应用，示出了DiffNaps在比其他竞争对手的情况下，能够提供准确、简洁和可读的类描述。

A Structural-Clustering Based Active Learning for Graph Neural Networks

paper_url: http://arxiv.org/abs/2312.04307
repo_url: https://github.com/rickymaulanafajri/spa
paper_authors: Ricky Maulana Fajri, Yulong Pei, Lu Yin, Mykola Pechenizkiy
for: 这篇论文的目的是提出一种名为 Structural-Clustering PageRank 的活动学习方法，来提高 graph-structured data 上的 Graph Neural Networks (GNNs) 的效果。
methods: 这篇论文使用了 SCAN 算法来探测社区，并与 PageRank 分数方法结合，以实现更有效和有用的样本选择。
results: 这篇论文透过实验展示了 SPA 方法在不同的标注预算下的更高精度和macro-F1 分数，并且可以实现更大的样本数量和更快的查询时间。

Abstract
In active learning for graph-structured data, Graph Neural Networks (GNNs) have shown effectiveness. However, a common challenge in these applications is the underutilization of crucial structural information. To address this problem, we propose the Structural-Clustering PageRank method for improved Active learning (SPA) specifically designed for graph-structured data. SPA integrates community detection using the SCAN algorithm with the PageRank scoring method for efficient and informative sample selection. SPA prioritizes nodes that are not only informative but also central in structure. Through extensive experiments, SPA demonstrates higher accuracy and macro-F1 score over existing methods across different annotation budgets and achieves significant reductions in query time. In addition, the proposed method only adds two hyperparameters, $\epsilon$ and $\mu$ in the algorithm to finely tune the balance between structural learning and node selection. This simplicity is a key advantage in active learning scenarios, where extensive hyperparameter tuning is often impractical.

摘要
在活动学习中，图 neural network (GNN) 已经表现出效iveness。然而，在这些应用中，一个常见的挑战是不足利用重要的结构信息。为了解决这个问题，我们提出了结构归一化PageRank方法（SPA），专门为图结构数据设计。SPA将SCAN算法用于社区探测与PageRank分数方法结合，以提高活动学习中的选择效率和信息含量。SPA将中心结构中的节点优先选择，以提高准确率和macro-F1分数。经过广泛的实验，SPA在不同的注解预算下显示出较高的准确率和macro-F1分数，并且在查询时间方面具有显著的减少。此外，提出的方法只有两个超参数，$\epsilon$ 和 $\mu$，可以细调化结构学习和节点选择的平衡。这种简单性是活动学习场景中的一个关键优势，因为投入大量的超参数调试是不实际的。

Simulating the Air Quality Impact of Prescribed Fires Using a Graph Neural Network-Based PM$_{2.5}$ Emissions Forecasting System

paper_url: http://arxiv.org/abs/2312.04291
repo_url: None
paper_authors: Kyleen Liao, Jatan Buch, Kara Lamb, Pierre Gentine
for: 这篇论文主要是为了解决西北美洲的大火中PM$_{2.5}$污染物的预测问题。
methods: 这篇论文提出了一种将控制火 simulation与空间时间图 neural network-based PM$_{2.5}$ 预测模型相结合的新方法。
results: 这篇论文的实验是在决定加利福尼亚州进行控制火的最佳时间和外火季进行更多的控制火时，Quantify the potential air quality trade-offs involved in conducting more prescribed fires outside the fire season.

Abstract
The increasing size and severity of wildfires across western North America have generated dangerous levels of PM$_{2.5}$ pollution in recent years. In a warming climate, expanding the use of prescribed fires is widely considered to be the most robust fire mitigation strategy. However, reliably forecasting the potential air quality impact from these prescribed fires, a critical ingredient in determining the fires' location and time, at hourly to daily time scales remains a challenging problem. This paper proposes a novel integration of prescribed fire simulation with a spatio-temporal graph neural network-based PM$_{2.5}$ forecasting model. The experiments in this work focus on determining the optimal time for implementing prescribed fires in California as well as quantifying the potential air quality trade-offs involved in conducting more prescribed fires outside the fire season.

摘要
在西北美洲的西部，过去几年的野火规模和严重程度都在增长。这些野火会生成危险水平的PM$_{2.5}$污染物。在气候变化的情况下，扩大使用预制火是最有力的火mitigation策略。然而，在小时到日律时间尺度上，可靠地预测预制火对空气质量的影响仍然是一个挑战。本文提出了一种新的预制火仿真模型与空间时间图 neural network基于PM$_{2.5}$预测模型的集成。这些实验在决定加利福尼亚州预制火的最佳时间以及在外火季外进行更多预制火的空气质量交易征有关。

Factor-Assisted Federated Learning for Personalized Optimization with Heterogeneous Data

paper_url: http://arxiv.org/abs/2312.04281
repo_url: None
paper_authors: Feifei Wang, Huiyun Tang, Yang Li
for: 保护数据隐私的分布式机器学习框架（Federated Learning）中处理数据多样性的问题。
methods: 提出一种个性化的 Federated Learning 框架（FedSplit），基于数据在不同客户端中含有共同知识和个性知识的发现。通过将隐藏元素分为共同和个性两组，提出了一个新的目标函数并优化。
results: theoretically和empirically比较了 FedSplit 和标准 Federated Learning 方法的速度，证明了 FedSplit 的更快的速度。同时，对 FedSplit 方法的泛化上限 bounds 进行了研究。通过引入因素分析，实现了在实际数据上实现 FedSplit 模型，并对其进行了实际验证，证明了 FedFac 的更高的预测性能。

Abstract
Federated learning is an emerging distributed machine learning framework aiming at protecting data privacy. Data heterogeneity is one of the core challenges in federated learning, which could severely degrade the convergence rate and prediction performance of deep neural networks. To address this issue, we develop a novel personalized federated learning framework for heterogeneous data, which we refer to as FedSplit. This modeling framework is motivated by the finding that, data in different clients contain both common knowledge and personalized knowledge. Then the hidden elements in each neural layer can be split into the shared and personalized groups. With this decomposition, a novel objective function is established and optimized. We demonstrate FedSplit enjoyers a faster convergence speed than the standard federated learning method both theoretically and empirically. The generalization bound of the FedSplit method is also studied. To practically implement the proposed method on real datasets, factor analysis is introduced to facilitate the decoupling of hidden elements. This leads to a practically implemented model for FedSplit and we further refer to as FedFac. We demonstrated by simulation studies that, using factor analysis can well recover the underlying shared/personalized decomposition. The superior prediction performance of FedFac is further verified empirically by comparison with various state-of-the-art federated learning methods on several real datasets.

摘要
“联合学习”是一种现代分布式机器学习框架，旨在保护数据隐私。数据多样性是联合学习中的核心挑战，这可能会严重影响深度神经网络的收敛率和预测性能。为解决这个问题，我们开发了一种新的个性化联合学习框架 для多样数据，我们称之为FedSplit。这个模型框架是基于发现数据在不同客户端中包含共同知识和个性化知识的发现。然后，隐藏元素在每层神经网络中可以被分解成共同和个性化两组。通过这种分解，我们定义了一个新的目标函数并优化它。我们证明FedSplit比标准联合学习方法更快地收敛，并且在理论和实验上进行了证明。我们还研究了FedSplit方法的泛化约束。为实现提议方法在实际数据上的应用，我们引入因素分析来帮助隐藏元素的解 coupling。这导致了一个实际应用的FedFac模型，我们进一步称之为FedFac。我们通过模拟研究表明，使用因素分析可以良好地回归到共同/个性化分解。FedFac的更高预测性能也被实际对比评估中的各种现代联合学习方法证明。

Estimating Countries with Similar Maternal Mortality Rate using Cluster Analysis and Pairing Countries with Identical MMR

paper_url: http://arxiv.org/abs/2312.04275
repo_url: None
paper_authors: S. Nandini, Sanjjushri Varshini R
for: 本研究目的是分析和发现各国的孕产母 mortality rate (MMR) 水平，以及哪些国家的孕产母健康受到哪些因素的影响。
methods: 本研究采用机器学习的不supervised学习方法进行嵌入分析，以找到具有相似 MMR 水平的国家对照。
results: 研究发现了一些国家的 MMR 水平呈现出较高的趋势，而其他国家的 MMR 水平呈现出较低的趋势。此外，研究还发现了一些国家的孕产母健康受到哪些因素的影响，如医疗设施和医疗服务质量等。

Abstract
In the evolving world, we require more additionally the young era to flourish and evolve into developed land. Most of the population all around the world are unaware of the complications involved in the routine they follow while they are pregnant and how hospital facilities affect maternal health. Maternal Mortality is the death of a pregnant woman due to intricacies correlated to pregnancy, underlying circumstances exacerbated by the pregnancy or management of these situations. It is crucial to consider the Maternal Mortality Rate (MMR) in diverse locations and determine which human routines and hospital facilities diminish the Maternal Mortality Rate (MMR). This research aims to examine and discover the countries which are keeping more lavish threats of MMR and countries alike in MMR encountered. Data is examined and collected for various countries, data consists of the earlier years' observation. From the perspective of Machine Learning, Unsupervised Machine Learning is implemented to perform Cluster Analysis. Therefore the pairs of countries with similar MMR as well as the extreme opposite pair concerning the MMR are found.

摘要
在 развивающемся世界，我们需要更多的年轻一代发展成为发达地区。大多数人民全球都不知道他们日常 Routine 中有哪些问题，以及孕期所带来的风险如何影响母亲健康。孕子死亡率（MMR）是指怀孕期间因孕期相关问题或加剧了孕期所带来的问题而导致的母亲死亡率。我们需要考虑各地区的 MMR，并找出哪些人类习惯和医院设施可以降低 MMR。这项研究的目的是检查各国 MM Rate 的情况，并找出这些国家之间的相似性和对 MM Rate 的影响。我们通过机器学习的无监督方法进行嵌入分析，从而找到具有相似 MMR 的国家对应的对应对。

Invariant Random Forest: Tree-Based Model Solution for OOD Generalization

paper_url: http://arxiv.org/abs/2312.04273
repo_url: None
paper_authors: Yufan Liao, Qi Wu, Xing Yan
for: 本文主要研究对决策树模型的Out-Of-Distribution（OOD）泛化。
methods: 本文提出了一种新的和有效的决策树模型OOD泛化方法，称为不变决策树（IDT），该方法在生长树时加入稳定性约束。其ensemble版本，即不变随机森林（IRF），也被构建。
results: 数据测试表明，相比非OOD树模型，IDT和IRF具有更高的性能，这 imply OOD泛化对决策树模型是必要的和应该受到更多的关注。

Abstract
Out-Of-Distribution (OOD) generalization is an essential topic in machine learning. However, recent research is only focusing on the corresponding methods for neural networks. This paper introduces a novel and effective solution for OOD generalization of decision tree models, named Invariant Decision Tree (IDT). IDT enforces a penalty term with regard to the unstable/varying behavior of a split across different environments during the growth of the tree. Its ensemble version, the Invariant Random Forest (IRF), is constructed. Our proposed method is motivated by a theoretical result under mild conditions, and validated by numerical tests with both synthetic and real datasets. The superior performance compared to non-OOD tree models implies that considering OOD generalization for tree models is absolutely necessary and should be given more attention.

摘要
OUT-OF-DISTRIBUTION（OOD）泛化是机器学习中的一个重要话题。然而，最近的研究仅专注于神经网络中的相关方法。这篇论文介绍了一种新的和有效的决策树模型 OUT-OF-DISTRIBUTION 泛化方法，称为 invariable Decision Tree（IDT）。IDT 在不同环境中执行分裂的稳定性/变化行为增加一个罚金项。其ensemble版本，即 invariable Random Forest（IRF），通过组合IDT来构建。我们的提议方法是基于轻微条件的理论结论，并通过数学测试表明了与非OOD树模型的超越性。这表明考虑OOD泛化对树模型是必要的，并且应该得到更多的关注。Note: "决策树模型" (decision tree model) is translated as "树模型" (tree model) in Simplified Chinese, and "Out-Of-Distribution" (OOD) is translated as " OUT-OF-DISTRIBUTION" (with capital letters) in Simplified Chinese.

CODEX: A Cluster-Based Method for Explainable Reinforcement Learning

paper_url: http://arxiv.org/abs/2312.04216
repo_url: https://github.com/ainfosec/codex
paper_authors: Timothy K. Mathes, Jessica Inman, Andrés Colón, Simon Khan
for: 提高强化学习（RL）在实际应用中的采用率，通过解释RL代理行为和建立用户信任。
methods: 使用语义归类，准确概括RL代理行为在状态动作空间中的表现。
results: 实验结果表明，使用语义归类可以保留时间和实体信息，并将游戏环境中的综合事件与 semantic space 关联起来。

Abstract
Despite the impressive feats demonstrated by Reinforcement Learning (RL), these algorithms have seen little adoption in high-risk, real-world applications due to current difficulties in explaining RL agent actions and building user trust. We present Counterfactual Demonstrations for Explanation (CODEX), a method that incorporates semantic clustering, which can effectively summarize RL agent behavior in the state-action space. Experimentation on the MiniGrid and StarCraft II gaming environments reveals the semantic clusters retain temporal as well as entity information, which is reflected in the constructed summary of agent behavior. Furthermore, clustering the discrete+continuous game-state latent representations identifies the most crucial episodic events, demonstrating a relationship between the latent and semantic spaces. This work contributes to the growing body of work that strives to unlock the power of RL for widespread use by leveraging and extending techniques from Natural Language Processing.

摘要
尽管增强学习（RL）展示了各种印象的表现，但由于目前的解释RL机器人行为和建立用户信任的困难，RL在高风险实际应用中尚未得到广泛采用。我们提出了Counterfactual Demonstrations for Explanation（CODEX）方法，该方法通过语义划分，可以有效地概括RL机器人在状态动作空间的行为。实验在MiniGrid和StarCraft II游戏环境中表明，语义划分保留了时间和实体信息，并在构建机器人行为概要中反映出来。此外，将游戏状态的维度和连续数据映射到语义空间中，可以标识最重要的 episodic 事件，表明语义和 latent 空间之间存在关系。这项工作为RL的普及做出了贡献，并通过推广和扩展自然语言处理技术，推动RL在实际应用中的广泛使用。

Constrained Hierarchical Clustering via Graph Coarsening and Optimal Cuts

paper_url: http://arxiv.org/abs/2312.04209
repo_url: None
paper_authors: Eliabelle Mauduit, Andrea Simonetto
for: 本研究旨在抽象和简要概述相关信息，如满意问卷、酒店评价和X/Twitter等短语设定中的信息。
methods: 本研究使用了层次结构约束 clustering 问题，具体来说是 horizontal 约束和 vertical 约束。horizontal 约束通常是不能链接和必须链接 among words，而 vertical 约束是 cluster level 之间的先后约束。
results: 我们通过将问题分解为两步解决：首先，作为一个软约束REGULARIZED LEAST SQUARES，使得结果倾向于horizontal feasible set。然后，从结果中提取平铺的 clusters，基于可用的约束计算最优的剖高。我们发现该方法与现有算法相比，表现很好，计算效率也很高。

Abstract
Motivated by extracting and summarizing relevant information in short sentence settings, such as satisfaction questionnaires, hotel reviews, and X/Twitter, we study the problem of clustering words in a hierarchical fashion. In particular, we focus on the problem of clustering with horizontal and vertical structural constraints. Horizontal constraints are typically cannot-link and must-link among words, while vertical constraints are precedence constraints among cluster levels. We overcome state-of-the-art bottlenecks by formulating the problem in two steps: first, as a soft-constrained regularized least-squares which guides the result of a sequential graph coarsening algorithm towards the horizontal feasible set. Then, flat clusters are extracted from the resulting hierarchical tree by computing optimal cut heights based on the available constraints. We show that the resulting approach compares very well with respect to existing algorithms and is computationally light.

摘要
We overcome state-of-the-art bottlenecks by formulating the problem in two steps:1. First, as a soft-constrained regularized least-squares, which guides the result of a sequential graph coarsening algorithm towards the horizontal feasible set.2. Then, flat clusters are extracted from the resulting hierarchical tree by computing optimal cut heights based on the available constraints.We show that the resulting approach compares very well with respect to existing algorithms and is computationally light.Simplified Chinese:我们受短句设置中抽取和概括有用信息的需求，例如满意问naire、酒店评价和X/Twitter等，我们研究了层次划分单词的问题。特别是在 horizontal 和 vertical 结构约束下划分单词。horizontal 约束通常是不能链接和必须链接 между 单词，而 vertical 约束是层次约束 between 层次。我们突破现有瓶颈，通过将问题分解成两步：1. 首先，作为软约束正则最小二乘，将结果导向到可能的水平可行集。2. 然后，从结果的层次树中提取最佳剖高来确定层次结构。我们显示，我们的方法与现有算法相比，具有较好的比较和较轻的计算负担。

Wavelength-multiplexed Delayed Inputs for Memory Enhancement of Microring-based Reservoir Computing

paper_url: http://arxiv.org/abs/2312.04204
repo_url: None
paper_authors: Bernard J. Giron Castro, Christophe Peucheret, Francesco Da Ros
for: solves memory-demanding tasks like time-series prediction
methods: combines parallel delayed inputs and wavelength division multiplexing
results: good performance without requiring external optical feedback

Abstract
We numerically demonstrate a silicon add-drop microring-based reservoir computing scheme that combines parallel delayed inputs and wavelength division multiplexing. The scheme solves memory-demanding tasks like time-series prediction with good performance without requiring external optical feedback.

摘要
我们 numerically 示出了一种基于微缩彩环的硅添加降阶扩展计算方案，该方案结合了并行延迟输入和波长分 multiplexing。该方案可以解决具有巨大内存需求的任务，如时间序列预测，并且不需要外部光学反馈。

Coherent energy and force uncertainty in deep learning force fields

paper_url: http://arxiv.org/abs/2312.04174
repo_url: None
paper_authors: Peter Bjørn Jørgensen, Jonas Busk, Ole Winther, Mikkel N. Schmidt
for: 这个论文的目的是为了提出一种机器学习 potential energy model，以链接能量和力的 aleatoric uncertainty。
methods: 该论文使用了一种equivariant messages passing neural network potential，并在两个非平衡分子数据集上进行了训练。
results: 研究人员通过该模型可以获得能量和力的 aleatoric uncertainty，并且还可以通过一种 bayesian 解释来获得epistemic uncertainty。

Abstract
In machine learning energy potentials for atomic systems, forces are commonly obtained as the negative derivative of the energy function with respect to atomic positions. To quantify aleatoric uncertainty in the predicted energies, a widely used modeling approach involves predicting both a mean and variance for each energy value. However, this model is not differentiable under the usual white noise assumption, so energy uncertainty does not naturally translate to force uncertainty. In this work we propose a machine learning potential energy model in which energy and force aleatoric uncertainty are linked through a spatially correlated noise process. We demonstrate our approach on an equivariant messages passing neural network potential trained on energies and forces on two out-of-equilibrium molecular datasets. Furthermore, we also show how to obtain epistemic uncertainties in this setting based on a Bayesian interpretation of deep ensemble models.

摘要
在机器学习中的原子系统能量潜能中，力通常通过原子位置的负导数法取得。为了量化随机uncertainty的预测值，一种广泛使用的模型方法是预测每个能量值的平均值和方差。然而，这种模型在常见的白噪声假设下不是可导的，因此能量uncertainty不自然地翻译为力uncertainty。在这项工作中，我们提议一种机器学习的潜能能量模型，其中能量和力随机uncertainty通过空间相关的噪声过程相连。我们在两个不对称分子数据集上使用了对称的消息传递神经网络潜能模型，并demonstrate了如何在这种设定下获取epistemic uncertainty。

A novel feature selection framework for incomplete data

paper_url: http://arxiv.org/abs/2312.04171
repo_url: None
paper_authors: Cong Guo
for: 本研究旨在解决受限数据集上 feature selection 问题，提出了一种新的 incomplete data feature selection 框架，该框架考虑了特征的重要性。
methods: 该方法包括两个轮换iterative阶段：M阶段和W阶段。M阶段使用给定的特征重要性vector来填充缺失数据，并使用多个初始填充结果。W阶段使用改进的 reliefF 算法来学习特征重要性vector，并在下一轮iteration中使用当前轮次得到的特征重要性vector作为输入。
results: 实验结果表明，提出的方法在人工生成和实际受限数据集上显著超越其他方法。

Abstract
Feature selection on incomplete datasets is an exceptionally challenging task. Existing methods address this challenge by first employing imputation methods to complete the incomplete data and then conducting feature selection based on the imputed data. Since imputation and feature selection are entirely independent steps, the importance of features cannot be considered during imputation. However, in real-world scenarios or datasets, different features have varying degrees of importance. To address this, we propose a novel incomplete data feature selection framework that considers feature importance. The framework mainly consists of two alternating iterative stages: the M-stage and the W-stage. In the M-stage, missing values are imputed based on a given feature importance vector and multiple initial imputation results. In the W-stage, an improved reliefF algorithm is employed to learn the feature importance vector based on the imputed data. Specifically, the feature importance vector obtained in the current iteration of the W-stage serves as input for the next iteration of the M-stage. Experimental results on both artificially generated and real incomplete datasets demonstrate that the proposed method outperforms other approaches significantly.

摘要
feature选择在受限数据集上是非常具有挑战性的任务。现有的方法对此问题进行了填充方法来完善缺失的数据，然后基于填充的数据进行feature选择。由于填充和feature选择是完全独立的两个步骤，因此在实际的数据集中，不同的特征之间的重要性不能被考虑。为解决这个问题，我们提出了一种新的受限数据集feature选择框架，该框架主要包括两个交互式迭代阶段：M阶段和W阶段。在M阶段，基于给定的特征重要性向量和多个初始填充结果来填充缺失值。在W阶段，我们改进了reliefF算法来学习特征重要性向量基于填充后的数据。具体来说，在当前迭代的W阶段中获得的特征重要性向量将被用作下一轮M阶段的输入。实验结果表明，提出的方法在人工生成的和实际的受限数据集上明显超越了其他方法。

Mixture of Dynamical Variational Autoencoders for Multi-Source Trajectory Modeling and Separation

paper_url: http://arxiv.org/abs/2312.04167
repo_url: None
paper_authors: Xiaoyu Lin, Laurent Girin, Xavier Alameda-Pineda
for: 这个论文是为了模型多个运动源系统的动态而设计的一种隐变量生成模型（MixDVAE）。
methods: 该模型使用预训练的DVAE模型来捕捉单个源数据的动态，然后将多个实例的预训练DVAE模型集成到多源混合模型中，使用一个离散观测到源的归一化变量来确定每个观测值的来源。
results: 该模型在计算机视觉任务（多对象跟踪）和音频处理任务（单道音频源分离）上达到了良好的效果，并超过了多个基eline方法。

Abstract
In this paper, we propose a latent-variable generative model called mixture of dynamical variational autoencoders (MixDVAE) to model the dynamics of a system composed of multiple moving sources. A DVAE model is pre-trained on a single-source dataset to capture the source dynamics. Then, multiple instances of the pre-trained DVAE model are integrated into a multi-source mixture model with a discrete observation-to-source assignment latent variable. The posterior distributions of both the discrete observation-to-source assignment variable and the continuous DVAE variables representing the sources content/position are estimated using a variational expectation-maximization algorithm, leading to multi-source trajectories estimation. We illustrate the versatility of the proposed MixDVAE model on two tasks: a computer vision task, namely multi-object tracking, and an audio processing task, namely single-channel audio source separation. Experimental results show that the proposed method works well on these two tasks, and outperforms several baseline methods.

摘要
在这篇论文中，我们提出了一种含有聚合变量的生成模型，即混合动力变量自动编码器（MixDVAE），用于模型多个运动源系统的动态。一个DVAE模型在单个源数据集上进行预训练，以捕捉源动力。然后，多个预训练DVAE模型被集成到一个多源混合模型中，其中有一个离散观测到源的归一化变量。通过变量估计算子扩散算法，我们可以估计混合模型中的观测到源的 posterior 分布，以及表示源内容/位置的连续DVAE变量的 posterior 分布。通过这种方法，我们可以进行多源轨迹估计。我们在计算机视觉任务和音频处理任务中运用了这种方法，并得到了良好的结果，比基eline方法更好。

Improving Communication Efficiency of Federated Distillation via Accumulating Local Updates

paper_url: http://arxiv.org/abs/2312.04166
repo_url: None
paper_authors: Zhiyuan Wu, Sheng Sun, Yuwei Wang, Min Liu, Tian Wen, Wen Wang
for: 这篇论文是针对 Federated Learning 的一种新型方法，即 Federated Distillation，进行改进。
methods: 本论文提出了一种名为 ALU（Accumulated Local Update）的新技术，它在 Federated Distillation 中聚集多轮本地更新，然后将知识传递到中央服务器。
results: 实验结果显示，ALU 可以对 Federated Distillation 的通信效率进行帮助，实现更好的训练效果。

Abstract
As an emerging federated learning paradigm, federated distillation enables communication-efficient model training by transmitting only small-scale knowledge during the learning process. To further improve the communication efficiency of federated distillation, we propose a novel technique, ALU, which accumulates multiple rounds of local updates before transferring the knowledge to the central server. ALU drastically decreases the frequency of communication in federated distillation, thereby significantly reducing the communication overhead during the training process. Empirical experiments demonstrate the substantial effect of ALU in improving the communication efficiency of federated distillation.

摘要
为了提高 federated learning 的通信效率，我们提出了一种新的技术：ALU。ALU 在 local update 多次聚合后，才将知识传输到中央服务器。这种技术可以减少 federated distillation 中的通信频率，从而减少训练过程中的通信开销。实验结果表明，ALU 可以很有效地提高 federated distillation 中的通信效率。

Multi-scale Residual Transformer for VLF Lightning Transients Classification

paper_url: http://arxiv.org/abs/2312.04163
repo_url: None
paper_authors: Jinghao Sun, Tingting Ji, Guoyu Wang, Rui Wang
for: 提高导航系统的可靠性和性能，减少电磁干扰和噪声
methods: 使用深度学习的Convolutional Neural Network (CNN)模型，以便准确分类突发电磁信号
results: 模型可以准确地分类突发电磁信号，达到90%的准确率，并且可以捕捉突发电磁信号的细腻特征和不同尺度的特征Here’s the translation in English:
for: Improving the reliability and performance of navigation systems, reducing electromagnetic interference and noise
methods: Using deep learning-based Convolutional Neural Network (CNN) models to accurately classify lightning signals
results: The model can accurately classify lightning signals, achieving 90% accuracy, and can capture the fine-grained features and different aspects of the input lightning signal sequence.

Abstract
The utilization of Very Low Frequency (VLF) electromagnetic signals in navigation systems is widespread. However, the non-stationary behavior of lightning signals can affect VLF electromagnetic signal transmission. Accurately classifying lightning signals is important for reducing interference and noise in VLF, thereby improving the reliability and overall performance of navigation systems. In recent years, the evolution of deep learning, specifically Convolutional Neural Network (CNNs), has sparked a transformation in lightning classification, surpassing traditional statistical methodologies. Existing CNN models have limitations as they overlook the diverse attributes of lightning signals across different scales and neglect the significance of temporal sequencing in sequential signals. This study introduces an innovative multi-scale residual transform (MRTransformer) that not only has the ability to discern intricate fine-grained patterns while also weighing the significance of different aspects within the input lightning signal sequence. This model performs the attributes of the lightning signal across different scales and the level of accuracy reached 90% in the classification. In future work, this model has the potential applied to a comprehensive understanding of the localization and waveform characteristics of lightning signals.

摘要
utilization of Very Low Frequency (VLF) electromagnetic signals in navigation systems 广泛存在，但是灯 signals 的非站点行为可以影响 VLF 电磁信号传输。正确分类灯 signals 对于降低干扰和静止噪声是非常重要的，从而提高导航系统的可靠性和总体性能。在最近几年，深度学习的演化，尤其是 Convolutional Neural Network (CNNs)，对灯 signals 的分类带来了革命性的变革，超越传统的统计方法。现有的 CNN 模型弊端在忽略不同尺度的灯 signals 的多样性和忽略时间序列信号的重要性。本研究提出了一种创新的多尺度径补 transformer (MRTransformer)，该模型不仅能够识别细腻的pattern，同时也能够评估不同尺度的灯 signals 的重要性。该模型可以在不同尺度上分类灯 signals，并达到了 90% 的准确率。在未来的工作中，这种模型有potential应用于灯 signals 的本地化和波形特征的全面理解。

Zero-Touch Networks: Towards Next-Generation Network Automation

paper_url: http://arxiv.org/abs/2312.04159
repo_url: None
paper_authors: Mirna El Rajab, Li Yang, Abdallah Shami
for: 这篇论文探讨了零touch网络和服务管理（ZSM）框架在5G和5G+网络管理中的应用，包括自动化自我管理和自我维护功能，以 Addressing the escalating complexity and growing data volume of modern networks.
methods: 该论文探讨了零touch网络（ZTN）在ZSM框架中的应用，包括网络优化、流量监控、能效维护和安全方面，并 explore the challenges associated with ZSM, particularly those related to Machine Learning（ML）.
results: 研究发现，将AutoML纳入ZTNs可以减少网络管理成本并提高性能，同时可以透过自动化ML模型选择和调整过程来提高预测精度。实验结果表明，提案的AutoML管道比传统ML更高度的准确性。

Abstract
The Zero-touch network and Service Management (ZSM) framework represents an emerging paradigm in the management of the fifth-generation (5G) and Beyond (5G+) networks, offering automated self-management and self-healing capabilities to address the escalating complexity and the growing data volume of modern networks. ZSM frameworks leverage advanced technologies such as Machine Learning (ML) to enable intelligent decision-making and reduce human intervention. This paper presents a comprehensive survey of Zero-Touch Networks (ZTNs) within the ZSM framework, covering network optimization, traffic monitoring, energy efficiency, and security aspects of next-generational networks. The paper explores the challenges associated with ZSM, particularly those related to ML, which necessitate the need to explore diverse network automation solutions. In this context, the study investigates the application of Automated ML (AutoML) in ZTNs, to reduce network management costs and enhance performance. AutoML automates the selection and tuning process of a ML model for a given task. Specifically, the focus is on AutoML's ability to predict application throughput and autonomously adapt to data drift. Experimental results demonstrate the superiority of the proposed AutoML pipeline over traditional ML in terms of prediction accuracy. Integrating AutoML and ZSM concepts significantly reduces network configuration and management efforts, allowing operators to allocate more time and resources to other important tasks. The paper also provides a high-level 5G system architecture incorporating AutoML and ZSM concepts. This research highlights the potential of ZTNs and AutoML to revolutionize the management of 5G+ networks, enabling automated decision-making and empowering network operators to achieve higher efficiency, improved performance, and enhanced user experience.

摘要
第五代（5G）和更多（5G+）网络管理的零touch网络和服务管理（ZSM）框架代表了现代网络管理的新趋势，提供自动化自我管理和自我修复功能，以应对现代网络的增长复杂度和数据量的增长。ZSM框架利用先进技术，如机器学习（ML），以实现智能决策和减少人类干预。本文对零touch网络（ZTN）进行了全面的报告，覆盖了网络优化、流量监测、能效性和安全方面的下一代网络问题。本文探讨了ZSM的挑战，特别是与ML相关的挑战，需要探索多种网络自动化解决方案。在这个上下文中，研究探讨了在ZTN中应用自动化机器学习（AutoML）的可能性，以减少网络管理成本和提高性能。AutoML可以自动选择和调整ML模型，以预测应用程序吞吐量并自动适应数据漂移。实验结果表明，提议的AutoML管道比传统ML更高精度。将AutoML和ZSM概念结合可以减少网络配置和管理努力，让操作员可以更多的时间和资源投入到其他重要任务上。文章还提供了5G系统架构，包括AutoML和ZSM概念。这些研究强调了零touch网络和AutoML在5G+网络管理中的潜在力量，帮助自动化决策，让网络操作员更高效、更好的性能和更好的用户体验。

Resource Allocation for Semantic Communication under Physical-layer Security

paper_url: http://arxiv.org/abs/2312.04155
repo_url: None
paper_authors: Yang Li, Xinyu Zhou, Jun Zhao
for: 这篇论文旨在实现6G无线网络中的 semantic communication 系统，它的目的是传递抽象信息，而不是原始数据，接收者将尝试复原。
methods: 本文提出了一个共同优化算法，用于优化总延迟和功能。另外，为了保证系统的安全性，我们将物理层安全方法（secrecy rate） incorporated into the optimization problem。
results: 实验结果显示，提出的算法在比较基准下表现出最佳的共同优化性能。

Abstract
Semantic communication is deemed as a revolution of Shannon's paradigm in the six-generation (6G) wireless networks. It aims at transmitting the extracted information rather than the original data, which receivers will try to recover. Intuitively, the larger extracted information, the longer latency of semantic communication will be. Besides, larger extracted information will result in more accurate reconstructed information, thereby causing a higher utility of the semantic communication system. Shorter latency and higher utility are desirable objectives for the system, so there will be a trade-off between utility and latency. This paper proposes a joint optimization algorithm for total latency and utility. Moreover, security is essential for the semantic communication system. We incorporate the secrecy rate, a physical-layer security method, into the optimization problem. The secrecy rate is the communication rate at which no information is disclosed to an eavesdropper. Experimental results demonstrate that the proposed algorithm obtains the best joint optimization performance compared to the baselines.

摘要
Semantic communication 被认为是6G无线网络中的一个革命，它目的是将提取的信息传输而不是原始数据，接收者将尝试复原。具体来说，提取的信息越大， semantic communication 的延迟时间就会越长。此外，提取的信息越多，复原的信息会更加精准，因此 semantic communication 系统的使用价值会更高。系统中的延迟时间和使用价值是 Desirable 的目标，因此这两个目标会存在贡献和延迟的复杂性。这篇论文提出了一个共同优化算法，以便同时优化总延迟时间和使用价值。此外，安全性是 semantic communication 系统的必要条件。我们将物理层安全方法——秘密率， incorporated 到优化问题中。秘密率是指在无法获得听者的情况下，通信的速率。实验结果显示，提出的算法在基准之上表现最佳。

A Novel Federated Learning-based Intrusion Detection System for Flying Ad Hoc Networks

paper_url: http://arxiv.org/abs/2312.04135
repo_url: None
paper_authors: Ozlem Ceviz, Pinar Sadioglu, Sevil Sen, Vassilios G. Vassilakis
for: This paper aims to improve the security of Flying Ad-hoc Networks (FANETs) by developing a Federated Learning-based Intrusion Detection System (FL-IDS) that addresses privacy concerns while maintaining effective intrusion detection performance.
methods: The FL-IDS approach uses federated learning to enable UAVs to collaboratively train a global intrusion detection model without sharing raw data. Local models are assigned to each UAV, and only updated model weights are shared with a central server. The Bias Towards Specific Clients (BTSC) method is also used to enhance FL-IDS performance.
results: The experimental results show that FL-IDS has competitive performance with Central IDS (C-IDS) while mitigating privacy concerns. FL-IDS also surpasses C-IDS and traditional intrusion detection methods, including Local IDS (L-IDS), in terms of detection performance.Here are the three key points in Simplified Chinese text:
for: 这篇论文目标是为悬浮广播网络（FANETs）提高安全性，通过开发基于联合学习的攻击检测系统（FL-IDS）来解决隐私问题。
methods: FL-IDS使用联合学习方法，让无人飞行器协同培训一个全球攻击检测模型，而不需要共享原始数据。本地模型分配给每个无人飞行器，并仅将更新的模型 веса分享给中央服务器。
results: 实验结果显示，FL-IDS与中央攻击检测系统（C-IDS）的性能相似，同时解决了隐私问题。FL-IDS还超过C-IDS和传统的攻击检测方法，包括本地攻击检测系统（L-IDS），在检测性能方面。

Abstract
Unmanned aerial vehicles (UAVs) in flying ad-hoc networks (FANETs) face security challenges due to the dynamic and distributed nature of these networks. This paper presents the Federated Learning-based Intrusion Detection System (FL-IDS), an innovative approach designed to improve FANET security. FL-IDS leverages federated learning to address privacy concerns of centralized intrusion detection systems. FL-IDS operates in a decentralized manner, enabling UAVs to collaboratively train a global intrusion detection model without sharing raw data. Local models are assigned to each UAV, using client-specific data, and only updated model weights are shared with a central server. This preserves privacy while utilizing collective intelligence for effective intrusion detection. Experimental results show FL-IDS's competitive performance with Central IDS (C-IDS) while mitigating privacy concerns. The Bias Towards Specific Clients (BTSC) method further enhances FL-IDS performance, surpassing C-IDS even at lower attacker ratios. A comparative analysis with traditional intrusion detection methods, including Local IDS (L-IDS), provides insights into FL-IDS's strengths. This study significantly contributes to FANET security by introducing a privacy-aware, decentralized intrusion detection approach tailored to the unique challenges of UAV networks.

摘要
无人飞行器（UAV）在飞行式自组织网络（FANET）面临安全挑战，主要是因为这些网络的动态和分布式特性。本文介绍了联邦学习基于的入侵检测系统（FL-IDS），这是一种创新的方法，旨在提高FANET的安全性。FL-IDS使用联邦学习来解决中央入侵检测系统中的隐私问题。FL-IDS在分布式的方式下运行，让无人飞行器之间协同培训全球入侵检测模型，而不需要分享原始数据。每个无人飞行器都有自己的本地模型，使用客户端特定的数据进行训练，并仅将更新后的模型 веса分享到中央服务器。这种方式保持了隐私，同时利用了集体智慧来实现有效的入侵检测。实验结果表明，FL-IDS在与中央入侵检测系统（C-IDS）的竞争中表现积极，同时解决了隐私问题。此外，使用偏好特定客户端方法（BTSC）可以进一步提高FL-IDS的表现，甚至在较低的入侵者比率下也超过C-IDS。与传统的入侵检测方法，包括本地入侵检测系统（L-IDS）进行比较分析，可以了解FL-IDS的优势。本研究对FANET安全做出了重要贡献，介绍了适应无人飞行器网络特点的隐私意识、分布式入侵检测方法。

Small Area Estimation of Case Growths for Timely COVID-19 Outbreak Detection

paper_url: http://arxiv.org/abs/2312.04110
repo_url: https://github.com/wangzilongri/covid-tracking
paper_authors: Zhaowei She, Zilong Wang, Jagpreet Chhatwal, Turgay Ayer
for:这个论文旨在提出一种基于机器学习的泛化随机森林算法（TLGRF），用于实时检测COVID-19疫情在美国县份中的快速增长率，并且可以在各县的样本量较少的情况下准确地估计增长率。methods:这个论文使用了TLGRF算法，该算法根据病例数据中的日期和县级特征选择适当的拟合窗口大小来估计快速增长率。此外，该论文还使用了传输学习技术，以便在各县的样本量较少的情况下准确地估计增长率。results:研究表明，TLGRF算法可以准确地估计COVID-19疫情在各县中的快速增长率，并且在对比于已有的增长率估计方法时表现出了优异性。此外，在基于科罗拉多州的疫情数据进行的实验中，TLGRF算法可以提高检测疫情的准确率，并且可以在各县的样本量较少的情况下提供有用的增长率估计。

Abstract
The COVID-19 pandemic has exerted a profound impact on the global economy and continues to exact a significant toll on human lives. The COVID-19 case growth rate stands as a key epidemiological parameter to estimate and monitor for effective detection and containment of the resurgence of outbreaks. A fundamental challenge in growth rate estimation and hence outbreak detection is balancing the accuracy-speed tradeoff, where accuracy typically degrades with shorter fitting windows. In this paper, we develop a machine learning (ML) algorithm, which we call Transfer Learning Generalized Random Forest (TLGRF), that balances this accuracy-speed tradeoff. Specifically, we estimate the instantaneous COVID-19 exponential growth rate for each U.S. county by using TLGRF that chooses an adaptive fitting window size based on relevant day-level and county-level features affecting the disease spread. Through transfer learning, TLGRF can accurately estimate case growth rates for counties with small sample sizes. Out-of-sample prediction analysis shows that TLGRF outperforms established growth rate estimation methods. Furthermore, we conducted a case study based on outbreak case data from the state of Colorado and showed that the timely detection of outbreaks could have been improved by up to 224% using TLGRF when compared to the decisions made by Colorado's Department of Health and Environment (CDPHE). To facilitate implementation, we have developed a publicly available outbreak detection tool for timely detection of COVID-19 outbreaks in each U.S. county, which received substantial attention from policymakers.

摘要
COVID-19 流行病对全球经济产生了深远的影响，并仍然对人类生命产生了重要的影响。 COVID-19 病例增长率作为一个关键的疫情 Parameters，用于估算和控制疫情复发。然而，在增长率估算中，精度和速度之间存在一定的权衡问题，通常精度随着适应窗口的短化而下降。在这篇论文中，我们开发了一种机器学习（ML）算法，我们称之为Transfer Learning Generalized Random Forest（TLGRF），它可以在精度和速度之间做出权衡。具体来说，我们使用 TLGRF 来估算每个美国县的 COVID-19 快速增长率，并通过适应窗口的自适应调整来基于相关的日期和县级特征来控制疫情的传播。通过传输学习，TLGRF 可以准确地估算 county 的增长率，即使它们有小样本大小。外样预测分析表明，TLGRF 超过了已有的增长率估算方法。此外，我们基于科罗拉多州疫情案例进行了一个案例研究，并显示了在使用 TLGRF 时，可以提高检测疫情的准确性，最高可以达到 224%。为了实现实施，我们开发了一个可公开获取的疫情检测工具，该工具可以在每个美国县中实时检测 COVID-19 疫情，并得到了政策制定者的重要关注。

On the adaptation of in-context learners for system identification

paper_url: http://arxiv.org/abs/2312.04083
repo_url: https://github.com/forgi86/sysid-transformers-transfer
paper_authors: Dario Piga, Filippo Pura, Marco Forgione
for: 这篇论文旨在探讨元模型修改的角色，以提高系统识别的预测性能。
methods: 这篇论文使用了 numrical examples，以示出元模型修改可以在三个实际情况下提高预测性能： tailoring the meta-model to describe a specific system rather than a class; extending the meta-model to capture the behaviour of systems beyond the initial training class; 和 recalibrating the model for new prediction tasks。
results: 结果显示了元模型修改可以实现一个更加灵活和可靠的元学习架构，以提高系统识别的预测性能。

Abstract
In-context system identification aims at constructing meta-models to describe classes of systems, differently from traditional approaches that model single systems. This paradigm facilitates the leveraging of knowledge acquired from observing the behaviour of different, yet related dynamics. This paper discusses the role of meta-model adaptation. Through numerical examples, we demonstrate how meta-model adaptation can enhance predictive performance in three realistic scenarios: tailoring the meta-model to describe a specific system rather than a class; extending the meta-model to capture the behaviour of systems beyond the initial training class; and recalibrating the model for new prediction tasks. Results highlight the effectiveness of meta-model adaptation to achieve a more robust and versatile meta-learning framework for system identification.

摘要
Context-aware系统identification旨在构建meta模型，用于描述不同系统的类型，不同于传统方法，该方法只是模型单个系统。这个思想使得可以利用不同系统的行为观察获得的知识，来提高预测性能。本文讨论meta模型适应的角色。通过数值示例，我们示出了meta模型适应可以提高预测性能的三种实际场景：将meta模型适应到特定系统而不是类型；将meta模型扩展到涵盖初学习类型以外的系统行为；以及重新调整模型以适应新的预测任务。结果表明，meta模型适应可以实现一个更加稳健和多样化的meta学习框架 для系统识别。

A Transformer Model for Symbolic Regression towards Scientific Discovery

paper_url: http://arxiv.org/abs/2312.04070
repo_url: https://github.com/omron-sinicx/transformer4sr
paper_authors: Florian Lalande, Yoshitomo Matsubara, Naoya Chiba, Tatsunori Taniai, Ryo Igarashi, Yoshitala Ushiku
for: 这篇论文主要适用于科学发现中的符号回归问题。
methods: 该论文提出了一种基于Transformer模型的符号回归方法，并考虑了不同的encoder架构，以提高方法的灵活性。
results: 论文在使用 normalized tree-based edit distance 评价指标时，在Symbolic Regression for Scientific Discovery datasets中实现了state-of-the-art的结果。

Abstract
Symbolic Regression (SR) searches for mathematical expressions which best describe numerical datasets. This allows to circumvent interpretation issues inherent to artificial neural networks, but SR algorithms are often computationally expensive. This work proposes a new Transformer model aiming at Symbolic Regression particularly focused on its application for Scientific Discovery. We propose three encoder architectures with increasing flexibility but at the cost of column-permutation equivariance violation. Training results indicate that the most flexible architecture is required to prevent from overfitting. Once trained, we apply our best model to the SRSD datasets (Symbolic Regression for Scientific Discovery datasets) which yields state-of-the-art results using the normalized tree-based edit distance, at no extra computational cost.

摘要
Symbolic Regression (SR) 搜索数字集合中的数学表达，以解决人工神经网络的解释问题。然而，SR 算法通常 computationally expensive。这项工作提出了一种新的 Transformer 模型，专门用于Symbolic Regression，特别是在科学发现中。我们提出三种 encoder 架构，具有不同的灵活性，但是会导致列 permutation 不变性破坏。训练结果表明，最灵活的架构可以避免过拟合。已训练后，我们使用我们最佳模型来处理 SRSD 数据集（Symbolic Regression for Scientific Discovery 数据集），并取得了使用normalized tree-based edit distance的最佳结果，无论额外计算成本。

MeanCut: A Greedy-Optimized Graph Clustering via Path-based Similarity and Degree Descent Criterion

paper_url: http://arxiv.org/abs/2312.04067
repo_url: https://github.com/zpguigroupwhu/meancut-clustering
paper_authors: Dehua Peng, Zhipeng Gui, Huayi Wu
for: 提出了一种基于路径相似性的非破坏性图分割算法，以提高内部协会关系，并提出了一个新的目标函数 MeanCut，通过排序度Descending Order greedily 优化，以适应非圆杯分布数据。
methods: 使用了路径相似性来增强内部协会关系，并提出了一种基于最大束 Span Tree (MST) 的快速同步算法 FastMST，以提高计算效率。此外，还定义了一个浮动度阈值因子 (DGF)，用于分离弱连接的群体。
results: 通过对真实 benchmark 和面Recognition 应用进行测试， validate 了我们的算法的有效性和稳定性。

Abstract
As the most typical graph clustering method, spectral clustering is popular and attractive due to the remarkable performance, easy implementation, and strong adaptability. Classical spectral clustering measures the edge weights of graph using pairwise Euclidean-based metric, and solves the optimal graph partition by relaxing the constraints of indicator matrix and performing Laplacian decomposition. However, Euclidean-based similarity might cause skew graph cuts when handling non-spherical data distributions, and the relaxation strategy introduces information loss. Meanwhile, spectral clustering requires specifying the number of clusters, which is hard to determine without enough prior knowledge. In this work, we leverage the path-based similarity to enhance intra-cluster associations, and propose MeanCut as the objective function and greedily optimize it in degree descending order for a nondestructive graph partition. This algorithm enables the identification of arbitrary shaped clusters and is robust to noise. To reduce the computational complexity of similarity calculation, we transform optimal path search into generating the maximum spanning tree (MST), and develop a fast MST (FastMST) algorithm to further improve its time-efficiency. Moreover, we define a density gradient factor (DGF) for separating the weakly connected clusters. The validity of our algorithm is demonstrated by testifying on real-world benchmarks and application of face recognition. The source code of MeanCut is available at https://github.com/ZPGuiGroupWhu/MeanCut-Clustering.

摘要
“ spectral clustering 是图像分割最常见的方法，受到极高的表现、易于实现和强大的适应性的欢迎。 classical spectral clustering 使用对照式 Euclidean 距离来衡量图像的边重量，并通过减少约束和 Laplacian 分解来解决最优图像分割。然而， Euclidean 基于的相似性可能会导致非圆形数据分布的扭曲割，而放松策略会导致信息损失。此外， spectral clustering 需要指定分区数量，这可能需要充分的先验知识。在这种工作中，我们使用路径基于的相似性来增强内部协会，并提出 MeanCut 作为目标函数，通过度降序搜索来快速优化。这个算法可以识别任意形状的团群，并对噪声抗性。为了降低相似性计算的计算复杂度，我们将最优路径搜索转换为生成最大束Tree（MST），并开发 FastMST 算法以进一步提高时间效率。此外，我们定义了强度梯度因子（DGF），用于分离弱连接的团群。我们的算法在实际 benchmark 和面部识别应用中证明了其有效性。MeanCut 算法的源代码可以在 https://github.com/ZPGuiGroupWhu/MeanCut-Clustering 上获取。”

A Robust and Efficient Boundary Point Detection Method by Measuring Local Direction Dispersion

paper_url: http://arxiv.org/abs/2312.04065
repo_url: None
paper_authors: Dehua Peng, Zhipeng Gui, Huayi Wu
For: The paper aims to address the challenge of boundary point detection in machine learning tasks, particularly in non-convex structures and high-dimensional manifolds.* Methods: The proposed method, Local Direction Dispersion (LoDD), uses a density-independent K-Nearest Neighbors (KNN) method to determine neighboring points and a statistic-based metric using the eigenvalues of the covariance matrix of KNN coordinates to measure the centrality of a query point.* Results: The paper demonstrates the validity of LoDD on five synthetic datasets and ten real-world benchmarks, and shows that LoDD achieves promising and robust detection accuracy in a time-efficient manner.Here’s the Chinese translation of the three key points:* For: 本文targets addressing the challenge of boundary point detection in machine learning tasks, particularly in non-convex structures and high-dimensional manifolds.* Methods: 提议的方法是Local Direction Dispersion (LoDD), which uses a density-independent K-Nearest Neighbors (KNN) method to determine neighboring points and a statistic-based metric using the eigenvalues of the covariance matrix of KNN coordinates to measure the centrality of a query point.* Results: 本文 demonstrates the validity of LoDD on five synthetic datasets and ten real-world benchmarks, and shows that LoDD achieves promising and robust detection accuracy in a time-efficient manner.

Abstract
Boundary points pose a significant challenge for machine learning tasks, including classification, clustering, and dimensionality reduction. Due to the similarity of features, boundary areas can result in mixed-up classes or clusters, leading to a crowding problem in dimensionality reduction. To address this challenge, numerous boundary point detection methods have been developed, but they are insufficiently to accurately and efficiently identify the boundary points in non-convex structures and high-dimensional manifolds. In this work, we propose a robust and efficient method for detecting boundary points using Local Direction Dispersion (LoDD). LoDD considers that internal points are surrounded by neighboring points in all directions, while neighboring points of a boundary point tend to be distributed only in a certain directional range. LoDD adopts a density-independent K-Nearest Neighbors (KNN) method to determine neighboring points, and defines a statistic-based metric using the eigenvalues of the covariance matrix of KNN coordinates to measure the centrality of a query point. We demonstrated the validity of LoDD on five synthetic datasets (2-D and 3-D) and ten real-world benchmarks, and tested its clustering performance by equipping with two typical clustering methods, K-means and Ncut. Our results show that LoDD achieves promising and robust detection accuracy in a time-efficient manner.

摘要
<> transtable text into Simplified Chinese.Boundary points pose a significant challenge for machine learning tasks, including classification, clustering, and dimensionality reduction. Due to the similarity of features, boundary areas can result in mixed-up classes or clusters, leading to a crowding problem in dimensionality reduction. To address this challenge, numerous boundary point detection methods have been developed, but they are insufficiently to accurately and efficiently identify the boundary points in non-convex structures and high-dimensional manifolds. In this work, we propose a robust and efficient method for detecting boundary points using Local Direction Dispersion (LoDD). LoDD considers that internal points are surrounded by neighboring points in all directions, while neighboring points of a boundary point tend to be distributed only in a certain directional range. LoDD adopts a density-independent K-Nearest Neighbors (KNN) method to determine neighboring points, and defines a statistic-based metric using the eigenvalues of the covariance matrix of KNN coordinates to measure the centrality of a query point. We demonstrated the validity of LoDD on five synthetic datasets (2-D and 3-D) and ten real-world benchmarks, and tested its clustering performance by equipping with two typical clustering methods, K-means and Ncut. Our results show that LoDD achieves promising and robust detection accuracy in a time-efficient manner.中文简体版：边界点pose机器学习任务中的一大挑战，包括分类、聚类和维度减少。由于特征相似性，边界区域可能会导致类别杂化或群集拥堵，从而影响维度减少。为解决这个挑战，许多边界点检测方法已经被开发出来，但它们无法准确和高效地在非对称结构和高维投影中标识边界点。在这种情况下，我们提出了一种可靠和高效的边界点检测方法——本地方向分散（LoDD）。LoDD认为内部点被周围的点所环绕，而边界点的周围点往往只在某个方向范围内分布。LoDD采用了无关于密度的K最近邻（KNN）方法来确定邻近点，并使用KNN坐标的covariance矩阵的 eigenvalues来定义查询点的中心性。我们在五个synthetic数据集（2D和3D）和十个实际 benchmark上验证了LoDD的有效性，并通过与K-means和Ncut两种典型的聚类方法结合测试其聚类性能。我们的结果表明，LoDD在时间高效的情况下实现了可靠和robust的检测精度。

DiscoBAX: Discovery of Optimal Intervention Sets in Genomic Experiment Design

paper_url: http://arxiv.org/abs/2312.04064
repo_url: https://github.com/amehrjou/discobax
paper_authors: Clare Lyle, Arash Mehrjou, Pascal Notin, Andrew Jesson, Stefan Bauer, Yarin Gal, Patrick Schwab
for: 本研究旨在开发一种能够快速、效果强效地发现治疗基因驱动疾病的方法。
methods: 本研究提出了一种名为DiscoBAX的实验设计方法，该方法可以在 genomic experiment campaign 中尽可能快速地找到一些有效且多样的干预，以便更好地理解疾病机理。
results: 对比 existing state-of-the-art methods for experimental design, DiscoBAX 能够更好地选择有效且多样的干预，并且可以提高预测性的报告率。

Abstract
The discovery of therapeutics to treat genetically-driven pathologies relies on identifying genes involved in the underlying disease mechanisms. Existing approaches search over the billions of potential interventions to maximize the expected influence on the target phenotype. However, to reduce the risk of failure in future stages of trials, practical experiment design aims to find a set of interventions that maximally change a target phenotype via diverse mechanisms. We propose DiscoBAX, a sample-efficient method for maximizing the rate of significant discoveries per experiment while simultaneously probing for a wide range of diverse mechanisms during a genomic experiment campaign. We provide theoretical guarantees of approximate optimality under standard assumptions, and conduct a comprehensive experimental evaluation covering both synthetic as well as real-world experimental design tasks. DiscoBAX outperforms existing state-of-the-art methods for experimental design, selecting effective and diverse perturbations in biological systems.

摘要
发现用于治疗遗传疾病的药物需要确定疾病机理中参与的基因。现有的方法在亿万个可能性中搜索以 maximize 目标fenotype的影响。然而，以降低未来试验阶段的失败风险，实用的实验设计目标是找到一组实现多种机制的干扰。我们提出了DiscoBAX，一种高效的样本设计方法，以 maximize 每次试验中对target fenotype的重要发现速率，同时在 genomic experiment campaign 中广泛探索多种不同的机制。我们提供了标准假设下的理论保证，并进行了对 synthetic 和实际实验设计任务的全面性evaluation。DiscoBAX 在比较现有状态的方法上表现出了更高的效果，在生物系统中选择有效和多样的干扰。

Jointly spatial-temporal representation learning for individual trajectories

paper_url: http://arxiv.org/abs/2312.04055
repo_url: None
paper_authors: Fei Huang, Jianrong Lv, Yang Yue
for: This paper aims to provide a method for learning general-purpose trajectory representations that can be used for various geospatial applications, such as predicting movement patterns and preserving trajectory similarity.
methods: The proposed method, called ST-GraphRL, uses a weighted directed spatial-temporal graph, a two-stage joint encoder, and a decoder to learn entangled spatial-temporal dependencies and explicit mobility regularities from trajectory data.
results: The proposed ST-GraphRL method outperformed all baseline models in predicting movement spatial-temporal distributions and preserving trajectory similarity, with high spatial-temporal correlations. Additionally, the method was found to understand spatial-temporal patterns and be transferable for general-purpose geospatial data representations.

Abstract
Individual trajectories, containing substantial information on human-environment interactions across space and time, is a crucial input for geospatial foundation models (GeoFMs). However, existing attempts, leveraging trajectory data for various applications have overlooked the implicit spatial-temporal dependency within trajectories and failed to encode and represent it in a format friendly to deep learning, posing a challenge in obtaining general-purpose trajectory representations. Therefore, this paper proposes a spatial-temporal joint representation learning method (ST-GraphRL) to formalize learnable spatial-temporal dependencies into trajectory representations. The proposed ST-GraphRL consists of three compositions: (i) a weighted directed spatial-temporal graph to explicitly construct mobility interactions over both space and time dimensions; (ii) a two-stage jointly encoder (i.e., decoupling and fusion) to learn entangled spatial-temporal dependencies by independently decomposing and jointly aggregating space and time information; (iii) a decoder guides ST-GraphRL to learn explicit mobility regularities by simulating the spatial-temporal distributions of trajectories. Tested on three real-world human mobility datasets, the proposed ST-GraphRL outperformed all the baseline models in predicting movement spatial-temporal distributions and preserving trajectory similarity with high spatial-temporal correlations. We also explore how spatial-temporal features presented in latent space, validating that ST-GraphRL understands spatial-temporal patterns. This method is also transferable for general-purpose geospatial data representations for broad downstream tasks, as well advancing GeoFMs developing.

摘要
个人路径，含有大量人环境互动的资讯，是地球资库模型（GeoFM）的重要输入。然而，现有的尝试，利用行程数据进行不同应用，忽略了行程中隐藏的空间时间相依性，并未能将其转换为深度学习友好的格式，实际上增加了获得通用行程表现的挑战。因此，本文提出了一个空间时间共同表现学习方法（ST-GraphRL），以明确构筑行程中的 mobilitity互动，并将其转换为深度学习友好的格式。ST-GraphRL包括三个部分：1. 一个权重向量的直向空间时间图（ST-Graph），以明确构筑行程中的 mobilitity互动；2. 一个分阶段的两种联合对应（i.e., 分离和联合），以独立地分解和联合空间和时间信息；3. 一个导引ST-GraphRL学习明确的 mobilitity常式的推导器。在三个实验中，提出的ST-GraphRL比所有基eline模型在预测行程的空间时间分布和保持行程相似性高的情况下表现出色。我们还评估了ST-GraphRL中的空间时间特征在潜在空间中的显示，验证ST-GraphRL理解了空间时间模式。此方法也可以转移到通用地球资料表现的广泛下渠道任务，以及进一步推动GeoFM的发展。

Reconstruction of dynamical systems from data without time labels

paper_url: http://arxiv.org/abs/2312.04038
repo_url: None
paper_authors: Zhijun Zeng, Pipi Hu, Chenglong Bao, Yi Zhu, Zuoqiang Shi
for: reconstruction of dynamical systems from data without time labels
methods: using sliced Wasserstein distance to minimize distribution loss
results: effective reconstruction of underlying dynamical systems through extensive experiments

Abstract
In this paper, we study the method to reconstruct dynamical systems from data without time labels. Data without time labels appear in many applications, such as molecular dynamics, single-cell RNA sequencing etc. Reconstruction of dynamical system from time sequence data has been studied extensively. However, these methods do not apply if time labels are unknown. Without time labels, sequence data becomes distribution data. Based on this observation, we propose to treat the data as samples from a probability distribution and try to reconstruct the underlying dynamical system by minimizing the distribution loss, sliced Wasserstein distance more specifically. Extensive experiment results demonstrate the effectiveness of the proposed method.

摘要
在这篇论文中，我们研究了从数据无时间标签中重construct动力系统的方法。无时间标签的数据在许多应用中出现，如分子动力学、单元RNA分析等。重construct动力系统从时间序列数据中的研究已经很广泛，但这些方法不适用于无时间标签的数据。在这种情况下，时间序列数据变成了分布数据。基于这一观察，我们提议对数据进行样本化，并尝试通过最小化分布损失来重construct下面的动力系统，具体来说是使用截卷 Wasserstein 距离。我们的实验结果表明该方法的有效性。

Series2Vec: Similarity-based Self-supervised Representation Learning for Time Series Classification

paper_url: http://arxiv.org/abs/2312.03998
repo_url: https://github.com/navidfoumani/series2vec
paper_authors: Navid Mohammadi Foumani, Chang Wei Tan, Geoffrey I. Webb, Mahsa Salehi
for: 本研究旨在提出一种新的自适应表示学习方法，即Series2Vec，用于时间序列分类。
methods: Series2Vec使用自适应任务来学习时间序列表示，并通过验证同时序列之间的相似性来强化网络的学习。
results: 对于9个大规模的实际 datasets和UCRLUEA数据库，Series2Vec表现出了与当前状态的自适应技术相比的提高。此外，Series2Vec在具有有限标注数据的 dataset 中表现出了高效性，并且可以与其他表示学习模型进行集成以提高时间序列分类性能。

Abstract
We argue that time series analysis is fundamentally different in nature to either vision or natural language processing with respect to the forms of meaningful self-supervised learning tasks that can be defined. Motivated by this insight, we introduce a novel approach called \textit{Series2Vec} for self-supervised representation learning. Unlike other self-supervised methods in time series, which carry the risk of positive sample variants being less similar to the anchor sample than series in the negative set, Series2Vec is trained to predict the similarity between two series in both temporal and spectral domains through a self-supervised task. Series2Vec relies primarily on the consistency of the unsupervised similarity step, rather than the intrinsic quality of the similarity measurement, without the need for hand-crafted data augmentation. To further enforce the network to learn similar representations for similar time series, we propose a novel approach that applies order-invariant attention to each representation within the batch during training. Our evaluation of Series2Vec on nine large real-world datasets, along with the UCR/UEA archive, shows enhanced performance compared to current state-of-the-art self-supervised techniques for time series. Additionally, our extensive experiments show that Series2Vec performs comparably with fully supervised training and offers high efficiency in datasets with limited-labeled data. Finally, we show that the fusion of Series2Vec with other representation learning models leads to enhanced performance for time series classification. Code and models are open-source at \url{https://github.com/Navidfoumani/Series2Vec.}

摘要
我们认为时间序列分析与视觉或自然语言处理有所不同，尤其是在定义有用的自愿式学习任务方面。这个见解驱使我们推出了一个新的方法called Series2Vec，用于自愿式表现学习。与其他时间序列自愿式方法不同，Series2Vec 在时间和频域两个领域中预测两个时间序列之间的相似性。Series2Vec 依赖自愿式相似性步骤的一致性，而不是自愿式相似性测量的自然质量。而且，我们提出了一个新的方法，将每个batch中的每个表现经过频率对称处理，以在训练时强制网络学习相似的表现。我们的实验表明，Series2Vec 在九个大规模的真实世界数据集上表现出色，并且与现有的自愿式技术相比，具有更高的效率和相似的表现。此外，我们还证明了 Series2Vec 可以与其他表现学习模型融合，实现更高的时间序列分类表现。我们的代码和模型都公开在 GitHub 上，请参考 \url{https://github.com/Navidfoumani/Series2Vec.}

Rapid detection of rare events from in situ X-ray diffraction data using machine learning

paper_url: http://arxiv.org/abs/2312.03989
repo_url: None
paper_authors: Weijian Zheng, Jun-Sang Park, Peter Kenesei, Ahsan Ali, Zhengchun Liu, Ian T. Foster, Nicholas Schwarz, Rajkumar Kettimuthu, Antonino Miceli, Hemant Sharma
for: 这篇论文旨在开发一种快速检测高能X射线微Diffraction数据中塑变的自动化方法，以提高多模式X射线Diffraction方法的效率和时间分辨率。
methods: 这篇论文使用了自适应图像表示学习和归类技术，将大量的数据转化为具有含义的、精炼的表示，从而快速检测塑变。
results: 这篇论文的结果表明，这种新的自动化方法可以比传统方法快速多少，并且可以处理更多的数据，从而提高多模式X射线Diffraction方法的效率和时间分辨率。

Abstract
High-energy X-ray diffraction methods can non-destructively map the 3D microstructure and associated attributes of metallic polycrystalline engineering materials in their bulk form. These methods are often combined with external stimuli such as thermo-mechanical loading to take snapshots over time of the evolving microstructure and attributes. However, the extreme data volumes and the high costs of traditional data acquisition and reduction approaches pose a barrier to quickly extracting actionable insights and improving the temporal resolution of these snapshots. Here we present a fully automated technique capable of rapidly detecting the onset of plasticity in high-energy X-ray microscopy data. Our technique is computationally faster by at least 50 times than the traditional approaches and works for data sets that are up to 9 times sparser than a full data set. This new technique leverages self-supervised image representation learning and clustering to transform massive data into compact, semantic-rich representations of visually salient characteristics (e.g., peak shapes). These characteristics can be a rapid indicator of anomalous events such as changes in diffraction peak shapes. We anticipate that this technique will provide just-in-time actionable information to drive smarter experiments that effectively deploy multi-modal X-ray diffraction methods that span many decades of length scales.

摘要
高能衍射方法可以不破坏性地映射出金属多晶体材料的3D微结构和相关特征。这些方法经常与外部刺激（如热机械压力）结合，以获取时间序列中的微结构和特征变化。然而，传统的数据获取和减少方法的高成本和庞大数据量带来阻碍快速提取有用信息的障碍。我们现在介绍一种全自动的方法，可以快速检测高能衍射数据中的塑性变化。这种方法比传统方法快速50倍，并且可以处理9倍稀疏的数据集。我们的方法利用了无监督图像表示学习和聚类，将庞大数据转化为紧凑、具有Semantic-rich特征的表示。这些特征可以快速检测异常事件，例如衍射峰形变化。我们预计这种方法将提供实时有用的信息，以便更好地利用多Modal衍射方法，覆盖多个长径范围。

Node-aware Bi-smoothing: Certified Robustness against Graph Injection Attacks

paper_url: http://arxiv.org/abs/2312.03979
repo_url: None
paper_authors: Yuni Lai, Yulin Zhu, Bailin Pan, Kai Zhou
for: 防止图像攻击（Graph Injection Attacks）的certified robustness。
methods: 提出了一种模型独立的节点意识宽度补做方法，可以防止欺骗和毒素攻击。
results: 通过理论分析和实验证明，提出的补做方法能够保证鲁棒性。同时，在实际应用中也能够采用这种方法来防止真实的攻击。

Abstract
Deep Graph Learning (DGL) has emerged as a crucial technique across various domains. However, recent studies have exposed vulnerabilities in DGL models, such as susceptibility to evasion and poisoning attacks. While empirical and provable robustness techniques have been developed to defend against graph modification attacks (GMAs), the problem of certified robustness against graph injection attacks (GIAs) remains largely unexplored. To bridge this gap, we introduce the node-aware bi-smoothing framework, which is the first certifiably robust approach for general node classification tasks against GIAs. Notably, the proposed node-aware bi-smoothing scheme is model-agnostic and is applicable for both evasion and poisoning attacks. Through rigorous theoretical analysis, we establish the certifiable conditions of our smoothing scheme. We also explore the practical implications of our node-aware bi-smoothing schemes in two contexts: as an empirical defense approach against real-world GIAs and in the context of recommendation systems. Furthermore, we extend two state-of-the-art certified robustness frameworks to address node injection attacks and compare our approach against them. Extensive evaluations demonstrate the effectiveness of our proposed certificates.

摘要
Translated into Simplified Chinese:深度图学（DGL）技术在不同领域得到了广泛应用，但是最近的研究发现了DGL模型的漏洞，如欺骗和毒素攻击。虽然有些Empirical和可证明的Robustness技术已经开发出来防御Graph Modification Attacks（GMAs），但GIAs问题还尚未得到充分研究。为了填补这一差，我们提出了节点意识Bi-smoothing框架，这是第一个可证明Robustness的普通节点分类任务 противGIAs。它是Model-agnostic的，可以适用于欺骗和毒素攻击。通过严谨的理论分析，我们确定了我们的缓和方案的可证明条件。我们还将我们的节点意识Bi-smoothing方案应用于两个Context：作为实际GIAs防御措施和推荐系统中。此外，我们将两个State-of-the-art certified robustness frameworks extend to address node injection attacks and compare our approach against them.广泛的评估表明了我们的提案的有效性。

PerSival: Neural-network-based visualisation for pervasive continuum-mechanical simulations in musculoskeletal biomechanics

paper_url: http://arxiv.org/abs/2312.03957
repo_url: None
paper_authors: David Rosin, Johannes Kässinger, Xingyao Yu, Okan Avci, Christian Bleiler, Oliver Röhrle
for: 这 paper 的目的是提出一种基于神经网络的三维人体 upper limb 肌骨系统模型的新型建模方法，以实现在资源受限的系统（如移动设备）上进行仿真计算。
methods: 本 paper 使用了稀疏格网模型来捕捉肌肉表面的变形，并使用深度学习模型进行实时视觉化。
results: 本 paper 的实验结果显示，使用这种方法可以实现实时视觉化，并且可以在 CPU 上 achieve 101 fps 和 GPU 上 achieve 287 fps。

Abstract
This paper presents a novel neural network architecture for the purpose of pervasive visualisation of a 3D human upper limb musculoskeletal system model. Bringing simulation capabilities to resource-poor systems like mobile devices is of growing interest across many research fields, to widen applicability of methods and results. Until recently, this goal was thought to be out of reach for realistic continuum-mechanical simulations of musculoskeletal systems, due to prohibitive computational cost. Within this work we use a sparse grid surrogate to capture the surface deformation of the m.~biceps brachii in order to train a deep learning model, used for real-time visualisation of the same muscle. Both these surrogate models take 5 muscle activation levels as input and output Cartesian coordinate vectors for each mesh node on the muscle's surface. Thus, the neural network architecture features a significantly lower input than output dimension. 5 muscle activation levels were sufficient to achieve an average error of 0.97 +/- 0.16 mm, or 0.57 +/- 0.10 % for the 2809 mesh node positions of the biceps. The model achieved evaluation times of 9.88 ms per predicted deformation state on CPU only and 3.48 ms with GPU-support, leading to theoretical frame rates of 101 fps and 287 fps respectively. Deep learning surrogates thus provide a way to make continuum-mechanical simulations accessible for visual real-time applications.

摘要
Translated into Simplified Chinese:这篇论文提出了一种新的神经网络架构，用于实时Visual化3D人体上肱部Musculoskeletal系统模型。该方法使用一种稀疏Grid抽象法来捕捉肱部肌肉的表面变形，并使用深度学习模型来实现实时Visualization。该神经网络架构的输入只有5个肌肉活动水平，因此输入维度远低于传统方法。该方法可以实现肱部肌肉的表面变形的平均误差为0.97+/-0.16mm和0.57+/-0.10%，并且在CPU和GPU支持下，评估时间分别为9.88ms和3.48ms， theoretically leading to frame rates of 101fps和287fps。这种方法提供了一种使continuum-mechanical simulations可以在视觉实时应用中进行的访问。