cs.LG - 2023-11-12

Analytical Verification of Deep Neural Network Performance for Time-Synchronized Distribution System State Estimation

  • paper_url: http://arxiv.org/abs/2311.06973
  • repo_url: None
  • paper_authors: Behrouz Azimian, Shiva Moshtagh, Anamitra Pal, Shanshan Ma
  • for: 本文提出了一种使用深度神经网络(DNN)实现实时不可见分布系统状态估计的方法,并提供了对这种方法的性能分析。
  • methods: 本文使用了深度神经网络来解决实时不可见分布系统状态估计问题,并对输入偏差的影响进行分析。
  • results: 本文通过对模拟数据集和实际系统数据进行比较,证明了深度神经网络在输入偏差下的Robustness和可靠性。同时,本文也发现批量正常化可以有效地解决了MILP形式中的缺点。
    Abstract Recently, we demonstrated success of a time-synchronized state estimator using deep neural networks (DNNs) for real-time unobservable distribution systems. In this letter, we provide analytical bounds on the performance of that state estimator as a function of perturbations in the input measurements. It has already been shown that evaluating performance based on only the test dataset might not effectively indicate a trained DNN's ability to handle input perturbations. As such, we analytically verify robustness and trustworthiness of DNNs to input perturbations by treating them as mixed-integer linear programming (MILP) problems. The ability of batch normalization in addressing the scalability limitations of the MILP formulation is also highlighted. The framework is validated by performing time-synchronized distribution system state estimation for a modified IEEE 34-node system and a real-world large distribution system, both of which are incompletely observed by micro-phasor measurement units.
    摘要 最近,我们已经成功地使用深度神经网络(DNN)来实时估计不可见分布系统的状态。在这封信中,我们提供了对状态估计器的性能进行分析的下限。已经证明了只 judging 基于训练集不能准确地评估已经训练好的 DNN 对输入干扰的能力。因此,我们使用混合整数线性程序(MILP)问题来验证 DNN 对输入干扰的Robustness和可靠性。我们还 highlighted 批处理normalization 的缩放性限制,并在批处理normalization 下进行了性能验证。我们的框架在一个修改过 IEEE 34 节点系统和一个实际大型分布系统中进行了时同步分布系统状态估计,两个系统都是通过微phasor测量单元不完全观察的。

An Expandable Machine Learning-Optimization Framework to Sequential Decision-Making

  • paper_url: http://arxiv.org/abs/2311.06972
  • repo_url: None
  • paper_authors: Dogacan Yilmaz, İ. Esra Büyüktahtakın
  • for: 解决sequential decision-making问题,提高machine learning(ML)预测的可行性和泛化能力。
  • methods: integrate attention-based encoder-decoder neural network architecture with infeasibility-elimination和generalization framework,并Optimize the required level of predictions to eliminate the infeasibility of the ML predictions。
  • results: 可以快速解决time-dependent optimization问题,并且可以降低solution time by three orders of magnitude,average optimality gap below 0.1%。 Comparing with various specially designed heuristics, PredOpt outperforms them.
    Abstract We present an integrated prediction-optimization (PredOpt) framework to efficiently solve sequential decision-making problems by predicting the values of binary decision variables in an optimal solution. We address the key issues of sequential dependence, infeasibility, and generalization in machine learning (ML) to make predictions for optimal solutions to combinatorial problems. The sequential nature of the combinatorial optimization problems considered is captured with recurrent neural networks and a sliding-attention window. We integrate an attention-based encoder-decoder neural network architecture with an infeasibility-elimination and generalization framework to learn high-quality feasible solutions to time-dependent optimization problems. In this framework, the required level of predictions is optimized to eliminate the infeasibility of the ML predictions. These predictions are then fixed in mixed-integer programming (MIP) problems to solve them quickly with the aid of a commercial solver. We demonstrate our approach to tackling the two well-known dynamic NP-Hard optimization problems: multi-item capacitated lot-sizing (MCLSP) and multi-dimensional knapsack (MSMK). Our results show that models trained on shorter and smaller-dimensional instances can be successfully used to predict longer and larger-dimensional problems. The solution time can be reduced by three orders of magnitude with an average optimality gap below 0.1%. We compare PredOpt with various specially designed heuristics and show that our framework outperforms them. PredOpt can be advantageous for solving dynamic MIP problems that need to be solved instantly and repetitively.
    摘要 我们提出了一个集成预测优化(PredOpt)框架,用于高效解决顺序决策问题,预测二进制决策变量的价值在优质解决方案中。我们解决了机器学习(ML)中的顺序依赖、不可实现性和泛化问题,以使ML预测可以为优质解决方案提供高质量的预测。我们使用循环神经网络和滑块注意力窗口捕捉顺序优化问题的特点。我们将注意力基本网络和不可实现性和泛化框架结合在一起,以学习高质量的可行解决方案。在这个框架中,ML预测的需要级别被优化,以消除ML预测中的不可实现性。这些预测然后被ixed在混合整数编程(MIP)问题中,通过商业解决器快速解决。我们利用PredOpt解决了多项目资源配置问题(MCLSP)和多维度饼干问题(MSMK)。我们的结果显示,可以使用较短和更小的实例来训练模型,并且这些模型可以成功预测更长和更大的问题。我们的解决方案比特制的各种优化策略更高效,并且可以降低解决时间三个数量级,average optimality gap在0.1%以下。我们与其他专门设计的各种优化策略进行比较,并证明PredOpt可以在实时和重复地解决动态MIP问题中具有优势。

Anchor Data Augmentation

  • paper_url: http://arxiv.org/abs/2311.06965
  • repo_url: None
  • paper_authors: Nora Schneider, Shirin Goshtasbpour, Fernando Perez-Cruz
  • for: 提高非线性过参数回归的数据增强方法
  • methods: 基于 causality 文献的 Anchor regression (AR) 方法,使用多个修改后的样本来提供更多训练例子,提高回归预测的Robustness。
  • results: ADA 在线性和非线性回归问题中表现与当前领域无关的 Mixup 解决方案竞争。
    Abstract We propose a novel algorithm for data augmentation in nonlinear over-parametrized regression. Our data augmentation algorithm borrows from the literature on causality and extends the recently proposed Anchor regression (AR) method for data augmentation, which is in contrast to the current state-of-the-art domain-agnostic solutions that rely on the Mixup literature. Our Anchor Data Augmentation (ADA) uses several replicas of the modified samples in AR to provide more training examples, leading to more robust regression predictions. We apply ADA to linear and nonlinear regression problems using neural networks. ADA is competitive with state-of-the-art C-Mixup solutions.
    摘要 我们提出一种新的数据扩充算法,用于非线性过参数化回归。我们的数据扩充算法从 causality литературе借鉴,并对最近提出的 Anchor regression(AR)方法进行扩展,而不是现有的领域不依然的 Mixup 解决方案。我们的 Anchor Data Augmentation(ADA)使用多个修改后的样本来提供更多的训练示例,从而导致更加稳定的回归预测。我们在线性和非线性回归问题中应用 ADA,并与当前领域最佳的 C-Mixup 解决方案竞争。

Robust Regression over Averaged Uncertainty

  • paper_url: http://arxiv.org/abs/2311.06960
  • repo_url: None
  • paper_authors: Dimitris Bertsimas, Yu Ma
  • for: 本文提出了一种新的稳健回归方法,通过综合所有实现集来获得最佳解决方案 для常规最小二乘回归问题。
  • methods: 本文使用了一种averaged approach来处理uncertainty set,并证明了这种方法可以回归ridge回归和确定了exististing回归问题的mean squared error和robust optimization之间的联系。
  • results: 本文在synthetic数据集和实际世界回归问题中显示了一个consistent improvement,并且与干扰水平增加时,提高的速度也随着干扰水平增加。
    Abstract We propose a new formulation of robust regression by integrating all realizations of the uncertainty set and taking an averaged approach to obtain the optimal solution for the ordinary least-squared regression problem. We show that this formulation surprisingly recovers ridge regression and establishes the missing link between robust optimization and the mean squared error approaches for existing regression problems. We first prove the equivalence for four uncertainty sets: ellipsoidal, box, diamond, and budget, and provide closed-form formulations of the penalty term as a function of the sample size, feature size, as well as perturbation protection strength. We then show in synthetic datasets with different levels of perturbations, a consistent improvement of the averaged formulation over the existing worst-case formulation in out-of-sample performance. Importantly, as the perturbation level increases, the improvement increases, confirming our method's advantage in high-noise environments. We report similar improvements in the out-of-sample datasets in real-world regression problems obtained from UCI datasets.
    摘要 We prove the equivalence of our method for four types of uncertainty sets: ellipsoidal, box, diamond, and budget. We also provide closed-form expressions for the penalty term as a function of sample size, feature size, and perturbation protection strength.In synthetic datasets with different levels of perturbations, our method consistently outperforms the traditional worst-case formulation in out-of-sample performance. As the perturbation level increases, the improvement also increases, demonstrating the advantage of our method in high-noise environments. We observe similar improvements in real-world regression problems obtained from UCI datasets.

A GPU-Accelerated Moving-Horizon Algorithm for Training Deep Classification Trees on Large Datasets

  • paper_url: http://arxiv.org/abs/2311.06952
  • repo_url: None
  • paper_authors: Jiayang Ren, Valentín Osuna-Enciso, Morimasa Okamoto, Qiangqiang Mao, Chaojie Ji, Liang Cao, Kaixun Hua, Yankai Cao
  • for: 本文主要针对决策树的训练受限于NP-完备性和各种各样的特性,并提出了一种基于移动观察点的 diferencial evolution算法来解决这些问题。
  • methods: 本文提出了一种基于GPU加速和搜索优化的移动观察点差分演化算法(MH-DEOCT),包括离散树解码方法、GPU加速实现和移动观察点策略。
  • results: 对于68个UCI数据集,MH-DEOCT方法与CART方法相比,平均提高了训练和测试准确率3.44%和1.71%,并且在深树和大规模数据集中实现了很好的扩展性。
    Abstract Decision trees are essential yet NP-complete to train, prompting the widespread use of heuristic methods such as CART, which suffers from sub-optimal performance due to its greedy nature. Recently, breakthroughs in finding optimal decision trees have emerged; however, these methods still face significant computational costs and struggle with continuous features in large-scale datasets and deep trees. To address these limitations, we introduce a moving-horizon differential evolution algorithm for classification trees with continuous features (MH-DEOCT). Our approach consists of a discrete tree decoding method that eliminates duplicated searches between adjacent samples, a GPU-accelerated implementation that significantly reduces running time, and a moving-horizon strategy that iteratively trains shallow subtrees at each node to balance the vision and optimizer capability. Comprehensive studies on 68 UCI datasets demonstrate that our approach outperforms the heuristic method CART on training and testing accuracy by an average of 3.44% and 1.71%, respectively. Moreover, these numerical studies empirically demonstrate that MH-DEOCT achieves near-optimal performance (only 0.38% and 0.06% worse than the global optimal method on training and testing, respectively), while it offers remarkable scalability for deep trees (e.g., depth=8) and large-scale datasets (e.g., ten million samples).
    摘要 决策树是必备的,但是训练NP-完整的,导致广泛使用各种各样的规则来缺省性能。最近,对于寻找优化决策树的突破发展出现了,但这些方法仍然面临巨大的计算成本和深度大的树结构。为了解决这些限制,我们介绍了一种基于移动观察点的 diferencial evolution算法 для分类树(MH-DEOCT)。我们的方法包括一种离散树解码方法,消除邻近样本之间的重复搜索,一种GPU加速的实现,以及一种移动观察点策略,在每个节点训练 shallow 树以平衡视觉和优化能力。我们对68个UCI数据集进行了完整的实验研究,显示我们的方法在训练和测试精度上比CART方法高出3.44%和1.71%, respectively。此外,这些数字实验也证明了MH-DEOCT方法在训练和测试精度上几乎达到了最佳性能(只比全球最佳方法在训练和测试精度上低出0.38%和0.06%),而且它在深度大的树结构和大规模数据集(例如,一千万个样本)中表现出了很好的可扩展性。

Contractive Systems Improve Graph Neural Networks Against Adversarial Attacks

  • paper_url: http://arxiv.org/abs/2311.06942
  • repo_url: None
  • paper_authors: Moshe Eliasof, Davide Murari, Ferdia Sherry, Carola-Bibiane Schönlieb
  • for: 强化Graph Neural Networks(GNNs)对抗黑客攻击
  • methods: 基于减法动态系统的图神经网络层,同时学习节点特征和图连接矩阵的演化,提高模型对输入特征和图结构的Robustness
  • results: 通过许多实验示范,与现有方法相比,提高或与之相当的性能
    Abstract Graph Neural Networks (GNNs) have established themselves as a key component in addressing diverse graph-based tasks. Despite their notable successes, GNNs remain susceptible to input perturbations in the form of adversarial attacks. This paper introduces an innovative approach to fortify GNNs against adversarial perturbations through the lens of contractive dynamical systems. Our method introduces graph neural layers based on differential equations with contractive properties, which, as we show, improve the robustness of GNNs. A distinctive feature of the proposed approach is the simultaneous learned evolution of both the node features and the adjacency matrix, yielding an intrinsic enhancement of model robustness to perturbations in the input features and the connectivity of the graph. We mathematically derive the underpinnings of our novel architecture and provide theoretical insights to reason about its expected behavior. We demonstrate the efficacy of our method through numerous real-world benchmarks, reading on par or improved performance compared to existing methods.
    摘要 图 neural network (GNN) 已成为许多图像任务的关键组件。尽管它们具有显著的成功,但 GNN 仍然易受输入抗干扰的影响。这篇论文介绍了一种创新的方法,通过对 GNN 进行启发式动力系统的扩展,提高其对抗干扰的 robustness。我们的方法基于差分方程,并且通过对节点特征和邻接矩阵的同时学习,实现了图像模型的自适应性。我们 математичеamente derivation 了我们的新架构的基础,并提供了理论上的理解,以便理解我们的方法的预期行为。我们通过多个实际 benchmark 证明了我们的方法的有效性,与现有方法相比,表现了类似或更好的性能。

5G Networks and IoT Devices: Mitigating DDoS Attacks with Deep Learning Techniques

  • paper_url: http://arxiv.org/abs/2311.06938
  • repo_url: None
  • paper_authors: Reem M. Alzhrani, Mohammed A. Alliheedi
  • For: 这项研究旨在应对互联网物联网(IoT)设备的安全性和隐私问题,特别是在5G网络中。* Methods: 该研究使用了深度学习技术,包括卷积神经网络(CNN)和Feed Forward神经网络(FNN),对iot设备在5G网络中的数据进行分析和识别。* Results: 研究发现,使用深度学习技术可以准确地识别normal网络流量和DDos攻击,CNN和FNN两种算法均达到了99%的准确率。这些结果表明深度学习可以提高IoT设备在5G网络中的安全性。
    Abstract The development and implementation of Internet of Things (IoT) devices have been accelerated dramatically in recent years. As a result, a super-network is required to handle the massive volumes of data collected and transmitted to these devices. Fifth generation (5G) technology is a new, comprehensive wireless technology that has the potential to be the primary enabling technology for the IoT. The rapid spread of IoT devices can encounter many security limits and concerns. As a result, new and serious security and privacy risks have emerged. Attackers use IoT devices to launch massive attacks; one of the most famous is the Distributed Denial of Service (DDoS) attack. Deep Learning techniques have proven their effectiveness in detecting and mitigating DDoS attacks. In this paper, we applied two Deep Learning algorithms Convolutional Neural Network (CNN) and Feed Forward Neural Network (FNN) in dataset was specifically designed for IoT devices within 5G networks. We constructed the 5G network infrastructure using OMNeT++ with the INET and Simu5G frameworks. The dataset encompasses both normal network traffic and DDoS attacks. The Deep Learning algorithms, CNN and FNN, showed impressive accuracy levels, both reaching 99%. These results underscore the potential of Deep Learning to enhance the security of IoT devices within 5G networks.
    摘要 “现在的互联网发展趋势很快,因此需要一个超级网络来处理大量的数据和传输到这些设备。第五代(5G)技术是一种新的、全面的无线技术,它有可能成为互联网的主要启动技术。随着互联网设备的快速普及,新的安全和隐私问题也在不断产生。攻击者使用互联网设备发动大规模攻击,其中最著名的是分布式拒绝服务(DDoS)攻击。深度学习技术在检测和解决DDoS攻击方面表现出色,在本文中,我们将运用深度学习算法Convolutional Neural Network(CNN)和Feed Forward Neural Network(FNN),在特定的互联网5G网络中进行测试。我们使用OMNeT++架构,并使用INET和Simu5G框架建立5G网络基础设施。资料集包括正常网络流量和DDoS攻击。深度学习算法CNN和FNN在资料集中表现出色,精度分别达到99%。这些结果显示深度学习在5G网络中增强互联网设备的安全性具有潜力。”

Attention for Causal Relationship Discovery from Biological Neural Dynamics

  • paper_url: http://arxiv.org/abs/2311.06928
  • repo_url: None
  • paper_authors: Ziyu Lu, Anika Tabassum, Shruti Kulkarni, Lu Mi, J. Nathan Kutz, Eric Shea-Brown, Seung-Hwan Lim
  • for: 这 paper 探讨了使用 transformer 模型来学习 neural network 中每个节点的 Granger causality,以获得更好的 causal representation learning。
  • methods: 这 paper 使用 simulated neural dynamics 进行证明,并通过 cross attention module 来捕捉 neuron 之间的 causal relationship,其准确率与 Granger causality analysis 方法相当或更高。
  • results: 这 paper 的研究表明,transformer 模型可以有效地捕捉 neural network 中每个节点的 causal relationship,并且可以与 Granger causality analysis 方法相当或更高的准确率。
    Abstract This paper explores the potential of the transformer models for learning Granger causality in networks with complex nonlinear dynamics at every node, as in neurobiological and biophysical networks. Our study primarily focuses on a proof-of-concept investigation based on simulated neural dynamics, for which the ground-truth causality is known through the underlying connectivity matrix. For transformer models trained to forecast neuronal population dynamics, we show that the cross attention module effectively captures the causal relationship among neurons, with an accuracy equal or superior to that for the most popular Granger causality analysis method. While we acknowledge that real-world neurobiology data will bring further challenges, including dynamic connectivity and unobserved variability, this research offers an encouraging preliminary glimpse into the utility of the transformer model for causal representation learning in neuroscience.
    摘要 Translation in Simplified Chinese:这篇论文探讨了转换器模型在具有复杂非线性动态的网络中学习格兰格 causality的潜力,例如 neuroscience 和生物物理网络。我们的研究主要集中在基于模拟神经动力学的证明性研究上,其中的 causality 是通过连接矩阵获知的。对于基于神经动力学预测的 transformer 模型,我们显示了 cross attention 模块能够有效地捕捉神经之间的 causal 关系,准确率与最受欢迎的格兰格 causality 分析方法相当或更高。虽然我们认为实际的 neuroscience 数据会带来更多的挑战,包括动态连接和隐藏变量,但这种研究提供了encouraging 的初步预览,表明 transformer 模型在 neuroscience 中的 causal 表示学习具有潜力。

Concept Matching: Clustering-based Federated Continual Learning

  • paper_url: http://arxiv.org/abs/2311.06921
  • repo_url: None
  • paper_authors: Xiaopeng Jiang, Cristian Borcea
  • for: 本研究旨在解决联合学习(FL)和继续学习(CL)的问题,提高模型准确率。
  • methods: 提出了一种基于归一化的概念匹配(CM)框架,通过将客户端模型分组到概念模型集中,然后在不同时间点建立不同概念的全局模型,以避免泄漏性学习和客户端模型之间的干扰。
  • results: 证明了CM比状态艺术系统表现更好,并可扩展到不同的归一化、汇集和匹配算法。
    Abstract Federated Continual Learning (FCL) has emerged as a promising paradigm that combines Federated Learning (FL) and Continual Learning (CL). To achieve good model accuracy, FCL needs to tackle catastrophic forgetting due to concept drift over time in CL, and to overcome the potential interference among clients in FL. We propose Concept Matching (CM), a clustering-based framework for FCL to address these challenges. The CM framework groups the client models into concept model clusters, and then builds different global models to capture different concepts in FL over time. In each round, the server sends the global concept models to the clients. To avoid catastrophic forgetting, each client selects the concept model best-matching the concept of the current data for further fine-tuning. To avoid interference among client models with different concepts, the server clusters the models representing the same concept, aggregates the model weights in each cluster, and updates the global concept model with the cluster model of the same concept. Since the server does not know the concepts captured by the aggregated cluster models, we propose a novel server concept matching algorithm that effectively updates a global concept model with a matching cluster model. The CM framework provides flexibility to use different clustering, aggregation, and concept matching algorithms. The evaluation demonstrates that CM outperforms state-of-the-art systems and scales well with the number of clients and the model size.
    摘要

Resource-Aware Hierarchical Federated Learning for Video Caching in Wireless Networks

  • paper_url: http://arxiv.org/abs/2311.06918
  • repo_url: None
  • paper_authors: Md Ferdous Pervej, Andreas F Molisch
  • for: 避免回хай路塞车的压力,提高网络性能
  • methods: 使用资源意识的联邦学习方法(RawHFL)估算用户未来的内容请求
  • results: 比基eline的方法表现出更高的预测精度和总能耗Here’s the translation in Simplified Chinese:
  • for: 减轻回хай路塞车的压力,提高网络性能
  • methods: 使用资源意识的联邦学习方法(RawHFL)预测用户未来的内容请求
  • results: 比基线的方法表现出更高的预测精度和总能耗
    Abstract Video caching can significantly improve backhaul traffic congestion by locally storing the popular content that users frequently request. A privacy-preserving method is desirable to learn how users' demands change over time. As such, this paper proposes a novel resource-aware hierarchical federated learning (RawHFL) solution to predict users' future content requests under the realistic assumptions that content requests are sporadic and users' datasets can only be updated based on the requested content's information. Considering a partial client participation case, we first derive the upper bound of the global gradient norm that depends on the clients' local training rounds and the successful reception of their accumulated gradients over the wireless links. Under delay, energy and radio resource constraints, we then optimize client selection and their local rounds and central processing unit (CPU) frequencies to minimize a weighted utility function that facilitates RawHFL's convergence in an energy-efficient way. Our simulation results show that the proposed solution significantly outperforms the considered baselines in terms of prediction accuracy and total energy expenditure.
    摘要 <>使用视频缓存可以大幅提高后向压力堵塞,通过地方存储用户经常请求的受欢迎内容。一种遵守隐私的方法是需要了解用户的需求变化。因此,这篇论文提出了一种基于资源意识的归纳 Federated learning(RawHFL)解决方案,以预测用户未来的内容请求。assuming that content requests are sporadic and users' datasets can only be updated based on the requested content's information.首先,我们 derive the upper bound of the global gradient norm that depends on the clients' local training rounds and the successful reception of their accumulated gradients over the wireless links.然后,我们在延迟、能量和无线链接的限制下优化客户选择和他们的本地循环数和中央处理器(CPU)频率,以最小化一个权重函数,以便 RawHFL 在能效的方式进行归纳。我们的 simulation 结果表明,提出的解决方案significantly outperforms the considered baselines in terms of prediction accuracy and total energy expenditure.(Note: Please note that the translation is in Simplified Chinese, and the grammar and sentence structure may be different from the original text.)

EPIM: Efficient Processing-In-Memory Accelerators based on Epitome

  • paper_url: http://arxiv.org/abs/2311.07620
  • repo_url: None
  • paper_authors: Chenyu Wang, Zhen Dong, Daquan Zhou, Zhenhua Zhu, Yu Wang, Jiashi Feng, Kurt Keutzer
  • for: 这篇论文旨在探讨如何在Processing-In-Memory(PIM)加速器上实现大规模神经网络,并解决由于PIM加速器的内存容量限制所带来的挑战。
  • methods: 本论文使用了模型压缩算法来缩小对应encephalographic Neural Networks(CNNs)的大小,并提出了一种名为Epitome的轻量级神经操作,以实现PIM加速器上的内存效率。在软件方面,我们评估了epitome的延迟和能源消耗PIM加速器上,并提出了一种PIM应用层次设计方法来提高硬件效率。在硬件方面,我们修改了现有PIM加速器的资料道路来适应epitome,并实现了图像重复技术来降低computation成本。
  • results: 我们的32位量化EPIM-ResNet50在ImageNet上取得71.59%的顶部1精度,比前一代压缩方法在PIM上更高。EPIM超过了现有的删除方法在PIM上。
    Abstract The exploration of Processing-In-Memory (PIM) accelerators has garnered significant attention within the research community. However, the utilization of large-scale neural networks on Processing-In-Memory (PIM) accelerators encounters challenges due to constrained on-chip memory capacity. To tackle this issue, current works explore model compression algorithms to reduce the size of Convolutional Neural Networks (CNNs). Most of these algorithms either aim to represent neural operators with reduced-size parameters (e.g., quantization) or search for the best combinations of neural operators (e.g., neural architecture search). Designing neural operators to align with PIM accelerators' specifications is an area that warrants further study. In this paper, we introduce the Epitome, a lightweight neural operator offering convolution-like functionality, to craft memory-efficient CNN operators for PIM accelerators (EPIM). On the software side, we evaluate epitomes' latency and energy on PIM accelerators and introduce a PIM-aware layer-wise design method to enhance their hardware efficiency. We apply epitome-aware quantization to further reduce the size of epitomes. On the hardware side, we modify the datapath of current PIM accelerators to accommodate epitomes and implement a feature map reuse technique to reduce computation cost. Experimental results reveal that our 3-bit quantized EPIM-ResNet50 attains 71.59% top-1 accuracy on ImageNet, reducing crossbar areas by 30.65 times. EPIM surpasses the state-of-the-art pruning methods on PIM.
    摘要 研究人员对处理在内存(PIM)加速器的探索已经吸引了广泛的关注。然而,使用大规模神经网络(CNN)在PIM加速器上遇到了问题,因为内存容量的限制。为解决这个问题,当前的研究主要探讨模型压缩算法,以减少神经网络中参数的大小。大多数这些算法都是通过压缩神经网络中的参数来减少神经网络的大小。然而,设计神经网络操作符符合PIM加速器的特点是一个需要更多研究的领域。在这篇论文中,我们介绍了一种轻量级的神经操作符,即Epitome,以实现内存有效的CNN操作符。在软件端,我们评估了epitome在PIM加速器上的响应时间和能耗,并提出了一种针对PIM加速器的层次设计方法,以提高硬件效率。在硬件端,我们修改了现有PIM加速器的数据路径,以便使用epitome,并实现了特征图 reuse技术,以减少计算成本。实验结果表明,我们的3比特量化的EPIM-ResNet50在ImageNet上达到了71.59%的前1 accuracy,相比之下,降低了交叉栅格面积30.65倍。EPIM超越了PIM上的状态态-of-the-art剪裁方法。

An Application of Vector Autoregressive Model for Analyzing the Impact of Weather And Nearby Traffic Flow On The Traffic Volume

  • paper_url: http://arxiv.org/abs/2311.06894
  • repo_url: None
  • paper_authors: Anh Thi-Hoang Nguyen, Dung Ha Nguyen, Trong-Hop Do
  • for: 预测一个道路段的交通流量,基于附近交通量和天气条件。
  • methods: 使用VAR(36)模型,包括时间趋势和常数,来训练数据集和预测。
  • results: 通过分析结果,发现天气条件和附近交通量对交通流量的影响,并且提供了解决交通流量预测问题的一种方法。
    Abstract This paper aims to predict the traffic flow at one road segment based on nearby traffic volume and weather conditions. Our team also discover the impact of weather conditions and nearby traffic volume on the traffic flow at a target point. The analysis results will help solve the problem of traffic flow prediction and develop an optimal transport network with efficient traffic movement and minimal traffic congestion. Hourly historical weather and traffic flow data are selected to solve this problem. This paper uses model VAR(36) with time trend and constant to train the dataset and forecast. With an RMSE of 565.0768111 on average, the model is considered appropriate although some statistical tests implies that the residuals are unstable and non-normal. Also, this paper points out some variables that are not useful in forecasting, which helps simplify the data-collecting process when building the forecasting system.
    摘要 这篇论文目标是根据附近交通量和天气情况预测一段公路交通流量。我们团队还发现了天气情况和附近交通量对target点交通流量的影响。分析结果将帮助解决交通流量预测问题并开发高效的交通网络,实现最佳的交通运输和最小的交通拥堵。选用了一年历史天气和交通流量数据进行解决这个问题。本文使用VAR(36)模型,包括时间趋势和常数,来训练数据集和预测。其中RMSE平均为565.0768111,可以视为合适的,但一些统计测试表明 residuals 不稳定和不归一化。此外,本文还指出了一些无用的变量,帮助简化收集数据时建立预测系统。

Preserving Node-level Privacy in Graph Neural Networks

  • paper_url: http://arxiv.org/abs/2311.06888
  • repo_url: None
  • paper_authors: Zihang Xiang, Tianhao Wang, Di Wang
  • for: 这个研究旨在解决Graph Neural Networks(GNNs)中的实体隐私问题,specifically addressing the issue of node-level privacy。
  • methods: 我们的协议包括两个主要 ком成分:1)一个称为HeterPoisson的抽样 routinen,这个routinen使用特殊化的节点抽样策略和一系列适合的操作来生成一批子graphs with desired properties,2)一个Randomization routinen,这个routinen使用symmetric multivariate Laplace(SML)噪声而不是常用的Gaussian噪声。
  • results: 我们的隐私评估显示这组合提供了一定的隐私保证。实验表明,与现有的基准比较,我们的方法在高隐私 режи的情况下表现更好,特别是在五个真实世界数据集上。我们还进行了会员推测攻击和隐私审核技术来证明我们的协议的隐私完整性。
    Abstract Differential privacy (DP) has seen immense applications in learning on tabular, image, and sequential data where instance-level privacy is concerned. In learning on graphs, contrastingly, works on node-level privacy are highly sparse. Challenges arise as existing DP protocols hardly apply to the message-passing mechanism in Graph Neural Networks (GNNs). In this study, we propose a solution that specifically addresses the issue of node-level privacy. Our protocol consists of two main components: 1) a sampling routine called HeterPoisson, which employs a specialized node sampling strategy and a series of tailored operations to generate a batch of sub-graphs with desired properties, and 2) a randomization routine that utilizes symmetric multivariate Laplace (SML) noise instead of the commonly used Gaussian noise. Our privacy accounting shows this particular combination provides a non-trivial privacy guarantee. In addition, our protocol enables GNN learning with good performance, as demonstrated by experiments on five real-world datasets; compared with existing baselines, our method shows significant advantages, especially in the high privacy regime. Experimentally, we also 1) perform membership inference attacks against our protocol and 2) apply privacy audit techniques to confirm our protocol's privacy integrity. In the sequel, we present a study on a seemingly appealing approach \cite{sajadmanesh2023gap} (USENIX'23) that protects node-level privacy via differentially private node/instance embeddings. Unfortunately, such work has fundamental privacy flaws, which are identified through a thorough case study. More importantly, we prove an impossibility result of achieving both (strong) privacy and (acceptable) utility through private instance embedding. The implication is that such an approach has intrinsic utility barriers when enforcing differential privacy.
    摘要 différential privacy (DP) 已经在表格、图像和序列数据中进行了广泛的应用,而且在实例级隐私方面进行了充分的保障。然而,在图学中,工作在节点级隐私方面是非常罕见。 existing DP 协议几乎没有应用于图学中的消息传递机制。在这种研究中,我们提出了一种解决方案,即特点在节点级隐私方面。我们的协议包括两个主要组成部分: 1. 一种叫做 HeterPoisson 的采样 Routine,该 Routine使用特殊的节点采样策略和一系列适应的操作来生成一批具有感兴趣的属性的子图。 2. 一种叫做 Symmetric Multivariate Laplace (SML) 随机噪音的使用,而不是通常使用的高斯噪音。我们的隐私负荷表明这种特定的组合具有一定的隐私保障。此外,我们的协议允许 GNN 学习具有良好的性能,如实验所示,相比现有的基eline,我们的方法在高隐私 режиower表现出了显著优势,特别是在高隐私 режиower下。在实验中,我们还 1. 对我们的协议进行了成员推理攻击,以及 2. 通过隐私审核技术来确认我们的协议的隐私完整性。在继续的研究中,我们发现了一篇可能有吸引力的论文 \cite{sajadmanesh2023gap} (USENIX'23),该论文通过异 diferencial privacy 保护节点级隐私。然而,我们在这篇论文中发现了基本的隐私漏洞,并通过详细的案例研究证明了这些漏洞。此外,我们还证明了在保持异 diferencial privacy 的情况下,不可能同时实现强隐私和可接受的用用。这种隐私的限制意味着在强制实施异 diferencial privacy 时,实际上存在一定的实用障碍。

pFedES: Model Heterogeneous Personalized Federated Learning with Feature Extractor Sharing

  • paper_url: http://arxiv.org/abs/2311.06879
  • repo_url: None
  • paper_authors: Liping Yi, Han Yu, Gang Wang, Xiaoguang Liu
  • for: 本研究旨在提出一种基于特征提取器共享(pFedES)的个性化 Federated learning方法,以便让每个数据拥有者(FL客户端)在本地数据分布、系统资源和模型结构等限制下,训练个性化的本地模型。
  • methods: 本方法基于一个小型的同构特征提取器,并通过论证方法证明其可以在wall-to-wall时间内收敛。客户端通过迭代学习方法来训练本地模型,并将模型参数上传到FL服务器进行集成。
  • results: 对于两个实际数据集,与六种状态OF-the-art方法进行比较,实验结果表明,pFedES可以建立最准确的模型,同时具有低的通信和计算成本。相比最佳基eline,它可以提高测试准确率1.61%,而同时降低通信和计算成本99.6%和82.9%。
    Abstract As a privacy-preserving collaborative machine learning paradigm, federated learning (FL) has attracted significant interest from academia and the industry alike. To allow each data owner (a.k.a., FL clients) to train a heterogeneous and personalized local model based on its local data distribution, system resources and requirements on model structure, the field of model-heterogeneous personalized federated learning (MHPFL) has emerged. Existing MHPFL approaches either rely on the availability of a public dataset with special characteristics to facilitate knowledge transfer, incur high computation and communication costs, or face potential model leakage risks. To address these limitations, we propose a model-heterogeneous personalized Federated learning approach based on feature Extractor Sharing (pFedES). It incorporates a small homogeneous feature extractor into each client's heterogeneous local model. Clients train them via the proposed iterative learning method to enable the exchange of global generalized knowledge and local personalized knowledge. The small local homogeneous extractors produced after local training are uploaded to the FL server and for aggregation to facilitate easy knowledge sharing among clients. We theoretically prove that pFedES can converge over wall-to-wall time. Extensive experiments on two real-world datasets against six state-of-the-art methods demonstrate that pFedES builds the most accurate model, while incurring low communication and computation costs. Compared with the best-performing baseline, it achieves 1.61% higher test accuracy, while reducing communication and computation costs by 99.6% and 82.9%, respectively.
    摘要 As a privacy-preserving collaborative machine learning paradigm, federated learning (FL) has attracted significant interest from academia and industry. To allow each data owner (a.k.a., FL clients) to train a heterogeneous and personalized local model based on its local data distribution, system resources, and requirements on model structure, the field of model-heterogeneous personalized federated learning (MHPFL) has emerged. Existing MHPFL approaches either rely on the availability of a public dataset with special characteristics to facilitate knowledge transfer, incur high computation and communication costs, or face potential model leakage risks. To address these limitations, we propose a model-heterogeneous personalized Federated learning approach based on feature Extractor Sharing (pFedES). It incorporates a small homogeneous feature extractor into each client's heterogeneous local model. Clients train them via the proposed iterative learning method to enable the exchange of global generalized knowledge and local personalized knowledge. The small local homogeneous extractors produced after local training are uploaded to the FL server for aggregation to facilitate easy knowledge sharing among clients. We theoretically prove that pFedES can converge over wall-to-wall time. Extensive experiments on two real-world datasets against six state-of-the-art methods demonstrate that pFedES builds the most accurate model, while incurring low communication and computation costs. Compared with the best-performing baseline, it achieves 1.61% higher test accuracy, while reducing communication and computation costs by 99.6% and 82.9%, respectively.

Unified machine learning tasks and datasets for enhancing renewable energy

  • paper_url: http://arxiv.org/abs/2311.06876
  • repo_url: None
  • paper_authors: Arsam Aryandoust, Thomas Rigoni, Francesco di Stefano, Anthony Patt
  • for: 本研究旨在探讨使用多任务机器学习模型解决可再生能源过渡和气候变化问题。
  • methods: 本文使用多任务机器学习模型,包括零损训练和几何学习模型,以解决具有少量训练数据的问题。
  • results: 本文 introduce了17个能源转换任务数据集,并将所有任务集合成一个多任务机器学习模型,以便对这些任务进行解决。同时,本文还提出了一些数据集的维度、需要的设计要求和模型性能指标。
    Abstract Multi-tasking machine learning (ML) models exhibit prediction abilities in domains with little to no training data available (few-shot and zero-shot learning). Over-parameterized ML models are further capable of zero-loss training and near-optimal generalization performance. An open research question is, how these novel paradigms contribute to solving tasks related to enhancing the renewable energy transition and mitigating climate change. A collection of unified ML tasks and datasets from this domain can largely facilitate the development and empirical testing of such models, but is currently missing. Here, we introduce the ETT-17 (Energy Transition Tasks-17), a collection of 17 datasets from six different application domains related to enhancing renewable energy, including out-of-distribution validation and testing data. We unify all tasks and datasets, such that they can be solved using a single multi-tasking ML model. We further analyse the dimensions of each dataset; investigate what they require for designing over-parameterized models; introduce a set of dataset scores that describe important properties of each task and dataset; and provide performance benchmarks.
    摘要 多任务学习机器学习(ML)模型在具有少量或无training数据的领域表现出预测能力(几shot和零shot学习)。过度参数化的ML模型可以在无损训练和近似优化性能下进行训练。现有一个开放的研究问题是,这些新的 paradigma如何在推进可再生能源转型和减轻气候变化中发挥作用。一个包含这些任务和数据集的集成可以大大促进这些模型的开发和实验测试,但目前缺失。我们现在介绍ETT-17(能源转型任务17),这是6个不同应用领域中的17个数据集,包括out-of-distribution验证和测试数据。我们将所有任务和数据集统一,以便通过单个多任务ML模型解决它们。我们还分析每个数据集的维度,研究它们需要的设计过度参数化模型的要求,介绍每个任务和数据集的数据集分数,并提供性能标准。

Inference and Interference: The Role of Clipping, Pruning and Loss Landscapes in Differentially Private Stochastic Gradient Descent

  • paper_url: http://arxiv.org/abs/2311.06839
  • repo_url: None
  • paper_authors: Lauren Watson, Eric Gan, Mohan Dantam, Baharan Mirzasoleiman, Rik Sarkar
  • for: 这篇论文主要针对了差异性保护随机梯度下降(DP-SGD)在大神经网络上的训练和测试性能,并对其进行了详细的研究和比较。
  • methods: 该论文使用了分析DP-SGD和SGD的两个过程的不同行为,并在早期和晚期两个阶段进行了分别的分析。它发现DP-SGD在早期阶段的进度较慢,但是在后期阶段的进度决定了最终结果。此外,它还分析了DP-SGD中的剪切和随机噪声的两个步骤,发现剪切Step有更大的影响,而随机噪声则会引入误差。
  • results: 该论文通过理论分析和广泛的实验表明,可以通过减小维度来提高DP-SGD的测试准确率,并且发现重剪的方法可以更好地提高DP-SGD的测试准确率。
    Abstract Differentially private stochastic gradient descent (DP-SGD) is known to have poorer training and test performance on large neural networks, compared to ordinary stochastic gradient descent (SGD). In this paper, we perform a detailed study and comparison of the two processes and unveil several new insights. By comparing the behavior of the two processes separately in early and late epochs, we find that while DP-SGD makes slower progress in early stages, it is the behavior in the later stages that determines the end result. This separate analysis of the clipping and noise addition steps of DP-SGD shows that while noise introduces errors to the process, gradient descent can recover from these errors when it is not clipped, and clipping appears to have a larger impact than noise. These effects are amplified in higher dimensions (large neural networks), where the loss basin occupies a lower dimensional space. We argue theoretically and using extensive experiments that magnitude pruning can be a suitable dimension reduction technique in this regard, and find that heavy pruning can improve the test accuracy of DPSGD.
    摘要 diferencialmente privado stochastic gradient descent (DP-SGD) 是已知在大型神经网络上训练和测试性能较差,相比普通的随机梯度 descent (SGD)。在这篇论文中,我们进行了详细的比较和分析两个过程,并发现了一些新的发现。通过分 sep 梯度 descent 和噪声添加步骤的分析,我们发现,虽然 DP-SGD 在早期阶段 slower progress,但是在后期阶段的表现决定了结果。这些效果在高维(大神经网络)中更加突出,因为损失基地占据了低维度空间。我们 theoretically 和广泛实验表明, magnitude pruning 可以是适当的维度减少技术,并发现了重彻uning 可以提高 DPSGD 的测试精度。

GraNNDis: Efficient Unified Distributed Training Framework for Deep GNNs on Large Clusters

  • paper_url: http://arxiv.org/abs/2311.06837
  • repo_url: None
  • paper_authors: Jaeyong Song, Hongsun Jang, Jaewon Jung, Youngsok Kim, Jinho Lee
  • for: 提高大图和深层GNN训练的效率
  • methods: 提出了三种新技术:分享预加载、扩展意识采样和合作批处理
  • results: 实验在多服务器多GPU集群上显示,GraNNDis可以提供更高的速度提升 compared to state-of-the-art distributed GNN 训练框架
    Abstract Graph neural networks (GNNs) are one of the most rapidly growing fields within deep learning. According to the growth in the dataset and the model size used for GNNs, an important problem is that it becomes nearly impossible to keep the whole network on GPU memory. Among numerous attempts, distributed training is one popular approach to address the problem. However, due to the nature of GNNs, existing distributed approaches suffer from poor scalability, mainly due to the slow external server communications. In this paper, we propose GraNNDis, an efficient distributed GNN training framework for training GNNs on large graphs and deep layers. GraNNDis introduces three new techniques. First, shared preloading provides a training structure for a cluster of multi-GPU servers. We suggest server-wise preloading of essential vertex dependencies to reduce the low-bandwidth external server communications. Second, we present expansion-aware sampling. Because shared preloading alone has limitations because of the neighbor explosion, expansion-aware sampling reduces vertex dependencies that span across server boundaries. Third, we propose cooperative batching to create a unified framework for full-graph and minibatch training. It significantly reduces redundant memory usage in mini-batch training. From this, GraNNDis enables a reasonable trade-off between full-graph and mini-batch training through unification especially when the entire graph does not fit into the GPU memory. With experiments conducted on a multi-server/multi-GPU cluster, we show that GraNNDis provides superior speedup over the state-of-the-art distributed GNN training frameworks.
    摘要 GRAPH NeRal networks (GNNs) 是深度学习中最快增长的领域之一。随着数据集和模型大小的增长,GNNs 中的一个重要问题是将整个网络存储在 GPU 内存中变得几乎不可能。为解决这个问题,分布式训练是一种受欢迎的方法。然而,由于 GNNs 的性质,现有的分布式方法受到较低的扩展缓存的限制,主要是由slow external server communications引起的。在本文中,我们提出了 GraNNDis,一种高效的分布式 GNN 训练框架,用于在大图和深层次上训练 GNNs。GraNNDis introduce three new techniques:1. 共享预加载提供了一种cluster of multi-GPU servers中的训练结构。我们建议在多GPU服务器上进行优先级预加载 essentials vertex dependencies,以降低低带宽外部服务器通信的External server communications。2. 我们提出了扩展相关采样。由于共享预加载独立有限制,因为邻居爆发,扩展相关采样可以降低 span across server boundaries的 vertex dependencies。3. 我们提出了合作批处理,以创建一个统一的框架,用于全图和小批量训练。它可以减少了小批量训练中的冗余内存使用。从而,GraNNDis 允许在reasonable trade-off between full-graph and mini-batch training through unification,特别是当整个图片不能被GPU内存中的情况下。通过在多服务器/多GPU集群上进行实验,我们表明了 GraNNDis 对现有的分布式 GNN 训练框架提供了显著的加速。

Towards Continual Reinforcement Learning for Quadruped Robots

  • paper_url: http://arxiv.org/abs/2311.06828
  • repo_url: None
  • paper_authors: Giovanni Minelli, Vassilis Vassiliades
  • for: 本研究旨在强化quadruped robot在实际场景中的适应性和性能,通过在不同环境下进行练习和评估来增强机器人的适应能力。
  • methods: 本研究采用了两种 continual learning 方法,先后在不同环境中训练机器人,并同时评估其在所有环境下的性能。
  • results: 研究发现,机器人在不同环境下的适应能力受到了环境变化和前期练习的影响,同时也存在了机器人忘记先前学习的技能的现象。通过对这些因素的评估和控制,我们希望能够提高quadruped robot在实际场景中的性能和适应性。
    Abstract Quadruped robots have emerged as an evolving technology that currently leverages simulators to develop a robust controller capable of functioning in the real-world without the need for further training. However, since it is impossible to predict all possible real-world situations, our research explores the possibility of enabling them to continue learning even after their deployment. To this end, we designed two continual learning scenarios, sequentially training the robot on different environments while simultaneously evaluating its performance across all of them. Our approach sheds light on the extent of both forward and backward skill transfer, as well as the degree to which the robot might forget previously acquired skills. By addressing these factors, we hope to enhance the adaptability and performance of quadruped robots in real-world scenarios.
    摘要 四足机器人已经成为一种发展中的技术,目前利用模拟器来开发一个在实际世界中能够运行的稳定控制器,无需进一步训练。然而,由于无法预测所有的实际世界情况,我们的研究探讨了让机器人继续学习,即使已经部署。为此,我们设计了两种连续学习情况,顺序地训练机器人在不同环境中,同时评估其在所有环境中的性能。我们的方法揭示了机器人在前进和逆转技能传递方面的扩展和忘记程度。通过解决这些因素,我们希望提高四足机器人在实际世界情况下的适应性和性能。

A Comprehensive Survey On Client Selections in Federated Learning

  • paper_url: http://arxiv.org/abs/2311.06801
  • repo_url: None
  • paper_authors: Ala Gouissem, Zina Chkirbene, Ridha Hamila
  • for: 本文提供了 federated learning 中客户端选择技术的审视,包括其优势和局限性,以及需要解决的挑战和开issues。
  • methods: 本文覆盖了一些常见的客户端选择技术,如随机选择、性能 aware 选择和资源 aware 选择,以及在不同类型的网络中的应用。
  • results: 本文讨论了客户端选择对模型安全性的改进,以及在动态约束和不同类型的网络中的客户端选择的挑战和开issues。
    Abstract Federated Learning (FL) is a rapidly growing field in machine learning that allows data to be trained across multiple decentralized devices. The selection of clients to participate in the training process is a critical factor for the performance of the overall system. In this survey, we provide a comprehensive overview of the state-of-the-art client selection techniques in FL, including their strengths and limitations, as well as the challenges and open issues that need to be addressed. We cover conventional selection techniques such as random selection where all or partial random of clients is used for the trained. We also cover performance-aware selections and as well as resource-aware selections for resource-constrained networks and heterogeneous networks. We also discuss the usage of client selection in model security enhancement. Lastly, we discuss open issues and challenges related to clients selection in dynamic constrained, and heterogeneous networks.
    摘要 federated 学习(FL)是一个快速发展的机器学习领域,允许数据在多个分散式设备上进行训练。选择参与训练过程的客户端是系统性能的关键因素。在本调查中,我们提供了 federated 学习中客户端选择技术的全面概述,包括它们的优点和局限性,以及需要解决的挑战和开放问题。我们覆盖了传统的选择技术,如随机选择,其中所有或部分随机选择客户端进行训练。我们还覆盖了性能协调的选择和资源协调的选择,用于资源受限的网络和多样化网络。此外,我们还讨论了客户端选择在模型安全增强中的应用。最后,我们讨论了客户端选择在动态约束和多样化网络中的开放问题。

Learning Predictive Safety Filter via Decomposition of Robust Invariant Set

  • paper_url: http://arxiv.org/abs/2311.06769
  • repo_url: None
  • paper_authors: Zeyang Li, Chuxiong Hu, Weiye Zhao, Changliu Liu
  • for: 本研究旨在提供一种可靠地保证非线性系统的安全性,尤其是在实际控制任务中,遇到模型不确定性和外部干扰时。
  • methods: 本研究使用了一种混合了模型预测控制(RMPC)和学习控制(RL)的方法,以实现安全性和可扩展性的平衡。
  • results: 研究结果表明,该方法可以在实时处理非 convex 优化问题,并提供持续的安全性保证,而不需要高度计算负担。
    Abstract Ensuring safety of nonlinear systems under model uncertainty and external disturbances is crucial, especially for real-world control tasks. Predictive methods such as robust model predictive control (RMPC) require solving nonconvex optimization problems online, which leads to high computational burden and poor scalability. Reinforcement learning (RL) works well with complex systems, but pays the price of losing rigorous safety guarantee. This paper presents a theoretical framework that bridges the advantages of both RMPC and RL to synthesize safety filters for nonlinear systems with state- and action-dependent uncertainty. We decompose the robust invariant set (RIS) into two parts: a target set that aligns with terminal region design of RMPC, and a reach-avoid set that accounts for the rest of RIS. We propose a policy iteration approach for robust reach-avoid problems and establish its monotone convergence. This method sets the stage for an adversarial actor-critic deep RL algorithm, which simultaneously synthesizes a reach-avoid policy network, a disturbance policy network, and a reach-avoid value network. The learned reach-avoid policy network is utilized to generate nominal trajectories for online verification, which filters potentially unsafe actions that may drive the system into unsafe regions when worst-case disturbances are applied. We formulate a second-order cone programming (SOCP) approach for online verification using system level synthesis, which optimizes for the worst-case reach-avoid value of any possible trajectories. The proposed safety filter requires much lower computational complexity than RMPC and still enjoys persistent robust safety guarantee. The effectiveness of our method is illustrated through a numerical example.
    摘要 保证非线性系统在模型不确定性和外部干扰下的安全性是非常重要,尤其是在实际控制任务中。预测方法如Robust Model Predictive Control(RMPC)需要在线解决非凸优化问题,这会导致高计算负担和低可扩展性。学习控制(RL)可以与复杂系统相处,但是付出了放弃准确的安全保证。本文提出了一个概念框架,可以结合RMPC和RL两种方法,并生成安全筛选器 для非线性系统。我们将 robust invariant set(RIS)分解为两部分:一个目标集,与终端区域设计相对应,以及一个可达-避免集,其他的RIS都包含在内。我们提出了一种政策迭代法,用于robust reach-avoid问题,并证明其 monotone convergence。这种方法为一种actor-critic深度学习算法提供了基础,该算法同时生成了一个可达-避免策略网络、一个干扰策略网络和一个可达-避免值网络。学习的可达-避免策略网络可以生成nominal trajectories,用于在线验证,并过滤潜在危险的动作,以避免系统在最差情况下受到危险的影响。我们使用系统级 synthesis 的 SOCP 方法进行在线验证,并优化了最差情况下的可达-避免值。提案的安全筛选器需要远低于RMPC的计算复杂性,仍然享有持续的准确安全保证。数据示例 verify 了我们的方法的有效性。

Personalized Federated Learning via ADMM with Moreau Envelope

  • paper_url: http://arxiv.org/abs/2311.06756
  • repo_url: https://github.com/zsk66/flame-master
  • paper_authors: Shengkun Zhu, Jinshan Zeng, Sheng Wang, Yuan Sun, Zhiyong Peng
  • for: 提出了个人化联合学习(PFL)方法,以解决不同数据集的训练不收敛问题。
  • methods: 使用了多元函数方法(ADMM)和更多瓦埃均值(FLAME),实现了下线性收敛率,只需要Gradient Lipschitz连续性的较弱假设。此外,由于ADMM是无梯度的,FLAME可以减少全局模型训练中的hyperparameter调整,特别是避免学习率的调整。
  • results: 在不同数据集上训练PFL模型,FLAME可以在模型性能方面超过现有方法,并且在通信效率方面实现3.75倍的平均加速。此外,对于客户端选择策略,我们提出了偏向客户端选择策略,可以加速个人和全局模型的训练。
    Abstract Personalized federated learning (PFL) is an approach proposed to address the issue of poor convergence on heterogeneous data. However, most existing PFL frameworks require strong assumptions for convergence. In this paper, we propose an alternating direction method of multipliers (ADMM) for training PFL models with Moreau envelope (FLAME), which achieves a sublinear convergence rate, relying on the relatively weak assumption of gradient Lipschitz continuity. Moreover, due to the gradient-free nature of ADMM, FLAME alleviates the need for hyperparameter tuning, particularly in avoiding the adjustment of the learning rate when training the global model. In addition, we propose a biased client selection strategy to expedite the convergence of training of PFL models. Our theoretical analysis establishes the global convergence under both unbiased and biased client selection strategies. Our experiments validate that FLAME, when trained on heterogeneous data, outperforms state-of-the-art methods in terms of model performance. Regarding communication efficiency, it exhibits an average speedup of 3.75x compared to the baselines. Furthermore, experimental results validate that the biased client selection strategy speeds up the convergence of both personalized and global models.
    摘要 “个性化联合学习(PFL)是一种提出来解决不同数据集的融合问题的方法。然而,大多数现有的PFL框架需要强大的假设来保证收敛。在这篇论文中,我们提出了一种多参数方向法(ADMM)来训练PFL模型,使用Moreau抛物(FLAME)实现下线性收敛率,只需要 Gradient Lipschitz continuity 的较弱假设。此外,由于ADMM是gradient-free的,FLAME可以减少对学习率的调整,特别是在训练全局模型时。此外,我们还提出了偏向客户选择策略来加速PFL模型的训练。我们的理论分析表明,在不偏向和偏向客户选择策略下,模型都能够达到全球收敛。我们的实验表明,FLAME,当训练不同数据集时,可以比现有方法更好地性能。此外,实验还表明,偏向客户选择策略可以加速个性化和全局模型的训练。”Note: Please note that the translation is in Simplified Chinese, and the word order and sentence structure may be different from the original text.

Application of a Dense Fusion Attention Network in Fault Diagnosis of Centrifugal Fan

  • paper_url: http://arxiv.org/abs/2311.07614
  • repo_url: None
  • paper_authors: Ruijun Wang, Yuan Liu, Zhixia Fan, Xiaogang Xu, Huijie Wang
  • for: 本研究旨在提高 rotate machinery 的监测和诊断能力,通过嵌入分布式注意力模块而不是传统的紧密相连操作。
  • methods: 本研究使用了 dense fusion 技术,即在 dense connections 中嵌入 distributed attention modules,以隔离空间和通道对瑕点特征重新调整的影响。同时,该技术还形成了一种 fusional attention 函数,可以帮助解释网络诊断过程的可读性。
  • results: 实验结果表明,该网络在 centrifugal fan 瑕点诊断方面的表现比其他先进瑕点诊断模型更好,可以更好地抵御噪声和提高瑕点特征提取能力。
    Abstract Although the deep learning recognition model has been widely used in the condition monitoring of rotating machinery. However, it is still a challenge to understand the correspondence between the structure and function of the model and the diagnosis process. Therefore, this paper discusses embedding distributed attention modules into dense connections instead of traditional dense cascading operations. It not only decouples the influence of space and channel on fault feature adaptive recalibration feature weights, but also forms a fusion attention function. The proposed dense fusion focuses on the visualization of the network diagnosis process, which increases the interpretability of model diagnosis. How to continuously and effectively integrate different functions to enhance the ability to extract fault features and the ability to resist noise is answered. Centrifugal fan fault data is used to verify this network. Experimental results show that the network has stronger diagnostic performance than other advanced fault diagnostic models.
    摘要 尽管深度学习识别模型在旋转机器condition monitoring中广泛应用,但是理解模型和诊断过程之间的对应关系仍然是一个挑战。因此,本文提出将分布式注意力模块采用 dense connections 而不是传统的 dense cascading 操作。这不仅解耦了空间和通道对缺陷特征重量的影响,还形成了一种 fusional attention 函数。提出的 dense fusion 可以视觉化网络诊断过程,从而提高模型诊断的可读性。如何不断和有效地 интегра力不同功能,提高抽取缺陷特征的能力和抗噪能力,这个问题得到了答案。使用中心扇式风机缺陷数据进行验证,实验结果表明,该网络在其他先进缺陷诊断模型之上具有更强的诊断能力。

How do Minimum-Norm Shallow Denoisers Look in Function Space?

  • paper_url: http://arxiv.org/abs/2311.06748
  • repo_url: None
  • paper_authors: Chen Zeno, Greg Ongie, Yaniv Blumenfeld, Nir Weinberger, Daniel Soudry
  • for: This paper aims to understand the functions realized by shallow ReLU NN denoisers in the context of interpolation and minimal representation cost.
  • methods: The authors use a theoretical approach to derive closed-form expressions for the NN denoiser functions, and prove their contractivity and generalization properties.
  • results: The authors find that the NN denoiser functions can be decomposed into a sum of simple rank-one piecewise linear interpolations aligned with edges and/or faces connecting training samples, and empirically verify this alignment phenomenon on synthetic data and real images.
    Abstract Neural network (NN) denoisers are an essential building block in many common tasks, ranging from image reconstruction to image generation. However, the success of these models is not well understood from a theoretical perspective. In this paper, we aim to characterize the functions realized by shallow ReLU NN denoisers -- in the common theoretical setting of interpolation (i.e., zero training loss) with a minimal representation cost (i.e., minimal $\ell^2$ norm weights). First, for univariate data, we derive a closed form for the NN denoiser function, find it is contractive toward the clean data points, and prove it generalizes better than the empirical MMSE estimator at a low noise level. Next, for multivariate data, we find the NN denoiser functions in a closed form under various geometric assumptions on the training data: data contained in a low-dimensional subspace, data contained in a union of one-sided rays, or several types of simplexes. These functions decompose into a sum of simple rank-one piecewise linear interpolations aligned with edges and/or faces connecting training samples. We empirically verify this alignment phenomenon on synthetic data and real images.
    摘要 For univariate data, we derive a closed-form expression for the NN denoiser function and show that it is contractive towards the clean data points. We also prove that the NN denoiser generalizes better than the empirical MMSE estimator at a low noise level.For multivariate data, we find the NN denoiser functions in closed form under various geometric assumptions on the training data, including data contained in a low-dimensional subspace, data contained in a union of one-sided rays, or data contained in several types of simplexes. These functions decompose into a sum of simple rank-one piecewise linear interpolations aligned with edges and/or faces connecting training samples. We empirically verify this alignment phenomenon on synthetic data and real images.

ReactionT5: a large-scale pre-trained model towards application of limited reaction data

  • paper_url: http://arxiv.org/abs/2311.06708
  • repo_url: https://github.com/sagawatatsuya/ReactionT5
  • paper_authors: Tatsuya Sagawa, Ryosuke Kojima
  • for: 这种paper是为了提出一种基于Transformer的深度神经网络,用于预测多个分子反应的结果。
  • methods: 这种模型使用了ORD数据库中的大规模数据进行预训练,然后进行了精细调整以适应具体的反应预测任务。
  • results: 研究发现,这种模型在预测反应产物的量和产物分布中表现出色,即使用有限的精细调整数据也能够达到比较出色的效果。
    Abstract Transformer-based deep neural networks have revolutionized the field of molecular-related prediction tasks by treating molecules as symbolic sequences. These models have been successfully applied in various organic chemical applications by pretraining them with extensive compound libraries and subsequently fine-tuning them with smaller in-house datasets for specific tasks. However, many conventional methods primarily focus on single molecules, with limited exploration of pretraining for reactions involving multiple molecules. In this paper, we propose ReactionT5, a novel model that leverages pretraining on the Open Reaction Database (ORD), a publicly available large-scale resource. We further fine-tune this model for yield prediction and product prediction tasks, demonstrating its impressive performance even with limited fine-tuning data compared to traditional models. The pre-trained ReactionT5 model is publicly accessible on the Hugging Face platform.
    摘要 transformer-based deep neural networks have revolutionized the field of molecular-related prediction tasks by treating molecules as symbolic sequences. These models have been successfully applied in various organic chemical applications by pretraining them with extensive compound libraries and subsequently fine-tuning them with smaller in-house datasets for specific tasks. However, many conventional methods primarily focus on single molecules, with limited exploration of pretraining for reactions involving multiple molecules. In this paper, we propose ReactionT5, a novel model that leverages pretraining on the Open Reaction Database (ORD), a publicly available large-scale resource. We further fine-tune this model for yield prediction and product prediction tasks, demonstrating its impressive performance even with limited fine-tuning data compared to traditional models. The pre-trained ReactionT5 model is publicly accessible on the Hugging Face platform.Here's the translation in Traditional Chinese:transformer-based deep neural networks have revolutionized the field of molecular-related prediction tasks by treating molecules as symbolic sequences. These models have been successfully applied in various organic chemical applications by pretraining them with extensive compound libraries and subsequently fine-tuning them with smaller in-house datasets for specific tasks. However, many conventional methods primarily focus on single molecules, with limited exploration of pretraining for reactions involving multiple molecules. In this paper, we propose ReactionT5, a novel model that leverages pretraining on the Open Reaction Database (ORD), a publicly available large-scale resource. We further fine-tune this model for yield prediction and product prediction tasks, demonstrating its impressive performance even with limited fine-tuning data compared to traditional models. The pre-trained ReactionT5 model is publicly accessible on the Hugging Face platform.

Transfer Learning to Detect COVID-19 Coughs with Incremental Addition of Patient Coughs to Healthy People’s Cough Detection Models

  • paper_url: http://arxiv.org/abs/2311.06707
  • repo_url: None
  • paper_authors: Sudip Vhaduri, Seungyeon Paik, Jessica E Huber
  • for: 检测COVID-19病人的喘音,以防止疾病的迅速传播。
  • methods: 使用升级传输学习方法,利用健康人喘音和COVID-19患者喘音之间的关系,以精准地检测COVID-19病人的喘音。
  • results: 使用小量病人喘音数据和预训练的健康人喘音模型,可以达到reasonable的喘音检测精度,从而降低需要大量病人数据来训练模型的需求。
    Abstract Millions of people have died worldwide from COVID-19. In addition to its high death toll, COVID-19 has led to unbearable suffering for individuals and a huge global burden to the healthcare sector. Therefore, researchers have been trying to develop tools to detect symptoms of this human-transmissible disease remotely to control its rapid spread. Coughing is one of the common symptoms that researchers have been trying to detect objectively from smartphone microphone-sensing. While most of the approaches to detect and track cough symptoms rely on machine learning models developed from a large amount of patient data, this is not possible at the early stage of an outbreak. In this work, we present an incremental transfer learning approach that leverages the relationship between healthy peoples' coughs and COVID-19 patients' coughs to detect COVID-19 coughs with reasonable accuracy using a pre-trained healthy cough detection model and a relatively small set of patient coughs, reducing the need for large patient dataset to train the model. This type of model can be a game changer in detecting the onset of a novel respiratory virus.
    摘要 众多人在全球死亡了,COVID-19 也引起了不可支持的人类uffering 和全球医疗机构的巨大荷载。因此,研究人员在努力开发可以远程检测COVID-19 病人的症状,以控制其迅速传播。咳嗽是COVID-19 病人的常见症状之一,研究人员在智能手机麦克风感测中尝试了对咳嗽症状进行 объектив检测。大多数检测和跟踪咳嗽症状的方法都基于由大量病人数据提供的机器学习模型,但在疫情的早期,这并不是可行的。在这种情况下,我们提出了一种增量传输学习方法,利用健康人群的咳嗽和COVID-19 病人的咳嗽之间的关系,以对COVID-19 病人的咳嗽进行reasonable的检测,使用先前已经训练的健康咳嗽检测模型和相对较小的病人咳嗽数据集,从而减少需要大量病人数据来训练模型。这种类型的模型可能会是检测新型呼吸病的游戏 changer。

A Physics-informed Machine Learning-based Control Method for Nonlinear Dynamic Systems with Highly Noisy Measurements

  • paper_url: http://arxiv.org/abs/2311.07613
  • repo_url: None
  • paper_authors: Mason Ma, Jiajie Wu, Chase Post, Tony Shi, Jingang Yi, Tony Schmitz, Hong Wang
  • for: 该研究旨在提出一种基于物理学习的控制方法,用于非线性动态系统中的噪声探测。现有的数据驱动控制方法使用机器学习进行系统识别,但不能有效应对高度噪声的测量结果,导致控制性能不稳定。
  • methods: 该研究扩展了当前的物理学习可靠性模型,并将其集成到预测控制框架中。以实验 validate 两个噪声非线性动态系统:洛朗兹3系统和转换机床。
  • results: 分析结果表明,提出的方法在高噪声条件下表现出较高的模型准确性和控制性能,与当前参考值相比有所提高。
    Abstract This study presents a physics-informed machine learning-based control method for nonlinear dynamic systems with highly noisy measurements. Existing data-driven control methods that use machine learning for system identification cannot effectively cope with highly noisy measurements, resulting in unstable control performance. To address this challenge, the present study extends current physics-informed machine learning capabilities for modeling nonlinear dynamics with control and integrates them into a model predictive control framework. To demonstrate the capability of the proposed method we test and validate with two noisy nonlinear dynamic systems: the chaotic Lorenz 3 system, and turning machine tool. Analysis of the results illustrate that the proposed method outperforms state-of-the-art benchmarks as measured by both modeling accuracy and control performance for nonlinear dynamic systems under high-noise conditions.
    摘要