cs.LG - 2023-07-16

Dataset Distillation Meets Provable Subset Selection

  • paper_url: http://arxiv.org/abs/2307.08086
  • repo_url: None
  • paper_authors: Murad Tukan, Alaa Maalouf, Margarita Osadchy
  • for: 提高 dataset distillation 的效果,减少数据量和计算成本。
  • methods: 使用 sampling-based 方法初始化 distilled set,并在训练过程中使用 importance 定义来选择数据集。
  • results: 实验结果表明,我们的方法可以启用 exiting dataset distillation 技术,并提高其性能。
    Abstract Deep learning has grown tremendously over recent years, yielding state-of-the-art results in various fields. However, training such models requires huge amounts of data, increasing the computational time and cost. To address this, dataset distillation was proposed to compress a large training dataset into a smaller synthetic one that retains its performance -- this is usually done by (1) uniformly initializing a synthetic set and (2) iteratively updating/learning this set according to a predefined loss by uniformly sampling instances from the full data. In this paper, we improve both phases of dataset distillation: (1) we present a provable, sampling-based approach for initializing the distilled set by identifying important and removing redundant points in the data, and (2) we further merge the idea of data subset selection with dataset distillation, by training the distilled set on ``important'' sampled points during the training procedure instead of randomly sampling the next batch. To do so, we define the notion of importance based on the relative contribution of instances with respect to two different loss functions, i.e., one for the initialization phase (a kernel fitting function for kernel ridge regression and $K$-means based loss function for any other distillation method), and the relative cross-entropy loss (or any other predefined loss) function for the training phase. Finally, we provide experimental results showing how our method can latch on to existing dataset distillation techniques and improve their performance.
    摘要 深度学习在最近几年内发展 extremely rapidly,在不同领域取得了状态的艺术 Results。然而,训练这些模型需要巨量数据和计算资源,这导致了训练时间和成本的增加。为解决这个问题,人们提出了数据集缩写,将大量的训练数据缩写成一个更小的合成数据集,保持其性能。在这篇论文中,我们改进了两个数据集缩写阶段:1. 我们提出了一种可证明的抽样方法,通过重要性分析和减少数据中的重复项来初始化缩写集。2. 我们进一步将数据subset选择纳入数据集缩写,在训练过程中使用“重要”的抽样点训练缩写集而不是随机抽样下一个批。我们定义了“重要”的概念,根据数据点对两个不同的损失函数(一个是 kernel ridge regression 和 $K$-means 基于损失函数,另一个是距离损失函数)的相对贡献来确定。最后,我们提供了实验结果,证明我们的方法可以与现有的数据集缩写技术相结合,提高其性能。

POMDP inference and robust solution via deep reinforcement learning: An application to railway optimal maintenance

  • paper_url: http://arxiv.org/abs/2307.08082
  • repo_url: https://github.com/giarcieri/robust-optimal-maintenance-planning-through-reinforcement-learning-and-rllib
  • paper_authors: Giacomo Arcieri, Cyprien Hoelzl, Oliver Schwery, Daniel Straub, Konstantinos G. Papakonstantinou, Eleni Chatzi
  • for: 这项研究的目的是提出一种 combining framework for inference and robust solution of POMDPs via deep RL, 用于解决复杂的顺序决策问题在不确定环境中。
  • methods: 该framework包括使用Markov Chain Monte Carlo sampling来 JOINTLYINFER transition和observation模型参数,然后使用深度学习技术解决POMDP问题,并通过域随机化将参数 Distributions incorporated into the solution。
  • results: 该研究 comparing the use of transformers和long short-term memory networks,以及model-based/model-free hybrid approach,并应用于实际世界的轨道资产维护规划问题。
    Abstract Partially Observable Markov Decision Processes (POMDPs) can model complex sequential decision-making problems under stochastic and uncertain environments. A main reason hindering their broad adoption in real-world applications is the lack of availability of a suitable POMDP model or a simulator thereof. Available solution algorithms, such as Reinforcement Learning (RL), require the knowledge of the transition dynamics and the observation generating process, which are often unknown and non-trivial to infer. In this work, we propose a combined framework for inference and robust solution of POMDPs via deep RL. First, all transition and observation model parameters are jointly inferred via Markov Chain Monte Carlo sampling of a hidden Markov model, which is conditioned on actions, in order to recover full posterior distributions from the available data. The POMDP with uncertain parameters is then solved via deep RL techniques with the parameter distributions incorporated into the solution via domain randomization, in order to develop solutions that are robust to model uncertainty. As a further contribution, we compare the use of transformers and long short-term memory networks, which constitute model-free RL solutions, with a model-based/model-free hybrid approach. We apply these methods to the real-world problem of optimal maintenance planning for railway assets.
    摘要 In this work, we propose a combined framework for inferring and solving POMDPs via deep RL. First, all transition and observation model parameters are jointly inferred via Markov Chain Monte Carlo (MCMC) sampling of a hidden Markov model (HMM), conditioned on actions, to recover full posterior distributions from the available data. The POMDP with uncertain parameters is then solved via deep RL techniques, with the parameter distributions incorporated into the solution via domain randomization, to develop solutions that are robust to model uncertainty.As a further contribution, we compare the use of transformers and long short-term memory (LSTM) networks, which constitute model-free RL solutions, with a model-based/model-free hybrid approach. We apply these methods to the real-world problem of optimal maintenance planning for railway assets.

Flexible and efficient spatial extremes emulation via variational autoencoders

  • paper_url: http://arxiv.org/abs/2307.08079
  • repo_url: None
  • paper_authors: Likun Zhang, Xiaoyu Ma, Christopher K. Wikle, Raphaël Huser
  • for: 用于模elling Complex 的 spatial extremes 性质
  • methods: 使用 encoding-decoding 结构的 variational autoencoder (extVAE)
  • results: 可以更快和更高精度地 simulate spatial extremes 模型输出,并且可以更好地 capture tail 部分的性质
    Abstract Many real-world processes have complex tail dependence structures that cannot be characterized using classical Gaussian processes. More flexible spatial extremes models such as Gaussian scale mixtures and single-station conditioning models exhibit appealing extremal dependence properties but are often exceedingly prohibitive to fit and simulate from. In this paper, we develop a new spatial extremes model that has flexible and non-stationary dependence properties, and we integrate it in the encoding-decoding structure of a variational autoencoder (extVAE). The extVAE can be used as a spatio-temporal emulator that characterizes the distribution of potential mechanistic model output states and produces outputs that have the same properties as the inputs, especially in the tail. Through extensive simulation studies, we show that our extVAE is vastly more time-efficient than traditional Bayesian inference while also outperforming many spatial extremes models with a stationary dependence structure. To further demonstrate the computational power of the extVAE, we analyze a high-resolution satellite-derived dataset of sea surface temperature in the Red Sea, which includes daily measurements at 16703 grid cells.
    摘要 很多现实世界的过程都具有复杂的尾部依赖结构,不能使用传统的 Gaussian 过程来描述。更灵活的 spatial extremes 模型,如 Gaussian scale mixtures 和 single-station conditioning models,具有吸引人的极端依赖性质,但是常常非常困难 fitted 和 simulate from。在这篇论文中,我们开发了一种新的 spatial extremes 模型,具有灵活和非站ary 的依赖性质。我们将其集成到 encoding-decoding 结构中,并将其用作一种 spatio-temporal emulator,可以Characterizes the distribution of potential mechanistic model output states and produces outputs that have the same properties as the inputs, especially in the tail.通过广泛的 simulations 研究,我们表明我们的 extVAE 在时间效率方面远胜传统的 Bayesian inference,同时也能够超越许多 stationary 的 spatial extremes 模型。为了进一步展示 extVAE 的计算能力,我们分析了一个高分辨率的卫星Derived 数据集,包括每天 measurement 的 16703 个网格单元的 sea surface temperature 在红海。

MaGNAS: A Mapping-Aware Graph Neural Architecture Search Framework for Heterogeneous MPSoC Deployment

  • paper_url: http://arxiv.org/abs/2307.08065
  • repo_url: None
  • paper_authors: Mohanad Odema, Halima Bouzidi, Hamza Ouarnoughi, Smail Niar, Mohammad Abdullah Al Faruque
  • for: 这 paper 是为了提高图像推理 tasks 的性能和能效性而设计的。
  • methods: 该 paper 使用了一种叫做 MaGNAS 的图 neural architecture search 框架,该框架可以在多处理器系统上(SoC)上进行图 neural network 的设计和映射。
  • results: experiments 表明,使用 MaGNAS 可以在 NVIDIA Xavier AGX 平台上提高图像推理 tasks 的性能和能效性,比基eline 的 GPU-only 部署提高 1.57 倍的延迟速度和 3.38 倍的能效率,同时保持平均的准确率下降 0.11%。
    Abstract Graph Neural Networks (GNNs) are becoming increasingly popular for vision-based applications due to their intrinsic capacity in modeling structural and contextual relations between various parts of an image frame. On another front, the rising popularity of deep vision-based applications at the edge has been facilitated by the recent advancements in heterogeneous multi-processor Systems on Chips (MPSoCs) that enable inference under real-time, stringent execution requirements. By extension, GNNs employed for vision-based applications must adhere to the same execution requirements. Yet contrary to typical deep neural networks, the irregular flow of graph learning operations poses a challenge to running GNNs on such heterogeneous MPSoC platforms. In this paper, we propose a novel unified design-mapping approach for efficient processing of vision GNN workloads on heterogeneous MPSoC platforms. Particularly, we develop MaGNAS, a mapping-aware Graph Neural Architecture Search framework. MaGNAS proposes a GNN architectural design space coupled with prospective mapping options on a heterogeneous SoC to identify model architectures that maximize on-device resource efficiency. To achieve this, MaGNAS employs a two-tier evolutionary search to identify optimal GNNs and mapping pairings that yield the best performance trade-offs. Through designing a supernet derived from the recent Vision GNN (ViG) architecture, we conducted experiments on four (04) state-of-the-art vision datasets using both (i) a real hardware SoC platform (NVIDIA Xavier AGX) and (ii) a performance/cost model simulator for DNN accelerators. Our experimental results demonstrate that MaGNAS is able to provide 1.57x latency speedup and is 3.38x more energy-efficient for several vision datasets executed on the Xavier MPSoC vs. the GPU-only deployment while sustaining an average 0.11% accuracy reduction from the baseline.
    摘要 图 neural network (GNN) 在视觉应用中日益受欢迎,因为它们可以自然地模型图像帧中不同部分之间的结构和上下文关系。另一方面,由于近期的深度视觉应用在边缘得到了广泛的应用,因此GNN在这些应用中必须遵循同样的执行要求。然而,与普通的深度神经网络不同,图学习操作的不规则流动对于运行GNN在多核心系统中带来了挑战。在这篇论文中,我们提出了一种统一的设计映射方法,用于有效地处理视觉GNN工作负荷在多核心系统上。特别是,我们开发了MaGNAS,一个具有Mapping-Aware Graph Neural Architecture Search(MaGNAS)框架。MaGNAS将GNN建立的建筑设计空间与多核心SoC上的可能的映射选项相结合,以便标识最佳的设备资源利用率。为了实现这一点,MaGNAS采用了两层演化搜索,以确定最佳的GNN和映射对的性能交互。通过基于最近的Vision GNN(ViG)架构设计了一个超网,我们在四个state-of-the-art视觉数据集上进行了实验。我们的实验结果表明,MaGNAS能够提供1.57倍的延迟速度提升和3.38倍的能量效率提升,而在NVIDIA Xavier AGX多核心系统上执行视觉数据集时与GPU-only部署相比,保持了0.11%的准确率下降。

Fast Quantum Algorithm for Attention Computation

  • paper_url: http://arxiv.org/abs/2307.08045
  • repo_url: None
  • paper_authors: Yeqi Gao, Zhao Song, Xin Yang, Ruizhe Zhang
  • for: 大型自然语言处理(NLP)模型(LLMs)的性能表现出色,它们通过高级深度学习技术得到了广泛应用。
  • methods: 本研究使用Grover搜寻算法来高效 Compute sparse attention computation matrix。
  • results: 我们的量子算法可以取得 polynomial 的速度提升,并且 attention matrix 具有Extra low-rank 结构,可以帮助获得更快的 LLM 训练算法。
    Abstract Large language models (LLMs) have demonstrated exceptional performance across a wide range of tasks. These models, powered by advanced deep learning techniques, have revolutionized the field of natural language processing (NLP) and have achieved remarkable results in various language-related tasks. LLMs have excelled in tasks such as machine translation, sentiment analysis, question answering, text generation, text classification, language modeling, and more. They have proven to be highly effective in capturing complex linguistic patterns, understanding context, and generating coherent and contextually relevant text. The attention scheme plays a crucial role in the architecture of large language models (LLMs). It is a fundamental component that enables the model to capture and utilize contextual information during language processing tasks effectively. Making the attention scheme computation faster is one of the central questions to speed up the LLMs computation. It is well-known that quantum machine has certain computational advantages compared to the classical machine. However, it is currently unknown whether quantum computing can aid in LLM. In this work, we focus on utilizing Grover's Search algorithm to compute a sparse attention computation matrix efficiently. We achieve a polynomial quantum speed-up over the classical method. Moreover, the attention matrix outputted by our quantum algorithm exhibits an extra low-rank structure that will be useful in obtaining a faster training algorithm for LLMs. Additionally, we present a detailed analysis of the algorithm's error analysis and time complexity within the context of computing the attention matrix.
    摘要 大型语言模型(LLM)在各种语言处理任务中表现出色,应用先进的深度学习技术。这些模型在机器翻译、情感分析、问答、文本生成、文本分类、语言模型等任务中均有卓越表现。它们能够吸收复杂的语言模式,理解上下文,并生成具有上下文相关性的文本。模型中的注意力结构是重要的基本 ком成分,它在语言处理任务中实现了效果。目前,尚未知道Quantum computing是否可以应用于LLM。在这个工作中,我们专注于使用Grover搜寻算法计算稀疏注意力computation матриrice,以取得高效的量子速度优化。我们获得了对级数方法的多项式优化,并且注意力矩阵的出力显示了额外的低维结构,这将会帮助获得更快的LLM训练算法。此外,我们还提供了计算注意力矩阵的错误分析和时间复杂度分析。

Towards Flexible Time-to-event Modeling: Optimizing Neural Networks via Rank Regression

  • paper_url: http://arxiv.org/abs/2307.08044
  • repo_url: https://github.com/teboozas/dart_ecai23
  • paper_authors: Hyunjun Lee, Junhyun Lee, Taehwa Choi, Jaewoo Kang, Sangbum Choi
  • for: 预测时间事件发生的时间点 (predicting the time of occurrence of an event)
  • methods: 使用深度学习模型,基于Gehan的排名统计 (using a deep learning model based on Gehan’s rank statistic)
  • results: 在多个 benchmark 数据集上实现了显著的提升,无需额外参数或复杂的模型架构 (achieved significant improvements on multiple benchmark datasets without additional hyperparameters or complex model architectures)
    Abstract Time-to-event analysis, also known as survival analysis, aims to predict the time of occurrence of an event, given a set of features. One of the major challenges in this area is dealing with censored data, which can make learning algorithms more complex. Traditional methods such as Cox's proportional hazards model and the accelerated failure time (AFT) model have been popular in this field, but they often require assumptions such as proportional hazards and linearity. In particular, the AFT models often require pre-specified parametric distributional assumptions. To improve predictive performance and alleviate strict assumptions, there have been many deep learning approaches for hazard-based models in recent years. However, representation learning for AFT has not been widely explored in the neural network literature, despite its simplicity and interpretability in comparison to hazard-focused methods. In this work, we introduce the Deep AFT Rank-regression model for Time-to-event prediction (DART). This model uses an objective function based on Gehan's rank statistic, which is efficient and reliable for representation learning. On top of eliminating the requirement to establish a baseline event time distribution, DART retains the advantages of directly predicting event time in standard AFT models. The proposed method is a semiparametric approach to AFT modeling that does not impose any distributional assumptions on the survival time distribution. This also eliminates the need for additional hyperparameters or complex model architectures, unlike existing neural network-based AFT models. Through quantitative analysis on various benchmark datasets, we have shown that DART has significant potential for modeling high-throughput censored time-to-event data.
    摘要 时间到事分析(也称为存存分析)的目标是预测事件发生的时间, givens 一组特征。 这个领域的一个主要挑战是处理 censored 数据,可以使学习算法更加复杂。传统方法 such as Cox 的对数加速破碎模型和加速失败时间(AFT)模型在这个领域非常受欢迎,但它们经常需要假设,例如对比例的危险和线性。特别是 AF 模型经常需要预先指定的参数分布假设。为了提高预测性能和缓解严格假设,在过去的几年中有很多深度学习方法在预测 hazard-based 模型中得到应用。然而, representation learning 在神经网络文献中对 AFT 模型的应用还很少,尽管它在比较简单和可读性方面比 hazard-focused 方法更有优势。在这项工作中,我们介绍了 Deep AFT Rank-regression 模型(DART),这个模型使用基于 Gehan 排名统计的目标函数,这是一种高效的 representation learning 方法。在 eliminating the requirement to establish a baseline event time distribution 的同时,DART 保留了标准 AFT 模型中的优点,直接预测事件时间。我们的提案的方法是一种 semi-parametric AFT 模型,不需要任何参数分布假设,这也消除了需要额外的 гипер Parameters 或复杂的模型架构,与现有的神经网络基于 AFT 模型不同。通过对各种 benchmark 数据进行量化分析,我们表明了 DART 在高通过率 censored time-to-event 数据模型中具有显著的潜力。

Bivariate DeepKriging for Large-scale Spatial Interpolation of Wind Fields

  • paper_url: http://arxiv.org/abs/2307.08038
  • repo_url: None
  • paper_authors: Pratik Nag, Ying Sun, Brian J Reich
  • for: 这篇论文旨在提供一种高精度风场数据的大规模插值或降阶方法,用于气象、海洋学和气候研究等领域。
  • methods: 本文提出了一种称为“深度拟合”的方法,它是一个具有空间对预的深度神经网络,具有空间对预的嵌入层,用于预测二维空间资料。
  • results: 比较traditional cokriging预测器,深度拟合方法的预测性能更高,且可以快速 Compute,实现高效的数据预测。在中东地区506,771个位置上应用了深度拟合方法,结果表明其预测性能佳,且大大减少了计算时间。
    Abstract High spatial resolution wind data are essential for a wide range of applications in climate, oceanographic and meteorological studies. Large-scale spatial interpolation or downscaling of bivariate wind fields having velocity in two dimensions is a challenging task because wind data tend to be non-Gaussian with high spatial variability and heterogeneity. In spatial statistics, cokriging is commonly used for predicting bivariate spatial fields. However, the cokriging predictor is not optimal except for Gaussian processes. Additionally, cokriging is computationally prohibitive for large datasets. In this paper, we propose a method, called bivariate DeepKriging, which is a spatially dependent deep neural network (DNN) with an embedding layer constructed by spatial radial basis functions for bivariate spatial data prediction. We then develop a distribution-free uncertainty quantification method based on bootstrap and ensemble DNN. Our proposed approach outperforms the traditional cokriging predictor with commonly used covariance functions, such as the linear model of co-regionalization and flexible bivariate Mat\'ern covariance. We demonstrate the computational efficiency and scalability of the proposed DNN model, with computations that are, on average, 20 times faster than those of conventional techniques. We apply the bivariate DeepKriging method to the wind data over the Middle East region at 506,771 locations. The prediction performance of the proposed method is superior over the cokriging predictors and dramatically reduces computation time.
    摘要 高空间分辨率风数据是气候、海洋学和气象研究中的重要工具。大规模的风场 interpolación或下采样是一项复杂的任务,因为风数据往往非 Gaussian 分布,具有高空间变化和不均匀性。在空间统计中,cokriging 是广泛使用的方法,但predictor 不是优化的,除非使用 Gaussian 过程。此外,cokriging 对大量数据来说是计算昂贵的。在这篇论文中,我们提出了一种方法,即双向 DeepKriging,它是一种具有空间依赖性的双向深度神经网络(DNN),其中 embedding 层由空间径向基函数构建。我们然后开发了一种不含分布的不确定性评估方法,基于 bootstrap 和集成 DNN。我们的提出的方法在传统的 cokriging 预测器中超越了常用的 covariance 函数,如线性模型协调和灵活的双向 Matér covariance。我们 demonstate 了我们的 DNN 模型的计算效率和可扩展性,计算时间比传统技术平均快20倍。我们应用了双向 DeepKriging 方法于中东地区的风数据,包括506,771个位置。我们的预测性能较传统的 cokriging 预测器更高,计算时间减少了90%。

Magnetic Field-Based Reward Shaping for Goal-Conditioned Reinforcement Learning

  • paper_url: http://arxiv.org/abs/2307.08033
  • repo_url: None
  • paper_authors: Hongyu Ding, Yuanze Tang, Qing Wu, Bo Wang, Chunlin Chen, Zhi Wang
  • for: 提高目标填充RL任务中样本效率,即使面临动态环境和奖励稀缺。
  • methods: 基于磁场的奖励定制(MFRS)方法,通过将目标和障碍物视为常见磁铁,根据磁场强度的非线性和不规则分布来设置奖励函数。
  • results: 在 simulate 和实际 робо控任务中,MFRS 比相关现有方法更高效,可以有效提高RL算法在目标填充任务中的样本效率,并且可以适应不同的目标和障碍物动态。
    Abstract Goal-conditioned reinforcement learning (RL) is an interesting extension of the traditional RL framework, where the dynamic environment and reward sparsity can cause conventional learning algorithms to fail. Reward shaping is a practical approach to improving sample efficiency by embedding human domain knowledge into the learning process. Existing reward shaping methods for goal-conditioned RL are typically built on distance metrics with a linear and isotropic distribution, which may fail to provide sufficient information about the ever-changing environment with high complexity. This paper proposes a novel magnetic field-based reward shaping (MFRS) method for goal-conditioned RL tasks with dynamic target and obstacles. Inspired by the physical properties of magnets, we consider the target and obstacles as permanent magnets and establish the reward function according to the intensity values of the magnetic field generated by these magnets. The nonlinear and anisotropic distribution of the magnetic field intensity can provide more accessible and conducive information about the optimization landscape, thus introducing a more sophisticated magnetic reward compared to the distance-based setting. Further, we transform our magnetic reward to the form of potential-based reward shaping by learning a secondary potential function concurrently to ensure the optimal policy invariance of our method. Experiments results in both simulated and real-world robotic manipulation tasks demonstrate that MFRS outperforms relevant existing methods and effectively improves the sample efficiency of RL algorithms in goal-conditioned tasks with various dynamics of the target and obstacles.
    摘要 traditional reinforcement learning(RL)框架中的目标条件RL是一种有趣的扩展,因为动态环境和奖励稀缺可能使得传统的学习算法失效。奖励形成是一种实用的方法来提高样本效率,其中将人类领域知识 embed 到学习过程中。现有的奖励形成方法 для goal-conditioned RL 通常基于距离度量,这可能无法提供动态环境中的高复杂性所需的充分信息。这篇论文提出了一种基于磁场的奖励形成(MFRS)方法,用于goal-conditioned RL 任务中的动态目标和障碍物。我们根据物体的物理性质,将目标和障碍物视为永久磁铁,并根据这些磁铁生成的磁场强度设置奖励函数。非线性和不均匀的磁场强度分布可以提供更多的可访问和渠道化信息关于优化地图,因此引入一种更加复杂的磁奖。此外,我们将我们的磁奖转换为 potential-based 奖励形成,通过同时学习次要潜在函数来确保我们的方法的优化策略不变性。实验结果表明,MFRS在模拟和实际 робоック抓取任务中表现出色,超越了相关的现有方法,并有效地提高了RL算法在目标条件下的样本效率。

Noise-aware Speech Enhancement using Diffusion Probabilistic Model

  • paper_url: http://arxiv.org/abs/2307.08029
  • repo_url: https://github.com/yuchen005/nase
  • paper_authors: Yuchen Hu, Chen Chen, Ruizhe Li, Qiushi Zhu, Eng Siong Chng
  • for: 提高扩散模型下的生成散音增强(SE)性能,尤其是对于未经测试的噪声。
  • methods: 提出一种基于噪声特征的散音增强(NASE)方法,通过EXTRACTING噪声特征来导引反推进程。同时,提出一种多任务学习方案,以 JOINTLY 优化 SE 和 NC 任务,以提高噪声特征的提取精度。
  • results: 实验证明,NASE 可以与多种主流扩散 SE 模型结合使用,并在 VoiceBank-DEMAND 数据集上显示出显著提高,特别是对于未经测试的噪声。
    Abstract With recent advances of diffusion model, generative speech enhancement (SE) has attracted a surge of research interest due to its great potential for unseen testing noises. However, existing efforts mainly focus on inherent properties of clean speech for inference, underexploiting the varying noise information in real-world conditions. In this paper, we propose a noise-aware speech enhancement (NASE) approach that extracts noise-specific information to guide the reverse process in diffusion model. Specifically, we design a noise classification (NC) model to produce acoustic embedding as a noise conditioner for guiding the reverse denoising process. Meanwhile, a multi-task learning scheme is devised to jointly optimize SE and NC tasks, in order to enhance the noise specificity of extracted noise conditioner. Our proposed NASE is shown to be a plug-and-play module that can be generalized to any diffusion SE models. Experiment evidence on VoiceBank-DEMAND dataset shows that NASE achieves significant improvement over multiple mainstream diffusion SE models, especially on unseen testing noises.
    摘要 Recent advances in diffusion models have led to a surge of research interest in generative speech enhancement (SE) due to its great potential for unseen testing noises. However, existing efforts primarily focus on the inherent properties of clean speech for inference, neglecting the varying noise information in real-world conditions. In this paper, we propose a noise-aware speech enhancement (NASE) approach that extracts noise-specific information to guide the reverse process in the diffusion model. Specifically, we design a noise classification (NC) model to produce an acoustic embedding as a noise conditioner for guiding the reverse denoising process. Moreover, we devise a multi-task learning scheme to jointly optimize the SE and NC tasks, enhancing the noise specificity of the extracted noise conditioner. Our proposed NASE is shown to be a plug-and-play module that can be generalized to any diffusion SE models. Experimental evidence on the VoiceBank-DEMAND dataset demonstrates that NASE achieves significant improvement over multiple mainstream diffusion SE models, especially on unseen testing noises.

Revisiting Implicit Models: Sparsity Trade-offs Capability in Weight-tied Model for Vision Tasks

  • paper_url: http://arxiv.org/abs/2307.08013
  • repo_url: None
  • paper_authors: Haobo Song, Soumajit Majumder, Tao Lin
  • for: 该文章旨在探讨隐式模型(DEQs)在视觉任务上的表现,以及其相对于其他方法的比较。
  • methods: 该文章使用Weight-tied模型,并对其进行了深入研究,以及提出了使用特异 sparse masks 来提高模型容量。
  • results: 研究发现,Weight-tied模型在视觉任务上比DEQ variants更有效稳定,并且可以更好地 generalized to other learning paradigms。
    Abstract Implicit models such as Deep Equilibrium Models (DEQs) have garnered significant attention in the community for their ability to train infinite layer models with elegant solution-finding procedures and constant memory footprint. However, despite several attempts, these methods are heavily constrained by model inefficiency and optimization instability. Furthermore, fair benchmarking across relevant methods for vision tasks is missing. In this work, we revisit the line of implicit models and trace them back to the original weight-tied models. Surprisingly, we observe that weight-tied models are more effective, stable, as well as efficient on vision tasks, compared to the DEQ variants. Through the lens of these simple-yet-clean weight-tied models, we further study the fundamental limits in the model capacity of such models and propose the use of distinct sparse masks to improve the model capacity. Finally, for practitioners, we offer design guidelines regarding the depth, width, and sparsity selection for weight-tied models, and demonstrate the generalizability of our insights to other learning paradigms.
    摘要 匿名模型(DEQs)在社区中受到了广泛关注,因为它们可以训练无穷层模型,并且拥有简洁的解决方案和常量内存占用。然而,虽然有几次尝试,但这些方法受到了模型不充分利用和优化不稳定的限制。此外,对于视觉任务的比较是缺失的。在这项工作中,我们回顾了匿名模型的线索,并发现Weight-tied模型在视觉任务上更有效率、稳定、并且更高效。通过这些简单而干净的Weight-tied模型,我们进一步研究了这些模型的基本限制,并提出了使用特定的稀疏mask来提高模型容量。最后,我们对于实践者提供了depth、宽和稀疏选择的设计指南,并证明了我们的理解在其他学习方法上也是可行。

For One-Shot Decoding: Self-supervised Deep Learning-Based Polar Decoder

  • paper_url: http://arxiv.org/abs/2307.08004
  • repo_url: None
  • paper_authors: Huiying Song, Yihao Luo, Yuma Fukuzawa
  • for: 实现一种基于深度学习的排序码解码方案,具有一次解码功能。
  • methods: 透过自动化学习并利用生成矩阵来训练神经网络,将神经网络训练为bounded distance解码器。
  • results: Computer simulations show that the proposed scheme can achieve similar performance to the maximum a posteriori (MAP) decoder for very short packets, and the proposed neural network decoder (NND) has better generalization ability than the conventional one.
    Abstract We propose a self-supervised deep learning-based decoding scheme that enables one-shot decoding of polar codes. In the proposed scheme, rather than using the information bit vectors as labels for training the neural network (NN) through supervised learning as the conventional scheme did, the NN is trained to function as a bounded distance decoder by leveraging the generator matrix of polar codes through self-supervised learning. This approach eliminates the reliance on predefined labels, empowering the potential to train directly on the actual data within communication systems and thereby enhancing the applicability. Furthermore, computer simulations demonstrate that (i) the bit error rate (BER) and block error rate (BLER) performances of the proposed scheme can approach those of the maximum a posteriori (MAP) decoder for very short packets and (ii) the proposed NN decoder (NND) exhibits much superior generalization ability compared to the conventional one.
    摘要 我们提出一种自动超vised深度学习的解码方案,可以实现一击解码楔形码。在我们的方案中,而不是通过指定标签来用深度学习训练神经网络(NN),就是通过楔形码生成矩阵自我超vised学习训练NN。这种方法消除了靠定标签的依赖,使得可以直接在通信系统中训练NN,从而提高可用性。此外,计算机实验表明,(i)提案的方案可以在很短的包长度下达到MAP解码器的比Error rate和块Error rate性能,(ii)提案的NN解码器(NND)在对比传统方法的情况下显示出了很好的泛化能力。

Joint Microseismic Event Detection and Location with a Detection Transformer

  • paper_url: http://arxiv.org/abs/2307.09207
  • repo_url: None
  • paper_authors: Yuanyuan Yang, Claire Birnie, Tariq Alkhalifah
  • for: 这个研究旨在提出一个能够同时探测和定位微地震事件的方法,以实现实时微地震监控。
  • methods: 本研究使用了卷积神经网和Encoder-Decoder Transformer,并应用了一个基于集合的匈牙利损失函数,以直接处理录取到的波形资料。
  • results: 实验结果显示,提案的方法能够正确地探测和定位微地震事件,并且在实际应用中能够获得高效和可靠的结果。
    Abstract Microseismic event detection and location are two primary components in microseismic monitoring, which offers us invaluable insights into the subsurface during reservoir stimulation and evolution. Conventional approaches for event detection and location often suffer from manual intervention and/or heavy computation, while current machine learning-assisted approaches typically address detection and location separately; such limitations hinder the potential for real-time microseismic monitoring. We propose an approach to unify event detection and source location into a single framework by adapting a Convolutional Neural Network backbone and an encoder-decoder Transformer with a set-based Hungarian loss, which is applied directly to recorded waveforms. The proposed network is trained on synthetic data simulating multiple microseismic events corresponding to random source locations in the area of suspected microseismic activities. A synthetic test on a 2D profile of the SEAM Time Lapse model illustrates the capability of the proposed method in detecting the events properly and locating them in the subsurface accurately; while, a field test using the Arkoma Basin data further proves its practicability, efficiency, and its potential in paving the way for real-time monitoring of microseismic events.
    摘要 微型地震事件检测和定位是微型地震监测的两个关键组成部分,它为我们提供了无价的地层下预测和演化的准确信息。传统的方法通常受到人工干预和/或重量计算的限制,而当前的机器学习帮助的方法通常分别处理检测和定位,这些限制了实时微型地震监测的可能性。我们提出了一种将事件检测和源位置嵌入到同一个框架中的方法,通过采用卷积神经网络背景和encoder-decoder转换器,并使用集合基于hungarian损失函数进行训练。该网络在记录波形上直接应用。我们对多个微型地震事件的同时发生进行了synthetic数据生成,并在2DProfile上进行了一个synthetic测试,这些测试结果表明了我们的方法可以正确地检测事件并准确地定位其位置在地层下。此外,我们还对Arkoma Basin数据进行了一个实际测试,这些测试结果表明了我们的方法的实用性、效率和实时监测微型地震事件的潜在性。

LUCYD: A Feature-Driven Richardson-Lucy Deconvolution Network

  • paper_url: http://arxiv.org/abs/2307.07998
  • repo_url: https://github.com/ctom2/lucyd-deconvolution
  • paper_authors: Tomáš Chobola, Gesine Müller, Veit Dausmann, Anton Theileis, Jan Taucher, Jan Huisken, Tingying Peng
  • for: 提高微scopic图像质量和可读性
  • methods: 结合Richardson-Lucy方程和深度学习特征,提出了一种基于特征驱动的图像恢复模型
  • results: 对于synthetic和实际微scopic图像,LUCYD方法表现出色,提高图像质量和一致性,并且可以处理不同的微scopic模式和捕捉条件。
    Abstract The process of acquiring microscopic images in life sciences often results in image degradation and corruption, characterised by the presence of noise and blur, which poses significant challenges in accurately analysing and interpreting the obtained data. This paper proposes LUCYD, a novel method for the restoration of volumetric microscopy images that combines the Richardson-Lucy deconvolution formula and the fusion of deep features obtained by a fully convolutional network. By integrating the image formation process into a feature-driven restoration model, the proposed approach aims to enhance the quality of the restored images whilst reducing computational costs and maintaining a high degree of interpretability. Our results demonstrate that LUCYD outperforms the state-of-the-art methods in both synthetic and real microscopy images, achieving superior performance in terms of image quality and generalisability. We show that the model can handle various microscopy modalities and different imaging conditions by evaluating it on two different microscopy datasets, including volumetric widefield and light-sheet microscopy. Our experiments indicate that LUCYD can significantly improve resolution, contrast, and overall quality of microscopy images. Therefore, it can be a valuable tool for microscopy image restoration and can facilitate further research in various microscopy applications. We made the source code for the model accessible under https://github.com/ctom2/lucyd-deconvolution.
    摘要 生物科学中获取微型图像的过程经常会导致图像异常和损害,表现为图像噪声和模糊,这会对数据分析和解释提出 significiant 挑战。这篇论文提出了一种名为LUCYD的新方法,用于修复Volume Microscopy 图像。该方法结合了Richardson-Lucy 减 convolution 方程和基于深度学习的卷积网络来提高图像的品质。通过将图像形成过程包含在一个特征驱动的修复模型中,该方法希望提高修复后图像的质量,同时降低计算成本并保持高度可读性。我们的结果表明,LUCYD 在 synthetic 和实际 Microscopy 图像中都超过了现有方法的性能,在图像质量和泛化性方面表现出色。我们通过评估其在不同 Microscopy 模式和拍摄条件下的表现,发现LUCYD 可以处理不同的 Microscopy 模式和拍摄条件。我们的实验表明,LUCYD 可以显著提高 Microscopy 图像的分辨率、对比度和整体质量。因此,它可以成为 Microscopy 图像修复的有价值工具,并促进了不同 Microscopy 应用的进一步研究。我们将模型的源代码公开在 上。

MargCTGAN: A “Marginally’’ Better CTGAN for the Low Sample Regime

  • paper_url: http://arxiv.org/abs/2307.07997
  • repo_url: https://github.com/tejuafonja/margctgan
  • paper_authors: Tejumade Afonja, Dingfan Chen, Mario Fritz
  • for: This paper is written for evaluating the effectiveness of synthetic tabular data generation methods, specifically in low sample scenarios, and addressing the oversight of neglecting statistical properties in current evaluation methods.
  • methods: The paper uses three state-of-the-art synthetic tabular data generators, including CTGAN, and evaluates their performance based on marginal distribution, column-pair correlation, joint distribution, and downstream task utility. The proposed MargCTGAN model adds feature matching of de-correlated marginals to improve the statistical properties and downstream utility of the synthetic data.
  • results: The paper shows that CTGAN underperforms in low sample settings in terms of utility, but the proposed MargCTGAN model consistently improves downstream utility as well as statistical properties of the synthetic data.
    Abstract The potential of realistic and useful synthetic data is significant. However, current evaluation methods for synthetic tabular data generation predominantly focus on downstream task usefulness, often neglecting the importance of statistical properties. This oversight becomes particularly prominent in low sample scenarios, accompanied by a swift deterioration of these statistical measures. In this paper, we address this issue by conducting an evaluation of three state-of-the-art synthetic tabular data generators based on their marginal distribution, column-pair correlation, joint distribution and downstream task utility performance across high to low sample regimes. The popular CTGAN model shows strong utility, but underperforms in low sample settings in terms of utility. To overcome this limitation, we propose MargCTGAN that adds feature matching of de-correlated marginals, which results in a consistent improvement in downstream utility as well as statistical properties of the synthetic data.
    摘要 现实生成的数据的潜在价值很大。然而,目前的生成synthetic tabular数据的评估方法主要关注下游任务的有用性,经常忽略了统计性质的重要性。这种欠缺特别在低样本情况下显著,同时这些统计度量快速下降。在这篇论文中,我们解决这个问题,通过评估三种现状最佳的synthetic tabular数据生成器的边缘分布、列对协同分布、共同分布和下游任务有用性的表现,从高样本到低样本的场景中进行评估。CTGAN模型在下游任务上表现良好,但在低样本情况下表现不佳,其 Utility 下降。为了解决这个限制,我们提议MargCTGAN,它通过匹配特征的分布,使得下游任务的有用性和统计性质都得到了改进。

CoNAN: Conditional Neural Aggregation Network For Unconstrained Face Feature Fusion

  • paper_url: http://arxiv.org/abs/2307.10237
  • repo_url: None
  • paper_authors: Bhavin Jawade, Deen Dayal Mohan, Dennis Fedorishin, Srirangaraj Setlur, Venu Govindaraju
  • for: 长距离、低分辨率、不同视角、照明、姿势和 atmospheric conditions 下的面部识别,提高face recognition的精度和可靠性。
  • methods: 基于面部特征的分布信息来Conditioning,通过学习一个context vector来对特征进行权重调整,以提高face recognition的精度和可靠性。
  • results: 在BTS和DroneSURF等长距离未控制的面部识别数据集上达到了state-of-the-art的结果,证明了我们的方法的优势。
    Abstract Face recognition from image sets acquired under unregulated and uncontrolled settings, such as at large distances, low resolutions, varying viewpoints, illumination, pose, and atmospheric conditions, is challenging. Face feature aggregation, which involves aggregating a set of N feature representations present in a template into a single global representation, plays a pivotal role in such recognition systems. Existing works in traditional face feature aggregation either utilize metadata or high-dimensional intermediate feature representations to estimate feature quality for aggregation. However, generating high-quality metadata or style information is not feasible for extremely low-resolution faces captured in long-range and high altitude settings. To overcome these limitations, we propose a feature distribution conditioning approach called CoNAN for template aggregation. Specifically, our method aims to learn a context vector conditioned over the distribution information of the incoming feature set, which is utilized to weigh the features based on their estimated informativeness. The proposed method produces state-of-the-art results on long-range unconstrained face recognition datasets such as BTS, and DroneSURF, validating the advantages of such an aggregation strategy.
    摘要 面部识别从图像集中获取在无法控制的设置下进行,如在远距离、低分辨率、不同视点、照明、姿势和大气条件下,是有挑战的。在这些识别系统中,面部特征聚合起important role,即将一个模板中的多个特征表示聚合到一个全局表示中。现有的传统面部特征聚合方法可以通过metadata或高维度中间特征表示来估计特征质量进行聚合。然而,在EXTREMELY LOW-RESOLUTION的脸部图像中,生成高质量的metadata或样式信息是不可能的。为了突破这些限制,我们提出了一种特征分布条件的方法called CoNAN для模板聚合。具体来说,我们的方法是学习一个conditioned over the distribution information of the incoming feature set,用于对特征进行权重调整基于估计的有用性。我们的方法在long-range和高高度设置下的无结构化脸部识别数据集,如BTS和DroneSURF, producessate-of-the-art结果,证明了such an aggregation strategy的优势。

A Survey of Techniques for Optimizing Transformer Inference

  • paper_url: http://arxiv.org/abs/2307.07982
  • repo_url: None
  • paper_authors: Krishna Teja Chitty-Venkata, Sparsh Mittal, Murali Emani, Venkatram Vishwanath, Arun K. Somani
  • for: 本文旨在把transformer网络的推理阶段优化技术综述一下,以便为研究人员和学生提供一个全面的参考。
  • methods: 本文总结了多种优化技术,包括知识传承、剪枝、量化、神经网络搜索和特征硬件设计等,以提高transformer网络的推理效率。
  • results: 本文对多种模型和技术进行了评估,并展示了它们的数据量和精度之间的质量评估。同时,本文还预测了未来这一领域的发展趋势。
    Abstract Recent years have seen a phenomenal rise in performance and applications of transformer neural networks. The family of transformer networks, including Bidirectional Encoder Representations from Transformer (BERT), Generative Pretrained Transformer (GPT) and Vision Transformer (ViT), have shown their effectiveness across Natural Language Processing (NLP) and Computer Vision (CV) domains. Transformer-based networks such as ChatGPT have impacted the lives of common men. However, the quest for high predictive performance has led to an exponential increase in transformers' memory and compute footprint. Researchers have proposed techniques to optimize transformer inference at all levels of abstraction. This paper presents a comprehensive survey of techniques for optimizing the inference phase of transformer networks. We survey techniques such as knowledge distillation, pruning, quantization, neural architecture search and lightweight network design at the algorithmic level. We further review hardware-level optimization techniques and the design of novel hardware accelerators for transformers. We summarize the quantitative results on the number of parameters/FLOPs and accuracy of several models/techniques to showcase the tradeoff exercised by them. We also outline future directions in this rapidly evolving field of research. We believe that this survey will educate both novice and seasoned researchers and also spark a plethora of research efforts in this field.
    摘要 近年来,变换神经网络家族,包括矢量代码表(BERT)、生成预训练变换网络(GPT)和视觉转换网络(ViT),在自然语言处理(NLP)和计算机视觉(CV)领域中表现出色。变换网络如ChatGPT对公众生活产生了深远的影响。然而,为了 достичь高预测性能,变换网络的内存和计算核心覆盖面积呈极值增长趋势。研究人员提出了优化变换推理阶段的多种技术。本文对优化变换网络推理阶段的技术进行了全面的检视,包括知识储存、剪辑、量化、神经网络搜索和轻量级网络设计等算法级别的技术。此外,我们还评估硬件级别的优化技术和专门为变换网络设计的新硬件加速器。我们对多种模型和技术的参数/计算量和准确率进行了评量分析,并对这些技术的许多特点进行了详细的描述。我们还预测未来这一领域的发展趋势。我们相信,这篇评论将为初学者和季读者都提供深刻的了解,并且将激发这一领域的大量研究。

Byzantine-Robust Distributed Online Learning: Taming Adversarial Participants in An Adversarial Environment

  • paper_url: http://arxiv.org/abs/2307.07980
  • repo_url: https://github.com/wanger521/ogd
  • paper_authors: Xingrong Dong, Zhaoxian Wu, Qing Ling, Zhi Tian
  • for: 这个论文研究了分布式在线学习下的拜尼袭击问题。
  • methods: 该论文使用了一种robust aggregation rule,并提出了一种Byzantine-robust分布式在线摘要算法来实现sublinear的权衡误差 bound。
  • results: 该论文证明了,即使使用State-of-the-art robust aggregation rule,分布式在线学习仍然只能实现线性的拜尼袭击 regret bound,这是不可避免的。但是,当环境不完全是敌对的,即honest participant的损失是独立同分布的时,我们可以实现sublinear的随机误差 bound。
    Abstract This paper studies distributed online learning under Byzantine attacks. The performance of an online learning algorithm is often characterized by (adversarial) regret, which evaluates the quality of one-step-ahead decision-making when an environment provides adversarial losses, and a sublinear bound is preferred. But we prove that, even with a class of state-of-the-art robust aggregation rules, in an adversarial environment and in the presence of Byzantine participants, distributed online gradient descent can only achieve a linear adversarial regret bound, which is tight. This is the inevitable consequence of Byzantine attacks, even though we can control the constant of the linear adversarial regret to a reasonable level. Interestingly, when the environment is not fully adversarial so that the losses of the honest participants are i.i.d. (independent and identically distributed), we show that sublinear stochastic regret, in contrast to the aforementioned adversarial regret, is possible. We develop a Byzantine-robust distributed online momentum algorithm to attain such a sublinear stochastic regret bound. Extensive numerical experiments corroborate our theoretical analysis.
    摘要

Finite element inspired networks: Learning physically-plausible deformable object dynamics from partial observations

  • paper_url: http://arxiv.org/abs/2307.07975
  • repo_url: None
  • paper_authors: Shamil Mamedov, A. René Geist, Jan Swevers, Sebastian Trimpe
  • For: The paper aims to develop a human-interpretable and data-efficient model for simulating the dynamics of deformable linear objects (DLOs).* Methods: The authors draw inspiration from the rigid finite element method (R-FEM) and model a DLO as a serial chain of rigid bodies whose internal state is unrolled through time by a dynamics network. They train the dynamics network jointly with a physics-informed encoder to ensure that the state representation is physically meaningful.* Results: The authors demonstrate the effectiveness of their approach in a robot experiment, showing that the “Finite element inspired network” (FEN) forms a capable DLO dynamics model that yields physically interpretable predictions from partial observations.Here’s the information in Simplified Chinese text:
  • for: 该文章目标是开发一种可以快速预测并具有人类可读性的材料线性对象动力学模型。
  • methods: 作者们引入了刚体Finite element方法(R-FEM),将材料线性对象模型为一串连接的刚体体 whose internal state通过时间的推进。他们将动力学网络与物理学习编码器同时训练,以确保状态表示具有物理意义。
  • results: 作者们在Robot实验中展示了他们的方法的效果,显示了”Finite element inspired network”(FEN)可以快速预测材料线性对象动力学行为,并且从部分观察数据中获得物理意义的预测结果。
    Abstract The accurate simulation of deformable linear object (DLO) dynamics is challenging if the task at hand requires a human-interpretable and data-efficient model that also yields fast predictions. To arrive at such model, we draw inspiration from the rigid finite element method (R-FEM) and model a DLO as a serial chain of rigid bodies whose internal state is unrolled through time by a dynamics network. As this state is not observed directly, the dynamics network is trained jointly with a physics-informed encoder mapping observed motion variables to the body chain's state. To encourage that the state acquires a physically meaningful representation, we leverage the forward kinematics (FK) of the underlying R-FEM model as a decoder. We demonstrate in a robot experiment that this architecture - being termed "Finite element inspired network" - forms an easy to handle, yet capable DLO dynamics model yielding physically interpretable predictions from partial observations. The project code is available at: \url{https://tinyurl.com/fei-networks}
    摘要 对于不可归纳的线性物体(DLO)动力学的准确模拟是一项挑战,特别是当需要一个人类可读的和数据效率的模型,同时还需要快速预测。为了实现这种模型,我们从可变finite element方法(R-FEM)中继承灵感,将DLO表示为一串连续的刚体体 whose internal state是通过时间的卷积来实现的。由于这个状态不直接可见,因此我们将动力网络与物理学整体编码器一起训练,以将观察到的动作变量映射到body链的状态上。为了使状态具有物理意义,我们利用了下行骨干(FK)的前向几何学模型作为解码器。我们在Robot实验中展示了这种架构,称之为"finite element inspired network",可以提供容易处理、具有物理解释能力的DLO动力学模型,从部分观察数据中提取物理意义的预测结果。Code project可以在以下链接中找到:https://tinyurl.com/fei-networks

Heteroscedastic Causal Structure Learning

  • paper_url: http://arxiv.org/abs/2307.07973
  • repo_url: https://github.com/baosws/host
  • paper_authors: Bao Duong, Thin Nguyen
  • for: 学习导向的不同树(DAGs),即从观察数据中找到 causal 关系的编码问题。
  • methods: 我们提出了一种基于 Gaussian 噪声的不同树学习算法,即 HOST(Heteroscedastic causal STructure learning)算法,它可以在 polynomial 时间复杂度下解决 causal 结构学习问题。
  • results: 我们通过广泛的实验评估表明,HOST 算法在 causal 顺序学习和结构学习问题中与状态之前的方法竞争。
    Abstract Heretofore, learning the directed acyclic graphs (DAGs) that encode the cause-effect relationships embedded in observational data is a computationally challenging problem. A recent trend of studies has shown that it is possible to recover the DAGs with polynomial time complexity under the equal variances assumption. However, this prohibits the heteroscedasticity of the noise, which allows for more flexible modeling capabilities, but at the same time is substantially more challenging to handle. In this study, we tackle the heteroscedastic causal structure learning problem under Gaussian noises. By exploiting the normality of the causal mechanisms, we can recover a valid causal ordering, which can uniquely identify the causal DAG using a series of conditional independence tests. The result is HOST (Heteroscedastic causal STructure learning), a simple yet effective causal structure learning algorithm that scales polynomially in both sample size and dimensionality. In addition, via extensive empirical evaluations on a wide range of both controlled and real datasets, we show that the proposed HOST method is competitive with state-of-the-art approaches in both the causal order learning and structure learning problems.
    摘要

Enhancing Energy Efficiency and Reliability in Autonomous Systems Estimation using Neuromorphic Approach

  • paper_url: http://arxiv.org/abs/2307.07963
  • repo_url: None
  • paper_authors: Reza Ahmadvand, Sarah Safura Sharif, Yaser Mike Banad
  • for: 这篇论文的目的是提出一个基于脉冲编程理论和脉冲神经网络(SNN)的估计框架,以便实现低Size、Weight、Power(SWaP)电脑的能效性和可靠性。
  • methods: 本研究使用了SNN-基于的卡尔曼约瑟(KF)和修改后的滑块创新续推(MSIF),并且设计了系统模型对应的网络重量矩阵,以消除学习需求。
  • results: 比较了提案的策略和其算法对应的KF和MSIF,使用了 Monte Carlo 实验,并评估了SNN-MSIF的稳定性,包括在模型不确定性和神经元损失的情况下。结果显示了提案的方法的可行性和SNN-MSIF的精度和稳定性的超越性。此外,脉冲图从网络中观察到的脉冲图亮点,表明了提案的方法实现了约97%的脉冲减少。
    Abstract Energy efficiency and reliability have long been crucial factors for ensuring cost-effective and safe missions in autonomous systems computers. With the rapid evolution of industries such as space robotics and advanced air mobility, the demand for these low size, weight, and power (SWaP) computers has grown significantly. This study focuses on introducing an estimation framework based on spike coding theories and spiking neural networks (SNN), leveraging the efficiency and scalability of neuromorphic computers. Therefore, we propose an SNN-based Kalman filter (KF), a fundamental and widely adopted optimal strategy for well-defined linear systems. Furthermore, based on the modified sliding innovation filter (MSIF) we present a robust strategy called SNN-MSIF. Notably, the weight matrices of the networks are designed according to the system model, eliminating the need for learning. To evaluate the effectiveness of the proposed strategies, we compare them to their algorithmic counterparts, namely the KF and the MSIF, using Monte Carlo simulations. Additionally, we assess the robustness of SNN-MSIF by comparing it to SNN-KF in the presence of modeling uncertainties and neuron loss. Our results demonstrate the applicability of the proposed methods and highlight the superior performance of SNN-MSIF in terms of accuracy and robustness. Furthermore, the spiking pattern observed from the networks serves as evidence of the energy efficiency achieved by the proposed methods, as they exhibited an impressive reduction of approximately 97 percent in emitted spikes compared to possible spikes.
    摘要 “能效率和可靠性在自动系统计算机中已经是长期关键因素,以确保成本效果和安全的任务。随着空间 робо技术和高级空中交通等行业的快速发展,小型、轻量、低功耗(SWaP)计算机的需求增长了 significatively。本研究提出了基于射频编码理论和射频神经网络(SNN)的估算框架,利用神经计算机的效率和可扩展性。因此,我们提出了基于SNN的卡尔曼畸(KF),是一种广泛采用的优质策略,适用于定义良好的线性系统。此外,基于修改的滑动创新畸(MSIF),我们提出了一种可靠的策略,称为SNN-MSIF。各种网络权重矩阵按照系统模型设计,无需学习。通过对提出的策略与算法策略进行比较,我们评估了提案的有效性。此外,我们还对SNN-MSIF的可靠性进行了评估,并在模型不确定性和神经丢失的情况下与SNN-KF进行比较。结果表明提案的方法的可行性和SNN-MSIF的精度和可靠性的提高。此外,由网络产生的射频模式表明了提案的能效率实现,它们在可能发生的射频中减少了约97%。”

Automated Polynomial Filter Learning for Graph Neural Networks

  • paper_url: http://arxiv.org/abs/2307.07956
  • repo_url: https://github.com/Aryia-Behroziuan/neurons
  • paper_authors: Wendi Yu, Zhichao Hou, Xiaorui Liu
  • For: This paper is written for researchers and practitioners in the field of graph neural networks (GNNs) and related areas, as it explores the potential and limitations of polynomial graph filter learning approaches and proposes a novel automated polynomial graph filter learning framework called Auto-Polynomial.* Methods: The paper uses polynomial graph filters as guiding principles in the design of GNNs, and proposes a novel automated learning framework called Auto-Polynomial that efficiently learns better filters capable of adapting to various complex graph signals.* Results: The paper demonstrates significant and consistent performance improvements on both homophilic and heterophilic graphs across multiple learning settings considering various labeling ratios, which unleashes the potential of polynomial filter learning.
    Abstract Polynomial graph filters have been widely used as guiding principles in the design of Graph Neural Networks (GNNs). Recently, the adaptive learning of the polynomial graph filters has demonstrated promising performance for modeling graph signals on both homophilic and heterophilic graphs, owning to their flexibility and expressiveness. In this work, we conduct a novel preliminary study to explore the potential and limitations of polynomial graph filter learning approaches, revealing a severe overfitting issue. To improve the effectiveness of polynomial graph filters, we propose Auto-Polynomial, a novel and general automated polynomial graph filter learning framework that efficiently learns better filters capable of adapting to various complex graph signals. Comprehensive experiments and ablation studies demonstrate significant and consistent performance improvements on both homophilic and heterophilic graphs across multiple learning settings considering various labeling ratios, which unleashes the potential of polynomial filter learning.
    摘要 偏微分 графические фильтры已经广泛应用于图神经网络(GNNs)的设计中。最近,可变学习偏微分 графические фильтры的应用已经展示了在模型图信号上的出色表现,感谢它们的灵活性和表达能力。在这项工作中,我们进行了一项新的初步研究,探索偏微分 графические филь特的潜在和局限性,发现了严重的欠拟合问题。为了改善偏微分 графические filters的效果,我们提出了 Auto-Polynomial,一种新的、通用的自动偏微分 графические filters 学习框架,可以高效地学习更好的 filters,能够适应不同的复杂图信号。经过完善的实验和割除研究,我们发现了在不同的学习设定下,包括不同的标签比例,在Homophilic和Heterophilic图上,Auto-Polynomial 可以在多种情况下提供显著和稳定的性能改进。

SentimentGPT: Exploiting GPT for Advanced Sentiment Analysis and its Departure from Current Machine Learning

  • paper_url: http://arxiv.org/abs/2307.10234
  • repo_url: None
  • paper_authors: Kiana Kheiri, Hamid Karimi
  • for: 这个研究探讨了基于Transformer生成器(GPT)的多种方法在 sentiment analysis 中的表现,具体是使用 SemEval 2017 数据集进行 Task 4 的分析。
  • methods: 这个研究使用了三种主要策略:1)使用高级 GPT-3.5 Turbo 进行提示工程,2)精度调整 GPT 模型,3)一种创新的嵌入分类方法。
  • results: 研究结果表明,GPT 方法在 predictive performance 方面表现出了明显的优势,相比之前的 state-of-the-art 模型,GPT 模型的 F1 分数提高了 более 22%。此外,研究还探讨了 sentiment analysis 任务中的常见挑战,如理解上下文和检测蔑词。研究表明,GPT 模型在处理这些复杂性方面具有强大的能力。
    Abstract This study presents a thorough examination of various Generative Pretrained Transformer (GPT) methodologies in sentiment analysis, specifically in the context of Task 4 on the SemEval 2017 dataset. Three primary strategies are employed: 1) prompt engineering using the advanced GPT-3.5 Turbo, 2) fine-tuning GPT models, and 3) an inventive approach to embedding classification. The research yields detailed comparative insights among these strategies and individual GPT models, revealing their unique strengths and potential limitations. Additionally, the study compares these GPT-based methodologies with other current, high-performing models previously used with the same dataset. The results illustrate the significant superiority of the GPT approaches in terms of predictive performance, more than 22\% in F1-score compared to the state-of-the-art. Further, the paper sheds light on common challenges in sentiment analysis tasks, such as understanding context and detecting sarcasm. It underscores the enhanced capabilities of the GPT models to effectively handle these complexities. Taken together, these findings highlight the promising potential of GPT models in sentiment analysis, setting the stage for future research in this field. The code can be found at https://github.com/DSAatUSU/SentimentGPT
    摘要

Accelerating Distributed ML Training via Selective Synchronization

  • paper_url: http://arxiv.org/abs/2307.07950
  • repo_url: None
  • paper_authors: Sahil Tyagi, Martin Swany
  • for: 这篇论文主要旨在提出一种实用、低开销的深度神经网络(DNN)训练方法,以提高分布式训练中的效率。
  • methods: 本文使用了一种名为\texttt{SelSync}的实际方法,它在每步决定是否进行交互,以提高训练效率。此外,\texttt{SelSync}还提出了一些优化策略来提高在半同步训练中的整合。
  • results: 根据实验结果,\texttt{SelSync}可以与传统的分布式训练方法(如BSP)具有相同或更好的准确性,同时减少训练时间,最多减少14倍。
    Abstract In distributed training, deep neural networks (DNNs) are launched over multiple workers concurrently and aggregate their local updates on each step in bulk-synchronous parallel (BSP) training. However, BSP does not linearly scale-out due to high communication cost of aggregation. To mitigate this overhead, alternatives like Federated Averaging (FedAvg) and Stale-Synchronous Parallel (SSP) either reduce synchronization frequency or eliminate it altogether, usually at the cost of lower final accuracy. In this paper, we present \texttt{SelSync}, a practical, low-overhead method for DNN training that dynamically chooses to incur or avoid communication at each step either by calling the aggregation op or applying local updates based on their significance. We propose various optimizations as part of \texttt{SelSync} to improve convergence in the context of \textit{semi-synchronous} training. Our system converges to the same or better accuracy than BSP while reducing training time by up to 14$\times$.
    摘要 在分布式训练中,深度神经网络(DNNs)会在多个工作者同时进行并将本地更新集中发送到每个步骤上进行大规模同步并发训练(BSP)。然而,BSP不会线性扩展,因为协调成本过高。为了减少这些开销, alternativas como Federated Averaging(FedAvg)和Stale-Synchronous Parallel(SSP)可以减少同步频率或完全消除同步,通常是在牺牲最终准确性的代价。在这篇论文中,我们提出了\texttt{SelSync},一种实用、低开销的DNN训练方法,可以在每个步骤之间动态选择是否进行通信。我们还提出了多种优化,以提高在半同步训练的 конверGENCE。我们的系统可以与BSP相比,提高训练时间的效率,最多可以减少训练时间的14倍。

Revisiting Domain-Adaptive 3D Object Detection by Reliable, Diverse and Class-balanced Pseudo-Labeling

  • paper_url: http://arxiv.org/abs/2307.07944
  • repo_url: https://github.com/zhuoxiao-chen/redb-da-3ddet
  • paper_authors: Zhuoxiao Chen, Yadan Luo, Zheng Wang, Mahsa Baktashmotlagh, Zi Huang
  • for: 本文targets at domain-adaptive 3D object detection, specifically addressing the challenge of low-quality pseudo labels and class imbalance in multi-class training settings.
  • methods: 本文提出了一种基于pseudo labeling的域 adapted 3D object detection方法,包括ReDB框架、cross-domain examination(CDE)和overlapped boxes counting(OBC)等技术。
  • results: 实验结果表明, compared to existing 3D domain adaptation methods, our proposed ReDB approach significantly improves 3D object detection performance, with a 23.15% mAP improvement on the nuScenes $\rightarrow$ KITTI task.
    Abstract Unsupervised domain adaptation (DA) with the aid of pseudo labeling techniques has emerged as a crucial approach for domain-adaptive 3D object detection. While effective, existing DA methods suffer from a substantial drop in performance when applied to a multi-class training setting, due to the co-existence of low-quality pseudo labels and class imbalance issues. In this paper, we address this challenge by proposing a novel ReDB framework tailored for learning to detect all classes at once. Our approach produces Reliable, Diverse, and class-Balanced pseudo 3D boxes to iteratively guide the self-training on a distributionally different target domain. To alleviate disruptions caused by the environmental discrepancy (e.g., beam numbers), the proposed cross-domain examination (CDE) assesses the correctness of pseudo labels by copy-pasting target instances into a source environment and measuring the prediction consistency. To reduce computational overhead and mitigate the object shift (e.g., scales and point densities), we design an overlapped boxes counting (OBC) metric that allows to uniformly downsample pseudo-labeled objects across different geometric characteristics. To confront the issue of inter-class imbalance, we progressively augment the target point clouds with a class-balanced set of pseudo-labeled target instances and source objects, which boosts recognition accuracies on both frequently appearing and rare classes. Experimental results on three benchmark datasets using both voxel-based (i.e., SECOND) and point-based 3D detectors (i.e., PointRCNN) demonstrate that our proposed ReDB approach outperforms existing 3D domain adaptation methods by a large margin, improving 23.15% mAP on the nuScenes $\rightarrow$ KITTI task. The code is available at https://github.com/zhuoxiao-chen/ReDB-DA-3Ddet.
    摘要 “对于多类别训练设定下的隐式领域适应(DA), Pseudo 标签技术已经成为一种重要的方法。然而,现有的 DA 方法在多类别训练设定下会受到较大的性能下降,这是因为 Pseudo 标签质量低下和分布不均匀问题。在这篇文章中,我们提出了一个名为 ReDB 的框架,用于同时探索所有类别。我们的方法可以生成可靠、多样和分布均匀的 Pseudo 3D 标签,并在不同的目标领域中将其适应自我训练。为了解决环境差异所导致的干扰(例如:批量数量),我们提出了跨领域评估(CDE),用于评估 Pseudo 标签的正确性,并在目标环境中复制目标实体。为了减少计算负载和补偿物体移动(例如:数量和点密度),我们设计了重 overlap 的盒子(OBC)度量,允许对不同的几何特征进行均匀抽样。为了解决类别不均匀问题,我们逐渐将目标点云补充了一些具有分布均匀的 Pseudo 标签目标实体和原始点云,这样可以提高识别率,包括常见的类别和罕见的类别。实验结果显示,我们的 ReDB 方法在三个 benchmark 数据集上比较早的 3D 领域适应方法提高了 23.15% mAP,对于 nuScenes ← KITTI 任务来说,这表明我们的方法可以更好地适应不同的领域。代码可以在 https://github.com/zhuoxiao-chen/ReDB-DA-3Ddet 上找到。”

KECOR: Kernel Coding Rate Maximization for Active 3D Object Detection

  • paper_url: http://arxiv.org/abs/2307.07942
  • repo_url: https://github.com/Luoyadan/KECOR-active-3Ddet
  • paper_authors: Yadan Luo, Zhuoxiao Chen, Zhen Fang, Zheng Zhang, Zi Huang, Mahsa Baktashmotlagh
  • for: 提高自动驾驶中LiDAR检测器的可靠性,使用活动学习方法来减少标注量。
  • methods: 使用信息论来选择最有用的点云获取标注,并通过搜索算法选择最有用的点云。
  • results: 比起现有方法,提高了监督学习性能的同时减少了约44%的标注成本和26%的计算时间。
    Abstract Achieving a reliable LiDAR-based object detector in autonomous driving is paramount, but its success hinges on obtaining large amounts of precise 3D annotations. Active learning (AL) seeks to mitigate the annotation burden through algorithms that use fewer labels and can attain performance comparable to fully supervised learning. Although AL has shown promise, current approaches prioritize the selection of unlabeled point clouds with high uncertainty and/or diversity, leading to the selection of more instances for labeling and reduced computational efficiency. In this paper, we resort to a novel kernel coding rate maximization (KECOR) strategy which aims to identify the most informative point clouds to acquire labels through the lens of information theory. Greedy search is applied to seek desired point clouds that can maximize the minimal number of bits required to encode the latent features. To determine the uniqueness and informativeness of the selected samples from the model perspective, we construct a proxy network of the 3D detector head and compute the outer product of Jacobians from all proxy layers to form the empirical neural tangent kernel (NTK) matrix. To accommodate both one-stage (i.e., SECOND) and two-stage detectors (i.e., PVRCNN), we further incorporate the classification entropy maximization and well trade-off between detection performance and the total number of bounding boxes selected for annotation. Extensive experiments conducted on two 3D benchmarks and a 2D detection dataset evidence the superiority and versatility of the proposed approach. Our results show that approximately 44% box-level annotation costs and 26% computational time are reduced compared to the state-of-the-art AL method, without compromising detection performance.
    摘要 需要一个可靠的LiDAR基于对象检测器在自动驾驶中,但其成功取决于获得大量精度3D注释。活动学习(AL)希望减轻注释压力通过使用 fewer labels 和可以达到完全监督学习的性能。 although AL has shown promise, current approaches prioritize the selection of unlabeled point clouds with high uncertainty and/or diversity, leading to the selection of more instances for labeling and reduced computational efficiency.在这篇论文中,我们采用了一种新的幂 coding rate maximization(KECOR)策略,该策略目的是通过信息理论来标识最有用的点云来获得标注。 我们使用批处理来寻找想要标注的点云,以便最大化最小数量的比特位数据编码 latent 特征。为了确定选择的样本唯一性和有用性,我们构建了一个3D检测头的卫星网络,并计算所有卫星层的外积Jacobian的 outer product,以形成empirical neural tangent kernel(NTK)矩阵。为了适应一stage(i.e., SECOND)和two-stage检测器(i.e., PVRCNN),我们进一步包括分类 entropy maximization 和检测性能和总绘制盒数之间的平衡。我们在两个3D benchmark和一个2D检测数据集上进行了广泛的实验,结果显示了我们的方法的优越性和多样性。我们发现,相比于状态 искусственный智能方法,我们的方法可以降低盒级注释成本约44%,并且不会影响检测性能。

Optimal Compression of Unit Norm Vectors in the High Distortion Regime

  • paper_url: http://arxiv.org/abs/2307.07941
  • repo_url: None
  • paper_authors: Heng Zhu, Avishek Ghosh, Arya Mazumdar
  • for: 这篇论文探讨了将单位 нор 向量压缩成最少位元数,以确保可以在接受到的水平上实现重建的可接受程度。
  • methods: 本研究将Rate-Distortion/covering code文献中的问题探讨,但专注于”高 Distortion” regime。研究在最坏情况下进行,无任何对vector的假设,但允许使用随机压缩 Maps。
  • results: 研究发现,简单的压缩方案几乎是最佳的。结果包括部分新的结果和已知结果,但是在这篇论文中汇集了所有结果 для 完整性。
    Abstract Motivated by the need for communication-efficient distributed learning, we investigate the method for compressing a unit norm vector into the minimum number of bits, while still allowing for some acceptable level of distortion in recovery. This problem has been explored in the rate-distortion/covering code literature, but our focus is exclusively on the "high-distortion" regime. We approach this problem in a worst-case scenario, without any prior information on the vector, but allowing for the use of randomized compression maps. Our study considers both biased and unbiased compression methods and determines the optimal compression rates. It turns out that simple compression schemes are nearly optimal in this scenario. While the results are a mix of new and known, they are compiled in this paper for completeness.
    摘要 <> translate "Motivated by the need for communication-efficient distributed learning, we investigate the method for compressing a unit norm vector into the minimum number of bits, while still allowing for some acceptable level of distortion in recovery. This problem has been explored in the rate-distortion/covering code literature, but our focus is exclusively on the 'high-distortion' regime. We approach this problem in a worst-case scenario, without any prior information on the vector, but allowing for the use of randomized compression maps. Our study considers both biased and unbiased compression methods and determines the optimal compression rates. It turns out that simple compression schemes are nearly optimal in this scenario. While the results are a mix of new and known, they are compiled in this paper for completeness." into Simplified Chinese.翻译文本:<>驱动通信效率的分布式学习需求,我们调查了最小化单元 нор 向量 bit 数量的压缩方法,同时仍允许接受一定的损害率。这个问题在 rate-distortion/covering code 文献中已经被研究过,但我们专注于 "高损害" régime。我们在最坏情况下进行研究,不具备向量的任何先前信息,但允许使用随机压缩映射。我们的研究包括偏向和无偏向压缩方法,并确定了最优压缩率。结果显示,简单的压缩方案几乎是最优的。虽然结果包含一些新知识和已知的成果,但它们在这篇文章中被编辑成完整。

A Novel Truncated Norm Regularization Method for Multi-channel Color Image Denoising

  • paper_url: http://arxiv.org/abs/2307.07932
  • repo_url: https://github.com/wangzhi82/DtNFM
  • paper_authors: Yiwen Shan, Dong Hu, Haoming Ding, Chunming Yang, Zhi Wang
  • for: 这篇研究旨在提出一种基于双重权重核心幂方法的彩色图像干扰除法,以解决现实世界中彩色图像干扰除中存在跨通道差异和空间变化的问题。
  • methods: 该方法使用了双重权重核心幂方法(DtNFM),通过利用干扰图像中的非本地自相似性,将相似的结构聚集起来,并将每个组建立一个DtNFM模型来估算其干扰版本。最终的干扰除图像由多个估算结果 concatenate 得到。
  • results: 该方法在 synthetic 和实际噪声数据集上进行了广泛的实验,并证明了它在现实世界中彩色图像干扰除中表现出优于许多状态之前的方法。
    Abstract Due to the high flexibility and remarkable performance, low-rank approximation methods has been widely studied for color image denoising. However, those methods mostly ignore either the cross-channel difference or the spatial variation of noise, which limits their capacity in real world color image denoising. To overcome those drawbacks, this paper is proposed to denoise color images with a double-weighted truncated nuclear norm minus truncated Frobenius norm minimization (DtNFM) method. Through exploiting the nonlocal self-similarity of the noisy image, the similar structures are gathered and a series of similar patch matrices are constructed. For each group, the DtNFM model is conducted for estimating its denoised version. The denoised image would be obtained by concatenating all the denoised patch matrices. The proposed DtNFM model has two merits. First, it models and utilizes both the cross-channel difference and the spatial variation of noise. This provides sufficient flexibility for handling the complex distribution of noise in real world images. Second, the proposed DtNFM model provides a close approximation to the underlying clean matrix since it can treat different rank components flexibly. To solve the problem resulted from DtNFM model, an accurate and effective algorithm is proposed by exploiting the framework of the alternating direction method of multipliers (ADMM). The generated subproblems are discussed in detail. And their global optima can be easily obtained in closed-form. Rigorous mathematical derivation proves that the solution sequences generated by the algorithm converge to a single critical point. Extensive experiments on synthetic and real noise datasets demonstrate that the proposed method outperforms many state-of-the-art color image denoising methods.
    摘要 due to the high flexibility and remarkable performance, low-rank approximation methods have been widely studied for color image denoising. however, those methods mostly ignore either the cross-channel difference or the spatial variation of noise, which limits their capacity in real-world color image denoising. to overcome these drawbacks, this paper proposes a double-weighted truncated nuclear norm minus truncated frobenius norm minimization (DtNFM) method for denoising color images. by exploiting the nonlocal self-similarity of the noisy image, similar structures are gathered and a series of similar patch matrices are constructed. for each group, the DtNFM model is conducted to estimate its denoised version. the denoised image is obtained by concatenating all the denoised patch matrices. the proposed DtNFM model has two merits. first, it models and utilizes both the cross-channel difference and the spatial variation of noise, providing sufficient flexibility for handling the complex distribution of noise in real-world images. second, the proposed DtNFM model provides a close approximation to the underlying clean matrix, treating different rank components flexibly. to solve the problem resulted from the DtNFM model, an accurate and effective algorithm is proposed by exploiting the framework of the alternating direction method of multipliers (ADMM). the generated subproblems are discussed in detail, and their global optima can be easily obtained in closed form. rigorous mathematical derivation proves that the solution sequences generated by the algorithm converge to a single critical point. extensive experiments on synthetic and real noise datasets demonstrate that the proposed method outperforms many state-of-the-art color image denoising methods.

On the Robustness of Split Learning against Adversarial Attacks

  • paper_url: http://arxiv.org/abs/2307.07916
  • repo_url: https://github.com/fmy266/SplitADV
  • paper_authors: Mingyuan Fan, Cen Chen, Chengyu Wang, Wenmeng Zhou, Jun Huang
  • for: 评估分布式学习中对 adversarial 攻击的Robustness,特别是在无法访问全模型的情况下。
  • methods: 开发了一种特定于分布式学习的攻击方法,称为 SPADV,它包括两个阶段:1)阴影模型训练,解决模型缺失部分的问题,2)本地 adversarial 攻击,生成 adversarial 例子来评估。
  • results: SPADV 需要only a few unlabeled non-IID data,并且可以在第二阶段通过对 naturalsamples中的Intermediate output进行拟合来生成 adversarial examples,这表明了分布式学习对 adversarial 攻击的Robustness surprisingly vulnerable。
    Abstract Split learning enables collaborative deep learning model training while preserving data privacy and model security by avoiding direct sharing of raw data and model details (i.e., sever and clients only hold partial sub-networks and exchange intermediate computations). However, existing research has mainly focused on examining its reliability for privacy protection, with little investigation into model security. Specifically, by exploring full models, attackers can launch adversarial attacks, and split learning can mitigate this severe threat by only disclosing part of models to untrusted servers.This paper aims to evaluate the robustness of split learning against adversarial attacks, particularly in the most challenging setting where untrusted servers only have access to the intermediate layers of the model.Existing adversarial attacks mostly focus on the centralized setting instead of the collaborative setting, thus, to better evaluate the robustness of split learning, we develop a tailored attack called SPADV, which comprises two stages: 1) shadow model training that addresses the issue of lacking part of the model and 2) local adversarial attack that produces adversarial examples to evaluate.The first stage only requires a few unlabeled non-IID data, and, in the second stage, SPADV perturbs the intermediate output of natural samples to craft the adversarial ones. The overall cost of the proposed attack process is relatively low, yet the empirical attack effectiveness is significantly high, demonstrating the surprising vulnerability of split learning to adversarial attacks.
    摘要 分学促进了分布式深度学习模型的共同训练,同时保护数据隐私和模型安全,因为服务器和客户端只持有部分子网和交换中间计算。然而,现有研究主要集中在隐私保护的可靠性上,很少探讨模型安全性。具体来说,通过探索全模型,攻击者可以发起对抗性攻击,而分学促可以通过仅披露部分模型来减轻这种严重的威胁。这篇论文的目的是评估分学促的可抗性 against 对抗性攻击,特别是在最具挑战性的设定下,即不可信服务器仅有访问模型中间层。现有的对抗性攻击主要集中在中央设定下,而不是合作设定,因此,为更好地评估分学促的可抗性,我们开发了一种特定的攻击方法,称为 SPADV。 SPADV 包括两个阶段:1)遮盾模型培训,解决因缺少部分模型而产生的问题,2)本地对抗性攻击,生成对抗性示例来评估。第一阶段只需要一些非相关的非独特数据,而第二阶段,SPADV 对于自然示例的间接输出进行了扰动,以生成对抗性示例。整个攻击过程的总成本较低,但实际攻击效果却非常高,表明了分学促对于对抗性攻击的意外脆弱性。

Exploiting FPGA Capabilities for Accelerated Biomedical Computing

  • paper_url: http://arxiv.org/abs/2307.07914
  • repo_url: None
  • paper_authors: Kayode Inadagbo, Baran Arig, Nisanur Alici, Murat Isik
  • for: 这个研究旨在通过使用 Field Programmable Gate Arrays (FPGAs) 提高心电信号分析,并提出了多种高级神经网络架构,包括卷积神经网络 (CNN)、循环神经网络 (RNN)、长期短TERM Memory网络 (LSTMs) 和深度信念网络 (DBNs)。
  • methods: 我们使用 MIT-BIH 心电性股库进行训练和验证,并在模型中引入 Gaussian 噪声以提高算法的Robustness。我们还使用 EarlyStopping 回调和 Dropout 层来避免过拟合。此外,我们还开发了一个自定义的 Tensor Compute Unit (TCU) 加速器,用于 PYNQ Z1 板。
  • results: 我们计算了各种性能指标,如延迟和通过put,以获得实际的应用级别高性能的FPGA在生物医学计算中的潜在性。这种研究最终提供了优化神经网络性能在 FPGAs 上的指南,为不同应用场景提供参考。
    Abstract This study presents advanced neural network architectures including Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Long Short-Term Memory Networks (LSTMs), and Deep Belief Networks (DBNs) for enhanced ECG signal analysis using Field Programmable Gate Arrays (FPGAs). We utilize the MIT-BIH Arrhythmia Database for training and validation, introducing Gaussian noise to improve algorithm robustness. The implemented models feature various layers for distinct processing and classification tasks and techniques like EarlyStopping callback and Dropout layer are used to mitigate overfitting. Our work also explores the development of a custom Tensor Compute Unit (TCU) accelerator for the PYNQ Z1 board, offering comprehensive steps for FPGA-based machine learning, including setting up the Tensil toolchain in Docker, selecting architecture, configuring PS-PL, and compiling and executing models. Performance metrics such as latency and throughput are calculated for practical insights, demonstrating the potential of FPGAs in high-performance biomedical computing. The study ultimately offers a guide for optimizing neural network performance on FPGAs for various applications.
    摘要 Translated into Simplified Chinese:这个研究提出了使用场程可编程阵列(FPGAs)进行高性能生物医学计算的先进神经网络架构,包括卷积神经网络(CNN)、循环神经网络(RNN)、长期短Memory神经网络(LSTM)和深度信仰神经网络(DBNs)。我们使用MIT-BIH心跳频数据库进行训练和验证,并在神经网络中引入 Gaussian 噪声以提高算法的稳定性。实现的模型包括不同层次的处理和分类任务,并使用 EarlyStopping 回调函数和 Dropout 层来避免过拟合。我们的工作还探讨了基于 PYNQ Z1 板的自定义 Tensor Compute Unit (TCU) 加速器的开发,并提供了完整的 FPGA-based 机器学习实现方法,包括在 Docker 中设置 Tensil 工具链、选择架构、配置 PS-PL 和编译并执行模型。实验中计算的性能指标包括延迟和吞吐量,以提供实用的指导,展示 FPGAs 在高性能生物医学计算中的潜力。研究最终提供了优化神经网络性能在 FPGAs 上的指南,用于多种应用。

Predicting mechanical properties of Carbon Nanotube (CNT) images Using Multi-Layer Synthetic Finite Element Model Simulations

  • paper_url: http://arxiv.org/abs/2307.07912
  • repo_url: None
  • paper_authors: Kaveh Safavigerdini, Koundinya Nouduri, Ramakrishna Surya, Andrew Reinhard, Zach Quinlan, Filiz Bunyak, Matthew R. Maschmann, Kannappan Palaniappan
  • for: 预测碳纳米管(CNT)林集成体的机械性能
  • methods: 使用深度学习模型和人工智能(AI)基 materials发现
  • results: 提出一种使用多层合成图像(MLS)或 quasi-2.5D 图像进行数据增强的管道,可以更好地预测 CNT 林集成体的机械性能。
    Abstract We present a pipeline for predicting mechanical properties of vertically-oriented carbon nanotube (CNT) forest images using a deep learning model for artificial intelligence (AI)-based materials discovery. Our approach incorporates an innovative data augmentation technique that involves the use of multi-layer synthetic (MLS) or quasi-2.5D images which are generated by blending 2D synthetic images. The MLS images more closely resemble 3D synthetic and real scanning electron microscopy (SEM) images of CNTs but without the computational cost of performing expensive 3D simulations or experiments. Mechanical properties such as stiffness and buckling load for the MLS images are estimated using a physics-based model. The proposed deep learning architecture, CNTNeXt, builds upon our previous CNTNet neural network, using a ResNeXt feature representation followed by random forest regression estimator. Our machine learning approach for predicting CNT physical properties by utilizing a blended set of synthetic images is expected to outperform single synthetic image-based learning when it comes to predicting mechanical properties of real scanning electron microscopy images. This has the potential to accelerate understanding and control of CNT forest self-assembly for diverse applications.
    摘要 我们提出了一个气候预测碳纳米管(CNT)林图像的机械性能预测管道,使用深度学习模型来实现人工智能(AI)基于材料发现。我们的方法包括一种创新的数据增强技术,使用多层合成(MLS)或 quasi-2.5D 图像,这些图像由拼接2D 合成图像而成。MLS 图像更接近3D 合成和实验室扫描电子显微镜(SEM)图像,但没有 computationally Expensive 3D 模拟或实验的成本。机械性能如刚性和填充荷 для MLS 图像通过物理基础模型进行估算。我们的提议的深度学习架构,CNTNeXt,基于我们之前的 CNTNet 神经网络,使用 ResNeXt 特征表示 followed by random forest 回归估计器。我们的机器学习方法,通过使用拼接 synthetic 图像来预测 CNT 物理性能,预计能够超越单独使用 synthetic 图像来预测实际 SEM 图像中的机械性能,从而加速了 CNT 林自组装的理解和控制,并且具有广泛的应用前景。

MESOB: Balancing Equilibria & Social Optimality

  • paper_url: http://arxiv.org/abs/2307.07911
  • repo_url: None
  • paper_authors: Xin Guo, Lihong Li, Sareh Nabi, Rabih Salhab, Junzi Zhang
  • for: This paper aims to provide a novel optimization method for multi-level and multi-agent games with anonymous agents and complex interplay between competition and cooperation.
  • methods: The proposed method is called MESOB-OMO, which combines a mean-field approximation with an occupation measure optimization method to solve a bi-objective optimization problem.
  • results: The proposed method is effective in balancing the interests of different parties and handling the competitive nature of bidders, and outperforms baseline methods that only consider either the competitive or cooperative aspects.Here is the text in Simplified Chinese:
  • for: 这篇论文目标是提供一种新的优化方法,用于处理多层多代理人的游戏,具有大量匿名代理人和复杂的竞争与合作关系。
  • methods: 提议的方法是MESOB-OMO,它将mean-field approximation与占用度优化方法相结合,解决bi-objective优化问题。
  • results: MESOB-OMO方法能够均衡不同党之利益,抑制竞争性代理人的行为,并与基准方法相比显示出优异性。
    Abstract Motivated by bid recommendation in online ad auctions, this paper considers a general class of multi-level and multi-agent games, with two major characteristics: one is a large number of anonymous agents, and the other is the intricate interplay between competition and cooperation. To model such complex systems, we propose a novel and tractable bi-objective optimization formulation with mean-field approximation, called MESOB (Mean-field Equilibria & Social Optimality Balancing), as well as an associated occupation measure optimization (OMO) method called MESOB-OMO to solve it. MESOB-OMO enables obtaining approximately Pareto efficient solutions in terms of the dual objectives of competition and cooperation in MESOB, and in particular allows for Nash equilibrium selection and social equalization in an asymptotic manner. We apply MESOB-OMO to bid recommendation in a simulated pay-per-click ad auction. Experiments demonstrate its efficacy in balancing the interests of different parties and in handling the competitive nature of bidders, as well as its advantages over baselines that only consider either the competitive or the cooperative aspects.
    摘要 <>使用在在线广告拍卖中的拍卖推荐为动机,这篇论文考虑了一个总体来说是多层次和多代理人的游戏,具有两个主要特征:一是一大量的匿名代理人,二是竞争和合作之间的复杂互动。为了模型这些复杂系统,我们提出了一种新的和可行的双目标优化方法,称为MESOB(Mean-field Equilibria & Social Optimality Balancing),以及与之相关的占用度优化方法MESOB-OMO(MESOB-Occupation Measure Optimization)来解决它。MESOB-OMO可以在MESOB中获得约等价的竞争和合作两个目标的解,并且可以在极限情况下实现纳什均衡选择和社会平等。我们在模拟的一个基于拍卖的点播广告拍卖中应用MESOB-OMO。实验表明,它可以均衡不同党的利益,同时处理竞争性的拍卖者,以及与基准值(只考虑竞争或合作方面)相比,具有优势。

Seeing is not Believing: Robust Reinforcement Learning against Spurious Correlation

  • paper_url: http://arxiv.org/abs/2307.07907
  • repo_url: None
  • paper_authors: Wenhao Ding, Laixi Shi, Yuejie Chi, Ding Zhao
  • for: Handle spurious correlation in reinforcement learning to improve the robustness of real-world tasks.
  • methods: Propose a new framework called Robust State-Confounded Markov Decision Processes (RSC-MDPs) and design an empirical algorithm to learn the robust optimal policy.
  • results: Outperform all baselines in eight realistic self-driving and manipulation tasks.
    Abstract Robustness has been extensively studied in reinforcement learning (RL) to handle various forms of uncertainty such as random perturbations, rare events, and malicious attacks. In this work, we consider one critical type of robustness against spurious correlation, where different portions of the state do not have causality but have correlations induced by unobserved confounders. These spurious correlations are ubiquitous in real-world tasks, for instance, a self-driving car usually observes heavy traffic in the daytime and light traffic at night due to unobservable human activity. A model that learns such useless or even harmful correlation could catastrophically fail when the confounder in the test case deviates from the training one. Although motivated, enabling robustness against spurious correlation poses significant challenges since the uncertainty set, shaped by the unobserved confounder and sequential structure of RL, is difficult to characterize and identify. Existing robust algorithms that assume simple and unstructured uncertainty sets are therefore inadequate to address this challenge. To solve this issue, we propose Robust State-Confounded Markov Decision Processes (RSC-MDPs) and theoretically demonstrate its superiority in breaking spurious correlations compared with other robust RL counterparts. We also design an empirical algorithm to learn the robust optimal policy for RSC-MDPs, which outperforms all baselines in eight realistic self-driving and manipulation tasks.
    摘要 robustness 在强化学习(RL)中已经广泛研究,以处理不同形式的不确定性,如随机干扰、罕见事件和恶意攻击。在这项工作中,我们考虑了一种关键的一种强度对假设相关性,即不同的状态部分没有 causality,但由不见的干扰因素引起的相关性。这种假设相关性在实际任务中很普遍,例如一个自驾车通常在白天会观察到压力很大的交通,而在夜晚则是非常少的交通,这是由于不可见的人类活动引起的。如果模型学习这种无用或甚至有害的相关性,那么在测试案例中,当干扰因素与训练案例不同时,模型可能会catastrophically fail。虽然有动机,但使模型具有假设相关性的鲁棒性具有 significante challenges,因为不确定集,由不见的干扰因素和RL的顺序结构塑造,很难characterize和识别。现有的鲁棒算法假设简单的和无结构的不确定集,因此无法解决这一问题。为解决这个问题,我们提出了Robust State-Confounded Markov Decision Processes(RSC-MDPs),并 theoretically demonstrab其在破坏假设相关性方面的优越性。我们还设计了一种实际算法,用于学习RSC-MDPs中的鲁棒优胜策略,并在八个实际自驾和操作任务中超过所有基elines。

Anomaly Detection in Automated Fibre Placement: Learning with Data Limitations

  • paper_url: http://arxiv.org/abs/2307.07893
  • repo_url: None
  • paper_authors: Assef Ghamisi, Todd Charter, Li Ji, Maxime Rivard, Gil Lund, Homayoun Najjaran
  • for: Automated Fibre Placement (AFP) 自动纤维放置系统中的缺陷检测
  • methods: 无监督深度学习和经典计算机视觉算法
  • results: 可以减少训练图像数量,同时检测到各种表面问题,并且可以准确地标识缺陷位置
    Abstract Conventional defect detection systems in Automated Fibre Placement (AFP) typically rely on end-to-end supervised learning, necessitating a substantial number of labelled defective samples for effective training. However, the scarcity of such labelled data poses a challenge. To overcome this limitation, we present a comprehensive framework for defect detection and localization in Automated Fibre Placement. Our approach combines unsupervised deep learning and classical computer vision algorithms, eliminating the need for labelled data or manufacturing defect samples. It efficiently detects various surface issues while requiring fewer images of composite parts for training. Our framework employs an innovative sample extraction method leveraging AFP's inherent symmetry to expand the dataset. By inputting a depth map of the fibre layup surface, we extract local samples aligned with each composite strip (tow). These samples are processed through an autoencoder, trained on normal samples for precise reconstructions, highlighting anomalies through reconstruction errors. Aggregated values form an anomaly map for insightful visualization. The framework employs blob detection on this map to locate manufacturing defects. The experimental findings reveal that despite training the autoencoder with a limited number of images, our proposed method exhibits satisfactory detection accuracy and accurately identifies defect locations. Our framework demonstrates comparable performance to existing methods, while also offering the advantage of detecting all types of anomalies without relying on an extensive labelled dataset of defects.
    摘要 Our framework employs an innovative sample extraction method that leverages AFP's inherent symmetry to expand the dataset. By inputting a depth map of the fibre layup surface, we extract local samples aligned with each composite strip (tow). These samples are processed through an autoencoder, trained on normal samples for precise reconstructions, highlighting anomalies through reconstruction errors. Aggregated values form an anomaly map for insightful visualization. The framework employs blob detection on this map to locate manufacturing defects.Experimental findings show that our proposed method exhibits satisfactory detection accuracy and accurately identifies defect locations, despite training the autoencoder with a limited number of images. Our framework demonstrates comparable performance to existing methods, while also offering the advantage of detecting all types of anomalies without relying on an extensive labelled dataset of defects.

Multitemporal SAR images change detection and visualization using RABASAR and simplified GLR

  • paper_url: http://arxiv.org/abs/2307.07892
  • repo_url: None
  • paper_authors: Weiying Zhao, Charles-Alban Deledalle, Loïc Denis, Henri Maître, Jean-Marie Nicolas, Florence Tupin
  • for: 这个论文主要是为了检测土地表面的变化,尤其是不同类型的变化,如农田、建筑、港口和洪涝等。
  • methods: 本论文提出了一种简化的总体可能比率(SGLR)方法,假设同时间像素具有相同的等效数量看(ENL)。此外,本论文还提出了一种改进的光谱划分法和一种改进的变化类别法。
  • results: 本论文通过处理模拟和Synthetic Aperture Radar(SAR)图像,证明了提出的方法的效果,特别是在检测农田、建筑、港口和洪涝等区域变化方面。
    Abstract Understanding the state of changed areas requires that precise information be given about the changes. Thus, detecting different kinds of changes is important for land surface monitoring. SAR sensors are ideal to fulfil this task, because of their all-time and all-weather capabilities, with good accuracy of the acquisition geometry and without effects of atmospheric constituents for amplitude data. In this study, we propose a simplified generalized likelihood ratio ($S_{GLR}$) method assuming that corresponding temporal pixels have the same equivalent number of looks (ENL). Thanks to the denoised data provided by a ratio-based multitemporal SAR image denoising method (RABASAR), we successfully applied this similarity test approach to compute the change areas. A new change magnitude index method and an improved spectral clustering-based change classification method are also developed. In addition, we apply the simplified generalized likelihood ratio to detect the maximum change magnitude time, and the change starting and ending times. Then, we propose to use an adaptation of the REACTIV method to visualize the detection results vividly. The effectiveness of the proposed methods is demonstrated through the processing of simulated and SAR images, and the comparison with classical techniques. In particular, numerical experiments proved that the developed method has good performances in detecting farmland area changes, building area changes, harbour area changes and flooding area changes.
    摘要 <>translate the following text into Simplified Chinese<>理解改变区域的状况需要提供精确的改变信息。因此,检测不同类型的改变是重要的 для土地表面监测。SAR感知器非常适合完成这项任务,因为它们在任何时间和任何天气情况下都有良好的探测geometry的准确性,而无需 atmospheric constituents的影响。在这项研究中,我们提出了一种简化的通用概率比例(SGLR)方法,假设相应的时间像素有同等的等效数量 Looks(ENL)。由于降噪数据提供了由比例基于多 temporal SAR图像降噪方法(RABASAR),我们成功地应用了这种相似性测试方法来计算改变区域。此外,我们还开发了一种改进的spectral clustering-based改变类型分类方法和一种最大改变幅度时间检测方法。然后,我们使用简化的SGLR方法检测改变的开始和结束时间。最后,我们提出了使用adapted REACTIV方法来Visual化检测结果的方法。实验证明了我们提出的方法的效果,包括对农地改变、建筑改变、港口改变和洪涝改变等方面的检测。

Investigation of compressor cascade flow based on physics-informed neural networks

  • paper_url: http://arxiv.org/abs/2308.04501
  • repo_url: None
  • paper_authors: Zhihui Li, Francesco Montomoli, Sanjiv Sharma
  • for: 这项研究使用新兴的物理告知神经网络(PINNs)方法,为首次预测压缩机风场。
  • methods: 这种方法基于两维问题,包括液体力学方程在前向和反向问题中。
  • results: PINNs能够高精度地预测压缩机风场,并且在无部分边界条件的情况下,PINNs显示出了与传统CFD方法相比的明显优势。
    Abstract In this study, we utilize the emerging Physics Informed Neural Networks (PINNs) approach for the first time to predict the flow field of a compressor cascade. The approach is demonstrated on a two-dimensional problem, incorporating Navier-Stokes equations in both the forward and inverse problems. In the forward problem, PINNs effectively predict the flow field of the compressor. The key advantage over Deep Neural Networks (DNNs) is that the PINNs model incorporates a physical relationship between the relevant quantities, resulting in more precise predictions. PINNs show obvious advantages over the traditional CFD approaches when dealing with inverse problems in the absence of partial boundary conditions. PINNs successfully reconstruct the flow field of the compressor cascade solely based on partial velocity vectors and wall pressure information. This research provides compelling evidence that PINNs offer turbomachinery designers a promising alternative to the current dominant CFD methods, delivering higher accuracy compared to DNNs.
    摘要 在本研究中,我们首次利用emerging Physics Informed Neural Networks(PINNs)方法预测压缩机螺旋叶流场。该方法在二维问题上进行了示例,并包括了 Navier-Stokes 方程在前向和反向问题中。在前向问题中,PINNs有效地预测了压缩机的流场。与深度神经网络(DNNs)相比,PINNs 模型具有物理关系的相互关系,从而实现了更加精确的预测。在 inverse 问题中,PINNs 显示出了与传统 CFD 方法相比的明显优势,可以在缺少部分边界条件时成功重建压缩机螺旋叶流场。这项研究提供了证明 PINNs 对液压机设计师提供了一个可靠的代替方法,比 DNNs 更高精度。

Handwritten and Printed Text Segmentation: A Signature Case Study

  • paper_url: http://arxiv.org/abs/2307.07887
  • repo_url: None
  • paper_authors: Sina Gholamian, Ali Vahdat
  • for: 提高手写和印刷文本分类精度
  • methods: 引入新的数据集SignaTR6K和模型架构
  • results: 比对先前工作提高17.9%和7.3%的IOU分数
    Abstract While analyzing scanned documents, handwritten text can overlap with printed text. This overlap causes difficulties during the optical character recognition (OCR) and digitization process of documents, and subsequently, hurts downstream NLP tasks. Prior research either focuses solely on the binary classification of handwritten text or performs a three-class segmentation of the document, i.e., recognition of handwritten, printed, and background pixels. This approach results in the assignment of overlapping handwritten and printed pixels to only one of the classes, and thus, they are not accounted for in the other class. Thus, in this research, we develop novel approaches to address the challenges of handwritten and printed text segmentation. Our objective is to recover text from different classes in their entirety, especially enhancing the segmentation performance on overlapping sections. To support this task, we introduce a new dataset, SignaTR6K, collected from real legal documents, as well as a new model architecture for the handwritten and printed text segmentation task. Our best configuration outperforms prior work on two different datasets by 17.9% and 7.3% on IoU scores. The SignaTR6K dataset is accessible for download via the following link: https://forms.office.com/r/2a5RDg7cAY.
    摘要 While analyzing scanned documents, handwritten text can overlap with printed text. This overlap causes difficulties during the optical character recognition (OCR) and digitization process of documents, and subsequently, hurts downstream NLP tasks. Prior research either focuses solely on the binary classification of handwritten text or performs a three-class segmentation of the document, i.e., recognition of handwritten, printed, and background pixels. This approach results in the assignment of overlapping handwritten and printed pixels to only one of the classes, and thus, they are not accounted for in the other class. Therefore, in this research, we develop novel approaches to address the challenges of handwritten and printed text segmentation. Our objective is to recover text from different classes in their entirety, especially enhancing the segmentation performance on overlapping sections. To support this task, we introduce a new dataset, SignaTR6K, collected from real legal documents, as well as a new model architecture for the handwritten and printed text segmentation task. Our best configuration outperforms prior work on two different datasets by 17.9% and 7.3% on IoU scores. The SignaTR6K dataset is accessible for download via the following link: https://forms.office.com/r/2a5RDg7cAY.

Intuitionistic Fuzzy Broad Learning System: Enhancing Robustness Against Noise and Outliers

  • paper_url: http://arxiv.org/abs/2307.08713
  • repo_url: None
  • paper_authors: M. Sajid, A. K. Malik, M. Tanveer
    for: 提高 Broad Learning System (BLS) 的稳定性和有效性,应对实际数据集中噪声和异常值的问题。methods: 提出了两种改进 BLS 模型:含隐式含度的 F-BLS 模型和基于直觉含数理论的 IF-BLS 模型。两种模型都使用距离函数来评估训练点的成员度,但是 F-BLS 模型只考虑训练点与类中心的距离,而 IF-BLS 模型则考虑训练点的含度和非含度。results: 对于 44 个 UCI 数据集和 ADNI 数据集,提出的 F-BLS 和 IF-BLS 模型都显示出优于基eline 模型的总体化能力和鲁棒性。具有噪声的 UCI 数据集上,提出的方法也能够保持比较高的总体化能力和鲁棒性。
    Abstract In the realm of data classification, broad learning system (BLS) has proven to be a potent tool that utilizes a layer-by-layer feed-forward neural network. It consists of feature learning and enhancement segments, working together to extract intricate features from input data. The traditional BLS treats all samples as equally significant, which makes it less robust and less effective for real-world datasets with noises and outliers. To address this issue, we propose the fuzzy BLS (F-BLS) model, which assigns a fuzzy membership value to each training point to reduce the influence of noises and outliers. In assigning the membership value, the F-BLS model solely considers the distance from samples to the class center in the original feature space without incorporating the extent of non-belongingness to a class. We further propose a novel BLS based on intuitionistic fuzzy theory (IF-BLS). The proposed IF-BLS utilizes intuitionistic fuzzy numbers based on fuzzy membership and non-membership values to assign scores to training points in the high-dimensional feature space by using a kernel function. We evaluate the performance of proposed F-BLS and IF-BLS models on 44 UCI benchmark datasets across diverse domains. Furthermore, Gaussian noise is added to some UCI datasets to assess the robustness of the proposed F-BLS and IF-BLS models. Experimental results demonstrate superior generalization performance of the proposed F-BLS and IF-BLS models compared to baseline models, both with and without Gaussian noise. Additionally, we implement the proposed F-BLS and IF-BLS models on the Alzheimers Disease Neuroimaging Initiative (ADNI) dataset, and promising results showcase the models effectiveness in real-world applications. The proposed methods offer a promising solution to enhance the BLS frameworks ability to handle noise and outliers.
    摘要 在数据分类领域,广泛学习系统(BLS)已经证明是一种强大的工具,使用层次 feed-forward 神经网络。它包括特征学习和提升段,共同提取输入数据中细腻的特征。传统的 BLS 对所有样本都视为一样重要,这使得它在真实世界的数据集中 less robust 和 less effective。为解决这个问题,我们提出了模糊 BLS(F-BLS)模型,它将对each training point 分配模糊会员价值,以降低噪声和异常值的影响。在分配会员价值时,F-BLS 模型只考虑样本与类中心的距离在原始特征空间中,不考虑类外的非属性程度。我们还提出了基于INTUITIONISTIC FUZZY理论(IF-BLS)的新模型。该模型使用INTUITIONISTIC FUZZY数字,基于模糊会员价值和非会员价值来对训练点进行分配分数。我们对44个 UCI benchmark 数据集进行了评估,并在一些 UCI 数据集上添加了高斯噪声,以评估我们提出的 F-BLS 和 IF-BLS 模型的Robustness。实验结果表明我们的提出的 F-BLS 和 IF-BLS 模型在对比基eline模型时表现出优化的总体化能力,同时在噪声和异常值存在的情况下也有优异的表现。此外,我们在ADNI 数据集上实现了我们的 F-BLS 和 IF-BLS 模型,并得到了有 promise 的结果,这表明我们的方法在真实世界应用中具有潜在的价值。我们的方法可以增强 BLS 框架对噪声和异常值的处理能力。

Gradient-free training of neural ODEs for system identification and control using ensemble Kalman inversion

  • paper_url: http://arxiv.org/abs/2307.07882
  • repo_url: https://gitlab.com/computationalscience/eki-neural-ode
  • paper_authors: Lucas Böttcher
  • for: solves inverse problems within a Bayesian framework for system identification and optimal control tasks
  • methods: Ensemble Kalman inversion (EKI), a sequential Monte Carlo method that is gradient-free and only requires forward passes to evaluate artificial neural networks
  • results: EKI is an efficient method for training neural ODEs, with competitive runtime and solution quality compared to commonly used gradient-based optimizers.
    Abstract Ensemble Kalman inversion (EKI) is a sequential Monte Carlo method used to solve inverse problems within a Bayesian framework. Unlike backpropagation, EKI is a gradient-free optimization method that only necessitates the evaluation of artificial neural networks in forward passes. In this study, we examine the effectiveness of EKI in training neural ordinary differential equations (neural ODEs) for system identification and control tasks. To apply EKI to optimal control problems, we formulate inverse problems that incorporate a Tikhonov-type regularization term. Our numerical results demonstrate that EKI is an efficient method for training neural ODEs in system identification and optimal control problems, with runtime and quality of solutions that are competitive with commonly used gradient-based optimizers.
    摘要

Graph Embedded Intuitionistic Fuzzy RVFL for Class Imbalance Learning

  • paper_url: http://arxiv.org/abs/2307.07881
  • repo_url: None
  • paper_authors: M. A. Ganaie, M. Sajid, A. K. Malik, M. Tanveer
  • for: 该论文目的是解决机器学习领域中的类异常学习问题,即处理含有少量样本的训练集时,模型往往受到类别偏好的影响,导致少量类型的样本被排除。
  • methods: 该论文提出了一种基于图像函数链(RVFL)网络的新型分类模型,称为图像函数链 intuitionistic 瑞利(GE-IFRVFL)模型。该模型利用图像函数来提取数据集中的含义丰富信息,并使用 Intuitionistic 瑞利集来处理数据中的不确定性和不准确性。
  • results: 该论文在多个 benchmark 不均衡数据集上进行了实验,结果表明,与传统 RVFL 网络相比,GE-IFRVFL 模型在处理不均衡数据集时表现出了明显的优势。此外,该论文还应用了该模型在 ADNI 数据集上,并取得了良好的结果,证明该模型在实际应用中也具有出色的表现。
    Abstract The domain of machine learning is confronted with a crucial research area known as class imbalance learning, which presents considerable hurdles in the precise classification of minority classes. This issue can result in biased models where the majority class takes precedence in the training process, leading to the underrepresentation of the minority class. The random vector functional link (RVFL) network is a widely-used and effective learning model for classification due to its speed and efficiency. However, it suffers from low accuracy when dealing with imbalanced datasets. To overcome this limitation, we propose a novel graph embedded intuitionistic fuzzy RVFL for class imbalance learning (GE-IFRVFL-CIL) model incorporating a weighting mechanism to handle imbalanced datasets. The proposed GE-IFRVFL-CIL model has a plethora of benefits, such as $(i)$ it leverages graph embedding to extract semantically rich information from the dataset, $(ii)$ it uses intuitionistic fuzzy sets to handle uncertainty and imprecision in the data, $(iii)$ and the most important, it tackles class imbalance learning. The amalgamation of a weighting scheme, graph embedding, and intuitionistic fuzzy sets leads to the superior performance of the proposed model on various benchmark imbalanced datasets, including UCI and KEEL. Furthermore, we implement the proposed GE-IFRVFL-CIL on the ADNI dataset and achieved promising results, demonstrating the model's effectiveness in real-world applications. The proposed method provides a promising solution for handling class imbalance in machine learning and has the potential to be applied to other classification problems.
    摘要 machine learning 领域面临一个重要的研究领域,即类别不均衡学习(Class Imbalance Learning,简称 CIL),这种情况可能导致模型偏向主要类别,从而导致少数类别的下 represencing。Random vector functional link(RVFL)网络是一种广泛使用和高效的学习模型,但它在不均衡数据集上表现不佳。为了解决这个限制,我们提出了一种基于图embeded intuitionistic fuzzy RVFL(GE-IFRVFL-CIL)模型,该模型具有以下优点:$(i)$ 利用图embeded提取数据集中具有含义的信息;$(ii)$ 使用intuitionistic fuzzy sets处理数据中的不确定和不准确性;$(iii)$ 特别是,解决类别不均衡学习问题。通过将权重机制、图embeded和intuitionistic fuzzy sets结合在一起,我们的模型在多个benchmark不均衡数据集上表现出色,包括UCI和KEEL。此外,我们在ADNI数据集上实现了该模型,并获得了可观的结果,证明了模型在实际应用中的效果。本方法为处理机器学习中的类别不均衡提供了一个有 Promise的解决方案,并可以应用于其他分类问题。

Why Does Little Robustness Help? Understanding Adversarial Transferability From Surrogate Training

  • paper_url: http://arxiv.org/abs/2307.07873
  • repo_url: None
  • paper_authors: Yechao Zhang, Shengshan Hu, Leo Yu Zhang, Junyu Shi, Minghui Li, Xiaogeng Liu, Wei Wan, Hai Jin
  • for: 本研究旨在更深入地理解对 Deep Neural Networks (DNNs) 的抗击例软件的可迁移性,尤其是在代理模型方面。
  • methods: 本研究使用了一系列的理论和实验分析,探讨了两个主要因素——模型平滑性和梯度相似性——如何影响对 DNNs 的抗击例软件的可迁移性。
  • results: 研究发现,在对 DNNs 进行 adversarial 训练时,模型平滑性和梯度相似性之间存在负相关关系,而这两个因素又与对 DNNs 的抗击例软件的可迁移性有着普遍的影响。通过调整数据增强和梯度正则化,可以同时提高模型平滑性和梯度相似性,从而提高对 DNNs 的抗击例软件的可迁移性。
    Abstract Adversarial examples (AEs) for DNNs have been shown to be transferable: AEs that successfully fool white-box surrogate models can also deceive other black-box models with different architectures. Although a bunch of empirical studies have provided guidance on generating highly transferable AEs, many of these findings lack explanations and even lead to inconsistent advice. In this paper, we take a further step towards understanding adversarial transferability, with a particular focus on surrogate aspects. Starting from the intriguing little robustness phenomenon, where models adversarially trained with mildly perturbed adversarial samples can serve as better surrogates, we attribute it to a trade-off between two predominant factors: model smoothness and gradient similarity. Our investigations focus on their joint effects, rather than their separate correlations with transferability. Through a series of theoretical and empirical analyses, we conjecture that the data distribution shift in adversarial training explains the degradation of gradient similarity. Building on these insights, we explore the impacts of data augmentation and gradient regularization on transferability and identify that the trade-off generally exists in the various training mechanisms, thus building a comprehensive blueprint for the regulation mechanism behind transferability. Finally, we provide a general route for constructing better surrogates to boost transferability which optimizes both model smoothness and gradient similarity simultaneously, e.g., the combination of input gradient regularization and sharpness-aware minimization (SAM), validated by extensive experiments. In summary, we call for attention to the united impacts of these two factors for launching effective transfer attacks, rather than optimizing one while ignoring the other, and emphasize the crucial role of manipulating surrogate models.
    摘要 深度学习模型(DNN)的敌对示例(AE)已经被证明可以传播:AE 可以在不同的模型结构下骗别的黑盒模型。虽然一些实验研究提供了生成高度传播的AE的指导,但是大多数这些发现缺乏解释,甚至导致不一致的建议。在这篇文章中,我们带领读者一步进一步地了解对抗传播性,特别是在代理方面。我们从小的Robustness现象出发,其中模型在弱相对抗样本上进行了适应性训练后,可以作为更好的代理模型。我们归因这一现象于两个主要因素的贡献:模型的平滑性和梯度相似性。我们的分析将注重这两个因素之间的共同效应,而不是它们与传播性之间的相互关系。通过一系列理论和实验分析,我们提出了数据分布shift在对抗训练中的影响,以及如何通过数据增强和梯度规则来调节传播性。最后,我们提出了一种通用的制定机制,可以同时优化模型的平滑性和梯度相似性,并通过广泛的实验证明其效果。因此,我们呼吁关注这两个因素的共同影响,而不是仅仅优化一个而忽略另一个,并强调在制定代理模型时的重要性。

Does Double Descent Occur in Self-Supervised Learning?

  • paper_url: http://arxiv.org/abs/2307.07872
  • repo_url: https://github.com/yonatangideoni/double_descent_tiny_paper
  • paper_authors: Alisia Lupidi, Yonatan Gideoni, Dulhan Jayalath
  • for: investigate the existence of double descent in self-supervised models
  • methods: use standard and linear autoencoders, two previously unstudied settings
  • results: the test loss either has a classical U-shape or monotonically decreases, without exhibiting a double-descent curve.
    Abstract Most investigations into double descent have focused on supervised models while the few works studying self-supervised settings find a surprising lack of the phenomenon. These results imply that double descent may not exist in self-supervised models. We show this empirically using a standard and linear autoencoder, two previously unstudied settings. The test loss is found to have either a classical U-shape or to monotonically decrease instead of exhibiting a double-descent curve. We hope that further work on this will help elucidate the theoretical underpinnings of this phenomenon.
    摘要 大多数调查双峰现象都集中在指导学习模型上,而自适应学习设置中的研究却有很少。这些结果表明双峰现象可能不存在于自适应模型中。我们通过标准和线性自动编码器两种未研究过的设置来证实这一点。测试损失的曲线可以分为两种:一种是经典的U型曲线,另一种是 monotonically decrease 而不是展现双峰曲线。我们希望未来的研究能够深入探讨这一现象的理论基础。

The SocialAI School: Insights from Developmental Psychology Towards Artificial Socio-Cultural Agents

  • paper_url: http://arxiv.org/abs/2307.07871
  • repo_url: None
  • paper_authors: Grgur Kovač, Rémy Portelas, Peter Ford Dominey, Pierre-Yves Oudeyer
  • for: 这个论文主要是为了探讨人工智能在社会交互 Setting中的发展,以及如何通过发展心理学来帮助AI研究社会智能。
  • methods: 这篇论文使用了Michael Tomasello和Jerome Bruner等发展心理学家的理论,并提出了一个可 parameterized 的社交AI学校,用于帮助研究者进行相关的实验和研究。
  • results: 这篇论文的结果表明,通过使用社交AI学校,可以让RL代理和大语言模型在社交交互 Setting中展现出更高的社会智能能力。同时,这篇论文也提供了一个简单的入门点,以帮助AI研究者更好地理解和应用发展心理学。
    Abstract Developmental psychologists have long-established the importance of socio-cognitive abilities in human intelligence. These abilities enable us to enter, participate and benefit from human culture. AI research on social interactive agents mostly concerns the emergence of culture in a multi-agent setting (often without a strong grounding in developmental psychology). We argue that AI research should be informed by psychology and study socio-cognitive abilities enabling to enter a culture too. We discuss the theories of Michael Tomasello and Jerome Bruner to introduce some of their concepts to AI and outline key concepts and socio-cognitive abilities. We present The SocialAI school - a tool including a customizable parameterized uite of procedurally generated environments, which simplifies conducting experiments regarding those concepts. We show examples of such experiments with RL agents and Large Language Models. The main motivation of this work is to engage the AI community around the problem of social intelligence informed by developmental psychology, and to provide a tool to simplify first steps in this direction. Refer to the project website for code and additional information: https://sites.google.com/view/socialai-school.
    摘要 We draw on the theories of Michael Tomasello and Jerome Bruner to introduce some of their concepts in AI and outline key socio-cognitive abilities. We present the SocialAI school, a tool that includes a customizable parameterized suite of procedurally generated environments, which simplifies conducting experiments regarding these concepts. We demonstrate examples of such experiments with reinforcement learning (RL) agents and large language models.Our main motivation is to engage the AI community in the problem of social intelligence informed by developmental psychology and provide a tool to simplify initial steps in this direction. For more information and access to the project's code, please refer to the project website at .

Large Language Models as Superpositions of Cultural Perspectives

  • paper_url: http://arxiv.org/abs/2307.07870
  • repo_url: None
  • paper_authors: Grgur Kovač, Masataka Sawayama, Rémy Portelas, Cédric Colas, Peter Ford Dominey, Pierre-Yves Oudeyer
  • for: 这 paper 主要研究了大型自然语言模型 (LLMs) 是如何被识别为具有人性或价值观的问题。
  • methods: 作者使用了问卷调查 (PVQ, VSM, IPIP) 来研究 LLMS 在不同情境下表达的价值和人性特质是如何变化的。他们还通过质量实验来示例 LLMS 在不同情境下表达的价值是如何改变的。
  • results: 研究发现 LLMS 在不同情境下表达的价值和人性特质是 Context-dependent 的,并且可以通过不同的方法来控制这些表达。同时,作者还发现了不同模型的 drivability 和可控性。
    Abstract Large Language Models (LLMs) are often misleadingly recognized as having a personality or a set of values. We argue that an LLM can be seen as a superposition of perspectives with different values and personality traits. LLMs exhibit context-dependent values and personality traits that change based on the induced perspective (as opposed to humans, who tend to have more coherent values and personality traits across contexts). We introduce the concept of perspective controllability, which refers to a model's affordance to adopt various perspectives with differing values and personality traits. In our experiments, we use questionnaires from psychology (PVQ, VSM, IPIP) to study how exhibited values and personality traits change based on different perspectives. Through qualitative experiments, we show that LLMs express different values when those are (implicitly or explicitly) implied in the prompt, and that LLMs express different values even when those are not obviously implied (demonstrating their context-dependent nature). We then conduct quantitative experiments to study the controllability of different models (GPT-4, GPT-3.5, OpenAssistant, StableVicuna, StableLM), the effectiveness of various methods for inducing perspectives, and the smoothness of the models' drivability. We conclude by examining the broader implications of our work and outline a variety of associated scientific questions. The project website is available at https://sites.google.com/view/llm-superpositions .
    摘要 大型语言模型(LLM)经常被误认为具有人格或一组价值观。我们认为,LLM可以看作是一个积加的视角,具有不同的价值观和人格特质。LLM在不同的上下文中展现出不同的价值观和人格特质,而人类在不同上下文中的价值观和人格特质往往更加一致。我们提出了“视角可控性”概念,即模型可以采取不同的视角,以便采取不同的价值观和人格特质。我们通过心理测试(PVQ、VSM、IPIP)研究了LLM在不同上下文中表达的价值观和人格特质是如何变化的。我们还通过质量实验表明,LLM在不同的提示下表达不同的价值观,而且这些价值观不一定是明确地表达出来的( demonstrate 其上下文相依性)。我们然后进行了量化实验,研究不同模型(GPT-4、GPT-3.5、OpenAssistant、StableVicuna、StableLM)的可控性,不同方法的影响和模型的顺略性。我们最后结论,我们的工作有很多相关科学问题,并提出了一些新的科学问题。相关研究网站地址为

Custom DNN using Reward Modulated Inverted STDP Learning for Temporal Pattern Recognition

  • paper_url: http://arxiv.org/abs/2307.07869
  • repo_url: None
  • paper_authors: Vijay Shankaran Vivekanand, Rajkumar Kubendran
  • for: 本研究旨在提出一种高效的时间峰检测算法,用于各种领域,如异常检测、关键词检测和神经科学。
  • methods: 该算法结合奖金补偿行为、HEBBbian和反HEBBbian基于学习方法,可以在动态数据集上高效地识别时间峰模式。
  • results: 对于一个复杂的语音数据集,该算法的表现比 estado-of-the-art 更高, indicating that the algorithm can effectively recognize temporal spike patterns in real-world data.
    Abstract Temporal spike recognition plays a crucial role in various domains, including anomaly detection, keyword spotting and neuroscience. This paper presents a novel algorithm for efficient temporal spike pattern recognition on sparse event series data. The algorithm leverages a combination of reward-modulatory behavior, Hebbian and anti-Hebbian based learning methods to identify patterns in dynamic datasets with short intervals of training. The algorithm begins with a preprocessing step, where the input data is rationalized and translated to a feature-rich yet sparse spike time series data. Next, a linear feed forward spiking neural network processes this data to identify a trained pattern. Finally, the next layer performs a weighted check to ensure the correct pattern has been detected.To evaluate the performance of the proposed algorithm, it was trained on a complex dataset containing spoken digits with spike information and its output compared to state-of-the-art.
    摘要 <>Temporal spike recognition plays a crucial role in various domains, including anomaly detection, keyword spotting, and neuroscience. This paper presents a novel algorithm for efficient temporal spike pattern recognition on sparse event series data. The algorithm leverages a combination of reward-modulatory behavior, Hebbian, and anti-Hebbian based learning methods to identify patterns in dynamic datasets with short intervals of training. The algorithm begins with a preprocessing step, where the input data is rationalized and translated to a feature-rich yet sparse spike time series data. Next, a linear feed forward spiking neural network processes this data to identify a trained pattern. Finally, the next layer performs a weighted check to ensure the correct pattern has been detected. To evaluate the performance of the proposed algorithm, it was trained on a complex dataset containing spoken digits with spike information and its output compared to state-of-the-art.中文简体版: Temporal spike recognition在各种领域都扮演着关键角色,包括异常检测、关键词检测和神经科学。本文提出了一种高效的时间脉冲模式识别算法,用于处理缺省事件序列数据。该算法结合奖励调节行为、希伯纳和反希伯纳基本学习方法,在短期培训下 indentify动态数据中的模式。该算法的前期处理步骤将输入数据理解化和转换为具有丰富特征 yet sparse spike时间序列数据。接着,一个线性径向冲击神经网络处理这些数据,以标识已经训练的模式。最后,下一层 performs一个权重检查,以确保正确的模式已经被检测到。为评估提出的算法性能,它被训练在一个包含 spoken digits 的复杂数据集上,并与当前领先的输出进行比较。

Contrasting the efficiency of stock price prediction models using various types of LSTM models aided with sentiment analysis

  • paper_url: http://arxiv.org/abs/2307.07868
  • repo_url: None
  • paper_authors: Varun Sangwan, Vishesh Kumar Singh, Bibin Christopher V
  • for: 该研究目标是找到使用公司预测和行业表现来正确预测股票价格,包括短期和长期目标。
  • methods: 该研究使用公司预测和行业表现来建立模型,以便预测股票价格。
  • results: 该研究得到的结果可以帮助投资者更好地理解公司的股票价格,并且可以用于长期和短期投资决策。I hope this helps! Let me know if you have any other questions.
    Abstract Our research aims to find the best model that uses companies projections and sector performances and how the given company fares accordingly to correctly predict equity share prices for both short and long term goals.
    摘要 我们的研究目标是找到最佳的模型,该使用公司预测和行业表现来正确预测股票价格,包括短期和长期目标。

Benchmarking the Effectiveness of Classification Algorithms and SVM Kernels for Dry Beans

  • paper_url: http://arxiv.org/abs/2307.07863
  • repo_url: None
  • paper_authors: Anant Mehta, Prajit Sengupta, Divisha Garg, Harpreet Singh, Yosi Shacham Diamand
  • for: 增强作物产量,植物育种者和农业研究人员可以通过识别感兴趣特征、疾病抵抗力和营养含量来提高作物产量。
  • methods: 本研究使用了不同的支持向量机(SVM)分类算法,包括直线、多项式和径向基函数(RBF),以及其他流行的分类算法进行比较和分析。为了降低维度,使用了主成分分析(PCA)进行预处理。
  • results: 研究发现,使用径向基函数(RBF) SVM 算法可以达到最高的准确率(93.34%)、精度(92.61%)、回归率(92.35%)和 F1 分数(91.40%)。此外,研究还提供了有用的视觉化和实验分析,为识别不同 SVM 算法在复杂和非线性结构的数据集中的重要性提供了有价值的指导。
    Abstract Plant breeders and agricultural researchers can increase crop productivity by identifying desirable features, disease resistance, and nutritional content by analysing the Dry Bean dataset. This study analyses and compares different Support Vector Machine (SVM) classification algorithms, namely linear, polynomial, and radial basis function (RBF), along with other popular classification algorithms. The analysis is performed on the Dry Bean Dataset, with PCA (Principal Component Analysis) conducted as a preprocessing step for dimensionality reduction. The primary evaluation metric used is accuracy, and the RBF SVM kernel algorithm achieves the highest Accuracy of 93.34%, Precision of 92.61%, Recall of 92.35% and F1 Score as 91.40%. Along with adept visualization and empirical analysis, this study offers valuable guidance by emphasizing the importance of considering different SVM algorithms for complex and non-linear structured datasets.
    摘要 植物育种者和农业研究人员可以通过识别有利特征、疾病抵抗力和营养含量来提高作物产量。这个研究分析和比较不同的支持向量机器学习(SVM)分类算法,包括直线、多项式和径向基函数(RBF),以及其他流行的分类算法。研究使用了干豇数据集,先进行了主成分分析(PCA)作为维度减少步骤。主要评价指标是准确率,RBF SVM 算法实现了最高的准确率为 93.34%、精度为 92.61%、回归率为 92.35% 和 F1 分数为 91.40%。此外,研究还提供了丰富的视觉化和实证分析,为复杂和非线性结构数据中的SVM算法选择提供了有价值的指导。

Automated Knowledge Modeling for Cancer Clinical Practice Guidelines

  • paper_url: http://arxiv.org/abs/2307.10231
  • repo_url: None
  • paper_authors: Pralaypati Ta, Bhumika Gupta, Arihant Jain, Sneha Sree C, Arunima Sarkar, Keerthi Ram, Mohanasankar Sivaprakasam
    for:This paper aims to develop an automated method for extracting knowledge from National Comprehensive Cancer Network (NCCN) Clinical Practice Guidelines (CPGs) in Oncology and generating a structured model containing the retrieved knowledge.methods:The proposed method uses natural language processing (NLP) techniques to extract knowledge from NCCN CPGs, and employs a knowledge model to represent the extracted knowledge in a structured format. The method also includes three enrichment strategies to enhance the model: Cancer staging information, UMLS Metathesaurus & NCIt concepts, and Node classification.results:The proposed method was tested using two versions of NCCN Non-Small Cell Lung Cancer (NSCLC) CPG, and achieved a high accuracy of 0.81 in Node classification using a Support Vector Machine (SVM) model with 10-fold cross-validation.
    Abstract Clinical Practice Guidelines (CPGs) for cancer diseases evolve rapidly due to new evidence generated by active research. Currently, CPGs are primarily published in a document format that is ill-suited for managing this developing knowledge. A knowledge model of the guidelines document suitable for programmatic interaction is required. This work proposes an automated method for extraction of knowledge from National Comprehensive Cancer Network (NCCN) CPGs in Oncology and generating a structured model containing the retrieved knowledge. The proposed method was tested using two versions of NCCN Non-Small Cell Lung Cancer (NSCLC) CPG to demonstrate the effectiveness in faithful extraction and modeling of knowledge. Three enrichment strategies using Cancer staging information, Unified Medical Language System (UMLS) Metathesaurus & National Cancer Institute thesaurus (NCIt) concepts, and Node classification are also presented to enhance the model towards enabling programmatic traversal and querying of cancer care guidelines. The Node classification was performed using a Support Vector Machine (SVM) model, achieving a classification accuracy of 0.81 with 10-fold cross-validation.
    摘要 临床实践指南 (CPGs) for cancer diseases 在新证据的激发下不断发展。目前,CPGs 主要以文档格式发布,这种格式不适合管理这些发展中的知识。这项工作提出了一种自动提取 CPGS 文档中的知识并生成一个结构化模型的方法。该方法在使用两个版本的 National Comprehensive Cancer Network (NCCN) Non-Small Cell Lung Cancer (NSCLC) CPG 进行测试,并证明了 faithful 提取和模型知识的效果。此外,文章还提出了三种润色策略,使得模型具有可programmatic traversal和querying cancer care guidelines的能力。这三种润色策略分别是使用 Cancer 分期信息、Unified Medical Language System (UMLS) Metathesaurus & National Cancer Institute thesaurus (NCIt) 概念以及节点分类。Node 分类使用 Support Vector Machine (SVM) 模型,在10-fold cross-validation中达到了0.81的分类精度。

Variational Inference with Gaussian Score Matching

  • paper_url: http://arxiv.org/abs/2307.07849
  • repo_url: https://github.com/modichirag/gsm-vi
  • paper_authors: Chirag Modi, Charles Margossian, Yuling Yao, Robert Gower, David Blei, Lawrence Saul
  • For: The paper is written for researchers and practitioners interested in Bayesian inference and variational inference.* Methods: The paper proposes a new approach to variational inference called score matching variational inference (GSM-VI), which is based on the principle of score matching and can be applied to a wide class of models. The algorithm iteratively adjusts the variational estimate to match the scores at a newly sampled value of the latent variables.* Results: The paper compares GSM-VI to black box variational inference (BBVI) and studies how GSM-VI behaves as a function of the problem dimensionality, the condition number of the target covariance matrix, and the degree of mismatch between the approximating and exact posterior distribution. The results show that GSM-VI is faster than BBVI and requires fewer gradient evaluations to obtain a comparable quality of approximation.
    Abstract Variational inference (VI) is a method to approximate the computationally intractable posterior distributions that arise in Bayesian statistics. Typically, VI fits a simple parametric distribution to the target posterior by minimizing an appropriate objective such as the evidence lower bound (ELBO). In this work, we present a new approach to VI based on the principle of score matching, that if two distributions are equal then their score functions (i.e., gradients of the log density) are equal at every point on their support. With this, we develop score matching VI, an iterative algorithm that seeks to match the scores between the variational approximation and the exact posterior. At each iteration, score matching VI solves an inner optimization, one that minimally adjusts the current variational estimate to match the scores at a newly sampled value of the latent variables. We show that when the variational family is a Gaussian, this inner optimization enjoys a closed form solution, which we call Gaussian score matching VI (GSM-VI). GSM-VI is also a ``black box'' variational algorithm in that it only requires a differentiable joint distribution, and as such it can be applied to a wide class of models. We compare GSM-VI to black box variational inference (BBVI), which has similar requirements but instead optimizes the ELBO. We study how GSM-VI behaves as a function of the problem dimensionality, the condition number of the target covariance matrix (when the target is Gaussian), and the degree of mismatch between the approximating and exact posterior distribution. We also study GSM-VI on a collection of real-world Bayesian inference problems from the posteriorDB database of datasets and models. In all of our studies we find that GSM-VI is faster than BBVI, but without sacrificing accuracy. It requires 10-100x fewer gradient evaluations to obtain a comparable quality of approximation.
    摘要 “统计学中的统计推理(Variational Inference,VI)是一种方法估计computationally intractable的 posterior distribution。通常,VI使用一个简单的 parametric distribution 来替代目标 posterior,并且使用一个适当的目标函数,例如证据下界(Evidence Lower Bound,ELBO)来对其进行最佳化。在这个研究中,我们提出了一种基于得分匹配原理的新方法,即得分匹配VI(Score Matching VI,SM-VI)。这个方法的基本思想是,如果两个分布相同,则它们的得分函数(即分布的LOG值的导数)在它们的支持集上也是相同的。我们透过对SM-VI进行迭代运算,将得分匹配到目标 posterior 中的分布。在每个迭代中,SM-VI解决一个内部优化问题,将当前的渠道估计匹配到目标 posterior 中的分布。当渠道家族为 Gaussian 时,内部优化问题具有关注形式的解,我们称之为 Gaussian Score Matching VI(GSM-VI)。GSM-VI 也是一个“黑盒子”渠道推理方法,它只需要一个可微的共同分布,并且可以应用到广泛的模型中。我们与黑盒子推理(BBVI)进行比较,BBVI 的要求相同,但是它将 ELBO 优化而不是得分匹配。我们研究了 GSM-VI 的行为,包括问题的维度、目标均值矩阵的条件数(当目标为 Gaussian 时)和渠道估计和实际 posterior 的匹配程度。我们还对一些真实世界的 Bayesian 推理问题进行了研究,包括 posteriorDB 数据库中的数据和模型。在所有的研究中,我们发现 GSM-VI 比 BBVI 快速,并且无需牺牲精度。GSM-VI 需要 10-100 倍 fewer gradient evaluations 以取得相同质量的渠道估计。”

Neural Video Recovery for Cloud Gaming

  • paper_url: http://arxiv.org/abs/2307.07847
  • repo_url: https://github.com/Aryia-Behroziuan/neurons
  • paper_authors: Zhaoyuan He, Yifan Yang, Shuozhe Li, Diyuan Dai, Lili Qiu
  • for: 提高云游戏的视频恢复率和质量,以提供更好的游戏体验。
  • methods: 使用游戏状态进行视频恢复,并使用部分解码的帧来恢复丢失的视频部分。开发了一个整体系统,包括提取游戏状态、修改H.264视频解码器生成恢复帧的掩码,以及设计一种新的神经网络来恢复完整或部分的视频帧。
  • results: 通过iPhone 12和笔记机实现,证明了游戏状态在视频恢复中的重要性,以及我们的总体设计的有效性。
    Abstract Cloud gaming is a multi-billion dollar industry. A client in cloud gaming sends its movement to the game server on the Internet, which renders and transmits the resulting video back. In order to provide a good gaming experience, a latency below 80 ms is required. This means that video rendering, encoding, transmission, decoding, and display have to finish within that time frame, which is especially challenging to achieve due to server overload, network congestion, and losses. In this paper, we propose a new method for recovering lost or corrupted video frames in cloud gaming. Unlike traditional video frame recovery, our approach uses game states to significantly enhance recovery accuracy and utilizes partially decoded frames to recover lost portions. We develop a holistic system that consists of (i) efficiently extracting game states, (ii) modifying H.264 video decoder to generate a mask to indicate which portions of video frames need recovery, and (iii) designing a novel neural network to recover either complete or partial video frames. Our approach is extensively evaluated using iPhone 12 and laptop implementations, and we demonstrate the utility of game states in the game video recovery and the effectiveness of our overall design.
    摘要 云游戏是一个多亿美元的industry。一个客户端在云游戏中将其运动发送到游戏服务器上的互联网上,服务器将其渲染并将结果视频回传。为了提供良好的游戏体验,云游戏需要的延迟时间在80ms左右。这意味着视频渲染、编码、传输、解码和显示都需要在这个时间段内完成,这是特别是由服务器过载、网络拥堵和 losses 而具有挑战性。在这篇论文中,我们提出了一种新的视频帧恢复方法,与传统的视频帧恢复方法不同的是,我们的方法使用游戏状态以显著提高恢复精度,并使用部分解码的帧来恢复丢失的部分。我们开发了一个整体系统,包括(i)高效地提取游戏状态,(ii)修改H.264视频解码器生成一个面积指示需要恢复的视频帧部分,以及(iii)设计一种新的神经网络来恢复完整或部分的视频帧。我们的方法在iPhone 12和笔记机实现中进行了广泛的测试,并证明了游戏状态在游戏视频恢复中的重要性和我们的整体设计的有效性。

Transformers are Universal Predictors

  • paper_url: http://arxiv.org/abs/2307.07843
  • repo_url: https://github.com/danderfer/Comp_Sci_Sem_2
  • paper_authors: Sourya Basu, Moulik Choraria, Lav R. Varshney
  • for: 这篇论文是研究Transformer架构的语言模型限制和其在信息论中的通用预测性的。
  • methods: 论文使用了信息论的方法来分析Transformer架构的性能,并在非对称数据 regime中分析不同组件的作用,特别是在数据效率训练中。
  • results: 实验表明,Transformer架构在 sintetic 和实际数据上具有良好的性能,且可以在数据效率训练中获得优秀的结果。
    Abstract We find limits to the Transformer architecture for language modeling and show it has a universal prediction property in an information-theoretic sense. We further analyze performance in non-asymptotic data regimes to understand the role of various components of the Transformer architecture, especially in the context of data-efficient training. We validate our theoretical analysis with experiments on both synthetic and real datasets.
    摘要 我们发现 transformer 架构在语言模型预测中存在限制,并证明它有一种普遍预测性质在信息理论上。我们进一步分析不同组件的转换器架构在非对称数据 régime 中的表现,尤其是在数据效果训练中。我们 validate our 理论分析通过实验测试 synthetic 和实际数据集。Note: The translation is in Simplified Chinese, which is the standard writing system used in mainland China. If you prefer Traditional Chinese, please let me know and I can provide the translation in that format as well.

RegExplainer: Generating Explanations for Graph Neural Networks in Regression Task

  • paper_url: http://arxiv.org/abs/2307.07840
  • repo_url: None
  • paper_authors: Jiaxing Zhang, Zhuomin Chen, Hao Mei, Dongsheng Luo, Hua Wei
  • for: 这 paper 的目的是解释图像 regression 模型(XAIG-R)的含义,以便更好地理解图像学习任务中的推理过程。
  • methods: 这 paper 使用了信息瓶颈理论基于的一个新目标函数,以及一种新的混合框架,可以支持不同的 GNN 模型在一种模型无关的方式上。此外,它还提出了一种对比学习策略来解决 regression 任务中的连续顺序标签问题。
  • results: 这 paper 通过三个 benchmark 数据集和一个实际数据集进行了广泛的实验,证明了该方法的有效性在解释 GNN 模型在 regression 任务中。
    Abstract Graph regression is a fundamental task and has received increasing attention in a wide range of graph learning tasks. However, the inference process is often not interpretable. Most existing explanation techniques are limited to understanding GNN behaviors in classification tasks. In this work, we seek an explanation to interpret the graph regression models (XAIG-R). We show that existing methods overlook the distribution shifting and continuously ordered decision boundary, which hinders them away from being applied in the regression tasks. To address these challenges, we propose a novel objective based on the information bottleneck theory and introduce a new mix-up framework, which could support various GNNs in a model-agnostic manner. We further present a contrastive learning strategy to tackle the continuously ordered labels in regression task. To empirically verify the effectiveness of the proposed method, we introduce three benchmark datasets and a real-life dataset for evaluation. Extensive experiments show the effectiveness of the proposed method in interpreting GNN models in regression tasks.
    摘要 GRaph regression是一个基本任务,在各种图学习任务中受到了越来越多的关注。然而,推断过程经常不可解释。大多数现有的解释技术仅适用于理解 GNN 的类型任务。在这项工作中,我们寻求一种可解释的方法,用于解释图 regression 模型(XAIG-R)。我们发现现有方法忽略了分布Shift和连续顺序决策边界,这会阻碍它们在回归任务中应用。为解决这些挑战,我们提出了一个基于信息瓶颈理论的新目标函数,并提出了一种新的混合框架,可以支持多种 GNN 在一种模型无关的方式上。此外,我们还提出了一种对比学习策略,用于处理连续顺序标签在回归任务中。为证明提出的方法的有效性,我们引入了三个标准数据集和一个真实数据集进行评估。广泛的实验表明,我们的方法可以有效地解释 GNN 模型在回归任务中。