cs.LG - 2023-07-03

Sampling the lattice Nambu-Goto string using Continuous Normalizing Flows

  • paper_url: http://arxiv.org/abs/2307.01107
  • repo_url: https://github.com/turinlatticefieldtheorygroup/nambugotocnf
  • paper_authors: Michele Caselle, Elia Cellini, Alessandro Nada
  • for: 这篇论文是为了描述偏微场论中的粘合现象,使用紧缩串理论(EST)来描述粘合频谱管的模型。
  • methods: 这篇论文使用了机器学习方法,特别是最新的Continuous Normalizing Flows(CNF)来解决EST预测的计算问题。
  • results: 该论文使用CNF方法对Nambu-Goto strings进行了数值计算,并获得了可靠的EST预测值。
    Abstract Effective String Theory (EST) represents a powerful non-perturbative approach to describe confinement in Yang-Mills theory that models the confining flux tube as a thin vibrating string. EST calculations are usually performed using the zeta-function regularization: however there are situations (for instance the study of the shape of the flux tube or of the higher order corrections beyond the Nambu-Goto EST) which involve observables that are too complex to be addressed in this way. In this paper we propose a numerical approach based on recent advances in machine learning methods to circumvent this problem. Using as a laboratory the Nambu-Goto string, we show that by using a new class of deep generative models called Continuous Normalizing Flows it is possible to obtain reliable numerical estimates of EST predictions.
    摘要 效果串理论(EST)表示一种强大的非拟合方法来描述 Yang-Mills 理论中的吸引作用,将吸引 flux tube 模型为细膨散的弹性String。EST 计算通常使用 zeta-function 正则化:但有些情况(如研究 flux tube 的形状或高阶修正项)需要访问 Observables 是不可能通过这种方式进行处理。在这篇论文中,我们提出一种基于最近的机器学习方法的数字方法来解决这个问题。使用 Nambu-Goto string 作为实验室,我们示出了使用一种新的深度生成模型called Continuous Normalizing Flows 可以获得可靠的 EST 预测。

Streamlined Lensed Quasar Identification in Multiband Images via Ensemble Networks

  • paper_url: http://arxiv.org/abs/2307.01090
  • repo_url: None
  • paper_authors: Irham Taufik Andika, Sherry H. Suyu, Raoul Cañameras, Alejandra Melo, Stefan Schuldt, Yiping Shu, Anna-Christina Eilers, Anton Timur Jaelani, Minghao Yue
  • for: 找寻强式折射辐射源(quasars experiencing strong lensing),以了解cosmic expansion rate、dark matter profile和镜像主 galaxy。
  • methods: 使用 cutting-edge convolutional networks(CNNs)和vision transformers(ViTs), ensemble 训练在realistic galaxy-quasar lens simulations基础上。
  • results: 通过 averaging 多种CNNs和ViTs,可以减小误 positives,并使用HSC图像和其他数据库,找到大约600万个源,其中3080个 source有高概率是强式折射辐射源,需要spectroscopic confirmation。
    Abstract Quasars experiencing strong lensing offer unique viewpoints on subjects related to the cosmic expansion rate, the dark matter profile within the foreground deflectors, and the quasar host galaxies. Unfortunately, identifying them in astronomical images is challenging since they are overwhelmed by the abundance of non-lenses. To address this, we have developed a novel approach by ensembling cutting-edge convolutional networks (CNNs) -- for instance, ResNet, Inception, NASNet, MobileNet, EfficientNet, and RegNet -- along with vision transformers (ViTs) trained on realistic galaxy-quasar lens simulations based on the Hyper Suprime-Cam (HSC) multiband images. While the individual model exhibits remarkable performance when evaluated against the test dataset, achieving an area under the receiver operating characteristic curve of $>$97.3% and a median false positive rate of 3.6%, it struggles to generalize in real data, indicated by numerous spurious sources picked by each classifier. A significant improvement is achieved by averaging these CNNs and ViTs, resulting in the impurities being downsized by factors up to 50. Subsequently, combining the HSC images with the UKIRT, VISTA, and unWISE data, we retrieve approximately 60 million sources as parent samples and reduce this to 892,609 after employing a photometry preselection to discover $z>1.5$ lensed quasars with Einstein radii of $\theta_\mathrm{E}<5$ arcsec. Afterward, the ensemble classifier indicates 3080 sources with a high probability of being lenses, for which we visually inspect, yielding 210 prevailing candidates awaiting spectroscopic confirmation. These outcomes suggest that automated deep learning pipelines hold great potential in effectively detecting strong lenses in vast datasets with minimal manual visual inspection involved.
    摘要 astronomy 关键词:Quasars,强大的 gravitational lensing,cosmic expansion rate,dark matter,quasar host galaxiesQuasars 经历强大的 gravitational lensing 提供了一种独特的视角,可以研究cosmic expansion rate 和 dark matter Profile within the foreground deflectors 以及 Quasar host galaxies。然而,在天文图像中 identific Quasars 是一项挑战,因为它们被非 gravitational lens 的丰富数量掩盖。为 Address this, we have developed a novel approach by ensembling cutting-edge Convolutional Neural Networks (CNNs) 和 Vision Transformers (ViTs) ,例如 ResNet, Inception, NASNet, MobileNet, EfficientNet, and RegNet ,并在 Hyper Suprime-Cam (HSC) 多波段图像上进行实际的 galaxy-quasar lens simulations。although the individual model shows remarkable performance when evaluated against the test dataset, achieving an area under the receiver operating characteristic curve of $>$97.3% and a median false positive rate of 3.6%,it struggles to generalize in real data, indicated by numerous spurious sources picked by each classifier。a significant improvement is achieved by averaging these CNNs and ViTs, resulting in the impurities being downsized by factors up to 50。subsequently, we combine the HSC images with the UKIRT, VISTA, and unWISE data, retrieve approximately 60 million sources as parent samples, and reduce this to 892,609 after employing a photometry preselection to discover $z>1.5$ lensed quasars with Einstein radii of $\theta_\mathrm{E}<5$ arcsec。afterward, the ensemble classifier indicates 3080 sources with a high probability of being lenses, for which we visually inspect, yielding 210 prevailing candidates awaiting spectroscopic confirmation。these outcomes suggest that automated deep learning pipelines hold great potential in effectively detecting strong lenses in vast datasets with minimal manual visual inspection involved。

Empirically Validating Conformal Prediction on Modern Vision Architectures Under Distribution Shift and Long-tailed Data

  • paper_url: http://arxiv.org/abs/2307.01088
  • repo_url: None
  • paper_authors: Kevin Kasa, Graham W. Taylor
  • for: 提供深度学习模型可靠的uncertainty estimate和安全保证
  • methods: 评估多种post-hoc和训练基于的conformal prediction方法在不同的分布shift和长尾分布下的性能
  • results: 研究发现,即使使用多种conformal方法和神经网络家族,其性能在分布shift和长尾分布下都会受到很大影响,导致安全保证被违反。
    Abstract Conformal prediction has emerged as a rigorous means of providing deep learning models with reliable uncertainty estimates and safety guarantees. Yet, its performance is known to degrade under distribution shift and long-tailed class distributions, which are often present in real world applications. Here, we characterize the performance of several post-hoc and training-based conformal prediction methods under these settings, providing the first empirical evaluation on large-scale datasets and models. We show that across numerous conformal methods and neural network families, performance greatly degrades under distribution shifts violating safety guarantees. Similarly, we show that in long-tailed settings the guarantees are frequently violated on many classes. Understanding the limitations of these methods is necessary for deployment in real world and safety-critical applications.
    摘要 宽泛预测(Conformal prediction)已经成为深度学习模型提供可靠的不确定性估计和安全保证的有力方法。然而,其性能知道会在分布变换和长尾类分布下逐渐下降,这些情况通常存在在实际应用中。我们在这里对几种后处和训练基础的宽泛预测方法进行了首次实证评估,并在大规模数据和模型上进行了评估。我们发现,无论是哪些方法或者哪些神经网络家族,在分布变换下都会违反安全保证。同时,在长尾设置下,保证也frequently被违反了许多类别。理解这些方法的限制是必要的,以便在实际应用和安全关键应用中部署。

Supervised Manifold Learning via Random Forest Geometry-Preserving Proximities

  • paper_url: http://arxiv.org/abs/2307.01077
  • repo_url: None
  • paper_authors: Jake S. Rhodes
  • for: 本文目的是提出一种新的整体性 manifold learning 方法,用于supervised dimensionality reduction。
  • methods: 本文使用的方法包括 random forest proximities 和 diffusion-based algorithms,以保持本数据集中的地方结构和全局结构。
  • results: 本文表明,使用这种新的方法可以更好地保持数据集中的地方结构和全局结构,并且可以提高 manifold learning 的效果。
    Abstract Manifold learning approaches seek the intrinsic, low-dimensional data structure within a high-dimensional space. Mainstream manifold learning algorithms, such as Isomap, UMAP, $t$-SNE, Diffusion Map, and Laplacian Eigenmaps do not use data labels and are thus considered unsupervised. Existing supervised extensions of these methods are limited to classification problems and fall short of uncovering meaningful embeddings due to their construction using order non-preserving, class-conditional distances. In this paper, we show the weaknesses of class-conditional manifold learning quantitatively and visually and propose an alternate choice of kernel for supervised dimensionality reduction using a data-geometry-preserving variant of random forest proximities as an initialization for manifold learning methods. We show that local structure preservation using these proximities is near universal across manifold learning approaches and global structure is properly maintained using diffusion-based algorithms.
    摘要 manifold learning方法寻找高维空间中内在的低维数据结构。主流 manifold learning 算法,如 Isomap、UMAP、t-SNE、Diffusion Map 和 Laplacian Eigenmaps,不使用数据标签,因此被视为无监督的。现有的监督扩展方法仅适用于分类问题,而且由于其使用不 preserve 的、类别 conditional 距离,因此无法找到有意义的嵌入。在这篇论文中,我们证明了类别 conditional manifold learning 的弱点,并提出了一种不同的kernel来实现监督的维度减少。我们显示了这些距离的本地结构保持是near universal across manifold learningapproaches,而且使用扩散算法来保持全局结构。

When Can Linear Learners be Robust to Indiscriminate Poisoning Attacks?

  • paper_url: http://arxiv.org/abs/2307.01073
  • repo_url: None
  • paper_authors: Fnu Suya, Xiao Zhang, Yuan Tian, David Evans
  • for: 研究线性学习器对恶意抹黑攻击的Robustness,攻击者通过杂化数据中插入一些针对性设计的示例来尝试让模型在测试集上增加错误率。
  • methods: 基于一些数据集上线性学习器能够抵抗最佳攻击的观察,研究是否存在一些数据集可以自然地抵抗杂化攻击。对于理想化的 Gaussian 分布,我们准确地描述了最佳抹黑攻击策略的行为,这种策略可以在给定抹黑预算下最大化测试集中模型的风险。
  • results: 我们的结果表明,如果数据集中的类划分具有低差异和低方差,并且杂化攻击点的约束集的大小很小,那么线性学习器就可以具有Robustness 对杂化攻击。这些发现解释了一些学习任务在杂化攻击下的差异性,对于理解杂化攻击的根本原因是非常重要的一步。
    Abstract We study indiscriminate poisoning for linear learners where an adversary injects a few crafted examples into the training data with the goal of forcing the induced model to incur higher test error. Inspired by the observation that linear learners on some datasets are able to resist the best known attacks even without any defenses, we further investigate whether datasets can be inherently robust to indiscriminate poisoning attacks for linear learners. For theoretical Gaussian distributions, we rigorously characterize the behavior of an optimal poisoning attack, defined as the poisoning strategy that attains the maximum risk of the induced model at a given poisoning budget. Our results prove that linear learners can indeed be robust to indiscriminate poisoning if the class-wise data distributions are well-separated with low variance and the size of the constraint set containing all permissible poisoning points is also small. These findings largely explain the drastic variation in empirical attack performance of the state-of-the-art poisoning attacks on linear learners across benchmark datasets, making an important initial step towards understanding the underlying reasons some learning tasks are vulnerable to data poisoning attacks.
    摘要 我们研究不偏辐射毒素攻击,针对线性学习器,敌对人在训练数据中插入一些手动制作的示例,以达到让引导出的模型测试错误高的目的。受观察到一些数据集上的线性学习器能够不受任何防御措施下 resist最佳攻击的现象,我们进一步调查是否存在一些数据集可以自然地抵抗不偏辐射毒素攻击。对于理论 Gaussian 分布,我们仔细描述了最佳毒素攻击策略,即在给定毒素预算下,可以使引导出的模型测试错误最大化的攻击策略。我们的结果表明,如果数据集中的类别数据分布具有低差异和低方差,而且制定集中包含所有可能的毒素点的大小也很小,那么线性学习器就可以具有抵抗不偏辐射毒素攻击的性能。这些发现解释了一些学习任务对于数据毒素攻击的 empirical 攻击性能的悬峰性,从而为我们更好的理解这些任务的潜在原因。

PIGNet2: A Versatile Deep Learning-based Protein-Ligand Interaction Prediction Model for Binding Affinity Scoring and Virtual Screening

  • paper_url: http://arxiv.org/abs/2307.01066
  • repo_url: https://github.com/ace-kaist/pignet2
  • paper_authors: Seokhyun Moon, Sang-Yeon Hwang, Jaechang Lim, Woo Youn Kim
  • for: 该研究旨在提出一种可靠地预测蛋白-小分子交互(PLI)的模型,以帮助药物发现过程中更好地预测蛋白和小分子之间的交互。
  • methods: 该研究使用了一种新的数据扩展策略,与物理学习神经网络结合,以提高PLI预测的准确性和效率。
  • results: 研究显示,该模型在不同的测试中具有显著的改进,包括衍生数据集测试和距离可能性学习测试,并达到了与当前最佳性能相当的水平。这表明该方法在药物发现中具有潜在的应用前景。
    Abstract Prediction of protein-ligand interactions (PLI) plays a crucial role in drug discovery as it guides the identification and optimization of molecules that effectively bind to target proteins. Despite remarkable advances in deep learning-based PLI prediction, the development of a versatile model capable of accurately scoring binding affinity and conducting efficient virtual screening remains a challenge. The main obstacle in achieving this lies in the scarcity of experimental structure-affinity data, which limits the generalization ability of existing models. Here, we propose a viable solution to address this challenge by introducing a novel data augmentation strategy combined with a physics-informed graph neural network. The model showed significant improvements in both scoring and screening, outperforming task-specific deep learning models in various tests including derivative benchmarks, and notably achieving results comparable to the state-of-the-art performance based on distance likelihood learning. This demonstrates the potential of this approach to drug discovery.
    摘要

ENGAGE: Explanation Guided Data Augmentation for Graph Representation Learning

  • paper_url: http://arxiv.org/abs/2307.01053
  • repo_url: https://github.com/sycny/engage
  • paper_authors: Yucheng Shi, Kaixiong Zhou, Ninghao Liu
  • for: 本文旨在提出一种基于解释导向的数据增强方法,以保留图数据中重要的特征信息并排除不必要的信息。
  • methods: 本文提出了一种名为ENGAGE(ExplaNation Guided data AuGmEntation)的方法,其中使用了一种高效的无监督解释方法,称为简化活动图,以评估节点的重要性。此外,本文还提出了两种数据增强方案,一种是Structural Augmentation,另一种是Feature Augmentation。
  • results: 经过实验证明,ENGAGE方法可以在不同的模型架构和真实的图数据上显著提高图数据的表示能力。同时,ENGAGE方法还可以在图级和节点级任务上达到优秀的性能。
    Abstract The recent contrastive learning methods, due to their effectiveness in representation learning, have been widely applied to modeling graph data. Random perturbation is widely used to build contrastive views for graph data, which however, could accidentally break graph structures and lead to suboptimal performance. In addition, graph data is usually highly abstract, so it is hard to extract intuitive meanings and design more informed augmentation schemes. Effective representations should preserve key characteristics in data and abandon superfluous information. In this paper, we propose ENGAGE (ExplaNation Guided data AuGmEntation), where explanation guides the contrastive augmentation process to preserve the key parts in graphs and explore removing superfluous information. Specifically, we design an efficient unsupervised explanation method called smoothed activation map as the indicator of node importance in representation learning. Then, we design two data augmentation schemes on graphs for perturbing structural and feature information, respectively. We also provide justification for the proposed method in the framework of information theories. Experiments of both graph-level and node-level tasks, on various model architectures and on different real-world graphs, are conducted to demonstrate the effectiveness and flexibility of ENGAGE. The code of ENGAGE can be found: https://github.com/sycny/ENGAGE.
    摘要 近期的对比学习方法,因其在表示学习中的效iveness,在图数据模型中广泛应用。Random perturbation 广泛使用于建立对比视图,但可能意外破坏图结构,导致表现下降。此外,图数据通常很抽象,因此很难提取直观意义和设计更 Informed的扩充方案。有效的表示应保留数据中的关键特征,抛弃不必要的信息。在这篇论文中,我们提出了ENGAGE(ExplaNation Guided data AuGmEntation),其中解释指导对比扩充过程,以保留图中关键部分并探索抛弃不必要信息。具体来说,我们设计了一种高效的无监督解释方法,即简化活动图作为表示学习中节点重要性的指标。然后,我们设计了对图结构和特征信息进行扩充的两种方案。我们还提供了对提案方法的信息理论 justify。在图级和节点级任务上,使用不同的模型架构和真实的图数据进行实验,以证明ENGAGE的有效性和灵活性。ENGAGE的代码可以在 GitHub 上找到:https://github.com/sycny/ENGAGE。

Transport, Variational Inference and Diffusions: with Applications to Annealed Flows and Schrödinger Bridges

  • paper_url: http://arxiv.org/abs/2307.01050
  • repo_url: None
  • paper_authors: Francisco Vargas, Nikolas Nüsken
  • for: 这篇论文探讨了最优运输和变量推断之间的连接,特别是关于前向和反向时间随机 diffeq 和 Girсанов变换。
  • methods: 作者提出了一种原则正式的框架,基于差异在路径空间上进行抽象和生成模型,包括一种基于差异的扩散流技术和一种规范化迭代匹配目标。
  • results: 作者通过一系列的生成模型示例和一个基于双峰的罕见事件任务,展示了提议的方法的潜力。
    Abstract This paper explores the connections between optimal transport and variational inference, with a focus on forward and reverse time stochastic differential equations and Girsanov transformations.We present a principled and systematic framework for sampling and generative modelling centred around divergences on path space. Our work culminates in the development of a novel score-based annealed flow technique (with connections to Jarzynski and Crooks identities from statistical physics) and a regularised iterative proportional fitting (IPF)-type objective, departing from the sequential nature of standard IPF. Through a series of generative modelling examples and a double-well-based rare event task, we showcase the potential of the proposed methods.
    摘要 这篇论文探讨优化运输和变量推断之间的连接,特别是关注forward和reverse时间随机 diffeq和 Girсанов变换。我们提出了一种原理性的和系统的框架,用于 sampling和生成模型,围绕 divergence on path space。我们的工作最终得出了一种新的Score-based annealed flow技术(与统计物理中的Jarzynski和Crooks标准相关)和一种常规iterative proportional fitting(IPF)类型的目标函数。通过一系列的生成模型示例和一个基于double-well的罕见事件任务,我们展示了提议的方法的潜力。Note: The translation is done using a machine translation tool, and may not be perfect or idiomatic.

Vector Quantile Regression on Manifolds

  • paper_url: http://arxiv.org/abs/2307.01037
  • repo_url: None
  • paper_authors: Marco Pegoraro, Sanketh Vedula, Aviv A. Rosenberg, Irene Tallini, Emanuele Rodolà, Alex M. Bronstein
  • For: 本研究旨在探讨量规 regression(QR)在多变量分布上的应用,特别是在多变量分布上的 manifold 上。* Methods: 本研究使用优化的运输理论和c-卷函数来定义高维变量在 manifold 上的 conditional vector quantile function(M-CVQF)。* Results: 研究人员通过synthetic data experiment来证明M-CVQF的有效性,并提供了非欧氏分布中量规的含义。
    Abstract Quantile regression (QR) is a statistical tool for distribution-free estimation of conditional quantiles of a target variable given explanatory features. QR is limited by the assumption that the target distribution is univariate and defined on an Euclidean domain. Although the notion of quantiles was recently extended to multi-variate distributions, QR for multi-variate distributions on manifolds remains underexplored, even though many important applications inherently involve data distributed on, e.g., spheres (climate measurements), tori (dihedral angles in proteins), or Lie groups (attitude in navigation). By leveraging optimal transport theory and the notion of $c$-concave functions, we meaningfully define conditional vector quantile functions of high-dimensional variables on manifolds (M-CVQFs). Our approach allows for quantile estimation, regression, and computation of conditional confidence sets. We demonstrate the approach's efficacy and provide insights regarding the meaning of non-Euclidean quantiles through preliminary synthetic data experiments.
    摘要

Temporal Graph Benchmark for Machine Learning on Temporal Graphs

  • paper_url: http://arxiv.org/abs/2307.01026
  • repo_url: https://github.com/shenyanghuang/tgb
  • paper_authors: Shenyang Huang, Farimah Poursafaei, Jacob Danovitch, Matthias Fey, Weihua Hu, Emanuele Rossi, Jure Leskovec, Michael Bronstein, Guillaume Rabusseau, Reihaneh Rabbany
  • for: 本文为了提供一个真实、可重现、可靠的 temporal graph 模型评估 benchmark,旨在驱动 temporal graph 研究的进步。
  • methods: 本文使用了多种常见的 temporal graph 模型,并设计了基于实际用 caso 的评估协议。
  • results: 研究发现,存在许多 temporal graph dataset,模型的性能可以很大差异,而且简单的方法经常超越现有的 temporal graph 模型。这些发现打开了未来 temporal graph 研究的可能性。
    Abstract We present the Temporal Graph Benchmark (TGB), a collection of challenging and diverse benchmark datasets for realistic, reproducible, and robust evaluation of machine learning models on temporal graphs. TGB datasets are of large scale, spanning years in duration, incorporate both node and edge-level prediction tasks and cover a diverse set of domains including social, trade, transaction, and transportation networks. For both tasks, we design evaluation protocols based on realistic use-cases. We extensively benchmark each dataset and find that the performance of common models can vary drastically across datasets. In addition, on dynamic node property prediction tasks, we show that simple methods often achieve superior performance compared to existing temporal graph models. We believe that these findings open up opportunities for future research on temporal graphs. Finally, TGB provides an automated machine learning pipeline for reproducible and accessible temporal graph research, including data loading, experiment setup and performance evaluation. TGB will be maintained and updated on a regular basis and welcomes community feedback. TGB datasets, data loaders, example codes, evaluation setup, and leaderboards are publicly available at https://tgb.complexdatalab.com/ .
    摘要 我们介绍了 Temporal Graph Benchmark(TGB),一个包含具有具有强大挑战和多样化的 bencmark 数据集,用于真实、可重现和Robust 评估机器学习模型在时间图上。TGB 数据集的规模很大,覆盖了多年的时间 duration,包括节点和边级别预测任务,并覆盖了社交、贸易、交易和交通网络等多种领域。为两个任务,我们设计了基于实际用例的评估协议。我们对每个数据集进行了广泛的测试,发现公共模型在不同数据集上的性能可以截然不同。此外,在动态节点属性预测任务上,我们发现简单的方法经常超越现有的时间图模型。我们认为这些发现开发了未来研究时间图的机遇。此外,TGB 提供了一个自动化机器学习管道,用于可重现和可访问的时间图研究,包括数据加载、实验设置和性能评估。TGB 将会在 régular basis 维护和更新,欢迎社区反馈。TGB 数据集、数据加载器、示例代码、评估设置和排名是公共可用的,可以通过 https://tgb.complexdatalab.com/ 访问。

  • paper_url: http://arxiv.org/abs/2307.01023
  • repo_url: None
  • paper_authors: C. Coelho, M. Fernanda P. Costa, L. L. Ferrás
  • for: 预测系统的时间序列,包括前向和反向时间预测
  • methods: 使用深度神经网络模型,包括Neural CODE和其变种CODE-RNN、CODE-BiRNN、CODE-GRU、CODE-BiGRU、CODE-LSTM和CODE-BiLSTM
  • results: 实验结果表明Neural CODE比Neural ODE更好地学习系统的时间序列,而CODE-BiRNN/-BiGRU/-BiLSTM在三个实际时间序列任务上表现最佳,包括数据缺失估计、前向和反向推算等。
    Abstract This work introduces Neural Chronos Ordinary Differential Equations (Neural CODE), a deep neural network architecture that fits a continuous-time ODE dynamics for predicting the chronology of a system both forward and backward in time. To train the model, we solve the ODE as an initial value problem and a final value problem, similar to Neural ODEs. We also explore two approaches to combining Neural CODE with Recurrent Neural Networks by replacing Neural ODE with Neural CODE (CODE-RNN), and incorporating a bidirectional RNN for full information flow in both time directions (CODE-BiRNN), and variants with other update cells namely GRU and LSTM: CODE-GRU, CODE-BiGRU, CODE-LSTM, CODE-BiLSTM. Experimental results demonstrate that Neural CODE outperforms Neural ODE in learning the dynamics of a spiral forward and backward in time, even with sparser data. We also compare the performance of CODE-RNN/-GRU/-LSTM and CODE-BiRNN/-BiGRU/-BiLSTM against ODE-RNN/-GRU/-LSTM on three real-life time series data tasks: imputation of missing data for lower and higher dimensional data, and forward and backward extrapolation with shorter and longer time horizons. Our findings show that the proposed architectures converge faster, with CODE-BiRNN/-BiGRU/-BiLSTM consistently outperforming the other architectures on all tasks.
    摘要

Joint Coordinate Regression and Association For Multi-Person Pose Estimation, A Pure Neural Network Approach

  • paper_url: http://arxiv.org/abs/2307.01004
  • repo_url: None
  • paper_authors: Dongyang Yu, Yunshi Xie, Wangpeng An, Li Zhang, Yufeng Yao
  • for: 这个论文目的是提出一种一阶段端到端多人2D姿态估计算法(Joint Coordinate Regression and Association,简称JCRA),不需要任何后处理。
  • methods: 该算法使用一个一阶段端到端网络架构,从图像中直接输出人体关节坐标,并采用了对称的网络结构,以确保高准确率。
  • results: 对于MS COCO和CrowdPose测试集,JCRA的实验结果表明,它在准确率和效率两个方面都超过了现有的方法。具体来说,JCRA在MS COCO测试集上达到了69.2 mAP,并且在推理加速方面比前一代底层算法快78%。
    Abstract We introduce a novel one-stage end-to-end multi-person 2D pose estimation algorithm, known as Joint Coordinate Regression and Association (JCRA), that produces human pose joints and associations without requiring any post-processing. The proposed algorithm is fast, accurate, effective, and simple. The one-stage end-to-end network architecture significantly improves the inference speed of JCRA. Meanwhile, we devised a symmetric network structure for both the encoder and decoder, which ensures high accuracy in identifying keypoints. It follows an architecture that directly outputs part positions via a transformer network, resulting in a significant improvement in performance. Extensive experiments on the MS COCO and CrowdPose benchmarks demonstrate that JCRA outperforms state-of-the-art approaches in both accuracy and efficiency. Moreover, JCRA demonstrates 69.2 mAP and is 78\% faster at inference acceleration than previous state-of-the-art bottom-up algorithms. The code for this algorithm will be publicly available.
    摘要 我们提出了一种新的一stage终端多人2D姿态估计算法,称为共同坐标回归和关联(JCRA),该算法可以不需要任何后处理生成人体姿态关节和关联。我们提出的算法具有快速、准确、有效和简单的特点。我们使用一stage终端网络架构,这有效地提高了JCRA的推理速度。同时,我们设计了对Encoder和Decoder网络结构的 симметри化,确保高精度地标定关键点。它采用一种直接输出部位位置的transformer网络架构,从而导致了显著提高的性能。我们在COCO和CrowdPose benchmark上进行了广泛的实验,demonstrates that JCRA exceeds state-of-the-art approaches in both accuracy and efficiency. In addition, JCRA achieves 69.2 mAP and is 78% faster at inference acceleration than previous state-of-the-art bottom-up algorithms. The code for this algorithm will be publicly available.

Capafoldable: self-tracking foldable smart textiles with capacitive sensing

  • paper_url: http://arxiv.org/abs/2307.05370
  • repo_url: None
  • paper_authors: Lala Shakti Swarup Ray, Daniel Geißler, Bo Zhou, Paul Lukowicz, Berit Greinke
  • For: 能够检测 estructural motions 的 smart textile* Methods: combining folded fabric structures 和 capacitive sensing,使用 state-of-the-art sensing circuits 和 deep learning technologies* Results: 可以很准确地 reconstruction geometry primitives defining patch shape from capacitive signals,tracking error 只有 1cm,可以应用于新的 smart textile 应用程序。Here’s the full text in Traditional Chinese:这个研究旨在开发一种能够检测结构运动的智能纱布,通过结合折叠纱布结构和导电纱布感知技术,并使用现代感知电路和深度学习技术来实现。我们实验了两种折叠模式,即Accordion和Chevron,每种模式都有两种导电纱布感知器的配置。通过我们的方法,可以很准确地从导电信号中重建geometry primitives定义纱布形状,追踪误差只有1cm,可以应用于新的智能纱布应用程序。
    Abstract Folding is an unique structural technique to enable planer materials with motion or 3D mechanical properties. Textile-based capacitive sensing has shown to be sensitive to the geometry deformation and relative motion of conductive textiles. In this work, we propose a novel self-tracking foldable smart textile by combining folded fabric structures and capacitive sensing to detect the structural motions using state-of-the-art sensing circuits and deep learning technologies. We created two folding patterns, Accordion and Chevron, each with two layouts of capacitive sensors in the form of thermobonded conductive textile patches. In an experiment of manually moving patches of the folding patterns, we developed deep neural network to learn and reconstruct the vision-tracked shape of the patches. Through our approach, the geometry primitives defining the patch shape can be reconstructed from the capacitive signals with R-squared value of up to 95\% and tracking error of 1cm for 22.5cm long patches. With mechanical, electrical and sensing properties, Capafoldable could enable a new range of smart textile applications.
    摘要 折叠是一种独特的结构技术,可以让平面材料具有运动或3D机械性能。基于织物的电容式感测技术已经证明可以感测织物的几何变形和相对运动。在这项工作中,我们提出了一种新的自追踪式折叠智能织物,通过结合折叠布结构和电容式感测技术来检测结构运动。我们设计了两种折叠模式,即腰棒和斜线,每种模式有两种布置的电容器在形式为热粘合的导电织物贴片上。在手动移动贴片的实验中,我们开发了深度神经网络来学习和重建通过视觉跟踪的贴片形状。通过我们的方法,贴片的几何基本元可以从电容信号中被重建,R-squared值可达95%,跟踪错误为1cm,对22.5cm长的贴片来说。拥有机械、电学和感测性能,Capafoldable可以开拓新的智能织物应用领域。

Pareto optimal proxy metrics

  • paper_url: http://arxiv.org/abs/2307.01000
  • repo_url: None
  • paper_authors: Lee Richardson, Alessandro Zito, Dylan Greaves, Jacopo Soriano
  • for: 优化产品,提高产品质量
  • methods: 使用代理指标,同时提高预测精度和敏感度
  • results: 提高决策 velocicty和决策质量,代理指标比北星指标高八倍敏感度
    Abstract North star metrics and online experimentation play a central role in how technology companies improve their products. In many practical settings, however, evaluating experiments based on the north star metric directly can be difficult. The two most significant issues are 1) low sensitivity of the north star metric and 2) differences between the short-term and long-term impact on the north star metric. A common solution is to rely on proxy metrics rather than the north star in experiment evaluation and launch decisions. Existing literature on proxy metrics concentrates mainly on the estimation of the long-term impact from short-term experimental data. In this paper, instead, we focus on the trade-off between the estimation of the long-term impact and the sensitivity in the short term. In particular, we propose the Pareto optimal proxy metrics method, which simultaneously optimizes prediction accuracy and sensitivity. In addition, we give an efficient multi-objective optimization algorithm that outperforms standard methods. We applied our methodology to experiments from a large industrial recommendation system, and found proxy metrics that are eight times more sensitive than the north star and consistently moved in the same direction, increasing the velocity and the quality of the decisions to launch new features.
    摘要 北斗星指标和在线实验是技术公司产品改进的中心角色。然而,在实际设置中,直接基于北斗星指标评估实验的问题经常出现。主要问题包括1)北斗星指标敏感度低和2)短期和长期北斗星指标之间的差异。现有文献中的代表指标集中在长期影响的估计上,而不是短期影响。在这篇论文中,我们则关注了长期影响和短期敏感度之间的衡量。我们提出了最优化代表指标方法,同时保证了预测准确性和敏感度。此外,我们还提供了超过标准方法的高效多目标优化算法。我们在一大型工业推荐系统的实验中应用了我们的方法ологи,发现代表指标是北斗星指标的八倍敏感,并且一直逐渐增长,提高了决策启动新特性的速度和质量。

Environmental effects on emergent strategy in micro-scale multi-agent reinforcement learning

  • paper_url: http://arxiv.org/abs/2307.00994
  • repo_url: https://github.com/swarmrl/swarmrl
  • paper_authors: Samuel Tovey, David Zimmer, Christoph Lohrmann, Tobias Merkt, Simon Koppenhoefer, Veit-Lorenz Heuthe, Clemens Bechinger, Christian Holm
  • for: This paper explores the role of temperature in the emergence and efficacy of strategies in MARL systems using particle-based Langevin molecular dynamics simulations.
  • methods: The paper uses particle-based Langevin molecular dynamics simulations as a realistic representation of micro-scale environments, and introduces a novel Python package for studying microscopic agents using reinforcement learning.
  • results: The paper finds that at higher temperatures, the RL agents identify new strategies for achieving tasks, highlighting the importance of understanding this regime and providing insight into optimal training strategies for bridging the generalization gap between simulation and reality.
    Abstract Multi-Agent Reinforcement Learning (MARL) is a promising candidate for realizing efficient control of microscopic particles, of which micro-robots are a subset. However, the microscopic particles' environment presents unique challenges, such as Brownian motion at sufficiently small length-scales. In this work, we explore the role of temperature in the emergence and efficacy of strategies in MARL systems using particle-based Langevin molecular dynamics simulations as a realistic representation of micro-scale environments. To this end, we perform experiments on two different multi-agent tasks in microscopic environments at different temperatures, detecting the source of a concentration gradient and rotation of a rod. We find that at higher temperatures, the RL agents identify new strategies for achieving these tasks, highlighting the importance of understanding this regime and providing insight into optimal training strategies for bridging the generalization gap between simulation and reality. We also introduce a novel Python package for studying microscopic agents using reinforcement learning (RL) to accompany our results.
    摘要

Over-The-Air Federated Learning: Status Quo, Open Challenges, and Future Directions

  • paper_url: http://arxiv.org/abs/2307.00974
  • repo_url: None
  • paper_authors: Bingnan Xiao, Xichen Yu, Wei Ni, Xin Wang, H. Vincent Poor
  • for: 这篇论文旨在提供一份关于无线网络上实现人工智能应用程序的总体评论,并指出未来研究的可能方向。
  • methods: 论文使用了多 accessed 通道(MACs)的超�ayer federated learning(OTA-FL)技术,通过让网络边缘用户共享频率资源,实现高效、低延迟的全球模型聚合。
  • results: 论文对OTA-FL的进展进行了分类和总结,包括单天线OTA-FL、多天线OTA-FL和利用emerging reconfigurable intelligent surface(RIS)技术的OTA-FL。同时,论文还讨论了OTA-FL的信任、安全和隐私方面的问题,并提出了未来研究的挑战和方向。
    Abstract The development of applications based on artificial intelligence and implemented over wireless networks is increasingly rapidly and is expected to grow dramatically in the future. The resulting demand for the aggregation of large amounts of data has caused serious communication bottlenecks in wireless networks and particularly at the network edge. Over-the-air federated learning (OTA-FL), leveraging the superposition feature of multi-access channels (MACs), enables users at the network edge to share spectrum resources and achieves efficient and low-latency global model aggregation. This paper provides a holistic review of progress in OTA-FL and points to potential future research directions. Specifically, we classify OTA-FL from the perspective of system settings, including single-antenna OTA-FL, multi-antenna OTA-FL, and OTA-FL with the aid of the emerging reconfigurable intelligent surface (RIS) technology, and the contributions of existing works in these areas are summarized. Moreover, we discuss the trust, security and privacy aspects of OTA-FL, and highlight concerns arising from security and privacy. Finally, challenges and potential research directions are discussed to promote the future development of OTA-FL in terms of improving system performance, reliability, and trustworthiness. Specifical challenges to be addressed include model distortion under channel fading, the ineffective OTA aggregation of local models trained on substantially unbalanced data, and the limited accessibility and verifiability of individual local models.
    摘要 发展基于人工智能的应用程序在无线网络上实施,迅速增长,未来将会继续增长迅速。这Resulting in 大量数据的聚合需求导致了无线网络中的通信瓶颈和特别是网络边缘的瓶颈。使用无线多接口通道(MACs)的超载特性,用户在网络边缘可以共享频率资源,实现高效且响应时间短的全球模型聚合。本文提供了无线 federated learning(OTA-FL)的总体评论,并指出了未来研究的可能性。 Specifically, we classify OTA-FL from the perspective of system settings, including single-antenna OTA-FL, multi-antenna OTA-FL, and OTA-FL with the aid of the emerging reconfigurable intelligent surface(RIS)技术,并 Summarize the contributions of existing works in these areas. In addition, we discuss the trust, security, and privacy aspects of OTA-FL, and highlight concerns arising from security and privacy. Finally, we discuss challenges and potential research directions to promote the future development of OTA-FL in terms of improving system performance, reliability, and trustworthiness. Specific challenges to be addressed include model distortion under channel fading, the ineffective OTA aggregation of local models trained on substantially unbalanced data, and the limited accessibility and verifiability of individual local models.

MoVie: Visual Model-Based Policy Adaptation for View Generalization

  • paper_url: http://arxiv.org/abs/2307.00972
  • repo_url: https://github.com/yangsizhe/MoVie
  • paper_authors: Sizhe Yang, Yanjie Ze, Huazhe Xu
  • for: 这篇论文主要旨在解决视觉学习(Reinforcement Learning,RL)代理人在看到的限制视图下面临的总体化能力扩展问题。
  • methods: 作者提出了一种简单 yet effective的方法,可以在测试时使模型基于视图的政策适应视图总结问题,不需要显式奖励信号和任何修改 durante training time。
  • results: 作者的方法在四种不同的场景下(包括 DMControl、xArm 和 Adroit 等)进行了18个任务的测试,与基eline相比,表现出了substantial advancements(相对提高33%、86% 和 152%)。这些出色的结果表明该方法在实际 роботех术应用中具有极大的潜力。
    Abstract Visual Reinforcement Learning (RL) agents trained on limited views face significant challenges in generalizing their learned abilities to unseen views. This inherent difficulty is known as the problem of $\textit{view generalization}$. In this work, we systematically categorize this fundamental problem into four distinct and highly challenging scenarios that closely resemble real-world situations. Subsequently, we propose a straightforward yet effective approach to enable successful adaptation of visual $\textbf{Mo}$del-based policies for $\textbf{Vie}$w generalization ($\textbf{MoVie}$) during test time, without any need for explicit reward signals and any modification during training time. Our method demonstrates substantial advancements across all four scenarios encompassing a total of $\textbf{18}$ tasks sourced from DMControl, xArm, and Adroit, with a relative improvement of $\mathbf{33}$%, $\mathbf{86}$%, and $\mathbf{152}$% respectively. The superior results highlight the immense potential of our approach for real-world robotics applications. Videos are available at https://yangsizhe.github.io/MoVie/ .
    摘要 视觉强化学习(RL)代理人在有限视角下接受训练,面临普遍化视角问题的挑战。这种问题在实际情况中非常困难,被称为“视觉普遍化”问题。在这种工作中,我们系统地将这个基本问题分为四个明确和具有挑战性的enario,与实际情况很接近。然后,我们提议一种简单 yet effective的方法,在测试时使模型基于视觉的策略适应视觉普遍化,不需要显式奖励信号和任何修改 durante 训练时间。我们的方法在四个scenario中展示了明显的进步,涵盖了DMControl、xArm和Adroit中的共计18个任务,相对改进率为33%、86%和152%。这些出色的结果表明我们的方法在实际 робо学应用中具有巨大的潜力。视频可以在https://yangsizhe.github.io/MoVie/ 中找到。

REAL: A Representative Error-Driven Approach for Active Learning

  • paper_url: http://arxiv.org/abs/2307.00968
  • repo_url: https://github.com/withchencheng/ecml_pkdd_23_real
  • paper_authors: Cheng Chen, Yong Wang, Lizi Liao, Yueguo Chen, Xiaoyong Du
  • for: 这个论文的目的是提出一种基于 Representative Errors for Active Learning(REAL)的方法,以优化活动学习中的数据选择。
  • methods: 这个方法使用了一种基于不确定性和多样性的方法来评估未标注的实例的有用性,并在这些实例中寻找最有代表性的错误( Pseudo Errors)。它还采用了一种自适应的采样预算分配方法,以根据预测错误的density来决定采样的质量。
  • results: 实验表明,使用 REAL 方法可以在多种 гипер参数设置下,consistently 超越所有最佳基准方法 regarding 准确率和 F1-macro 分数。同时,我们的分析还显示了 REAL 方法选择的 pseudo errors 与真实错误的分布相匹配。
    Abstract Given a limited labeling budget, active learning (AL) aims to sample the most informative instances from an unlabeled pool to acquire labels for subsequent model training. To achieve this, AL typically measures the informativeness of unlabeled instances based on uncertainty and diversity. However, it does not consider erroneous instances with their neighborhood error density, which have great potential to improve the model performance. To address this limitation, we propose $REAL$, a novel approach to select data instances with $\underline{R}$epresentative $\underline{E}$rrors for $\underline{A}$ctive $\underline{L}$earning. It identifies minority predictions as \emph{pseudo errors} within a cluster and allocates an adaptive sampling budget for the cluster based on estimated error density. Extensive experiments on five text classification datasets demonstrate that $REAL$ consistently outperforms all best-performing baselines regarding accuracy and F1-macro scores across a wide range of hyperparameter settings. Our analysis also shows that $REAL$ selects the most representative pseudo errors that match the distribution of ground-truth errors along the decision boundary. Our code is publicly available at https://github.com/withchencheng/ECML_PKDD_23_Real.
    摘要 (Simplified Chinese translation) Given a limited labeling budget, active learning (AL) aims to sample the most informative instances from an unlabeled pool to acquire labels for subsequent model training. To achieve this, AL typically measures the informativeness of unlabeled instances based on uncertainty and diversity. However, it does not consider erroneous instances with their neighborhood error density, which have great potential to improve the model performance. To address this limitation, we propose $REAL$, a novel approach to select data instances with $\underline{R}$epresentative $\underline{E}$rrors for $\underline{A}$ctive $\underline{L}$earning. It identifies minority predictions as \emph{pseudo errors} within a cluster and allocates an adaptive sampling budget for the cluster based on estimated error density. Extensive experiments on five text classification datasets demonstrate that $REAL$ consistently outperforms all best-performing baselines regarding accuracy and F1-macro scores across a wide range of hyperparameter settings. Our analysis also shows that $REAL$ selects the most representative pseudo errors that match the distribution of ground-truth errors along the decision boundary. Our code is publicly available at https://github.com/withchencheng/ECML_PKDD_23_Real.

OpenClinicalAI: An Open and Dynamic Model for Alzheimer’s Disease Diagnosis

  • paper_url: http://arxiv.org/abs/2307.00965
  • repo_url: None
  • paper_authors: Yunyou Huang, Xiaoshuang Liang, Xiangjiang Lu, Xiuxia Miao, Jiyue Xie, Wenjing Liu, Fan Zhang, Guoxin Kang, Li Ma, Suqin Tang, Zhifei Zhang, Jianfeng Zhan
  • for: 这个研究旨在提出一个可以应对现实临床设置的普遍性阿尔茨heimer病诊断系统,以提高现有医疗系统中的诊断效率和准确性。
  • methods: 本研究使用了开放式临床AI(OpenClinicalAI),融合了相互关联的深度多动作征学学习(DMARL)和多中心遗传学习(MCML),以动态形成诊断策略和提供诊断结果,以应对现实临床设置中的不确定和多元性。
  • results: 实验结果显示,OpenClinicalAI 比前一代模型具有更好的性能和较少的临床检查次数。
    Abstract Although Alzheimer's disease (AD) cannot be reversed or cured, timely diagnosis can significantly reduce the burden of treatment and care. Current research on AD diagnosis models usually regards the diagnosis task as a typical classification task with two primary assumptions: 1) All target categories are known a priori; 2) The diagnostic strategy for each patient is consistent, that is, the number and type of model input data for each patient are the same. However, real-world clinical settings are open, with complexity and uncertainty in terms of both subjects and the resources of the medical institutions. This means that diagnostic models may encounter unseen disease categories and need to dynamically develop diagnostic strategies based on the subject's specific circumstances and available medical resources. Thus, the AD diagnosis task is tangled and coupled with the diagnosis strategy formulation. To promote the application of diagnostic systems in real-world clinical settings, we propose OpenClinicalAI for direct AD diagnosis in complex and uncertain clinical settings. This is the first powerful end-to-end model to dynamically formulate diagnostic strategies and provide diagnostic results based on the subject's conditions and available medical resources. OpenClinicalAI combines reciprocally coupled deep multiaction reinforcement learning (DMARL) for diagnostic strategy formulation and multicenter meta-learning (MCML) for open-set recognition. The experimental results show that OpenClinicalAI achieves better performance and fewer clinical examinations than the state-of-the-art model. Our method provides an opportunity to embed the AD diagnostic system into the current health care system to cooperate with clinicians to improve current health care.
    摘要 although Alzheimer's disease (AD) cannot be reversed or cured, timely diagnosis can significantly reduce the burden of treatment and care. Current research on AD diagnosis models usually regards the diagnosis task as a typical classification task with two primary assumptions: 1) all target categories are known a priori; 2) the diagnostic strategy for each patient is consistent, that is, the number and type of model input data for each patient are the same. However, real-world clinical settings are open, with complexity and uncertainty in terms of both subjects and the resources of the medical institutions. This means that diagnostic models may encounter unseen disease categories and need to dynamically develop diagnostic strategies based on the subject's specific circumstances and available medical resources. Thus, the AD diagnosis task is tangled and coupled with the diagnosis strategy formulation. To promote the application of diagnostic systems in real-world clinical settings, we propose OpenClinicalAI for direct AD diagnosis in complex and uncertain clinical settings. This is the first powerful end-to-end model to dynamically formulate diagnostic strategies and provide diagnostic results based on the subject's conditions and available medical resources. OpenClinicalAI combines reciprocally coupled deep multiaction reinforcement learning (DMARL) for diagnostic strategy formulation and multicenter meta-learning (MCML) for open-set recognition. The experimental results show that OpenClinicalAI achieves better performance and fewer clinical examinations than the state-of-the-art model. Our method provides an opportunity to embed the AD diagnostic system into the current health care system to cooperate with clinicians to improve current health care.

A Dual Stealthy Backdoor: From Both Spatial and Frequency Perspectives

  • paper_url: http://arxiv.org/abs/2307.10184
  • repo_url: None
  • paper_authors: Yudong Gao, Honglong Chen, Peng Sun, Junjian Li, Anqing Zhang, Zhibo Wang
  • for: 防止深度神经网络(DNN)受到后门攻击
  • methods: 利用Discrete Wavelet Transform和Fourier Transform等方法实现隐藏后门攻击
  • results: 对四个预测集进行了广泛测试,并获得了较高的攻击成功率和隐藏性
    Abstract Backdoor attacks pose serious security threats to deep neural networks (DNNs). Backdoored models make arbitrarily (targeted) incorrect predictions on inputs embedded with well-designed triggers while behaving normally on clean inputs. Many works have explored the invisibility of backdoor triggers to improve attack stealthiness. However, most of them only consider the invisibility in the spatial domain without explicitly accounting for the generation of invisible triggers in the frequency domain, making the generated poisoned images be easily detected by recent defense methods. To address this issue, in this paper, we propose a DUal stealthy BAckdoor attack method named DUBA, which simultaneously considers the invisibility of triggers in both the spatial and frequency domains, to achieve desirable attack performance, while ensuring strong stealthiness. Specifically, we first use Discrete Wavelet Transform to embed the high-frequency information of the trigger image into the clean image to ensure attack effectiveness. Then, to attain strong stealthiness, we incorporate Fourier Transform and Discrete Cosine Transform to mix the poisoned image and clean image in the frequency domain. Moreover, the proposed DUBA adopts a novel attack strategy, in which the model is trained with weak triggers and attacked with strong triggers to further enhance the attack performance and stealthiness. We extensively evaluate DUBA against popular image classifiers on four datasets. The results demonstrate that it significantly outperforms the state-of-the-art backdoor attacks in terms of the attack success rate and stealthiness
    摘要 深度神经网络(DNN)受到后门攻击的安全威胁。后门模型会在嵌入了高效设计的触发器的输入上提供targeted incorrect predictions,而不会对干净输入产生影响。许多研究探讨了后门触发器的隐藏性,但大多数只是在空间领域内不显式地考虑了生成隐藏的触发器,使得生成的毒素图像可以轻松地被现有的防御方法检测。为解决这个问题,在这篇论文中,我们提出了DUal stealthy BAckdoor attack方法(DUBA),该方法同时考虑了触发器在空间和频域内的隐藏性,以实现desirable的攻击性能,同时保证强大的隐身性。具体来说,我们首先使用Discrete Wavelet Transform将高频信息 embed到了干净图像中,以确保攻击效果。然后,为了进一步增强隐身性,我们使用Fourier Transform和Discrete Cosine Transform将毒素图像和干净图像混合在频域中。此外,我们提出的DUBA采用了一种新的攻击策略,在该策略中,模型被训练使用弱触发器,并在攻击时使用强触发器,以进一步提高攻击性能和隐身性。我们对四个数据集进行了广泛的测试,结果显示,DUBA可以具有较高的攻击成功率和隐身性。

  • paper_url: http://arxiv.org/abs/2307.00960
  • repo_url: None
  • paper_authors: Simone Sarti, Eugenio Lomurno, Matteo Matteucci
  • for: 本研究旨在提高Neural Architecture Search(NAS)技术的效率和计算资源利用率,以便在各种任务上建立高性能的人工神经网络。
  • methods: 本文提出了Once-For-All(OFA)和其 successor Once-For-All-2(OFAv2)技术,以及Neural Architecture Transfer(NAT)技术,用于自动设计任务优化的人工神经网络。
  • results: 本研究表明,NATv2可以成功地改进NAT,并在多目标搜索算法应用于动态超网络架构时实现质量提升。
    Abstract Deep learning is increasingly impacting various aspects of contemporary society. Artificial neural networks have emerged as the dominant models for solving an expanding range of tasks. The introduction of Neural Architecture Search (NAS) techniques, which enable the automatic design of task-optimal networks, has led to remarkable advances. However, the NAS process is typically associated with long execution times and significant computational resource requirements. Once-For-All (OFA) and its successor, Once-For-All-2 (OFAv2), have been developed to mitigate these challenges. While maintaining exceptional performance and eliminating the need for retraining, they aim to build a single super-network model capable of directly extracting sub-networks satisfying different constraints. Neural Architecture Transfer (NAT) was developed to maximise the effectiveness of extracting sub-networks from a super-network. In this paper, we present NATv2, an extension of NAT that improves multi-objective search algorithms applied to dynamic super-network architectures. NATv2 achieves qualitative improvements in the extractable sub-networks by exploiting the improved super-networks generated by OFAv2 and incorporating new policies for initialisation, pre-processing and updating its networks archive. In addition, a post-processing pipeline based on fine-tuning is introduced. Experimental results show that NATv2 successfully improves NAT and is highly recommended for investigating high-performance architectures with a minimal number of parameters.
    摘要 深度学习在当代社会中越来越有影响。人工神经网络已经成为解决越来越多任务的主导模型。 introduce Neural Architecture Search(NAS)技术,它可以自动设计适应任务的网络,导致了非常的进步。然而,NAS过程通常具有较长的执行时间和较大的计算资源需求。Once-For-All(OFA)和其 successor Once-For-All-2(OFAv2)已经开发出来了,以解决这些挑战。它们希望建立一个单一的超网络模型,可以直接提取满足不同约束的子网络。Neural Architecture Transfer(NAT)被开发出来,以 maximize the effectiveness of extracting sub-networks from a super-network。在这篇论文中,我们提出NATv2,它是NAT的扩展,通过在OFAv2生成的改进的超网络和新的初始化、预处理和更新网络archive的策略来提高可提取的子网络质量。此外,我们还提出了一个基于练习的后处理管道。实验结果表明,NATv2成功地提高了NAT,并且在具有最小参数数量的情况下提供了高性能的建议。

Learning Difference Equations with Structured Grammatical Evolution for Postprandial Glycaemia Prediction

  • paper_url: http://arxiv.org/abs/2307.01238
  • repo_url: None
  • paper_authors: Daniel Parra, David Joedicke, J. Manuel Velasco, Gabriel Kronberger, J. Ignacio Hidalgo
  • for: 这项研究旨在提供一种可解释的血糖预测方法,以帮助患有 диабе吗的人更好地控制血糖水平。
  • methods: 该方法基于 Interpretable Sparse Identification by Grammatical Evolution 技术,并结合了之前的准备阶段。它提供了 finite difference equations,用于预测在吃过食物后两个小时内血糖水平的变化。
  • results: 该方法可以提供安全且准确的预测结果,而无需放弃可解释性。与其他方法相比,该方法在准确性和可解释性之间协调得更好,提供了一个有前途的方法 для血糖预测。
    Abstract People with diabetes must carefully monitor their blood glucose levels, especially after eating. Blood glucose regulation requires a proper combination of food intake and insulin boluses. Glucose prediction is vital to avoid dangerous post-meal complications in treating individuals with diabetes. Although traditional methods, such as artificial neural networks, have shown high accuracy rates, sometimes they are not suitable for developing personalised treatments by physicians due to their lack of interpretability. In this study, we propose a novel glucose prediction method emphasising interpretability: Interpretable Sparse Identification by Grammatical Evolution. Combined with a previous clustering stage, our approach provides finite difference equations to predict postprandial glucose levels up to two hours after meals. We divide the dataset into four-hour segments and perform clustering based on blood glucose values for the twohour window before the meal. Prediction models are trained for each cluster for the two-hour windows after meals, allowing predictions in 15-minute steps, yielding up to eight predictions at different time horizons. Prediction safety was evaluated based on Parkes Error Grid regions. Our technique produces safe predictions through explainable expressions, avoiding zones D (0.2% average) and E (0%) and reducing predictions on zone C (6.2%). In addition, our proposal has slightly better accuracy than other techniques, including sparse identification of non-linear dynamics and artificial neural networks. The results demonstrate that our proposal provides interpretable solutions without sacrificing prediction accuracy, offering a promising approach to glucose prediction in diabetes management that balances accuracy, interpretability, and computational efficiency.
    摘要 人们有糖尿病必须仔细监测血糖水平,特别是 после吃食。血糖补做需要合适的食物摄取和人工胰岛素注射。预测血糖水平是对治疗糖尿病患者的生命关键。传统方法,如人工神经网络,已经显示高准确率,但是由于其解释性不足,不适用于个性化治疗。在本研究中,我们提出了一种新的血糖预测方法,强调解释性:可解释的稀缺特征识别。结合之前的划分阶段,我们的方法提供了finite difference方程来预测午餐后血糖水平。我们将数据分成四个时间段,并基于吃食前两个小时的血糖值进行划分。预测模型在每个群中训练,以预测午餐后两个小时内的血糖水平,每步预测15分钟,共八个预测。预测安全性评估基于公钵环境区域。我们的方法生成安全的预测,避免了区域D(0.2%的平均值)和区域E(0%),并减少了区域C(6.2%)。此外,我们的提议的准确性略高于其他技术,包括稀缺特征识别非线性动力学和人工神经网络。结果表明,我们的提议可以提供可解释的解决方案,不 sacrificing预测准确性,为糖尿病管理提供了可能的平衡。

Dynamical Graph Echo State Networks with Snapshot Merging for Dissemination Process Classification

  • paper_url: http://arxiv.org/abs/2307.01237
  • repo_url: None
  • paper_authors: Ziqiang Li, Kantaro Fujiwara, Gouhei Tanaka
  • for: 本研究主要针对的是 temporally graph classification 问题,尤其是在 community 中的信息或疾病传播模式的分类。
  • methods: 本研究提出了一种 combining snapshot merging 策略和 Dynamical Graph Echo State Network (DynGESN) 模型来处理 temporally graph classification 任务。 snapshot merging 策略用于在不同时刻 merge 邻居快照,以获得更多的 spatiotemporal 特征;而 DynGESN 模型则用于 capture 这些特征。
  • results: 实验结果显示,对六个 benchmark DPC 数据集进行测试,本研究的提出的模型在 classification 性能方面占据了优势,比 DynGESN 和一些基于核函数的模型更好。
    Abstract The Dissemination Process Classification (DPC) is a popular application of temporal graph classification. The aim of DPC is to classify different spreading patterns of information or pestilence within a community represented by discrete-time temporal graphs. Recently, a reservoir computing-based model named Dynamical Graph Echo State Network (DynGESN) has been proposed for processing temporal graphs with relatively high effectiveness and low computational costs. In this study, we propose a novel model which combines a novel data augmentation strategy called snapshot merging with the DynGESN for dealing with DPC tasks. In our model, the snapshot merging strategy is designed for forming new snapshots by merging neighboring snapshots over time, and then multiple reservoir encoders are set for capturing spatiotemporal features from merged snapshots. After those, the logistic regression is adopted for decoding the sum-pooled embeddings into the classification results. Experimental results on six benchmark DPC datasets show that our proposed model has better classification performances than the DynGESN and several kernel-based models.
    摘要 《信息或疫病传播过程分类(DPC)应用》是一个广泛使用的时间图分类应用。DPC的目标是根据社区的时间图来分类不同的信息或疫病传播模式。近些年,一种基于储存计算机(reservoir computing)的模型named Dynamical Graph Echo State Network(DynGESN)已经被提出来处理时间图。在本研究中,我们提出了一种新的模型,该模型将Snapshot Merging策略与DynGESN相结合,用于处理DPC任务。在我们的模型中,Snapshot Merging策略是用于在时间上邻居图像合并,并将多个储存编码器用于捕捉时间图的空间特征。然后,逻辑回归被采用来将汇聚编码器的输出编码为分类结果。在六个标准DPC数据集上进行实验,我们的提出的模型的分类性能比DynGESN和一些核心基于模型更好。

Rockmate: an Efficient, Fast, Automatic and Generic Tool for Re-materialization in PyTorch

  • paper_url: http://arxiv.org/abs/2307.01236
  • repo_url: https://github.com/topal-team/rockmate
  • paper_authors: Xunyi Zhao, Théotime Le Hellard, Lionel Eyraud, Julia Gusak, Olivier Beaumont
  • for: 这篇论文的目的是提出一个名为 Rockmate 的自动化工具,用于控制 PyTorch DNN 模型的记忆需求。
  • methods: Rockmate 使用一种自动检测模型的 Computational 和 Data 相依性结构,并将其转换为一系列复杂的封页,以控制记忆需求。
  • results: 经过实验显示,Rockmate 能够与 Checkmate 和 Rotor 相比,具有相似的速度和效率,并且可以在许多模型中获得较低的记忆需求(具体比例为 2-5),仅带来一定的开销(约 10%-20%)。
    Abstract We propose Rockmate to control the memory requirements when training PyTorch DNN models. Rockmate is an automatic tool that starts from the model code and generates an equivalent model, using a predefined amount of memory for activations, at the cost of a few re-computations. Rockmate automatically detects the structure of computational and data dependencies and rewrites the initial model as a sequence of complex blocks. We show that such a structure is widespread and can be found in many models in the literature (Transformer based models, ResNet, RegNets,...). This structure allows us to solve the problem in a fast and efficient way, using an adaptation of Checkmate (too slow on the whole model but general) at the level of individual blocks and an adaptation of Rotor (fast but limited to sequential models) at the level of the sequence itself. We show through experiments on many models that Rockmate is as fast as Rotor and as efficient as Checkmate, and that it allows in many cases to obtain a significantly lower memory consumption for activations (by a factor of 2 to 5) for a rather negligible overhead (of the order of 10% to 20%). Rockmate is open source and available at https://github.com/topal-team/rockmate.
    摘要

OpenAPMax: Abnormal Patterns-based Model for Real-World Alzheimer’s Disease Diagnosis

  • paper_url: http://arxiv.org/abs/2307.00936
  • repo_url: None
  • paper_authors: Yunyou Huang, Xianglong Guan, Xiangjiang Lu, Xiaoshuang Liang, Xiuxia Miao, Jiyue Xie, Wenjing Liu, Li Ma, Suqin Tang, Zhifei Zhang, Jianfeng Zhan
  • for: The paper aims to address the challenges of Alzheimer’s disease (AD) diagnosis in real-world settings, particularly the open-set recognition problem where the known categories are not fixed and can change over time.
  • methods: The proposed method, OpenAPMax, uses an anomaly pattern-based approach to model the distance between each patient’s abnormal pattern and the center of their category, and modifies the classification probability using extreme value theory (EVT).
  • results: The proposed method achieves state-of-the-art results in open-set recognition, outperforming recent open-set recognition methods.Here’s the Simplified Chinese text format for the three key points:
  • for: 本文旨在解决阿尔ц海默病(AD)诊断在实际设置下的挑战,特别是开集识别问题,where the known categories are not fixed and can change over time.
  • methods: 提议方法基于异常模式的方法,使用极值理论(EVT)来模型每个患者的异常模式和其分类概率。
  • results: 提议方法达到开集识别领域的州先Result,超越最近的开集识别方法。I hope this helps!
    Abstract Alzheimer's disease (AD) cannot be reversed, but early diagnosis will significantly benefit patients' medical treatment and care. In recent works, AD diagnosis has the primary assumption that all categories are known a prior -- a closed-set classification problem, which contrasts with the open-set recognition problem. This assumption hinders the application of the model in natural clinical settings. Although many open-set recognition technologies have been proposed in other fields, they are challenging to use for AD diagnosis directly since 1) AD is a degenerative disease of the nervous system with similar symptoms at each stage, and it is difficult to distinguish from its pre-state, and 2) diversified strategies for AD diagnosis are challenging to model uniformly. In this work, inspired by the concerns of clinicians during diagnosis, we propose an open-set recognition model, OpenAPMax, based on the anomaly pattern to address AD diagnosis in real-world settings. OpenAPMax first obtains the abnormal pattern of each patient relative to each known category through statistics or a literature search, clusters the patients' abnormal pattern, and finally, uses extreme value theory (EVT) to model the distance between each patient's abnormal pattern and the center of their category and modify the classification probability. We evaluate the performance of the proposed method with recent open-set recognition, where we obtain state-of-the-art results.
    摘要 阿尔茨heimer病 (AD) 不能 reversed,但早期诊断将有利于患者的医疗和护理。在最近的研究中,AD 诊断假设所有类别都是已知的,一个closed-set classification problem,与开放式认知问题不同。这种假设限制了模型在自然临床设置中的应用。虽然许多开放式认知技术在其他领域已经提出,但它们在AD诊断直接应用很困难,因为1)AD 是 nervious system 的逐步衰变病,表现类似,难以与其预状分 distinguish,2)AD 诊断策略多样化难以统一模型。在这种情况下,我们以临床医生在诊断过程中的关注为导向,提出一种开放式认知模型,OpenAPMax,基于异常模式来解决AD诊断在实际设置中。OpenAPMax 首先通过统计或文献搜索获取每个患者的异常模式,对每个患者的异常模式进行归类,最后使用极值理论 (EVT) 来模型每个患者的异常模式与其分类概率的距离。我们对最近的开放式认知方法进行评估,并获得了状态机器人的结果。

Learning Differentiable Logic Programs for Abstract Visual Reasoning

  • paper_url: http://arxiv.org/abs/2307.00928
  • repo_url: https://github.com/ml-research/neumann
  • paper_authors: Hikaru Shindo, Viktor Pfanschilling, Devendra Singh Dhami, Kristian Kersting
    for:The paper is written for building intelligent agents that can perform visual reasoning and solve problem-solving tasks beyond perception.methods:The paper proposes a graph-based differentiable forward reasoner called NEUMANN, which passes messages in a memory-efficient manner and handles structured programs with functors. Additionally, the paper proposes a computationally-efficient structure learning algorithm for explanatory program induction on complex visual scenes.results:The paper demonstrates that NEUMANN outperforms neural, symbolic, and neuro-symbolic baselines in visual reasoning tasks, including a new task called “visual reasoning behind-the-scenes” that requires agents to learn abstract programs and answer queries by imagining scenes that are not observed.
    Abstract Visual reasoning is essential for building intelligent agents that understand the world and perform problem-solving beyond perception. Differentiable forward reasoning has been developed to integrate reasoning with gradient-based machine learning paradigms. However, due to the memory intensity, most existing approaches do not bring the best of the expressivity of first-order logic, excluding a crucial ability to solve abstract visual reasoning, where agents need to perform reasoning by using analogies on abstract concepts in different scenarios. To overcome this problem, we propose NEUro-symbolic Message-pAssiNg reasoNer (NEUMANN), which is a graph-based differentiable forward reasoner, passing messages in a memory-efficient manner and handling structured programs with functors. Moreover, we propose a computationally-efficient structure learning algorithm to perform explanatory program induction on complex visual scenes. To evaluate, in addition to conventional visual reasoning tasks, we propose a new task, visual reasoning behind-the-scenes, where agents need to learn abstract programs and then answer queries by imagining scenes that are not observed. We empirically demonstrate that NEUMANN solves visual reasoning tasks efficiently, outperforming neural, symbolic, and neuro-symbolic baselines.
    摘要 Visual reasoning 是建立智能代理的关键,它允许代理人理解世界并解决问题,超出感知。difficult forward reasoning 已经开发来整合reasoning 与梯度基于机器学习 парадигмы。然而,由于内存投入,大多数现有方法不能充分发挥逻辑表达力,排除了一个关键的能力:解决抽象视觉逻辑,代理人需要通过对抽象概念的比较来解决问题。为了解决这个问题,我们提出了 NEUro-symbolic Message-pAssiNg reasoNer (NEUMANN),它是一个图形基的difficult forward reasoner,通过Message passing 来减少内存占用,并处理结构化程序。此外,我们还提出了一种 Computationally-efficient 的结构学习算法,用于在复杂视觉场景中进行解释性程序induction。为了评估 NEUMANN 的性能,我们提出了一个新任务:视觉逻辑后台,代理人需要学习抽象程序,然后回答问题,想象出未见的场景。我们的实验结果表明,NEUMANN 可以高效解决视觉逻辑任务,超过 neural、symbolic 和 neuro-symbolic 基elines。

Semi-supervised multi-view concept decomposition

  • paper_url: http://arxiv.org/abs/2307.00924
  • repo_url: None
  • paper_authors: Qi Jiang, Guoxu Zhou, Qibin Zhao
  • for: 提高多视图数据表示性和含义检索性
  • methods: 基于kernel方法学习 latent表示,并结合多视图CF、标签传播和抽象学习,实现数据表示的更好化
  • results: 在四个多样化的数据集上,经验表明 SMVCF 模型在多视图归一化任务中显著提高表示性和准确率
    Abstract Concept Factorization (CF), as a novel paradigm of representation learning, has demonstrated superior performance in multi-view clustering tasks. It overcomes limitations such as the non-negativity constraint imposed by traditional matrix factorization methods and leverages kernel methods to learn latent representations that capture the underlying structure of the data, thereby improving data representation. However, existing multi-view concept factorization methods fail to consider the limited labeled information inherent in real-world multi-view data. This often leads to significant performance loss. To overcome these limitations, we propose a novel semi-supervised multi-view concept factorization model, named SMVCF. In the SMVCF model, we first extend the conventional single-view CF to a multi-view version, enabling more effective exploration of complementary information across multiple views. We then integrate multi-view CF, label propagation, and manifold learning into a unified framework to leverage and incorporate valuable information present in the data. Additionally, an adaptive weight vector is introduced to balance the importance of different views in the clustering process. We further develop targeted optimization methods specifically tailored for the SMVCF model. Finally, we conduct extensive experiments on four diverse datasets with varying label ratios to evaluate the performance of SMVCF. The experimental results demonstrate the effectiveness and superiority of our proposed approach in multi-view clustering tasks.
    摘要 《概念分解(CF)》是一种新的表示学习 paradigm,在多视图划分任务中表现出了更高的性能。它超越了传统的矩阵分解方法中的非正式约束,并利用核函数方法学习 latent 表示,以捕捉数据下面的结构,从而改善数据表示。然而,现有的多视图概念分解方法通常不考虑实际世界中多视图数据中的有限 labels 信息。这经常导致显著的性能损失。为了解决这些限制,我们提出了一种新的半upervised多视图概念分解模型,名为 SMVCF。在 SMVCF 模型中,我们首先将传统的单视图 CF 扩展到多视图版本,以更好地利用多个视图之间的补做信息。然后,我们将多视图 CF、标签传播和抽象学习集成到一个统一框架中,以利用数据中存在的有价值信息。此外,我们还引入了一个 adaptive веctor 来衡量不同视图在划分过程中的重要性。最后,我们开发了特定于 SMVCF 模型的目标优化方法。我们进行了对四个多样化的数据集进行了广泛的实验,以评估 SMVCF 模型在多视图划分任务中的性能。实验结果表明,我们提出的方法在多视图划分任务中表现出了更高的有效性和优势。

Achieving Stable Training of Reinforcement Learning Agents in Bimodal Environments through Batch Learning

  • paper_url: http://arxiv.org/abs/2307.00923
  • repo_url: None
  • paper_authors: E. Hurwitz, N. Peace, G. Cevora
  • for: 解决 tabular Q-learning 问题中的�iumodal, 随机环境挑战
  • methods: 使用批处理更新方法
  • results: 比较 typically 更新和批处理学习agent,批处理学习agent更高效、更具抗随机环境能力
    Abstract Bimodal, stochastic environments present a challenge to typical Reinforcement Learning problems. This problem is one that is surprisingly common in real world applications, being particularly applicable to pricing problems. In this paper we present a novel learning approach to the tabular Q-learning algorithm, tailored to tackling these specific challenges by using batch updates. A simulation of pricing problem is used as a testbed to compare a typically updated agent with a batch learning agent. The batch learning agents are shown to be both more effective than the typically-trained agents, and to be more resilient to the fluctuations in a large stochastic environment. This work has a significant potential to enable practical, industrial deployment of Reinforcement Learning in the context of pricing and others.
    摘要 bisimodal, 随机环境会对传统的奖励学习问题提出挑战。这种问题在实际应用中很普遍,特别适用于价格问题。在这篇论文中,我们提出了一种新的学习方法,用于修改标准的Q学习算法,以适应这些特定挑战。我们使用了批处理更新来适应这种随机环境。我们通过对比一个通常更新的代理和批处理学习代理的测试,显示了批处理学习代理在大型随机环境中更加有效和更加鲁棒。这项工作具有实用化奖励学习在价格和其他领域的潜在应用潜力。

Quantum Machine Learning on Near-Term Quantum Devices: Current State of Supervised and Unsupervised Techniques for Real-World Applications

  • paper_url: http://arxiv.org/abs/2307.00908
  • repo_url: None
  • paper_authors: Yaswitha Gujju, Atsushi Matsuo, Rudy Raymond
  • for: 本文主要关注在真正的量子硬件上实现量子机器学习(QML)应用,以实现量子优势。
  • methods: 本文探讨了目前量子硬件上QML实现的Current Limitations,并提出了多种缓解这些限制的技术,如编码技术、架构结构、错误纠正和梯度方法。
  • results: 本文评估了这些QML实现的性能,并与其经典对手进行比较。最后,本文提出了将来缓解量子机器学习应用在真正量子硬件上的挑战。Here’s the translation of the paper’s abstract in Simplified Chinese:
  • for: 本文主要关注在真正的量子硬件上实现量子机器学习(QML)应用,以实现量子优势。
  • methods: 本文探讨了目前量子硬件上QML实现的Current Limitations,并提出了多种缓解这些限制的技术,如编码技术、架构结构、错误纠正和梯度方法。
  • results: 本文评估了这些QML实现的性能,并与其经典对手进行比较。最后,本文提出了将来缓解量子机器学习应用在真正量子硬件上的挑战。
    Abstract The past decade has seen considerable progress in quantum hardware in terms of the speed, number of qubits and quantum volume which is defined as the maximum size of a quantum circuit that can be effectively implemented on a near-term quantum device. Consequently, there has also been a rise in the number of works based on the applications of Quantum Machine Learning (QML) on real hardware to attain quantum advantage over their classical counterparts. In this survey, our primary focus is on selected supervised and unsupervised learning applications implemented on quantum hardware, specifically targeting real-world scenarios. Our survey explores and highlights the current limitations of QML implementations on quantum hardware. We delve into various techniques to overcome these limitations, such as encoding techniques, ansatz structure, error mitigation, and gradient methods. Additionally, we assess the performance of these QML implementations in comparison to their classical counterparts. Finally, we conclude our survey with a discussion on the existing bottlenecks associated with applying QML on real quantum devices and propose potential solutions for overcoming these challenges in the future.
    摘要 过去一个十年,量子硬件在速度、量子比特数和量子体积方面有所进步,量子机器学习(QML)在真实硬件上实现的应用工作也有所增加。在这份报告中,我们主要关注选择性supervised和无监督学习应用程序在量子硬件上的实现,特别是面向实际场景。我们的报告探讨和强调当前量子硬件QML实现的限制,包括编码技术、架构结构、错误缓冲和梯度方法。此外,我们评估这些QML实现与其经典对手的性能。最后,我们结束报告,讨论现有量子硬件上应用QML的瓶颈,并提出未来缓解这些挑战的可能性。

Enhancing the Robustness of QMIX against State-adversarial Attacks

  • paper_url: http://arxiv.org/abs/2307.00907
  • repo_url: None
  • paper_authors: Weiran Guo, Guanjun Liu, Ziyuan Zhou, Ling Wang, Jiacun Wang
  • for: 本研究旨在提高多智能体强化学习(MARL)算法的Robustness,以适应状态恶化攻击(state-adversarial attacks)。
  • methods: 本研究使用QMIX算法作为例子,提出了四种方法来提高SARL算法的Robustness,包括:在训练阶段使用多种攻击,在训练阶段使用不同类型的攻击,使用混合攻击,以及在训练阶段使用随机攻击。
  • results: 通过对QMIX算法进行训练和测试,研究发现这些方法可以提高MARL算法的Robustness,使其能够更好地抵抗状态恶化攻击。
    Abstract Deep reinforcement learning (DRL) performance is generally impacted by state-adversarial attacks, a perturbation applied to an agent's observation. Most recent research has concentrated on robust single-agent reinforcement learning (SARL) algorithms against state-adversarial attacks. Still, there has yet to be much work on robust multi-agent reinforcement learning. Using QMIX, one of the popular cooperative multi-agent reinforcement algorithms, as an example, we discuss four techniques to improve the robustness of SARL algorithms and extend them to multi-agent scenarios. To increase the robustness of multi-agent reinforcement learning (MARL) algorithms, we train models using a variety of attacks in this research. We then test the models taught using the other attacks by subjecting them to the corresponding attacks throughout the training phase. In this way, we organize and summarize techniques for enhancing robustness when used with MARL.
    摘要 深度强化学习(DRL)性能通常受到状态敌意攻击的影响。最近的研究主要集中在对单机器学习(SARL)算法进行鲁棒性加固。然而,对多机器学习(MARL)算法的鲁棒性加固还很少研究。使用QMIX算法作为例子,本文讨论了对SARL算法进行四种技术提升鲁棒性的方法,并将其推广到多机器场景。为提高MARL算法的鲁棒性,我们在训练过程中使用多种攻击。然后,我们在训练阶段对模型进行测试,以验证它们是否能够抵抗对应的攻击。这样,我们可以系统地整理和总结提高MARL算法的鲁棒性技术。

Fixing confirmation bias in feature attribution methods via semantic match

  • paper_url: http://arxiv.org/abs/2307.00897
  • repo_url: None
  • paper_authors: Giovanni Cinà, Daniel Fernandez-Llaneza, Nishant Mishra, Tabea E. Röber, Sandro Pezzelle, Iacer Calixto, Rob Goedhart, Ş. İlker Birbil
  • for: This paper aims to address the issue of confirmation bias in feature attribution methods for black box models, and to propose a structured approach to evaluate the semantic match between human concepts and the model’s explanations.
  • methods: The paper proposes a new approach called “semantic match” to evaluate the alignment between human concepts and the feature attributions generated by the model. This approach is based on a conceptual framework put forward in Cin`a et al. (2023).
  • results: The paper presents a suite of experiments using both tabular and image data to demonstrate the effectiveness of the proposed approach in identifying both desirable and undesirable model behaviors. The results show that the assessment of semantic match can provide valuable insights into the model’s internal representations and help to resolve the issue of confirmation bias in XAI.
    Abstract Feature attribution methods have become a staple method to disentangle the complex behavior of black box models. Despite their success, some scholars have argued that such methods suffer from a serious flaw: they do not allow a reliable interpretation in terms of human concepts. Simply put, visualizing an array of feature contributions is not enough for humans to conclude something about a model's internal representations, and confirmation bias can trick users into false beliefs about model behavior. We argue that a structured approach is required to test whether our hypotheses on the model are confirmed by the feature attributions. This is what we call the "semantic match" between human concepts and (sub-symbolic) explanations. Building on the conceptual framework put forward in Cin\`a et al. [2023], we propose a structured approach to evaluate semantic match in practice. We showcase the procedure in a suite of experiments spanning tabular and image data, and show how the assessment of semantic match can give insight into both desirable (e.g., focusing on an object relevant for prediction) and undesirable model behaviors (e.g., focusing on a spurious correlation). We couple our experimental results with an analysis on the metrics to measure semantic match, and argue that this approach constitutes the first step towards resolving the issue of confirmation bias in XAI.
    摘要 feature 归因方法已成为黑盒模型行为解释的标准方法。 despite their success, some scholars have argued that such methods suffer from a serious flaw: they do not allow a reliable interpretation in terms of human concepts. simply put, visualizing an array of feature contributions is not enough for humans to conclude something about a model's internal representations, and confirmation bias can trick users into false beliefs about model behavior. we argue that a structured approach is required to test whether our hypotheses on the model are confirmed by the feature attributions. this is what we call the "semantic match" between human concepts and (sub-symbolic) explanations. building on the conceptual framework put forward in Cin\`a et al. [2023], we propose a structured approach to evaluate semantic match in practice. we showcase the procedure in a suite of experiments spanning tabular and image data, and show how the assessment of semantic match can give insight into both desirable (e.g., focusing on an object relevant for prediction) and undesirable model behaviors (e.g., focusing on a spurious correlation). we couple our experimental results with an analysis on the metrics to measure semantic match, and argue that this approach constitutes the first step towards resolving the issue of confirmation bias in XAI.

Internet of Things Fault Detection and Classification via Multitask Learning

  • paper_url: http://arxiv.org/abs/2307.01234
  • repo_url: None
  • paper_authors: Mohammad Arif Ul Alam
  • for: 这篇论文旨在开发一种适用于实际IIoT应用场景的错误检测和分类系统。
  • methods: 研究团队使用了实际IIoT系统进行三个阶段的数据收集,模拟了11种预定的故障类别。提出了SMTCNN方法用于IIoT故障检测和分类,并对实际数据进行评估。
  • results: SMTCNN方法在实际数据上达到了3.5%的特异性,并显著提高了精度、回归率和F1评价指标。
    Abstract This paper presents a comprehensive investigation into developing a fault detection and classification system for real-world IIoT applications. The study addresses challenges in data collection, annotation, algorithm development, and deployment. Using a real-world IIoT system, three phases of data collection simulate 11 predefined fault categories. We propose SMTCNN for fault detection and category classification in IIoT, evaluating its performance on real-world data. SMTCNN achieves superior specificity (3.5%) and shows significant improvements in precision, recall, and F1 measures compared to existing techniques.
    摘要

Fraunhofer SIT at CheckThat! 2023: Tackling Classification Uncertainty Using Model Souping on the Example of Check-Worthiness Classification

  • paper_url: http://arxiv.org/abs/2307.02377
  • repo_url: None
  • paper_authors: Raphael Frick, Inna Vogel, Jeong-Eun Choi
  • for: 这个论文是为了解决政治辩论文本中是否需要进行复核的问题。
  • methods: 这个论文使用了Model Souping ensemble Classification scheme来解决这个问题。
  • results: 在英文数据集上,我们的提交模型达到了总F1分数0.878,在竞赛中排名第二。
    Abstract This paper describes the second-placed approach developed by the Fraunhofer SIT team in the CLEF-2023 CheckThat! lab Task 1B for English. Given a text snippet from a political debate, the aim of this task is to determine whether it should be assessed for check-worthiness. Detecting check-worthy statements aims to facilitate manual fact-checking efforts by prioritizing the claims that fact-checkers should consider first. It can also be considered as primary step of a fact-checking system. Our best-performing method took advantage of an ensemble classification scheme centered on Model Souping. When applied to the English data set, our submitted model achieved an overall F1 score of 0.878 and was ranked as the second-best model in the competition.
    摘要 Translation notes:* "Check-worthiness" is 可验证性 (kě yàn zhèng xìng) in Simplified Chinese.* "Model Souping" is 模型汤 (molduō tāng) in Simplified Chinese, which is a play on words combining "model" and "soup" to refer to the ensemble of models used in the approach.* "F1 score" is 平均准确率 (píng jiān zhèng qiáng lǐ) in Simplified Chinese.

Unbiased Pain Assessment through Wearables and EHR Data: Multi-attribute Fairness Loss-based CNN Approach

  • paper_url: http://arxiv.org/abs/2307.05333
  • repo_url: None
  • paper_authors: Sharmin Sultana, Md Mahmudur Rahman, Atqiya Munawara Mahi, Shao-Hsien Liu, Mohammad Arif Ul Alam
  • for: 这个研究旨在开发一个可以处理不同数据类型(IoT、EHR和临床调查)的扩展可靠的人工智能(AI)系统,以找到痛症状态的物理、行为和心理指标。
  • methods: 这个研究使用了一个基于Convolutional Neural Networks(CNN)的多Attribute Fairness Loss(MAFL)模型,以考虑数据中可能包含的敏感特征,并对于不同群体进行公平的痛症评估。
  • results: 研究结果显示,对于不同群体的痛症评估,提出的MAFL模型能够优化精度和公平性的贡献,并且与现有的 Mitigation 方法相比,表现较好。使用NIH All-Of-US 数据,研究范例包括868名受试者,收集了1500天的数据,以分析提议的公平痛症评估系统。
    Abstract The combination of diverse health data (IoT, EHR, and clinical surveys) and scalable-adaptable Artificial Intelligence (AI), has enabled the discovery of physical, behavioral, and psycho-social indicators of pain status. Despite the hype and promise to fundamentally alter the healthcare system with technological advancements, much AI adoption in clinical pain evaluation has been hampered by the heterogeneity of the problem itself and other challenges, such as personalization and fairness. Studies have revealed that many AI (i.e., machine learning or deep learning) models display biases and discriminate against specific population segments (such as those based on gender or ethnicity), which breeds skepticism among medical professionals about AI adaptability. In this paper, we propose a Multi-attribute Fairness Loss (MAFL) based CNN model that aims to account for any sensitive attributes included in the data and fairly predict patients' pain status while attempting to minimize the discrepancies between privileged and unprivileged groups. In order to determine whether the trade-off between accuracy and fairness can be satisfied, we compare the proposed model with well-known existing mitigation procedures, and studies reveal that the implemented model performs favorably in contrast to state-of-the-art methods. Utilizing NIH All-Of-US data, where a cohort of 868 distinct individuals with wearables and EHR data gathered over 1500 days has been taken into consideration to analyze our suggested fair pain assessment system.
    摘要 “由多元健康数据(IoT、EHR和临床调查)和可扩展适应的人工智能(AI)的结合,已经发现了体征、行为和心理社会指标。尽管技术进步对健康领域的改革具有广泛的推广和承认,但AI在临床痛评估中的采纳受到了多重因素的影响,例如个人化和公平性。研究发现,许多AI(例如机器学习或深度学习)模型会带有偏见和歧视特定人群(例如根据性别或民族),这产生了医疗专业人员对AI适应性的怀疑。在这篇文章中,我们提出了基于多Attribute Fairness Loss(MAFL)的弹性神经网络模型,以减少权利层次中的差异。为了决定是否可以满足精确性和公平性之间的贸易,我们与已知的 Mitigation 程序进行比较,研究发现,我们的提案模型在与州域方法进行比较时表现较好。使用NIH All-Of-US数据,我们分析了我们建议的公平痛评估系统。”

Exploring the Multi-modal Demand Dynamics During Transport System Disruptions

  • paper_url: http://arxiv.org/abs/2307.00877
  • repo_url: None
  • paper_authors: Ali Shateri Benam, Angelo Furno, Nour-Eddin El Faouzi
  • for: 本研究旨在探讨不同类型的交通系统紊乱对城市流动性的影响,以及乘客对这些紊乱事件的不同反应。
  • methods: 本研究采用数据驱动的方法来探索多种交通方式的需求动态下降。首先,我们开发了一种方法来自动检测历史小时旅行需求数据中的异常情况。然后,我们应用分 clustering 这些异常小时,以分辨不同类型的多种交通需求动态。
  • results: 本研究提供了一种简单的工具,可以根据不同的紊乱enario分类不同类型的乘客反应,以及估算不同紊乱enario下的模式转移范围。
    Abstract Various forms of disruption in transport systems perturb urban mobility in different ways. Passengers respond heterogeneously to such disruptive events based on numerous factors. This study takes a data-driven approach to explore multi-modal demand dynamics under disruptions. We first develop a methodology to automatically detect anomalous instances through historical hourly travel demand data. Then we apply clustering to these anomalous hours to distinguish various forms of multi-modal demand dynamics occurring during disruptions. Our study provides a straightforward tool for categorising various passenger responses to disruptive events in terms of mode choice and paves the way for predictive analyses on estimating the scope of modal shift under distinct disruption scenarios.
    摘要 不同的交通系统紊乱会对城市流动性产生不同的影响。乘客对这些紊乱事件的回应也是多iform的,这是基于许多因素。这项研究采用数据驱动的方法来探索各种多Modal的需求动力学在紊乱情况下。我们首先开发了一种自动检测历史小时旅行需求数据中异常情况的方法。然后我们应用分 clustering 这些异常小时,以分辨不同的多Modal需求动力学在紊乱情况下发生的形式。我们的研究提供了一种简单的工具,可以根据乘客对紊乱事件的回应来分类不同的交通模式,并且为估计不同紊乱情况下的模式转换范围做出预测分析。

RobustL2S: Speaker-Specific Lip-to-Speech Synthesis exploiting Self-Supervised Representations

  • paper_url: http://arxiv.org/abs/2307.01233
  • repo_url: None
  • paper_authors: Neha Sahipjohn, Neil Shah, Vishal Tambrahalli, Vineet Gandhi
  • for: lip-to-speech synthesis
  • methods: non-autoregressive sequence-to-sequence architecture, disentangled speech content representation
  • results: state-of-the-art performance on unconstrained and constrained datasets, speech samples available onlineHere’s the full text in Simplified Chinese:for: lip-to-speech synthesismethods: non-autoregressive sequence-to-sequence architecture, 自成分化 speech content representationresults: state-of-the-art performance on unconstrained和 constrained datasets, speech samples available online
    Abstract Significant progress has been made in speaker dependent Lip-to-Speech synthesis, which aims to generate speech from silent videos of talking faces. Current state-of-the-art approaches primarily employ non-autoregressive sequence-to-sequence architectures to directly predict mel-spectrograms or audio waveforms from lip representations. We hypothesize that the direct mel-prediction hampers training/model efficiency due to the entanglement of speech content with ambient information and speaker characteristics. To this end, we propose RobustL2S, a modularized framework for Lip-to-Speech synthesis. First, a non-autoregressive sequence-to-sequence model maps self-supervised visual features to a representation of disentangled speech content. A vocoder then converts the speech features into raw waveforms. Extensive evaluations confirm the effectiveness of our setup, achieving state-of-the-art performance on the unconstrained Lip2Wav dataset and the constrained GRID and TCD-TIMIT datasets. Speech samples from RobustL2S can be found at https://neha-sherin.github.io/RobustL2S/
    摘要 significan progress has been made in speaker-dependent Lip-to-Speech synthesis, which aims to generate speech from silent videos of talking faces. current state-of-the-art approaches primarily employ non-autoregressive sequence-to-sequence architectures to directly predict mel-spectrograms or audio waveforms from lip representations. we hypothesize that the direct mel-prediction hampers training/model efficiency due to the entanglement of speech content with ambient information and speaker characteristics. to this end, we propose RobustL2S, a modularized framework for Lip-to-Speech synthesis. first, a non-autoregressive sequence-to-sequence model maps self-supervised visual features to a representation of disentangled speech content. a vocoder then converts the speech features into raw waveforms. extensive evaluations confirm the effectiveness of our setup, achieving state-of-the-art performance on the unconstrained Lip2Wav dataset and the constrained GRID and TCD-TIMIT datasets. speech samples from RobustL2S can be found at https://neha-sherin.github.io/RobustL2S/Here's the word-for-word translation: significan进步有所作出在 speaker-dependent Lip-to-Speech合成中,目标是从无声视频中提取讲话的语音。 current state-of-the-art Approaches primarily employ non-autoregressive sequence-to-sequence architectures to directly predict mel-spectrograms or audio waveforms from lip representations. we hypothesize that the direct mel-prediction hampers training/model efficiency due to the entanglement of speech content with ambient information and speaker characteristics. to this end, we propose RobustL2S, a modularized framework for Lip-to-Speech synthesis. first, a non-autoregressive sequence-to-sequence model maps self-supervised visual features to a representation of disentangled speech content. a vocoder then converts the speech features into raw waveforms. extensive evaluations confirm the effectiveness of our setup, achieving state-of-the-art performance on the unconstrained Lip2Wav dataset and the constrained GRID and TCD-TIMIT datasets. speech samples from RobustL2S can be found at https://neha-sherin.github.io/RobustL2S/

MADS: Modulated Auto-Decoding SIREN for time series imputation

  • paper_url: http://arxiv.org/abs/2307.00868
  • repo_url: None
  • paper_authors: Tom Bamford, Elizabeth Fons, Yousef El-Laham, Svitlana Vyetrenko
  • for: 这 paper 是为了解决时间序列填充问题而写的,时间序列填充是许多领域中的一个重要挑战,因为时间序列数据的类型可能会具有很大的变化。
  • methods: 这 paper 使用了深度学习技术,特别是 SIRENs 和 hypernetwork 架构,来解决时间序列填充问题。
  • results: 这 paper 在两个实际数据集上进行了评估,并证明了它在时间序列填充方面的表现比之前的方法更好,在人活动数据集上提高了填充性能的至少 40%,而在空气质量数据集上与其他基elines一样。在synthetic数据上进行评估时,我们的模型在不同数据集配置下的平均排名得到了所有基elines的最好成绩。
    Abstract Time series imputation remains a significant challenge across many fields due to the potentially significant variability in the type of data being modelled. Whilst traditional imputation methods often impose strong assumptions on the underlying data generation process, limiting their applicability, researchers have recently begun to investigate the potential of deep learning for this task, inspired by the strong performance shown by these models in both classification and regression problems across a range of applications. In this work we propose MADS, a novel auto-decoding framework for time series imputation, built upon implicit neural representations. Our method leverages the capabilities of SIRENs for high fidelity reconstruction of signals and irregular data, and combines it with a hypernetwork architecture which allows us to generalise by learning a prior over the space of time series. We evaluate our model on two real-world datasets, and show that it outperforms state-of-the-art methods for time series imputation. On the human activity dataset, it improves imputation performance by at least 40%, while on the air quality dataset it is shown to be competitive across all metrics. When evaluated on synthetic data, our model results in the best average rank across different dataset configurations over all baselines.
    摘要 时间序列填充仍然是许多领域中的主要挑战,因为可能存在严重的数据类型变化,导致传统填充方法的适用有限。然而,研究人员最近开始研究使用深度学习来解决这个问题,因为深度学习模型在各种应用中的类型预测和回归问题中表现出色。在这种工作中,我们提出了MADS,一种新的自动解码框架 для时间序列填充,基于含义表示。我们的方法利用SIRENs高精度重建信号和不规则数据的能力,并将其与一个权重网络架构相结合,以学习时间序列的先验知识。我们对两个实际数据集进行评估,并显示了MADS在时间序列填充方面的超过状态艺术方法的表现。在人活动数据集上,它提高填充性能至少40%,而在空气质量数据集上,它与所有指标中竞争。当对synthetic数据进行评估时,我们的模型在不同数据集配置下的平均排名最高。

  • paper_url: http://arxiv.org/abs/2307.00865
  • repo_url: None
  • paper_authors: Xingyu Liu, Juan Chen, Quan Wen
  • for: 本文旨在探讨 tradicional convolutional neural networks 在图数据处理中的应用,以及如何将其扩展到图数据分析和处理领域。
  • methods: 本文使用 graph convolutional operators 和 graph pooling operators 来构建图 convolutional neural networks,并采用 attention mechanisms 和 autoencoders 来提高模型性能。
  • results: 本文通过对node classification、graph classification和link prediction等任务的应用,阐述了图 convolutional neural networks 在不同任务中的应用和效果。
    Abstract Traditional convolutional neural networks are limited to handling Euclidean space data, overlooking the vast realm of real-life scenarios represented as graph data, including transportation networks, social networks, and reference networks. The pivotal step in transferring convolutional neural networks to graph data analysis and processing lies in the construction of graph convolutional operators and graph pooling operators. This comprehensive review article delves into the world of graph convolutional neural networks. Firstly, it elaborates on the fundamentals of graph convolutional neural networks. Subsequently, it elucidates the graph neural network models based on attention mechanisms and autoencoders, summarizing their application in node classification, graph classification, and link prediction along with the associated datasets.
    摘要 传统的卷积神经网络只能处理几何空间数据,忽略了现实生活中的各种图数据,包括交通网络、社交网络和引用网络。图 convolutional 算子和图 Pooling 算子的建构是将卷积神经网络传输到图数据分析和处理的关键步骤。本综述文章将介绍图 convolutional 神经网络的基础知识,然后详细介绍基于注意机制和自编码器的图神经网络模型,包括节点分类、图分类和链接预测,以及相关的数据集。

Thompson Sampling under Bernoulli Rewards with Local Differential Privacy

  • paper_url: http://arxiv.org/abs/2307.00863
  • repo_url: None
  • paper_authors: Bo Jiang, Tianchi Zhao, Ming Li
  • for: 这个论文研究了多机枪弹(MAB)问题中的 regret 最小化问题,同时保证了地方差分隐私(LDP)的 garantor。
  • methods: 论文使用了三种隐私机制:线性机制、 quadrature 机制和指数机制,并对 Thompson Sampling 算法 derivated 了随机 regret bound。
  • results: 论文通过 simulate 来示例了不同隐私预算下不同机制的凝结行为。
    Abstract This paper investigates the problem of regret minimization for multi-armed bandit (MAB) problems with local differential privacy (LDP) guarantee. Given a fixed privacy budget $\epsilon$, we consider three privatizing mechanisms under Bernoulli scenario: linear, quadratic and exponential mechanisms. Under each mechanism, we derive stochastic regret bound for Thompson Sampling algorithm. Finally, we simulate to illustrate the convergence of different mechanisms under different privacy budgets.
    摘要 这个论文研究了多重投机(MAB)问题中的 regret最小化问题,同时保证了本地差分隐私(LDP)的 garantía。给定一个固定的隐私预算$\epsilon$,我们考虑了三种隐私机制在 Bernoulli 场景下:线性机制、квадратиче机制和指数机制。对于每种机制,我们 derivated sto的抽象 regret bound for Thompson Sampling 算法。最后,我们使用 Simulation 来示出不同隐私预算下不同机制的连续性。Note that the translation is in Simplified Chinese, which is the standard writing system used in mainland China. If you prefer Traditional Chinese, please let me know and I can provide the translation in that format as well.

CardiGraphormer: Unveiling the Power of Self-Supervised Learning in Revolutionizing Drug Discovery

  • paper_url: http://arxiv.org/abs/2307.00859
  • repo_url: None
  • paper_authors: Abhijit Gupta, Arnab Mukherjee
  • for: 这篇论文旨在探讨一种新的人工智能(AI)方法,用于探索药物发现的可能性。
  • methods: 这篇论文使用了自我超级学习(SSL)、 graf对话网(GNNs)和卡度维持注意力(Cardinality Preserving Attention)等技术,创造了一个名为CardiGraphormer的新方法。
  • results: CardiGraphormer可以对药物结构进行学习,并将其映射到更加强大的预测性和解释性。它可以处理复杂的药物数据,并可以进行节点、双节点、子graph和整个图结构的处理。
    Abstract In the expansive realm of drug discovery, with approximately 15,000 known drugs and only around 4,200 approved, the combinatorial nature of the chemical space presents a formidable challenge. While Artificial Intelligence (AI) has emerged as a powerful ally, traditional AI frameworks face significant hurdles. This manuscript introduces CardiGraphormer, a groundbreaking approach that synergizes self-supervised learning (SSL), Graph Neural Networks (GNNs), and Cardinality Preserving Attention to revolutionize drug discovery. CardiGraphormer, a novel combination of Graphormer and Cardinality Preserving Attention, leverages SSL to learn potent molecular representations and employs GNNs to extract molecular fingerprints, enhancing predictive performance and interpretability while reducing computation time. It excels in handling complex data like molecular structures and performs tasks associated with nodes, pairs of nodes, subgraphs, or entire graph structures. CardiGraphormer's potential applications in drug discovery and drug interactions are vast, from identifying new drug targets to predicting drug-to-drug interactions and enabling novel drug discovery. This innovative approach provides an AI-enhanced methodology in drug development, utilizing SSL combined with GNNs to overcome existing limitations and pave the way for a richer exploration of the vast combinatorial chemical space in drug discovery.
    摘要 在药物发现领域中,有约15,000种已知药物和只有约4,200个得到批准,化学空间的 combinatorial 特性呈现出巨大的挑战。而人工智能(AI)已经出现为一种强大的同伴,但传统的 AI 框架又存在许多障碍。这篇文章介绍 CardiGraphormer,一种创新的方法,它将自我supervised learning(SSL)、Graph Neural Networks(GNNs)和Cardinality Preserving Attention(CPA)相互融合,以推动药物发现领域的革命。CardiGraphormer 是 Graphormer 和 CPA 的新的组合体,通过 SSL 学习强大的分子表示,并使用 GNNs 提取分子指纹,提高预测性能和可读性,同时减少计算时间。它可以处理复杂的数据,如分子结构,并可以完成节点、节点对、子图和整个图结构上的任务。CardiGraphormer 在药物发现和药物互作中的应用前景广阔,从选择新的药物目标到预测药物之间的互作,以及开发新的药物探索方法。这种创新的方法为药物发展中的 AI 增强方法,通过 SSL 与 GNNs 的结合,突破现有的限制,开辟了更加丰富的化学空间的探索。

Beyond the Snapshot: Brain Tokenized Graph Transformer for Longitudinal Brain Functional Connectome Embedding

  • paper_url: http://arxiv.org/abs/2307.00858
  • repo_url: https://github.com/zijiand/brain-tokengt
  • paper_authors: Zijian Dong, Yilei Wu, Yu Xiao, Joanna Su Xian Chong, Yueming Jin, Juan Helen Zhou
  • for: 这个研究的目的是为了发展一个可解释的数据 embedding 方法,用于诊断和预测认知功能障碍和慢速进行的脑性障碍病。
  • methods: 这个方法使用了 Graph Neural Networks (GNN) 和 tokenization 技术,实现了脑功能连接图 (FC) 的时间变化轨迹 embedding。
  • results: 在 AD 病Continuum 上的两个公共 longitudinal fMRI 数据集上,这个方法比其他 benchmark 模型出色,并且提供了极佳的解释性。
    Abstract Under the framework of network-based neurodegeneration, brain functional connectome (FC)-based Graph Neural Networks (GNN) have emerged as a valuable tool for the diagnosis and prognosis of neurodegenerative diseases such as Alzheimer's disease (AD). However, these models are tailored for brain FC at a single time point instead of characterizing FC trajectory. Discerning how FC evolves with disease progression, particularly at the predementia stages such as cognitively normal individuals with amyloid deposition or individuals with mild cognitive impairment (MCI), is crucial for delineating disease spreading patterns and developing effective strategies to slow down or even halt disease advancement. In this work, we proposed the first interpretable framework for brain FC trajectory embedding with application to neurodegenerative disease diagnosis and prognosis, namely Brain Tokenized Graph Transformer (Brain TokenGT). It consists of two modules: 1) Graph Invariant and Variant Embedding (GIVE) for generation of node and spatio-temporal edge embeddings, which were tokenized for downstream processing; 2) Brain Informed Graph Transformer Readout (BIGTR) which augments previous tokens with trainable type identifiers and non-trainable node identifiers and feeds them into a standard transformer encoder to readout. We conducted extensive experiments on two public longitudinal fMRI datasets of the AD continuum for three tasks, including differentiating MCI from controls, predicting dementia conversion in MCI, and classification of amyloid positive or negative cognitively normal individuals. Based on brain FC trajectory, the proposed Brain TokenGT approach outperformed all the other benchmark models and at the same time provided excellent interpretability. The code is available at https://github.com/ZijianD/Brain-TokenGT.git
    摘要 基于网络基因的脑功能连接图(FC)基于图神经网络(GNN)已成为诊断和诊断预测阿尔茨海默病(AD)等神经退化疾病的有价值工具。然而,这些模型围绕单个时间点的脑FC而构建,而不是跟踪FC的演变。了解脑FC在疾病晚期的演变,特别是在认知正常者悉射氧皮质堆积或轻度认知障碍(MCI)等阶段,是诊断疾病扩散模式和开发有效缓解疾病的关键。在这种工作中,我们提出了第一个可解释性框架 для脑FC演变 embeddings,即Brain Tokenized Graph Transformer(Brain TokenGT)。它包括两个模块:1)图固有和变异 embedding(GIVE)用于生成节点和空间时间边 embedding,这些embedding被拆分为下游处理; 2)脑指导图Transformer读取(BIGTR),它将上一个模块的输出和可训练类型标识符和非可训练节点标识符相加,并将其传输给标准Transformer编码器来读取。我们对两个公共的 longitudinal fMRI 数据集进行了广泛的实验,包括分类认知正常者和MCI、预测MCI中的诊断转化和分类悉射氧皮质堆积或正常认知正常者。基于脑FC演变,我们的Brain TokenGT方法在所有参考模型中表现出色,同时提供了优秀的可解释性。代码可以在https://github.com/ZijianD/Brain-TokenGT.git中找到。

Surgical fine-tuning for Grape Bunch Segmentation under Visual Domain Shifts

  • paper_url: http://arxiv.org/abs/2307.00837
  • repo_url: https://github.com/airlab-polimi/sft_grape_segmentation
  • paper_authors: Agnese Chiatti, Riccardo Bertoglio, Nico Catalano, Matteo Gatti, Matteo Matteucci
  • for: 这篇论文是为了研究mobile robots在农业中的应用,尤其是在葡萄园中自动和有效地监测植物状况。
  • methods: 本研究使用了抢救精细调整(surgical fine-tuning)来适应葡萄图像中的视觉域变化。 selectively tuning only specific model layers可以支持预训练深度学习模型对新收集的葡萄图像进行适应,同时减少调整的参数数量。
  • results: 本研究表明,通过抢救精细调整,可以减少参数数量的情况下,预训练深度学习模型可以快速适应葡萄图像中的视觉域变化,并且可以提高葡萄棕榈的检测精度。
    Abstract Mobile robots will play a crucial role in the transition towards sustainable agriculture. To autonomously and effectively monitor the state of plants, robots ought to be equipped with visual perception capabilities that are robust to the rapid changes that characterise agricultural settings. In this paper, we focus on the challenging task of segmenting grape bunches from images collected by mobile robots in vineyards. In this context, we present the first study that applies surgical fine-tuning to instance segmentation tasks. We show how selectively tuning only specific model layers can support the adaptation of pre-trained Deep Learning models to newly-collected grape images that introduce visual domain shifts, while also substantially reducing the number of tuned parameters.
    摘要 mobile robots会在可持续农业过渡中发挥关键作用。为了让机器人自动和有效地监测植物状况,它们需要具备可靠地对农业环境中快速变化的视觉感知能力。在这篇论文中,我们关注了在葡萄园中机器人收集的图像中分割葡萄束的挑战性任务。我们表明了使用手术精细调整来实现预训练深度学习模型的适应新收集的葡萄图像,同时减少调整参数的数量。

Trading-Off Payments and Accuracy in Online Classification with Paid Stochastic Experts

  • paper_url: http://arxiv.org/abs/2307.00836
  • repo_url: None
  • paper_authors: Dirk van der Hoeven, Ciara Pike-Burke, Hao Qiu, Nicolo Cesa-Bianchi
  • for: 这个论文研究在线分类,特别是使用支付的权重随机专家(Paid Stochastic Experts)。
  • methods: 论文使用了在线学习算法,其中每名专家需要支付一定的费用才能提供预测。 learner需要在每一轮中决定支付每名专家多少费用,并根据这些预测来做预测。
  • results: 论文提出了一种在线学习算法,该算法的总成本在 $T$ 轮后不超过一个 predictor 的总成本,这个 predictor 知道所有专家的产出函数(productivity)。这个结果比标准的 Lipschitz 随机抽象更好,可以避免 $T^{2/3}$ 的成本上限。
    Abstract We investigate online classification with paid stochastic experts. Here, before making their prediction, each expert must be paid. The amount that we pay each expert directly influences the accuracy of their prediction through some unknown Lipschitz "productivity" function. In each round, the learner must decide how much to pay each expert and then make a prediction. They incur a cost equal to a weighted sum of the prediction error and upfront payments for all experts. We introduce an online learning algorithm whose total cost after $T$ rounds exceeds that of a predictor which knows the productivity of all experts in advance by at most $\mathcal{O}(K^2(\log T)\sqrt{T})$ where $K$ is the number of experts. In order to achieve this result, we combine Lipschitz bandits and online classification with surrogate losses. These tools allow us to improve upon the bound of order $T^{2/3}$ one would obtain in the standard Lipschitz bandit setting. Our algorithm is empirically evaluated on synthetic data
    摘要 我们调查线上分类 WITH 付价随机专家。在这里,每个专家都需要付出一定的代价才能发出预测。付出的代价直接影响预测的精度通过一个未知的 Lipschitz "产生力" 函数。在每一轮中,学习者需要决定如何付出每个专家以及做出预测。他们需要支付一个复杂的权重和预测误差的成本。我们引入线上学习算法,其全部成本在 $T$ 轮后不超过一个知道所有专家产生力的预测者的全部成本的 $\mathcal{O}(K^2(\log T)\sqrt{T})$,其中 $K$ 是专家的数量。以实现这个结果,我们结合了Lipschitz 枪和线上分类 WITH 代理损失。这些工具允许我们对于标准 Lipschitz 枪设置中的结果进行改进。我们的算法在实验上被评估了在合成数据上。

Engression: Extrapolation for Nonlinear Regression?

  • paper_url: http://arxiv.org/abs/2307.00835
  • repo_url: https://github.com/xwshen51/engression
  • paper_authors: Xinwei Shen, Nicolai Meinshausen
  • For: The paper is written for those who need a nonlinear regression method that can handle extrapolation tasks, especially in situations where the training data is limited and the test data is outside the support.* Methods: The paper proposes a new method called “engression” which is a distributional regression technique for pre-additive noise models. The method adds noise to the covariates before applying a nonlinear transformation, allowing it to perform well in extrapolation tasks.* Results: The paper shows that engression consistently provides a meaningful improvement in extrapolation tasks compared to traditional regression approaches such as least-squares regression and quantile regression, especially when the function class is strictly monotone. The empirical results from both simulated and real data validate the effectiveness of the engression method.
    Abstract Extrapolation is crucial in many statistical and machine learning applications, as it is common to encounter test data outside the training support. However, extrapolation is a considerable challenge for nonlinear models. Conventional models typically struggle in this regard: while tree ensembles provide a constant prediction beyond the support, neural network predictions tend to become uncontrollable. This work aims at providing a nonlinear regression methodology whose reliability does not break down immediately at the boundary of the training support. Our primary contribution is a new method called `engression' which, at its core, is a distributional regression technique for pre-additive noise models, where the noise is added to the covariates before applying a nonlinear transformation. Our experimental results indicate that this model is typically suitable for many real data sets. We show that engression can successfully perform extrapolation under some assumptions such as a strictly monotone function class, whereas traditional regression approaches such as least-squares regression and quantile regression fall short under the same assumptions. We establish the advantages of engression over existing approaches in terms of extrapolation, showing that engression consistently provides a meaningful improvement. Our empirical results, from both simulated and real data, validate these findings, highlighting the effectiveness of the engression method. The software implementations of engression are available in both R and Python.
    摘要 <> translate the following text into Simplified Chinese<>描述:扩展是在统计学和机器学习应用中非常重要,因为很多时候会遇到训练数据外的测试数据。然而,扩展是非线性模型的一大挑战。传统的模型通常在这个方面表现不佳:虚拟树集提供了常数预测,而神经网络预测则变得无法控制。这项工作的目标是提供一种非线性回归方法,其可靠性不会因训练支持的边界而崩溃。我们的主要贡献是一种名为“扩展”的新方法,其核心思想是为幂函数模型添加前向随机变量的分布式回归技术。我们的实验结果表明,这种方法适用于许多实际数据集。我们证明了扩展在一些假设下(如幂函数类型的准确 monotonic)下能够成功进行扩展,而传统的回归方法如最小二乘回归和量化回归则在同样的假设下失败。我们还证明了扩展与现有方法的优势,并通过实验证明了扩展的有效性。扩展的软件实现现已在R和Python中可用。Note: Simplified Chinese is used in mainland China and Singapore, while Traditional Chinese is used in Taiwan, Hong Kong, and Macau.

Model-Assisted Probabilistic Safe Adaptive Control With Meta-Bayesian Learning

  • paper_url: http://arxiv.org/abs/2307.00828
  • repo_url: None
  • paper_authors: Shengbo Wang, Ke Li, Yin Yang, Yuting Cao, Tingwen Huang, Shiping Wen
  • for: 本研究旨在开发一种可靠的 adaptive safe control 框架,以满足控制系统中的安全性和可靠性需求。
  • methods: 本研究使用 meta learning 技术、权重学习和 Bayesian 模型,以及控制边界函数(CBF)方法,来学习内在和外在不确定性。特别是,通过 CBF 方法,我们可以通过一个统一的 adaptive Bayesian linear regression(ABLR)模型来学习不确定性,该模型包括一个前向神经网络(NN)和一个 Bayesian 输出层。
  • results: 对比历史类似任务的数据,我们的算法可以快速地适应新的控制任务,并在多个不确定性约束下进行安全的探索。 results 表明,我们的算法可以显著提高 Bayesian 模型基于 CBF 方法的性能,并且可以快速地适应不同的控制任务。
    Abstract Breaking safety constraints in control systems can lead to potential risks, resulting in unexpected costs or catastrophic damage. Nevertheless, uncertainty is ubiquitous, even among similar tasks. In this paper, we develop a novel adaptive safe control framework that integrates meta learning, Bayesian models, and control barrier function (CBF) method. Specifically, with the help of CBF method, we learn the inherent and external uncertainties by a unified adaptive Bayesian linear regression (ABLR) model, which consists of a forward neural network (NN) and a Bayesian output layer. Meta learning techniques are leveraged to pre-train the NN weights and priors of the ABLR model using data collected from historical similar tasks. For a new control task, we refine the meta-learned models using a few samples, and introduce pessimistic confidence bounds into CBF constraints to ensure safe control. Moreover, we provide theoretical criteria to guarantee probabilistic safety during the control processes. To validate our approach, we conduct comparative experiments in various obstacle avoidance scenarios. The results demonstrate that our algorithm significantly improves the Bayesian model-based CBF method, and is capable for efficient safe exploration even with multiple uncertain constraints.
    摘要 breaking safety constraints in control systems can lead to potential risks, resulting in unexpected costs or catastrophic damage. Nevertheless, uncertainty is ubiquitous, even among similar tasks. In this paper, we develop a novel adaptive safe control framework that integrates meta learning, Bayesian models, and control barrier function (CBF) method. Specifically, with the help of CBF method, we learn the inherent and external uncertainties by a unified adaptive Bayesian linear regression (ABLR) model, which consists of a forward neural network (NN) and a Bayesian output layer. Meta learning techniques are leveraged to pre-train the NN weights and priors of the ABLR model using data collected from historical similar tasks. For a new control task, we refine the meta-learned models using a few samples, and introduce pessimistic confidence bounds into CBF constraints to ensure safe control. Moreover, we provide theoretical criteria to guarantee probabilistic safety during the control processes. To validate our approach, we conduct comparative experiments in various obstacle avoidance scenarios. The results demonstrate that our algorithm significantly improves the Bayesian model-based CBF method, and is capable for efficient safe exploration even with multiple uncertain constraints.Here's the translation in Traditional Chinese:breaking safety constraints in control systems can lead to potential risks, resulting in unexpected costs or catastrophic damage. Nevertheless, uncertainty is ubiquitous, even among similar tasks. In this paper, we develop a novel adaptive safe control framework that integrates meta learning, Bayesian models, and control barrier function (CBF) method. Specifically, with the help of CBF method, we learn the inherent and external uncertainties by a unified adaptive Bayesian linear regression (ABLR) model, which consists of a forward neural network (NN) and a Bayesian output layer. Meta learning techniques are leveraged to pre-train the NN weights and priors of the ABLR model using data collected from historical similar tasks. For a new control task, we refine the meta-learned models using a few samples, and introduce pessimistic confidence bounds into CBF constraints to ensure safe control. Moreover, we provide theoretical criteria to guarantee probabilistic safety during the control processes. To validate our approach, we conduct comparative experiments in various obstacle avoidance scenarios. The results demonstrate that our algorithm significantly improves the Bayesian model-based CBF method, and is capable for efficient safe exploration even with multiple uncertain constraints.

Robust Surgical Tools Detection in Endoscopic Videos with Noisy Data

  • paper_url: http://arxiv.org/abs/2307.01232
  • repo_url: None
  • paper_authors: Adnan Qayyum, Hassan Ali, Massimo Caputo, Hunaid Vohra, Taofeek Akinosho, Sofiat Abioye, Ilhem Berrou, Paweł Capik, Junaid Qadir, Muhammad Bilal
  • for: 本研究旨在提出一种系统性的方法ologies для开发Robust模型,以便在含有噪声数据的情况下进行手术工具检测。
  • methods: 本研究使用了两个关键创新:首先,提出了一种智能活动学习策略,以便由人工专家进行最小数据标注和标签更正;其次,提出了一种学生-教师模型自我训练框架,以便在半监督的情况下实现多种手术工具的精准分类。此外,我们还使用了负权重数据加载器来处理困难的类别标签和类别不均衡问题。
  • results: 根据我们的实验结果,提出的方法可以在含有噪声数据的情况下实现85.88%的平均F1分数,而无类别权重的情况下可以达到80.88%的平均F1分数。此外,我们的提出方法也可以有效地超越现有的方法,这有效地证明了其效果。
    Abstract Over the past few years, surgical data science has attracted substantial interest from the machine learning (ML) community. Various studies have demonstrated the efficacy of emerging ML techniques in analysing surgical data, particularly recordings of procedures, for digitizing clinical and non-clinical functions like preoperative planning, context-aware decision-making, and operating skill assessment. However, this field is still in its infancy and lacks representative, well-annotated datasets for training robust models in intermediate ML tasks. Also, existing datasets suffer from inaccurate labels, hindering the development of reliable models. In this paper, we propose a systematic methodology for developing robust models for surgical tool detection using noisy data. Our methodology introduces two key innovations: (1) an intelligent active learning strategy for minimal dataset identification and label correction by human experts; and (2) an assembling strategy for a student-teacher model-based self-training framework to achieve the robust classification of 14 surgical tools in a semi-supervised fashion. Furthermore, we employ weighted data loaders to handle difficult class labels and address class imbalance issues. The proposed methodology achieves an average F1-score of 85.88\% for the ensemble model-based self-training with class weights, and 80.88\% without class weights for noisy labels. Also, our proposed method significantly outperforms existing approaches, which effectively demonstrates its effectiveness.
    摘要 过去几年,手术数据科学已经吸引了机器学习(ML)社区的广泛关注。多个研究表明了新兴ML技术在分析手术数据方面的效果,特别是记录手术过程的数据,包括前操作规划、Context-aware决策和手术技巧评估等。然而,这一领域仍处于初期阶段,缺乏代表性的、正确标注的数据集,以用于训练中等ML任务。此外,现有的数据集受到不准确的标注,使得模型的发展受到限制。在这篇论文中,我们提出了一种系统方法论,用于开发Robust模型,以便在噪音数据上进行手术工具检测。我们的方法引入了两个关键创新:1. 一种智能的活动学习策略,用于identify minimal数据集和人工专家标注的自动 corrections;2. 一种学生-教师模型基于的自动训练框架,以实现14种手术工具的robust分类。我们还使用了重量数据加载器,以处理困难的类别标签和类别不均衡问题。提议的方法在 ensemble模型基于自动训练中实现了85.88%的F1分数,而不含类别标签的情况下,则是80.88%。此外,我们的提议方法在与现有方法进行比较时,显示出了明显的效果。

Analysis of Task Transferability in Large Pre-trained Classifiers

  • paper_url: http://arxiv.org/abs/2307.00823
  • repo_url: https://github.com/akshaymehra24/tasktransferanalysis
  • paper_authors: Akshay Mehra, Yunbei Zhang, Jihun Hamm
  • for: 本研究旨在分析将知识从源任务传播到多个下游任务中的性能,特别是使用大型预训练模型时,传播性能如何提高。
  • methods: 本研究使用一种名为Task Transfer Analysis的方法,将源任务的分布和分类器变换成一个新的源任务分布,并将源任务的损失与下游任务的损失相关联。
  • results: 本研究通过大规模的实验研究,发现了各种因素如任务相关性、预训练方法和模型结构对传播性能的影响。
    Abstract Transfer learning transfers the knowledge acquired by a model from a source task to multiple downstream target tasks with minimal fine-tuning. The success of transfer learning at improving performance, especially with the use of large pre-trained models has made transfer learning an essential tool in the machine learning toolbox. However, the conditions under which the performance is transferable to downstream tasks are not understood very well. In this work, we analyze the transfer of performance for classification tasks, when only the last linear layer of the source model is fine-tuned on the target task. We propose a novel Task Transfer Analysis approach that transforms the source distribution (and classifier) by changing the class prior distribution, label, and feature spaces to produce a new source distribution (and classifier) and allows us to relate the loss of the downstream task (i.e., transferability) to that of the source task. Concretely, our bound explains transferability in terms of the Wasserstein distance between the transformed source and downstream task's distribution, conditional entropy between the label distributions of the two tasks, and weighted loss of the source classifier on the source task. Moreover, we propose an optimization problem for learning the transforms of the source task to minimize the upper bound on transferability. We perform a large-scale empirical study by using state-of-the-art pre-trained models and demonstrate the effectiveness of our bound and optimization at predicting transferability. The results of our experiments demonstrate how factors such as task relatedness, pretraining method, and model architecture affect transferability.
    摘要 <> Transfer learning 是一种将模型从源任务中的知识转移到多个下游任务上,并且只需 minimal 微调。由于转移学习在提高性能的能力,特别是使用大型预训练模型,因此转移学习已成为机器学习工具箱中的一种重要工具。但是,转移学习下的性能是如何转移的?在这项工作中,我们分析了 classification 任务下的性能转移,只有源模型的最后 Linear layer 微调。我们提出了一种 Task Transfer Analysis 方法,将源分布(和分类器)变换成新的源分布(和分类器),以生成一个新的源分布(和分类器),并允许我们将下游任务的损失(即转移性)与源任务的损失相关联。具体来说,我们的 bound 将转移性分解为 Wasserstein 距离 between 源和下游任务的分布、下游任务的标签分布和源任务的标签分布之间的 conditional entropy,以及源任务中源分类器的权重损失。此外,我们提出了一个优化问题,用于学习转移任务中的转换,以最小化转移性的上限。我们通过使用现有的预训练模型进行大规模的实验研究,并证明了我们的 bound 和优化方法的效果。实验结果表明,转移学习中的因素,如任务相关性、预训练方法和模型结构,对转移性产生了影响。Note: The translation is in Simplified Chinese, which is the standard writing system used in mainland China. If you prefer Traditional Chinese, please let me know and I can provide the translation in that format as well.

A Critical Re-evaluation of Benchmark Datasets for (Deep) Learning-Based Matching Algorithms

  • paper_url: http://arxiv.org/abs/2307.01231
  • repo_url: https://github.com/gpapadis/dlmatchers
  • paper_authors: George Papadakis, Nishadi Kirielle, Peter Christen, Themis Palpanas
  • for: 本研究旨在评估Established datasets的难度和适用性,以便更好地评估学习型匹配算法的性能。
  • methods: 本研究提出了四种方法来评估13个Established datasets的难度和适用性,包括两种理论方法和两种实践方法。
  • results: 研究发现,大多数流行的Established datasets pose relatively easy classification tasks,因此不适合评估学习型匹配算法的性能。为此,本研究提出了一种新的方法ología para生成benchmark datasets,并在实践中创建了四个新匹配任务,以便更好地评估学习型匹配算法的性能。
    Abstract Entity resolution (ER) is the process of identifying records that refer to the same entities within one or across multiple databases. Numerous techniques have been developed to tackle ER challenges over the years, with recent emphasis placed on machine and deep learning methods for the matching phase. However, the quality of the benchmark datasets typically used in the experimental evaluations of learning-based matching algorithms has not been examined in the literature. To cover this gap, we propose four different approaches to assessing the difficulty and appropriateness of 13 established datasets: two theoretical approaches, which involve new measures of linearity and existing measures of complexity, and two practical approaches: the difference between the best non-linear and linear matchers, as well as the difference between the best learning-based matcher and the perfect oracle. Our analysis demonstrates that most of the popular datasets pose rather easy classification tasks. As a result, they are not suitable for properly evaluating learning-based matching algorithms. To address this issue, we propose a new methodology for yielding benchmark datasets. We put it into practice by creating four new matching tasks, and we verify that these new benchmarks are more challenging and therefore more suitable for further advancements in the field.
    摘要 <>传统的实体解决(ER)过程是将多个数据库中的记录与同一实体相匹配。随着年代的推移,各种技术已经被开发出来解决ER挑战,其中最近几年的研究主要集中在机器学习和深度学习方法中。然而,学术界中使用的标准数据集的质量尚未得到文献中的检查。为了填补这个差距,我们提出了四种方法来评估13个已知数据集的难度和适用性:两种理论方法,即基于新的线性度量和现有的复杂度度量,以及两种实践方法:非线性和线性匹配器之间的差异,以及学习基于匹配器和完美oracle之间的差异。我们的分析表明,大多数流行的数据集都是较为容易的分类任务,因此它们不适用于评估学习基于匹配算法的进展。为了解决这个问题,我们提出了一种新的方法来生成标准数据集。我们将其应用于四个新匹配任务中,并证明这些新的标准数据集更适合进一步的发展。

Large Language and Text-to-3D Models for Engineering Design Optimization

  • paper_url: http://arxiv.org/abs/2307.01230
  • repo_url: None
  • paper_authors: Thiago Rios, Stefan Menzel, Bernhard Sendhoff
  • for: 这 paper 是为了研究深度文本-3D 模型在工程领域中的潜在应用,尤其是在计算机 simulate 基于的设计优化中。
  • methods: 这 paper 使用了 Shap-E,一个最近发布的文本-3D 资产网络,以自动化进化设计优化框架。
  • results: 研究发现,在使用文本提示中,需要确保生成的设计是在物品类中有效,而且需要进一步研究以确定文本提示的变化和3D 设计变化之间的相关性,以提高优化。
    Abstract The current advances in generative AI for learning large neural network models with the capability to produce essays, images, music and even 3D assets from text prompts create opportunities for a manifold of disciplines. In the present paper, we study the potential of deep text-to-3D models in the engineering domain, with focus on the chances and challenges when integrating and interacting with 3D assets in computational simulation-based design optimization. In contrast to traditional design optimization of 3D geometries that often searches for the optimum designs using numerical representations, such as B-Spline surface or deformation parameters in vehicle aerodynamic optimization, natural language challenges the optimization framework by requiring a different interpretation of variation operators while at the same time may ease and motivate the human user interaction. Here, we propose and realize a fully automated evolutionary design optimization framework using Shap-E, a recently published text-to-3D asset network by OpenAI, in the context of aerodynamic vehicle optimization. For representing text prompts in the evolutionary optimization, we evaluate (a) a bag-of-words approach based on prompt templates and Wordnet samples, and (b) a tokenisation approach based on prompt templates and the byte pair encoding method from GPT4. Our main findings from the optimizations indicate that, first, it is important to ensure that the designs generated from prompts are within the object class of application, i.e. diverse and novel designs need to be realistic, and, second, that more research is required to develop methods where the strength of text prompt variations and the resulting variations of the 3D designs share causal relations to some degree to improve the optimization.
    摘要 当前的生成AI技术在学习大规模神经网络模型方面带来了许多机会,这些模型可以从文本提示生成文章、图像、音乐和 même 3D 资产。在 presente 纸上,我们研究了在工程领域中深度文本-3D 模型的潜在力量,特别是在 Computational simulation-based 设计优化中交互和结合 3D 资产的机会和挑战。与传统的3D 结构设计优化不同,通过 numerical 表示(如 B-Spline 表面或者扭变参数在汽车 aerodynamic 优化中),文本挑战了优化框架,需要不同的变量操作符的解释,同时可能使人工用户交互更加容易和动机。在这里,我们提出了一个完全自动化的进化式设计优化框架,使用 OpenAI 最近发布的 Shap-E 文本-3D 资产网络。为表示文本提示在进化优化中,我们评估了(a)使用提示模板和 Wordnet 样本的袋子-of-words 方法,以及(b)使用提示模板和 GPT4 的字节对应方法。我们的主要发现表明,首先,需要确保生成的设计是在应用对象类中的,即文本提示生成的设计需要是多样化、创新的,同时也需要是真实的。其次,需要进一步的研究,以发展方法,使文本提示的变化强度和生成的 3D 设计之间存在相互 causal 关系,以改善优化。

Monte Carlo Policy Gradient Method for Binary Optimization

  • paper_url: http://arxiv.org/abs/2307.00783
  • repo_url: https://github.com/optsuite/mcpg
  • paper_authors: Cheng Chen, Ruitao Chen, Tianyou Li, Ruichen Ao, Zaiwen Wen
  • for: 这 paper 是为了解决 combinatorial optimization 问题,如 MaxCut、MIMO detection 和 MaxSAT。
  • methods: 这 paper 使用了一种新的 probabilistic model,以采样 binary solution according to a parameterized policy distribution。
  • results: 这 paper 的结果表明,使用这种方法可以提供 near-optimal 的解决方案,并且 convergence 性能良好。
    Abstract Binary optimization has a wide range of applications in combinatorial optimization problems such as MaxCut, MIMO detection, and MaxSAT. However, these problems are typically NP-hard due to the binary constraints. We develop a novel probabilistic model to sample the binary solution according to a parameterized policy distribution. Specifically, minimizing the KL divergence between the parameterized policy distribution and the Gibbs distributions of the function value leads to a stochastic optimization problem whose policy gradient can be derived explicitly similar to reinforcement learning. For coherent exploration in discrete spaces, parallel Markov Chain Monte Carlo (MCMC) methods are employed to sample from the policy distribution with diversity and approximate the gradient efficiently. We further develop a filter scheme to replace the original objective function by the one with the local search technique to broaden the horizon of the function landscape. Convergence to stationary points in expectation of the policy gradient method is established based on the concentration inequality for MCMC. Numerical results show that this framework is very promising to provide near-optimal solutions for quite a few binary optimization problems.
    摘要 二进制优化具有广泛的应用于 combinatorial 优化问题中,如 MaxCut、MIMO 探测和 MaxSAT。然而,这些问题通常是 NP-hard 的由于二进制约束。我们开发了一种新的 probabilistic 模型,以采样二进制解决方案根据参数化的政策分布。 Specifically,将参数化的政策分布与 Gibbs 分布的函数值进行 minimum KL divergence 会导致一个随机优化问题,其策略均匀可以明确地 derivation 如抽象学习。为了在离散空间中具有协调的探索,我们使用并行 Markov Chain Monte Carlo (MCMC) 方法来采样政策分布中的多样性和精准地计算 gradient。我们进一步开发了一种筛选方案,将原始目标函数换换为具有本地搜索技术的目标函数,以拓宽目标函数的地平。基于 MCMC 的归一化不等式,我们证明了策略梯度法在预期中 converges to stationary points。 numerics 表明,这一框架非常有 promise 可以为许多二进制优化问题提供near-optimal 解决方案。

GA-DRL: Graph Neural Network-Augmented Deep Reinforcement Learning for DAG Task Scheduling over Dynamic Vehicular Clouds

  • paper_url: http://arxiv.org/abs/2307.00777
  • repo_url: None
  • paper_authors: Zhang Liu, Lianfen Huang, Zhibin Gao, Manman Luo, Seyyedali Hosseinalipour, Huaiyu Dai
  • for: 本文提出了一种基于图神经网络和深度强化学习的方法来调度在动态车辆云(VC)上执行计算密集任务。
  • methods: 本文使用了一种基于多头图注意力网络(GAT)的方法,通过同时考虑每个子任务的前一个和后一个任务,提取了DAG任务的特征。此外,该方法还引入了不均匀DAG任务邻域采样,使其能够适应完全未seen的DAG任务拓扑。
  • results: 通过在实际的车辆运动轨迹上模拟多种DAG任务,研究人员发现,GA-DRL方法在DAG任务完成时间方面表现出了超过现有标准准则的优势。
    Abstract Vehicular clouds (VCs) are modern platforms for processing of computation-intensive tasks over vehicles. Such tasks are often represented as directed acyclic graphs (DAGs) consisting of interdependent vertices/subtasks and directed edges. In this paper, we propose a graph neural network-augmented deep reinforcement learning scheme (GA-DRL) for scheduling DAG tasks over dynamic VCs. In doing so, we first model the VC-assisted DAG task scheduling as a Markov decision process. We then adopt a multi-head graph attention network (GAT) to extract the features of DAG subtasks. Our developed GAT enables a two-way aggregation of the topological information in a DAG task by simultaneously considering predecessors and successors of each subtask. We further introduce non-uniform DAG neighborhood sampling through codifying the scheduling priority of different subtasks, which makes our developed GAT generalizable to completely unseen DAG task topologies. Finally, we augment GAT into a double deep Q-network learning module to conduct subtask-to-vehicle assignment according to the extracted features of subtasks, while considering the dynamics and heterogeneity of the vehicles in VCs. Through simulating various DAG tasks under real-world movement traces of vehicles, we demonstrate that GA-DRL outperforms existing benchmarks in terms of DAG task completion time.
    摘要 自动车云(VC)是现代计算密集任务处理平台。这些任务经常表示为导向无环图(DAG)中的依赖关系,其中每个子任务之间存在指向关系。在本文中,我们提出了基于图神经网络和深度强化学习的GA-DRL方案,用于VC上进行DAG任务调度。在实现这一点上,我们首先将VC协助DAG任务调度模型为Markov决策过程。然后,我们采用多头图注意网络(GAT)来提取DAG子任务的特征。我们开发的GAT允许同时考虑每个子任务的前一个和后一个任务,从而实现两个方向的维度汇集。此外,我们还引入非均匀DAG邻居采样,通过编码调度优先级不同的子任务,使我们的GAT普适于完全未seen DAG任务拓扑。最后,我们将GAT与double deep Q-network学习模块结合,以进行子任务与车辆的具体分配,并考虑车辆在VC中的动态和多样性。通过对各种DAG任务进行真实世界车辆运动轨迹的模拟,我们示出GA-DRL方案在DAG任务完成时间方面的优越性。

Hierarchical Open-vocabulary Universal Image Segmentation

  • paper_url: http://arxiv.org/abs/2307.00764
  • repo_url: https://github.com/berkeley-hipie/hipie
  • paper_authors: Xudong Wang, Shufan Li, Konstantinos Kallidromitis, Yusuke Kato, Kazuki Kozuka, Trevor Darrell
  • for: 这篇论文targets open-vocabulary image segmentation, aiming to partition an image into semantic regions based on arbitrary text descriptions.
  • methods: 该方法使用了一个嵌入式表示学习机制,以解决图像描述语言中的抽象层次问题。它还包括一个分离的文本-图像融合机制和表示学习模块。
  • results: 该模型名为HIPIE,可以同时解决多级嵌入 semantics, open-vocabulary, and universal segmentation tasks。在多个dataset上(如ADE20K、COCO、Pascal-VOC Part、RefCOCO/RefCOCOg、ODinW和SeginW)进行了测试,HIPIE在不同的图像理解水平(如semantic segmentation、panoptic/referring segmentation、object detection和part/subpart segmentation)中达到了state-of-the-art的结果。
    Abstract Open-vocabulary image segmentation aims to partition an image into semantic regions according to arbitrary text descriptions. However, complex visual scenes can be naturally decomposed into simpler parts and abstracted at multiple levels of granularity, introducing inherent segmentation ambiguity. Unlike existing methods that typically sidestep this ambiguity and treat it as an external factor, our approach actively incorporates a hierarchical representation encompassing different semantic-levels into the learning process. We propose a decoupled text-image fusion mechanism and representation learning modules for both "things" and "stuff".1 Additionally, we systematically examine the differences that exist in the textual and visual features between these types of categories. Our resulting model, named HIPIE, tackles HIerarchical, oPen-vocabulary, and unIvErsal segmentation tasks within a unified framework. Benchmarked on over 40 datasets, e.g., ADE20K, COCO, Pascal-VOC Part, RefCOCO/RefCOCOg, ODinW and SeginW, HIPIE achieves the state-of-the-art results at various levels of image comprehension, including semantic-level (e.g., semantic segmentation), instance-level (e.g., panoptic/referring segmentation and object detection), as well as part-level (e.g., part/subpart segmentation) tasks. Our code is released at https://github.com/berkeley-hipie/HIPIE.
    摘要 开放词汇图像分割目标是将图像 partition 成Semantic 区域,根据自由文本描述。然而,复杂的视觉场景可以自然地被 decomposed 成更简单的部分,并且在不同的粒度上进行抽象,从而引入内在的分割抽象。不同于现有方法,我们的方法 actively 包含层次结构表示,以便在学习过程中吸收不同 semantic 级别的信息。我们提出了解释文本-图像融合机制和表示学习模块,用于处理 "thing" 和 "stuff" 两类不同的概念。此外,我们系统性地研究了这两类概念之间的文本特征和视觉特征之间的差异。我们的模型,名为 HIPIE,可以同时解决多级图像理解任务,包括层次、开放词汇、不同类别的图像分割任务。我们在超过40个数据集上进行了 benchmarking,包括 ADE20K、COCO、Pascal-VOC Part、RefCOCO/RefCOCOg、ODinW 和 SeginW,HIPIE 在不同的图像理解水平上达到了状态之前的最佳结果,包括semantic-level(例如semantic segmentation)、instance-level(例如panoptic/referring segmentation和对象检测)以及part-level(例如part/subpart segmentation)任务。我们的代码可以在 GitHub 上获取:https://github.com/berkeley-hipie/HIPIE。

EmoGen: Eliminating Subjective Bias in Emotional Music Generation

  • paper_url: http://arxiv.org/abs/2307.01229
  • repo_url: https://github.com/microsoft/muzic
  • paper_authors: Chenfei Kang, Peiling Lu, Botao Yu, Xu Tan, Wei Ye, Shikun Zhang, Jiang Bian
  • for: 本研究旨在生成具有情感特征的音乐,以便在自动音乐生成方面提高情感表达的能力。
  • methods: 本研究提出了一种基于情感相关音乐特征的音乐生成系统,即 EmoGen。该系统包括两个阶段:首先,使用supervised clustering将情感标签映射到音乐特征上,然后,使用自动学习将音乐特征映射到音乐序列上。两个阶段都有利于提高音乐质量和情感控制精度。
  • results: 对于emotion control accuracy和音乐质量,EmoGen的表现都超过了之前的方法。具体来说,EmoGen在情感控制精度方面的表现提高了15.6%,而音乐质量方面的表现提高了22.4%。这些结果表明EmoGen在生成情感强的音乐方面具有优势。
    Abstract Music is used to convey emotions, and thus generating emotional music is important in automatic music generation. Previous work on emotional music generation directly uses annotated emotion labels as control signals, which suffers from subjective bias: different people may annotate different emotions on the same music, and one person may feel different emotions under different situations. Therefore, directly mapping emotion labels to music sequences in an end-to-end way would confuse the learning process and hinder the model from generating music with general emotions. In this paper, we propose EmoGen, an emotional music generation system that leverages a set of emotion-related music attributes as the bridge between emotion and music, and divides the generation into two stages: emotion-to-attribute mapping with supervised clustering, and attribute-to-music generation with self-supervised learning. Both stages are beneficial: in the first stage, the attribute values around the clustering center represent the general emotions of these samples, which help eliminate the impacts of the subjective bias of emotion labels; in the second stage, the generation is completely disentangled from emotion labels and thus free from the subjective bias. Both subjective and objective evaluations show that EmoGen outperforms previous methods on emotion control accuracy and music quality respectively, which demonstrate our superiority in generating emotional music. Music samples generated by EmoGen are available via this link:https://ai-muzic.github.io/emogen/, and the code is available at this link:https://github.com/microsoft/muzic/.
    摘要 音乐可以传递情感,因此自动生成情感rich的音乐是非常重要的。在过去的工作中,情感音乐生成直接使用了标注的情感标签作为控制信号,但这会受到主观偏见的影响:不同的人可能对同一首音乐的情感标签进行不同的标注,一个人在不同的情况下可能会感受到不同的情感。因此,直接将情感标签映射到音乐序列的方式会诱导学习过程中的混乱,使模型无法生成普遍的情感音乐。在这篇论文中,我们提出了Emotion Music Generation(EmoGen)系统,该系统利用了一组情感相关的音乐特征作为情感和音乐之间的桥梁,并将生成分为两个阶段:情感到特征映射与监督聚合,以及特征到音乐生成与自我监督学习。两个阶段都是有利的:在第一个阶段,特征值附近的聚合中心表示这些样本的普遍情感,这有助于消除主观偏见的情感标签的影响;在第二个阶段,生成完全不受情感标签的影响,因此免除了主观偏见的问题。两种评价方法(主观和客观)都表明EmoGen在情感控制准确性和音乐质量方面超越了之前的方法,这表明我们在生成情感音乐方面的优势。EmoGen生成的音乐样本可以在这里找到:https://ai-muzic.github.io/emogen/,代码可以在这里找到:https://github.com/microsoft/muzic/。

Graph-level Anomaly Detection via Hierarchical Memory Networks

  • paper_url: http://arxiv.org/abs/2307.00755
  • repo_url: https://github.com/niuchx/himnet
  • paper_authors: Chaoxi Niu, Guansong Pang, Ling Chen
  • for: 本研究旨在提出一种新的图数据异常检测方法,用于检测图像中的异常图。
  • methods: 本方法使用嵌入式自适应神经网络,学习图像中的细腻和整体正常模式,并将其组织成两个层次的内存模块:节点级别内存模块和图像级别内存模块。
  • results: 对于16种真实的图像数据集,本方法在检测本地异常图和全局异常图方面具有显著的优势,并且具有较高的抗异常杂质性能。代码可以在 GitHub 上获取:https://github.com/Niuchx/HimNet。
    Abstract Graph-level anomaly detection aims to identify abnormal graphs that exhibit deviant structures and node attributes compared to the majority in a graph set. One primary challenge is to learn normal patterns manifested in both fine-grained and holistic views of graphs for identifying graphs that are abnormal in part or in whole. To tackle this challenge, we propose a novel approach called Hierarchical Memory Networks (HimNet), which learns hierarchical memory modules -- node and graph memory modules -- via a graph autoencoder network architecture. The node-level memory module is trained to model fine-grained, internal graph interactions among nodes for detecting locally abnormal graphs, while the graph-level memory module is dedicated to the learning of holistic normal patterns for detecting globally abnormal graphs. The two modules are jointly optimized to detect both locally- and globally-anomalous graphs. Extensive empirical results on 16 real-world graph datasets from various domains show that i) HimNet significantly outperforms the state-of-art methods and ii) it is robust to anomaly contamination. Codes are available at: https://github.com/Niuchx/HimNet.
    摘要 格式化检测目标是找到异常图形和节点特征相比多数图形集中的异常图形。一个主要挑战是学习图形集中正常模式,包括细致和总体两个视图。为解决这个挑战,我们提出了一种新的方法 called Hierarchical Memory Networks (HimNet),它通过图像自动编码网络架构学习层次记忆模块——节点记忆模块和图形记忆模块。节点级别记忆模块用于模型节点之间的细致相互作用,以检测本地异常图形,而图形级别记忆模块则专门学习总体正常模式,以检测全球异常图形。两个模块被联合优化,以检测本地和全球异常图形。我们的实验结果表明,i) HimNet significantly 超过了当前方法,ii) 它对异常污染有良好的鲁棒性。代码可以在 找到。

ImDiffusion: Imputed Diffusion Models for Multivariate Time Series Anomaly Detection

  • paper_url: http://arxiv.org/abs/2307.00754
  • repo_url: https://github.com/17000cyh/imdiffusion
  • paper_authors: Yuhang Chen, Chaoyun Zhang, Minghua Ma, Yudong Liu, Ruomeng Ding, Bowen Li, Shilin He, Saravan Rajmohan, Qingwei Lin, Dongmei Zhang
  • for: 这篇论文的目的是为了提出一个新的多重时间序列资料异常检测方法,以解决现有方法的限制。
  • methods: 这篇论文使用了时间序列替代模型和扩散模型,实现精确和可靠的异常检测。它还使用时间序列替代模型来实现精确的时间序列预测,并且利用步骤实现过程中的推导出PUTS为异常预测提供有用的信号。
  • results: 这篇论文的实验结果显示,与现有方法比较,ImDiffusion在检测精度和时间上都有明显的进步。尤其是在Microsoft的生产环境中,ImDiffusion的检测F1分数提高了11.4%。
    Abstract Anomaly detection in multivariate time series data is of paramount importance for ensuring the efficient operation of large-scale systems across diverse domains. However, accurately detecting anomalies in such data poses significant challenges. Existing approaches, including forecasting and reconstruction-based methods, struggle to address these challenges effectively. To overcome these limitations, we propose a novel anomaly detection framework named ImDiffusion, which combines time series imputation and diffusion models to achieve accurate and robust anomaly detection. The imputation-based approach employed by ImDiffusion leverages the information from neighboring values in the time series, enabling precise modeling of temporal and inter-correlated dependencies, reducing uncertainty in the data, thereby enhancing the robustness of the anomaly detection process. ImDiffusion further leverages diffusion models as time series imputers to accurately capturing complex dependencies. We leverage the step-by-step denoised outputs generated during the inference process to serve as valuable signals for anomaly prediction, resulting in improved accuracy and robustness of the detection process. We evaluate the performance of ImDiffusion via extensive experiments on benchmark datasets. The results demonstrate that our proposed framework significantly outperforms state-of-the-art approaches in terms of detection accuracy and timeliness. ImDiffusion is further integrated into the real production system in Microsoft and observe a remarkable 11.4% increase in detection F1 score compared to the legacy approach. To the best of our knowledge, ImDiffusion represents a pioneering approach that combines imputation-based techniques with time series anomaly detection, while introducing the novel use of diffusion models to the field.
    摘要 <> translate the following text into Simplified Chinese:Anomaly detection in multivariate time series data is of paramount importance for ensuring the efficient operation of large-scale systems across diverse domains. However, accurately detecting anomalies in such data poses significant challenges. Existing approaches, including forecasting and reconstruction-based methods, struggle to address these challenges effectively. To overcome these limitations, we propose a novel anomaly detection framework named ImDiffusion, which combines time series imputation and diffusion models to achieve accurate and robust anomaly detection. The imputation-based approach employed by ImDiffusion leverages the information from neighboring values in the time series, enabling precise modeling of temporal and inter-correlated dependencies, reducing uncertainty in the data, thereby enhancing the robustness of the anomaly detection process. ImDiffusion further leverages diffusion models as time series imputers to accurately capturing complex dependencies. We leverage the step-by-step denoised outputs generated during the inference process to serve as valuable signals for anomaly prediction, resulting in improved accuracy and robustness of the detection process. We evaluate the performance of ImDiffusion via extensive experiments on benchmark datasets. The results demonstrate that our proposed framework significantly outperforms state-of-the-art approaches in terms of detection accuracy and timeliness. ImDiffusion is further integrated into the real production system in Microsoft and observe a remarkable 11.4% increase in detection F1 score compared to the legacy approach. To the best of our knowledge, ImDiffusion represents a pioneering approach that combines imputation-based techniques with time series anomaly detection, while introducing the novel use of diffusion models to the field.中文简体版:针对多变量时间序列数据,精准检测异常现象非常重要,以确保大规模系统在多个领域中具有高效的运行。然而,在这种数据中异常检测存在 significativetransportation challenges。现有的方法,包括预测和重建方法,尝试解决这些挑战,但是效果不够。为了超越这些限制,我们提出了一种新的异常检测框架,名为ImDiffusion,它将时间序列插入和扩散模型结合使用,以实现高精度和Robustness的异常检测。ImDiffusion中使用的插入方法利用时间序列中的邻居值信息,准确地模型时间和相关性的依赖关系,从而减少数据中的不确定性,提高异常检测过程的稳定性。ImDiffusion还利用扩散模型作为时间序列插入器,以准确地捕捉复杂的依赖关系。我们利用推理过程中的步骤脱氧输出,作为异常预测的有价值信号,从而提高异常检测的精度和稳定性。我们通过对ImDiffusion进行了广泛的实验,证明了我们提出的框架在检测精度和快速性方面具有明显的优势。ImDiffusion还被 Microsoft 的实际生产系统中 интегра,并观察到了11.4%的增强检测 F1 分数。到目前为止,ImDiffusion 是我们所知道的首个将插入基本技术与时间序列异常检测结合的方法,同时引入扩散模型到该领域。

Population Age Group Sensitivity for COVID-19 Infections with Deep Learning

  • paper_url: http://arxiv.org/abs/2307.00751
  • repo_url: None
  • paper_authors: Md Khairul Islam, Tyler Valentine, Royal Wang, Levi Davis, Matt Manner, Judy Fox
    for: 这种研究的目的是为了在美国县级别上确定COVID-19感染率中最有影响力的年龄组。methods: 这个研究使用Modified Morris Method和深度学习时序分析来确定年龄组对COVID-19感染率的影响。研究首先在不同的年龄组中训练了现有的时序模型Temporal Fusion Transformer,然后对不同的年龄组进行了特征敏感分析,并根据每个输入特征的柯尼斯基敏感分数( Morris sensitivity scores)进行排名。results: 研究发现,在COVID-19传播过程中,年龄组最有影响力的是20-29岁的年轻人。这些结果通过美国卫生部和美国人口普查局提供的真实感染率数据得到了验证。这些结果可以用于改进公共卫生政策和 intervención,例如targeted疫苗接种策略,以更好地控制病毒的传播。
    Abstract The COVID-19 pandemic has created unprecedented challenges for governments and healthcare systems worldwide, highlighting the critical importance of understanding the factors that contribute to virus transmission. This study aimed to identify the most influential age groups in COVID-19 infection rates at the US county level using the Modified Morris Method and deep learning for time series. Our approach involved training the state-of-the-art time-series model Temporal Fusion Transformer on different age groups as a static feature and the population vaccination status as the dynamic feature. We analyzed the impact of those age groups on COVID-19 infection rates by perturbing individual input features and ranked them based on their Morris sensitivity scores, which quantify their contribution to COVID-19 transmission rates. The findings are verified using ground truth data from the CDC and US Census, which provide the true infection rates for each age group. The results suggest that young adults were the most influential age group in COVID-19 transmission at the county level between March 1, 2020, and November 27, 2021. Using these results can inform public health policies and interventions, such as targeted vaccination strategies, to better control the spread of the virus. Our approach demonstrates the utility of feature sensitivity analysis in identifying critical factors contributing to COVID-19 transmission and can be applied in other public health domains.
    摘要 COVID-19 大流行带来了无 precedent 的挑战,让政府和医疗系统在全球各地面临巨大的挑战。这项研究的目的是通过 Modified Morris Method 和深度学习时序序列来确定 COVID-19 感染率在美国县级别中最有影响力的年龄组。我们的方法是在不同的年龄组作为静态特征,并将人口疫苗接种状况作为动态特征,使用时代混合trasformer 模型进行训练。我们分析了每个年龄组对 COVID-19 感染率的影响,并将其排序基于其 Morris 敏感度分数,该分数量化每个年龄组对 COVID-19 传播率的贡献。结果被验证使用 CDC 和 US Census 的真实感染率数据,这些数据提供了每个年龄组的真实感染率。结果表明在2020年3月1日至2021年11月27日之间,美国县级别中最有影响力的年龄组是年轻成人。使用这些结果可以更好地制定公共卫生政策和干预措施,以控制病毒的传播。我们的方法可以应用于其他公共卫生领域,以确定病毒传播中的关键因素。

ESGCN: Edge Squeeze Attention Graph Convolutional Network for Traffic Flow Forecasting

  • paper_url: http://arxiv.org/abs/2307.01227
  • repo_url: None
  • paper_authors: Sangrok Lee, Ha Young Kim
  • for: 预测交通流量,提高交通预测精度
  • methods: 提posed Edge Squeeze Graph Convolutional Network (ESGCN),包括W-module和ES module,通过Graph Convolutional Network (GCN)模型空间时间关系,并使用边特征 direktly capture spatial-temporal flow representation,以及edge attention mechanism和node contrastive loss进行约束
  • results: 实验结果表明,ESGCN在四个实际数据集(PEMS03、04、07、08)上达到了当前最佳性能水平,而且计算成本较低
    Abstract Traffic forecasting is a highly challenging task owing to the dynamical spatio-temporal dependencies of traffic flows. To handle this, we focus on modeling the spatio-temporal dynamics and propose a network termed Edge Squeeze Graph Convolutional Network (ESGCN) to forecast traffic flow in multiple regions. ESGCN consists of two modules: W-module and ES module. W-module is a fully node-wise convolutional network. It encodes the time-series of each traffic region separately and decomposes the time-series at various scales to capture fine and coarse features. The ES module models the spatio-temporal dynamics using Graph Convolutional Network (GCN) and generates an Adaptive Adjacency Matrix (AAM) with temporal features. To improve the accuracy of AAM, we introduce three key concepts. 1) Using edge features to directly capture the spatiotemporal flow representation among regions. 2) Applying an edge attention mechanism to GCN to extract the AAM from the edge features. Here, the attention mechanism can effectively determine important spatio-temporal adjacency relations. 3) Proposing a novel node contrastive loss to suppress obstructed connections and emphasize related connections. Experimental results show that ESGCN achieves state-of-the-art performance by a large margin on four real-world datasets (PEMS03, 04, 07, and 08) with a low computational cost.
    摘要 很多挑战在交通预测中,主要是由交通流动的空间时间相关性引起的。为了解决这个问题,我们关注了空间时间动态的模型化,并提出了一种名为 Edge Squeeze Graph Convolutional Network(ESGCN)来预测多个区域的交通流。ESGCN包括两个模块:W模块和ES模块。W模块是一个完全节点卷积网络,它在每个交通区域 separately 编码时间序列,并在不同的尺度分解时间序列来捕捉细致和大致特征。ES模块使用图aelastic network(GCN)模型了空间时间动态,并生成了一个 Adaptive Adjacency Matrix(AAM),其中包含了时间特征。为了提高AAM的准确性,我们提出了三个关键想法:1. 直接使用边特征来捕捉交通空间时间流表示。2. 应用边注意机制来GCN中提取AAM。这里注意机制可以有效地确定重要的空间时间相关关系。3. 提出了一种新的节点对比损失函数,用于抑制干扰连接和强调相关连接。实验结果表明,ESGCN在四个真实世界数据集(PEMS03、04、07和08)上 achieved state-of-the-art 性能,而且计算成本较低。

vONTSS: vMF based semi-supervised neural topic modeling with optimal transport

  • paper_url: http://arxiv.org/abs/2307.01226
  • repo_url: None
  • paper_authors: Weijie Xu, Xiaoyu Jiang, Srinivasan H. Sengamedu, Francis Iannacci, Jinjin Zhao
  • for: This paper presents a semi-supervised neural topic modeling method, vONTSS, which aims to incorporate human knowledge into the topic modeling process.
  • methods: vONTSS uses von Mises-Fisher (vMF) based variational autoencoders and optimal transport to generate potential topics and optimize topic-keyword quality and topic classification.
  • results: Experiments show that vONTSS outperforms existing semi-supervised topic modeling methods in classification accuracy and diversity. Additionally, vONTSS in the unsupervised setting discovers highly clustered and coherent topics on benchmark datasets and is faster than recent NTMs while achieving similar classification performance.
    Abstract Recently, Neural Topic Models (NTM), inspired by variational autoencoders, have attracted a lot of research interest; however, these methods have limited applications in the real world due to the challenge of incorporating human knowledge. This work presents a semi-supervised neural topic modeling method, vONTSS, which uses von Mises-Fisher (vMF) based variational autoencoders and optimal transport. When a few keywords per topic are provided, vONTSS in the semi-supervised setting generates potential topics and optimizes topic-keyword quality and topic classification. Experiments show that vONTSS outperforms existing semi-supervised topic modeling methods in classification accuracy and diversity. vONTSS also supports unsupervised topic modeling. Quantitative and qualitative experiments show that vONTSS in the unsupervised setting outperforms recent NTMs on multiple aspects: vONTSS discovers highly clustered and coherent topics on benchmark datasets. It is also much faster than the state-of-the-art weakly supervised text classification method while achieving similar classification performance. We further prove the equivalence of optimal transport loss and cross-entropy loss at the global minimum.
    摘要 近期,神经话题模型(NTM),受变量自动编码器的激发,吸引了大量的研究兴趣;然而,这些方法在实际应用中受到人类知识的挑战。这项工作提出了一种半监督神经话题模型方法,vONTSS,它使用 von Mises-Fisher(vMF)基于的变量自动编码器和最优运输。当提供一些关键词时,vONTSS在半监督设置下生成可能的话题并优化话题-关键词质量和话题分类。实验显示,vONTSS比现有的半监督话题模型方法在分类精度和多样性方面表现更好。vONTSS还支持无监督话题模型。量化和质量实验表明,vONTSS在无监督设置下比最新的弱监督文本分类方法更快,并在类似的分类性能下达到类似的性能。我们进一步证明了最优运输损失和十字积分损失在全局最优点的等价性。

UnLoc: A Universal Localization Method for Autonomous Vehicles using LiDAR, Radar and/or Camera Input

  • paper_url: http://arxiv.org/abs/2307.00741
  • repo_url: None
  • paper_authors: Muhammad Ibrahim, Naveed Akhtar, Saeed Anwar, Ajmal Mian
  • for: 本研究旨在提出一种基于多感器输入的自主导航 robots 的本地化方法,以满足现有方法的缺点,如单一输入数据模式或需要训练多个计算模型来处理不同的感知数据。
  • methods: 本研究使用了一种名为 UnLoc 的新型 neural network 模型,可以同时处理 LiDAR、摄像头和 RADAR 输入数据,并且可以根据需要选择使用一个或多个输入感知器,从而提高了系统的可靠性和灵活性。
  • results: 研究人员通过对 Oxford Radar RobotCar、ApolloSouthBay 和 Perth-WA 数据集进行广泛的测试和评估,发现 UnLoc 方法可以准确地地址本地化问题,并且在不同的天气和环境下表现出色。
    Abstract Localization is a fundamental task in robotics for autonomous navigation. Existing localization methods rely on a single input data modality or train several computational models to process different modalities. This leads to stringent computational requirements and sub-optimal results that fail to capitalize on the complementary information in other data streams. This paper proposes UnLoc, a novel unified neural modeling approach for localization with multi-sensor input in all weather conditions. Our multi-stream network can handle LiDAR, Camera and RADAR inputs for localization on demand, i.e., it can work with one or more input sensors, making it robust to sensor failure. UnLoc uses 3D sparse convolutions and cylindrical partitioning of the space to process LiDAR frames and implements ResNet blocks with a slot attention-based feature filtering module for the Radar and image modalities. We introduce a unique learnable modality encoding scheme to distinguish between the input sensor data. Our method is extensively evaluated on Oxford Radar RobotCar, ApolloSouthBay and Perth-WA datasets. The results ascertain the efficacy of our technique.
    摘要 <>rupiaoming: zhongguo yu dianzi de jiaoyu yu jingying zhongxinLocalization is a fundamental task in robotics for autonomous navigation. Existing localization methods rely on a single input data modality or train several computational models to process different modalities. This leads to stringent computational requirements and sub-optimal results that fail to capitalize on the complementary information in other data streams. This paper proposes UnLoc, a novel unified neural modeling approach for localization with multi-sensor input in all weather conditions. Our multi-stream network can handle LiDAR, Camera and RADAR inputs for localization on demand, i.e., it can work with one or more input sensors, making it robust to sensor failure. UnLoc uses 3D sparse convolutions and cylindrical partitioning of the space to process LiDAR frames and implements ResNet blocks with a slot attention-based feature filtering module for the Radar and image modalities. We introduce a unique learnable modality encoding scheme to distinguish between the input sensor data. Our method is extensively evaluated on Oxford Radar RobotCar, ApolloSouthBay and Perth-WA datasets. The results ascertain the efficacy of our technique.<>Here's the translation in Simplified Chinese:<>本文提出了一种基于多感器输入的LOCALIZATION方法,用于Robotics autonomous navigation。现有的LOCALIZATION方法通常仅使用单一的输入数据模式,或者训练多个计算模型来处理不同的数据模式。这会导致计算需求严格,并且不能充分利用其他数据流中的补充信息。本文提出了一种名为UnLoc的新的协调神经网络方法,可以处理LiDAR、摄像头和RADAR输入数据,并在不同的天气条件下进行定位。我们的多流网络可以根据需要选择一个或多个输入感知器,从而增强其对感知器失效的Robustness。在进行LiDAR框架处理时,我们使用3D稀疏核算法和圆柱体分割方法,并在RADAR和图像模式中实现了ResNet块和满足特征筛选模块。我们还提出了一种唯一的学习型感知编码方案,以便分辨输入感知器的数据。我们的方法在Oxford Radar RobotCar、ApolloSouthBay和Perth-WA数据集上进行了广泛的评估,结果证明了我们的方法的有效性。<>

On the choice of training data for machine learning of geostrophic mesoscale turbulence

  • paper_url: http://arxiv.org/abs/2307.00734
  • repo_url: None
  • paper_authors: F. E. Yan, J. Mak, Y. Wang
  • for: 这个论文是关于数据驱动方法在地球系统模型中的应用,特别是关于旋转层分布的热层湍流中的质量交换现象。
  • methods: 本论文使用了数据驱动方法来学习旋转层分布中的质量交换现象,并提供了对比或更好的能力和稳定性。
  • results: 研究发现,如果将旋转Component从质量交换流动中过滤掉,那么数据驱动模型的预测能力和稳定性会得到改善,而且可以更好地捕捉到数据中隐藏的物理过程。
    Abstract 'Data' plays a central role in data-driven methods, but is not often the subject of focus in investigations of machine learning algorithms as applied to Earth System Modeling related problems. Here we consider the case of eddy-mean interaction in rotating stratified turbulence in the presence of lateral boundaries, a problem of relevance to ocean modeling, where the eddy fluxes contain dynamically inert rotational components that are expected to contaminate the learning process. An often utilized choice in the literature is to learn from the divergence of the eddy fluxes. Here we provide theoretical arguments and numerical evidence that learning from the eddy fluxes with the rotational component appropriately filtered out results in models with comparable or better skill, but substantially improved robustness. If we simply want a data-driven model to have predictive skill then the choice of data choice and/or quality may not be critical, but we argue it is highly desirable and perhaps even necessary if we want to leverage data-driven methods to aid in discovering unknown or hidden physical processes within the data itself.
    摘要

Interpretability and Transparency-Driven Detection and Transformation of Textual Adversarial Examples (IT-DT)

  • paper_url: http://arxiv.org/abs/2307.01225
  • repo_url: None
  • paper_authors: Bushra Sabir, M. Ali Babar, Sharif Abuadbba
    for: 这个论文的目的是提出一种可解释性和透明度驱动的检测和转换(IT-DT)框架,以解决BERT等基于Transformer的文本分类器对于恶意示例的抵触性。methods: 这个框架使用了注意力地图、集成导数和模型反馈等技术来提高可解释性,以便在检测阶段更好地理解恶意分类的依据。在转换阶段,IT-DT使用预训练的嵌入和模型反馈来生成适当的替换,以将恶意示例转化为非恶意示例。results: 实验结果表明,IT-DT可以准确地检测和转换恶意示例,提高了模型的可靠性和安全性。此外,人工专家参与约束和反馈也使得决策更加稳定和可靠。
    Abstract Transformer-based text classifiers like BERT, Roberta, T5, and GPT-3 have shown impressive performance in NLP. However, their vulnerability to adversarial examples poses a security risk. Existing defense methods lack interpretability, making it hard to understand adversarial classifications and identify model vulnerabilities. To address this, we propose the Interpretability and Transparency-Driven Detection and Transformation (IT-DT) framework. It focuses on interpretability and transparency in detecting and transforming textual adversarial examples. IT-DT utilizes techniques like attention maps, integrated gradients, and model feedback for interpretability during detection. This helps identify salient features and perturbed words contributing to adversarial classifications. In the transformation phase, IT-DT uses pre-trained embeddings and model feedback to generate optimal replacements for perturbed words. By finding suitable substitutions, we aim to convert adversarial examples into non-adversarial counterparts that align with the model's intended behavior while preserving the text's meaning. Transparency is emphasized through human expert involvement. Experts review and provide feedback on detection and transformation results, enhancing decision-making, especially in complex scenarios. The framework generates insights and threat intelligence empowering analysts to identify vulnerabilities and improve model robustness. Comprehensive experiments demonstrate the effectiveness of IT-DT in detecting and transforming adversarial examples. The approach enhances interpretability, provides transparency, and enables accurate identification and successful transformation of adversarial inputs. By combining technical analysis and human expertise, IT-DT significantly improves the resilience and trustworthiness of transformer-based text classifiers against adversarial attacks.
    摘要 tranformer-based文本分类器如BERT、Roberta、T5和GPT-3在NLP中表现出色,但它们对攻击性示例的抵触性存在安全风险。现有的防御方法缺乏可读性,使得对攻击分类和模型漏洞难以理解。为了解决这个问题,我们提出了可读性和透明度驱动的检测和转换(IT-DT)框架。IT-DT将注重可读性和透明度,在检测阶段使用注意力地图、 интеGRATED GRADIENTS 和模型反馈来提供可读性。这 помоляет Identify 突出的特征和折衣字符在攻击分类中发挥作用。在转换阶段,IT-DT使用预训练的嵌入和模型反馈来生成适当的替换,以将攻击示例转化为不攻击的示例,保持文本的意义。在转换过程中,人工专家参与纠正和反馈,以增强决策,特别是在复杂的情况下。IT-DT生成了见解和威胁情报,使分析人员可以识别漏洞和提高模型 Robustness。广泛的实验表明IT-DT可以准确地检测和转换攻击示例。这种方法提高了可读性,提供了透明度,并使transformer-based文本分类器对攻击示例的抵觕性提高。通过结合技术分析和人工专家知识,IT-DT有效地提高了transformer-based文本分类器对攻击示例的抵觕性和可靠性。

Neural Polytopes

  • paper_url: http://arxiv.org/abs/2307.00721
  • repo_url: https://github.com/zfurman56/polytopes
  • paper_authors: Koji Hashimoto, Tomoya Naito, Hisashi Naito
  • for: 这篇论文是为了研究用简单神经网络和ReLU活化函数生成规范体(polytopes)而写的。
  • methods: 论文使用简单神经网络和ReLU活化函数来生成规范体,并研究了不同活化函数的总体化。
  • results: 研究发现,使用简单神经网络和ReLU活化函数可以生成规范体,并且可以通过调整网络结构来控制规范体的种类和维度。此外,研究还发现了这些规范体的总体化,即神经规范体。
    Abstract We find that simple neural networks with ReLU activation generate polytopes as an approximation of a unit sphere in various dimensions. The species of polytopes are regulated by the network architecture, such as the number of units and layers. For a variety of activation functions, generalization of polytopes is obtained, which we call neural polytopes. They are a smooth analogue of polytopes, exhibiting geometric duality. This finding initiates research of generative discrete geometry to approximate surfaces by machine learning.
    摘要 我们发现简单的神经网络与ReLU吸引函数生成了多面体,作为各种维度上圆形的近似。这种多面体的种类受到神经网络架构的限制,如单元数和层数。对于不同的吸引函数,我们可以得到总化的多面体,我们称之为神经多面体。它们是一种缓和的多面体,展示了几何对偶。这一发现推动了机器学习来approximate Surfaces的生成推理 discrete geometry。

Worth of knowledge in deep learning

  • paper_url: http://arxiv.org/abs/2307.00712
  • repo_url: https://github.com/woshixuhao/worth_of_knowledge
  • paper_authors: Hao Xu, Yuntian Chen, Dongxiao Zhang
  • for: 本研究旨在探讨深度学习中知识的作用,以提高模型的泛化能力和约束遵循性。
  • methods: 本研究使用可解释Machine learning的框架,通过量化实验评估知识的价值,并分析数据和知识之间的复杂关系。
  • results: 研究发现,知识的价值受到数据量和估计范围的影响,存在依赖、协同和替换效应。这种结果可以应用于多种常见的网络架构,并且可以提高了知识汇报的性能和约束遵循性。
    Abstract Knowledge constitutes the accumulated understanding and experience that humans use to gain insight into the world. In deep learning, prior knowledge is essential for mitigating shortcomings of data-driven models, such as data dependence, generalization ability, and compliance with constraints. To enable efficient evaluation of the worth of knowledge, we present a framework inspired by interpretable machine learning. Through quantitative experiments, we assess the influence of data volume and estimation range on the worth of knowledge. Our findings elucidate the complex relationship between data and knowledge, including dependence, synergistic, and substitution effects. Our model-agnostic framework can be applied to a variety of common network architectures, providing a comprehensive understanding of the role of prior knowledge in deep learning models. It can also be used to improve the performance of informed machine learning, as well as distinguish improper prior knowledge.
    摘要 知识是人类使用来理解世界的总结和经验。在深度学习中,先验知识是关键的,可以减少数据驱动模型的缺陷,例如数据依赖、泛化能力和约束遵从。为了有效评估知识的价值,我们提出一种基于可解释机器学习的框架。通过量化实验,我们评估数据量和估计范围对知识的影响。我们的发现揭示了数据和知识之间复杂的关系,包括依赖、合作和替换效应。我们的框架可以应用于多种常见的网络架构,为深度学习模型的角色带来全面的理解。它还可以用来改进了知识填充机器学习的性能,以及分辨不正确的先验知识。

A physics-constrained machine learning method for mapping gapless land surface temperature

  • paper_url: http://arxiv.org/abs/2307.04817
  • repo_url: None
  • paper_authors: Jun Ma, Huanfeng Shen, Menghui Jiang, Liupeng Lin, Chunlei Meng, Chao Zeng, Huifang Li, Penghai Wu
  • for: 这 paper 的目的是提出一种physics-constrained machine learning(PC-ML)模型,用于 gapless 土壤温度(LST)估算,以提高physical interpretability和抽象能力。
  • methods: 该模型 combines 机器学习(ML)模型和物理机制模型,并将 физические约束(PCs) incorporated 到 ML 模型中,以提高模型的解释能力和推断能力。
  • results: 对比 pure physical method 和 pure ML methods,PC-LGBM 模型提高了 LST 预测精度和physical interpretability,并demonstrated good extrapolation ability for extreme weather cases。这种方法可以提供高精度和物理意义的 gapless LST 估算,并可以加速土壤表面过程的研究和数据挖掘。
    Abstract More accurate, spatio-temporally, and physically consistent LST estimation has been a main interest in Earth system research. Developing physics-driven mechanism models and data-driven machine learning (ML) models are two major paradigms for gapless LST estimation, which have their respective advantages and disadvantages. In this paper, a physics-constrained ML model, which combines the strengths in the mechanism model and ML model, is proposed to generate gapless LST with physical meanings and high accuracy. The hybrid model employs ML as the primary architecture, under which the input variable physical constraints are incorporated to enhance the interpretability and extrapolation ability of the model. Specifically, the light gradient-boosting machine (LGBM) model, which uses only remote sensing data as input, serves as the pure ML model. Physical constraints (PCs) are coupled by further incorporating key Community Land Model (CLM) forcing data (cause) and CLM simulation data (effect) as inputs into the LGBM model. This integration forms the PC-LGBM model, which incorporates surface energy balance (SEB) constraints underlying the data in CLM-LST modeling within a biophysical framework. Compared with a pure physical method and pure ML methods, the PC-LGBM model improves the prediction accuracy and physical interpretability of LST. It also demonstrates a good extrapolation ability for the responses to extreme weather cases, suggesting that the PC-LGBM model enables not only empirical learning from data but also rationally derived from theory. The proposed method represents an innovative way to map accurate and physically interpretable gapless LST, and could provide insights to accelerate knowledge discovery in land surface processes and data mining in geographical parameter estimation.
    摘要 更准确、空间和时间一致、物理一致的LST估计已经是地球系统研究的主要兴趣点。开发物理驱动机制模型和数据驱动机器学习(ML)模型是两个主要方法 для无缝LST估计,它们各有优势和缺点。本文提出了一种物理约束机器学习(PC-ML)模型,它将机器学习模型作为主体,并将输入变量物理约束(PC)纳入模型中以提高解释性和推理能力。具体来说,使用远程感知数据为输入的光梯度提升机(LGBM)模型作为纯ML模型。PC通过将CLM的激活数据(原因)和CLM的仿真数据(后果)作为输入 integrate into LGBM model,形成PC-LGBM模型,这个模型既包含了地表能耗平衡(SEB)的下面数据,又在CLM-LST模型中体现出了物理的含义。与纯物理方法和纯ML方法相比,PC-LGBM模型提高了LST预测精度和物理解释性。它还表明了对EXTREME WEATHER CASES的应答能力,表明PC-LGBM模型不仅可以从数据学习,也可以从理论 derivation。提出的方法可以创新精度和物理解释能力的无缝LST地图,并提供了对土地表面过程的加速知识发现和数据挖掘的新思路。

Classification of sleep stages from EEG, EOG and EMG signals by SSNet

  • paper_url: http://arxiv.org/abs/2307.05373
  • repo_url: None
  • paper_authors: Haifa Almutairi, Ghulam Mubashar Hassan, Amitava Datta
  • for: 这个研究旨在开发一个基于深度学习的睡眠阶段分类模型,以便诊断睡眠相关疾病,如呼吸暴露睡眠疾病(SDB)病。
  • methods: 本研究使用了一个终端到终点的深度学习架构,名为SSNet,其包括两个基于卷积神经网络(CNN)和长短期记忆运算(LSTM)的深度学习网络。这两个深度学习网络从联合的电普热学参数(EOG)、电脑电参数(EEG)和电omyogram(EMG)信号中提取特征,每个信号都有独特的特征,可以帮助分类睡眠阶段。两个深度学习网络生成的特征被 concatenated 传递到完全连接层进行分类。
  • results: 本研究使用了两个公共的数据集,包括Sleep-EDF扩展数据集和ISRUC-Sleep数据集,评估了我们提出的模型的性能。结果显示,我们的模型在三种睡眠阶段的分类中取得了96.36%的准确率和93.40%的协变系数,在五种睡眠阶段的分类中取得了96.57%的准确率和83.05%的协变系数。相比之下,我们的模型在睡眠阶段分类中表现较好,并且超过了现有的技术。
    Abstract Classification of sleep stages plays an essential role in diagnosing sleep-related diseases including Sleep Disorder Breathing (SDB) disease. In this study, we propose an end-to-end deep learning architecture, named SSNet, which comprises of two deep learning networks based on Convolutional Neuron Networks (CNN) and Long Short Term Memory (LSTM). Both deep learning networks extract features from the combination of Electrooculogram (EOG), Electroencephalogram (EEG), and Electromyogram (EMG) signals, as each signal has distinct features that help in the classification of sleep stages. The features produced by the two-deep learning networks are concatenated to pass to the fully connected layer for the classification. The performance of our proposed model is evaluated by using two public datasets Sleep-EDF Expanded dataset and ISRUC-Sleep dataset. The accuracy and Kappa coefficient are 96.36% and 93.40% respectively, for classifying three classes of sleep stages using Sleep-EDF Expanded dataset. Whereas, the accuracy and Kappa coefficient are 96.57% and 83.05% respectively for five classes of sleep stages using Sleep-EDF Expanded dataset. Our model achieves the best performance in classifying sleep stages when compared with the state-of-the-art techniques.
    摘要 классификация сновидений играет ключевую роль в диагностике заболеваний, связанных с сном, включая заболевание дыхательными путями во сне (SDB). В этом исследовании мы предлагаем энд-то-энд architecture, называемую SSNet, которая включает в себя два глубоких обучающихся сетей на основе сетей convolutional neurons (CNN) и Long Short Term Memory (LSTM). Обе сети глубокого обучения извлекают признаки из комбинации сигналов Electrooculogram (EOG), Electroencephalogram (EEG) и Electromyogram (EMG), поскольку каждый сигнал имеет отличительные признаки, которые помогают в классификации сновидений. Признаки, выделенные двумя сетями глубокого обучения, сходят в полностью связанный слой для классификации. Оценка нашей предложенной модели была выполнена с помощью двух общедоступных данныхсетов Sleep-EDF Expanded dataset и ISRUC-Sleep dataset. Аккуратность и коэффициент Кэппеля были равны 96,36% и 93,40% соответственно для классификации трех классов сновидений с помощью Sleep-EDF Expanded dataset. При этом, аккуратность и коэффициент Кэппеля были равны 96,57% и 83,05% соответственно для пяти классов сновидений с помощью Sleep-EDF Expanded dataset. Наша модель достигла лучшего результата в классификации сновидений по сравнению с существующими техниками.

Tools for Verifying Neural Models’ Training Data

  • paper_url: http://arxiv.org/abs/2307.00682
  • repo_url: None
  • paper_authors: Dami Choi, Yonadav Shavit, David Duvenaud
  • for: 该论文旨在提供一种Proof-of-Training-Data协议,以便用户可以验证模型训练数据的来源和质量。
  • methods: 论文提出了一种基于随机种子的预commit机制和模型暂时过拟合特性的验证方法,以验证模型训练数据的可靠性。
  • results: 实验表明,该验证方法可以捕捉广泛的攻击,包括所有已知的Proof-of-Learning文献中的攻击。
    Abstract It is important that consumers and regulators can verify the provenance of large neural models to evaluate their capabilities and risks. We introduce the concept of a "Proof-of-Training-Data": any protocol that allows a model trainer to convince a Verifier of the training data that produced a set of model weights. Such protocols could verify the amount and kind of data and compute used to train the model, including whether it was trained on specific harmful or beneficial data sources. We explore efficient verification strategies for Proof-of-Training-Data that are compatible with most current large-model training procedures. These include a method for the model-trainer to verifiably pre-commit to a random seed used in training, and a method that exploits models' tendency to temporarily overfit to training data in order to detect whether a given data-point was included in training. We show experimentally that our verification procedures can catch a wide variety of attacks, including all known attacks from the Proof-of-Learning literature.
    摘要 Note: Simplified Chinese is also known as "简化字母" or "简化字母".Translation notes:* "consumers" is translated as "消费者" (shāngchǎng zhě)* "regulators" is translated as "管制机构" (guǎnzhì jīgòu)* "Proof-of-Training-Data" is translated as "训练数据证明" (xùxíng xùzhì)* "model trainer" is translated as "模型训练者" (móxìng xùxíng zhe)* "Verifier" is translated as "验证人" (yànzhèng rén)* "large-model training procedures" is translated as "大型模型训练程序" (dàxíng móxìng xùxíng)* "random seed" is translated as "随机种子" (suījī zhòngzi)* "training data" is translated as "训练数据" (xùxíng xùzhì)* "harmful or beneficial data sources" is translated as "有害或有益的数据来源" (yǒu hài yòu yì de xùnxīn lái yuán)Please note that the translation is done by a machine and may not be perfect, and it's always a good idea to have a human reviewer to check the translation before using it in any official context.

CLIMAX: An exploration of Classifier-Based Contrastive Explanations

  • paper_url: http://arxiv.org/abs/2307.00680
  • repo_url: https://github.com/niftynans/climax
  • paper_authors: Praharsh Nanavati, Ranjitha Prasad
  • For: 这种论文旨在解释黑盒机器学习模型的决策过程,以便使这些模型更加透明、负责任、可理解。* Methods: 这种方法基于本地分类器,并使用了标签感知的副本数据生成方法和影响子抽样来保证模型准确性。* Results: 作者比较了这种方法与其他基于LIME的方法,并发现它在一些预测任务上具有更高的一致性。此外,这种方法还可以在文本和图像类 datasets 上生成对比性的解释。
    Abstract Explainable AI is an evolving area that deals with understanding the decision making of machine learning models so that these models are more transparent, accountable, and understandable for humans. In particular, post-hoc model-agnostic interpretable AI techniques explain the decisions of a black-box ML model for a single instance locally, without the knowledge of the intrinsic nature of the ML model. Despite their simplicity and capability in providing valuable insights, existing approaches fail to deliver consistent and reliable explanations. Moreover, in the context of black-box classifiers, existing approaches justify the predicted class, but these methods do not ensure that the explanation scores strongly differ as compared to those of another class. In this work we propose a novel post-hoc model agnostic XAI technique that provides contrastive explanations justifying the classification of a black box classifier along with a reasoning as to why another class was not predicted. Our method, which we refer to as CLIMAX which is short for Contrastive Label-aware Influence-based Model Agnostic XAI, is based on local classifiers . In order to ensure model fidelity of the explainer, we require the perturbations to be such that it leads to a class-balanced surrogate dataset. Towards this, we employ a label-aware surrogate data generation method based on random oversampling and Gaussian Mixture Model sampling. Further, we propose influence subsampling in order to retaining effective samples and hence ensure sample complexity. We show that we achieve better consistency as compared to baselines such as LIME, BayLIME, and SLIME. We also depict results on textual and image based datasets, where we generate contrastive explanations for any black-box classification model where one is able to only query the class probabilities for an instance of interest.
    摘要 Explainable AI是一个在发展中的领域,旨在理解机器学习模型的决策过程,以便这些模型更加透明、责任、可理解。特别是,我们专注于在黑盒机器学习模型上的后期、模型无关的解释技术,可以在单一实例上,地方解释模型的决策。虽然现有的方法具有简单性和可提供有价值的洞见,但是它们无法提供一致和可靠的解释。此外,在黑盒分类器的情况下,现有的方法只会说明预测的类别,但不能保证解释 scores 强烈不同于另一个类别。在这个工作中,我们提出了一种新的后期、模型无关的解释技术,可以为黑盒分类器提供相对的解释,同时也能够解释为何选择另一个类别。我们称这种技术为 CLIMAX,即 Contrastive Label-aware Influence-based Model Agnostic XAI。CLIMAX 基于地方分类器,以 Ensure explainer 的模型实践性,我们需要进行类别数balanced的调整数据。为了实现这一目标,我们使用了随机批量扩展和泊松分布的采样方法。此外,我们也提出了影响抽样,以保留有效的抽样和确保样本复杂性。我们发现我们的方法可以与基于 LIME、BayLIME 和 SLIME 的基eline 相比,具有更高的一致性。我们还展示了在文本和图像基于的数据集上的结果,可以为任何黑盒分类器提供相对的解释,只需要对兴趣的实例进行查询。

SDC-HSDD-NDSA: Structure Detecting Cluster by Hierarchical Secondary Directed Differential with Normalized Density and Self-Adaption

  • paper_url: http://arxiv.org/abs/2307.00677
  • repo_url: https://github.com/hao-b-shu/sdc-hsdd-ndsa
  • paper_authors: Hao Shu
  • for: 提供一种能够检测高密度区域中结构的分 clustering方法,以解决传统密度基本划分方法中结构不能够被检测的问题。
  • methods: 使用次要导向差异、层次结构、 нор化密度以及自适应系数来实现结构检测,并被称为SDC-HSDD-NDSA。
  • results: 在多个数据集中运行算法,结果验证了该方法的结构检测、鲁棒性和粒度独立性,并且表明其能够超越前一代方法。
    Abstract Density-based clustering could be the most popular clustering algorithm since it can identify clusters of arbitrary shape as long as different (high-density) clusters are separated by low-density regions. However, the requirement of the separateness of clusters by low-density regions is not trivial since a high-density region might have different structures which should be clustered into different groups. Such a situation demonstrates the main flaw of all previous density-based clustering algorithms we have known--structures in a high-density cluster could not be detected. Therefore, this paper aims to provide a density-based clustering scheme that not only has the ability previous ones have but could also detect structures in a high-density region not separated by low-density ones. The algorithm employs secondary directed differential, hierarchy, normalized density, as well as the self-adaption coefficient, and thus is called Structure Detecting Cluster by Hierarchical Secondary Directed Differential with Normalized Density and Self-Adaption, dubbed by SDC-HSDD-NDSA for short. To illustrate its effectiveness, we run the algorithm in several data sets. The results verify its validity in structure detection, robustness over noises, as well as independence of granularities, and demonstrate that it could outperform previous ones. The Python code of the paper could be found on https://github.com/Hao-B-Shu/SDC-HSDD-NDSA.
    摘要 density-based clustering可能是最受欢迎的聚类算法,因为它可以找到任意形状的聚集,只要不同的高密度区域被低密度区域隔离开来。然而,需要聚集区域由低密度区域隔离开来的要求并不是干扰的,因为高密度区域可能有不同的结构,这些结构应该被分配到不同的组。这种情况表明了所有之前的密度基于的聚类算法的主要缺陷——聚集区域中的结构不能被探测。因此,本文提出了一种密度基于的聚类方案,不仅具有之前的能力,而且可以在高密度区域中探测结构。该算法使用次级导向差、层次结构、 норциали化密度以及自适应系数,因此被称为结构探测聚类方法,简称SDC-HSDD-NDSA。为证明其效果,我们在多个数据集上运行了该算法。结果表明其在结构探测、鲁棒性和自适应性方面具有优势,并且可以超越之前的算法。Python代码可以在https://github.com/Hao-B-Shu/SDC-HSDD-NDSA中找到。

Pay Attention to the Atlas: Atlas-Guided Test-Time Adaptation Method for Robust 3D Medical Image Segmentation

  • paper_url: http://arxiv.org/abs/2307.00676
  • repo_url: None
  • paper_authors: Jingjie Guo, Weitong Zhang, Matthew Sinclair, Daniel Rueckert, Chen Chen
  • for: 提高3D医学图像分割的稳定性和准确性,特别是在医疗影像应用中,遇到不同临床站点和扫描仪器的医学图像变换问题。
  • methods: 提出了一种名为AdaAtlas的新的Atlas导航测试时适应方法,只需要一个无标签的测试样本作为输入,并通过最小化atlas空间中学习的atlas-based损失来适应分割网络。此外,我们还可以在测试时使用通道和空间注意力块来提高适应性。
  • results: 对多个来自不同站点的数据集进行了广泛的实验,结果显示,AdaAtlas-Attention在比较其他竞争方法时具有显著的性能改善,特别是在3D医学图像分割任务中。
    Abstract Convolutional neural networks (CNNs) often suffer from poor performance when tested on target data that differs from the training (source) data distribution, particularly in medical imaging applications where variations in imaging protocols across different clinical sites and scanners lead to different imaging appearances. However, re-accessing source training data for unsupervised domain adaptation or labeling additional test data for model fine-tuning can be difficult due to privacy issues and high labeling costs, respectively. To solve this problem, we propose a novel atlas-guided test-time adaptation (TTA) method for robust 3D medical image segmentation, called AdaAtlas. AdaAtlas only takes one single unlabeled test sample as input and adapts the segmentation network by minimizing an atlas-based loss. Specifically, the network is adapted so that its prediction after registration is aligned with the learned atlas in the atlas space, which helps to reduce anatomical segmentation errors at test time. In addition, different from most existing TTA methods which restrict the adaptation to batch normalization blocks in the segmentation network only, we further exploit the use of channel and spatial attention blocks for improved adaptability at test time. Extensive experiments on multiple datasets from different sites show that AdaAtlas with attention blocks adapted (AdaAtlas-Attention) achieves superior performance improvements, greatly outperforming other competitive TTA methods.
    摘要 循环 нейрон网络(CNN)在面向目标数据进行测试时,经常会表现出不佳的性能,特别是在医学影像应用中,因为不同的临床站点和扫描仪器使用不同的扫描技术,导致影像的显示方式不同。然而,重新访问源训练数据进行隐私的预处理或者为模型细化进行标注是困难的,因为隐私和高标注成本。为解决这个问题,我们提出了一种基于图集的测试时适应(TTA)方法,called AdaAtlas。AdaAtlas只需要一个单独的无标签测试样本,并使用图集来适应分割网络,以适应测试数据的不同。具体来说,网络会在注册后与学习的图集进行对比,以减少分割错误。此外,我们还在TTA方法中使用通道和空间注意力块,以提高测试时的适应性。经过多个数据集的测试,我们发现AdaAtlas-Attention方法可以 achieve superior performance improvement,大大超过其他竞争性TTA方法。

ENN: A Neural Network with DCT-Adaptive Activation Functions

  • paper_url: http://arxiv.org/abs/2307.00673
  • repo_url: None
  • paper_authors: Marc Martinez-Gost, Ana Pérez-Neira, Miguel Ángel Lagunas
  • for: 这篇论文是用于探讨神经网络 Activation Function 的可表示性和可调整性的。
  • methods: 这篇论文提出了一种基于 Discrete Cosine Transform (DCT) 的非线性活化函数模型,并通过反射学习在训练阶段进行调整。这种 parametrization 可以保持训练参数的数量低、适合梯度下降法,并适应不同的学习任务。
  • results: 经过具体实验,这种模型可以在分类和回归任务上适应和具有高表达能力,并且在一些情况下可以提高 state of the art 的准确率,达到40%之间。
    Abstract The expressiveness of neural networks highly depends on the nature of the activation function, although these are usually assumed predefined and fixed during the training stage. In this paper we present Expressive Neural Network (ENN), a novel architecture in which the non-linear activation functions are modeled using the Discrete Cosine Transform (DCT) and adapted using backpropagation during training. This parametrization keeps the number of trainable parameters low, is appropriate for gradient-based schemes, and adapts to different learning tasks. This is the first non-linear model for activation functions that relies on a signal processing perspective, providing high flexibility and expressiveness to the network. We contribute with insights in the explainability of the network at convergence by recovering the concept of bump, this is, the response of each activation function in the output space to provide insights. Finally, through exhaustive experiments we show that the model can adapt to classification and regression tasks. The performance of ENN outperforms state of the art benchmarks, providing up to a 40\% gap in accuracy in some scenarios.
    摘要 Expressive Neural Network (ENN)是一种新型神经网络 Architecture,其中非线性活化函数被模型为Discrete Cosine Transform (DCT),并在训练过程中通过反传播进行适应。这种参数化方式保持训练参数的数量低,适合梯度下降方法,并适应不同的学习任务。这是首次基于信号处理角度的非线性活化函数模型,具有高度的灵活性和表达力。我们提供了解释网络各激活函数在输出空间的响应,即“bump”的概念,从而提供了网络的解释。最后,我们通过广泛的实验证明了ENN模型可以适应分类和回归任务,并在一些情况下超越了状态的权威benchmark。ENN模型的性能与状态的权威benchmark之间的差距可达40%。

Automatic MILP Solver Configuration By Learning Problem Similarities

  • paper_url: http://arxiv.org/abs/2307.00670
  • repo_url: https://github.com/scale-lab/MILPTune
  • paper_authors: Abdelrahman Hosny, Sherief Reda
  • for: 这个研究旨在预测Mixed Integer Linear Programs(MILP)中遗传数据的优化组件,以提高解的成本而不是在解时间上投入大量时间搜寻和评估组件。
  • methods: 本研究使用深度度量学来学习MILP之间的相似性,并将其转换为解决方案的成本相似性。在推断时,给出一个新的问题时,将其转换为已学习的度量空间中的 nearest neighbor 问题,并将组件设置预测为最近的问题中的组件设置。
  • results: 实验结果显示,我们的方法可以预测组件设置,从而提高解的成本,最高可以观察到38%的改善。
    Abstract A large number of real-world optimization problems can be formulated as Mixed Integer Linear Programs (MILP). MILP solvers expose numerous configuration parameters to control their internal algorithms. Solutions, and their associated costs or runtimes, are significantly affected by the choice of the configuration parameters, even when problem instances have the same number of decision variables and constraints. On one hand, using the default solver configuration leads to suboptimal solutions. On the other hand, searching and evaluating a large number of configurations for every problem instance is time-consuming and, in some cases, infeasible. In this study, we aim to predict configuration parameters for unseen problem instances that yield lower-cost solutions without the time overhead of searching-and-evaluating configurations at the solving time. Toward that goal, we first investigate the cost correlation of MILP problem instances that come from the same distribution when solved using different configurations. We show that instances that have similar costs using one solver configuration also have similar costs using another solver configuration in the same runtime environment. After that, we present a methodology based on Deep Metric Learning to learn MILP similarities that correlate with their final solutions' costs. At inference time, given a new problem instance, it is first projected into the learned metric space using the trained model, and configuration parameters are instantly predicted using previously-explored configurations from the nearest neighbor instance in the learned embedding space. Empirical results on real-world problem benchmarks show that our method predicts configuration parameters that improve solutions' costs by up to 38% compared to existing approaches.
    摘要 许多现实世界优化问题可以表示为杂合整数线性程序(MILP)。MILP解决器公布了许多配置参数来控制其内部算法。解决方案和其关联的成本或运行时间受到配置参数的选择的影响,即使问题实例具有相同的决策变量和约束。一方面,使用默认解决器配置会导致优化解决方案。另一方面,在每个问题实例上搜索和评估大量配置参数是时间消耗性很高,甚至不可行。在这种情况下,我们的目标是预测未看到的问题实例的配置参数,以便在解决时获得更低成本的解决方案,而无需在解决时进行搜索和评估配置参数。我们首先研究了使用不同配置参数解决MILP问题实例的成本相关性。我们发现,使用不同配置参数解决的MILP问题实例的成本相似。然后,我们提出了基于深度度量学习的方法,用于学习MILP问题之间的相似性。在推理时,给定一个新的问题实例,将其投影到已经学习的度量空间中,并使用已经探索的配置参数来预测解决方案的成本。实验结果表明,我们的方法可以预测解决方案的成本下降至38%,相比于现有的方法。

Active Sensing with Predictive Coding and Uncertainty Minimization

  • paper_url: http://arxiv.org/abs/2307.00668
  • repo_url: None
  • paper_authors: Abdelrahman Sharafeldin, Nabil Imam, Hannah Choi
  • for: 本研究旨在提出一种基于生物学计算的探索方法,可应用于任何探索任务中,无需任务具体的准备和指导。
  • methods: 该方法基于两种生物计算:预测编码和不确定度最小化。它可以在任务独立和内在驱动的情况下应用于任何探索任务。
  • results: 研究人员通过在迷宫探索任务和活观视觉任务中应用该方法,并证明了其能够找到环境的转移分布和重建空间特征。此外,该方法还能够建立不监督的表示,使代理人能够高效地样本和分类感知场景。
    Abstract We present an end-to-end procedure for embodied exploration based on two biologically inspired computations: predictive coding and uncertainty minimization. The procedure can be applied to any exploration setting in a task-independent and intrinsically driven manner. We first demonstrate our approach in a maze navigation task and show that our model is capable of discovering the underlying transition distribution and reconstructing the spatial features of the environment. Second, we apply our model to the more complex task of active vision, where an agent must actively sample its visual environment to gather information. We show that our model is able to build unsupervised representations that allow it to actively sample and efficiently categorize sensory scenes. We further show that using these representations as input for downstream classification leads to superior data efficiency and learning speed compared to other baselines, while also maintaining lower parameter complexity. Finally, the modularity of our model allows us to analyze its internal mechanisms and to draw insight into the interactions between perception and action during exploratory behavior.
    摘要 我们提出了一种从头到尾的方法,基于生物学上的两种计算:预测编码和不确定度最小化。这种方法可以在任何探索任务中应用,具有任务独立和自适应的特点。我们首先在一个迷宫探索任务中应用了我们的方法,并证明我们的模型可以找到环境的下行传递分布和重建空间特征。然后,我们将我们的模型应用到更复杂的激活视觉任务中,agent需要活动地抽取视觉环境中的信息。我们证明了我们的模型可以建立无监督的表示,使得它可以活动地抽取和有效地分类感知场景。此外,我们还证明了使用这些表示作为下游分类器的输入,可以提高数据效率和学习速度,同时也可以降低参数复杂性。最后,我们的模型的卷积结构允许我们分析它的内部机制,并从探索行为中拓展出对感知和行为之间的交互的理解。

Morse Neural Networks for Uncertainty Quantification

  • paper_url: http://arxiv.org/abs/2307.00667
  • repo_url: None
  • paper_authors: Benoit Dherin, Huiyi Hu, Jie Ren, Michael W. Dusenberry, Balaji Lakshminarayanan
    for:* The paper presents a new deep generative model called the Morse neural network, which is useful for uncertainty quantification and can be used for various tasks such as OOD detection, anomaly detection, and continuous learning.methods:* The Morse neural network uses a KL-divergence loss to fit the model, which yields five components: a generative density, an OOD detector, a calibration temperature, a generative sampler, and a distance-aware classifier (in the supervised case).results:* The Morse neural network unifies many techniques in uncertainty quantification and has connections to support vector machines, kernel methods, and Morse theory in topology. It can be used on top of a pre-trained network to bring distance-aware calibration w.r.t the training data.
    Abstract We introduce a new deep generative model useful for uncertainty quantification: the Morse neural network, which generalizes the unnormalized Gaussian densities to have modes of high-dimensional submanifolds instead of just discrete points. Fitting the Morse neural network via a KL-divergence loss yields 1) a (unnormalized) generative density, 2) an OOD detector, 3) a calibration temperature, 4) a generative sampler, along with in the supervised case 5) a distance aware-classifier. The Morse network can be used on top of a pre-trained network to bring distance-aware calibration w.r.t the training data. Because of its versatility, the Morse neural networks unifies many techniques: e.g., the Entropic Out-of-Distribution Detector of (Mac\^edo et al., 2021) in OOD detection, the one class Deep Support Vector Description method of (Ruff et al., 2018) in anomaly detection, or the Contrastive One Class classifier in continuous learning (Sun et al., 2021). The Morse neural network has connections to support vector machines, kernel methods, and Morse theory in topology.
    摘要 我们介绍一个新的深度生成模型,用于不确定量化:Morse神经网络,它将高维子集模式扩展到非数字点的 Gaussian 分布中。通过 Morse 神经网络的适应损失函数,可以获得1)生成密度(尚未 норма化),2)外部数据检测器,3)整合温度,4)生成抽样器,并在监督学习情况下5)距离意识类别器。Morse 神经网络可以在对照数据进行训练后,将距离意识于训练数据。由于其多方面性,Morse 神经网络可以统一许多技术:例如 Entropic Out-of-Distribution Detector(Mac\^edo et al., 2021)在类别外数据检测中,One Class Deep Support Vector Description method(Ruff et al., 2018)在异常检测中,以及 Contrastive One Class 类别器在连续学习中(Sun et al., 2021)。Morse 神经网络与支持向量机制、核方法和 Morse 理论在数学上有联系。

Numerical Association Rule Mining: A Systematic Literature Review

  • paper_url: http://arxiv.org/abs/2307.00662
  • repo_url: None
  • paper_authors: Minakshi Kaushik, Rahul Sharma, Iztok Fister Jr., Dirk Draheim
  • for: 本研究旨在bridging numerical association rule mining领域中的知识差距,通过对1996年至2022年发表的1,140篇学术论文进行系统性的文献综述。
  • methods: 本研究使用了多种方法、算法、 метрик和数据集,包括不同的某些精度抽象方法、精度评估方法、精度权重方法、精度匹配方法等。
  • results: 本研究通过对68篇论文进行深入的分析,提供了 numerical association rule mining领域中现有的多种方法、算法、 метрик和数据集,并发现了一些研究问题和未来可能性。此外,本研究还提出了一种新的某些精度抽象方法,可以帮助提高人类对数据的认知。
    Abstract Numerical association rule mining is a widely used variant of the association rule mining technique, and it has been extensively used in discovering patterns and relationships in numerical data. Initially, researchers and scientists integrated numerical attributes in association rule mining using various discretization approaches; however, over time, a plethora of alternative methods have emerged in this field. Unfortunately, the increase of alternative methods has resulted into a significant knowledge gap in understanding diverse techniques employed in numerical association rule mining -- this paper attempts to bridge this knowledge gap by conducting a comprehensive systematic literature review. We provide an in-depth study of diverse methods, algorithms, metrics, and datasets derived from 1,140 scholarly articles published from the inception of numerical association rule mining in the year 1996 to 2022. In compliance with the inclusion, exclusion, and quality evaluation criteria, 68 papers were chosen to be extensively evaluated. To the best of our knowledge, this systematic literature review is the first of its kind to provide an exhaustive analysis of the current literature and previous surveys on numerical association rule mining. The paper discusses important research issues, the current status, and future possibilities of numerical association rule mining. On the basis of this systematic review, the article also presents a novel discretization measure that contributes by providing a partitioning of numerical data that meets well human perception of partitions.
    摘要 numerically association rule mining 是一种广泛使用的 association rule mining 技术的变种,并在发现数据中的模式和关系方面得到了广泛应用。初始时,研究人员和科学家将数值属性 integrate 到 association rule mining 中使用了多种精炼方法;然而,随着时间的推移,这个领域中出现了一系列的替代方法。 unfortunately, 这些替代方法的出现导致了关于不同技术在数值 association rule mining 中的知识差距增加,这篇论文试图通过进行系统性的文献综述来填补这个知识差距。我们从1996年 numerical association rule mining 的开始时间到2022年进行了1,140篇学术论文的系统性综述。在符合包括、排除和质量评估标准的基础下,我们选择了68篇文献进行了深入的评估。据我们所知,这是首次对 numerical association rule mining 的现有文献和前期调查进行了系统性的综述。本文提出了一些重要的研究问题,现状和未来可能性,以及一种新的精炼度量,它可以为数值数据提供一种符合人类认知的分割。

Intra- & Extra-Source Exemplar-Based Style Synthesis for Improved Domain Generalization

  • paper_url: http://arxiv.org/abs/2307.00648
  • repo_url: https://github.com/boschresearch/issa
  • paper_authors: Yumeng Li, Dan Zhang, Margret Keuper, Anna Khoreva
  • for: 提高域外适应性(Domain Generalization)的 semantic segmentation 模型,尤其是在自动驾驶等应用场景中,它们经常遇到域 shift 问题。
  • methods: 提出了一种 exemplar-based 风格合成管道,通过 StyleGAN2 倒数采样和掩蔽噪声预测来提高域外适应性。
  • results: 在不同类型的数据偏移(地理位置、天气、日夜等)下,实现了最多 12.4% 的 mIoU 提升,并且可以与 CNN 和 Transformer 模型相容,并且可以与其他域外适应性技术相结合使用。
    Abstract The generalization with respect to domain shifts, as they frequently appear in applications such as autonomous driving, is one of the remaining big challenges for deep learning models. Therefore, we propose an exemplar-based style synthesis pipeline to improve domain generalization in semantic segmentation. Our method is based on a novel masked noise encoder for StyleGAN2 inversion. The model learns to faithfully reconstruct the image, preserving its semantic layout through noise prediction. Using the proposed masked noise encoder to randomize style and content combinations in the training set, i.e., intra-source style augmentation (ISSA) effectively increases the diversity of training data and reduces spurious correlation. As a result, we achieve up to $12.4\%$ mIoU improvements on driving-scene semantic segmentation under different types of data shifts, i.e., changing geographic locations, adverse weather conditions, and day to night. ISSA is model-agnostic and straightforwardly applicable with CNNs and Transformers. It is also complementary to other domain generalization techniques, e.g., it improves the recent state-of-the-art solution RobustNet by $3\%$ mIoU in Cityscapes to Dark Z\"urich. In addition, we demonstrate the strong plug-n-play ability of the proposed style synthesis pipeline, which is readily usable for extra-source exemplars e.g., web-crawled images, without any retraining or fine-tuning. Moreover, we study a new use case to indicate neural network's generalization capability by building a stylized proxy validation set. This application has significant practical sense for selecting models to be deployed in the open-world environment. Our code is available at \url{https://github.com/boschresearch/ISSA}.
    摘要 “域域迁移总是深度学习模型的一大挑战,尤其在自动驾驶应用中出现的场景下。因此,我们提出了一种基于例子的风格合成管道,以提高域迁移的深度学习模型。我们的方法基于 StyleGAN2 的掩码噪音编码器,该模型能够准确地重建图像,保留其 semantic 布局。通过在训练集中随机采样样式和内容的组合,我们称之为内源样式增强(ISSA),可以提高训练数据的多样性,并降低偶极性。这使得我们在不同类型的数据迁移下 achieve 12.4% 的 mIoU 提升。ISSA 是模型无关的和简单应用于 CNN 和 Transformer 上。它还是其他域迁移技术的补充,例如 RobustNet 的最新状态态之一。此外,我们还证明了我们提posed的风格合成管道具有强大的插件与替换能力,可以在不需要重新训练或微调的情况下使用。此外,我们还研究了一种新的应用场景,即通过建立风格化代理验证集来评估神经网络的泛化能力。这种应用场景具有实际 significanc,可以帮助选择要部署在开放世界环境中的模型。我们的代码可以在 \url{https://github.com/boschresearch/ISSA} 中找到。”

Multiclass Boosting: Simple and Intuitive Weak Learning Criteria

  • paper_url: http://arxiv.org/abs/2307.00642
  • repo_url: None
  • paper_authors: Nataly Brukhim, Amit Daniely, Yishay Mansour, Shay Moran
  • for: 这个论文是为了推广到多类Setting中的推广。
  • methods: 这个论文引入了一种弱学习条件,用于描述多类分类的原始概念,即“slightly better than random guessing”。提供了一种简单有效的推广算法,不需要 realizability assumption,其样本和oracle复杂度上下文独立于类数。
  • results: 这个论文在理论应用中使用了新的推广技术,包括列PAC学习中的等价性、推广 для列学习者和多类PAC学习和列PAC学习的特征化。其中,我们提供了一种简化的分析方法,并实现了对大型列size的改进的错误 bound。
    Abstract We study a generalization of boosting to the multiclass setting. We introduce a weak learning condition for multiclass classification that captures the original notion of weak learnability as being "slightly better than random guessing". We give a simple and efficient boosting algorithm, that does not require realizability assumptions and its sample and oracle complexity bounds are independent of the number of classes. In addition, we utilize our new boosting technique in several theoretical applications within the context of List PAC Learning. First, we establish an equivalence to weak PAC learning. Furthermore, we present a new result on boosting for list learners, as well as provide a novel proof for the characterization of multiclass PAC learning and List PAC learning. Notably, our technique gives rise to a simplified analysis, and also implies an improved error bound for large list sizes, compared to previous results.
    摘要 我们研究了多类 boosting 的泛化。我们引入了一种多类分类中的弱学习条件, capture 了原始的弱学习概念,即“一些更好过 random 猜测”。我们给出了简单、高效的 boosting 算法,不需要 realizability 假设,其样本和oracle 复杂度上下文独立于类数。在此基础上,我们在list PAC 学习的上下文中应用了我们的新 boosting 技术。首先,我们证明了弱 PAC 学习的等价性。其次,我们提供了一个新的 boosting для列学习者结果,以及一种新的证明方式 для多类 PAC 学习和list PAC 学习的特征化。值得注意的是,我们的技术使得分析简化,同时还提供了大量列表大小时的改进的错误 bound。

Effects of Explanation Specificity on Passengers in Autonomous Driving

  • paper_url: http://arxiv.org/abs/2307.00633
  • repo_url: None
  • paper_authors: Daniel Omeiza, Raunak Bhattacharyya, Nick Hawes, Marina Jirotka, Lars Kunze
  • for: investigate the effects of natural language explanations’ specificity on passengers in autonomous driving
  • methods: extended an existing data-driven tree-based explainer algorithm by adding a rule-based option for explanation generation, and generated auditory natural language explanations with different levels of specificity (abstract and specific)
  • results: both abstract and specific explanations had similar positive effects on passengers’ perceived safety and the feeling of anxiety, but specific explanations influenced the desire of passengers to takeover driving control from the autonomous vehicle, while abstract explanations did not.
    Abstract The nature of explanations provided by an explainable AI algorithm has been a topic of interest in the explainable AI and human-computer interaction community. In this paper, we investigate the effects of natural language explanations' specificity on passengers in autonomous driving. We extended an existing data-driven tree-based explainer algorithm by adding a rule-based option for explanation generation. We generated auditory natural language explanations with different levels of specificity (abstract and specific) and tested these explanations in a within-subject user study (N=39) using an immersive physical driving simulation setup. Our results showed that both abstract and specific explanations had similar positive effects on passengers' perceived safety and the feeling of anxiety. However, the specific explanations influenced the desire of passengers to takeover driving control from the autonomous vehicle (AV), while the abstract explanations did not. We conclude that natural language auditory explanations are useful for passengers in autonomous driving, and their specificity levels could influence how much in-vehicle participants would wish to be in control of the driving activity.
    摘要 自然语言说明提供的AI算法的特点已经引起了解释AI和人机交互社区的关注。在这篇论文中,我们研究了自动驾驶中乘客受到自然语言说明的特定性的影响。我们将现有的数据驱动树型解释算法扩展为添加了规则型解释生成选项。我们生成了不同水平的具体性(抽象和具体)的听觉自然语言说明,并在N=39名参与者参与的内置式物理驾驶模拟设置下进行了一场人Subject用户研究。我们的结果显示,抽象和具体的说明都有类似的正面效果于乘客对自动驾驶车辆(AV)的感知安全和焦虑情况。然而,具体说明对乘客想要控制驾驶活动的愿望产生了影响,而抽象说明没有这种影响。我们结论认为,自然语言听觉说明对自动驾驶中的乘客非常有用,而其特定性水平可以影响乘客希望在驾驶活动中担当控制的程度。

Bidirectional Looking with A Novel Double Exponential Moving Average to Adaptive and Non-adaptive Momentum Optimizers

  • paper_url: http://arxiv.org/abs/2307.00631
  • repo_url: https://github.com/chernyn/admeta-optimizer
  • paper_authors: Yineng Chen, Zuchao Li, Lefei Zhang, Bo Du, Hai Zhao
  • for: 提高深度学习模型的训练效果和稳定性
  • methods: 提出了一种新的双向性优化器框架,包括DEMA变体和动态预测策略
  • results: 经过广泛的实验和理论证明,提出的\textsc{Admeta}优化器在多个任务上表现出色,比基础优化器和最新的竞争优化器更高效和稳定。
    Abstract Optimizer is an essential component for the success of deep learning, which guides the neural network to update the parameters according to the loss on the training set. SGD and Adam are two classical and effective optimizers on which researchers have proposed many variants, such as SGDM and RAdam. In this paper, we innovatively combine the backward-looking and forward-looking aspects of the optimizer algorithm and propose a novel \textsc{Admeta} (\textbf{A} \textbf{D}ouble exponential \textbf{M}oving averag\textbf{E} \textbf{T}o \textbf{A}daptive and non-adaptive momentum) optimizer framework. For backward-looking part, we propose a DEMA variant scheme, which is motivated by a metric in the stock market, to replace the common exponential moving average scheme. While in the forward-looking part, we present a dynamic lookahead strategy which asymptotically approaches a set value, maintaining its speed at early stage and high convergence performance at final stage. Based on this idea, we provide two optimizer implementations, \textsc{AdmetaR} and \textsc{AdmetaS}, the former based on RAdam and the latter based on SGDM. Through extensive experiments on diverse tasks, we find that the proposed \textsc{Admeta} optimizer outperforms our base optimizers and shows advantages over recently proposed competitive optimizers. We also provide theoretical proof of these two algorithms, which verifies the convergence of our proposed \textsc{Admeta}.
    摘要 优化器是深度学习成功的关键组件,它引导神经网络更新参数根据训练集的损失。SGD和Adam是经典的优化器,研究人员对其提出了许多变体,如SGDM和RAdam。在这篇论文中,我们创新地结合了优化器算法的反向和前向方面,并提出了一个新的\textsc{Admeta}优化器框架。在反向方面,我们提出了DEMA变体方案,它是由股票市场中的一个度量所 inspirited,用于取代常见的几何移动平均方案。而在前向方面,我们提出了一种动态预测策略,它在早期 stages maintains its speed and在 final stages reaches a set value,即使在极端情况下也能够保持高性能。基于这个想法,我们提供了两种优化器实现,\textsc{AdmetaR}和\textsc{AdmetaS},前者基于RAdam,后者基于SGDM。通过对多种任务进行广泛的实验,我们发现了我们提posed的\textsc{Admeta}优化器在相比基础优化器和最新的竞争优化器之上具有优势。此外,我们还提供了这两个算法的理论证明,这证明了我们的提案的\textsc{Admeta}优化器的收敛性。

Variational Autoencoding Molecular Graphs with Denoising Diffusion Probabilistic Model

  • paper_url: http://arxiv.org/abs/2307.00623
  • repo_url: None
  • paper_authors: Daiki Koge, Naoaki Ono, Shigehiko Kanaya
  • for: 这 paper 的目的是提出一种基于分布式生成模型的分子特征设计方法,以便在数据驱动药物发现中使用。
  • methods: 这 paper 使用了一种基于梯度滤波的 probabilistic 模型(DDPM),将分子结构转化为含层次结构的 probabilistic 特征。
  • results: 经过一些实验表明,这 paper 的方法可以在小样本大小下提供更好的分子性质预测性能和稳定性,比较现有方法更好。
    Abstract In data-driven drug discovery, designing molecular descriptors is a very important task. Deep generative models such as variational autoencoders (VAEs) offer a potential solution by designing descriptors as probabilistic latent vectors derived from molecular structures. These models can be trained on large datasets, which have only molecular structures, and applied to transfer learning. Nevertheless, the approximate posterior distribution of the latent vectors of the usual VAE assumes a simple multivariate Gaussian distribution with zero covariance, which may limit the performance of representing the latent features. To overcome this limitation, we propose a novel molecular deep generative model that incorporates a hierarchical structure into the probabilistic latent vectors. We achieve this by a denoising diffusion probabilistic model (DDPM). We demonstrate that our model can design effective molecular latent vectors for molecular property prediction from some experiments by small datasets on physical properties and activity. The results highlight the superior prediction performance and robustness of our model compared to existing approaches.
    摘要 在数据驱动的药物发现中,设计分子特征是非常重要的任务。深度生成模型如变换自动编码器(VAEs)提供了一种解决方案,通过将分子结构转换为抽象的 probabilistic 矩阵来设计特征。这些模型可以在大量数据上训练,并应用到传输学习。然而,通常的 VAE 模型假设 approximate posterior distribution 的纬度分布是一个简单的多变量 Gaussian 分布,这可能会限制表示隐藏特征的性能。为了超越这些限制,我们提出了一种新的分子深度生成模型,通过将 hierarchical 结构 incorporated 到 probabilistic 矩阵中来实现。我们通过 denoising diffusion probabilistic model(DDPM)来实现这一点。我们的实验表明,我们的模型可以从一些小 datasets 上预测分子性质,并且比既有方法表现出更好的预测性和稳定性。

Solving Linear Inverse Problems Provably via Posterior Sampling with Latent Diffusion Models

  • paper_url: http://arxiv.org/abs/2307.00619
  • repo_url: https://github.com/liturout/psld
  • paper_authors: Litu Rout, Negin Raoof, Giannis Daras, Constantine Caramanis, Alexandros G. Dimakis, Sanjay Shakkottai
  • for: solves linear inverse problems using pre-trained latent diffusion models.
  • methods: leverages pre-trained latent diffusion models and provides provable sample recovery in a linear model setting.
  • results: outperforms previously proposed posterior sampling algorithms in various inpainting, denoising, deblurring, destriping, and super-resolution tasks.Here’s the full translation of the abstract in Simplified Chinese:
  • for: Linear inverse problems 的解决方案,使用预训练的潜在扩散模型。
  • methods: 基于预训练的潜在扩散模型,提供可证明样本恢复的线性模型设置下的算法分析。
  • results: 在多种缺失、降噪、滤波、除条、高清化和超解像等问题中,超越了之前的 posterior sampling 算法。
    Abstract We present the first framework to solve linear inverse problems leveraging pre-trained latent diffusion models. Previously proposed algorithms (such as DPS and DDRM) only apply to pixel-space diffusion models. We theoretically analyze our algorithm showing provable sample recovery in a linear model setting. The algorithmic insight obtained from our analysis extends to more general settings often considered in practice. Experimentally, we outperform previously proposed posterior sampling algorithms in a wide variety of problems including random inpainting, block inpainting, denoising, deblurring, destriping, and super-resolution.
    摘要 我团队首先提出了利用预训练的潜在扩散模型解决线性逆问题的框架。之前的方法(如DPS和DDRM)只适用于像素空间扩散模型。我们 theoretically analyzed our algorithm, showing provable sample recovery in a linear model setting. The algorithmic insight obtained from our analysis extends to more general settings often considered in practice. Experimentally, we outperform previously proposed posterior sampling algorithms in a wide variety of problems, including random inpainting, block inpainting, denoising, deblurring, destriping, and super-resolution.Here's the translation in Traditional Chinese:我团队首先提出了利用预训练的潜在扩散模型解决线性逆问题的框架。之前的方法(如DPS和DDRM)只适用于像素空间扩散模型。我们 theoretically analyzed our algorithm, showing provable sample recovery in a linear model setting. The algorithmic insight obtained from our analysis extends to more general settings often considered in practice. Experimentally, we outperform previously proposed posterior sampling algorithms in a wide variety of problems, including random inpainting, block inpainting, denoising, deblurring, destriping, and super-resolution.

Bounce: a Reliable Bayesian Optimization Algorithm for Combinatorial and Mixed Spaces

  • paper_url: http://arxiv.org/abs/2307.00618
  • repo_url: None
  • paper_authors: Leonard Papenmeier, Luigi Nardi, Matthias Poloczek
  • for: solve high-dimensional black-box functions with mixed and combinatorial input spaces
  • methods: uses a novel map of various variable types into nested embeddings of increasing dimensionality
  • results: reliably achieves and often improves upon state-of-the-art performance on a variety of high-dimensional problems.
    Abstract Impactful applications such as materials discovery, hardware design, neural architecture search, or portfolio optimization require optimizing high-dimensional black-box functions with mixed and combinatorial input spaces. While Bayesian optimization has recently made significant progress in solving such problems, an in-depth analysis reveals that the current state-of-the-art methods are not reliable. Their performances degrade substantially when the unknown optima of the function do not have a certain structure. To fill the need for a reliable algorithm for combinatorial and mixed spaces, this paper proposes Bounce that relies on a novel map of various variable types into nested embeddings of increasing dimensionality. Comprehensive experiments show that Bounce reliably achieves and often even improves upon state-of-the-art performance on a variety of high-dimensional problems.
    摘要 “影响力强大的应用,如材料发现、硬件设计、神经架构搜寻或资产优化,需要优化高维黑盒函数,其中输入空间为杂合和混合的。虽然 Bayesian 优化在最近已经做出了重要进步,但是一个深入分析表明,现今的州际状态艺术方法并不可靠。它们在未知最佳点不具有特定结构时表现很差。为了填补这些问题,本文提出了 Bounce,它基于一个新的变量类型对嵌入的方法。实验表明,Bounce 可靠地达到和有时甚至超越了现今状态艺术方法的性能。”Note: The translation is in Simplified Chinese, which is the standard writing system used in mainland China. If you need Traditional Chinese, please let me know.

The Forward-Forward Algorithm as a feature extractor for skin lesion classification: A preliminary study

  • paper_url: http://arxiv.org/abs/2307.00617
  • repo_url: None
  • paper_authors: Abel Reyes-Angulo, Sidike Paheding
  • for: 针对皮肤癌的早期诊断,以提高诊断率和治疗效果。
  • methods: 利用深度学习技术,包括卷积神经网络和变换器,对皮肤癌图像进行分类。
  • results: 提出一种新的神经网络模型,即前进方法(FFA),可以在低功耗的 анаóg硬件上进行实现,并且可以与传统的反射征文法(BP)相结合,以实现更高的预测精度。
    Abstract Skin cancer, a deadly form of cancer, exhibits a 23\% survival rate in the USA with late diagnosis. Early detection can significantly increase the survival rate, and facilitate timely treatment. Accurate biomedical image classification is vital in medical analysis, aiding clinicians in disease diagnosis and treatment. Deep learning (DL) techniques, such as convolutional neural networks and transformers, have revolutionized clinical decision-making automation. However, computational cost and hardware constraints limit the implementation of state-of-the-art DL architectures. In this work, we explore a new type of neural network that does not need backpropagation (BP), namely the Forward-Forward Algorithm (FFA), for skin lesion classification. While FFA is claimed to use very low-power analog hardware, BP still tends to be superior in terms of classification accuracy. In addition, our experimental results suggest that the combination of FFA and BP can be a better alternative to achieve a more accurate prediction.
    摘要 皮肤癌,一种致命的癌症,在美国的诊断晚期时存在23%的存活率。早期发现可以显著提高存活率,并促进时间有序的治疗。医疗分析中,精准的生物医学影像分类是非常重要的,帮助临床医生在疾病诊断和治疗中做出更加准确的决策。深度学习(DL)技术,如卷积神经网络和转换器,已经革命化了临床决策自动化。然而,计算成本和硬件限制使得现状的DL体系难以实施。在这项工作中,我们探索了一种不需要反卷积(BP)的神经网络,即前进方法(FFA),用于皮肤变性分类。虽然FFA声称使用非常低功耗的Analog嵌入式硬件,但BP仍然在分类准确率方面具有优势。此外,我们的实验结果表明,将FFA和BP组合起来可以达到更高的准确预测。

Fraunhofer SIT at CheckThat! 2023: Mixing Single-Modal Classifiers to Estimate the Check-Worthiness of Multi-Modal Tweets

  • paper_url: http://arxiv.org/abs/2307.00610
  • repo_url: None
  • paper_authors: Raphael Frick, Inna Vogel
  • for: 本研究旨在提出一种多模态检查性分析方法,用于在社交媒体上分享的 multimedia 数据中检测 false information 和 fake news。
  • methods: 该方法使用两种类器,每个类器在不同的模态上进行训练。对于图像数据,使用 OCR 分析检测出嵌入的文本表现最佳。
  • results: 该方法在 CheckThat! 2023 任务1A 中获得了一等奖,其在私有测试集上达到的 F1 分数为 0.7297。
    Abstract The option of sharing images, videos and audio files on social media opens up new possibilities for distinguishing between false information and fake news on the Internet. Due to the vast amount of data shared every second on social media, not all data can be verified by a computer or a human expert. Here, a check-worthiness analysis can be used as a first step in the fact-checking pipeline and as a filtering mechanism to improve efficiency. This paper proposes a novel way of detecting the check-worthiness in multi-modal tweets. It takes advantage of two classifiers, each trained on a single modality. For image data, extracting the embedded text with an OCR analysis has shown to perform best. By combining the two classifiers, the proposed solution was able to place first in the CheckThat! 2023 Task 1A with an F1 score of 0.7297 achieved on the private test set.
    摘要 <>translate "The option of sharing images, videos and audio files on social media opens up new possibilities for distinguishing between false information and fake news on the Internet. Due to the vast amount of data shared every second on social media, not all data can be verified by a computer or a human expert. Here, a check-worthiness analysis can be used as a first step in the fact-checking pipeline and as a filtering mechanism to improve efficiency. This paper proposes a novel way of detecting the check-worthiness in multi-modal tweets. It takes advantage of two classifiers, each trained on a single modality. For image data, extracting the embedded text with an OCR analysis has shown to perform best. By combining the two classifiers, the proposed solution was able to place first in the CheckThat! 2023 Task 1A with an F1 score of 0.7297 achieved on the private test set." into Simplified Chinese.中文简体版:社交媒体上分享图片、视频和音频文件的选项开启了新的方式来分辨假信息和 fake news 在互联网上。由于社交媒体上每秒分享的数据量太多,不能由计算机或人工专家所验证。这里,一种可验证性分析可以作为验证管道的第一步和效率提高的筛选机制。本文提出了一种新的多模态推断方法,利用两个分类器,每个分类器在不同的模式上训练。对图像数据,使用 OCR 分析提取嵌入的文本最佳。将两个分类器结合使用,提出的解决方案在 CheckThat! 2023 任务 1A 中取得了0.7297 的 F1 分数。