2023-12-02

cs.LG

cs.LG - 2023-12-02

A deep learning pipeline for cross-sectional and longitudinal multiview data integration

paper_url: http://arxiv.org/abs/2312.01238
repo_url: https://github.com/lasandrall/deepida-gru
paper_authors: Sarthak Jain, Sandra E. Safo
for:This paper aims to develop a pipeline for integrating cross-sectional and longitudinal data from multiple sources to better understand the pathobiology of complex diseases.methods:The pipeline uses a combination of statistical and deep learning methods, including variable selection/ranking, feature extraction, and joint integration and classification.results:The pipeline was applied to a case study of inflammatory bowel disease (IBD) and identified microbial pathways, metabolites, and genes that discriminate between IBD status, providing insights into the etiology of the disease. The authors also conducted simulations to compare the performance of different feature extraction methods.

Abstract
Biomedical research now commonly integrates diverse data types or views from the same individuals to better understand the pathobiology of complex diseases, but the challenge lies in meaningfully integrating these diverse views. Existing methods often require the same type of data from all views (cross-sectional data only or longitudinal data only) or do not consider any class outcome in the integration method, presenting limitations. To overcome these limitations, we have developed a pipeline that harnesses the power of statistical and deep learning methods to integrate cross-sectional and longitudinal data from multiple sources. Additionally, it identifies key variables contributing to the association between views and the separation among classes, providing deeper biological insights. This pipeline includes variable selection/ranking using linear and nonlinear methods, feature extraction using functional principal component analysis and Euler characteristics, and joint integration and classification using dense feed-forward networks and recurrent neural networks. We applied this pipeline to cross-sectional and longitudinal multi-omics data (metagenomics, transcriptomics, and metabolomics) from an inflammatory bowel disease (IBD) study and we identified microbial pathways, metabolites, and genes that discriminate by IBD status, providing information on the etiology of IBD. We conducted simulations to compare the two feature extraction methods. The proposed pipeline is available from the following GitHub repository: https://github.com/lasandrall/DeepIDA-GRU.

摘要
生物医学研究现通常将不同类型数据或视角集成一起，以更好地理解复杂疾病的生物学机理，但是这些集成受到限制。现有方法通常需要所有视角的数据都是同类型的（只有cross-sectional数据或只有 longitudinal数据），或者不考虑任何类别结果，这有限制。为了超越这些限制，我们开发了一个管道，利用统计学和深度学习方法集成cross-sectional和longitudinal数据从多个源。此外，它还可以识别视角之间和类别之间的关系，提供更深刻的生物学理解。这个管道包括变量选择/排名使用线性和非线性方法，功能 principales component analysis和欧拉特征分析，并使用稠密层 feed-forward networks和循环神经网络进行集成和分类。我们在一个inflammatory bowel disease（IBD）研究中应用了这个管道，并确定了IBD状态下的微生物道路、代谢物和基因，提供了IBD的etiology信息。我们进行了模拟，以比较这两种特征提取方法。提出的管道可以从以下GitHub存储库获取：https://github.com/lasandrall/DeepIDA-GRU。

Evetac: An Event-based Optical Tactile Sensor for Robotic Manipulation

paper_url: http://arxiv.org/abs/2312.01236
repo_url: None
paper_authors: Niklas Funk, Erik Helmut, Georgia Chalvatzaki, Roberto Calandra, Jan Peters
for: 此研究旨在开发一种基于事件的光学感知器，以提高感知器的时间分辨率。
methods: 研究人员使用了事件驱动的摄像机和新型的光学感知器 called Evetac，并开发了一种能够在1000Hz频率下处理触感测量的触感处理算法。
results: 研究人员在测试中发现，Evetac可以感测到频率达498Hz的振荡，重建Sheet的力，并相比RGB光学感知器，减少了数据量。此外，Evetac的输出和标记跟踪提供了有用的特征 для学习数据驱动的滑动检测和预测模型。

Abstract
Optical tactile sensors have recently become popular. They provide high spatial resolution, but struggle to offer fine temporal resolutions. To overcome this shortcoming, we study the idea of replacing the RGB camera with an event-based camera and introduce a new event-based optical tactile sensor called Evetac. Along with hardware design, we develop touch processing algorithms to process its measurements online at 1000 Hz. We devise an efficient algorithm to track the elastomer's deformation through the imprinted markers despite the sensor's sparse output. Benchmarking experiments demonstrate Evetac's capabilities of sensing vibrations up to 498 Hz, reconstructing shear forces, and significantly reducing data rates compared to RGB optical tactile sensors. Moreover, Evetac's output and the marker tracking provide meaningful features for learning data-driven slip detection and prediction models. The learned models form the basis for a robust and adaptive closed-loop grasp controller capable of handling a wide range of objects. We believe that fast and efficient event-based tactile sensors like Evetac will be essential for bringing human-like manipulation capabilities to robotics. The sensor design is open-sourced at https://sites.google.com/view/evetac .

摘要
“光学感觉传感器最近受欢迎。它们提供高度空间分辨率，但它们很难提供细腻的时间分辨率。为了解决这个缺陷，我们研究了将RGB摄像头换为事件驱动摄像头，并开发了一种新的事件驱动光学感觉传感器called Evetac。我们不仅设计了硬件，还开发了触摸处理算法，以实时处理Evetac的测量数据，频率为1000Hz。我们还提出了一种高效的算法，可以跟踪弹性体的变形，即使感测器的输出是稀疏的。我们的实验表明，Evetac可以感测到498Hz的振荡，重建剪切力，并比RGB光学感觉传感器减少数据量。此外，Evetac的输出和标记跟踪提供了有用的特征，可以用于学习数据驱动的滑动检测和预测模型。我们相信，快速和高效的事件驱动感觉传感器like Evetac将是Robotics中人类化抓取能力的关键。感测器的设计是开源的，可以在https://sites.google.com/view/evetac 中找到。”

Can We Learn Communication-Efficient Optimizers?

paper_url: http://arxiv.org/abs/2312.02204
repo_url: None
paper_authors: Charles-Étienne Joseph, Benjamin Thérien, Abhinav Moudgil, Boris Knyazev, Eugene Belilovsky
for: 本研究旨在investigate whether recent progress in learned optimizers can potentially close the gap between local SGD and state-of-the-art adaptive optimizers for deep learning, while maintaining communication efficiency.
methods: 本研究使用了meta-learning来学习如何在local SGD迭代中进行全局更新。
results: 研究结果表明，learned optimizers可以substantially outperform local SGD和其他复杂的变种，并且可以在未seen的大型 dataset和架构、以及未seen的模式（如语言模型）中进行泛化。

Abstract
Communication-efficient variants of SGD, specifically local SGD, have received a great deal of interest in recent years. These approaches compute multiple gradient steps locally, that is on each worker, before averaging model parameters, helping relieve the critical communication bottleneck in distributed deep learning training. Although many variants of these approaches have been proposed, they can sometimes lag behind state-of-the-art adaptive optimizers for deep learning. In this work, we investigate if the recent progress in the emerging area of learned optimizers can potentially close this gap while remaining communication-efficient. Specifically, we meta-learn how to perform global updates given an update from local SGD iterations. Our results demonstrate that learned optimizers can substantially outperform local SGD and its sophisticated variants while maintaining their communication efficiency. Learned optimizers can even generalize to unseen and much larger datasets and architectures, including ImageNet and ViTs, and to unseen modalities such as language modeling. We therefore demonstrate the potential of learned optimizers for improving communication-efficient distributed learning.

摘要
各种减少通信量的SGD变体，特别是本地SGD，在过去几年内得到了很大的关注。这些方法在每个工作节点上计算多个梯度步骤，然后在平均模型参数之前进行了LOCAL SGD迭代，帮助解决分布式深度学习训练中的关键通信瓶颈。虽然许多变体已经被提出，但它们有时会落后于适应型优化器。在这项工作中，我们 investigate 是否可以通过emerging learned optimizers来填补这个差距。我们通过meta-学习在local SGD迭代中进行全局更新来实现这一点。我们的结果表明，learned optimizers可以在保持通信效率的情况下，substantially outperform local SGD和其他复杂的变体。learned optimizers甚至可以在未看过和许多更大的数据集和架构，包括ImageNet和ViTs，以及未看过的模式，如语言模型化。因此，我们 демонстриated the potential of learned optimizers for improving communication-efficient distributed learning。

Learning High-Order Relationships of Brain Regions

paper_url: http://arxiv.org/abs/2312.02203
repo_url: None
paper_authors: Weikang Qiu, Huangrui Chu, Selena Wang, Haolan Zuo, Xiaoxiao Li, Yize Zhao, Rex Ying
for: 本研究旨在找出 fonctional magnetic resonance imaging (fMRI) 信号中可靠且有用的脑区之间交互关系，以便在 neuroscientific 预测中更好地理解脑的功能。
methods: 本研究使用了一种新的方法 named HyBRiD，该方法使用一个 Constructor 来标识 hyperedge 结构，并使用一个 Weighter 来计算每个 hyperedge 的权重。HyBRiD 通过一种创新的信息瓶颈框架 named multi-head drop-bottleneck 来实现 MIMR 目标，并具有理论保证。
results: 本研究的实验表明，HyBRiD 模型可以效果地提取 MIMR 高阶关系，并在评价标准协议 CPM 中超过当前最佳预测模型的平均提升率为 12.1%。

Abstract
Discovering reliable and informative interactions among brain regions from functional magnetic resonance imaging (fMRI) signals is essential in neuroscientific predictions of cognition. Most of the current methods fail to accurately characterize those interactions because they only focus on pairwise connections and overlook the high-order relationships of brain regions. We delve into this problem and argue that these high-order relationships should be maximally informative and minimally redundant (MIMR). However, identifying such high-order relationships is challenging and highly under-explored. Methods that can be tailored to our context are also non-existent. In response to this gap, we propose a novel method named HyBRiD that aims to extract MIMR high-order relationships from fMRI data. HyBRiD employs a Constructor to identify hyperedge structures, and a Weighter to compute a weight for each hyperedge. HyBRiD achieves the MIMR objective through an innovative information bottleneck framework named multi-head drop-bottleneck with theoretical guarantees. Our comprehensive experiments demonstrate the effectiveness of our model. Our model outperforms the state-of-the-art predictive model by an average of 12.1%, regarding the quality of hyperedges measured by CPM, a standard protocol for studying brain connections.

摘要
发现 brain 区域之间可靠且有用的交互关系是 neuroscientific 预测认知的关键。现有的方法多数只关注对应关系，而忽略 brain 区域之间的高阶关系。我们认为这些高阶关系应该是最 Informative 且最少重复的（MIMR）。然而，找到这些高阶关系是具有挑战性和不充分探索的。为了填补这个空白，我们提出了一种新方法名为 HyBRiD，该方法可以从 fMRI 数据中提取 MIMR 高阶关系。HyBRiD 使用一个名为 Constructor 的结构来认定 hyperedge 结构，并使用一个名为 Weighter 的计算器来计算每个 hyperedge 的权重。HyBRiD 通过一种创新的信息瓶颈框架，名为 multi-head drop-bottleneck，实现 MIMR 目标，并具有理论保证。我们的全面实验表明，我们的模型高效。我们的模型在 CPM 标准协议中衡量 hyperedge 质量时，平均高于现有的预测模型12.1%。

Distributed Bayesian Estimation in Sensor Networks: Consensus on Marginal Densities

paper_url: http://arxiv.org/abs/2312.01227
repo_url: None
paper_authors: Parth Paritosh, Nikolay Atanasov, Sonia Martinez
for: 本文旨在设计并分析分布式 bayesian 估计算法，用于感知网络。 addressed challenges 包括 (i) derivation of a distributed provably-correct algorithm in the functional space of probability distributions over continuous variables, and (ii) leveraging these results to obtain new distributed estimators restricted to subsets of variables observed by individual agents.
methods: 本文使用了数据来自非线性归一化函数的 bayesian 核算法，包括中央、分布式和 marginal distributed 设置。 after setting up a distributed estimation objective, the authors prove almost-sure convergence to the optimal set of pdfs at each agent.
results: 本文提出了一种基于 Gaussian 分布的 bayesian 核算法，并在一个映射问题中使用了 variational inference 处理非线性归一化函数相关的 LiDAR 探测数据。

Abstract
In this paper, we aim to design and analyze distributed Bayesian estimation algorithms for sensor networks. The challenges we address are to (i) derive a distributed provably-correct algorithm in the functional space of probability distributions over continuous variables, and (ii) leverage these results to obtain new distributed estimators restricted to subsets of variables observed by individual agents. This relates to applications such as cooperative localization and federated learning, where the data collected at any agent depends on a subset of all variables of interest. We present Bayesian density estimation algorithms using data from non-linear likelihoods at agents in centralized, distributed, and marginal distributed settings. After setting up a distributed estimation objective, we prove almost-sure convergence to the optimal set of pdfs at each agent. Then, we prove the same for a storage-aware algorithm estimating densities only over relevant variables at each agent. Finally, we present a Gaussian version of these algorithms and implement it in a mapping problem using variational inference to handle non-linear likelihood models associated with LiDAR sensing.

摘要
在这篇论文中，我们目标是设计和分析分布式 bayesian 估计算法 для感知网络。我们面临的挑战是（i） derivate一个分布式可靠的算法在连续变量的函数空间上，以及（ii）利用这些结果来获得新的分布式估计器，它们仅仅限制在每个代理机器上观察到的变量上。这与应用如协同地理位和联合学习相关，其中任何代理机器上收集的数据都依赖于所有变量的一个子集。我们提出了 bayesian 浊度估计算法，使用非线性概率函数在中央、分布式和部署式设置下进行估计。然后，我们证明了每个代理机器上的 almost-sure 收敛到优化的pdf集。接着，我们证明了一个具有存储限制的算法可以在每个代理机器上估计只有相关变量上的浊度。最后，我们提出了一个 Gaussian 版本的这些算法，并在一个映射问题中使用变量插值来处理雷达探测中的非线性概率函数。

When accurate prediction models yield harmful self-fulfilling prophecies

paper_url: http://arxiv.org/abs/2312.01210
repo_url: None
paper_authors: Wouter A. C. van Amsterdam, Nan van Geloven, Jesse H. Krijthe, Rajesh Ranganath, Giovanni Ciná
for: 这 paper 是关于医疗预测模型的研究，强调预测模型在医疗决策中的用途和危害。
methods: 这 paper 使用了一些数学和统计方法来分析预测模型的性能和应用。
results: 这 paper 发现，使用预测模型进行医疗决策可能会导致危害性的决策，即使预测模型在部署后仍然具有良好的预测能力。这些结果表明，需要修改现有的预测模型VALIDATION、部署和评估标准，以避免在医疗决策中导致危害。

Abstract
Prediction models are popular in medical research and practice. By predicting an outcome of interest for specific patients, these models may help inform difficult treatment decisions, and are often hailed as the poster children for personalized, data-driven healthcare. We show however, that using prediction models for decision making can lead to harmful decisions, even when the predictions exhibit good discrimination after deployment. These models are harmful self-fulfilling prophecies: their deployment harms a group of patients but the worse outcome of these patients does not invalidate the predictive power of the model. Our main result is a formal characterization of a set of such prediction models. Next we show that models that are well calibrated before and after deployment are useless for decision making as they made no change in the data distribution. These results point to the need to revise standard practices for validation, deployment and evaluation of prediction models that are used in medical decisions.

摘要
预测模型在医疗研究和实践中很受欢迎。这些模型可以预测特定患者的结果，帮助作出困难的治疗决策，并被视为个性化、数据驱动的医疗之星。然而，我们发现，使用预测模型做决策可能会导致有害的决策，即使这些模型在部署后表现出良好的预测能力。这些模型是有害的自我成就预测：它们在部署后对患者群体造成了伤害，但这些患者的更差结果不会推翻预测模型的预测力。我们的主要结论是对这类预测模型的正式Characterization。然后我们表明，在部署和评估过程中，这些模型的准备和部署是无用的，因为它们没有改变数据分布。这些结果表明需要修改预测模型的验证、部署和评估标准实践。

Short-term Precipitation Forecasting in The Netherlands: An Application of Convolutional LSTM neural networks to weather radar data

paper_url: http://arxiv.org/abs/2312.01197
repo_url: https://github.com/petrosdemetrakopoulos/lstm-radar-precipitation-forecast
paper_authors: Petros Demetrakopoulos
for: 预测短时间内降水的气象预报
methods: 应用Convolutional Long Short-Term Memory（ConvLSTM）神经网络，利用气象雷达数据从荷兰皇家气象所（KNMI）进行预测
results: 实现了高精度预测降水方向和强度的移动和变化

Abstract
This work addresses the challenge of short-term precipitation forecasting by applying Convolutional Long Short-Term Memory (ConvLSTM) neural networks to weather radar data from the Royal Netherlands Meteorological Institute (KNMI). The research exploits the combination of Convolutional Neural Networks (CNNs) layers for spatial pattern recognition and LSTM network layers for modelling temporal sequences, integrating these strengths into a ConvLSTM architecture. The model was trained and validated on weather radar data from the Netherlands. The model is an autoencoder consisting of nine layers, uniquely combining convolutional operations with LSTMs temporal processing, enabling it to capture the movement and intensity of precipitation systems. The training set comprised of sequences of radar images, with the model being tasked to predict precipitation patterns 1.5 hours ahead using the preceding data. Results indicate high accuracy in predicting the direction and intensity of precipitation movements. The findings of this study underscore the significant potential of ConvLSTM networks in meteorological forecasting, particularly in regions with complex weather patterns. It contributes to the field by offering a more accurate, data-driven approach to weather prediction, highlighting the broader applicability of ConvLSTM networks in meteorological tasks.

摘要

Exploring a Hybrid Deep Learning Framework to Automatically Discover Topic and Sentiment in COVID-19 Tweets

paper_url: http://arxiv.org/abs/2312.01178
repo_url: None
paper_authors: Khandaker Tayef Shahriar, Iqbal H. Sarker
for: 本研究旨在提供一种新的框架，以分析COVID-19 tweets中的话题情感。
methods: 我们提出了一种 combining BiLSTM和GRU结构的гибриidden深度学习模型，以实现情感分析。
results: 实验结果表明，我们的话题标签提取方法可以更好地提取话题标签，而我们提posed的 гибриidden深度学习模型在情感分析方面 achievied最高的准确率。

Abstract
COVID-19 has created a major public health problem worldwide and other problems such as economic crisis, unemployment, mental distress, etc. The pandemic is deadly in the world and involves many people not only with infection but also with problems, stress, wonder, fear, resentment, and hatred. Twitter is a highly influential social media platform and a significant source of health-related information, news, opinion and public sentiment where information is shared by both citizens and government sources. Therefore an effective analysis of COVID-19 tweets is essential for policymakers to make wise decisions. However, it is challenging to identify interesting and useful content from major streams of text to understand people's feelings about the important topics of the COVID-19 tweets. In this paper, we propose a new \textit{framework} for analyzing topic-based sentiments by extracting key topics with significant labels and classifying positive, negative, or neutral tweets on each topic to quickly find common topics of public opinion and COVID-19-related attitudes. While building our model, we take into account hybridization of BiLSTM and GRU structures for sentiment analysis to achieve our goal. The experimental results show that our topic identification method extracts better topic labels and the sentiment analysis approach using our proposed hybrid deep learning model achieves the highest accuracy compared to traditional models.

摘要
COVID-19 已经在全球造成了严重的公共健康问题，同时也导致了经济危机、失业、心理压力、等等问题。这场流感病毒是全球范围内的致命疾病，它不仅仅是感染人们的问题，还包括许多人们的问题、焦虑、恐慌、怒气和恨意。Twitter 是一个非常影响力的社交媒体平台，也是重要的健康信息、新闻、观点和公众情绪的来源，因此对 COVID-19 tweets 的有效分析是非常重要的。然而，从大量文本中提取有趣和有用的内容是一项挑战性的任务，以便理解人们对重要话题的感受和 COVID-19 相关的态度。在这篇论文中，我们提出了一种新的框架，用于分析话题基于的情感分析，包括提取重要话题和对每个话题的积极、消极或中性 tweets 进行分类，以快速找到公众意见和 COVID-19 相关的态度。在建立我们的模型时，我们考虑了 BiLSTM 和 GRU 结构的混合来进行情感分析，以达到我们的目标。实验结果表明，我们的话题标签提取方法可以更好地提取话题标签，而我们提posed的混合深度学习模型在情感分析方面实现了传统模型的最高准确率。

On-sensor Printed Machine Learning Classification via Bespoke ADC and Decision Tree Co-Design

paper_url: http://arxiv.org/abs/2312.01172
repo_url: None
paper_authors: Giorgos Armeniakos, Paula L. Duarte, Priyanjana Pal, Georgios Zervakis, Mehdi B. Tahoori, Dimitrios Soudris
for: 本研究旨在开发低成本的Printed Electronics（PE）技术，以实现个性化的计算设备。
methods: 本研究使用了自适应ADC和树状分类器的协同设计方法，以实现在感知器上自动化的计算。
results: 研究表明，使用自适应ADC和树状分类器可以实现在感知器上的自主运行，并且可以处理感知器输入数据。

Abstract
Printed electronics (PE) technology provides cost-effective hardware with unmet customization, due to their low non-recurring engineering and fabrication costs. PE exhibit features such as flexibility, stretchability, porosity, and conformality, which make them a prominent candidate for enabling ubiquitous computing. Still, the large feature sizes in PE limit the realization of complex printed circuits, such as machine learning classifiers, especially when processing sensor inputs is necessary, mainly due to the costly analog-to-digital converters (ADCs). To this end, we propose the design of fully customized ADCs and present, for the first time, a co-design framework for generating bespoke Decision Tree classifiers. Our comprehensive evaluation shows that our co-design enables self-powered operation of on-sensor printed classifiers in all benchmark cases.

摘要
印刷电子技术（PE）提供cost-effective硬件，具有未满足的定制化，因为它们的非回归工程和制造成本低。PE具有灵活性、可扩展性、透空性和适应性等特点，使其成为 ubique计算的优秀选择。然而，PE的大特征尺寸限制了复杂的印刷电路的实现，如机器学习分类器，特别是当处理感知输入是必要时，主要是因为costly analog-to-digital converters（ADCs）。为此，我们提议设计 Fully Customized ADCs，并提供了首次的共设计框架，用于生成特制的决策树分类器。我们的全面评估表明，我们的共设计可以实现在所有benchmark案例中的自主驱动运行。

Fast and Robust Sparsity-Aware Block Diagonal Representation

paper_url: http://arxiv.org/abs/2312.01137
repo_url: https://github.com/a-tastan/frs-bdr
paper_authors: Aylin Tastan, Michael Muma, Abdelhak M. Zoubir
for: 本研究旨在Addressing the challenges of recovering a block diagonal affinity matrix in real-world applications, where data may be subject to outliers and heavy-tailed noise.
methods: 本 paper proposes a Fast and Robust Sparsity-Aware Block Diagonal Representation (FRS-BDR) method, which jointly estimates cluster memberships and the number of blocks by reformulating the problem as a robust piece-wise linear fitting problem.
results: 对于多种实际应用，FRS-BDR 方法能够提高归一化精度、Robustness against corrupted features、计算时间和归一化个数性能。

Abstract
The block diagonal structure of an affinity matrix is a commonly desired property in cluster analysis because it represents clusters of feature vectors by non-zero coefficients that are concentrated in blocks. However, recovering a block diagonal affinity matrix is challenging in real-world applications, in which the data may be subject to outliers and heavy-tailed noise that obscure the hidden cluster structure. To address this issue, we first analyze the effect of different fundamental outlier types in graph-based cluster analysis. A key idea that simplifies the analysis is to introduce a vector that represents a block diagonal matrix as a piece-wise linear function of the similarity coefficients that form the affinity matrix. We reformulate the problem as a robust piece-wise linear fitting problem and propose a Fast and Robust Sparsity-Aware Block Diagonal Representation (FRS-BDR) method, which jointly estimates cluster memberships and the number of blocks. Comprehensive experiments on a variety of real-world applications demonstrate the effectiveness of FRS-BDR in terms of clustering accuracy, robustness against corrupted features, computation time and cluster enumeration performance.

摘要
“块对角结构是一种常见的需求在集群分析中，因为它表示特征向量的分布归一到块中。但在实际应用中，数据可能受到外围点和极大尾部噪声的影响，这会隐藏埋在数据中的块结构。为了解决这个问题，我们首先分析了不同类型的基本外围点对图像分析的影响。我们引入一个表示块对角矩阵为某种割辑函数的向量，然后将问题转化为一个Robust Piece-wise Linear Fitting（RPLF）问题。我们提出了一种快速和Robust Sparsity-Aware Block Diagonal Representation（FRS-BDR）方法，该方法同时估算集群成员和块数量。我们在多种实际应用上进行了广泛的实验，并证明了FRS-BDR在分组精度、噪声抗性、计算时间和集群枚举性方面的有效性。”Note that the translation is in Simplified Chinese, which is the standard writing system used in mainland China. If you prefer Traditional Chinese, I can provide that as well.

$t^3$-Variational Autoencoder: Learning Heavy-tailed Data with Student’s t and Power Divergence

paper_url: http://arxiv.org/abs/2312.01133
repo_url: None
paper_authors: Juno Kim, Jaehyuk Kwon, Mincheol Cho, Hyunjong Lee, Joong-Ho Won
for: 提高 VAE 模型的表现，特别是在处理低概率范围的数据上。
methods: 使用Student’s t-distribution来改进 VAE 模型，包括使用 t-distribution 作为先验分布、编码器和解码器。
results: $t^3$VAE 模型在含有重 tailed 数据的 Synthetic 数据上表现出色，并在 CelebA 和 imbalanced CIFAR-100 数据集上显示出 significiant 性能提升。

Abstract
The variational autoencoder (VAE) typically employs a standard normal prior as a regularizer for the probabilistic latent encoder. However, the Gaussian tail often decays too quickly to effectively accommodate the encoded points, failing to preserve crucial structures hidden in the data. In this paper, we explore the use of heavy-tailed models to combat over-regularization. Drawing upon insights from information geometry, we propose $t^3$VAE, a modified VAE framework that incorporates Student's t-distributions for the prior, encoder, and decoder. This results in a joint model distribution of a power form which we argue can better fit real-world datasets. We derive a new objective by reformulating the evidence lower bound as joint optimization of KL divergence between two statistical manifolds and replacing with $\gamma$-power divergence, a natural alternative for power families. $t^3$VAE demonstrates superior generation of low-density regions when trained on heavy-tailed synthetic data. Furthermore, we show that $t^3$VAE significantly outperforms other models on CelebA and imbalanced CIFAR-100 datasets.

摘要
通常，变分自动编码器（VAE）使用标准正态分布作为Regularizer для probabilistic latent encoder。然而， Gaussian 尾常 decay 太快，无法有效地包含编码点，导致潜在的结构遗失。在这篇论文中，我们探讨使用重 tailed 模型来避免过度规化。基于信息几何学的洞察，我们提出 $t^3$VAE，一种修改后VAE框架，其中使用 Student's t-distributions 作为 prior、encoder 和 decoder。这导致了一个共同的模型分布，我们认为可以更好地适应实际数据。我们 derivate 一个新的目标函数，通过重新定义证据下界为两个统计 manifold 的 joint 优化和 replacing γ-power divergence，一种自然的代替者。 $t^3$VAE 在带有重 tailed 数据训练时表现出了更好的生成低密度区域。此外，我们表明 $t^3$VAE 在 CelebA 和 imbalanced CIFAR-100 数据集上表现出了明显的优异。

Virtual reservoir acceleration for CPU and GPU: Case study for coupled spin-torque oscillator reservoir

paper_url: http://arxiv.org/abs/2312.01121
repo_url: https://github.com/mathowl/reservoir_acceleration
paper_authors: Thomas Geert de Jong, Nozomi Akashi, Tomohiro Taniguchi, Hirofumi Notsu, Kohei Nakajima
for: 这 paper 用于 simulating 由 $N$-coupled spin-torque oscillators 描述的抽象液体。
methods: 这 paper 使用了 CPU 和 GPU 进行实现，并对不同的实现进行了比较。
results: results 显示，对于 $N$ 在 1 到 10^4 之间的值，新的方法比基准值快至少 2.6 倍。特别是，在 $N=1$ 情况下，最好的因子是 78.9，而在 $N=10^3$ 和 $N=10^4$ 情况下，分别为 2.6 和 23.8。 GPU 在 $N=2500$ 情况下明显超过 CPU。

Abstract
We provide high-speed implementations for simulating reservoirs described by $N$-coupled spin-torque oscillators. Here $N$ also corresponds to the number of reservoir nodes. We benchmark a variety of implementations based on CPU and GPU. Our new methods are at least 2.6 times quicker than the baseline for $N$ in range $1$ to $10^4$. More specifically, over all implementations the best factor is 78.9 for $N=1$ which decreases to 2.6 for $N=10^3$ and finally increases to 23.8 for $N=10^4$. GPU outperforms CPU significantly at $N=2500$. Our results show that GPU implementations should be tested for reservoir simulations. The implementations considered here can be used for any reservoir with evolution that can be approximated using an explicit method.

摘要
我们提供高速实现方法，用于模拟由 $N$-个旋转磁场激活器描述的水库。这里 $N$ 还表示水库节点的数量。我们对 CPU 和 GPU 进行了比较。我们的新方法至少比基准值快速于 $N$ 在 1 到 10^4 之间。更具体地说，在所有实现方法中，最好的因子是 78.9 对 $N=1$，随着 $N$ 增加，这个因子逐渐减少，最终变为 23.8 对 $N=10^4$。GPU 在 $N=2500$ 时明显高效。我们的结果表明，GPU 实现应该用于水库模拟。我们考虑的实现方法可以用于任何可以使用显式方法描述的水库。Note: Simplified Chinese is used in mainland China and Singapore, while Traditional Chinese is used in Taiwan, Hong Kong, and Macau.

Cancer Subtype Identification through Integrating Inter and Intra Dataset Relationships in Multi-Omics Data

paper_url: http://arxiv.org/abs/2312.02195
repo_url: https://github.com/peelen-mark/identifying-cancer-subtypes-code
paper_authors: Mark Peelen, Leila Bagheriye, Johan Kwisthout
for: 这个论文旨在提出一种新的方法，用于通过多重数据的集成来分类癌症。
methods: 该方法基于线性关系 между和内不同的数据集（线性内部和间接数据集相互关系Matrix (LIDAF)）。
results: 提出的方法可以提高分类性能，并且在50%的log10排名值中获得了更好的表现，超越了现有的最佳方法。

Abstract
The integration of multi-omics data has emerged as a promising approach for gaining comprehensive insights into complex diseases such as cancer. This paper proposes a novel approach to identify cancer subtypes through the integration of multi-omics data for clustering. The proposed method, named LIDAF utilises affinity matrices based on linear relationships between and within different omics datasets (Linear Inter and Intra Dataset Affinity Fusion (LIDAF)). Canonical Correlation Analysis is in this paper employed to create distance matrices based on Euclidean distances between canonical variates. The distance matrices are converted to affinity matrices and those are fused in a three-step process. The proposed LIDAF addresses the limitations of the existing method resulting in improvement of clustering performance as measured by the Adjusted Rand Index and the Normalized Mutual Information score. Moreover, our proposed LIDAF approach demonstrates a notable enhancement in 50% of the log10 rank p-values obtained from Cox survival analysis, surpassing the performance of the best reported method, highlighting its potential of identifying distinct cancer subtypes.

摘要
随着多元数据的整合，诊断复杂疾病（如癌症）的全面性的理解得到了广泛的关注。这篇论文提出了一种新的方法，通过多元数据的整合进行划分，以获得更好的癌症分类效果。该方法，名为LIDAF，利用了线性关系 между不同的多元数据集（线性内部和外部数据集关系矩阵(LIDAF)）。在本文中，使用 canonical correlation analysis 创建了基于欧几丁度距离的距离矩阵，然后将距离矩阵转换为亲和力矩阵，并将其进行三步融合。提出的LIDAF方法解决了现有方法的局限性，导致分类性能的提高， measured by Adjusted Rand Index 和 Normalized Mutual Information score。此外，我们的LIDAF方法在50%的log10排名值上表现出了显著的改进，超过了最佳报道的方法，这 highlights its potential for identifying distinct cancer subtypes。

Strong Duality Relations in Nonconvex Risk-Constrained Learning

paper_url: http://arxiv.org/abs/2312.01110
repo_url: None
paper_authors: Dionysis Kalogerias, Spyridon Pougkakiotis
for: 这个论文的目的是为了研究函数型两步 compositional risk-constrained learning问题，包括多个非凸损函数和学习约束，无论损函数和学习约束的非凸性。
methods: 这个论文使用了最新的风险限制非凸编程技术，基于J. J. Uhl的凸性定理，这是一种扩展A. A. Lyapunov的凸性定理，适用于一般、无穷维 Banach空间。
results: 这个论文实现了零对偶差 gap，这意味着在研究的问题中，风险限制的问题都可以被解决，并且超越了当前文献中的状况。

Abstract
We establish strong duality relations for functional two-step compositional risk-constrained learning problems with multiple nonconvex loss functions and/or learning constraints, regardless of nonconvexity and under a minimal set of technical assumptions. Our results in particular imply zero duality gaps within the class of problems under study, both extending and improving on the state of the art in (risk-neutral) constrained learning. More specifically, we consider risk objectives/constraints which involve real-valued convex and positively homogeneous risk measures admitting dual representations with bounded risk envelopes, generalizing expectations and including popular examples, such as the conditional value-at-risk (CVaR), the mean-absolute deviation (MAD), and more generally all real-valued coherent risk measures on integrable losses as special cases. Our results are based on recent advances in risk-constrained nonconvex programming in infinite dimensions, which rely on a remarkable new application of J. J. Uhl's convexity theorem, which is an extension of A. A. Lyapunov's convexity theorem for general, infinite dimensional Banach spaces. By specializing to the risk-neutral setting, we demonstrate, for the first time, that constrained classification and regression can be treated under a unifying lens, while dispensing certain restrictive assumptions enforced in the current literature, yielding a new state-of-the-art strong duality framework for nonconvex constrained learning.

摘要
我们建立了强大的对偶关系 для函数两步复合风险条件学习问题，无视非对称性和最小技术假设。我们的结果特别是适用于研究中的问题范围内的零对偶距离，扩展和改善了现有的州际学习（risk-neutral）对偶框架。我们考虑的风险目标/限制为真値凸函数和正 Homogeneous risk measures，包括 conditional value-at-risk（CVaR）、mean-absolute deviation（MAD）和更一般的所有实值凝聚风险度量。我们的结果基于 latest advances in risk-constrained nonconvex programming in infinite dimensions，利用了 J. J. Uhl的凸性定理的新应用，这是A. A. Lyapunov的凸性定理的推广，这是一个普遍的、无限维 Banach空间上的扩展。通过特定化到风险中立设置，我们示出了，在首次对于非对称的问题进行对偶框架，并破坏了一些在目前文献中强制的假设，实现了一个新的强大对偶框架 для非对称的受限学习。

Rapid Speaker Adaptation in Low Resource Text to Speech Systems using Synthetic Data and Transfer learning

paper_url: http://arxiv.org/abs/2312.01107
repo_url: None
paper_authors: Raviraj Joshi, Nikesh Garera
for: 这个论文的目的是建立一个低资源语言的语音识别系统。
methods: 这个论文使用了转移学习方法，使用高资源语言的数据和生成的数据进行学习。然后，通过单个发音人数据的精度调整，实现了讲话者适应。
results: 这个论文通过对3个小时的目标语言发音人数据进行精度调整，实现了讲话者适应。此外，通过主观评价方法证明了这种方法的有效性。

Abstract
Text-to-speech (TTS) systems are being built using end-to-end deep learning approaches. However, these systems require huge amounts of training data. We present our approach to built production quality TTS and perform speaker adaptation in extremely low resource settings. We propose a transfer learning approach using high-resource language data and synthetically generated data. We transfer the learnings from the out-domain high-resource English language. Further, we make use of out-of-the-box single-speaker TTS in the target language to generate in-domain synthetic data. We employ a three-step approach to train a high-quality single-speaker TTS system in a low-resource Indian language Hindi. We use a Tacotron2 like setup with a spectrogram prediction network and a waveglow vocoder. The Tacotron2 acoustic model is trained on English data, followed by synthetic Hindi data from the existing TTS system. Finally, the decoder of this model is fine-tuned on only 3 hours of target Hindi speaker data to enable rapid speaker adaptation. We show the importance of this dual pre-training and decoder-only fine-tuning using subjective MOS evaluation. Using transfer learning from high-resource language and synthetic corpus we present a low-cost solution to train a custom TTS model.

摘要
文本至语音（TTS）系统通过端到端深度学习方法建立。然而，这些系统需要庞大的训练数据。我们提出了一种使用转移学习approach来建立生产质量TTS系统并进行speaker adaptation。我们使用高资源语言数据进行转移学习，并使用生成的数据进行synthetic data的生成。我们使用三步方法来训练一个高质量单个 speaker TTS系统，包括使用Tacotron2设置，spectrogram预测网络和waveglow vocoder。Tacotron2语音模型在英语数据上训练，然后在生成的印地语数据上进行synthetic Hindi data。最后，这个模型的解码器通过只需3小时的目标印地语言 speaker数据进行精度调整。我们通过主观MOS评估表明了这种双重预训和解码器精度调整的重要性。通过从高资源语言和人工合成库中 Transfer learning，我们提供了一种低成本的解决方案，可以训练自定义TTS模型。

Code-Mixed Text to Speech Synthesis under Low-Resource Constraints

paper_url: http://arxiv.org/abs/2312.01103
repo_url: None
paper_authors: Raviraj Joshi, Nikesh Garera
for: 这篇论文旨在描述如何建立高质量的混合语言（Hindi-English）文本读取系统，以满足voice-based电商应用的需求。
methods: 作者提出了一种数据驱动的方法，利用单语言数据集来训练混合语言TTS系统。他们还使用了转写模型将罗马文本转换为喀什米里文本，并将两个数据集合并训练。作者还进行了单话者适应和多话者训练，并使用了传输学习和解码器 fine-tuning 来提高性能。
results: 作者的方法可以达到Google TTS的positive CMOS分数0.02，并且在资源受限的情况下进行了低资源语音适应。此外，作者还进行了大量的主观评估，以证明系统的高质量。

Abstract
Text-to-speech (TTS) systems are an important component in voice-based e-commerce applications. These applications include end-to-end voice assistant and customer experience (CX) voice bot. Code-mixed TTS is also relevant in these applications since the product names are commonly described in English while the surrounding text is in a regional language. In this work, we describe our approaches for production quality code-mixed Hindi-English TTS systems built for e-commerce applications. We propose a data-oriented approach by utilizing monolingual data sets in individual languages. We leverage a transliteration model to convert the Roman text into a common Devanagari script and then combine both datasets for training. We show that such single script bi-lingual training without any code-mixing works well for pure code-mixed test sets. We further present an exhaustive evaluation of single-speaker adaptation and multi-speaker training with Tacotron2 + Waveglow setup to show that the former approach works better. These approaches are also coupled with transfer learning and decoder-only fine-tuning to improve performance. We compare these approaches with the Google TTS and report a positive CMOS score of 0.02 with the proposed transfer learning approach. We also perform low-resource voice adaptation experiments to show that a new voice can be onboarded with just 3 hrs of data. This highlights the importance of our pre-trained models in resource-constrained settings. This subjective evaluation is performed on a large number of out-of-domain pure code-mixed sentences to demonstrate the high quality of the systems.

摘要
文本至语系统（TTS）在voice基于电子商务应用程序中扮演着重要的角色。这些应用程序包括终端语音助手和客户体验（CX）语音机器人。在这种情况下，我们描述了我们对生产质量的code-mixed Hindi-英语TTS系统的方法。我们采用了数据驱动的方法，利用各自语言的单语言数据集。我们利用一种翻译模型将罗马字转换为常见的德вана格里文字，然后将两个数据集合并用于训练。我们表明，这种单文本编码的双语训练无需任何code-mixing即使在纯code-mixed测试集上表现良好。我们还进行了详细的单 speaker适应和多 speaker训练，并使用Tacotron2 + Waveglow的设置进行评估。我们发现，前者方法在性能上表现更好。这些方法还与传输学习和解码器 только精度调整进行改进性能。我们与Google TTS进行比较，并发现我们的传输学习方法可以获得正面的CMOS分数0.02。此外，我们还进行了尝试资源受限的声音适应实验，并证明了一个新的声音可以在仅3小时的数据上board。这种subjective评估是基于大量的out-of-domain纯code-mixed句子中进行的。这些结果表明了我们的预训练模型在资源受限的情况下的重要性。

Predicting Postoperative Nausea And Vomiting Using Machine Learning: A Model Development and Validation Study

paper_url: http://arxiv.org/abs/2312.01093
repo_url: https://github.com/teddy4445/ponv_prediction_tool
paper_authors: Maxim Glebov, Teddy Lazebnik, Boris Orkin, Haim Berkenstadt, Svetlana Bunimovich-Mendrazitsky
for:* 预测postsurgical nausea and vomiting (PONV)的工具，以提高患者的护理和病程结果。methods:* 使用机器学习算法 ensemble，在54848名病人的数据上进行训练，并使用k-fold核心验证法和Bee Colony算法来优化数据的分类表达。results:* 预测early和delayed PONV的工具具有84.0%和77.3%的准确率，超过了现有的PONV预测工具（Koivuranta score）的13.4%和12.9%。Feature importance分析表明，预测工具的性能与前期临床知识相符，表明其实用性。

Abstract
Background: Postoperative nausea and vomiting (PONV) is a frequently observed complication in patients undergoing surgery under general anesthesia. Moreover, it is a frequent cause of distress and dissatisfaction during the early postoperative period. The tools used for predicting PONV at present have not yielded satisfactory results. Therefore, prognostic tools for the prediction of early and delayed PONV were developed in this study with the aim of achieving satisfactory predictive performance. Methods: The retrospective data of adult patients admitted to the post-anesthesia care unit after undergoing surgical procedures under general anesthesia at the Sheba Medical Center, Israel, between September 1, 2018, and September 1, 2023, were used in this study. An ensemble model of machine learning algorithms trained on the data of 54848 patients was developed. The k-fold cross-validation method was used followed by splitting the data to train and test sets that optimally preserve the sociodemographic features of the patients, such as age, sex, and smoking habits, using the Bee Colony algorithm. Findings: Among the 54848 patients, early and delayed PONV were observed in 2706 (4.93%) and 8218 (14.98%) patients, respectively. The proposed PONV prediction tools could correctly predict early and delayed PONV in 84.0% and 77.3% of cases, respectively, outperforming the second-best PONV prediction tool (Koivuranta score) by 13.4% and 12.9%, respectively. Feature importance analysis revealed that the performance of the proposed prediction tools aligned with previous clinical knowledge, indicating their utility. Interpretation: The machine learning-based tools developed in this study enabled improved PONV prediction, thereby facilitating personalized care and improved patient outcomes.

摘要
Background: 麻醉后恶心和呕吐（PONV）是在普遍出现在在普通麻醉下进行手术的患者中，并且是早期后期受到折磨和不满的主要原因。目前使用的预测工具没有达到满意的结果。因此，本研究开发了预测早期和延迟PONV的预测工具，以实现满意的预测性能。方法: 这项研究使用了在以色列希伯医疗中心的后手术病房进行手术的成人患者的后续数据，共计54848名患者，时间范围为2018年9月1日至2023年9月1日。使用了Machine Learning算法的ensemble模型，通过在数据中使用Bee Colony算法进行k-fold跨 VALIDATION和数据分割，以最好地保留患者的社odemographic特征，如年龄、性别和吸烟习惯。发现: 总共有54848名患者中，早期和延迟PONV分别出现在2706名（4.93%）和8218名（14.98%）患者中。提出的PONV预测工具可以正确预测早期和延迟PONV的情况，准确率分别为84.0%和77.3%，相比第二好的PONV预测工具（koivuranta分数）高出13.4%和12.9%。特征重要性分析表明，提出的预测工具的性能与先前的临床知识相一致，表明其实用性。解释: 本研究开发的Machine Learning基于的预测工具，可以提高PONV预测的准确性，从而为个性化护理和患者的成本提供了便利。

A Semi-Supervised Deep Learning Approach to Dataset Collection for Query-By-Humming Task

paper_url: http://arxiv.org/abs/2312.01092
repo_url: https://github.com/amanteur/chad
paper_authors: Amantur Amatov, Dmitry Lamanov, Maksim Titov, Ivan Vovk, Ilya Makarov, Mikhail Kudinov
for: 这个论文的目的是提出一种深度学习数据收集技术和一个新的歌曲搜索数据集（CHAD），以便实现 Query-by-Humming（QbH）系统。
methods: 论文使用了一种半监督的模型训练管道，利用QbH任务作为特殊的歌曲重新创作任务（CSI）来扩展数据集。
results: 经过训练，模型在QbH任务上取得了竞争性的成绩，并且可以成功应用于实际应用中。

Abstract
Query-by-Humming (QbH) is a task that involves finding the most relevant song based on a hummed or sung fragment. Despite recent successful commercial solutions, implementing QbH systems remains challenging due to the lack of high-quality datasets for training machine learning models. In this paper, we propose a deep learning data collection technique and introduce Covers and Hummings Aligned Dataset (CHAD), a novel dataset that contains 18 hours of short music fragments, paired with time-aligned hummed versions. To expand our dataset, we employ a semi-supervised model training pipeline that leverages the QbH task as a specialized case of cover song identification (CSI) task. Starting with a model trained on the initial dataset, we iteratively collect groups of fragments of cover versions of the same song and retrain the model on the extended data. Using this pipeline, we collect over 308 hours of additional music fragments, paired with time-aligned cover versions. The final model is successfully applied to the QbH task and achieves competitive results on benchmark datasets. Our study shows that the proposed dataset and training pipeline can effectively facilitate the implementation of QbH systems.

摘要
对于Query-by-Humming（QbH）任务，因缺乏高质量训练数据而实现系统仍然具有挑战性。在这篇论文中，我们提出了一种深度学习数据采集技术，并引入Covers and Hummings Aligned Dataset（CHAD），一个新的数据集，包含18小时短 музы Fragment，与时间对应的hummed版本。为了扩大我们的数据集，我们采用了一种半supervised模型训练管道，利用QbH任务作为特殊的cover song identification（CSI）任务。从一个初始模型开始，我们逐步收集了 fragments of cover版本的同一首歌，并使用这些数据重新训练模型。通过这种管道，我们收集了超过308小时的额外音乐 Fragment，并与时间对应的cover版本一起。最终模型成功应用于QbH任务，在标准数据集上达到了竞争力的成绩。我们的研究表明，我们提出的数据集和训练管道可以有效地促进QbH系统的实施。

A New Random Reshuffling Method for Nonsmooth Nonconvex Finite-sum Optimization

paper_url: http://arxiv.org/abs/2312.01047
repo_url: None
paper_authors: Xiao Li, Andre Milzarek, Junwen Qiu
for: 本文提出了一种新的随机优化算法，称为normal map-based proximal random reshuffling（norm-PRR）方法，用于非光滑非对称总和问题。
methods: 本文使用了随机重排技术，这些技术在大规模应用中很普遍，如神经网络训练中。而在非光滑情况下，随机重排方法的收敛性和加速效果还不够了解，只有几种 proximal-type 随机重排方法具有可证明的保证。
results: 本文Establish了 norm-PRR 的迭代复杂度为 ${O}(n^{-1/3}T^{-2/3})$，其中 $n$ 是组件函数的数量， $T$ 是迭代次数。此外，我们还提供了新的极限点收敛结果，包括strong limit-point convergence和last iterate convergence rates。 Specifically, under the Kurdyka-{\L}ojasiewicz（KL）不等式，我们证明了 iterates 生成的 norm-PRR 收敛到一个唯一的站点点。此外，我们还得到了迭代次数 $k$ 的几何速率，即 ${\cal O}(k^{-p})$，其中 $p \in [0, 1]$ 取决于 KL 指数 $\theta \in [0,1)$ 和步长动力学。最后，我们在机器学习问题上进行了初步的数据分析，并证明了该方法的效率。

Abstract
In this work, we propose and study a novel stochastic optimization algorithm, termed the normal map-based proximal random reshuffling (norm-PRR) method, for nonsmooth nonconvex finite-sum problems. Random reshuffling techniques are prevalent and widely utilized in large-scale applications, e.g., in the training of neural networks. While the convergence behavior and advantageous acceleration effects of random reshuffling methods are fairly well understood in the smooth setting, much less seems to be known in the nonsmooth case and only few proximal-type random reshuffling approaches with provable guarantees exist. We establish the iteration complexity ${\cal O}(n^{-1/3}T^{-2/3})$ for norm-PRR, where $n$ is the number of component functions and $T$ counts the total number of iteration. We also provide novel asymptotic convergence results for norm-PRR. Specifically, under the Kurdyka-{\L}ojasiewicz (KL) inequality, we establish strong limit-point convergence, i.e., the iterates generated by norm-PRR converge to a single stationary point. Moreover, we derive last iterate convergence rates of the form ${\cal O}(k^{-p})$; here, $p \in [0, 1]$ depends on the KL exponent $\theta \in [0,1)$ and step size dynamics. Finally, we present preliminary numerical results on machine learning problems that demonstrate the efficiency of the proposed method.

摘要
“在这项工作中，我们提出了一种新的随机优化算法，称之为normal map-based proximal random reshuffling（norm-PRR）方法，用于非光滑非对称总和问题。随机重新排序技术在大规模应用中很普遍，例如在神经网络训练中。虽然随机重新排序方法在光滑 Setting中的收敛性和加速效果已经比较好地了解，但在非光滑情况下，却很少有 proximal-type随机重新排序方法的可证明保证。我们证明了 norm-PRR 方法的迭代复杂度为 $O(n^{-1/3}T^{-2/3})$，其中 $n$ 是功能函数的数量，$T$ 是迭代次数。我们还提供了 norm-PRR 方法的新的极限点收敛结果，包括强限点收敛和最后迭代速率。具体来说，在 Kurdyka-{\L}ojasiewicz（KL）不等式的假设下，我们证明了 norm-PRR 方法会收敛到一个固定点，并且可以在 $k$ 迭代次数下达到最后迭代速率 $O(k^{-p})$，其中 $p \in [0, 1]$ 是 KL 指数的一个优化参数。最后，我们在机器学习问题上进行了一些初步的数据分析，证明了 norm-PRR 方法的效率。”

Bagged Regularized $k$-Distances for Anomaly Detection

paper_url: http://arxiv.org/abs/2312.01046
repo_url: None
paper_authors: Yuchao Cai, Yuheng Ma, Hanfang Yang, Hanyuan Hang
For: 这个论文的目的是提出一个新的距离基本的检测方法，以解决距离基本方法在缺乏标签示例的情况下表现不佳的问题。* Methods: 这个方法使用袋包技术（bagging）和调整了的$k$-距离（k-distances）来实现不超给的检测。它首先使用袋包技术将资料分成多个批次，然后使用调整了的$k$-距离来检测每个批次是否有问题。* Results: 这个方法可以成功地解决距离基本方法中的敏感性问题，并且在大规模的资料集上实现高效的检测。实验结果显示，该方法可以与其他现有的距离基本方法相比，具有更好的优化性和更高的准确性。

Abstract
We consider the paradigm of unsupervised anomaly detection, which involves the identification of anomalies within a dataset in the absence of labeled examples. Though distance-based methods are top-performing for unsupervised anomaly detection, they suffer heavily from the sensitivity to the choice of the number of the nearest neighbors. In this paper, we propose a new distance-based algorithm called bagged regularized $k$-distances for anomaly detection (BRDAD) converting the unsupervised anomaly detection problem into a convex optimization problem. Our BRDAD algorithm selects the weights by minimizing the surrogate risk, i.e., the finite sample bound of the empirical risk of the bagged weighted $k$-distances for density estimation (BWDDE). This approach enables us to successfully address the sensitivity challenge of the hyperparameter choice in distance-based algorithms. Moreover, when dealing with large-scale datasets, the efficiency issues can be addressed by the incorporated bagging technique in our BRDAD algorithm. On the theoretical side, we establish fast convergence rates of the AUC regret of our algorithm and demonstrate that the bagging technique significantly reduces the computational complexity. On the practical side, we conduct numerical experiments on anomaly detection benchmarks to illustrate the insensitivity of parameter selection of our algorithm compared with other state-of-the-art distance-based methods. Moreover, promising improvements are brought by applying the bagging technique in our algorithm on real-world datasets.

摘要
我们考虑了无监督异常检测的 paradigm，即在数据集中检测异常的问题，在无标注示例的情况下进行。虽然距离基于方法在无监督异常检测方面表现出色，但是它们受到选择距离参数的选择的敏感性很大。在本文中，我们提出了一种新的距离基于算法 called Bagged Regularized $k$-Distances for Anomaly Detection (BRDAD)，将无监督异常检测问题转化为一个凸优化问题。我们的 BRDAD 算法通过最小化代理风险来选择参数，即对 Bagged Weighted $k$-Distances for Density Estimation (BWDDE) 的质量评估。这种方法可以成功地解决距离基于算法中参数选择敏感性的问题。此外，当处理大规模数据时，我们的 BRDAD 算法可以通过包装技术来提高效率。从理论角度来看，我们证明了我们的算法的 AUC regret 快速 converges 和 bagging 技术可以减少计算复杂性。从实践角度来看，我们在异常检测标准 benchmar 上进行了数值实验，并证明了我们的算法参数选择的不敏感性和其他状态艺术 distance-based 方法的比较。此外，我们在实际数据上应用了包装技术，得到了有 Promise 的改进。

Quantifying Hippocampal Shape Asymmetry in Alzheimer’s Disease Using Optimal Shape Correspondences

paper_url: http://arxiv.org/abs/2312.01043
repo_url: None
paper_authors: Shen Zhu, Ifrah Zawar, Jaideep Kapur, P. Thomas Fletcher
for: 这个论文旨在研究阿尔ツ海默病（AD）中海马体积不均匀和空间不均匀的特征。
methods: 该论文使用了优化点对匹配法来量化左右海马体的形态不均匀，同时采用了一种紧凑的统计形态模型来描述整个样本。
results: 结果表明，与使用体积信息相比，形态不均匀可以描述出阿尔ツ海默病患者海马体的细腻、地方性差异。

Abstract
Hippocampal atrophy in Alzheimer's disease (AD) is asymmetric and spatially inhomogeneous. While extensive work has been done on volume and shape analysis of atrophy of the hippocampus in AD, less attention has been given to hippocampal asymmetry specifically. Previous studies of hippocampal asymmetry are limited to global volume or shape measures, which don't localize shape asymmetry at the point level. In this paper, we propose to quantify localized shape asymmetry by optimizing point correspondences between left and right hippocampi within a subject, while simultaneously favoring a compact statistical shape model of the entire sample. To account for related variables that have impact on AD and healthy subject differences, we build linear models with other confounding factors. Our results on the OASIS3 dataset demonstrate that compared to using volumetric information, shape asymmetry reveals fine-grained, localized differences that indicate the hippocampal regions of most significant shape asymmetry in AD patients.

摘要
论文摘要：阿尔ц海默病（AD）中海马体积减少呈偏极和空间不均匀的特征。虽然有很多研究对阿尔ц海默病患者的海马体积和形态进行了分析，但对于海马偏极的研究则很少。前一代研究的海马偏极限定为全体海马体积或形态的全局指标，这些指标无法地方化形态偏极。本文提出了一种量化本地形态偏极的方法，通过对左右海马之间点对应优化，同时满足整个样本的紧凑统计形态模型。为了考虑影响阿尔ц海默病和健康人群差异的相关变量，我们建立了线性模型。我们的结果表明，相比使用体积信息，形态偏极可以描述阿尔ц海默病患者的海马区域的细腻、本地差异。Note: The translation is in Simplified Chinese, which is the standard form of Chinese used in mainland China and Singapore. If you need Traditional Chinese, please let me know.

RNN-BOF: A Multivariate Global Recurrent Neural Network for Binary Outcome Forecasting of Inpatient Aggression

paper_url: http://arxiv.org/abs/2312.01029
repo_url: None
paper_authors: Aidan Quinn, Melanie Simmons, Benjamin Spivak, Christoph Bergmeir
for: 预测暴力事件的未来风险，帮助临床医生评估患者的风险水平。
methods: 使用时间序列方法，通过长期数据学习，生成个体化的风险评估结果，并使用深度神经网络模型来全面预测多个时间序列。
results: 在一个真实世界数据集上，与参考心理测量工具和先前使用的机器学习方法相比，该方法显示出了显著的性能提升。

Abstract
Psychometric assessment instruments aid clinicians by providing methods of assessing the future risk of adverse events such as aggression. Existing machine learning approaches have treated this as a classification problem, predicting the probability of an adverse event in a fixed future time period from the scores produced by both psychometric instruments and clinical and demographic covariates. We instead propose modelling a patient's future risk using a time series methodology that learns from longitudinal data and produces a probabilistic binary forecast that indicates the presence of the adverse event in the next time period. Based on the recent success of Deep Neural Nets for globally forecasting across many time series, we introduce a global multivariate Recurrent Neural Network for Binary Outcome Forecasting, that trains from and for a population of patient time series to produce individual probabilistic risk assessments. We use a moving window training scheme on a real world dataset of 83 patients, where the main binary time series represents the presence of aggressive events and covariate time series represent clinical or demographic features and psychometric measures. On this dataset our approach was capable of a significant performance increase against both benchmark psychometric instruments and previously used machine learning methodologies.

摘要
心理测量工具可以帮助临床专业人员评估未来风险事件的发生，如攻击行为。现有的机器学习方法将此视为一个分类问题，从心理测量工具和临床和人口特征 covariates 中获取的分数来预测在固定的未来时间段内事件的概率。我们 however propose 使用时间序列方法，从 longitudinal 数据中学习并生成一个概率binary forecast，以指示下一个时间段中事件的发生。基于 globally forecasting 的最近成功，我们引入了一个全球多变量循环神经网络，用于生成个体化风险评估。我们使用一个移动窗口训练方案，在一个实际世界数据集上进行训练，该数据集包括83名患者的时间序列，主要 binary time series 表示攻击事件的存在，并且 covariate time series 表示临床或人口特征和心理测量结果。在这个数据集上，我们的方法可以在对比 benchmark 心理测量工具和先前使用的机器学习方法之上显著提高性能。

Advanced Language Model-Driven Verilog Development: Enhancing Power, Performance, and Area Optimization in Code Synthesis

paper_url: http://arxiv.org/abs/2312.01022
repo_url: None
paper_authors: Kiran Thorat, Jiahui Zhao, Yaotian Liu, Hongwu Peng, Xi Xie, Bin Lei, Jeff Zhang, Caiwen Ding
for: 这个研究探讨了高级语言模型（ALM）在多个领域中的应用，尤其是在语言指令下生成高级内容的能力。
methods: 这个研究使用了一种创新的框架，用于评估和提高ALM在电子硬件设计中的产效。这个框架包括两个阶段的修正协议，其中第一阶段是提高代码的语言和操作精度，第二阶段是根据Power-Performance-Area（PPA）标准进行代码的优化。
results: 这个研究发现，使用这种框架可以提高ALM在Verilog编程 синтеesis中的语言精度和操作效率。相比之下，现有的前沿技术只能达到73%的语言精度和46%的操作效率。这些结果表明ALM在复杂的技术领域中的应用潜力很大，并且标志着硬件设计操作的机械化是可能的。

Abstract
The increasing use of Advanced Language Models (ALMs) in diverse sectors, particularly due to their impressive capability to generate top-tier content following linguistic instructions, forms the core of this investigation. This study probes into ALMs' deployment in electronic hardware design, with a specific emphasis on the synthesis and enhancement of Verilog programming. We introduce an innovative framework, crafted to assess and amplify ALMs' productivity in this niche. The methodology commences with the initial crafting of Verilog programming via ALMs, succeeded by a distinct dual-stage refinement protocol. The premier stage prioritizes augmenting the code's operational and linguistic precision, while the latter stage is dedicated to aligning the code with Power-Performance-Area (PPA) benchmarks, a pivotal component in proficient hardware design. This bifurcated strategy, merging error remediation with PPA enhancement, has yielded substantial upgrades in the caliber of ALM-created Verilog programming. Our framework achieves an 81.37% rate in linguistic accuracy and 62.0% in operational efficacy in programming synthesis, surpassing current leading-edge techniques, such as 73% in linguistic accuracy and 46% in operational efficacy. These findings illuminate ALMs' aptitude in tackling complex technical domains and signal a positive shift in the mechanization of hardware design operations.

摘要
随着高级语言模型（ALM）在多个领域的应用，尤其是其执行高质量内容的能力，这项研究的核心在于调查ALM在电子硬件设计中的应用。本研究强调在Verilog编程中使用ALM的应用，并提出了一种创新的框架，用于评估和提高ALM在这个领域的生产力。这种框架包括两个阶段：首先，使用ALM编写Verilog程序，然后进行两个阶段的修正。第一阶段的目标是提高代码的语言和运行精度，而第二阶段则是根据Power-Performance-Area（PPA）标准进行修正，以提高硬件设计的效率。这种分段策略，结合修复和PPA提高，已经实现了ALM编写的Verilog程序质量的显著提高。our framework achieves an 81.37% rate in linguistic accuracy and 62.0% in operational efficacy in programming synthesis, surpassing current leading-edge techniques, such as 73% in linguistic accuracy and 46% in operational efficacy. These findings highlight ALMs' potential in tackling complex technical domains and signal a positive shift in the mechanization of hardware design operations.

Data-Driven Autoencoder Numerical Solver with Uncertainty Quantification for Fast Physical Simulations

paper_url: http://arxiv.org/abs/2312.01021
repo_url: https://github.com/llnl/gplasdi
paper_authors: Christophe Bonneville, Youngsoo Choi, Debojyoti Ghosh, Jonathan L. Belof
for: 该文章是为了开发一种更快的partial differential equation（PDE）解决方法，以替代传统的计算昂贵的方法。
methods: 该文章提出了一种гибриids deep learning和 bayesian reduced-order-models（ROMs）的方法，称为GPLaSDI。GPLaSDI在全功率模型（FOM）数据上训练了一个自适应神经网络，同时学习了 simpler equations governing the latent space。这些方程是通过 Gaussian Processes 进行插值，以实现uncertainty quantification和active learning，即使具有有限的FOM解决器访问。
results: 该文章的结果表明，GPLaSDI可以在流体力学问题上实现 up to 100,000 倍的速度增加，并且相对误差小于 7%。

Abstract
Traditional partial differential equation (PDE) solvers can be computationally expensive, which motivates the development of faster methods, such as reduced-order-models (ROMs). We present GPLaSDI, a hybrid deep-learning and Bayesian ROM. GPLaSDI trains an autoencoder on full-order-model (FOM) data and simultaneously learns simpler equations governing the latent space. These equations are interpolated with Gaussian Processes, allowing for uncertainty quantification and active learning, even with limited access to the FOM solver. Our framework is able to achieve up to 100,000 times speed-up and less than 7% relative error on fluid mechanics problems.

摘要
传统的partial differential equation（PDE）解决器可能会具有计算成本，这种情况下，我们发展了更快的方法，如减少的数量模型（ROM）。我们介绍了GPLaSDI，一种混合深度学习和 bayesian ROM。GPLaSDI在全功率模型（FOM）数据上训练了自适应神经网络，同时学习了较为简单的 governing equations governing the latent space。这些方程通过 Gaussian Processes 进行插值，允许uncertainty quantification和活动学习，即使有限制对FOM solver的访问。我们的框架可以实现到100,000倍的速度增强和相对误差小于7%在流体力学问题上。

ResNLS: An Improved Model for Stock Price Forecasting

paper_url: http://arxiv.org/abs/2312.01020
repo_url: None
paper_authors: Yuanzhe Jia, Ali Anaissi, Basem Suleiman
For: This paper aims to improve stock price prediction by emphasizing the dependencies between adjacent stock prices, using a hybrid model called ResNLS.* Methods: The proposed model combines ResNet and LSTM neural architectures to extract features and analyze time-series data, with the closing price data for the previous 5 consecutive trading days used as the input.* Results: The ResNLS-5 model outperforms other baseline models in terms of prediction accuracy, and demonstrates at least a 20% improvement over the current state-of-the-art baselines. Additionally, the trading strategy based on predictions from ResNLS-5 can successfully mitigate losses during declining stock prices and generate profits in the periods of rising stock prices.Here is the Chinese translation of the three key points:* For: 这篇论文目标是通过强调邻近股票价格之间的相关性，提高股票价格预测。* Methods: 提议的模型组合了 ResNet 和 LSTM 神经网络架构，以提取特征和分析时间序列数据，并使用前 5 天交易日的收盘价据作为输入。* Results: ResNLS-5 模型在预测精度方面超过了其他基准模型，并至少达到了当前状态艺术基准的 20% 提升。此外，基于 ResNLS-5 预测的交易策略可以在股票价格下降时成功避免损失，并在股票价格升起时产生利润。

Abstract
Stock prices forecasting has always been a challenging task. Although many research projects adopt machine learning and deep learning algorithms to address the problem, few of them pay attention to the varying degrees of dependencies between stock prices. In this paper we introduce a hybrid model that improves stock price prediction by emphasizing the dependencies between adjacent stock prices. The proposed model, ResNLS, is mainly composed of two neural architectures, ResNet and LSTM. ResNet serves as a feature extractor to identify dependencies between stock prices across time windows, while LSTM analyses the initial time-series data with the combination of dependencies which considered as residuals. In predicting the SSE Composite Index, our experiment reveals that when the closing price data for the previous 5 consecutive trading days is used as the input, the performance of the model (ResNLS-5) is optimal compared to those with other inputs. Furthermore, ResNLS-5 outperforms vanilla CNN, RNN, LSTM, and BiLSTM models in terms of prediction accuracy. It also demonstrates at least a 20% improvement over the current state-of-the-art baselines. To verify whether ResNLS-5 can help clients effectively avoid risks and earn profits in the stock market, we construct a quantitative trading framework for back testing. The experimental results show that the trading strategy based on predictions from ResNLS-5 can successfully mitigate losses during declining stock prices and generate profits in the periods of rising stock prices.

摘要
股票价格预测总是一个复杂的任务。虽然许多研究项目采用机器学习和深度学习算法来解决这个问题，但很少的人关注股票价格之间的不同程度的依赖关系。在这篇论文中，我们介绍了一种混合模型，即ResNLS，可以改进股票价格预测。ResNLS模型由两种神经网络架构组成：ResNet和LSTM。ResNet作为特征提取器，用于在时间窗口内识别股票价格之间的依赖关系，而LSTM则分析了时间序列数据的初始部分，并将这些依赖关系作为剩余分析。在预测SSE Composite Index时，我们的实验表明，使用前5天的收盘价数据作为输入，模型（ResNLS-5）的性能是最佳的，并且在其他输入情况下都表现不佳。此外，ResNLS-5还超过了vanilla CNN、RNN、LSTM和BiLSTM模型的预测精度。它还证明了在当前状况下的基准值中提供至少20%的提升。为了验证ResNLS-5是否可以帮助客户有效避免风险并获得利润，我们建立了一个量化交易框架进行回测。实验结果表明，基于ResNLS-5的预测，可以成功避免股票价格下跌期间的损失，并在股票价格上升期间获得收益。

Generating Images of the M87* Black Hole Using GANs

paper_url: http://arxiv.org/abs/2312.01005
repo_url: https://github.com/aryamohan23/eht-gans
paper_authors: Arya Mohan, Pavlos Protopapas, Keerthi Kunnumkai, Cecilia Garraffo, Lindy Blackburn, Koushik Chatterjee, Sheperd S. Doeleman, Razieh Emami, Christian M. Fromm, Yosuke Mizuno, Angelo Ricarte
for: 本研究提出了一种基于Conditional Progressive Generative Adversarial Networks（CPGAN）的数据增强方法，用于生成多样化的黑洞（BH）图像，考虑了质量和电子温度的变化。
methods: 本研究使用了CPGAN模型来生成BH图像，并且可以为任何质量值[-1, 1]之间的电子温度分布生成图像。
results: 我们的模型可以准确地预测黑洞质量，并且在使用增强数据集进行训练而测试用GRMHD模拟数据时显示出了显著的性能提升。因此，我们提议使用GAN来生成黑洞图像，并可靠地增强训练数据集 для其他参数化算法。

Abstract
In this paper, we introduce a novel data augmentation methodology based on Conditional Progressive Generative Adversarial Networks (CPGAN) to generate diverse black hole (BH) images, accounting for variations in spin and electron temperature prescriptions. These generated images are valuable resources for training deep learning algorithms to accurately estimate black hole parameters from observational data. Our model can generate BH images for any spin value within the range of [-1, 1], given an electron temperature distribution. To validate the effectiveness of our approach, we employ a convolutional neural network to predict the BH spin using both the GRMHD images and the images generated by our proposed model. Our results demonstrate a significant performance improvement when training is conducted with the augmented dataset while testing is performed using GRMHD simulated data, as indicated by the high R2 score. Consequently, we propose that GANs can be employed as cost effective models for black hole image generation and reliably augment training datasets for other parameterization algorithms.

摘要
在这篇论文中，我们介绍了一种基于进行进行Conditional Progressive Generative Adversarial Networks（CPGAN）的数据增强方法，用于生成具有质量变化的黑洞（BH）图像，考虑到电子温度分布的变化。这些生成的图像是训练深度学习算法计算黑洞参数的有价值资源。我们的模型可以根据电子温度分布生成BH图像任何质量值，跨 [-1, 1] 范围。为验证我们的方法效果，我们使用 convolutional neural network（CNN）预测黑洞旋转矩阵，使用GRMHD图像和我们提议的模型生成的图像进行训练和测试。我们的结果表明，使用增强 dataset 进行训练，并使用GRMHD simulate 数据进行测试，可以提高 R2 分数。因此，我们提议使用 GAN 作为便宜的黑洞图像生成模型，可靠地增强训练集 для其他参数化算法。

Second-Order Uncertainty Quantification: A Distance-Based Approach

paper_url: http://arxiv.org/abs/2312.00995
repo_url: None
paper_authors: Yusuf Sale, Viktor Bengs, Michele Caprio, Eyke Hüllermeier
for: This paper proposes a set of formal criteria for uncertainty measures in machine learning, specifically for predictive uncertainty based on second-order distributions.
methods: The paper offers a general framework for developing uncertainty measures that satisfy these criteria, and provides an instantiation based on the Wasserstein distance.
results: The authors prove that their proposed uncertainty measure satisfies all of the formal criteria they have established.

Abstract
In the past couple of years, various approaches to representing and quantifying different types of predictive uncertainty in machine learning, notably in the setting of classification, have been proposed on the basis of second-order probability distributions, i.e., predictions in the form of distributions on probability distributions. A completely conclusive solution has not yet been found, however, as shown by recent criticisms of commonly used uncertainty measures associated with second-order distributions, identifying undesirable theoretical properties of these measures. In light of these criticisms, we propose a set of formal criteria that meaningful uncertainty measures for predictive uncertainty based on second-order distributions should obey. Moreover, we provide a general framework for developing uncertainty measures to account for these criteria, and offer an instantiation based on the Wasserstein distance, for which we prove that all criteria are satisfied.

摘要
过去几年，在机器学习中表示和评估不同类型的预测uncertainty的方法有很多提议，主要是在分类 setting 中，基于第二个概率分布。然而，至今为止还没有一个完全 conclusive 的解决方案，这可以从最近的一些对常用的 uncertainty 度量 associates with second-order distributions 的批评中看出来。为了解决这些批评，我们提出了一些 formal criteria ，meaningful uncertainty measures for predictive uncertainty based on second-order distributions should obey。此外，我们还提供了一个总体框架，可以用于开发这些 criteria 的 uncertainty measures，并且提供了基于 Wasserstein distance 的实现，并证明了这些 criteria 都被满足。

paper_url: http://arxiv.org/abs/2312.00992
repo_url: None
paper_authors: Sayantan Kumar, Philip Payne, Aristeidis Sotiras
for: 这个论文的目的是提出一种新的normative模型，用于估计阿尔茨海默病（AD）病人群的脑图像数据中的异常性。
methods: 这个论文使用了mixture-of-product-of-experts（MoPoE）技术，将多Modal neuroimaging数据拟合到一起，并且使用这些模型来标识病人群中的异常subject。
results: 研究发现，通过使用MoPoE技术，可以更好地模型多Modal neuroimaging数据中的联合异常性，并且可以特定病人群中的异常latent空间和脑区域。

Abstract
Normative models in neuroimaging learn the brain patterns of healthy population distribution and estimate how disease subjects like Alzheimer's Disease (AD) deviate from the norm. Existing variational autoencoder (VAE)-based normative models using multimodal neuroimaging data aggregate information from multiple modalities by estimating product or averaging of unimodal latent posteriors. This can often lead to uninformative joint latent distributions which affects the estimation of subject-level deviations. In this work, we addressed the prior limitations by adopting the Mixture-of-Product-of-Experts (MoPoE) technique which allows better modelling of the joint latent posterior. Our model labelled subjects as outliers by calculating deviations from the multimodal latent space. Further, we identified which latent dimensions and brain regions were associated with abnormal deviations due to AD pathology.

摘要

Convergences for Minimax Optimization Problems over Infinite-Dimensional Spaces Towards Stability in Adversarial Training

paper_url: http://arxiv.org/abs/2312.00991
repo_url: None
paper_authors: Takashi Furuya, Satoshi Okuda, Kazuma Suetake, Yoshihide Sawada
for: This paper aims to address the instability issue in training neural networks, specifically in generative adversarial networks (GANs) and unsupervised domain adaptations (UDAs), through theoretical functional analysis.
methods: The paper uses gradient descent over infinite-dimensional spaces of continuous functions and probability measures to analyze the convergence property of the minimax problem. The authors also discuss various stabilization techniques, such as spectral normalization and gradient penalty, that are necessary for the convergence property.
results: The authors show that the conditions necessary for the convergence property are interpreted as stabilization techniques for adversarial training, providing a comprehensive understanding of GANs and UDAs.

Abstract
Training neural networks that require adversarial optimization, such as generative adversarial networks (GANs) and unsupervised domain adaptations (UDAs), suffers from instability. This instability problem comes from the difficulty of the minimax optimization, and there have been various approaches in GANs and UDAs to overcome this problem. In this study, we tackle this problem theoretically through a functional analysis. Specifically, we show the convergence property of the minimax problem by the gradient descent over the infinite-dimensional spaces of continuous functions and probability measures under certain conditions. Using this setting, we can discuss GANs and UDAs comprehensively, which have been studied independently. In addition, we show that the conditions necessary for the convergence property are interpreted as stabilization techniques of adversarial training such as the spectral normalization and the gradient penalty.

摘要
training neural networks that require adversarial optimization, such as generative adversarial networks (GANs) and unsupervised domain adaptations (UDAs), suffers from instability. this instability problem comes from the difficulty of the minimax optimization, and there have been various approaches in GANs and UDAs to overcome this problem. in this study, we tackle this problem theoretically through a functional analysis. specifically, we show the convergence property of the minimax problem by the gradient descent over the infinite-dimensional spaces of continuous functions and probability measures under certain conditions. using this setting, we can discuss GANs and UDAs comprehensively, which have been studied independently. in addition, we show that the conditions necessary for the convergence property are interpreted as stabilization techniques of adversarial training such as the spectral normalization and the gradient penalty.Note that the translation is in Simplified Chinese, which is the standard form of Chinese used in mainland China and Singapore. If you prefer Traditional Chinese, I can provide that as well.

Noisy probing dose facilitated dose prediction for pencil beam scanning proton therapy: physics enhances generalizability

paper_url: http://arxiv.org/abs/2312.00975
repo_url: None
paper_authors: Lian Zhang, Jason M. Holmes, Zhengliang Liu, Hongying Feng, Terence T. Sio, Carlos E. Vargas, Sameer R. Keole, Kristin Stützer, Sheng Li, Tianming Liu, Jiajian Shen, William W. Wong, Sujay A. Vora, Wei Liu
for: 这个研究的目的是设计一个能够考虑物理下的AI-based PBSPT剂量预测方法，以提高对特殊诊断情况的普遍化能力。
methods: 这个研究评估了三种方法：ROI-based方法、beam mask和滑动窗口方法，以及噪声探针剂量方法。
results: 噪声探针剂量方法在评估测验案例中表现出更高的DVH指标、3D Gamma passing rate和 dice coefficient相符度，并且在6个特殊诊断情况中表现出更好的普遍化能力。

Abstract
Purpose: Prior AI-based dose prediction studies in photon and proton therapy often neglect underlying physics, limiting their generalizability to handle outlier clinical cases, especially for pencil beam scanning proton therapy (PBSPT). Our aim is to design a physics-aware and generalizable AI-based PBSPT dose prediction method that has the underlying physics considered to achieve high generalizability to properly handle the outlier clinical cases. Methods and Materials: This study analyzed PBSPT plans of 103 prostate and 78 lung cancer patients from our institution,with each case comprising CT images, structure sets, and plan doses from our Monte-Carlo dose engine (serving as the ground truth). Three methods were evaluated in the ablation study: the ROI-based method, the beam mask and sliding window method, and the noisy probing dose method. Twelve cases with uncommon beam angles or prescription doses tested the methods' generalizability to rare treatment planning scenarios. Performance evaluation used DVH indices, 3D Gamma passing rates (3%/2mm/10%), and dice coefficients for dose agreement. Results: The noisy probing dose method showed improved agreement of DVH indices, 3D Gamma passing rates, and dice coefficients compared to the conventional methods for the testing cases. The noisy probing dose method showed better generalizability in the 6 outlier cases than the ROI-based and beam mask-based methods with 3D Gamma passing rates (for prostate cancer, targets: 89.32%$\pm$1.45% vs. 93.48%$\pm$1.51% vs. 96.79%$\pm$0.83%, OARs: 85.87%$\pm$1.73% vs. 91.15%$\pm$1.13% vs. 94.29%$\pm$1.01%). The dose predictions were completed within 0.3 seconds. Conclusions: We've devised a novel noisy probing dose method for PBSPT dose prediction in prostate and lung cancer patients. With more physics included, it enhances the generalizability of dose prediction in handling outlier clinical cases.

摘要
目的：早期基于人工智能的荷量预测研究 frequently neglect 背景物理，限制其普遍性，特别是 для笔直扫辐射治疗 (PBSPT) 的异常临床 caso。我们的目标是设计一种包含物理的AI基于PBSPT荷量预测方法，以高度普遍性来处理异常临床 caso。方法和材料：这种研究分析了我们机构103例肾癌和78例肺癌病人的PBSPT计划，每个案例包括CT图像、结构集和计划剂量，这些计划剂量由我们的Monte Carlo剂量引擎服务为真实值。研究中评估了三种方法：ROI基于方法、扫描窗口方法和噪声探测剂量方法。测试案例包括12个不寻常的辐射角度或荷量预测场景。表现评估使用DVH指标、3DGamma通过率（3%/2mm/10%）和 dice coefficient for dose agreement。结果：噪声探测剂量方法在测试案例中表现出了与 conventinal 方法相比的DVH指标、3DGamma通过率和 dice coefficient 的更好一致性。噪声探测剂量方法在6个异常 caso 中的普遍性比 ROI基于方法和扫描窗口方法更高，3DGamma通过率（肾癌目标：89.32%±1.45% vs. 93.48%±1.51% vs. 96.79%±0.83%; OARs：85.87%±1.73% vs. 91.15%±1.13% vs. 94.29%±1.01%）。计划时间为0.3秒。结论：我们提出了一种基于PBSPT肾癌和肺癌普遍性的噪声探测剂量方法。通过更多的物理包含，该方法可以提高荷量预测普遍性，更好地处理异常临床 caso。

2023-12-02

A deep learning pipeline for cross-sectional and longitudinal multiview data integration

Evetac: An Event-based Optical Tactile Sensor for Robotic Manipulation

Can We Learn Communication-Efficient Optimizers?

Learning High-Order Relationships of Brain Regions

Distributed Bayesian Estimation in Sensor Networks: Consensus on Marginal Densities

When accurate prediction models yield harmful self-fulfilling prophecies

Short-term Precipitation Forecasting in The Netherlands: An Application of Convolutional LSTM neural networks to weather radar data

Exploring a Hybrid Deep Learning Framework to Automatically Discover Topic and Sentiment in COVID-19 Tweets

On-sensor Printed Machine Learning Classification via Bespoke ADC and Decision Tree Co-Design

Fast and Robust Sparsity-Aware Block Diagonal Representation

$t^3$-Variational Autoencoder: Learning Heavy-tailed Data with Student’s t and Power Divergence

Virtual reservoir acceleration for CPU and GPU: Case study for coupled spin-torque oscillator reservoir

Cancer Subtype Identification through Integrating Inter and Intra Dataset Relationships in Multi-Omics Data

Strong Duality Relations in Nonconvex Risk-Constrained Learning

Rapid Speaker Adaptation in Low Resource Text to Speech Systems using Synthetic Data and Transfer learning

Code-Mixed Text to Speech Synthesis under Low-Resource Constraints

Predicting Postoperative Nausea And Vomiting Using Machine Learning: A Model Development and Validation Study

A Semi-Supervised Deep Learning Approach to Dataset Collection for Query-By-Humming Task

A New Random Reshuffling Method for Nonsmooth Nonconvex Finite-sum Optimization

Bagged Regularized $k$-Distances for Anomaly Detection

Quantifying Hippocampal Shape Asymmetry in Alzheimer’s Disease Using Optimal Shape Correspondences

RNN-BOF: A Multivariate Global Recurrent Neural Network for Binary Outcome Forecasting of Inpatient Aggression

Advanced Language Model-Driven Verilog Development: Enhancing Power, Performance, and Area Optimization in Code Synthesis

Data-Driven Autoencoder Numerical Solver with Uncertainty Quantification for Fast Physical Simulations

ResNLS: An Improved Model for Stock Price Forecasting

Generating Images of the M87* Black Hole Using GANs

Second-Order Uncertainty Quantification: A Distance-Based Approach

Improving Normative Modeling for Multi-modal Neuroimaging Data using mixture-of-product-of-experts variational autoencoders

Convergences for Minimax Optimization Problems over Infinite-Dimensional Spaces Towards Stability in Adversarial Training

Noisy probing dose facilitated dose prediction for pencil beam scanning proton therapy: physics enhances generalizability