2023-08-02

cs.LG

cs.LG - 2023-08-02

Careful Whisper – leveraging advances in automatic speech recognition for robust and interpretable aphasia subtype classification

paper_url: http://arxiv.org/abs/2308.01327
repo_url: None
paper_authors: Laurin Wagner, Mario Zusag, Theresa Bloder
for: 这 paper 是为了 automatization speech anomaly detection 以检测语言障碍的评估。
methods: 该 paper 使用了 Connectionist Temporal Classification (CTC) 和 encoder-decoder 自动语音识别模型，生成了丰富的声学特征和清晰的转录。然后，通过应用一些自然语言处理方法，提取了转录中的特征，生成了健康语音的原型。
results: 该 paper 的结果表明，使用这些原型可以以人工智能水平准确地分辨人们有语言障碍和健康控制组之间的差异。此外，还可以准确地分辨出最常见的语言障碍类型。该 pipeline 可以直接应用于其他疾病和语言，显示出robustly extracting diagnostic speech biomarkers的承诺。

Abstract
This paper presents a fully automated approach for identifying speech anomalies from voice recordings to aid in the assessment of speech impairments. By combining Connectionist Temporal Classification (CTC) and encoder-decoder-based automatic speech recognition models, we generate rich acoustic and clean transcripts. We then apply several natural language processing methods to extract features from these transcripts to produce prototypes of healthy speech. Basic distance measures from these prototypes serve as input features for standard machine learning classifiers, yielding human-level accuracy for the distinction between recordings of people with aphasia and a healthy control group. Furthermore, the most frequently occurring aphasia types can be distinguished with 90% accuracy. The pipeline is directly applicable to other diseases and languages, showing promise for robustly extracting diagnostic speech biomarkers.

摘要
这篇论文介绍了一种完全自动化的声音异常识别方法，通过结合连接主义时间分类（CTC）和扩展器-解码器基于自动语音识别模型，生成丰富的声音特征和清晰的译文。然后，通过应用一些自然语言处理方法，提取这些译文中的特征，生成健康语音的原型。基本距离度量从这些原型中提取，作为机器学习分类器的输入特征，可以达到人类水平的准确率，将recordings of people with aphasia和健康控制组分开。此外，最常出现的语言障碍类型可以达到90%的准确率。这个管道可以直接应用于其他疾病和语言，展现了抽取健康语音生物标志物的可靠性。

Do Multilingual Language Models Think Better in English?

paper_url: http://arxiv.org/abs/2308.01223
repo_url: https://github.com/juletx/self-translate
paper_authors: Julen Etxaniz, Gorka Azkune, Aitor Soroa, Oier Lopez de Lacalle, Mikel Artetxe
for: 本研究旨在提高多语言语言模型的性能，并提出了一种新的自动翻译方法（self-translate），可以减少外部翻译系统的使用。
methods: 本研究使用了多语言语言模型的几个任务来评估自动翻译方法的性能，并通过对输入数据进行翻译并运行推理来评估模型的性能。
results: 实验结果显示，使用自动翻译方法可以 Consistently outperform direct inference，这表明多语言语言模型在非英语语言下的表达能力尚未被完全发挥。

Abstract
Translate-test is a popular technique to improve the performance of multilingual language models. This approach works by translating the input into English using an external machine translation system, and running inference over the translated input. However, these improvements can be attributed to the use of a separate translation system, which is typically trained on large amounts of parallel data not seen by the language model. In this work, we introduce a new approach called self-translate, which overcomes the need of an external translation system by leveraging the few-shot translation capabilities of multilingual language models. Experiments over 5 tasks show that self-translate consistently outperforms direct inference, demonstrating that language models are unable to leverage their full multilingual potential when prompted in non-English languages. Our code is available at https://github.com/juletx/self-translate.

摘要
<> traduction-test 是一种受欢迎的技术，用于改进多语言语音模型的性能。这种方法工作通过将输入翻译成英语使用外部机器翻译系统，然后运行推理 sobre 翻译后的输入。然而，这些改进可以归功于使用分开的翻译系统，这个系统通常是基于大量的并行数据不被语音模型训练的。在这项工作中，我们介绍了一种新的方法called自动翻译（self-translate），它超越了需要外部翻译系统的需求，通过多语言语音模型的几个shot翻译能力来实现。经过5个任务的实验表明，自动翻译在直接推理的情况下一直表现出优于，这说明了语音模型在非英语提问时不能完全发挥其多语言潜力。我们的代码可以在中找到。

Calibration in Deep Learning: A Survey of the State-of-the-Art

paper_url: http://arxiv.org/abs/2308.01222
repo_url: None
paper_authors: Cheng Wang
for: This paper is written for researchers and practitioners who are interested in calibrating deep neural models for safety-critical applications.
methods: The paper reviews state-of-the-art calibration methods for deep models, including post-hoc calibration, regularization methods, uncertainty estimation, and composition methods.
results: The paper provides an understanding of the principles of model calibration, including the definition of model miscalibration and key metrics for measuring it. It also covers recent advancements in calibrating large models, particularly large language models (LLMs).

Abstract
Calibrating deep neural models plays an important role in building reliable, robust AI systems in safety-critical applications. Recent work has shown that modern neural networks that possess high predictive capability are poorly calibrated and produce unreliable model predictions. Though deep learning models achieve remarkable performance on various benchmarks, the study of model calibration and reliability is relatively underexplored. Ideal deep models should have not only high predictive performance but also be well calibrated. There have been some recent methods proposed to calibrate deep models by using different mechanisms. In this survey, we review the state-of-the-art calibration methods and provide an understanding of their principles for performing model calibration. First, we start with the definition of model calibration and explain the root causes of model miscalibration. Then we introduce the key metrics that can measure this aspect. It is followed by a summary of calibration methods that we roughly classified into four categories: post-hoc calibration, regularization methods, uncertainty estimation, and composition methods. We also covered some recent advancements in calibrating large models, particularly large language models (LLMs). Finally, we discuss some open issues, challenges, and potential directions.

摘要
<>模型调整在安全应用中建立可靠、Robust AI系统的重要作用。最近的工作表明，现代神经网络具有高预测能力，但是受到误差的影响，其预测结果不可靠。虽然深度学习模型在不同的测试环境中表现出色，但是模型调整和可靠性的研究相对落后。理想的深度模型应该不仅具有高预测性能，还应该具有良好的调整性。在这篇评论中，我们回顾了当前领域的状态对模型调整的方法，并提供了这些方法的原理。首先，我们开始定义模型调整，并解释了模型调整的根本原因。然后，我们介绍了评估模型调整的关键指标。接着，我们概括了调整方法，我们将它们分为四类：后处调整、regularization方法、uncertainty估计和组合方法。我们还讲述了在调整大型模型时的一些最新进展，特别是大语言模型（LLMs）。最后，我们讨论了一些开放的问题、挑战和未来的方向。<>

Using ScrutinAI for Visual Inspection of DNN Performance in a Medical Use Case

paper_url: http://arxiv.org/abs/2308.01220
repo_url: None
paper_authors: Rebekka Görge, Elena Haedecke, Michael Mock
for: 该论文旨在探讨模型性能如何受到标签质量的影响，特别是在医疗设置下，生成高质量标签需要深刻的专家知识和是非常昂贵的。
methods: 该论文使用了一种名为ScrutinAI的可见分析工具，来分析不同专家对数据集的标签变化对模型性能的影响。
results: 该论文通过对一个公共可用的数据集中脑出血的检测和分类的检测进行分析，发现模型性能受到标签质量的影响，并且可以通过ScrutinAI来分析出模型的真正弱点。

Abstract
Our Visual Analytics (VA) tool ScrutinAI supports human analysts to investigate interactively model performanceand data sets. Model performance depends on labeling quality to a large extent. In particular in medical settings, generation of high quality labels requires in depth expert knowledge and is very costly. Often, data sets are labeled by collecting opinions of groups of experts. We use our VA tool to analyse the influence of label variations between different experts on the model performance. ScrutinAI facilitates to perform a root cause analysis that distinguishes weaknesses of deep neural network (DNN) models caused by varying or missing labeling quality from true weaknesses. We scrutinize the overall detection of intracranial hemorrhages and the more subtle differentiation between subtypes in a publicly available data set.

摘要
我们的视觉分析工具ScrutinAI可以帮助人类分析员在互动式地模型性能和数据集之间进行调查。模型性能受标注质量的影响很大。尤其在医疗设置下，生成高质量标注需要深厚的专家知识并非常昂贵。经常情况下，数据集被由不同专家的意见集成来标注。我们使用ScrutinAI来分析标注之间的差异对模型性能的影响。ScrutinAI可以进行根本原因分析，并将模型因标注质量变化或缺失而导致的弱点与真正的弱点区分开来。我们在一个公共可用的数据集中进行了总脑出血的检测和更加细致的分类 между不同类型的差异分析。

Global Hierarchical Neural Networks using Hierarchical Softmax

paper_url: http://arxiv.org/abs/2308.01210
repo_url: https://github.com/jschuurmans/hsoftmax
paper_authors: Jetze Schuurmans, Flavius Frasincar
for: 这篇论文提出了一种基于层次softmax的全局分类器框架，适用于任何具有自然层次结构的分类任务。
methods: 该方法使用层次softmax创建全局分类器，并在四个文本分类数据集上进行了实验。在所有数据集中，层次softmax超过了常规softmax在扫描分类器中的 macro-F1 和 macro- recall。在三个数据集中，层次softmax达到了更高的微精度和macro精度。
results: 实验结果表明，层次softmax在四个文本分类数据集上均显著提高了分类性能，特别是在 macro-F1 和 macro-recall 上。

Abstract
This paper presents a framework in which hierarchical softmax is used to create a global hierarchical classifier. The approach is applicable for any classification task where there is a natural hierarchy among classes. We show empirical results on four text classification datasets. In all datasets the hierarchical softmax improved on the regular softmax used in a flat classifier in terms of macro-F1 and macro-recall. In three out of four datasets hierarchical softmax achieved a higher micro-accuracy and macro-precision.

摘要
Here is the text in Simplified Chinese:这篇论文提出了一个基于层次软max的全局层次分类器框架，适用于任何具有自然层次结构的分类任务。我们通过四个文本分类 dataset 的实验结果表明，层次软max 在 macro-F1 和 macro-recall 方面比常软max 使用的平面分类器表现更好。此外，层次软max 在三个 dataset 中达到了更高的微准确率和macro准确率。

Generative Noisy-Label Learning by Implicit Dicriminative Approximation with Partial Label Prior

paper_url: http://arxiv.org/abs/2308.01184
repo_url: None
paper_authors: Fengbei Liu, Yuanhong Chen, Chong Wang, Yuyuan Liu, Gustavo Carneiro
for: addresses the problem of learning with noisy labels, proposing a new generative approach that improves the estimation of the label transition matrix and disentangles clean and noisy labels.
methods: uses a new model optimization that directly associates data and clean labels, and implicitly estimates the generative model using a discriminative model, eliminating the need for inefficient training of a generative model.
results: achieves state-of-the-art results on several noisy-label benchmarks while maintaining a similar computational complexity as discriminative models.

Abstract
The learning with noisy labels has been addressed with both discriminative and generative models. Although discriminative models have dominated the field due to their simpler modeling and more efficient computational training processes, generative models offer a more effective means of disentangling clean and noisy labels and improving the estimation of the label transition matrix. However, generative approaches maximize the joint likelihood of noisy labels and data using a complex formulation that only indirectly optimizes the model of interest associating data and clean labels. Additionally, these approaches rely on generative models that are challenging to train and tend to use uninformative clean label priors. In this paper, we propose a new generative noisy-label learning approach that addresses these three issues. First, we propose a new model optimisation that directly associates data and clean labels. Second, the generative model is implicitly estimated using a discriminative model, eliminating the inefficient training of a generative model. Third, we propose a new informative label prior inspired by partial label learning as supervision signal for noisy label learning. Extensive experiments on several noisy-label benchmarks demonstrate that our generative model provides state-of-the-art results while maintaining a similar computational complexity as discriminative models.

摘要
学习噪声标签已经通过推论性和生成模型进行解决。虽然推论模型因其更简单的模型化和更高效的计算训练过程而占据了领先地位，但生成模型可以更好地分离噪声标签和数据，并提高标签过渡矩阵的估计。然而，生成方法需要使用复杂的定义，只是间接地优化模型关注的数据和干净标签的关系。此外，这些方法通常需要困难地训练生成模型，并使用不够有用的干净标签估计。在这篇论文中，我们提出了一种新的生成噪声标签学习方法，解决了这三个问题。首先，我们提出了一种直接关联数据和干净标签的模型优化方法。其次，我们使用推论模型来隐式地估计生成模型，从而消除生成模型的不fficient 训练。最后，我们提出了一种基于 partial label learning 的新估计标签超级视觉信号。我们在几个噪声标签 benchmark 上进行了广泛的实验，结果表明，我们的生成模型可以在计算复杂性相同的情况下提供状态机器人的结果。

Direct Gradient Temporal Difference Learning

paper_url: http://arxiv.org/abs/2308.01170
repo_url: None
paper_authors: Xiaochi Qian, Shangtong Zhang
for: This paper focuses on addressing the instability issue in off-policy learning with function approximation and bootstrapping, known as the “deadly triad” in reinforcement learning.
methods: The proposed method uses two samples in a Markovian data stream with an increasing gap to directly solve the double sampling issue, without the need for extra weights or Fenchel duality.
results: The proposed algorithm is computationally efficient and has a convergence rate on par with the canonical on-policy temporal difference learning, as demonstrated through both asymptotic and finite sample analysis. Additionally, the method only requires a logarithmically increasing memory as time progresses.

Abstract
Off-policy learning enables a reinforcement learning (RL) agent to reason counterfactually about policies that are not executed and is one of the most important ideas in RL. It, however, can lead to instability when combined with function approximation and bootstrapping, two arguably indispensable ingredients for large-scale reinforcement learning. This is the notorious deadly triad. Gradient Temporal Difference (GTD) is one powerful tool to solve the deadly triad. Its success results from solving a doubling sampling issue indirectly with weight duplication or Fenchel duality. In this paper, we instead propose a direct method to solve the double sampling issue by simply using two samples in a Markovian data stream with an increasing gap. The resulting algorithm is as computationally efficient as GTD but gets rid of GTD's extra weights. The only price we pay is a logarithmically increasing memory as time progresses. We provide both asymptotic and finite sample analysis, where the convergence rate is on-par with the canonical on-policy temporal difference learning. Key to our analysis is a novel refined discretization of limiting ODEs.

摘要
<> translates into 掌握了非政策学习可以让人工智能推理counterfactual about不被执行的策略，是现代人工智能中最重要的想法之一。然而，它可能会导致不稳定性，特别是在函数近似和重启执行时。这个问题被称为“恐怖三重套”。梯度时间差（GTD）是一种强大的解决方案，它通过解决重启执行和函数近似问题来解决这个问题。在这篇论文中，我们提出了一种直接解决double sampling问题的方法，通过在Markov链中使用两个采样，并且采样间隔逐渐增加。这种方法的计算效率与GTD相当，但是它不需要额外的Weight。我们付出的价格是在时间上增加logarithmic的内存。我们提供了both asymptotic和finite sample分析，其 convergencerate与标准的在政策学习中相同。关键在于我们的新的精细的分解限制ODEs。

Machine Learning-Based Diabetes Detection Using Photoplethysmography Signal Features

paper_url: http://arxiv.org/abs/2308.01930
repo_url: None
paper_authors: Filipe A. C. Oliveira, Felipe M. Dias, Marcelo A. F. Toledo, Diego A. C. Cardenas, Douglas A. Almeida, Estela Ribeiro, Jose E. Krieger, Marco A. Gutierrez
for: 这项研究旨在开发一种基于非侵入式光学 фото折射（PPG）的方法，用于检测糖尿病。
methods: 该研究使用了PPG信号和 metadata 进行训练 Logistic Regression（LR）和 eXtreme Gradient Boosting（XGBoost）算法，以分类非糖尿病和糖尿病患者。
results: 该模型在5个批处验证中获得了F1分数和AUC值为58.8±20.0%和79.2±15.0%（LR），以及51.7±16.5%和73.6±17.0%（XGBoost）。特征分析表明，PPG形态特征包含了糖尿病相关信息，同时metadata 也对模型的性能产生了影响。

Abstract
Diabetes is a prevalent chronic condition that compromises the health of millions of people worldwide. Minimally invasive methods are needed to prevent and control diabetes but most devices for measuring glucose levels are invasive and not amenable for continuous monitoring. Here, we present an alternative method to overcome these shortcomings based on non-invasive optical photoplethysmography (PPG) for detecting diabetes. We classify non-Diabetic and Diabetic patients using the PPG signal and metadata for training Logistic Regression (LR) and eXtreme Gradient Boosting (XGBoost) algorithms. We used PPG signals from a publicly available dataset. To prevent overfitting, we divided the data into five folds for cross-validation. By ensuring that patients in the training set are not in the testing set, the model's performance can be evaluated on unseen subjects' data, providing a more accurate assessment of its generalization. Our model achieved an F1-Score and AUC of $58.8\pm20.0\%$ and $79.2\pm15.0\%$ for LR and $51.7\pm16.5\%$ and $73.6\pm17.0\%$ for XGBoost, respectively. Feature analysis suggested that PPG morphological features contains diabetes-related information alongside metadata. Our findings are within the same range reported in the literature, indicating that machine learning methods are promising for developing remote, non-invasive, and continuous measurement devices for detecting and preventing diabetes.

摘要
diabetes 是一种流行的慢性疾病，对全球数百万人的健康产生了影响。为了预防和控制 diabetes，需要使用非侵入性的方法，但现有的血糖水平测量设备多为侵入性，不适合持续监测。在这里，我们提出了一种新的方法，利用非侵入性的光学折射 Plethysmography (PPG) 测测血糖水平。我们使用 PPG 信号和 metadata 进行分类，使用 Logistic Regression (LR) 和 eXtreme Gradient Boosting (XGBoost) 算法进行训练。我们使用公共可用的数据集。为了避免过拟合，我们将数据分成五个拟合集，进行十字验证。由于我们确保在训练集中没有测试集中的病人，因此我们可以评估模型在未见数据上的性能，从而提供更准确的评估。我们的模型达到了 F1 分数和 AUC 的 $58.8\pm20.0\%$ 和 $79.2\pm15.0\%$ для LR 和 $51.7\pm16.5\%$ 和 $73.6\pm17.0\%$ для XGBoost，分别。特征分析表明，PPG 形态特征包含糖尿病相关信息，并且与 metadata 相关。我们的发现与文献中的报告相同，表明机器学习方法在开发远程、非侵入性、持续测量设备方面具有投资前景。

LLMs Understand Glass-Box Models, Discover Surprises, and Suggest Repairs

paper_url: http://arxiv.org/abs/2308.01157
repo_url: https://github.com/interpretml/talktoebm
paper_authors: Benjamin J. Lengerich, Sebastian Bordt, Harsha Nori, Mark E. Nunnally, Yin Aphinyanaphongs, Manolis Kellis, Rich Caruana
for: 这篇论文旨在探讨大语言模型（LLMs）如何与可解释模型（ interpretable models）结合使用，以提高数据科学中的常见任务自动化。
methods: 论文使用了层次推理方法，使得LLMs可以对复杂结果进行可解释性的分解，并且不需要整个模型都适应上下文中。
results: 论文通过多个医疗实例展示了LLMs在数据科学中的新能力，特别是在通用加分模型（GAMs）中。此外，论文还提供了一个开源的LLM-GAM接口 package $\texttt{TalkToEBM}$。

Abstract
We show that large language models (LLMs) are remarkably good at working with interpretable models that decompose complex outcomes into univariate graph-represented components. By adopting a hierarchical approach to reasoning, LLMs can provide comprehensive model-level summaries without ever requiring the entire model to fit in context. This approach enables LLMs to apply their extensive background knowledge to automate common tasks in data science such as detecting anomalies that contradict prior knowledge, describing potential reasons for the anomalies, and suggesting repairs that would remove the anomalies. We use multiple examples in healthcare to demonstrate the utility of these new capabilities of LLMs, with particular emphasis on Generalized Additive Models (GAMs). Finally, we present the package $\texttt{TalkToEBM}$ as an open-source LLM-GAM interface.

摘要
我们显示大型语言模型（LLM）可以非常好地与可解释模型（GAM）结合，将复杂的结果拆分为单一图表表示的分量。通过运用层次推理的方法，LLM可以提供完整的模型级别概要而不需要整个模型适应上下文。这种方法使得LLM可以自动应用它们广泛的背景知识来进行资料科学中常见的任务，例如检测资料过程中的问题，描述问题的可能原因，以及提出修复方案来解决问题。我们使用了多个医疗保健例子来证明这些新的LLM功能的用 utility，特别是适用于GAM。最后，我们提出了一个名为 $\texttt{TalkToEBM}$ 的开源 LLM-GAM 界面。

A Transformer-based Prediction Method for Depth of Anesthesia During Target-controlled Infusion of Propofol and Remifentanil

paper_url: http://arxiv.org/abs/2308.01929
repo_url: https://github.com/heeeyk/transformer-doa-prediction
paper_authors: Yongkang He, Siyuan Peng, Mingjin Chen, Zhijing Yang, Yuanhui Chen
for: 预测麻醉效果的准确性是脊梁控制输液系统的关键。传统的PK-PD模型需要手动选择模型参数，在临床设置下可能具有挑战。现代深度学习方法可能只能捕捉总趋势，并不能预测麻醉 depth 的突然变化。
methods: 我们提议使用 transformer 网络来预测麻醉 depth，并使用 LSTM 和 GRN 网络来改进特征融合效率。我们还使用注意力机制来发现药物之间的交互关系。此外，我们使用标签分布平滑和重新权重损失来解决数据不均衡问题。
results: 我们的提议方法比传统 PK-PD 模型和先前的深度学习方法更高效地预测麻醉 depth，特别在突然的深度麻醉情况下。

Abstract
Accurately predicting anesthetic effects is essential for target-controlled infusion systems. The traditional (PK-PD) models for Bispectral index (BIS) prediction require manual selection of model parameters, which can be challenging in clinical settings. Recently proposed deep learning methods can only capture general trends and may not predict abrupt changes in BIS. To address these issues, we propose a transformer-based method for predicting the depth of anesthesia (DOA) using drug infusions of propofol and remifentanil. Our method employs long short-term memory (LSTM) and gate residual network (GRN) networks to improve the efficiency of feature fusion and applies an attention mechanism to discover the interactions between the drugs. We also use label distribution smoothing and reweighting losses to address data imbalance. Experimental results show that our proposed method outperforms traditional PK-PD models and previous deep learning methods, effectively predicting anesthetic depth under sudden and deep anesthesia conditions.

摘要
Accurately predicting anesthetic effects is crucial for target-controlled infusion systems. Traditional PK-PD models for Bispectral index (BIS) prediction require manual selection of model parameters, which can be challenging in clinical settings. Recently proposed deep learning methods can only capture general trends and may not predict abrupt changes in BIS. To address these issues, we propose a transformer-based method for predicting the depth of anesthesia (DOA) using drug infusions of propofol and remifentanil. Our method employs long short-term memory (LSTM) and gate residual network (GRN) networks to improve the efficiency of feature fusion and applies an attention mechanism to discover the interactions between the drugs. We also use label distribution smoothing and reweighting losses to address data imbalance. Experimental results show that our proposed method outperforms traditional PK-PD models and previous deep learning methods, effectively predicting anesthetic depth under sudden and deep anesthesia conditions.Here's the text with some additional information about the Simplified Chinese translation:Simplified Chinese is a written version of Chinese that uses simplified characters and is commonly used in mainland China. In this translation, I have used Simplified Chinese characters to represent the text. However, it's worth noting that Traditional Chinese characters are also commonly used in Taiwan and other regions, and may be preferred in some contexts. Additionally, the translation is written in a formal, academic style, which may not be appropriate for all contexts. If you have any specific requests or preferences for the translation, please let me know and I will do my best to accommodate them.

DySTreSS: Dynamically Scaled Temperature in Self-Supervised Contrastive Learning

paper_url: http://arxiv.org/abs/2308.01140
repo_url: None
paper_authors: Siladittya Manna, Soumitri Chattopadhyay, Rakesh Dey, Saumik Bhattacharya, Umapada Pal
for: 提高自适应对SSL中InfoNCE损失的性能，研究InfoNCE损失中温度超参的影响。
methods: 提出了一种基于cosine相似性的温度扩展函数，并对uniformity和tolerance度量进行了分析，以便更好地优化分布在特征空间中。
results: 实验证明，提出的方法可以超过或与对比损失基本相同的SSL算法相当。认为该研究为后续对异性学习的研究提供了基础。

Abstract
In contemporary self-supervised contrastive algorithms like SimCLR, MoCo, etc., the task of balancing attraction between two semantically similar samples and repulsion between two samples from different classes is primarily affected by the presence of hard negative samples. While the InfoNCE loss has been shown to impose penalties based on hardness, the temperature hyper-parameter is the key to regulating the penalties and the trade-off between uniformity and tolerance. In this work, we focus our attention to improve the performance of InfoNCE loss in SSL by studying the effect of temperature hyper-parameter values. We propose a cosine similarity-dependent temperature scaling function to effectively optimize the distribution of the samples in the feature space. We further analyze the uniformity and tolerance metrics to investigate the optimal regions in the cosine similarity space for better optimization. Additionally, we offer a comprehensive examination of the behavior of local and global structures in the feature space throughout the pre-training phase, as the temperature varies. Experimental evidence shows that the proposed framework outperforms or is at par with the contrastive loss-based SSL algorithms. We believe our work (DySTreSS) on temperature scaling in SSL provides a foundation for future research in contrastive learning.

摘要
现代自我超vised contrastive算法如SimCLR、MoCo等，任务是让两个semantically similar sample之间吸引，而两个不同类型的sample之间冲突。而这个任务主要受到强度负样本的影响。InfoNCE损失已经显示出对强度做出了罚款，但是温度超参数是控制这些罚款和吸引与耐受之间的折衔。在这项工作中，我们专注于改进InfoNCE损失在SSL中的性能，研究温度超参数的效果。我们提议一个cosine similarity-dependent温度缩放函数，可以有效地优化样本在特征空间的分布。我们进一步分析了uniformity和tolerance指标，以查找最佳的cosine similarity空间区域，以便更好地优化。此外，我们还对批处和全局结构在特征空间中的变化，随着温度的变化，进行了详细的分析。实验证明，我们提出的框架（DySTreSS）在SSL中的性能比或与基于contrastive损失的SSL算法相当。我们认为我们的工作在SSL中的温度缩放（DySTreSS）提供了一个基础 для未来的对冲学习的研究。

Dynamic Privacy Allocation for Locally Differentially Private Federated Learning with Composite Objectives

paper_url: http://arxiv.org/abs/2308.01139
repo_url: None
paper_authors: Jiaojiao Zhang, Dominik Fay, Mikael Johansson
for: 本研究提出了一种用于强Converter convex但可能不连续问题的本地差分隐私联合学习算法，保护每名工作者的梯度 Against an honest but curious server.
methods: 提出的算法在共享信息中添加人工噪声以确保隐私，并动态分配时变 noise variance来最小化优化错误的上限，以满足先前定义的隐私预算限制。
results: 数值结果表明，提出的算法在比较 estado-of-the-art方法的基础上具有更高的超越性和隐私保护能力，可以在一个适当的隐私预算下实现更高的优化质量。

Abstract
This paper proposes a locally differentially private federated learning algorithm for strongly convex but possibly nonsmooth problems that protects the gradients of each worker against an honest but curious server. The proposed algorithm adds artificial noise to the shared information to ensure privacy and dynamically allocates the time-varying noise variance to minimize an upper bound of the optimization error subject to a predefined privacy budget constraint. This allows for an arbitrarily large but finite number of iterations to achieve both privacy protection and utility up to a neighborhood of the optimal solution, removing the need for tuning the number of iterations. Numerical results show the superiority of the proposed algorithm over state-of-the-art methods.

摘要
（本文提出了一种地域分ifferentially private的联合学习算法，用于保护每个工作者的梯度 against an honest but curious server。该算法在共享信息中添加了人工噪声以保障隐私，并在时间变化的噪声方差中动态分配时间变化的噪声方差以最小化优化误差的Upper bound，以达到一定的隐私预算限制。这使得可以在一个finite但countlessiterations中实现隐私保护和实用性，从而消除调整iterations的需求。numerical results show that the proposed algorithm outperforms existing methods。）

Can We Transfer Noise Patterns? A Multi-environment Spectrum Analysis Model Using Generated Cases

paper_url: http://arxiv.org/abs/2308.01138
repo_url: https://github.com/magnomic/cnst
paper_authors: Haiwen Du, Zheng Ju, Yu An, Honghui Du, Dongjie Zhu, Zhaoshuo Tian, Aonghus Lawlor, Ruihai Dong
for: 这个研究旨在提高在线水质测试中的 спектル分析系统，以检测污染物的类型和浓度，并帮助管理机关对污染事件作出迅速回应。
methods: 本研究提出了一个噪音传播模型，可以将噪音模式传递到不同环境中的标准水样本上，并将这些噪音模式转换到未知标准水样本上，以提高分析模型的可行性。
results: 实验结果显示，提案的方法可以对比基准系统（包括波лет残减、深度神经网络和生成模型）进行比较，在不同背景噪音下表现出良好的噪音传播能力。

Abstract
Spectrum analysis systems in online water quality testing are designed to detect types and concentrations of pollutants and enable regulatory agencies to respond promptly to pollution incidents. However, spectral data-based testing devices suffer from complex noise patterns when deployed in non-laboratory environments. To make the analysis model applicable to more environments, we propose a noise patterns transferring model, which takes the spectrum of standard water samples in different environments as cases and learns the differences in their noise patterns, thus enabling noise patterns to transfer to unknown samples. Unfortunately, the inevitable sample-level baseline noise makes the model unable to obtain the paired data that only differ in dataset-level environmental noise. To address the problem, we generate a sample-to-sample case-base to exclude the interference of sample-level noise on dataset-level noise learning, enhancing the system's learning performance. Experiments on spectral data with different background noises demonstrate the good noise-transferring ability of the proposed method against baseline systems ranging from wavelet denoising, deep neural networks, and generative models. From this research, we posit that our method can enhance the performance of DL models by generating high-quality cases. The source code is made publicly available online at https://github.com/Magnomic/CNST.

摘要
(Simplified Chinese translation) spectral analysis systems in online water quality testing 是为检测污染物种和浓度，并帮助管理机构快速应对污染事件。然而，基于spectrum数据的测试设备在非实验室环境中会出现复杂的噪声模式。为了使分析模型适用于更多环境，我们提议一种噪声模式传递模型，该模型通过学习不同环境中标准水样的spectrum的差异，以便将噪声模式传递到未知样本。然而，不可避免的样本水平噪声使得模型无法获得只有数据aset级噪声的对照数据。为解决问题，我们生成了样本到样本的case基，以排除样本水平噪声对数据aset级噪声学习的干扰。实验结果表明，我们的方法可以减少对比基线方法（包括wavelet denoising、深度神经网络和生成模型）的干扰，提高系统学习性能。从这项研究中，我们认为我们的方法可以提高DL模型的性能，通过生成高质量的case。源代码已经在https://github.com/Magnomic/CNST上公开。

Multi-task learning for classification, segmentation, reconstruction, and detection on chest CT scans

paper_url: http://arxiv.org/abs/2308.01137
repo_url: None
paper_authors: Weronika Hryniewska-Guzik, Maria Kędzierska, Przemysław Biecek
for: 这个研究是为了提高肺癌和COVID-19的诊断和预后预测。
methods: 这个研究使用了多任务学习方法，把肺癌鉴别、分类、重建和检测作为多个任务，以提高医疗资料的抽象和普遍化。
results: 这个研究获得了肺癌鉴别、分类、重建和检测的良好结果，并且是首个在多任务解决方案中添加检测任务的研究。

Abstract
Lung cancer and covid-19 have one of the highest morbidity and mortality rates in the world. For physicians, the identification of lesions is difficult in the early stages of the disease and time-consuming. Therefore, multi-task learning is an approach to extracting important features, such as lesions, from small amounts of medical data because it learns to generalize better. We propose a novel multi-task framework for classification, segmentation, reconstruction, and detection. To the best of our knowledge, we are the first ones who added detection to the multi-task solution. Additionally, we checked the possibility of using two different backbones and different loss functions in the segmentation task.

摘要
肺癌和 COVID-19 在全球具有非常高的疾病率和死亡率。为医生而言，在疾病的早期阶段标识病变很困难和时间消耗。因此，多任务学习是一种提取重要特征，如病变，从小量医疗数据中提取特征的方法，因为它可以更好地泛化。我们提议一种新的多任务框架，用于分类、 segmentation、重建和检测。根据我们所知，我们是第一个将检测添加到多任务解决方案中的人。此外，我们还检查了使用不同的背景和损失函数在 segmentation 任务中的可能性。

Unlearning Spurious Correlations in Chest X-ray Classification

paper_url: http://arxiv.org/abs/2308.01119
repo_url: None
paper_authors: Misgina Tsighe Hagos, Kathleen M. Curran, Brian Mac Namee
for: 这个论文的目的是为了提高医疗图像分类模型的可靠性和透明度，并解决跨数据源集合中的隐藏关系问题。
methods: 这个论文使用了一种基于 Covid-19 胸部X射线图像的深度学习模型，并使用了一种基于用户反馈的交互式解释学习（XBL）方法来解决隐藏关系问题。
results: 研究发现，通过使用 XBL 方法可以有效地消除隐藏关系，从而提高模型的准确率和透明度。

Abstract
Medical image classification models are frequently trained using training datasets derived from multiple data sources. While leveraging multiple data sources is crucial for achieving model generalization, it is important to acknowledge that the diverse nature of these sources inherently introduces unintended confounders and other challenges that can impact both model accuracy and transparency. A notable confounding factor in medical image classification, particularly in musculoskeletal image classification, is skeletal maturation-induced bone growth observed during adolescence. We train a deep learning model using a Covid-19 chest X-ray dataset and we showcase how this dataset can lead to spurious correlations due to unintended confounding regions. eXplanation Based Learning (XBL) is a deep learning approach that goes beyond interpretability by utilizing model explanations to interactively unlearn spurious correlations. This is achieved by integrating interactive user feedback, specifically feature annotations. In our study, we employed two non-demanding manual feedback mechanisms to implement an XBL-based approach for effectively eliminating these spurious correlations. Our results underscore the promising potential of XBL in constructing robust models even in the presence of confounding factors.

摘要
医疗图像分类模型经常使用多种数据源进行训练。虽然利用多种数据源对模型泛化有益，但是需要注意这些来源的多样性自然会引入无意义的混合因素和其他挑战，这些挑战可能会影响模型准确性和透明度。在医疗图像分类中，特别是在骨骼成像中，生长induced by skeletal maturation during adolescence是一个明显的混合因素。我们使用COVID-19胸部X射线数据集训练深度学习模型，并显示这些数据集可能会导致不必要的相关性。基于解释的学习（XBL）是一种深度学习方法，它不仅提供了解释，还可以通过交互式的用户反馈来解除不必要的相关性。我们使用了两种不需要高度技术知识的手动反馈机制来实现XBL基于的方法。我们的结果表明XBL在混合因素存在时可以建立可靠的模型。

A Survey on Popularity Bias in Recommender Systems

paper_url: http://arxiv.org/abs/2308.01118
repo_url: None
paper_authors: Anastasiia Klimashevskaia, Dietmar Jannach, Mehdi Elahi, Christoph Trattner
for: 本研究旨在探讨推荐系统中偏好媒体文件的问题，以及如何探测、评估和缓解这种偏好。
methods: 本文评论了现有的推荐算法是如何导致媒体文件的偏好问题，并提出了一些方法来探测、评估和缓解这种偏好。
results: 本文发现现有的推荐算法在很多情况下会导致媒体文件的偏好问题，这可能会导致推荐的价值受到限制，并可能在长期内产生不良的循环效应。

Abstract
Recommender systems help people find relevant content in a personalized way. One main promise of such systems is that they are able to increase the visibility of items in the long tail, i.e., the lesser-known items in a catalogue. Existing research, however, suggests that in many situations today's recommendation algorithms instead exhibit a popularity bias, meaning that they often focus on rather popular items in their recommendations. Such a bias may not only lead to limited value of the recommendations for consumers and providers in the short run, but it may also cause undesired reinforcement effects over time. In this paper, we discuss the potential reasons for popularity bias and we review existing approaches to detect, quantify and mitigate popularity bias in recommender systems. Our survey therefore includes both an overview of the computational metrics used in the literature as well as a review of the main technical approaches to reduce the bias. We furthermore critically discuss today's literature, where we observe that the research is almost entirely based on computational experiments and on certain assumptions regarding the practical effects of including long-tail items in the recommendations.

摘要

Spatio-Temporal Branching for Motion Prediction using Motion Increments

paper_url: http://arxiv.org/abs/2308.01097
repo_url: https://github.com/jasonwang959/stpmp
paper_authors: Jiexin Wang, Yujie Zhou, Wenwen Qiang, Ying Ba, Bing Su, Ji-Rong Wen
for: 人体动作预测 (HMP) 是一个流行的研究领域，但是它仍然是一个具有杂乱和不规则性的任务，尤其是将来的姿势预测。传统方法通常使用手工设计的特征和机器学习技术，这些技术经常难以捕捉人体动作的复杂动态。
methods: 我们提出了一种新的空间temporal分支网络方法，使得各个节点之间的时间和空间关系得到了更好的利用。我们通过知识储存技术来实现域知识储存和跨频率学习。
results: 我们的方法在标准HMP测试集上进行评估，并与当前最佳方法进行比较。我们发现，我们的方法可以更好地降低噪声干扰，并提供更多的动作特征来Characterize人体动作。

Abstract
Human motion prediction (HMP) has emerged as a popular research topic due to its diverse applications, but it remains a challenging task due to the stochastic and aperiodic nature of future poses. Traditional methods rely on hand-crafted features and machine learning techniques, which often struggle to model the complex dynamics of human motion. Recent deep learning-based methods have achieved success by learning spatio-temporal representations of motion, but these models often overlook the reliability of motion data. Additionally, the temporal and spatial dependencies of skeleton nodes are distinct. The temporal relationship captures motion information over time, while the spatial relationship describes body structure and the relationships between different nodes. In this paper, we propose a novel spatio-temporal branching network using incremental information for HMP, which decouples the learning of temporal-domain and spatial-domain features, extracts more motion information, and achieves complementary cross-domain knowledge learning through knowledge distillation. Our approach effectively reduces noise interference and provides more expressive information for characterizing motion by separately extracting temporal and spatial features. We evaluate our approach on standard HMP benchmarks and outperform state-of-the-art methods in terms of prediction accuracy.

摘要
人体运动预测（HMP）已经成为一个受欢迎的研究主题，因为它在多个应用领域有广泛的应用前景。然而，HMP仍然是一个具有抽象和不规则的未来姿势的挑战。传统的方法通常采用手动设计的特征和机器学习技术，经常难以模型人体运动的复杂动力学。现代深度学习基于的方法在学习人体运动的空间-时间表示方面取得了成功，但这些模型经常忽略人体运动数据的可靠性。此外，人体运动中的时间和空间关系不同。时间关系捕捉人体运动的时间信息，而空间关系描述人体结构和不同节点之间的关系。在这篇论文中，我们提出了一种新的空间-时间分支网络，使用增量信息来进行HMP，这种方法可以分离学习时间Domain和空间Domain的特征，提取更多的运动信息，并通过知识储存来实现补做cross-domain知识学习。我们的方法可以减少噪声干扰和提供更多的表达信息，以便更好地描述运动。我们在标准HMP测试benchmark上评估了我们的方法，并在预测精度方面超过了当前的状态艺术方法。

Multi-variable Hard Physical Constraints for Climate Model Downscaling

paper_url: http://arxiv.org/abs/2308.01868
repo_url: None
paper_authors: Jose González-Abad, Álex Hernández-García, Paula Harder, David Rolnick, José Manuel Gutiérrez
for: 实现地方气候变化的影响和演化，Global Climate Models (GCMs) 是主要工具。
methods: 使用深度学习统计下降法，将粗糙空间解析的气候变化转换为本地规模的气候场。
results: 这种方法可以确保气候变化的本地规模预测的准确性，但是通常只是单一气候变量的独立下降。这种研究探讨了这个问题的范围和解决方案。

Abstract
Global Climate Models (GCMs) are the primary tool to simulate climate evolution and assess the impacts of climate change. However, they often operate at a coarse spatial resolution that limits their accuracy in reproducing local-scale phenomena. Statistical downscaling methods leveraging deep learning offer a solution to this problem by approximating local-scale climate fields from coarse variables, thus enabling regional GCM projections. Typically, climate fields of different variables of interest are downscaled independently, resulting in violations of fundamental physical properties across interconnected variables. This study investigates the scope of this problem and, through an application on temperature, lays the foundation for a framework introducing multi-variable hard constraints that guarantees physical relationships between groups of downscaled climate variables.

摘要

Homography Estimation in Complex Topological Scenes

paper_url: http://arxiv.org/abs/2308.01086
repo_url: None
paper_authors: Giacomo D’Amicantonio, Egor Bondarau, Peter H. N. De With
for: 这篇论文主要用于提出一种自动化摄像头卡利ibration过程，以便更好地处理环境变化和小型摄像头运动对摄像头卡利ibration的影响。
methods: 该方法使用了一个自定义的空间变换网络（STN）和一种新的topological损失函数，不需要任何相机设置的先验知识。
results: 实验表明，提议的方法可以提高IoU指标（相对于一个状态对照模型），在五个 sintetic dataset和2014年世界杯 dataset上提高IoU指标达12%。

Abstract
Surveillance videos and images are used for a broad set of applications, ranging from traffic analysis to crime detection. Extrinsic camera calibration data is important for most analysis applications. However, security cameras are susceptible to environmental conditions and small camera movements, resulting in a need for an automated re-calibration method that can account for these varying conditions. In this paper, we present an automated camera-calibration process leveraging a dictionary-based approach that does not require prior knowledge on any camera settings. The method consists of a custom implementation of a Spatial Transformer Network (STN) and a novel topological loss function. Experiments reveal that the proposed method improves the IoU metric by up to 12% w.r.t. a state-of-the-art model across five synthetic datasets and the World Cup 2014 dataset.

摘要
侦查视频和图像可以用于各种应用程序，从交通分析到犯罪检测。外部摄像头准备数据非常重要，但安全摄像头受到环境因素和小型摄像头运动的影响，需要一种自动重新准确方法，能够考虑这些不同的条件。本文提出了一种基于词典方法的自动摄像头准确过程，不需要任何摄像头设置的先验知识。该方法包括一个自定义的空间变换网络（STN）和一个新的topological损失函数。实验表明，提议的方法可以提高IoU指标（相对于状态静态模型），在五个人工数据集和2014年世界杯数据集上提高IoU指标达12%。

Data-Driven Identification of Quadratic Symplectic Representations of Nonlinear Hamiltonian Systems

paper_url: http://arxiv.org/abs/2308.01084
repo_url: None
paper_authors: Süleyman Yildiz, Pawan Goyal, Thomas Bendokat, Peter Benner
for: 学习哈密顿系统使用数据
methods: 使用生成函数和自动编码器来学习 quadratic 动力学系统，保持哈密顿结构，并使用高阶变量系统来解决高维数据问题
results: 提出了一种基于生成函数和自动编码器的方法来学习哈密顿系统，并实现了系统的长期稳定性和低模型复杂性In English:
for: Learning Hamiltonian systems using data
methods: Using a generating function and a symplectic autoencoder to learn quadratic dynamical systems that preserve the Hamiltonian structure, and using high-order variable systems to solve high-dimensional data problems
results: Proposed a method based on generating functions and symplectic autoencoders to learn Hamiltonian systems, and achieved long-term stability and low model complexity of the system.

Abstract
We present a framework for learning Hamiltonian systems using data. This work is based on the lifting hypothesis, which posits that nonlinear Hamiltonian systems can be written as nonlinear systems with cubic Hamiltonians. By leveraging this, we obtain quadratic dynamics that are Hamiltonian in a transformed coordinate system. To that end, for given generalized position and momentum data, we propose a methodology to learn quadratic dynamical systems, enforcing the Hamiltonian structure in combination with a symplectic auto-encoder. The enforced Hamiltonian structure exhibits long-term stability of the system, while the cubic Hamiltonian function provides relatively low model complexity. For low-dimensional data, we determine a higher-order transformed coordinate system, whereas, for high-dimensional data, we find a lower-order coordinate system with the desired properties. We demonstrate the proposed methodology by means of both low-dimensional and high-dimensional nonlinear Hamiltonian systems.

摘要
我们提出了一种基于数据学习哈密顿系统的框架。这项工作基于升降 гипотезы，即非线性哈密顿系统可以写作非线性系统的立方函数哈密顿。通过这种方式，我们得到了各自协调的quadratic动力学，并且在变换坐标系中强制实施哈密顿结构。为此，我们提议一种基于泛函和自动编码器的方法，用于学习哈密顿系统，并在数据的总体稳定性和立方函数哈密顿函数的模型简单性之间进行权衡。在低维数据时，我们可以找到更高阶的变换坐标系，而在高维数据时，我们可以找到一个较低阶的坐标系，满足需求。我们通过低维和高维非线性哈密顿系统的示例来证明这种方法的有效性。

A Practical Deep Learning-Based Acoustic Side Channel Attack on Keyboards

paper_url: http://arxiv.org/abs/2308.01074
repo_url: https://github.com/JBFH-Dev/Keystroke-Datasets
paper_authors: Joshua Harrison, Ehsan Toreini, Maryam Mehrnezhad
for: 防止键盘攻击（keyboard attacks）
methods: 使用深度学习模型和smartphone搭载的 Mikrofon记录键盘输入
results: 95%的准确率（最高精度）和93%的准确率（via Zoom视频会议软件）

Abstract
With recent developments in deep learning, the ubiquity of micro-phones and the rise in online services via personal devices, acoustic side channel attacks present a greater threat to keyboards than ever. This paper presents a practical implementation of a state-of-the-art deep learning model in order to classify laptop keystrokes, using a smartphone integrated microphone. When trained on keystrokes recorded by a nearby phone, the classifier achieved an accuracy of 95%, the highest accuracy seen without the use of a language model. When trained on keystrokes recorded using the video-conferencing software Zoom, an accuracy of 93% was achieved, a new best for the medium. Our results prove the practicality of these side channel attacks via off-the-shelf equipment and algorithms. We discuss a series of mitigation methods to protect users against these series of attacks.

摘要

Automatic Feature Engineering for Time Series Classification: Evaluation and Discussion

paper_url: http://arxiv.org/abs/2308.01071
repo_url: None
paper_authors: Aurélien Renault, Alexis Bondu, Vincent Lemaire, Dominique Gay
for: 本研究的目的是评估现有的特征工程工具在时间序列分类问题中的潜在预测性能。
methods: 本研究使用了11种特征工程工具，并与9种超参数化分类器结合使用，对112个时间序列数据集进行了10000多个学习实验。
results: 结果显示，基于特征的方法与当前状态艺术时间序列分类算法的准确率相当，因此应该在时间序列分类领域进一步考虑。

Abstract
Time Series Classification (TSC) has received much attention in the past two decades and is still a crucial and challenging problem in data science and knowledge engineering. Indeed, along with the increasing availability of time series data, many TSC algorithms have been suggested by the research community in the literature. Besides state-of-the-art methods based on similarity measures, intervals, shapelets, dictionaries, deep learning methods or hybrid ensemble methods, several tools for extracting unsupervised informative summary statistics, aka features, from time series have been designed in the recent years. Originally designed for descriptive analysis and visualization of time series with informative and interpretable features, very few of these feature engineering tools have been benchmarked for TSC problems and compared with state-of-the-art TSC algorithms in terms of predictive performance. In this article, we aim at filling this gap and propose a simple TSC process to evaluate the potential predictive performance of the feature sets obtained with existing feature engineering tools. Thus, we present an empirical study of 11 feature engineering tools branched with 9 supervised classifiers over 112 time series data sets. The analysis of the results of more than 10000 learning experiments indicate that feature-based methods perform as accurately as current state-of-the-art TSC algorithms, and thus should rightfully be considered further in the TSC literature.

摘要

When Analytic Calculus Cracks AdaBoost Code

paper_url: http://arxiv.org/abs/2308.01070
repo_url: None
paper_authors: Jean-Marc Brossier, Olivier Lafitte, Lenny Réthoré
for: 这个论文主要是为了探讨AdaBoost算法的准确性和优化性。
methods: 本论文使用了多个弱分类器的组合方法来构建一个更强的分类器。
results: 研究发现，AdaBoost算法不是一个真正的优化算法，而是可以通过 truth table 直接计算出最终的分类结果。 compared with scikit-learn中实现的AdaBoost算法，本研究的结果表明了这种方法的准确性和效率。

Abstract
The principle of boosting in supervised learning involves combining multiple weak classifiers to obtain a stronger classifier. AdaBoost has the reputation to be a perfect example of this approach. We have previously shown that AdaBoost is not truly an optimization algorithm. This paper shows that AdaBoost is an algorithm in name only, as the resulting combination of weak classifiers can be explicitly calculated using a truth table. This study is carried out by considering a problem with two classes and is illustrated by the particular case of three binary classifiers and presents results in comparison with those from the implementation of AdaBoost algorithm of the Python library scikit-learn.

摘要
“boosting”在超级vised学习中的原则是将多个弱分类器组合成一个更强的分类器。阿达Boost是这种方法的典型示例。我们之前已经证明了阿达Boost不是一个优化算法。这篇论文表明，阿达Boost并不是一个真正的算法，因为将弱分类器组合的结果可以由真理表来直接计算。本研究通过考虑两个类别问题，使用三个二进制分类器进行示例，并与scikit-learnPython库中实现的阿达Boost算法相比较。

Graph Anomaly Detection at Group Level: A Topology Pattern Enhanced Unsupervised Approach

paper_url: http://arxiv.org/abs/2308.01063
repo_url: None
paper_authors: Xing Ai, Jialong Zhou, Yulin Zhu, Gaolei Li, Tomasz P. Michalak, Xiapu Luo, Kai Zhou
for:This paper focuses on the task of Group-level Graph Anomaly Detection (Gr-GAD), which aims to identify and localize anomalous groups within a graph.methods:The proposed framework uses a variant of Graph AutoEncoder (GAE) to locate anchor nodes that belong to potential anomaly groups, and then employs group sampling and Topology Pattern-based Graph Contrastive Learning (TPGCL) to identify and localize anomaly groups.results:The experimental results on both real-world and synthetic datasets demonstrate that the proposed framework shows superior performance in identifying and localizing anomaly groups, highlighting it as a promising solution for Gr-GAD.

Abstract
Graph anomaly detection (GAD) has achieved success and has been widely applied in various domains, such as fraud detection, cybersecurity, finance security, and biochemistry. However, existing graph anomaly detection algorithms focus on distinguishing individual entities (nodes or graphs) and overlook the possibility of anomalous groups within the graph. To address this limitation, this paper introduces a novel unsupervised framework for a new task called Group-level Graph Anomaly Detection (Gr-GAD). The proposed framework first employs a variant of Graph AutoEncoder (GAE) to locate anchor nodes that belong to potential anomaly groups by capturing long-range inconsistencies. Subsequently, group sampling is employed to sample candidate groups, which are then fed into the proposed Topology Pattern-based Graph Contrastive Learning (TPGCL) method. TPGCL utilizes the topology patterns of groups as clues to generate embeddings for each candidate group and thus distinct anomaly groups. The experimental results on both real-world and synthetic datasets demonstrate that the proposed framework shows superior performance in identifying and localizing anomaly groups, highlighting it as a promising solution for Gr-GAD. Datasets and codes of the proposed framework are at the github repository https://anonymous.4open.science/r/Topology-Pattern-Enhanced-Unsupervised-Group-level-Graph-Anomaly-Detection.

摘要
“图像异常检测（GAD）已经取得成功并广泛应用于不同领域，如诈骗检测、网络安全、金融安全和生物化学。然而，现有的图像异常检测算法偏向异常个体（节点或图），忽略图中异常群体的可能性。为了解决这种限制，本文提出了一种新的无监督框架，称为群体级图像异常检测（Gr-GAD）。提案的框架首先使用变体的图自编码器（GAE）来确定异常群体的担 anchor节点，并capture长距离不一致性。然后，群体采样被使用来采样候选组，并将其feed到提案的图Pattern-based Graph Contrastive Learning（TPGCL）方法。TPGCL利用组 topology patterns作为特征来生成每个候选组的嵌入，从而分辨细节异常群体。实验结果表明，提案的框架在真实世界和 sintetic 数据集上具有优秀的异常组检测和定位能力，这得出了一个有前途的解决方案。数据集和代码可以在 GitHub 仓库 https://anonymous.4open.science/r/Topology-Pattern-Enhanced-Unsupervised-Group-level-Graph-Anomaly-Detection 中找到。”Note: The translation is in Simplified Chinese, which is the most widely used standard for Chinese writing. The translation is based on the official translation of the text into Simplified Chinese, and the word order and grammar may be different from the original text in Traditional Chinese.

Simulation-based inference using surjective sequential neural likelihood estimation

paper_url: http://arxiv.org/abs/2308.01054
repo_url: https://github.com/dirmeier/ssnl
paper_authors: Simon Dirmeier, Carlo Albert, Fernando Perez-Cruz
for: 该论文主要用于 simulation-based inference 领域，特别是在模型评估不可靠，只有一个可生成假数据的 simulator 存在的情况下。
methods: 该方法使用Surjective Sequential Neural Likelihood（SSNL）来实现 simulation-based inference，包括一个维度减少的射影正常分布模型，用于代替可评估的几何函数。可以使用 Markov chain Monte Carlo 方法或变量插入法进行 Bayesian 推断。
results: 作者在多种实验中证明了 SSNL 比现有的likelihood-based方法在高维数据集上表现更好，特别是在一个具有astrophysics 应用的实际例子中，模拟太阳magnetic field strength 的 solar dynamo 模型。

Abstract
We present Surjective Sequential Neural Likelihood (SSNL) estimation, a novel method for simulation-based inference in models where the evaluation of the likelihood function is not tractable and only a simulator that can generate synthetic data is available. SSNL fits a dimensionality-reducing surjective normalizing flow model and uses it as a surrogate likelihood function which allows for conventional Bayesian inference using either Markov chain Monte Carlo methods or variational inference. By embedding the data in a low-dimensional space, SSNL solves several issues previous likelihood-based methods had when applied to high-dimensional data sets that, for instance, contain non-informative data dimensions or lie along a lower-dimensional manifold. We evaluate SSNL on a wide variety of experiments and show that it generally outperforms contemporary methods used in simulation-based inference, for instance, on a challenging real-world example from astrophysics which models the magnetic field strength of the sun using a solar dynamo model.

摘要
我们介绍了一种新的Sequential Neural Likelihood（SSNL）估计方法，用于基于模拟的推断，其中只有一个可以生成假数据的 simulate器，但是evaluate the likelihood function的计算不可 tractable。SSNL采用了一种维度减少的射影正则化模型，并使其作为媒介假概率函数，以便使用Conventional Bayesian inference方法，如Markov chain Monte Carlo方法或variational inference。通过嵌入数据到低维度空间中，SSNL解决了之前基于概率函数的方法在高维度数据集中遇到的许多问题，例如非指导性数据维度或在低维度抽象上的扩展。我们在多种实验中评估了SSNL，并发现它通常比当前的基于模拟的推断方法高效，例如在一个实际的astrophysics例子中，模拟太阳magnetic field strength的solar dynamo模型。

A Counterfactual Safety Margin Perspective on the Scoring of Autonomous Vehicles’ Riskiness

paper_url: http://arxiv.org/abs/2308.01050
repo_url: None
paper_authors: Alessandro Zanardi, Andrea Censi, Margherita Atzei, Luigi Di Lillo, Emilio Frazzoli
For: This paper aims to provide a data-driven framework for comparing the risk of different autonomous vehicles (AVs) in various operational design domains (ODDs).* Methods: The paper uses counterfactual simulations of “misbehaving” road users to assess the risk of AVs. The concept of counterfactual safety margin is introduced, which represents the minimum deviation from normal behavior that could lead to a collision. The methodology is applicable even when the AV’s behavioral policy is unknown.* Results: The experimental results demonstrate the correlation between the safety margin, the driving policy quality, and the ODD, shedding light on the relative risk associated with different AV providers. The work contributes to AV safety assessment and addresses legislative and insurance concerns surrounding this emerging technology.Here is the same information in Simplified Chinese text:* For: 这篇论文旨在提供一种数据驱动的自动驾驶车（AV）在各种操作设计域（ODD）中的风险比较框架。* Methods: 论文使用“不良”道路用户的 counterfactual 模拟来评估 AV 的风险。该概念表示最小偏离正常行为的行为可能导致事故的最小差异。这种方法可以在 AV 行为策略不明确时进行应用。* Results: 实验结果显示了安全准备度、驾驶策略质量和 ODD 之间的相关性，揭示了不同 AV 提供商的相对风险水平。这项工作对自动驾驶车安全评估做出了贡献，并解决了立法和保险方面对这种新技术的关注。

Abstract
Autonomous Vehicles (AVs) have the potential to provide numerous societal benefits, such as decreased road accidents and increased overall transportation efficiency. However, quantifying the risk associated with AVs is challenging due to the lack of historical data and the rapidly evolving technology. This paper presents a data-driven framework for comparing the risk of different AVs' behaviors in various operational design domains (ODDs), based on counterfactual simulations of "misbehaving" road users. We introduce the concept of counterfactual safety margin, which represents the minimum deviation from normal behavior that could lead to a collision. This concept helps to find the most critical scenarios but also to assess the frequency and severity of risk of AVs. We show that the proposed methodology is applicable even when the AV's behavioral policy is unknown -- through worst- and best-case analyses -- making the method useful also to external third-party risk assessors. Our experimental results demonstrate the correlation between the safety margin, the driving policy quality, and the ODD shedding light on the relative risk associated with different AV providers. This work contributes to AV safety assessment and aids in addressing legislative and insurance concerns surrounding this emerging technology.

摘要
Note: The translation is in Simplified Chinese, which is the standard writing system used in mainland China. The translation is based on the original text and may not capture all the nuances of the original language.

Are Easy Data Easy (for K-Means)

paper_url: http://arxiv.org/abs/2308.01926
repo_url: https://github.com/sayantann11/all-classification-templetes-for-ML
paper_authors: Mieczysław A. Kłopotek
for: 这个论文 investigate $k$-means 算法是否可以正确地恢复良好分割的群集。
methods: 本论文使用了直接从通用定义cluster中得到的分割性定义，并 derivated conditions for a special case of well-separated clusters such that the global minimum of $k$-means cost function coincides with the well-separatedness。
results: 实验表明，$k$-means 算法不能correctly recover well-separated clusters。一种新的算法是 $k$-means++ via repeated {sub}sampling when choosing a seed，该算法在这个任务上表现更好。

Abstract
This paper investigates the capability of correctly recovering well-separated clusters by various brands of the $k$-means algorithm. The concept of well-separatedness used here is derived directly from the common definition of clusters, which imposes an interplay between the requirements of within-cluster-homogenicity and between-clusters-diversity. Conditions are derived for a special case of well-separated clusters such that the global minimum of $k$-means cost function coincides with the well-separatedness. An experimental investigation is performed to find out whether or no various brands of $k$-means are actually capable of discovering well separated clusters. It turns out that they are not. A new algorithm is proposed that is a variation of $k$-means++ via repeated {sub}sampling when choosing a seed. The new algorithm outperforms four other algorithms from $k$-means family on the task.

摘要

Evaluation of network-guided random forest for disease gene discovery

paper_url: http://arxiv.org/abs/2308.01323
repo_url: None
paper_authors: Jianchang Hu, Silke Szymczak
for: 本研究旨在探讨Random Forest（RF）算法在基因表达数据分析中是否可以利用基因网络信息提高疾病预测性能。
methods: 研究者使用了一种基于网络信息的采样概率方法，将网络信息纳入RF构建中。
results: 研究结果表明，基于网络信息的RF不能提高疾病预测性能，但是在疾病基因发现方面，如果疾病基因组成模块， THEN 基于网络信息的RF可以更准确地预测疾病基因。此外，当疾病状况与基因在给定网络中无关时，使用网络信息时可能会产生干扰选择结果，特别是对于Hub基因。

Abstract
Gene network information is believed to be beneficial for disease module and pathway identification, but has not been explicitly utilized in the standard random forest (RF) algorithm for gene expression data analysis. We investigate the performance of a network-guided RF where the network information is summarized into a sampling probability of predictor variables which is further used in the construction of the RF. Our results suggest that network-guided RF does not provide better disease prediction than the standard RF. In terms of disease gene discovery, if disease genes form module(s), network-guided RF identifies them more accurately. In addition, when disease status is independent from genes in the given network, spurious gene selection results can occur when using network information, especially on hub genes. Our empirical analysis on two balanced microarray and RNA-Seq breast cancer datasets from The Cancer Genome Atlas (TCGA) for classification of progesterone receptor (PR) status also demonstrates that network-guided RF can identify genes from PGR-related pathways, which leads to a better connected module of identified genes.

摘要
生物网络信息被认为对疾病模块和代谢通路的识别有利，但在标准随机森林（RF）算法中没有直接使用生物网络信息进行基因表达数据分析。我们研究了基于网络信息的随机森林（Network-guided RF），其中网络信息被概括为预测变量的抽样概率，并在随机森林的构建中使用。我们的结果表明，与标准RF相比，网络指导RF不提供更好的疾病预测。在疾病基因发现方面，如果疾病基因组成模块， тогда网络指导RF能够更准确地识别这些模块。此外，当疾病状况与生物网络中的基因独立时，使用网络信息可能会导致假阳性基因选择结果，特别是对于中心基因。我们对TCGA breast cancer数据集进行了empirical分析，并证明了网络指导RF可以识别PGR相关的基因路径，从而得到更好地连接的模块。

Computing the Distance between unbalanced Distributions – The flat Metric

paper_url: http://arxiv.org/abs/2308.01039
repo_url: https://github.com/hs42/flat_metric
paper_authors: Henri Schmidt, Christian Düll
for: Computes the flat metric in any dimension for unbalanced optimal transport tasks and data analysis.
methods: Uses a neural network to determine an optimal test function for computing the distance between two given measures.
results: Achieves comparability of pairwise computed distances from independently trained networks and shows high quality output in experiments and simulations.

Abstract
We provide an implementation to compute the flat metric in any dimension. The flat metric, also called dual bounded Lipschitz distance, generalizes the well-known Wasserstein distance W1 to the case that the distributions are of unequal total mass. This is of particular interest for unbalanced optimal transport tasks and for the analysis of data distributions where the sample size is important or normalization is not possible. The core of the method is based on a neural network to determine on optimal test function realizing the distance between two given measures. Special focus was put on achieving comparability of pairwise computed distances from independently trained networks. We tested the quality of the output in several experiments where ground truth was available as well as with simulated data.

摘要
我们提供了一个实现方式来计算任意维度的扁平度量。扁平度量，也称为双对称LIPschitz距离，将 Wasserstein距离W1扩展到分布是不均匀的情况下。这对不均匀优化运输和资料分布分析中具有特别的 interess。我们的方法靠在一个神经网络来决定两个给出的度量之间的距离。我们特别强调了独立训练的神经网络之间的比较可靠性。我们在一些实验中使用了实际的测试数据和伪实验数据进行测试。Note that Simplified Chinese is a romanization of Chinese, and the actual Chinese characters may be different.

Three Factors to Improve Out-of-Distribution Detection

paper_url: http://arxiv.org/abs/2308.01030
repo_url: None
paper_authors: Hyunjun Choi, JaeHo Chung, Hawook Jeong, Jin Young Choi
for: 提高Out-of-distribution（OOD）检测和分类精度之间的质量负担。
methods: 使用辅助数据作为异常数据进行细化，并具有自知识整合、半硬样本选择和新的监督对比学习等三大贡献。
results: 三大贡献的结合，同时提高了OOD检测性能和分类精度，并且与之前的方法相比，提高了OOD检测性能和分类精度。

Abstract
In the problem of out-of-distribution (OOD) detection, the usage of auxiliary data as outlier data for fine-tuning has demonstrated encouraging performance. However, previous methods have suffered from a trade-off between classification accuracy (ACC) and OOD detection performance (AUROC, FPR, AUPR). To improve this trade-off, we make three contributions: (i) Incorporating a self-knowledge distillation loss can enhance the accuracy of the network; (ii) Sampling semi-hard outlier data for training can improve OOD detection performance with minimal impact on accuracy; (iii) The introduction of our novel supervised contrastive learning can simultaneously improve OOD detection performance and the accuracy of the network. By incorporating all three factors, our approach enhances both accuracy and OOD detection performance by addressing the trade-off between classification and OOD detection. Our method achieves improvements over previous approaches in both performance metrics.

摘要
在 OUT-OF-DISTRIBUTION（OOD）探测问题中，使用辅助数据作为精度数据进行练习显示了激励人的性能。然而，先前的方法受到了准确率（ACC）和OOD探测性能（AUROC、FPR、AUPR）的负面影响。为了改进这种负面影响，我们提出了三个贡献：（i） incorporating self-knowledge distillation loss可以提高网络的准确率;（ii）使用 semi-hard outlier 数据进行训练可以提高 OOD 探测性能，而不会影响准确率;（iii）我们提出的新的超级vised contrastive learning可以同时提高 OOD 探测性能和网络的准确率。通过结合这三个因素，我们的方法可以同时改进准确率和 OOD 探测性能，解决了准确率和 OOD 探测性能之间的负面影响。我们的方法在两个性能指标上都取得了改进。

Maximizing Success Rate of Payment Routing using Non-stationary Bandits

paper_url: http://arxiv.org/abs/2308.01028
repo_url: None
paper_authors: Aayush Chaudhary, Abhinav Rai, Abhishek Gupta
for: 该论文是为了设计和部署非站立式多臂瑞伯投注策略来确定近似优化的支付路由策略，以优化支付系统性能和安全性。
methods: 该论文使用了一种新型的射线基实现来优化非站立式多臂瑞伯投注策略，以实现支付系统的扩展和缩放。具体来说，该论文使用了一种基于射线的 Routing Service 架构，并对多种非站立式多臂瑞伯投注策略进行了评估和比较。
results: live 实验结果显示，非站立式多臂瑞伯投注策略可以在一个月内提高支付交易的成功率，相比传统规则基于的方法，提高了0.92%。

Abstract
This paper discusses the system architecture design and deployment of non-stationary multi-armed bandit approaches to determine a near-optimal payment routing policy based on the recent history of transactions. We propose a Routing Service architecture using a novel Ray-based implementation for optimally scaling bandit-based payment routing to over 10000 transactions per second, adhering to the system design requirements and ecosystem constraints with Payment Card Industry Data Security Standard (PCI DSS). We first evaluate the effectiveness of multiple bandit-based payment routing algorithms on a custom simulator to benchmark multiple non-stationary bandit approaches and identify the best hyperparameters. We then conducted live experiments on the payment transaction system on a fantasy sports platform Dream11. In the live experiments, we demonstrated that our non-stationary bandit-based algorithm consistently improves the success rate of transactions by 0.92\% compared to the traditional rule-based methods over one month.

摘要
First, we evaluated the effectiveness of multiple bandit-based payment routing algorithms on a custom simulator to benchmark different non-stationary bandit approaches and identify the best hyperparameters. We then conducted live experiments on a real-world payment transaction system on a fantasy sports platform Dream11, demonstrating that our non-stationary bandit-based algorithm consistently improves the success rate of transactions by 0.92% compared to traditional rule-based methods over a one-month period.

Enhancing Representation Learning for Periodic Time Series with Floss: A Frequency Domain Regularization Approach

paper_url: http://arxiv.org/abs/2308.01011
repo_url: https://github.com/agustdd/floss
paper_authors: Chunwei Yang, Xiaoxu Chen, Lijun Sun, Hongyu Yang, Yuankai Wu
for: 这篇论文旨在提出一种无监督的方法，以帮助深度学习模型更好地处理具有周期性或假周期性特征的时间序列数据。
methods: 这篇论文提出了一种名为Floss的方法，它可以自动在频域中调整学习的表现。Floss方法首先自动检测时间序列中的主要周期性，然后使用周期性移动和频谱浓度相似度度量来学习有意义的表现。
results: 在实验中，Floss方法能够自动发现时间序列中的周期性，并且与其他深度学习模型相比，提高了时间序列分类、预测和偏常检测等任务的表现。

Abstract
Time series analysis is a fundamental task in various application domains, and deep learning approaches have demonstrated remarkable performance in this area. However, many real-world time series data exhibit significant periodic or quasi-periodic dynamics that are often not adequately captured by existing deep learning-based solutions. This results in an incomplete representation of the underlying dynamic behaviors of interest. To address this gap, we propose an unsupervised method called Floss that automatically regularizes learned representations in the frequency domain. The Floss method first automatically detects major periodicities from the time series. It then employs periodic shift and spectral density similarity measures to learn meaningful representations with periodic consistency. In addition, Floss can be easily incorporated into both supervised, semi-supervised, and unsupervised learning frameworks. We conduct extensive experiments on common time series classification, forecasting, and anomaly detection tasks to demonstrate the effectiveness of Floss. We incorporate Floss into several representative deep learning solutions to justify our design choices and demonstrate that it is capable of automatically discovering periodic dynamics and improving state-of-the-art deep learning models.

摘要
时序分析是多种应用领域的基础任务，深度学习方法在这个领域表现了惊人的表现。然而，许多实际世界时序数据具有重要的周期或准周期动态特征，这些特征经常不被现有的深度学习基础方法完全捕捉。这导致了时序动态行为的下面表示不够完整。为解决这个差距，我们提出了一种无监督的方法 called Floss，该方法可以自动在频率域中规范学习的表示。Floss方法首先自动检测时序数据中的主要周期性。然后，它使用周期偏移和频率分布相似度度量来学习具有周期一致性的有意义表示。此外，Floss可以轻松地integrated到supervised、semi-supervised和无监督学习框架中。我们在常见的时序分类、预测和异常检测任务中进行了广泛的实验，以证明Floss的有效性。我们将Floss incorporated into 多种代表性的深度学习解决方案，以证明我们的设计选择是合理的，并证明Floss可以自动发现周期动态和提高当前最佳深度学习模型。

MDT3D: Multi-Dataset Training for LiDAR 3D Object Detection Generalization

paper_url: http://arxiv.org/abs/2308.01000
repo_url: None
paper_authors: Louis Soum-Fontez, Jean-Emmanuel Deschaud, François Goulette
for:这个论文的目标是提高3D物体检测模型在新环境和不同感知器配置下的稳定性。methods:该方法使用多个注释源数据集进行共同训练，并使用新的标签映射技术来填充标签空白。此外，该方法还提出了一种混合数据集的训练方法和跨数据集对象插入 augmetnation 技术。results:研究表明，该方法可以提高不同类型的3D物体检测模型在新环境下的性能。 Code and additional results will be publicly available on GitHub for further reference.

Abstract
Supervised 3D Object Detection models have been displaying increasingly better performance in single-domain cases where the training data comes from the same environment and sensor as the testing data. However, in real-world scenarios data from the target domain may not be available for finetuning or for domain adaptation methods. Indeed, 3D object detection models trained on a source dataset with a specific point distribution have shown difficulties in generalizing to unseen datasets. Therefore, we decided to leverage the information available from several annotated source datasets with our Multi-Dataset Training for 3D Object Detection (MDT3D) method to increase the robustness of 3D object detection models when tested in a new environment with a different sensor configuration. To tackle the labelling gap between datasets, we used a new label mapping based on coarse labels. Furthermore, we show how we managed the mix of datasets during training and finally introduce a new cross-dataset augmentation method: cross-dataset object injection. We demonstrate that this training paradigm shows improvements for different types of 3D object detection models. The source code and additional results for this research project will be publicly available on GitHub for interested parties to access and utilize: https://github.com/LouisSF/MDT3D

摘要
受监督3D物体检测模型在单个频道情况下的性能有所提高，但在实际应用场景中，测试数据的频道可能不同于训练数据的频道。实际上，通过特定点分布训练的3D物体检测模型在未看过的数据集上generalization能力很差。因此，我们使用多个注释源数据集的多数据集训练方法（MDT3D）来增强3D物体检测模型在新环境中的Robustness。为了解决不同数据集之间的标签差距，我们使用了新的标签映射基于粗略标签。此外，我们详细介绍了在训练过程中如何处理多个数据集的混合，以及一种新的跨数据集增强方法：跨数据集物体注入。我们展示了这种训练方法在不同类型的3D物体检测模型中的改进。许多相关结果和代码将在GitHub上公开，以便有兴趣的人可以访问和利用：https://github.com/LouisSF/MDT3D。

Exploiting Synthetic Data for Data Imbalance Problems: Baselines from a Data Perspective

paper_url: http://arxiv.org/abs/2308.00994
repo_url: None
paper_authors: Moon Ye-Bin, Nam Hyeon-Woo, Wonseok Choi, Nayeong Kim, Suha Kwak, Tae-Hyun Oh
for: Addressing data imbalance problems in deep neural networks to prevent biased predictions and potential ethical and social consequences.
methods: Utilizes synthetic data as a preliminary step before employing task-specific algorithms to address data imbalance problems.
results: Surpasses the performance of existing task-specific methods on several datasets, including CIFAR100-LT, ImageNet100-LT, UTKFace, and Waterbird.

Abstract
We live in a vast ocean of data, and deep neural networks are no exception to this. However, this data exhibits an inherent phenomenon of imbalance. This imbalance poses a risk of deep neural networks producing biased predictions, leading to potentially severe ethical and social consequences. To address these challenges, we believe that the use of generative models is a promising approach for comprehending tasks, given the remarkable advancements demonstrated by recent diffusion models in generating high-quality images. In this work, we propose a simple yet effective baseline, SYNAuG, that utilizes synthetic data as a preliminary step before employing task-specific algorithms to address data imbalance problems. This straightforward approach yields impressive performance on datasets such as CIFAR100-LT, ImageNet100-LT, UTKFace, and Waterbird, surpassing the performance of existing task-specific methods. While we do not claim that our approach serves as a complete solution to the problem of data imbalance, we argue that supplementing the existing data with synthetic data proves to be an effective and crucial preliminary step in addressing data imbalance concerns.

摘要
我们生活在一个庞大的数据海洋中，而深度神经网络也不例外。然而，这些数据具有内生的不均衡现象，这可能导致深度神经网络预测结果受到偏见，从而导致严重的伦理和社会后果。为了解决这些挑战，我们认为使用生成模型是一个有前途的方法，因为最近的扩散模型在生成高质量图像方面已经展现出了卓越的成果。在这个工作中，我们提出了一个简单 yet有效的基eline，即SYNAuG，它利用生成的数据作为先决步骤，然后使用任务特定的算法来解决数据不均衡问题。这种简单的方法在CIFAR100-LT、ImageNet100-LT、UTKFace和Waterbird等数据集上达到了比较出色的性能，超过了现有的任务特定方法的性能。虽然我们不assert我们的方法是数据不均衡问题的完整解决方案，但我们 argue that在使用现有数据之前，通过生成数据来增加数据量是一个有效和关键的预liminary步骤。

Wasserstein Diversity-Enriched Regularizer for Hierarchical Reinforcement Learning

paper_url: http://arxiv.org/abs/2308.00989
repo_url: None
paper_authors: Haorui Li, Jiaqi Liang, Linjing Li, Daniel Zeng
for: 这个论文旨在解决复杂任务时 Composite reinforcement learning 中的强化学习问题。
methods: 论文提出了一种名为 Wasserstein Diversity-Enriched Regularizer (WDER) 的新任务不受限制的正则化方法，可以轻松地与现有方法结合使用，以提高性能。
results: 实验结果表明，我们的 WDER 可以提高性能和样本效率，与优化参数无关，这表明了我们的方法的可应用性和稳定性。

Abstract
Hierarchical reinforcement learning composites subpolicies in different hierarchies to accomplish complex tasks.Automated subpolicies discovery, which does not depend on domain knowledge, is a promising approach to generating subpolicies.However, the degradation problem is a challenge that existing methods can hardly deal with due to the lack of consideration of diversity or the employment of weak regularizers. In this paper, we propose a novel task-agnostic regularizer called the Wasserstein Diversity-Enriched Regularizer (WDER), which enlarges the diversity of subpolicies by maximizing the Wasserstein distances among action distributions. The proposed WDER can be easily incorporated into the loss function of existing methods to boost their performance further.Experimental results demonstrate that our WDER improves performance and sample efficiency in comparison with prior work without modifying hyperparameters, which indicates the applicability and robustness of the WDER.

摘要
In this paper, we propose a new task-agnostic regularizer called the Wasserstein Diversity-Enriched Regularizer (WDER), which increases the diversity of subpolicies by maximizing the Wasserstein distances among action distributions. The WDER can be easily incorporated into the loss function of existing methods to improve their performance further.Experimental results show that our WDER improves performance and sample efficiency compared to prior work, without modifying hyperparameters. This indicates the applicability and robustness of the WDER.

Learning Regionalization within a Differentiable High-Resolution Hydrological Model using Accurate Spatial Cost Gradients

paper_url: http://arxiv.org/abs/2308.02040
repo_url: None
paper_authors: Ngo Nghi Truyen Huynh, Pierre-André Garambois, François Colleoni, Benjamin Renard, Hélène Roux, Julie Demargne, Pierre Javelle
for: 这个论文主要是用来解决无数据的流域中 hydrological parameter 的估计问题，并在各个流域中寻找一个转换函数来将物理描述符与概念模型参数进行量化关系。
methods: 这篇论文提出了一种 Hybrid Data Assimilation and Parameter Regionalization (HDA-PR) 方法，它将 learnable regionalization mappings integrate 到了一个可微的ydrological model 中，以便在各个流域中使用不同类型的数据进行数据协同填充。
results: 在南法两个暴雨灾害地区进行了高分辨率、小时间和千米级别的地理模型运算，并得到了很好的 regionalization 性能，Nash-Sutcliffe 效率 (NSE) 分布在0.52-0.78之间，相比基线模型 calibrated WITH lumped parameters 提高了0.57的 NSE 分布。

Abstract
Estimating spatially distributed hydrological parameters in ungauged catchments poses a challenging regionalization problem and requires imposing spatial constraints given the sparsity of discharge data. A possible approach is to search for a transfer function that quantitatively relates physical descriptors to conceptual model parameters. This paper introduces a Hybrid Data Assimilation and Parameter Regionalization (HDA-PR) approach incorporating learnable regionalization mappings, based on either multivariate regressions or neural networks, into a differentiable hydrological model. It enables the exploitation of heterogeneous datasets across extensive spatio-temporal computational domains within a high-dimensional regionalization context, using accurate adjoint-based gradients. The inverse problem is tackled with a multi-gauge calibration cost function accounting for information from multiple observation sites. HDA-PR was tested on high-resolution, hourly and kilometric regional modeling of two flash-flood-prone areas located in the South of France. In both study areas, the median Nash-Sutcliffe efficiency (NSE) scores ranged from 0.52 to 0.78 at pseudo-ungauged sites over calibration and validation periods. These results highlight a strong regionalization performance of HDA-PR, improving NSE by up to 0.57 compared to the baseline model calibrated with lumped parameters, and achieving a performance comparable to the reference solution obtained with local uniform calibration (median NSE from 0.59 to 0.79). Multiple evaluation metrics based on flood-oriented hydrological signatures are also employed to assess the accuracy and robustness of the approach. The regionalization method is amenable to state-parameter correction from multi-source data over a range of time scales needed for operational data assimilation, and it is adaptable to other differentiable geophysical models.

摘要
估计分布式ydrological参数在无测站catchments中存在一个挑战性的区域化问题，需要在缺乏流量数据的情况下强制 spatial constraints。一种可能的方法是寻找一个转移函数，该函数可以量化物理描述符和概念模型参数之间的关系。这篇文章介绍了一种Hybrid Data Assimilation and Parameter Regionalization（HDA-PR）方法，该方法通过将多变量回归或神经网络作为学习可 Regionalization mappings incorporated into a differentiable hydrological model。这种方法可以在广泛的空间-时间计算Domain中利用高精度的后向梯度，并在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration。在多个观测站信息的基础上进行多观测站calibration

Certified Multi-Fidelity Zeroth-Order Optimization

paper_url: http://arxiv.org/abs/2308.00978
repo_url: None
paper_authors: Étienne de Montbrun, Sébastien Gerchinovitz
for: 本文研究了多级别预测优化问题，特别是如何使用不同级别的预测方法来优化一个函数 $f$，以优化函数 $f$ 的评估成本。
methods: 本文提出了一种证明过的算法，称为证明加法多级别预测优化算法（MFDOO）。该算法在评估环境中进行了一系列的游戏性评估，以确定最佳的预测级别和评估方法。
results: 本文提出了一种基于证明的多级别预测优化算法，并提供了一个基于 Lipschitz 函数 $f$ 的成本复杂度上下文。此外，文章还证明了一个 $f$-dependent 下界，表明该算法在任何 Lipschitz 函数 $f$ 下都具有近似优化成本复杂度。最后，文章还Addresses 了随机评估的特殊情况作为直接例子。

Abstract
We consider the problem of multi-fidelity zeroth-order optimization, where one can evaluate a function $f$ at various approximation levels (of varying costs), and the goal is to optimize $f$ with the cheapest evaluations possible. In this paper, we study \emph{certified} algorithms, which are additionally required to output a data-driven upper bound on the optimization error. We first formalize the problem in terms of a min-max game between an algorithm and an evaluation environment. We then propose a certified variant of the MFDOO algorithm and derive a bound on its cost complexity for any Lipschitz function $f$. We also prove an $f$-dependent lower bound showing that this algorithm has a near-optimal cost complexity. We close the paper by addressing the special case of noisy (stochastic) evaluations as a direct example.

摘要
我们考虑多项误差零项优化问题，其中可以评估函数 $f$ 在不同的推导水平（具有不同的成本）上，并且目标是将 $f$ 优化到最低成本下。在这篇文章中，我们研究了认证算法，这些算法还需要输出一个基于数据的Upper bound 估计错误。我们首先将问题正式化为一个测验环境和算法之间的最小最大游戏。然后，我们提出了认证版本的 MFDOO 算法，并且derive了这个算法的成本复杂度的上限，这上限是适用于任何 Lipschitz 函数 $f$。我们还证明了 $f$ 相依的下界，证明这个算法在任何情况下都具有近乎最佳的成本复杂度。最后，我们处理了随机（测量）评估的特例，作为直接的例子。

A new approach for evaluating internal cluster validation indices

paper_url: http://arxiv.org/abs/2308.03894
repo_url: None
paper_authors: Zoltán Botta-Dukát
for: 本研究旨在透过内部验证指标选择最佳表现的算法和参数设置，而不使用任何外部信息。
methods: 本研究提出了多种内部验证指标，并评估了它们在不同数据集上的表现。
results: 本研究提出了一种新的验证方法，并评估了其优劣点。

Abstract
A vast number of different methods are available for unsupervised classification. Since no algorithm and parameter setting performs best in all types of data, there is a need for cluster validation to select the actually best-performing algorithm. Several indices were proposed for this purpose without using any additional (external) information. These internal validation indices can be evaluated by applying them to classifications of datasets with a known cluster structure. Evaluation approaches differ in how they use the information on the ground-truth classification. This paper reviews these approaches, considering their advantages and disadvantages, and then suggests a new approach.

摘要
“有很多不同的方法可以用于无监督分类。由于不同的算法和参数设置不一定适合所有类型的数据，因此需要使用集群验证来选择最佳performing的算法。多种内部验证指标已经被提议用于此目的，但这些指标不使用任何外部信息。这些验证方法可以通过应用于知道的分类结构的数据来评估。本文将评论这些方法，包括其优点和缺点，然后提出一种新的方法。”Note: Please keep in mind that the translation is in Simplified Chinese, which is used in mainland China and Singapore, while Traditional Chinese is used in Taiwan, Hong Kong, and other countries.

Effects of Daily News Sentiment on Stock Price Forecasting

paper_url: http://arxiv.org/abs/2308.08549
repo_url: None
paper_authors: S. Srinivas, R. Gadela, R. Sabu, A. Das, G. Nath, V. Datla
for: This paper aims to improve the accuracy of stock price forecasts by incorporating investor sentiment from news articles into the prediction model.
methods: The authors use a robust data collection and preprocessing framework to create a news database and time series data for NITY50 stocks. They use sentiment libraries to calculate sentiment scores from different sections of the articles and fit LSTM models to forecast stock prices, both with and without sentiment features.
results: The authors compare the performance of the LSTM models with and without sentiment features and find that incorporating sentiment scores improves the accuracy of stock price forecasts.

Abstract
Predicting future prices of a stock is an arduous task to perform. However, incorporating additional elements can significantly improve our predictions, rather than relying solely on a stock's historical price data to forecast its future price. Studies have demonstrated that investor sentiment, which is impacted by daily news about the company, can have a significant impact on stock price swings. There are numerous sources from which we can get this information, but they are cluttered with a lot of noise, making it difficult to accurately extract the sentiments from them. Hence the focus of our research is to design an efficient system to capture the sentiments from the news about the NITY50 stocks and investigate how much the financial news sentiment of these stocks are affecting their prices over a period of time. This paper presents a robust data collection and preprocessing framework to create a news database for a timeline of around 3.7 years, consisting of almost half a million news articles. We also capture the stock price information for this timeline and create multiple time series data, that include the sentiment scores from various sections of the article, calculated using different sentiment libraries. Based on this, we fit several LSTM models to forecast the stock prices, with and without using the sentiment scores as features and compare their performances.

摘要
We present a robust data collection and preprocessing framework to create a news database spanning 3.7 years, consisting of nearly half a million news articles. We also collect stock price information for this timeline and create multiple time series data, including sentiment scores from various sections of the article calculated using different sentiment libraries. We then fit several LSTM models to forecast stock prices, with and without using sentiment scores as features, and compare their performances.

Integrating Homomorphic Encryption and Trusted Execution Technology for Autonomous and Confidential Model Refining in Cloud

paper_url: http://arxiv.org/abs/2308.00963
repo_url: None
paper_authors: Pinglan Liu, Wensheng Zhang
for: 本研究旨在设计一种在云端实现自动化和保密的模型优化方案，以满足长期连续进行机器学习的需求和数据和模型的隐私保护。
methods: 本研究使用了同时保证机密性和可信度的同时保证机密性和可信度的同时使用了Homomorphic加密和可信任执行环境技术，并通过实现和实验证明了该方案的可行性。
results: 实验结果表明，通过使用我们提议的方案，云服务器可以自动地对加密模型进行修改，以提高其精度。虽然效率仍然远低于基准方案，但我们预期通过更好地利用高级并行和云服务器GPU的计算能力，可以进一步提高效率。

Abstract
With the popularity of cloud computing and machine learning, it has been a trend to outsource machine learning processes (including model training and model-based inference) to cloud. By the outsourcing, other than utilizing the extensive and scalable resource offered by the cloud service provider, it will also be attractive to users if the cloud servers can manage the machine learning processes autonomously on behalf of the users. Such a feature will be especially salient when the machine learning is expected to be a long-term continuous process and the users are not always available to participate. Due to security and privacy concerns, it is also desired that the autonomous learning preserves the confidentiality of users' data and models involved. Hence, in this paper, we aim to design a scheme that enables autonomous and confidential model refining in cloud. Homomorphic encryption and trusted execution environment technology can protect confidentiality for autonomous computation, but each of them has their limitations respectively and they are complementary to each other. Therefore, we further propose to integrate these two techniques in the design of the model refining scheme. Through implementation and experiments, we evaluate the feasibility of our proposed scheme. The results indicate that, with our proposed scheme the cloud server can autonomously refine an encrypted model with newly provided encrypted training data to continuously improve its accuracy. Though the efficiency is still significantly lower than the baseline scheme that refines plaintext-model with plaintext-data, we expect that it can be improved by fully utilizing the higher level of parallelism and the computational power of GPU at the cloud server.

摘要
With the rise of cloud computing and machine learning, it has become a trend to outsource machine learning processes (including model training and model-based inference) to the cloud. By outsourcing, users can not only take advantage of the extensive and scalable resources offered by cloud service providers but also enjoy the convenience of having the cloud servers manage the machine learning processes autonomously on their behalf. This feature is particularly desirable when machine learning is expected to be a long-term continuous process and users are not always available to participate. However, due to security and privacy concerns, it is essential that the autonomous learning preserves the confidentiality of users' data and models involved. Therefore, in this paper, we aim to design a scheme that enables autonomous and confidential model refining in the cloud.Homomorphic encryption and trusted execution environment technology can protect confidentiality for autonomous computation, but each of them has its limitations, respectively. Therefore, we propose to integrate these two techniques in the design of the model refining scheme. Through implementation and experiments, we evaluate the feasibility of our proposed scheme. The results show that the cloud server can autonomously refine an encrypted model with newly provided encrypted training data to continuously improve its accuracy. Although the efficiency is still significantly lower than the baseline scheme that refines plaintext-model with plaintext-data, we expect that it can be improved by fully utilizing the higher level of parallelism and the computational power of GPU at the cloud server.

Causal Inference with Differentially Private (Clustered) Outcomes

paper_url: http://arxiv.org/abs/2308.00957
repo_url: None
paper_authors: Adel Javanmard, Vahab Mirrokni, Jean Pouget-Abadie
For: The paper aims to provide a new differential privacy mechanism, “Cluster-DP”, to improve the estimation of causal effects from randomized experiments while maintaining strong privacy guarantees.* Methods: The paper proposes a clustering-based differential privacy mechanism that leverages the cluster structure of the data to reduce the variance loss and maintain privacy guarantees.* Results: The paper shows that the proposed “Cluster-DP” algorithm can improve the variance loss compared to its unclustered version and a more extreme uniform-prior version, while maintaining the same privacy guarantees.

Abstract
Estimating causal effects from randomized experiments is only feasible if participants agree to reveal their potentially sensitive responses. Of the many ways of ensuring privacy, label differential privacy is a widely used measure of an algorithm's privacy guarantee, which might encourage participants to share responses without running the risk of de-anonymization. Many differentially private mechanisms inject noise into the original data-set to achieve this privacy guarantee, which increases the variance of most statistical estimators and makes the precise measurement of causal effects difficult: there exists a fundamental privacy-variance trade-off to performing causal analyses from differentially private data. With the aim of achieving lower variance for stronger privacy guarantees, we suggest a new differential privacy mechanism, "Cluster-DP", which leverages any given cluster structure of the data while still allowing for the estimation of causal effects. We show that, depending on an intuitive measure of cluster quality, we can improve the variance loss while maintaining our privacy guarantees. We compare its performance, theoretically and empirically, to that of its unclustered version and a more extreme uniform-prior version which does not use any of the original response distribution, both of which are special cases of the "Cluster-DP" algorithm.

摘要
估计 causal effect from randomized experiments 只能成功实现 participants 同意披露 potentially sensitive 回答。保护隐私的多种方法中，标签分布隐私是一种广泛使用的隐私保证度量，可能会鼓励 participants 分享回答不会风险 de-anonymization。许多异常隐私机制会在原始数据集中插入噪声来实现这一隐私保证度量，这会增加统计估计器的方差，从而使得确定 causal effect 变得更加困难：存在一个基本隐私-准备曲线质量负担。为了实现更低的准备质量和更强的隐私保证度量，我们建议一种新的隐私机制，即 "Cluster-DP"，它利用给定的数据集中的群集结构，同时仍然允许确定 causal effect。我们表明，根据某种直观的群集质量度量，我们可以提高准备质量的变化，而不会随着隐私保证度量的增加。我们对其性能进行了理论和实验性比较，与其不使用原始回答分布的特殊情况（即 "Cluster-DP" 算法的不分支情况）和更激进的均匀先验情况（即不使用原始回答分布的情况）进行比较。

Curriculum Guided Domain Adaptation in the Dark

paper_url: http://arxiv.org/abs/2308.00956
repo_url: None
paper_authors: Chowdhury Sadman Jahan, Andreas Savakis
for: 本研究目的是 Addressing the rising concerns of privacy and security, domain adaptation in the dark aims to adapt a black-box source trained model to an unlabeled target domain without access to any source data or source model parameters.
methods: 本研究使用了 Curriculum Adaptation for Black-Box (CABB) 方法，它是一种curriculum guided adaptation approach，首先在目标数据集上使用高置信度（clean）标签进行训练，然后在目标数据集上使用噪音标签进行训练。 CABB 方法使用了 Jensen-Shannon 分布差作为 cleaner-noisy sample separation 的优化目标函数，而不是传统的 cross entropy loss 函数。
results: 实验结果表明，CABB 方法在标准领域适应 datasets 上比现有的黑框 DA 模型表现更好，并且与白框领域适应模型相当。

Abstract
Addressing the rising concerns of privacy and security, domain adaptation in the dark aims to adapt a black-box source trained model to an unlabeled target domain without access to any source data or source model parameters. The need for domain adaptation of black-box predictors becomes even more pronounced to protect intellectual property as deep learning based solutions are becoming increasingly commercialized. Current methods distill noisy predictions on the target data obtained from the source model to the target model, and/or separate clean/noisy target samples before adapting using traditional noisy label learning algorithms. However, these methods do not utilize the easy-to-hard learning nature of the clean/noisy data splits. Also, none of the existing methods are end-to-end, and require a separate fine-tuning stage and an initial warmup stage. In this work, we present Curriculum Adaptation for Black-Box (CABB) which provides a curriculum guided adaptation approach to gradually train the target model, first on target data with high confidence (clean) labels, and later on target data with noisy labels. CABB utilizes Jensen-Shannon divergence as a better criterion for clean-noisy sample separation, compared to the traditional criterion of cross entropy loss. Our method utilizes co-training of a dual-branch network to suppress error accumulation resulting from confirmation bias. The proposed approach is end-to-end trainable and does not require any extra finetuning stage, unlike existing methods. Empirical results on standard domain adaptation datasets show that CABB outperforms existing state-of-the-art black-box DA models and is comparable to white-box domain adaptation models.

摘要
Addressing the rising concerns of privacy and security, 黑盒子领域适应（Domain Adaptation in the Dark）旨在不经过任何源数据或源模型参数的情况下，将黑盒子训练模型适应到无标注目标领域。随着深度学习解决方案的商业化，黑盒子领域适应的需求更加突出。现有的方法将源模型预测的误差转移到目标模型，并/或使用传统的噪声标签学习算法进行适应。但这些方法未使用容易从容易到困难的标签分配特性。此外，现有的方法都不是端到端的，需要额外的练习阶段和暖身阶段。在这个研究中，我们提出了“CURRICULUM ADAPTATION FOR BLACK-BOX”（CABB），它提供了一个课程导向的适应方法，首先在目标数据上将高信任度（清洁）标签训练目标模型，然后在目标数据上将噪声标签训练目标模型。CABB使用Jensen-Shannon散度作为更好的清洁噪声标签分配剂量，相比于传统的混合损失函数。我们的方法使用两条分支网络进行合作训练，以抑制因确认偏调所导致的错误累累。提案的方法是端到端训练的，不需要额外的练习阶段，与现有的方法不同。实验结果显示，CABB在标准领域适应 datasets 上表现更好，并且与白盒子领域适应模型相比几乎相同。

From Sparse to Soft Mixtures of Experts

paper_url: http://arxiv.org/abs/2308.00951
repo_url: https://github.com/google-research/vmoe
paper_authors: Joan Puigcerver, Carlos Riquelme, Basil Mustafa, Neil Houlsby
for: This paper aims to address the challenges of training and inference costs in Mixture of Expert (MoE) architectures, specifically in the context of visual recognition tasks.
methods: The proposed Soft MoE method uses a fully-differentiable sparse Transformer that performs implicit soft assignments of input tokens to experts, allowing for larger model capacity at lower inference cost.
results: Soft MoE outperforms standard Transformers (ViTs) and popular MoE variants (Tokens Choice and Experts Choice) in visual recognition tasks, while scaling well with increasing numbers of experts and layers. For example, Soft MoE-Base/16 requires 10.5x lower inference cost than ViT-Huge/14 while matching its performance after similar training, and Soft MoE Huge/14 with 128 experts in 16 MoE layers has over 40x more parameters than ViT Huge/14 with only a 2% increase in inference time cost.

Abstract
Sparse mixture of expert architectures (MoEs) scale model capacity without large increases in training or inference costs. Despite their success, MoEs suffer from a number of issues: training instability, token dropping, inability to scale the number of experts, or ineffective finetuning. In this work, we proposeSoft MoE, a fully-differentiable sparse Transformer that addresses these challenges, while maintaining the benefits of MoEs. Soft MoE performs an implicit soft assignment by passing different weighted combinations of all input tokens to each expert. As in other MoE works, experts in Soft MoE only process a subset of the (combined) tokens, enabling larger model capacity at lower inference cost. In the context of visual recognition, Soft MoE greatly outperforms standard Transformers (ViTs) and popular MoE variants (Tokens Choice and Experts Choice). For example, Soft MoE-Base/16 requires 10.5x lower inference cost (5.7x lower wall-clock time) than ViT-Huge/14 while matching its performance after similar training. Soft MoE also scales well: Soft MoE Huge/14 with 128 experts in 16 MoE layers has over 40x more parameters than ViT Huge/14, while inference time cost grows by only 2%, and it performs substantially better.

摘要
稀疏混合专家架构（MoE）可以增加模型容量而无需大幅提高训练或执行成本。 despite their success, MoE 受到一些问题的困扰，包括训练不稳定、吐token、无法扩展专家数量以及无效的微调。在这项工作中，我们提出了软MoE，一种完全可微分的稀疏转换器，解决了这些挑战，同时保留了 MoE 的优点。软MoE 通过不同权重的Weighted combinations of all input tokens 来进行隐式软分配。与其他 MoE 工作一样，专家在 Soft MoE 中只处理一部分（合并）的输入字符，使得模型容量得到了更大的提升，而执行成本则得到了更低的降低。在视识ognition中，软MoE 超过了标准Transformers（ViTs）和受欢迎的 MoE 变体（Tokens Choice和Experts Choice）。例如，Soft MoE-Base/16 需要10.5倍低的执行成本（5.7倍低的墙 clock time），而与其性能相似。 Soft MoE 也可以扩展： Soft MoE Huge/14 WITH 128 experts 在 16 MoE layers 中有40倍以上的参数量，而执行成本增加了只有2%，并且表现出了明显的提升。

Decomposing and Coupling Saliency Map for Lesion Segmentation in Ultrasound Images

paper_url: http://arxiv.org/abs/2308.00947
repo_url: None
paper_authors: Zhenyuan Ning, Yixiao Mao, Qianjin Feng, Shengzhou Zhong, Yu Zhang
for: 这篇论文旨在提高聚合体内部的肿瘤分类精度，应对静脉影像中肿瘤区域和周围组织（背景）的同等或更高的颜色和текстура对比。
methods: 这篇论文提出了一个名为DC-Net的分解且联系网络，通过在复杂的静脉影像中分解原始图像，将肿瘤区域和背景分类为不同的类别，以提高肿瘤分类精度。DC-Net包括分解和联系子网络，其中前者先将原始图像分解为肿瘤和背景的类别对应图像，然后后者进一步处理这些图像，以确保精度高的肿瘤分类。
results: 这篇论文的实验结果显示，DC-Net可以在两个静脉肿瘤分类任务中提高精度，比较现有的方法更好。

Abstract
Complex scenario of ultrasound image, in which adjacent tissues (i.e., background) share similar intensity with and even contain richer texture patterns than lesion region (i.e., foreground), brings a unique challenge for accurate lesion segmentation. This work presents a decomposition-coupling network, called DC-Net, to deal with this challenge in a (foreground-background) saliency map disentanglement-fusion manner. The DC-Net consists of decomposition and coupling subnets, and the former preliminarily disentangles original image into foreground and background saliency maps, followed by the latter for accurate segmentation under the assistance of saliency prior fusion. The coupling subnet involves three aspects of fusion strategies, including: 1) regional feature aggregation (via differentiable context pooling operator in the encoder) to adaptively preserve local contextual details with the larger receptive field during dimension reduction; 2) relation-aware representation fusion (via cross-correlation fusion module in the decoder) to efficiently fuse low-level visual characteristics and high-level semantic features during resolution restoration; 3) dependency-aware prior incorporation (via coupler) to reinforce foreground-salient representation with the complementary information derived from background representation. Furthermore, a harmonic loss function is introduced to encourage the network to focus more attention on low-confidence and hard samples. The proposed method is evaluated on two ultrasound lesion segmentation tasks, which demonstrates the remarkable performance improvement over existing state-of-the-art methods.

摘要
复杂的超声图像场景下，邻近组织（即背景）与病变区域（即前景）的像素强度几乎相同，甚至具有更复杂的文本排序模式，对准确病变分割带来了独特挑战。本文提出了一种 decomposition-coupling 网络（DC-Net），通过在 (前景-背景) 敏感地图离散-融合方式下进行精准分割。DC-Net 包括 decomposition 和 coupling 子网络，前者先将原始图像粗略地分解成前景和背景敏感地图，然后后者在帮助于敏感优化下进行精准分割。coupling 子网络包括三个方面的融合策略：1）地域特征聚合（通过可导式上下文搅拌运算器在编码器中），以适应大小上下文的更大范围，在维度减少时保留地方上下文特征；2）关系意识表示融合（通过交叉相关融合模块在解码器中），以高效地融合低级视觉特征和高级 semantic 特征；3）依赖关系评估（通过优化器），以强制前景敏感表示具有补偿信息的背景表示。此外，文本还引入了一种和谐损失函数，以鼓励网络更加关注低信心和困难样本。提出的方法在两个超声病变分割任务上进行评估，表现出了明显的性能提升。

On the use of deep learning for phase recovery

paper_url: http://arxiv.org/abs/2308.00942
repo_url: None
paper_authors: Kaiqiang Wang, Li Song, Chutian Wang, Zhenbo Ren, Guangyuan Zhao, Jiazhen Dou, Jianglei Di, George Barbastathis, Renjie Zhou, Jianlin Zhao, Edmund Y. Lam
for: This paper is written for those interested in phase recovery (PR) and its applications in computational imaging.
methods: The paper reviews conventional methods for PR, as well as how deep learning (DL) can be used to support PR from pre-processing, in-processing, and post-processing stages.
results: The paper summarizes the work in DL for PR and provides a live-updating resource for readers to learn more about PR.Here is the same information in Simplified Chinese text:
for: 这篇论文是为了介绍phaserecovery（PR）和其应用于计算成像而写的。
methods: 论文回顾了传统的PR方法，以及如何通过深度学习（DL）在pre-processing、in-processing和post-processing三个阶段支持PR。
results: 论文总结了DL在PR方面的工作，并提供了一个live-updating资源，让读者更深入了解PR。

Abstract
Phase recovery (PR) refers to calculating the phase of the light field from its intensity measurements. As exemplified from quantitative phase imaging and coherent diffraction imaging to adaptive optics, PR is essential for reconstructing the refractive index distribution or topography of an object and correcting the aberration of an imaging system. In recent years, deep learning (DL), often implemented through deep neural networks, has provided unprecedented support for computational imaging, leading to more efficient solutions for various PR problems. In this review, we first briefly introduce conventional methods for PR. Then, we review how DL provides support for PR from the following three stages, namely, pre-processing, in-processing, and post-processing. We also review how DL is used in phase image processing. Finally, we summarize the work in DL for PR and outlook on how to better use DL to improve the reliability and efficiency in PR. Furthermore, we present a live-updating resource (https://github.com/kqwang/phase-recovery) for readers to learn more about PR.

摘要
<>转换文本到简化中文。<>phas recovery (PR)指的是从光场强度测量中计算光场的阶段。例如从量化光场图像和相干散射图像到调整镜optics，PR是重构 объек的反射指数分布或地图的关键 step。在过去几年，深度学习（DL），通常通过深度神经网络实现，为计算成像提供了无前例的支持，导致了许多PR问题的更有效的解决方案。在这篇文章中，我们首先简要介绍了传统的PR方法。然后，我们回顾了DL在PR中的三个阶段，即先processing、processing和后processing。我们还回顾了DL在相位图像处理中的应用。最后，我们总结了DL在PR中的工作，并对如何更好地使用DL提高PR的可靠性和效率。此外，我们提供了一个live-updating资源（https://github.com/kqwang/phase-recovery），以便读者了解更多关于PR。

QUANT: A Minimalist Interval Method for Time Series Classification

paper_url: http://arxiv.org/abs/2308.00928
repo_url: https://github.com/angus924/quant
paper_authors: Angus Dempster, Daniel F. Schmidt, Geoffrey I. Webb
for: 这篇论文是为了提出一种基于间隔的时间序列分类方法。
methods: 该方法使用单一的特征（Quantiles）、固定的间隔和一个市场上可用的分类器。
results: 该方法可以在一组标准的测试数据集上实现同样的准确率，与现有最准确的间隔方法相比。这种快速和准确的方法可以在142个UCR数据集上实现状态机器学习的最佳性能，总计算时间仅为单CPU核心下的少于15分钟。

Abstract
We show that it is possible to achieve the same accuracy, on average, as the most accurate existing interval methods for time series classification on a standard set of benchmark datasets using a single type of feature (quantiles), fixed intervals, and an 'off the shelf' classifier. This distillation of interval-based approaches represents a fast and accurate method for time series classification, achieving state-of-the-art accuracy on the expanded set of 142 datasets in the UCR archive with a total compute time (training and inference) of less than 15 minutes using a single CPU core.

摘要
我们证明了可以通过使用单一的特征（分位数）、固定间隔和对应的资料集（UCR档案），在一个单一CPU核心上，实现时间序列分类的同等精度，与现有的间隔方法相比。这种简化的间隔方法可以实现快速和准确的时间序列分类，在扩展的142个测试集上达到了状况之优的精度，训练和推导时间总计少于15分钟。

Continual Domain Adaptation on Aerial Images under Gradually Degrading Weather

paper_url: http://arxiv.org/abs/2308.00924
repo_url: https://github.com/sadman-jahan/aid-ucm-degradingweather
paper_authors: Chowdhury Sadman Jahan, Andreas Savakis
for: 这篇研究旨在探讨深度学习模型在天空平台上的适应Domain Adaptation（DA）问题，以及在这种情况下的测试时 Adaptation（ continual DA）表现。
methods: 研究使用了两种逐渐恶化的天气情况，将Real Image Dataset中的两个 dataset合成而成四个 benchmark dataset。然后，评估了三种 DA 模型，包括标准 DA 模型和两种 continual DA 模型，并比较了两种不同的架构（卷积和transformer）。
results: 研究发现了在 continual DA Setting 中，exist buffer-fed continual DA 方法会出现稳定性问题，并提出了一个简单的Gradient Normalization方法来缓解这个问题。

Abstract
Domain adaptation (DA) strives to mitigate the domain gap between the source domain where a model is trained, and the target domain where the model is deployed. When a deep learning model is deployed on an aerial platform, it may face gradually degrading weather conditions during operation, leading to widening domain gaps between the training data and the encountered evaluation data. We synthesize two such gradually worsening weather conditions on real images from two existing aerial imagery datasets, generating a total of four benchmark datasets. Under the continual, or test-time adaptation setting, we evaluate three DA models on our datasets: a baseline standard DA model and two continual DA models. In such setting, the models can access only one small portion, or one batch of the target data at a time, and adaptation takes place continually, and over only one epoch of the data. The combination of the constraints of continual adaptation, and gradually deteriorating weather conditions provide the practical DA scenario for aerial deployment. Among the evaluated models, we consider both convolutional and transformer architectures for comparison. We discover stability issues during adaptation for existing buffer-fed continual DA methods, and offer gradient normalization as a simple solution to curb training instability.

摘要
域 adaptation (DA) 的目的是减少源域和目标域之间的域 gap，以便在不同域上使用模型。当深度学习模型在飞行平台上部署时，可能会遇到逐渐恶化的天气条件，导致模型在训练数据和评估数据之间的域 gap 进一步扩大。我们将两种逐渐恶化的天气条件synthesize在实际图像上，生成了四个benchmark dataset。在 continual 或 test-time adaptation Setting 中，我们评估了三个 DA 模型：基eline 标准 DA 模型和两个 continual DA 模型。在这种设置下，模型可以只访问一小部分，或一批target data 中的一个批次，并且adaptation 发生在一个epoch 内。将 continual adaptation 的约束和逐渐恶化的天气条件相结合，我们实际上构建了飞行部署中的实用 DA enario。我们考虑了 convolutional 和 transformer 架构进行比较。我们发现了 continual DA 方法中的稳定问题，并提供了一种简单的 Gradient Normalization 解决方案来缓解训练不稳定。

Survey on Computer Vision Techniques for Internet-of-Things Devices

paper_url: http://arxiv.org/abs/2308.02553
repo_url: None
paper_authors: Ishmeet Kaur, Adwaita Janardhan Jadhav
for: 本研究旨在探讨最新的低功耗和能效的神经网络实现方法，以提高神经网络的部署性，而不是减少准确率。
methods: 本文涵盖了三类主要方法：神经网络压缩、网络架构搜索和设计、编译器和图优化。
results: 本文总结了低功耗和能效的神经网络实现方法的优劣点和未来研究问题。

Abstract
Deep neural networks (DNNs) are state-of-the-art techniques for solving most computer vision problems. DNNs require billions of parameters and operations to achieve state-of-the-art results. This requirement makes DNNs extremely compute, memory, and energy-hungry, and consequently difficult to deploy on small battery-powered Internet-of-Things (IoT) devices with limited computing resources. Deployment of DNNs on Internet-of-Things devices, such as traffic cameras, can improve public safety by enabling applications such as automatic accident detection and emergency response.Through this paper, we survey the recent advances in low-power and energy-efficient DNN implementations that improve the deployability of DNNs without significantly sacrificing accuracy. In general, these techniques either reduce the memory requirements, the number of arithmetic operations, or both. The techniques can be divided into three major categories: neural network compression, network architecture search and design, and compiler and graph optimizations. In this paper, we survey both low-power techniques for both convolutional and transformer DNNs, and summarize the advantages, disadvantages, and open research problems.

摘要
深度神经网络（DNN）是现代计算机视觉问题的state-of-the-art技术。DNN需要数十亿参数和操作来实现state-of-the-art结果，这使得DNN变得极其计算、内存和能源贪吃，因此Difficult to deploy on small battery-powered Internet of Things（IoT）设备 with limited computing resources。 deploying DNNs on IoT devices, such as traffic cameras, can improve public safety by enabling applications such as automatic accident detection and emergency response. Through this paper, we survey the recent advances in low-power and energy-efficient DNN implementations that improve the deployability of DNNs without significantly sacrificing accuracy. In general, these techniques either reduce the memory requirements, the number of arithmetic operations, or both. The techniques can be divided into three major categories: neural network compression, network architecture search and design, and compiler and graph optimizations. In this paper, we survey both low-power techniques for both convolutional and transformer DNNs, and summarize the advantages, disadvantages, and open research problems.

Virtual histological staining of unlabeled autopsy tissue

paper_url: http://arxiv.org/abs/2308.00920
repo_url: None
paper_authors: Yuzhu Li, Nir Pillar, Jingxi Li, Tairan Liu, Di Wu, Songyu Sun, Guangdong Ma, Kevin de Haan, Luzhe Huang, Sepehr Hamidi, Anatoly Urisman, Tal Keidar Haran, William Dean Wallace, Jonathan E. Zuckerman, Aydogan Ozcan
for: 这个研究旨在解决传统压涂方法在检验尸体样本时的挑战，包括临床死亡后的样本自释导致的差异化、高成本和时间consuming的化学压涂过程。
methods: 这篇文章报道了一种虚拟染色技术，使用一个训练好的神经网络将染料自动替换为普通的染料染色，从而消除自释导致的严重染色扭曲。
results: 研究发现，虚拟染色技术可以快速地生成高质量的染料染色图像，并且可以减少劳动力、成本和基础设施需求。此外，这种技术还可以扩展到肿瘤组织和衰竭组织，并且可以在全球卫生危机期间提供更快速、更便宜的染色服务。

Abstract
Histological examination is a crucial step in an autopsy; however, the traditional histochemical staining of post-mortem samples faces multiple challenges, including the inferior staining quality due to autolysis caused by delayed fixation of cadaver tissue, as well as the resource-intensive nature of chemical staining procedures covering large tissue areas, which demand substantial labor, cost, and time. These challenges can become more pronounced during global health crises when the availability of histopathology services is limited, resulting in further delays in tissue fixation and more severe staining artifacts. Here, we report the first demonstration of virtual staining of autopsy tissue and show that a trained neural network can rapidly transform autofluorescence images of label-free autopsy tissue sections into brightfield equivalent images that match hematoxylin and eosin (H&E) stained versions of the same samples, eliminating autolysis-induced severe staining artifacts inherent in traditional histochemical staining of autopsied tissue. Our virtual H&E model was trained using >0.7 TB of image data and a data-efficient collaboration scheme that integrates the virtual staining network with an image registration network. The trained model effectively accentuated nuclear, cytoplasmic and extracellular features in new autopsy tissue samples that experienced severe autolysis, such as COVID-19 samples never seen before, where the traditional histochemical staining failed to provide consistent staining quality. This virtual autopsy staining technique can also be extended to necrotic tissue, and can rapidly and cost-effectively generate artifact-free H&E stains despite severe autolysis and cell death, also reducing labor, cost and infrastructure requirements associated with the standard histochemical staining.

摘要
histological examination是Autopsy中的关键步骤，但传统的 histochemical staining方法面临多种挑战，包括由尸体泥炭引起的自体解剖引起的低质量着色，以及覆盖大面积组织的化学着色程序需要巨大的劳动力、成本和时间。在全球卫生危机期间， histopathology服务的可用性受限，导致组织着色的延迟和更严重的着色 artifacts。在这种情况下，我们提出了虚拟着色技术，使用训练过的神经网络将染色不含染料的 autopsy 组织切片转换成和染色后的 Hematoxylin and Eosin（H&E）染色版本匹配的明亮场景图像，消除自体解剖引起的严重着色 artifacts。我们的虚拟 H&E 模型通过 >0.7 TB 的图像数据和数据效率协作方案来训练，该方案将虚拟染色网络与图像 регистрация网络结合。训练后，模型能够有效强调组织中的核、细胞质和 extracellular 特征，包括 COVID-19 样本，这些样本在传统的 histochemical staining 中未能得到一致的染色质量。此虚拟染色技术还可以扩展到肿瘤组织，可以快速、成本低地生成 artifact-free H&E 染色，即使严重的自体解剖和细胞死亡。

VLUCI: Variational Learning of Unobserved Confounders for Counterfactual Inference

paper_url: http://arxiv.org/abs/2308.00904
repo_url: None
paper_authors: Yonghe Zhao, Qiang Huang, Siwei Wu, Yun Peng, Huiyan Sun
for: 本研究旨在提出一种新的变量学习模型，用于对 observational 数据中的不观测变量进行推断，以提高 causal inference 的准确性。
methods: 该模型基于变量学习的思想，使用 doubly 变量推断模型来approximate 不观测变量的 posterior distribution，并且可以与现有的 counterfactual inference 模型相结合使用。
results: 实验表明，该模型可以准确地推断不观测变量，并且可以与现有的模型相结合使用，以提高 counterfactual inference 的准确性。 Plus, the model provides confidence intervals for counterfactual outcomes, which is useful in risk-sensitive domains.

Abstract
Causal inference plays a vital role in diverse domains like epidemiology, healthcare, and economics. De-confounding and counterfactual prediction in observational data has emerged as a prominent concern in causal inference research. While existing models tackle observed confounders, the presence of unobserved confounders remains a significant challenge, distorting causal inference and impacting counterfactual outcome accuracy. To address this, we propose a novel variational learning model of unobserved confounders for counterfactual inference (VLUCI), which generates the posterior distribution of unobserved confounders. VLUCI relaxes the unconfoundedness assumption often overlooked by most causal inference methods. By disentangling observed and unobserved confounders, VLUCI constructs a doubly variational inference model to approximate the distribution of unobserved confounders, which are used for inferring more accurate counterfactual outcomes. Extensive experiments on synthetic and semi-synthetic datasets demonstrate VLUCI's superior performance in inferring unobserved confounders. It is compatible with state-of-the-art counterfactual inference models, significantly improving inference accuracy at both group and individual levels. Additionally, VLUCI provides confidence intervals for counterfactual outcomes, aiding decision-making in risk-sensitive domains. We further clarify the considerations when applying VLUCI to cases where unobserved confounders don't strictly conform to our model assumptions using the public IHDP dataset as an example, highlighting the practical advantages of VLUCI.

摘要
causal inference在多个领域中扮演着重要的角色，如epidemiology、医疗和经济等。在观察数据中，解除干扰和预测Counterfactual outcome成为了causal inference研究中的一个显著挑战。现有的模型可以处理观察到的干扰因素，但未观察到的干扰因素的存在仍然是一个主要的挑战，对 causal inference 和 counterfactual outcome的准确性产生干扰。为解决这个问题，我们提出了一种新的变分学习模型（VLUCI），可以生成未观察到的干扰因素的 posterior 分布。VLUCI 释放了对观察到的干扰因素的假设，从而更好地解除干扰。通过分离观察到和未观察到的干扰因素，VLUCI 构建了一个双变分推理模型，用于估计未观察到的干扰因素，从而更准确地预测 counterfactual outcome。我们在 synthetic 和半 synthetic 数据集上进行了广泛的实验，显示 VLUCI 在推理未观察到的干扰因素方面表现出色。它可以与当前的 counterfactual inference 模型相容，在组织和个体水平上提高推理准确性。此外，VLUCI 还提供了对 counterfactual outcome 的信息interval，帮助在风险敏感领域做出决策。我们还在使用公共 IHDP 数据集为例，详细介绍了在实际应用中考虑 VLUCI 的一般考虑事项。

User-Controllable Recommendation via Counterfactual Retrospective and Prospective Explanations

paper_url: http://arxiv.org/abs/2308.00894
repo_url: https://github.com/chrisjtan/ucr
paper_authors: Juntao Tan, Yingqiang Ge, Yan Zhu, Yinglong Xia, Jiebo Luo, Jianchao Ji, Yongfeng Zhang
for: 提高用户满意度和信任度，提供用户可控制的个性化推荐
methods: combinatorial explainable recommender systems，counterfactual reasoning，user control options
results: 在MovieLens和Yelp数据集上实验 validate 提议的效果，并且发现在提供用户控制选项时，可能会提高未来推荐的准确率

Abstract
Modern recommender systems utilize users' historical behaviors to generate personalized recommendations. However, these systems often lack user controllability, leading to diminished user satisfaction and trust in the systems. Acknowledging the recent advancements in explainable recommender systems that enhance users' understanding of recommendation mechanisms, we propose leveraging these advancements to improve user controllability. In this paper, we present a user-controllable recommender system that seamlessly integrates explainability and controllability within a unified framework. By providing both retrospective and prospective explanations through counterfactual reasoning, users can customize their control over the system by interacting with these explanations. Furthermore, we introduce and assess two attributes of controllability in recommendation systems: the complexity of controllability and the accuracy of controllability. Experimental evaluations on MovieLens and Yelp datasets substantiate the effectiveness of our proposed framework. Additionally, our experiments demonstrate that offering users control options can potentially enhance recommendation accuracy in the future. Source code and data are available at \url{https://github.com/chrisjtan/ucr}.

摘要
现代推荐系统通常使用用户历史行为生成个性化推荐，但这些系统经常缺乏用户可控性，导致用户满意度和信任度减退。鉴于近年来的可解释推荐系统的进步，我们提议利用这些进步来提高用户可控性。在本文中，我们提出了一种可控性推荐系统，该系统内置了解释和可控性的统一框架。通过对解释和可控性进行反思，用户可以自定义他们对系统的控制。此外，我们引入了两个推荐系统可控性的特性：复杂性可控性和准确性可控性。我们在 MovieLens 和 Yelp 数据集上进行了实验评估，并证明了我们提出的框架的有效性。此外，我们的实验还表明，向用户提供控制选项可能会提高未来推荐的准确性。源代码和数据可以在上获取。

Tango: rethinking quantization for graph neural network training on GPUs

paper_url: http://arxiv.org/abs/2308.00890
repo_url: None
paper_authors: Shiyang Chen, Da Zheng, Caiwen Ding, Chengying Huan, Yuede Ji, Hang Liu
for: 这篇论文主要旨在提高图 нейрон网络训练的效率，使用量化来加速计算。
methods: 该论文提出了三大贡献：首先，提供了精炼的规则来保持量化训练中的准确性。其次，设计了量化感知的基本 primitives 和间接优化，以加速 GNN 训练。最后，与 популяр的 Deep Graph Library (DGL) 系统集成，并在多种 GNN 模型和数据集上达到了状态arc的性能。
results: 该论文通过 Tango 系统，在多种 GNN 模型和数据集上达到了更高的训练效率，比如 state-of-the-art 方法快。

Abstract
Graph Neural Networks (GNNs) are becoming increasingly popular due to their superior performance in critical graph-related tasks. While quantization is widely used to accelerate GNN computation, quantized training faces unprecedented challenges. Current quantized GNN training systems often have longer training times than their full-precision counterparts for two reasons: (i) addressing the accuracy challenge leads to excessive overhead, and (ii) the optimization potential exposed by quantization is not adequately leveraged. This paper introduces Tango which re-thinks quantization challenges and opportunities for graph neural network training on GPUs with three contributions: Firstly, we introduce efficient rules to maintain accuracy during quantized GNN training. Secondly, we design and implement quantization-aware primitives and inter-primitive optimizations that can speed up GNN training. Finally, we integrate Tango with the popular Deep Graph Library (DGL) system and demonstrate its superior performance over state-of-the-art approaches on various GNN models and datasets.

摘要
graph neural networks (GNNs) 在 kritical graph-related tasks 中的表现越来越出色，但量化训练面临无 precedent 的挑战。现有的量化 GNN 训练系统经常比其整数精度 counterparts longer training time due to two reasons: (i) addressing the accuracy challenge leads to excessive overhead, and (ii) the optimization potential exposed by quantization is not adequately leveraged. This paper introduces Tango, which re-thinks quantization challenges and opportunities for graph neural network training on GPUs with three contributions: Firstly, we introduce efficient rules to maintain accuracy during quantized GNN training. Secondly, we design and implement quantization-aware primitives and inter-primitive optimizations that can speed up GNN training. Finally, we integrate Tango with the popular Deep Graph Library (DGL) system and demonstrate its superior performance over state-of-the-art approaches on various GNN models and datasets.Note: Please note that the translation is in Simplified Chinese, which is one of the two standard Chinese dialects. If you prefer Traditional Chinese, please let me know and I can provide the translation in that dialect as well.

Factor Graph Neural Networks

paper_url: http://arxiv.org/abs/2308.00887
repo_url: https://github.com/molyswu/hand_detection
paper_authors: Zhen Zhang, Mohammed Haroon Dupty, Fan Wu, Javen Qinfeng Shi, Wee Sun Lee
for: 本 paper 的目的是提出一种能够有效地捕捉高阶关系的图神经网络（FGNN），以便进行推理和学习。
methods: 本 paper 使用的方法包括提出一种高效的准确推理算法，以及将这种算法 Neil 化为一种带有更加复杂的消息更新规则的图神经网络模块。
results: 本 paper 的实验结果表明，提出的 FGNN 模型可以在 sintetic 数据集和实际数据集上达到出色的性能，并且可以代表 Max-Product 和 Sum-Product 循环信念传播。

Abstract
In recent years, we have witnessed a surge of Graph Neural Networks (GNNs), most of which can learn powerful representations in an end-to-end fashion with great success in many real-world applications. They have resemblance to Probabilistic Graphical Models (PGMs), but break free from some limitations of PGMs. By aiming to provide expressive methods for representation learning instead of computing marginals or most likely configurations, GNNs provide flexibility in the choice of information flowing rules while maintaining good performance. Despite their success and inspirations, they lack efficient ways to represent and learn higher-order relations among variables/nodes. More expressive higher-order GNNs which operate on k-tuples of nodes need increased computational resources in order to process higher-order tensors. We propose Factor Graph Neural Networks (FGNNs) to effectively capture higher-order relations for inference and learning. To do so, we first derive an efficient approximate Sum-Product loopy belief propagation inference algorithm for discrete higher-order PGMs. We then neuralize the novel message passing scheme into a Factor Graph Neural Network (FGNN) module by allowing richer representations of the message update rules; this facilitates both efficient inference and powerful end-to-end learning. We further show that with a suitable choice of message aggregation operators, our FGNN is also able to represent Max-Product belief propagation, providing a single family of architecture that can represent both Max and Sum-Product loopy belief propagation. Our extensive experimental evaluation on synthetic as well as real datasets demonstrates the potential of the proposed model.

摘要
近年来，我们目睹了一场Graph Neural Networks（GNN）的浪涌，大多数可以在终端式的方式学习出强大的表示，在许多实际应用中取得了很大的成功。它们与 probabilistic Graphical Models（PGMs）有相似之处，但是超越了一些PGMs的限制。通过targeting表示学习而不是计算margin或最有可能的配置，GNNs提供了信息流动规则的灵活性，同时保持良好的性能。尽管它们的成功和灵感，但它们缺乏高阶关系的表示和学习方法。高阶GNNs需要更多的计算资源来处理高阶tensor。我们提议使用Factor Graph Neural Networks（FGNNs）来有效地捕捉高阶关系 для推理和学习。我们首先 derivate了一种高效的approximate Sum-Product loopy belief propagation推理算法 для离散高阶PGMs。然后，我们将这种新的message passing scheme neuralize到Factor Graph Neural Network（FGNN）模块中，允许更加丰富的message update规则表示，从而实现了有效的推理和强大的终端式学习。此外，我们还证明了在适当的message汇聚操作下，我们的FGNN可以表示Max-Product belief propagation，从而提供了一个单一的家族结构，可以表示Max和Sum-Product loopy belief propagation。我们对synthetic以及实际数据进行了广泛的实验评估，demonstrating the potential of the proposed model.

Enhancing Machine Learning Performance with Continuous In-Session Ground Truth Scores: Pilot Study on Objective Skeletal Muscle Pain Intensity Prediction

paper_url: http://arxiv.org/abs/2308.00886
repo_url: None
paper_authors: Boluwatife E. Faremi, Jonathon Stavres, Nuno Oliveira, Zhaoxian Zhou, Andrew H. Sung
for: This study aimed to develop a novel approach for objective pain intensity characterization using machine learning (ML) models and real-time, continuous in-session pain scores.methods: The study used two devices to acquire real-time pain scores and ANS-modulated endodermal activity (EDA) data. The authors used a custom pain platform to store and extract time-domain EDA features and in-session ground truth scores. They trained ML models, including Multi-layer Perceptron (MLP) and Random Forest (RF), using objective EDA features and in-session scores.results: The study found that using continuous in-session ground truth scores significantly enhanced the performance of ML models in pain intensity characterization, with macro-averaged geometric mean scores of 75.9% and 78.3% for MLP and RF models, respectively, compared to scores of 70.3% and 74.6% for models trained with post-session scores. The study demonstrates the potential of using real-time, continuous pain scores to improve the accuracy of ML pain systems.

Abstract
Machine learning (ML) models trained on subjective self-report scores struggle to objectively classify pain accurately due to the significant variance between real-time pain experiences and recorded scores afterwards. This study developed two devices for acquisition of real-time, continuous in-session pain scores and gathering of ANS-modulated endodermal activity (EDA).The experiment recruited N = 24 subjects who underwent a post-exercise circulatory occlusion (PECO) with stretch, inducing discomfort. Subject data were stored in a custom pain platform, facilitating extraction of time-domain EDA features and in-session ground truth scores. Moreover, post-experiment visual analog scale (VAS) scores were collected from each subject. Machine learning models, namely Multi-layer Perceptron (MLP) and Random Forest (RF), were trained using corresponding objective EDA features combined with in-session scores and post-session scores, respectively. Over a 10-fold cross-validation, the macro-averaged geometric mean score revealed MLP and RF models trained with objective EDA features and in-session scores achieved superior performance (75.9% and 78.3%) compared to models trained with post-session scores (70.3% and 74.6%) respectively. This pioneering study demonstrates that using continuous in-session ground truth scores significantly enhances ML performance in pain intensity characterization, overcoming ground truth sparsity-related issues, data imbalance, and high variance. This study informs future objective-based ML pain system training.

摘要
机器学习（ML）模型在主观自报分数上训练时，很难准确地分类疼痛，因为主观自报分数和实际时间内疼痛经历之间存在很大的差异。这个研究开发了两种设备用于实时、连续的疼痛分数获取和胃内活动评估（EDA）的收集。研究采用N = 24名参与者，经历了后期静脉填充（PECO）的伸展，导致不适。参与者的数据被存储在自定义的疼痛平台上，以便提取时间域EDA特征和实际时间内的真实分数。此外，每名参与者还提供了后实验的Visual Analog Scale（VAS）分数。机器学习模型，即多层感知器（MLP）和随机森林（RF），被训练使用对应的对象EDA特征和实际时间内分数 combinated with post-session scores。在10次横跨验证中，macro平均幂数分数表明MLP和RF模型使用对象EDA特征和实际时间内分数训练时的性能（75.9%和78.3%）高于使用post-session scores训练时的性能（70.3%和74.6%）。这项先驱研究表明，使用连续实际时间内的真实分数可以大大提高ML的疼痛强度特征化性能，超越真实分数稀缺、数据不均衡和高差异问题。这项研究为未来基于对象的ML疼痛系统训练提供了指导。

Revolutionizing Wireless Networks with Federated Learning: A Comprehensive Review

paper_url: http://arxiv.org/abs/2308.04404
repo_url: None
paper_authors: Sajjad Emdadi Mahdimahalleh
for: 本文探讨了在无线通信中机器学习的重要性，以及 federated learning（FL）在未来移动网络中的潜在作用。
methods: 本文使用了 federated learning（FL），它在无线边缘网络中分离了数据收集和计算，与传统的中央化学习不同。
results: 本文指出，由于无线通信资源有限和不可预测，FL 可以更好地适应这些环境，并且可以提高无线通信系统的效率和可靠性。

Abstract
These days with the rising computational capabilities of wireless user equipment such as smart phones, tablets, and vehicles, along with growing concerns about sharing private data, a novel machine learning model called federated learning (FL) has emerged. FL enables the separation of data acquisition and computation at the central unit, which is different from centralized learning that occurs in a data center. FL is typically used in a wireless edge network where communication resources are limited and unreliable. Bandwidth constraints necessitate scheduling only a subset of UEs for updates in each iteration, and because the wireless medium is shared, transmissions are susceptible to interference and are not assured. The article discusses the significance of Machine Learning in wireless communication and highlights Federated Learning (FL) as a novel approach that could play a vital role in future mobile networks, particularly 6G and beyond.

摘要
现在，由于无线用户设备的计算能力的提高，如智能手机、平板电脑和车辆，以及对共享私人数据的关注，一种新的机器学习模型叫做联邦学习（FL）已经出现。FL使得数据获取和计算在中央单元分离开，与中央集中学习在数据中心不同。FL通常在无线边缘网络中使用，因为通信资源有限和不可预测。由于带宽有限，每次迭代只能将一 subset of UE进行更新，因为无线媒体共享，传输受到干扰和不能保证。文章介绍了无线通信中机器学习的重要性，并将联邦学习（FL）作为未来无线网络中的重要角色提出。

PeRP: Personalized Residual Policies For Congestion Mitigation Through Co-operative Advisory Systems

paper_url: http://arxiv.org/abs/2308.00864
repo_url: None
paper_authors: Aamir Hasan, Neeloy Chakraborty, Haonan Chen, Jung-Hoon Cho, Cathy Wu, Katherine Driggs-Campbell
for: 本研究旨在提高自动驾驶系统的可靠性和效率，以提高社会经济因素 such as 通勤时间和燃油费用。
methods: 本研究使用 Piecewise Constant（PC）策略和个性化剩余策略（PeRP），以模型人类驾驶行为，提供个性化的行为建议。
results: 我们的方法在模拟环境中训练完成，与基线比较，显示我们的方法可以成功地减轻交通堵塞，适应不同的 Driver 行为，提高平均速度4-22%。

Abstract
Intelligent driving systems can be used to mitigate congestion through simple actions, thus improving many socioeconomic factors such as commute time and gas costs. However, these systems assume precise control over autonomous vehicle fleets, and are hence limited in practice as they fail to account for uncertainty in human behavior. Piecewise Constant (PC) Policies address these issues by structurally modeling the likeness of human driving to reduce traffic congestion in dense scenarios to provide action advice to be followed by human drivers. However, PC policies assume that all drivers behave similarly. To this end, we develop a co-operative advisory system based on PC policies with a novel driver trait conditioned Personalized Residual Policy, PeRP. PeRP advises drivers to behave in ways that mitigate traffic congestion. We first infer the driver's intrinsic traits on how they follow instructions in an unsupervised manner with a variational autoencoder. Then, a policy conditioned on the inferred trait adapts the action of the PC policy to provide the driver with a personalized recommendation. Our system is trained in simulation with novel driver modeling of instruction adherence. We show that our approach successfully mitigates congestion while adapting to different driver behaviors, with 4 to 22% improvement in average speed over baselines.

摘要
智能驾驶系统可以减轻交通拥堵，提高了许多社会经济指标，如通勤时间和油费成本。然而，这些系统假设自动车辆队伍具有精确的控制权，因此在实践中受到限制，因为它们无法考虑人类行为的不确定性。 Piecewise Constant（PC）策略可以解决这些问题，通过结构化模型人类驾驶行为，以减少繁忙场景中的交通拥堵，并提供行动建议，以便人类驾驶员遵循。然而，PC策略假设所有 drivers 都 behave similarly。为此，我们开发了一种合作性建议系统，基于 PC 策略和一种新的 Driver Trait Conditioned Personalized Residual Policy（PeRP）。PeRP 建议 drivers 采取 mitigating 交通拥堵的行动。我们首先使用变量自动编码器在无监督的方式推断 driver 的内在特征，然后根据推断到的特征 conditioned 策略，提供个性化的建议。我们的系统在 simulate 中受到新的 driver 模型的 instrucion adherence 训练。我们表明，我们的方法可以成功地减少交通拥堵，同时适应不同 driver 行为，与基准相比，提高了4%-22%的平均速度。

Understanding Activation Patterns in Artificial Neural Networks by Exploring Stochastic Processes

paper_url: http://arxiv.org/abs/2308.00858
repo_url: None
paper_authors: Stephan Johann Lehmler, Muhammad Saif-ur-Rehman, Tobias Glasmachers, Ioannis Iossifidis
for: 本研究想要更深入地理解人工神经网络（deep artificial neural network）的行为和学习动力学。
methods: 本研究使用杂素过程框架，它在人工神经网络性能方面提供了一种简化的视角，并且可以通过仿真来进行系统性的调查。
results: 研究人员使用杂素过程模型对不同的人工神经网络进行分析，发现这些网络在学习过程中的活动模式具有稳定的特征，并且可以通过 Mean Firing Rate、Mean Fano Factor 和 Variances 等指标来评估这些活动模式。这些结果可能有助于理解人工神经网络的行为和学习机制。

Abstract
To gain a deeper understanding of the behavior and learning dynamics of (deep) artificial neural networks, it is valuable to employ mathematical abstractions and models. These tools provide a simplified perspective on network performance and facilitate systematic investigations through simulations. In this paper, we propose utilizing the framework of stochastic processes, which has been underutilized thus far. Our approach models activation patterns of thresholded nodes in (deep) artificial neural networks as stochastic processes. We focus solely on activation frequency, leveraging neuroscience techniques used for real neuron spike trains. During a classification task, we extract spiking activity and use an arrival process following the Poisson distribution. We examine observed data from various artificial neural networks in image recognition tasks, fitting the proposed model's assumptions. Through this, we derive parameters describing activation patterns in each network. Our analysis covers randomly initialized, generalizing, and memorizing networks, revealing consistent differences across architectures and training sets. Calculating Mean Firing Rate, Mean Fano Factor, and Variances, we find stable indicators of memorization during learning, providing valuable insights into network behavior. The proposed model shows promise in describing activation patterns and could serve as a general framework for future investigations. It has potential applications in theoretical simulations, pruning, and transfer learning.

摘要
Simplified Chinese translation:为了更深刻理解深度人工神经网络的行为和学习动态，使用数学抽象和模型是非常有价值的。这些工具可以简化网络性能的视图，并且通过仿真来进行系统性的调查。在这篇论文中，我们提议使用 Stochastic Processes 框架，这种框架在过去并未得到充分利用。我们的方法是将 thresholded nodes 的活动模式模型为 Stochastic Processes。我们仅准确采用 activation frequency，并且参考了实际神经元的发射 Train 技术。在一个分类任务中，我们从不同的人工神经网络中提取了冲击活动，并使用 Poisson 分布来描述到达过程。我们对不同的人工神经网络在图像识别任务中的观察数据进行了适应，并从中 derive 了每个网络的活动模式参数。我们的分析覆盖了随机初始化、泛化和记忆化网络，并发现这些网络在不同的架构和训练集之间存在一致的差异。计算 Mean Firing Rate、Mean Fano Factor 和 Variances，我们发现在学习过程中，记忆化存在稳定的指标，这些指标为我们对网络行为提供了有价值的洞察。我们的模型表示了 activation patterns 的描述，并且有可能在未来的研究中扮演一个普遍的框架。它还可以在理论仿真、剪辑和转移学习等方面应用。

Differential Privacy for Adaptive Weight Aggregation in Federated Tumor Segmentation

paper_url: http://arxiv.org/abs/2308.00856
repo_url: None
paper_authors: Muhammad Irfan Khan, Esa Alhoniemi, Elina Kontio, Suleiman A. Khan, Mojtaba Jafaritadi
For: 这个研究旨在提供一个具有数据隐私保护的联邦学习架构，以保护医疗影像数据的隐私和数据完整性。* Methods: 这个研究使用了一个叫做DP-SimAgg的分子类似隐私复杂数据联邦学习架构，具有提高模型分类能力和隐私保护的两个优点。* Results: 研究结果显示，DP-SimAgg可以实现精确且Robust的脑膜肿瘤分类，并对于通信成本的降低做出了重要贡献。

Abstract
Federated Learning (FL) is a distributed machine learning approach that safeguards privacy by creating an impartial global model while respecting the privacy of individual client data. However, the conventional FL method can introduce security risks when dealing with diverse client data, potentially compromising privacy and data integrity. To address these challenges, we present a differential privacy (DP) federated deep learning framework in medical image segmentation. In this paper, we extend our similarity weight aggregation (SimAgg) method to DP-SimAgg algorithm, a differentially private similarity-weighted aggregation algorithm for brain tumor segmentation in multi-modal magnetic resonance imaging (MRI). Our DP-SimAgg method not only enhances model segmentation capabilities but also provides an additional layer of privacy preservation. Extensive benchmarking and evaluation of our framework, with computational performance as a key consideration, demonstrate that DP-SimAgg enables accurate and robust brain tumor segmentation while minimizing communication costs during model training. This advancement is crucial for preserving the privacy of medical image data and safeguarding sensitive information. In conclusion, adding a differential privacy layer in the global weight aggregation phase of the federated brain tumor segmentation provides a promising solution to privacy concerns without compromising segmentation model efficacy. By leveraging DP, we ensure the protection of client data against adversarial attacks and malicious participants.

摘要
federated learning (FL) 是一种分布式机器学习方法，保护隐私 by creating an impartial global model while respecting the privacy of individual client data。然而，传统的 FL 方法可能会引入安全风险，特别是处理多样化的客户端数据，可能会威胁隐私和数据完整性。为了解决这些挑战，我们在医疗图像分割中提出了一种含有权限的 federated deep learning 框架。在这篇论文中，我们将我们的相似性Weight集成 (SimAgg) 方法扩展到 differentially private 的 SimAgg 算法，用于在多Modal 磁共振成像 (MRI) 中的脑肿瘤分割。我们的 DP-SimAgg 方法不仅提高了模型分割能力，还提供了一层额外的隐私保护。我们对我们的框架进行了广泛的测试和评估，以计算性能为关键考虑因素，并证明了 DP-SimAgg 可以准确地 segment brain tumor while minimizing communication costs during model training。这一进步对于保护医疗图像数据的隐私和敏感信息的安全是关键。因此，我们认为在 global weight aggregation 阶段添加 differential privacy 层是一种有 Promise的解决方案，不会COMPROMISE segmentation model efficacy。通过运用 DP，我们能够保护客户端数据免受敌对攻击和恶意参与者的威胁。

A Comprehensive Study of Groundbreaking Machine Learning Research: Analyzing Highly Cited and Impactful Publications across Six Decades

paper_url: http://arxiv.org/abs/2308.00855
repo_url: None
paper_authors: Absalom E. Ezugwu, Japie Greeff, Yuh-Shan Ho
for: 本研究目的是为了了解机器学习（ML）领域最高引用的论文，以便了解该领域的主要趋势、影响人员和贡献。
methods: 本研究使用了各种 bibliometric 技术进行分析，包括引用分析、合作关系分析、关键词分析和出版趋势分析。
results: 研究发现了机器学习社区中最具影响力的论文、高引用的作者和合作网络，以及最受欢迎的研究主题和升起的新趋势。 Additionally, the study found that certain countries have a dominant position in ML research.

Abstract
Machine learning (ML) has emerged as a prominent field of research in computer science and other related fields, thereby driving advancements in other domains of interest. As the field continues to evolve, it is crucial to understand the landscape of highly cited publications to identify key trends, influential authors, and significant contributions made thus far. In this paper, we present a comprehensive bibliometric analysis of highly cited ML publications. We collected a dataset consisting of the top-cited papers from reputable ML conferences and journals, covering a period of several years from 1959 to 2022. We employed various bibliometric techniques to analyze the data, including citation analysis, co-authorship analysis, keyword analysis, and publication trends. Our findings reveal the most influential papers, highly cited authors, and collaborative networks within the machine learning community. We identify popular research themes and uncover emerging topics that have recently gained significant attention. Furthermore, we examine the geographical distribution of highly cited publications, highlighting the dominance of certain countries in ML research. By shedding light on the landscape of highly cited ML publications, our study provides valuable insights for researchers, policymakers, and practitioners seeking to understand the key developments and trends in this rapidly evolving field.

摘要

CASSINI: Network-Aware Job Scheduling in Machine Learning Clusters

paper_url: http://arxiv.org/abs/2308.00852
repo_url: None
paper_authors: Sudarsanan Rajasekaran, Manya Ghobadi, Aditya Akella
for: 提高机器学习（ML）集群中任务的完成时间和流量控制
methods: 使用网络卷积图来考虑不同任务之间的通信模式，并通过调整时间偏移值来调整这些任务的通信阶段
results: 在24个服务器测试环境中，与状态艺术ML调度器相比，CASSINI可以提高任务的平均和尾部完成时间 by up to 1.6x和2.5x，同时还可以减少集群中ECN标记包的数量 by up to 33x。

Abstract
We present CASSINI, a network-aware job scheduler for machine learning (ML) clusters. CASSINI introduces a novel geometric abstraction to consider the communication pattern of different jobs while placing them on network links. To do so, CASSINI uses an affinity graph that finds a series of time-shift values to adjust the communication phases of a subset of jobs, such that the communication patterns of jobs sharing the same network link are interleaved with each other. Experiments with 13 common ML models on a 24-server testbed demonstrate that compared to the state-of-the-art ML schedulers, CASSINI improves the average and tail completion time of jobs by up to 1.6x and 2.5x, respectively. Moreover, we show that CASSINI reduces the number of ECN marked packets in the cluster by up to 33x.

摘要
我们介绍CASSINI，一个对Machine Learning（ML）集群有对网络耦合的任务安排器。CASSINI引入了一个新的几何抽象，考虑不同任务之间的通信模式，并在网络链接上分配任务。为此，CASSINI使用一个相互作用graph，找到一系列时间延迟值，以调整具有共同网络链接的任务之间的通信阶段。实验结果显示，相比于现有的ML安排器，CASSINI可以提高任务的平均和尾部完成时间 by up to 1.6倍和2.5倍，分别。此外，我们显示CASSINI可以在集群中对ECN标识的封包数量减少到33倍。

An Exact Kernel Equivalence for Finite Classification Models

paper_url: http://arxiv.org/abs/2308.00824
repo_url: None
paper_authors: Brian Bell, Michael Geyer, David Glickenstein, Amanda Fernandez, Juston Moore
for: 这个论文是为了探讨神经网络和kernel方法之间的等价关系，并提出了首个精确表示任何 finite-size parametric classification model 的方法。
methods: 该论文使用了Gradient Descent来训练神经网络，并 derivated了一个精确的kernel机器。
results: 实验表明，该kernel可以在实际网络上计算到Machine Precision级别，并且可以提供有用的泛化理解。

Abstract
We explore the equivalence between neural networks and kernel methods by deriving the first exact representation of any finite-size parametric classification model trained with gradient descent as a kernel machine. We compare our exact representation to the well-known Neural Tangent Kernel (NTK) and discuss approximation error relative to the NTK and other non-exact path kernel formulations. We experimentally demonstrate that the kernel can be computed for realistic networks up to machine precision. We use this exact kernel to show that our theoretical contribution can provide useful insights into the predictions made by neural networks, particularly the way in which they generalize.

摘要
我们研究神经网络和核方法之间的等价关系，通过将任何具有Gradient Descent训练的finite-size Parametric类别模型转换为核机制。我们与Well-known Neural Tangent Kernel（NTK）进行比较，并讨论非正确的路径核方法的误差。我们还证明了可以实际地Compute这个核函数 для现实的神经网络，至机器精度。我们使用这个精确的核函数，以示我们的理论贡献可以提供有用的预测性关于神经网络的预测。

An Introduction to Bi-level Optimization: Foundations and Applications in Signal Processing and Machine Learning

paper_url: http://arxiv.org/abs/2308.00788
repo_url: None
paper_authors: Yihua Zhang, Prashant Khanduri, Ioannis Tsaknakis, Yuguang Yao, Mingyi Hong, Sijia Liu
For: This paper is focused on developing an overview of bi-level optimization (BLO) problems in the context of signal processing (SP) and machine learning (ML) applications.* Methods: The paper provides an overview of basic concepts, such as optimality conditions, standard algorithms, and practical implementations of BLO problems in SP and ML applications.* Results: The paper discusses recent advances in BLO theory, its implications for applications, and points out some limitations of the state-of-the-art that require significant future research efforts.Here’s the same information in Simplified Chinese text:
for: 这篇论文主要关注在信号处理（SP）和机器学习（ML）应用中的双层优化（BLO）问题。
methods: 论文提供了BLO问题的基本概念，包括优化条件、标准算法和实践应用。
results: 论文讨论了最新的BLO理论发展，其应用效果和未来研究的限制。

Abstract
Recently, bi-level optimization (BLO) has taken center stage in some very exciting developments in the area of signal processing (SP) and machine learning (ML). Roughly speaking, BLO is a classical optimization problem that involves two levels of hierarchy (i.e., upper and lower levels), wherein obtaining the solution to the upper-level problem requires solving the lower-level one. BLO has become popular largely because it is powerful in modeling problems in SP and ML, among others, that involve optimizing nested objective functions. Prominent applications of BLO range from resource allocation for wireless systems to adversarial machine learning. In this work, we focus on a class of tractable BLO problems that often appear in SP and ML applications. We provide an overview of some basic concepts of this class of BLO problems, such as their optimality conditions, standard algorithms (including their optimization principles and practical implementations), as well as how they can be leveraged to obtain state-of-the-art results for a number of key SP and ML applications. Further, we discuss some recent advances in BLO theory, its implications for applications, and point out some limitations of the state-of-the-art that require significant future research efforts. Overall, we hope that this article can serve to accelerate the adoption of BLO as a generic tool to model, analyze, and innovate on a wide array of emerging SP and ML applications.

摘要
近期，双层优化（BLO）在信号处理（SP）和机器学习（ML）领域备受关注。简言之，BLO是一个经典的优化问题，其包含两层层次结构（上下两层），其中解决上层问题需要解决下层问题。BLO受欢迎的原因在于它能够模型嵌入式目标函数的问题，如SP和ML应用中的资源分配和对抗学习等。在这篇文章中，我们将关注一类可解决的BLO问题，包括其优化条件、标准算法（包括优化原理和实践）以及如何使其在多个关键SP和ML应用中实现state-of-the-art结果。此外，我们还讨论了BLO理论的最新进展、其应用领域的影响和未来研究的限制。总之，我们希望通过这篇文章，加速BLO的采用，作为模型、分析和创新多种emerging SP和ML应用的一种通用工具。

Evaluating Spiking Neural Network On Neuromorphic Platform For Human Activity Recognition

paper_url: http://arxiv.org/abs/2308.00787
repo_url: None
paper_authors: Sizhen Bian, Michele Magno
for: 这个研究的目的是评估使用神经元网络处理器进行人体活动识别，以满足智能手表的能源效率和延迟时间的要求。
methods: 这个研究使用了多reshold delta modulation方法来将输入感应器资料转换为射频，然后将射频输入到神经元网络中进行训练。
results: 研究结果显示，使用射频识别系统可以与使用传统神经网络相比，实现二倍的能源延迟产品（0.66 \si{\micro\joule\second} vs. 1.32 \si{\micro\joule\second），并且实现了87.5%的准确率。

Abstract
Energy efficiency and low latency are crucial requirements for designing wearable AI-empowered human activity recognition systems, due to the hard constraints of battery operations and closed-loop feedback. While neural network models have been extensively compressed to match the stringent edge requirements, spiking neural networks and event-based sensing are recently emerging as promising solutions to further improve performance due to their inherent energy efficiency and capacity to process spatiotemporal data in very low latency. This work aims to evaluate the effectiveness of spiking neural networks on neuromorphic processors in human activity recognition for wearable applications. The case of workout recognition with wrist-worn wearable motion sensors is used as a study. A multi-threshold delta modulation approach is utilized for encoding the input sensor data into spike trains to move the pipeline into the event-based approach. The spikes trains are then fed to a spiking neural network with direct-event training, and the trained model is deployed on the research neuromorphic platform from Intel, Loihi, to evaluate energy and latency efficiency. Test results show that the spike-based workouts recognition system can achieve a comparable accuracy (87.5\%) comparable to the popular milliwatt RISC-V bases multi-core processor GAP8 with a traditional neural network ( 88.1\%) while achieving two times better energy-delay product (0.66 \si{\micro\joule\second} vs. 1.32 \si{\micro\joule\second}).

摘要
“能源效率和延迟时间是设计智能穿戴式人体活动识别系统的关键要求，因为电池运作的硬性限制和关闭反馈loop。尽管神经网络模型已经广泛压缩以适应边缘的 Stringent requirements，脉冲神经网络和事件感知是最近几年出现的有前途的解决方案，因为它们的内生能效和能够在很低的延迟时间处理空间时间数据。本工作旨在评估使用神经元处理器在人体活动识别中的脉冲神经网络效果。使用腕上穿戴式运动传感器进行运动识别作为研究。使用多reshold delta 模ulation方法编码输入传感器数据为脉冲 trains，然后将脉冲 trains feed到一个直接事件培训的脉冲神经网络中。经过训练后，模型被部署到英特尔的 Loihi 研究神经元平台上进行评估能源和延迟效率。测试结果表明，使用脉冲工作识别系统可以达到相同的准确率（87.5%），与流行的 milliwatt RISC-V 基于多核心处理器 GAP8 的传统神经网络（88.1%）相比，而且可以两倍提高能源延迟产品（0.66 微\si{\joule\second} vs. 1.32 微\si{\joule\second）。”

DYMOND: DYnamic MOtif-NoDes Network Generative Model

paper_url: http://arxiv.org/abs/2308.00770
repo_url: https://github.com/zeno129/dymond
paper_authors: Giselle Zeno, Timothy La Fond, Jennifer Neville
for: 这个论文主要是为了提出一种基于动态模式的图structuredynamics Generative模型，以便更好地模型动态图的结构和节点行为。
methods: 该模型使用动态模式活动来捕捉图的变化，同时考虑每个节点在模式中所扮演的角色。
results: 与基于边扩展的基elines相比，该模型在真实的网络上更好地生成图结构和节点行为。此外，该paper还提出了一种新的方法来适应图结构度量来评估网络的时间方面。

Abstract
Motifs, which have been established as building blocks for network structure, move beyond pair-wise connections to capture longer-range correlations in connections and activity. In spite of this, there are few generative graph models that consider higher-order network structures and even fewer that focus on using motifs in models of dynamic graphs. Most existing generative models for temporal graphs strictly grow the networks via edge addition, and the models are evaluated using static graph structure metrics -- which do not adequately capture the temporal behavior of the network. To address these issues, in this work we propose DYnamic MOtif-NoDes (DYMOND) -- a generative model that considers (i) the dynamic changes in overall graph structure using temporal motif activity and (ii) the roles nodes play in motifs (e.g., one node plays the hub role in a wedge, while the remaining two act as spokes). We compare DYMOND to three dynamic graph generative model baselines on real-world networks and show that DYMOND performs better at generating graph structure and node behavior similar to the observed network. We also propose a new methodology to adapt graph structure metrics to better evaluate the temporal aspect of the network. These metrics take into account the changes in overall graph structure and the individual nodes' behavior over time.

摘要
<>使用简化中文表示文本。<>网络结构中的模式，作为网络结构的基本构件，已经被证明可以捕捉更长距离的相关性。然而，有很少的生成图模型考虑高阶网络结构，而且这些模型几乎都是通过边添加来生成网络。这些模型通常被评估使用静止图结构指标，这些指标不能准确捕捉网络的时间性行为。为解决这些问题，在这项工作中，我们提出了动态模式无核（DYMOND）生成模型。DYMOND模型考虑了（i）动态变化的总图结构使用时间模式活动，以及（ii）节点在模式中所扮演的角色（例如，一个节点在wedgel中扮演核心角色，剩下两个节点扮演螺旋的角色）。我们比较了DYMOND模型与三种动态图生成模型基线在真实网络上的性能，并显示DYMOND模型在生成图结构和节点行为方面表现出色，能够更好地生成与观察网络相似的图结构和节点行为。我们还提出了一种新的方法来适应图结构指标的改进，这些指标考虑了网络结构的变化和每个节点在时间上的行为。

Self-Supervised Contrastive BERT Fine-tuning for Fusion-based Reviewed-Item Retrieval

paper_url: http://arxiv.org/abs/2308.00762
repo_url: https://github.com/d3mlab/rir_data
paper_authors: Mohammad Mahdi Abdollah Pour, Parsa Farinneya, Armin Toroghi, Anton Korikov, Ali Pesaranghader, Touqir Sajed, Manasa Bharadwaj, Borislav Mavrin, Scott Sanner
for: 本研究旨在提高Neural Information Retrieval（IR）方法对 Reviewed-Item Retrieval（RIR）任务的表现，包括使用自我超vised方法进行对律学习BERT表示的扩展。
methods: 本研究使用了自我超vised方法进行对律学习BERT表示，包括选择积极和消极样本，以及使用锚点抽样和元数据进行增强。
results: 实验结果显示，使用Late Fusion方法进行对律学习BERT表示的Neural RIR方法，在对律学习BERT表示的Neural IR和稀谱基eline上进行比较，具有最高的表现。

Abstract
As natural language interfaces enable users to express increasingly complex natural language queries, there is a parallel explosion of user review content that can allow users to better find items such as restaurants, books, or movies that match these expressive queries. While Neural Information Retrieval (IR) methods have provided state-of-the-art results for matching queries to documents, they have not been extended to the task of Reviewed-Item Retrieval (RIR), where query-review scores must be aggregated (or fused) into item-level scores for ranking. In the absence of labeled RIR datasets, we extend Neural IR methodology to RIR by leveraging self-supervised methods for contrastive learning of BERT embeddings for both queries and reviews. Specifically, contrastive learning requires a choice of positive and negative samples, where the unique two-level structure of our item-review data combined with meta-data affords us a rich structure for the selection of these samples. For contrastive learning in a Late Fusion scenario, we investigate the use of positive review samples from the same item and/or with the same rating, selection of hard positive samples by choosing the least similar reviews from the same anchor item, and selection of hard negative samples by choosing the most similar reviews from different items. We also explore anchor sub-sampling and augmenting with meta-data. For a more end-to-end Early Fusion approach, we introduce contrastive item embedding learning to fuse reviews into single item embeddings. Experimental results show that Late Fusion contrastive learning for Neural RIR outperforms all other contrastive IR configurations, Neural IR, and sparse retrieval baselines, thus demonstrating the power of exploiting the two-level structure in Neural RIR approaches as well as the importance of preserving the nuance of individual review content via Late Fusion methods.

摘要
随着自然语言界面的发展，用户可以提出越来越复杂的自然语言查询，这导致了用户评论内容的激增，从而帮助用户更好地找到如饭店、书籍或电影等匹配查询。而神经信息检索（Neural IR）方法已经提供了状态的检索结果，但它们没有扩展到评论检索（RIR）任务中，在这个任务中，查询评论得分需要被聚合（或融合）到项目级别上。由于没有标注的 RIR 数据集，我们扩展了神经 IR 方法到 RIR 任务中，利用自动生成的 BERT 表示来进行自我超vised学习。具体来说，我们使用了对比学习来学习 BERT 表示。对比学习需要选择正例和负例样本，我们利用 item-review 数据集的两层结构，以及元数据，选择了有利的样本。我们 investigate了在晚期融合 scenario 中使用同一个 Item 的正例评论、同一个分数的正例评论、最不相似的 anchor Item 的负例评论和最相似的 anchor Item 的负例评论等方法来选择正例和负例样本。我们还 explore了 anchor 子采样和元数据增强。此外，我们还引入了对比项embedding学习来融合评论。实验结果表明，使用晚期融合对比学习的神经 RIR 比其他对比 IR 配置、神经 IR 和缺省检索基eline都高效，这表明了在神经 RIR 方法中利用两层结构的优势以及在融合评论内容时保持评论细节的重要性。

The Bias Amplification Paradox in Text-to-Image Generation

paper_url: http://arxiv.org/abs/2308.00755
repo_url: https://github.com/preethiseshadri518/bias-amplification-paradox
paper_authors: Preethi Seshadri, Sameer Singh, Yanai Elazar
for: 这 paper studies bias amplification in the text-to-image domain, specifically looking at gender-occupation biases in the training data (LAION).
methods: The authors use Stable Diffusion to compare gender ratios in the training data and the generated images, and they find that the model amplifies gender biases present in the training data. However, they also identify several confounding factors that contribute to this amplification.
results: The authors discover that the model appears to amplify gender biases, but this amplification can largely be attributed to discrepancies between training captions and model prompts. Once these distributional differences are accounted for, the amplification decreases considerably. The findings highlight the challenges of comparing biases in models and the data they are trained on.

Abstract
Bias amplification is a phenomenon in which models increase imbalances present in the training data. In this paper, we study bias amplification in the text-to-image domain using Stable Diffusion by comparing gender ratios in training vs. generated images. We find that the model appears to amplify gender-occupation biases found in the training data (LAION). However, we discover that amplification can largely be attributed to discrepancies between training captions and model prompts. For example, an inherent difference is that captions from the training data often contain explicit gender information while the prompts we use do not, which leads to a distribution shift and consequently impacts bias measures. Once we account for various distributional differences between texts used for training and generation, we observe that amplification decreases considerably. Our findings illustrate the challenges of comparing biases in models and the data they are trained on, and highlight confounding factors that contribute to bias amplification.

摘要
“偏调增强”是一种现象，模型在训练数据中的偏调会增加。在这篇研究中，我们研究了在文本到图像领域中的偏调增强，使用稳定扩散比较训练和生成图像中的性别比。我们发现，模型似乎将训练数据中的性别职业偏调增强。但是，我们发现这些增强可以主要归因于训练描述和模型说明之间的分布差异。例如，训练数据中的描述通常包含直接的性别信息，而模型说明则不包含这些信息，这会导致分布差异和影响偏调测量。一旦我们考虑到不同的分布差异，我们发现增强的减少了许多。我们的发现显示了比较模型和训练数据中的偏调的问题，以及对偏调增强的混淆因素。

Learning from Hypervectors: A Survey on Hypervector Encoding

paper_url: http://arxiv.org/abs/2308.00685
repo_url: None
paper_authors: Sercan Aygun, Mehran Shoushtari Moghadam, M. Hassan Najafi, Mohsen Imani
For: 本研究 zeros in on HDC 系统输入和生成 гипер向量过程，直接影响 гипер向量编码过程。* Methods: 本研究将从不同研究中收集various methods for гипер向量生成，探讨它们的局限性、挑战和可能的利好。* Results: 通过全面探讨这些encoding type的各种应用，读者将获得深刻的理解 hybervector generation的多种类型和各种应用场景中的encoding过程。

Abstract
Hyperdimensional computing (HDC) is an emerging computing paradigm that imitates the brain's structure to offer a powerful and efficient processing and learning model. In HDC, the data are encoded with long vectors, called hypervectors, typically with a length of 1K to 10K. The literature provides several encoding techniques to generate orthogonal or correlated hypervectors, depending on the intended application. The existing surveys in the literature often focus on the overall aspects of HDC systems, including system inputs, primary computations, and final outputs. However, this study takes a more specific approach. It zeroes in on the HDC system input and the generation of hypervectors, directly influencing the hypervector encoding process. This survey brings together various methods for hypervector generation from different studies and explores the limitations, challenges, and potential benefits they entail. Through a comprehensive exploration of this survey, readers will acquire a profound understanding of various encoding types in HDC and gain insights into the intricate process of hypervector generation for diverse applications.

摘要
高维ensional计算（HDC）是一种emerging计算模式，它模仿大脑的结构，提供了一种强大和高效的处理和学习模型。在HDC中，数据被编码为长向量，称为超vector，通常长度在1K到10K。文献中提供了多种编码技术，以生成正交或相关的超vector，具体取决于应用场景。现有的文献综述通常专注于HDC系统的总体方面，包括输入、基本计算和最终输出。但本研究采取了更加细化的方法。它关注HDC系统的输入和超vector编码过程，直接影响超vector生成过程。本调查集结了不同研究中的各种超vector生成方法，探讨它们的局限性、挑战和应用场景中的优势。通过全面探讨本调查，读者将获得各种编码类型在HDC中的深刻理解，并对超vector生成过程中的细节有深入的了解。

CodeBPE: Investigating Subtokenization Options for Large Language Model Pretraining on Source Code

paper_url: http://arxiv.org/abs/2308.00683
repo_url: None
paper_authors: Nadezhda Chirkova, Sergey Troshin
for: investigate the effect of different subtokenization options for source code
methods: propose subtokenization that reduces average length by 17% without downstream performance drop, and show that a carefully chosen subtokenization may improve quality by 0.5-2%, possibly with some length increase.
results: identify most effective and length-efficient subtokenizations, taking into account code specifics.

Abstract
Recent works have widely adopted large language model pretraining for source code, suggested source code-specific pretraining objectives and investigated the applicability of various Transformer-based language model architectures for source code. This work investigates another important aspect of such models, namely the effect of different subtokenization options, and aims at identifying most effective and length-efficient subtokenizations, taking into account code specifics. We propose subtokenziation that reduces average length by 17% without downstream performance drop, and show that a carefully chosen subtokenization may improve quality by 0.5-2%, possibly with some length increase.

摘要
近期研究广泛采用大型自然语言模型预训练 для源代码,建议源代码特有的预训练目标和Investigate了多种Transformer基于语言模型架构在源代码中的可行性。这个工作另一方面 investigate了这些模型中的另一个重要方面，即不同的子字符串选择方法的影响，并企图确定最有效和最短的子字符串选择方法，考虑代码特点。我们提议一种减少平均长度17%的子字符串选择方法，并显示一个合适的子字符串选择方法可能提高质量0.5-2%，可能具有一定的长度增加。

Tool Documentation Enables Zero-Shot Tool-Usage with Large Language Models

paper_url: http://arxiv.org/abs/2308.00675
repo_url: None
paper_authors: Cheng-Yu Hsieh, Si-An Chen, Chun-Liang Li, Yasuhisa Fujii, Alexander Ratner, Chen-Yu Lee, Ranjay Krishna, Tomas Pfister
for: 提供一种新的工具使用方法，而不是通过示例来教导语言模型（LLM）使用新工具。
methods: 使用工具文档来替代示例，提供工具的使用描述以便LLM学习。
results: 研究发现，使用工具文档可以帮助LLM在不需要示例的情况下也能够正确使用工具，并且在实际场景中表现更好。

Abstract
Today, large language models (LLMs) are taught to use new tools by providing a few demonstrations of the tool's usage. Unfortunately, demonstrations are hard to acquire, and can result in undesirable biased usage if the wrong demonstration is chosen. Even in the rare scenario that demonstrations are readily available, there is no principled selection protocol to determine how many and which ones to provide. As tasks grow more complex, the selection search grows combinatorially and invariably becomes intractable. Our work provides an alternative to demonstrations: tool documentation. We advocate the use of tool documentation, descriptions for the individual tool usage, over demonstrations. We substantiate our claim through three main empirical findings on 6 tasks across both vision and language modalities. First, on existing benchmarks, zero-shot prompts with only tool documentation are sufficient for eliciting proper tool usage, achieving performance on par with few-shot prompts. Second, on a newly collected realistic tool-use dataset with hundreds of available tool APIs, we show that tool documentation is significantly more valuable than demonstrations, with zero-shot documentation significantly outperforming few-shot without documentation. Third, we highlight the benefits of tool documentations by tackling image generation and video tracking using just-released unseen state-of-the-art models as tools. Finally, we highlight the possibility of using tool documentation to automatically enable new applications: by using nothing more than the documentation of GroundingDino, Stable Diffusion, XMem, and SAM, LLMs can re-invent the functionalities of the just-released Grounded-SAM and Track Anything models.

摘要
Translated into Simplified Chinese:今天，大型语言模型（LLM）通常通过提供一些示例来教育它们使用新工具。然而，示例很难获得，而且如果选择错误的示例，可能会导致不正确的使用。即使示例 readily available，也没有原则性的选择协议来确定多少和哪些提供。随着任务的复杂度增加，选择搜索会 combinatorially intractable 化。我们的工作强调使用工具文档，而不是示例。我们通过六个任务 across 视觉和语言模式来证明我们的主张。首先，在现有的benchmark上，只有工具文档的Zero-shot prompt是 sufficient для获得正确的工具使用，并且与几个shot prompt的性能相当。其次，我们收集了一个实际的工具使用数据集，包含了数百个可用的工具API，并示出了工具文档的优越性。第三，我们高亮了工具文档的优势，通过使用最新的领先技术模型作为工具来解决图像生成和视频跟踪等任务。最后，我们强调了使用工具文档来自动启用新应用程序：通过使用 GroundingDino、Stable Diffusion、XMem 和 SAM 的文档，LLMs 可以重新实现最新的 Grounded-SAM 和 Track Anything 模型的功能。

Mapping Computer Science Research: Trends, Influences, and Predictions

paper_url: http://arxiv.org/abs/2308.00733
repo_url: None
paper_authors: Mohammed Almutairi, Ozioma Collins Oguine
for: The paper aims to identify trending research areas in the field of Computer Science (CS) and investigate the factors contributing to their emergence.
methods: The authors use a comprehensive dataset comprising papers, citations, and funding information, and employ advanced machine learning techniques, including Decision Tree and Logistic Regression models, to predict trending research areas.
results: The analysis reveals that reference counts play a pivotal role in determining trending research areas, and the Logistic Regression model outperforms the Decision Tree model in predicting trends, with higher accuracy, precision, recall, and F1 score. The results provide valuable insights into the trending research areas and offer a data-driven foundation for decision-making and future research direction.Here are the three information points in Simplified Chinese text:
for: 这篇论文探讨了计算机科学（CS）领域当前热点研究领域，并研究这些热点研究领域的起源因素。
methods: 作者使用了一个包括论文、引用、资金信息的完整数据集，并使用高级机器学习技术，包括决策树和逻辑回归模型，预测热点研究领域。
results: 分析发现，引用计数（Reference Count）在确定热点研究领域中发挥了关键作用，并且 NSF 资金和专利在热点话题的影响逐渐增加。逻辑回归模型在预测热点领域方面表现出色，比决策树模型更高精度、精确性、回归率和 F1 分数。通过超过随机尝试基准点，我们的数据驱动方法表明更高的准确性和效率，可以为研究人员和机构提供数据驱动的基础 для决策和未来研究方向。

Abstract
This paper explores the current trending research areas in the field of Computer Science (CS) and investigates the factors contributing to their emergence. Leveraging a comprehensive dataset comprising papers, citations, and funding information, we employ advanced machine learning techniques, including Decision Tree and Logistic Regression models, to predict trending research areas. Our analysis reveals that the number of references cited in research papers (Reference Count) plays a pivotal role in determining trending research areas making reference counts the most relevant factor that drives trend in the CS field. Additionally, the influence of NSF grants and patents on trending topics has increased over time. The Logistic Regression model outperforms the Decision Tree model in predicting trends, exhibiting higher accuracy, precision, recall, and F1 score. By surpassing a random guess baseline, our data-driven approach demonstrates higher accuracy and efficacy in identifying trending research areas. The results offer valuable insights into the trending research areas, providing researchers and institutions with a data-driven foundation for decision-making and future research direction.

摘要