results: 研究发现,允许Activation function 不是固定多项式的 GNN 在两轮迭代过程中可以分辨任何两个非同构的根树。此外,研究还回答了 [Grohe, 2021] 提出的一个开问,并证明了 bounded 和 unbounded 大小 GNN 之间存在势必的分化。Abstract
In this article we present new results about the expressivity of Graph Neural Networks (GNNs). We prove that for any GNN with piecewise polynomial activations, whose architecture size does not grow with the graph input sizes, there exists a pair of non-isomorphic rooted trees of depth two such that the GNN cannot distinguish their root vertex up to an arbitrary number of iterations. The proof relies on tools from the algebra of symmetric polynomials. In contrast, it was already known that unbounded GNNs (those whose size is allowed to change with the graph sizes) with piecewise polynomial activations can distinguish these vertices in only two iterations. Our results imply a strict separation between bounded and unbounded size GNNs, answering an open question formulated by [Grohe, 2021]. We next prove that if one allows activations that are not piecewise polynomial, then in two iterations a single neuron perceptron can distinguish the root vertices of any pair of nonisomorphic trees of depth two (our results hold for activations like the sigmoid, hyperbolic tan and others). This shows how the power of graph neural networks can change drastically if one changes the activation function of the neural networks. The proof of this result utilizes the Lindemann-Weierstrauss theorem from transcendental number theory.
摘要
在本文中,我们提出了新的结果关于图神经网络(GNNs)的表达能力。我们证明了,对任何具有分割 polynomials 活化函数的 GNN,其架构大小不随图像大小增长,那么存在两个非同构的根树,其中根节点不可以在无限多轮 iterations 中被 GNN отличить。证明基于同态多项式代数的工具。在对比之下,已知无穷 GNNs(允许架构大小随图像大小变化)可以在两轮 iterations 中分辨这两个根节点。我们的结果表明,有穷 GNNs 和无穷 GNNs 之间存在彻底的分化,解答了 [Grohe, 2021] 提出的开问。我们接着证明,允许非分割 polynomials 活化函数的情况下,在两轮 iterations 中,单个神经元某元素权重网络可以分辨任何两个非同构的深度两个根树的根节点(我们的结果适用于如 сиги模、恒下弯和其他活化函数)。这示cases how the power of graph neural networks can change drastically if one changes the activation function of the neural networks. The proof of this result utilizes the Lindemann-Weierstrauss theorem from transcendental number theory.
Active Learning for Video Classification with Frame Level Queries
results: 我们的方法可以将人工标注员的努力削减至只需要审核几帧帧,而不是观看整个视频。这些结果表明了我们的方法可以帮助将机器学习模型训练需要的标注数量降低,并且可以更好地利用人工标注员的时间和努力。Abstract
Deep learning algorithms have pushed the boundaries of computer vision research and have depicted commendable performance in a variety of applications. However, training a robust deep neural network necessitates a large amount of labeled training data, acquiring which involves significant time and human effort. This problem is even more serious for an application like video classification, where a human annotator has to watch an entire video end-to-end to furnish a label. Active learning algorithms automatically identify the most informative samples from large amounts of unlabeled data; this tremendously reduces the human annotation effort in inducing a machine learning model, as only the few samples that are identified by the algorithm, need to be labeled manually. In this paper, we propose a novel active learning framework for video classification, with the goal of further reducing the labeling onus on the human annotators. Our framework identifies a batch of exemplar videos, together with a set of informative frames for each video; the human annotator needs to merely review the frames and provide a label for each video. This involves much less manual work than watching the complete video to come up with a label. We formulate a criterion based on uncertainty and diversity to identify the informative videos and exploit representative sampling techniques to extract a set of exemplar frames from each video. To the best of our knowledge, this is the first research effort to develop an active learning framework for video classification, where the annotators need to inspect only a few frames to produce a label, rather than watching the end-to-end video.
摘要
Multimodal brain age estimation using interpretable adaptive population-graph learning
results: 比较static图建构方法和其他适应方法,本研究的方法在人脑年龄估计和分类任务上表现出色,并且通过图像和非图像特征(phenotypes)的权重赋值,提高了图建构的可读性。Abstract
Brain age estimation is clinically important as it can provide valuable information in the context of neurodegenerative diseases such as Alzheimer's. Population graphs, which include multimodal imaging information of the subjects along with the relationships among the population, have been used in literature along with Graph Convolutional Networks (GCNs) and have proved beneficial for a variety of medical imaging tasks. A population graph is usually static and constructed manually using non-imaging information. However, graph construction is not a trivial task and might significantly affect the performance of the GCN, which is inherently very sensitive to the graph structure. In this work, we propose a framework that learns a population graph structure optimized for the downstream task. An attention mechanism assigns weights to a set of imaging and non-imaging features (phenotypes), which are then used for edge extraction. The resulting graph is used to train the GCN. The entire pipeline can be trained end-to-end. Additionally, by visualizing the attention weights that were the most important for the graph construction, we increase the interpretability of the graph. We use the UK Biobank, which provides a large variety of neuroimaging and non-imaging phenotypes, to evaluate our method on brain age regression and classification. The proposed method outperforms competing static graph approaches and other state-of-the-art adaptive methods. We further show that the assigned attention scores indicate that there are both imaging and non-imaging phenotypes that are informative for brain age estimation and are in agreement with the relevant literature.
摘要
<>使用Graph Convolutional Networks (GCNs)在文献中使用人口图(population graph),其中包括多Modal imaging信息和人口之间的关系,已经证明是有用的 для多种医学影像任务。然而,人口图的建构并不是一个轻松的任务,可能会影响GCN的性能,GCN本身是非常敏感于图结构。在这种情况下,我们提出了一个框架,可以学习优化的人口图结构,用于下游任务。我们使用一个注意力机制,将多Modal imaging和非 imaging特征(phenotypes)分配给权重,然后用这些权重进行边EXTRACTION。得到的图可以用来训练GCN。整个管道可以被训练END-to-END。此外,通过Visual化注意力权重,我们提高了图的可读性。我们使用UK Biobank,该提供了丰富的神经成像和非成像特征,来评估我们的方法在脑年龄回归和分类任务中的性能。我们的方法超过了相同的静止图方法和其他状态的适应方法。我们进一步表明,分配的注意力分数指示,有 both imaging和非 imaging特征是脑年龄估计中有用的,与相关文献一致。<>
Weakly-supervised positional contrastive learning: application to cirrhosis classification
results: 研究结果显示,与基准模型相比,提案的模型在内部数据集上提高了分类AUC的值 by 5%,并在公共的LIHC数据集上提高了分类AUC的值 by 26%。Abstract
Large medical imaging datasets can be cheaply and quickly annotated with low-confidence, weak labels (e.g., radiological scores). Access to high-confidence labels, such as histology-based diagnoses, is rare and costly. Pretraining strategies, like contrastive learning (CL) methods, can leverage unlabeled or weakly-annotated datasets. These methods typically require large batch sizes, which poses a difficulty in the case of large 3D images at full resolution, due to limited GPU memory. Nevertheless, volumetric positional information about the spatial context of each 2D slice can be very important for some medical applications. In this work, we propose an efficient weakly-supervised positional (WSP) contrastive learning strategy where we integrate both the spatial context of each 2D slice and a weak label via a generic kernel-based loss function. We illustrate our method on cirrhosis prediction using a large volume of weakly-labeled images, namely radiological low-confidence annotations, and small strongly-labeled (i.e., high-confidence) datasets. The proposed model improves the classification AUC by 5% with respect to a baseline model on our internal dataset, and by 26% on the public LIHC dataset from the Cancer Genome Atlas. The code is available at: https://github.com/Guerbet-AI/wsp-contrastive.
摘要
大量医学成像数据集可以便宜地和快速地进行低信度、弱标注(例如,放射学分数)的标注。高信度标签,如 histology-based 诊断,则非常罕见和昂贵。预训练策略,如对冲学习(CL)方法,可以利用无标签或弱标签数据集。这些方法通常需要大批量大小,但是在大量3D图像的全分辨率下,由于 GPU 内存限制,这可能会增加困难。然而,三维位势信息可以在某些医学应用中非常重要。在这种情况下,我们提出了一种高效的弱指导位势对比(WSP)对比学习策略,该策略通过一种通用的 kernel-based 损失函数来整合每个2D slice 的空间上下文和弱标签。我们在 cirrhosis 预测 task 上使用了一大量的弱标签图像,即放射学低信度标注,以及一些小型、强标签(即高信度标签)数据集。我们的模型与基线模型在我们的内部数据集上提高了分类 AUC 值5%,并在 LIHC 数据集上提高了26%。代码可以在:https://github.com/Guerbet-AI/wsp-contrastive 中找到。
MiVOLO: Multi-input Transformer for Age and Gender Estimation
results: 经过实验表明,该模型在四个流行的标准benchmark上达到了状态机器人性能,并且在实时处理方面表现出色。此外,我们还引入了一个基于Open Images Dataset的新的benchmark,并提供了高精度的人工筛选的标注数据。最终,我们比较了该模型的年龄认知性能与人类水平,并证明它在大多数年龄范围内有显著优势。Abstract
Age and gender recognition in the wild is a highly challenging task: apart from the variability of conditions, pose complexities, and varying image quality, there are cases where the face is partially or completely occluded. We present MiVOLO (Multi Input VOLO), a straightforward approach for age and gender estimation using the latest vision transformer. Our method integrates both tasks into a unified dual input/output model, leveraging not only facial information but also person image data. This improves the generalization ability of our model and enables it to deliver satisfactory results even when the face is not visible in the image. To evaluate our proposed model, we conduct experiments on four popular benchmarks and achieve state-of-the-art performance, while demonstrating real-time processing capabilities. Additionally, we introduce a novel benchmark based on images from the Open Images Dataset. The ground truth annotations for this benchmark have been meticulously generated by human annotators, resulting in high accuracy answers due to the smart aggregation of votes. Furthermore, we compare our model's age recognition performance with human-level accuracy and demonstrate that it significantly outperforms humans across a majority of age ranges. Finally, we grant public access to our models, along with the code for validation and inference. In addition, we provide extra annotations for used datasets and introduce our new benchmark.
摘要
在野外中,年龄和性别识别是一项非常具有挑战性的任务:除了条件的变化、姿势复杂度和图像质量的变化外,还有情况下面部或完全被遮盖。我们介绍了 MiVOLO(多输入VOLO),一种简单的方法,使用最新的视觉变换器进行年龄和性别估算。我们的方法将这两个任务集成到一个统一的双输入/输出模型中,利用人像数据以及脸部信息。这会提高我们的模型的总体化能力,使其能够在图像中不可见的脸部时还能达到满意的结果。为评估我们提出的模型,我们进行了四个流行的benchmark测试,并实现了实时处理能力。此外,我们还创建了一个基于Open Images Dataset的新的benchmark,其中的ground truth标注由人工标注员仔细生成,因此得到了高度的准确率。此外,我们比较了我们的年龄识别性能与人类水平的准确率,并证明它在大多数年龄范围内Significantly Outperform humans。最后,我们向公众开放了我们的模型,同时提供了验证和推理代码。此外,我们还提供了其他的标注,并引入了我们的新benchmark。
EchoVest: Real-Time Sound Classification and Depth Perception Expressed through Transcutaneous Electrical Nerve Stimulation
paper_authors: Jesse Choe, Siddhant Sood, Ryan Park
for: The paper aims to develop an assistive device for blind/deaf individuals to enhance their awareness of their environment, with a focus on sound classification and localization.
methods: The paper employs a novel audio pipeline that combines the Audio Spectrogram Transformer (AST) model and Fast Fourier Transforms for noise reduction, as well as Otsu’s Method for background noise sound filtering and Complex Time Difference of Arrival algorithms for direction and depth calculation.
results: The final algorithm achieved state-of-the-art results on numerous checkpoints, including a 95.7% accuracy on the ESC-50 dataset for environmental sound classification.Here’s the simplified Chinese text for the three key points:
results: 最终算法在多个检查点上达到了顶尖的结果,包括ESC-50数据集上的声音分类准确率95.7%。Abstract
Over 1.5 billion people worldwide live with hearing impairment. Despite various technologies that have been created for individuals with such disabilities, most of these technologies are either extremely expensive or inaccessible for everyday use in low-medium income countries. In order to combat this issue, we have developed a new assistive device, EchoVest, for blind/deaf people to intuitively become more aware of their environment. EchoVest transmits vibrations to the user's body by utilizing transcutaneous electric nerve stimulation (TENS) based on the source of the sounds. EchoVest also provides various features, including sound localization, sound classification, noise reduction, and depth perception. We aimed to outperform CNN-based machine-learning models, the most commonly used machine learning model for classification tasks, in accuracy and computational costs. To do so, we developed and employed a novel audio pipeline that adapts the Audio Spectrogram Transformer (AST) model, an attention-based model, for our sound classification purposes, and Fast Fourier Transforms for noise reduction. The application of Otsu's Method helped us find the optimal thresholds for background noise sound filtering and gave us much greater accuracy. In order to calculate direction and depth accurately, we applied Complex Time Difference of Arrival algorithms and SOTA localization. Our last improvement was to use blind source separation to make our algorithms applicable to multiple microphone inputs. The final algorithm achieved state-of-the-art results on numerous checkpoints, including a 95.7\% accuracy on the ESC-50 dataset for environmental sound classification.
摘要
全球1.5亿人口中有听力障碍,尽管有许多为这些人群开发的技术,但大多数这些技术是非常昂贵或在低中收入国家中不可 accessing。为解决这个问题,我们开发了一个新的助手设备——EchoVest,用于听力障碍人群更好地了解环境。EchoVest通过使用皮肤电刺激(TENS)技术,将声音传递给用户的身体,并提供了多种功能,如声音地图、声音分类、干扰reduction和深度感知。我们想要在精度和计算成本两个方面超越基于Convolutional Neural Networks(CNN)的机器学习模型,因此我们开发了一个新的音频管道,使用Audio Spectrogram Transformer(AST)模型,并使用快速傅立叶变换来减少干扰。使用欧氏方法,我们可以查找最佳背景声音滤波的阈值,从而提高了准确性。为了计算方向和深度,我们应用了复杂时间差分解算法和当今最佳地址算法。最后,我们使用盲源分离来使我们的算法适用于多个麦克风输入。最终算法达到了现有最佳结果,包括ESC-50数据集上的95.7%准确率。
DBFed: Debiasing Federated Learning Framework based on Domain-Independent
results: 实验结果显示,DBFed 比三种比较方法中的大多数指标都高于,这 fully demonstrably 表明 DBFed 的偏见纠正效果。Abstract
As digital transformation continues, enterprises are generating, managing, and storing vast amounts of data, while artificial intelligence technology is rapidly advancing. However, it brings challenges in information security and data security. Data security refers to the protection of digital information from unauthorized access, damage, theft, etc. throughout its entire life cycle. With the promulgation and implementation of data security laws and the emphasis on data security and data privacy by organizations and users, Privacy-preserving technology represented by federated learning has a wide range of application scenarios. Federated learning is a distributed machine learning computing framework that allows multiple subjects to train joint models without sharing data to protect data privacy and solve the problem of data islands. However, the data among multiple subjects are independent of each other, and the data differences in quality may cause fairness issues in federated learning modeling, such as data bias among multiple subjects, resulting in biased and discriminatory models. Therefore, we propose DBFed, a debiasing federated learning framework based on domain-independent, which mitigates model bias by explicitly encoding sensitive attributes during client-side training. This paper conducts experiments on three real datasets and uses five evaluation metrics of accuracy and fairness to quantify the effect of the model. Most metrics of DBFed exceed those of the other three comparative methods, fully demonstrating the debiasing effect of DBFed.
摘要
为数统计 револю过程中,企业创生、管理和储存大量数据,同时人工智能技术快速发展。然而,这带来数据安全和资讯安全的挑战。数据安全指的是保护数据 digitization 过程中的数据免被未经授权的存取、损坏、窃取等行为。随着数据安全法规的推广和组织和用户对数据安全和隐私的重视,隐私保持技术如联邦学习被应用在各个应用场景中。联邦学习是一种分布式机器学习计算框架,允许多个主题共同训练无需分享数据,以保护数据隐私和解决数据岛问题。然而,各个主题的数据独立于对方,并且各个数据质量的不同可能导致该课程优化问题,如主题间的数据偏见,从而导致不公正和歧视的模型。因此,我们提出了DBFed,一个基于领域独立的偏见调整联邦学习框架,通过客边端训练中明确地编码敏感特征,以 Mitigate 模型偏见。本文对三个真实数据集进行实验,使用五个评估 метри来衡量模型的影响。大多数DBFed的 métriques exceed 其他三种比较方法的 métriques,全面展示了DBFed的偏见调整效果。
AnyTeleop: A General Vision-Based Dexterous Robot Arm-Hand Teleoperation System
results: 实际实验和虚拟实验中,AnyTeleop系统可以达到更高的成功率和更好的模仿学习性能,相比之前特定设计 для该机器人硬件的系统。Abstract
Vision-based teleoperation offers the possibility to endow robots with human-level intelligence to physically interact with the environment, while only requiring low-cost camera sensors. However, current vision-based teleoperation systems are designed and engineered towards a particular robot model and deploy environment, which scales poorly as the pool of the robot models expands and the variety of the operating environment increases. In this paper, we propose AnyTeleop, a unified and general teleoperation system to support multiple different arms, hands, realities, and camera configurations within a single system. Although being designed to provide great flexibility to the choice of simulators and real hardware, our system can still achieve great performance. For real-world experiments, AnyTeleop can outperform a previous system that was designed for a specific robot hardware with a higher success rate, using the same robot. For teleoperation in simulation, AnyTeleop leads to better imitation learning performance, compared with a previous system that is particularly designed for that simulator. Project page: http://anyteleop.com/.
摘要
“视觉基于的 теле操作可以赋予机器人人类水平的智能,同时只需要低成本的摄像头感知器。然而,当前的视觉基于的 теле操作系统是为特定的机器人模型和部署环境设计和工程,这会随着机器人模型池的扩展和运行环境的多样化而扩展不佳。在这篇论文中,我们提议了 AnyTeleop,一个通用和普适的 теле操作系统,可以支持多种不同的臂、手、现实和摄像头配置。尽管我们的系统设计了大量的灵活性,但它仍然可以实现高性能。在实际实验中, AnyTeleop 可以比一个特定机器人硬件设计的系统更高的成功率,使用同样的机器人。在模拟中, AnyTeleop 比特定适用于该模拟器的系统更好的学习效果。项目页面:http://anyteleop.com/。”Note: The translation is in Simplified Chinese, which is the standard form of Chinese used in mainland China and Singapore. If you need Traditional Chinese, please let me know.
A Semi-Automated Solution Approach Selection Tool for Any Use Case via Scopus and OpenAI: a Case Study for AI/ML in Oncology
results: 研究显示,这个工具可以实现半自动化的方案方法评估和选择,并提供了各种应用场景的敏感分析和后analyzes。在肿瘤领域的 caso study 和多个使用案例中,该工具获得了有前提的结果,与手动真实比较。Abstract
In today's vast literature landscape, a manual review is very time-consuming. To address this challenge, this paper proposes a semi-automated tool for solution method review and selection. It caters to researchers, practitioners, and decision-makers while serving as a benchmark for future work. The tool comprises three modules: (1) paper selection and scoring, using a keyword selection scheme to query Scopus API and compute relevancy; (2) solution method extraction in papers utilizing OpenAI API; (3) sensitivity analysis and post-analyzes. It reveals trends, relevant papers, and methods. AI in the oncology case study and several use cases are presented with promising results, comparing the tool to manual ground truth.
摘要
今天的文献景观中, manual review 非常时间consuming。为解决这个挑战,这篇论文提出了一种半自动化工具 для方法评估和选择。它适用于研究人员、实践者和决策者,同时也作为未来工作的标准。工具包括三个模块:(1)文献选择和分数计算,使用关键词选择方案查询Scopus API,计算相关性;(2)解决方法提取在论文中使用OpenAI API;(3)敏感分析和后置分析。它揭示了趋势、相关论文和方法。在肿瘤 случа子研究和一些实践案例中,提出了有前提的结果,与手动参照相比。
Unraveling the Age Estimation Puzzle: Comparative Analysis of Deep Learning Approaches for Facial Age Estimation
results: 我们发现,这些因素通常对年龄估计结果产生更大的影响,而不是选择年龄估计方法本身。我们还评估了每种方法的泛化能力,通过评估每种方法在公共可用的年龄估计数据集上的跨数据集性能。结果强调了使用一致的数据处理方法和建立标准 benchmarks,以确保可靠和意义的比较。Abstract
Comparing different age estimation methods poses a challenge due to the unreliability of published results, stemming from inconsistencies in the benchmarking process. Previous studies have reported continuous performance improvements over the past decade using specialized methods; however, our findings challenge these claims. We argue that, for age estimation tasks outside of the low-data regime, designing specialized methods is unnecessary, and the standard approach of utilizing cross-entropy loss is sufficient. This paper aims to address the benchmark shortcomings by evaluating state-of-the-art age estimation methods in a unified and comparable setting. We systematically analyze the impact of various factors, including facial alignment, facial coverage, image resolution, image representation, model architecture, and the amount of data on age estimation results. Surprisingly, these factors often exert a more significant influence than the choice of the age estimation method itself. We assess the generalization capability of each method by evaluating the cross-dataset performance for publicly available age estimation datasets. The results emphasize the importance of using consistent data preprocessing practices and establishing standardized benchmarks to ensure reliable and meaningful comparisons. The source code is available at https://github.com/paplhjak/Facial-Age-Estimation-Benchmark.
摘要
This paper aims to address benchmark shortcomings by evaluating state-of-the-art age estimation methods in a unified and comparable setting. We systematically analyze the impact of various factors, including facial alignment, facial coverage, image resolution, image representation, model architecture, and the amount of data on age estimation results. Surprisingly, these factors often have a more significant influence than the choice of age estimation method.We assess the generalization capability of each method by evaluating cross-dataset performance for publicly available age estimation datasets. The results emphasize the importance of using consistent data preprocessing practices and establishing standardized benchmarks to ensure reliable and meaningful comparisons. The source code is available at https://github.com/paplhjak/Facial-Age-Estimation-Benchmark.
Interpreting and generalizing deep learning in physics-based problems with functional linear models
results: 我们的模型可以与深度学习模型具有相同的准确率,同时提高对于不同数据集的泛化能力,并且可以提供更多的可读性和可解释性。我们在固体力学、流体力学和运输等领域进行了测试,并得到了良好的结果。Abstract
Although deep learning has achieved remarkable success in various scientific machine learning applications, its black-box nature poses concerns regarding interpretability and generalization capabilities beyond the training data. Interpretability is crucial and often desired in modeling physical systems. Moreover, acquiring extensive datasets that encompass the entire range of input features is challenging in many physics-based learning tasks, leading to increased errors when encountering out-of-distribution (OOD) data. In this work, motivated by the field of functional data analysis (FDA), we propose generalized functional linear models as an interpretable surrogate for a trained deep learning model. We demonstrate that our model could be trained either based on a trained neural network (post-hoc interpretation) or directly from training data (interpretable operator learning). A library of generalized functional linear models with different kernel functions is considered and sparse regression is used to discover an interpretable surrogate model that could be analytically presented. We present test cases in solid mechanics, fluid mechanics, and transport. Our results demonstrate that our model can achieve comparable accuracy to deep learning and can improve OOD generalization while providing more transparency and interpretability. Our study underscores the significance of interpretability in scientific machine learning and showcases the potential of functional linear models as a tool for interpreting and generalizing deep learning.
摘要
Inspired by the field of functional data analysis (FDA), we propose generalized functional linear models as an interpretable surrogate for a trained deep learning model. Our model can be trained either based on a trained neural network (post-hoc interpretation) or directly from training data (interpretable operator learning). We consider a library of generalized functional linear models with different kernel functions and use sparse regression to discover an interpretable surrogate model that can be analytically presented.We demonstrate our model's effectiveness through test cases in solid mechanics, fluid mechanics, and transport. Our results show that our model can achieve comparable accuracy to deep learning and can improve OOD generalization while providing more transparency and interpretability. Our study highlights the importance of interpretability in scientific machine learning and showcases the potential of functional linear models as a tool for interpreting and generalizing deep learning.
Automatically detecting activities of daily living from in-home sensors as indicators of routine behaviour in an older population
methods: 这个研究使用了一个Action Research Cycle(ARC)试验,试验了23名参与者,每个参与者有约20个IoT仪器在家中。在ARC试验中,参与者参加了两次数据告诉会,每次都会显示参与者的家中活动。这些会议也收集了参与者对检测活动的准确性的反馈。
results: 这个研究发现,使用了联合规则探索,可以独立地检测参与者的每天生活活动(ADL),并且可以使用单一的规则来检测各个参与者的ADL。这意味着,可以降低参与者提供训练数据的必要性,并且可以让更多的参与者加入系统。Abstract
Objective: The NEX project has developed an integrated Internet of Things (IoT) system coupled with data analytics to offer unobtrusive health and wellness monitoring supporting older adults living independently at home. Monitoring {currently} involves visualising a set of automatically detected activities of daily living (ADLs) for each participant. The detection of ADLs is achieved {} to allow the incorporation of additional participants whose ADLs are detected without re-training the system. Methods: Following an extensive User Needs and Requirements study involving 426 participants, a pilot trial and a friendly trial of the deployment, an Action Research Cycle (ARC) trial was completed. This involved 23 participants over a 10-week period each with c.20 IoT sensors in their homes. During the ARC trial, participants each took part in two data-informed briefings which presented visualisations of their own in-home activities. The briefings also gathered training data on the accuracy of detected activities. Association rule mining was then used on the combination of data from sensors and participant feedback to improve the automatic detection of ADLs. Results: Association rule mining was used to detect a range of ADLs for each participant independently of others and was then used to detect ADLs across participants using a single set of rules {for each ADL}. This allows additional participants to be added without the necessity of them providing training data. Conclusions: Additional participants can be added to the NEX system without the necessity to re-train the system for automatic detection of the set of their activities of daily living.
摘要
目标:NEX项目已经开发了一个集成互联网智能(IoT)系统,并结合数据分析,以提供不间断的健康和休闲监测,支持年轻者在家中独立生活。监测当前包括图示每名参与者的自动检测的日常生活活动(ADLs)的集成。方法:经过了426名参与者的用户需求和要求研究,飞行试验和友好试验的部署,完成了一个Action Research Cycle(ARC)试验。这 involve了23名参与者,每名参与者在10周期内有约20个IoT传感器在家中。在ARC试验期间,参与者每人参加了两次数据驱动的会议,其中显示了每名参与者的家中活动的视觉化。这些会议还收集了参与者对检测的准确性的反馈,以便进一步改进自动检测ADLs的方法。结论:通过对传感器和参与者反馈的组合使用关联规则挖掘,可以独立地检测每名参与者的ADLs,并且可以在所有参与者之间共享同一组则。这意味着可以在不需要更多的训练数据的情况下,将更多的参与者添加到NEX系统中。
Gradient Surgery for One-shot Unlearning on Generative Model
results: 本文比较了现有的基eline,并提供了理论分析,证明了本方法可以高效地删除数据的影响。Abstract
Recent regulation on right-to-be-forgotten emerges tons of interest in unlearning pre-trained machine learning models. While approximating a straightforward yet expensive approach of retrain-from-scratch, recent machine unlearning methods unlearn a sample by updating weights to remove its influence on the weight parameters. In this paper, we introduce a simple yet effective approach to remove a data influence on the deep generative model. Inspired by works in multi-task learning, we propose to manipulate gradients to regularize the interplay of influence among samples by projecting gradients onto the normal plane of the gradients to be retained. Our work is agnostic to statistics of the removal samples, outperforming existing baselines while providing theoretical analysis for the first time in unlearning a generative model.
摘要
最近的强制忘记法规则促使大量关注在强制忘记前训练机器学习模型上。而 aproximating一个直接又昂贵的重新训练方法,现代机器忘记方法在一个样本上忘记,通过更新权重来消除其影响于权重参数。在这篇论文中,我们介绍了一种简单又有效的方法,用于从深度生成模型中移除数据的影响。受到多任务学习的启发,我们提议在欠拟合的情况下,对权重参数进行抑制,以 regularize各个样本之间的影响相互作用。我们的方法不依赖于废弃样本的统计特征,超越现有的基eline,并提供了理论分析。
StyleGAN2-based Out-of-Distribution Detection for Medical Imaging
results: 研究发现,这个方法可以对CT图像进行非常好的分类,并且完全无法重建肝脏异常,如针和痰液。AUROC值高于90%,这表明这个方法可以实现非常好的OOD检测。Abstract
One barrier to the clinical deployment of deep learning-based models is the presence of images at runtime that lie far outside the training distribution of a given model. We aim to detect these out-of-distribution (OOD) images with a generative adversarial network (GAN). Our training dataset was comprised of 3,234 liver-containing computed tomography (CT) scans from 456 patients. Our OOD test data consisted of CT images of the brain, head and neck, lung, cervix, and abnormal livers. A StyleGAN2-ADA architecture was employed to model the training distribution. Images were reconstructed using backpropagation. Reconstructions were evaluated using the Wasserstein distance, mean squared error, and the structural similarity index measure. OOD detection was evaluated with the area under the receiver operating characteristic curve (AUROC). Our paradigm distinguished between liver and non-liver CT with greater than 90% AUROC. It was also completely unable to reconstruct liver artifacts, such as needles and ascites.
摘要
一个阻碍深度学习模型在临床应用的问题是运行时存在远离训练分布的图像。我们使用生成对抗网络(GAN)来探测这些外部分布(OOD)图像。我们的训练集包括3234个liver-包含 computed tomography(CT)扫描图像,来自456名患者。我们的OOD测试数据包括CT图像的脑、头颈、肺、颈部和异常liver。我们采用了StyleGAN2-ADA架构来模型训练分布。图像使用反射推导重建。重建图像的评估方法包括 Wasserstein 距离、平均方差和结构相似度指标。OOD检测的评估方法包括接受操作特征分布曲线(AUROC)。我们的方法可以在liver和非liver CT之间进行分类,AUROC高于90%。此外,我们的方法完全无法重建liver残留物,如针和液体。
Pathway toward prior knowledge-integrated machine learning in engineering
results: 本研究可以均衡全景和分解视角,满足不同领域专业人员的需求,同时也能够利用域知识来提高数据驱动过程的精度和可靠性。Abstract
Despite the digitalization trend and data volume surge, first-principles models (also known as logic-driven, physics-based, rule-based, or knowledge-based models) and data-driven approaches have existed in parallel, mirroring the ongoing AI debate on symbolism versus connectionism. Research for process development to integrate both sides to transfer and utilize domain knowledge in the data-driven process is rare. This study emphasizes efforts and prevailing trends to integrate multidisciplinary domain professions into machine acknowledgeable, data-driven processes in a two-fold organization: examining information uncertainty sources in knowledge representation and exploring knowledge decomposition with a three-tier knowledge-integrated machine learning paradigm. This approach balances holist and reductionist perspectives in the engineering domain.
摘要
Simplified Chinese:不withstanding数字化趋势和数据量的急剧增长,首 принциples模型(也称为逻辑驱动、物理学驱动、规则驱动或知识驱动模型)和数据驱动方法在平行的方式存在,反映了人工智能中符号主义VS连接主义的讨论。对于过程发展来 integrate both sides的研究 rare。这种研究强调在机器可识别的数据驱动过程中 Multidisciplinary domain professions的集成,使用 two-fold 组织方式:对知识表示中的信息不确定源进行检查,并 explore knowledge decomposition with a three-tier knowledge-integrated machine learning paradigm。这种方法平衡了整体和分解的视角在工程领域。
DADO – Low-Cost Selection Strategies for Deep Active Design Optimization
results: 提高设计优化的效率,减少计算成本。Abstract
In this experience report, we apply deep active learning to the field of design optimization to reduce the number of computationally expensive numerical simulations. We are interested in optimizing the design of structural components, where the shape is described by a set of parameters. If we can predict the performance based on these parameters and consider only the promising candidates for simulation, there is an enormous potential for saving computing power. We present two selection strategies for self-optimization to reduce the computational cost in multi-objective design optimization problems. Our proposed methodology provides an intuitive approach that is easy to apply, offers significant improvements over random sampling, and circumvents the need for uncertainty estimation. We evaluate our strategies on a large dataset from the domain of fluid dynamics and introduce two new evaluation metrics to determine the model's performance. Findings from our evaluation highlights the effectiveness of our selection strategies in accelerating design optimization. We believe that the introduced method is easily transferable to other self-optimization problems.
摘要
在这份经验报告中,我们运用深度活动学来降低计算成本的数字实验 simulation 的数量。我们关注设计结构元件的优化,其形状由一组参数描述。如果我们可以根据这些参数预测性能,只考虑计算成本较低的候选者,那么可以很大减少计算力量。我们提出了两种选择策略来减少多目标设计优化问题中的计算成本。我们的提出的方法是INTUITIVE的,易于实施,比Random Sampling更好,无需 uncertainty estimation。我们对大量的流体动力学数据集进行了评估,并引入了两种新的评价指标来评估模型的性能。我们的评估结果表明,我们的选择策略在加速设计优化方面具有显著的效果。我们认为引入的方法可以轻松地应用于其他自动优化问题。
QBitOpt: Fast and Accurate Bitwidth Reallocation during Training
results: 我们在ImageNet上测试了QBitOpt,并证明了我们可以比存在fixed和混合精度方法的情况下,在常见的位宽约束下表现更好。Abstract
Quantizing neural networks is one of the most effective methods for achieving efficient inference on mobile and embedded devices. In particular, mixed precision quantized (MPQ) networks, whose layers can be quantized to different bitwidths, achieve better task performance for the same resource constraint compared to networks with homogeneous bitwidths. However, finding the optimal bitwidth allocation is a challenging problem as the search space grows exponentially with the number of layers in the network. In this paper, we propose QBitOpt, a novel algorithm for updating bitwidths during quantization-aware training (QAT). We formulate the bitwidth allocation problem as a constraint optimization problem. By combining fast-to-compute sensitivities with efficient solvers during QAT, QBitOpt can produce mixed-precision networks with high task performance guaranteed to satisfy strict resource constraints. This contrasts with existing mixed-precision methods that learn bitwidths using gradients and cannot provide such guarantees. We evaluate QBitOpt on ImageNet and confirm that we outperform existing fixed and mixed-precision methods under average bitwidth constraints commonly found in the literature.
摘要
“量化神经网络是一种最有效的方法来实现移动和嵌入式设备上快速的推理。特别是杂比例量化(MPQ)网络,它的层可以被量化到不同的比特宽,可以在同样的资源约束下提高任务性能。然而,寻找最佳比特宽分配是一个复杂的问题,因为搜索空间随着网络层数的增加而呈指数增长。在这篇论文中,我们提出了QBitOpt算法,它是一种在量化推理期间更新比特宽的算法。我们将比特宽分配问题形式化为一个约束优化问题。通过将快速计算的敏感度与高效的解决方案结合在一起,QBitOpt可以生成杂比例网络,保证任务性能高,同时也能够遵守常见的比特宽约束。这与现有的杂比例方法不同,它们通过梯度来学习比特宽,不能提供相同的保证。我们对ImageNet进行了评估,并证明了我们在平均比特宽约束下出perform existing固定和杂比例方法。”
results: 研究人员透过实验证明了自适应神经网络的有效性,在分类和回授问题中均有好的表现。此外, authors 还证明了这种自适应神经网络可以在网络的大小和深度不确定情况下表现良好。Abstract
The results of training a neural network are heavily dependent on the architecture chosen; and even a modification of only the size of the network, however small, typically involves restarting the training process. In contrast to this, we begin training with a small architecture, only increase its capacity as necessary for the problem, and avoid interfering with previous optimization while doing so. We thereby introduce a natural gradient based approach which intuitively expands both the width and depth of a neural network when this is likely to substantially reduce the hypothetical converged training loss. We prove an upper bound on the "rate" at which neurons are added, and a computationally cheap lower bound on the expansion score. We illustrate the benefits of such Self-Expanding Neural Networks in both classification and regression problems, including those where the appropriate architecture size is substantially uncertain a priori.
摘要
training一个神经网络的结果很受网络结构的影响,而 même一些小的修改会导致重新开始训练。相比之下,我们开始训练时选择小的网络 architecture,逐渐增加其容量,并避免在过程中对前一个优化器进行干扰。我们提出一种自然偏导的方法,其可以在可能减少假设的训练损失的情况下,自然地扩展神经网络的宽度和深度。我们证明了某种准则下的神经元数目增加的Upper bound,以及一种计算效率低的下界。我们还证明了这种自适应神经网络在分类和回归问题中的优势,包括一些预先不确定适应网络大小的问题。
Cluster-Induced Mask Transformers for Effective Opportunistic Gastric Cancer Screening on Non-contrast CT Scans
results: 我们的方法在一个包含100名癌症患者和148名正常人的测试集上实现了感知率85.0%和特征率92.6%。与两名医生的平均感知率(73.5%)和特征率(84.3%)相比,我们的方法表现更好。此外,我们在一个外部测试集上实现了特征率97.7%。这表明我们的方法可以作为一种新、不侵入、低成本、高精度的胃癌检测方法。Abstract
Gastric cancer is the third leading cause of cancer-related mortality worldwide, but no guideline-recommended screening test exists. Existing methods can be invasive, expensive, and lack sensitivity to identify early-stage gastric cancer. In this study, we explore the feasibility of using a deep learning approach on non-contrast CT scans for gastric cancer detection. We propose a novel cluster-induced Mask Transformer that jointly segments the tumor and classifies abnormality in a multi-task manner. Our model incorporates learnable clusters that encode the texture and shape prototypes of gastric cancer, utilizing self- and cross-attention to interact with convolutional features. In our experiments, the proposed method achieves a sensitivity of 85.0% and specificity of 92.6% for detecting gastric tumors on a hold-out test set consisting of 100 patients with cancer and 148 normal. In comparison, two radiologists have an average sensitivity of 73.5% and specificity of 84.3%. We also obtain a specificity of 97.7% on an external test set with 903 normal cases. Our approach performs comparably to established state-of-the-art gastric cancer screening tools like blood testing and endoscopy, while also being more sensitive in detecting early-stage cancer. This demonstrates the potential of our approach as a novel, non-invasive, low-cost, and accurate method for opportunistic gastric cancer screening.
摘要
Gastic cancer 是全球第三大的肿瘤癌症死亡原因,但没有任何指南推荐的检测试验。现有的方法可能是侵入的,昂贵的,并且缺乏感知力来检测早期肿瘤。在这项研究中,我们探索了使用深度学习方法来非侵入性的检测肿瘤。我们提出了一种新的帧Transformer,它同时分割肿瘤和识别异常。我们的模型包含学习的团队,这些团队编码了肿瘤癌症的文本和形状原型,并利用自我和交叉关注来交互 WITH convolutional 特征。在我们的实验中,我们的方法实现了检测肿瘤的感知率为85.0%,特征率为92.6%。与两名放射学家的平均感知率(73.5%)和特征率(84.3%)相比,我们的方法表现出色。此外,我们在一个外部测试集上获得了特征率为97.7%,这个测试集包含903个正常的案例。我们的方法与已知的肿瘤癌症检测工具如血液测试和endoscopic一样有效,同时具有更高的感知力和更早的识别能力。这表明了我们的方法的潜在优势,作为一种新、非侵入的、低成本的、准确的肿瘤检测方法。
SAGC-A68: a space access graph dataset for the classification of spaces and space elements in apartment buildings
for: The paper is written for researchers and practitioners who are interested in developing Graph Deep Learning (GDL) models for space function and space element classification in the context of building design and analysis.
methods: The paper introduces a new dataset, SAGC-A68, which comprises access graphs automatically generated from 68 digital 3D models of space layouts of apartment buildings. The authors use this dataset to train and evaluate a graph attention network (GAT) that predicts 22 space function and 6 space element classes.
results: The authors demonstrate the potential of the dataset and the GAT model by achieving high accuracy rates on the test set. They also show that the GAT model outperforms other baseline models, indicating the effectiveness of using GDL methods for space function and space element classification.Abstract
The analysis of building models for usable area, building safety, and energy use requires accurate classification data of spaces and space elements. To reduce input model preparation effort and errors, automated classification of spaces and space elements is desirable. A barrier hindering the utilization of Graph Deep Learning (GDL) methods to space function and space element classification is a lack of suitable datasets. To bridge this gap, we introduce a dataset, SAGC-A68, which comprises access graphs automatically generated from 68 digital 3D models of space layouts of apartment buildings. This graph-based dataset is well-suited for developing GDL models for space function and space element classification. To demonstrate the potential of the dataset, we employ it to train and evaluate a graph attention network (GAT) that predicts 22 space function and 6 space element classes. The dataset and code used in the experiment are available online. https://doi.org/10.5281/zenodo.7805872, https://github.com/A2Amir/SAGC-A68.
摘要
analysis of building models for usable area, building safety, and energy use requires accurate classification data of spaces and space elements. to reduce input model preparation effort and errors, automated classification of spaces and space elements is desirable. a barrier hindering the utilization of graph deep learning (gdl) methods to space function and space element classification is a lack of suitable datasets. to bridge this gap, we introduce a dataset, sagc-a68, which comprises access graphs automatically generated from 68 digital 3d models of space layouts of apartment buildings. this graph-based dataset is well-suited for developing gdl models for space function and space element classification. to demonstrate the potential of the dataset, we employ it to train and evaluate a graph attention network (gat) that predicts 22 space function and 6 space element classes. the dataset and code used in the experiment are available online. https://doi.org/10.5281/zenodo.7805872, https://github.com/A2Amir/SAGC-A68.Here's the translation in Traditional Chinese:分析建筑模型的用able面积、建筑安全和能源使用需要准确的分类数据空间和空间元素。为了减少输入模型准备的劳动和错误,自动分类空间和空间元素是可能的。 however,使用图深度学习(GDL)方法进行空间功能和空间元素分类存在一个障碍:缺乏适合的数据集。为了跨越这一障碍,我们介绍了一个数据集,SAGC-A68,该数据集包括自动生成的68个数字3D模型的空间布局的访问图。这个图基本数据集非常适合开发GDL模型进行空间功能和空间元素分类。为了证明该数据集的潜力,我们使用它来训练和评估一个图注意力网络(GAT),该网络预测22个空间功能和6个空间元素类型。该数据集和实验代码在线可用。https://doi.org/10.5281/zenodo.7805872, https://github.com/A2Amir/SAGC-A68。
Improving Heterogeneous Graph Learning with Weighted Mixed-Curvature Product Manifold
results: 该论文通过 extensively 的实验表明,WEIGHTED-PM 方法可以从输入数据中学习更好的 graph 表示,并且在多个下游任务中表现更好,如 word similarity learning、top-$k$ recommendation 和 knowledge graph embedding。Abstract
In graph representation learning, it is important that the complex geometric structure of the input graph, e.g. hidden relations among nodes, is well captured in embedding space. However, standard Euclidean embedding spaces have a limited capacity in representing graphs of varying structures. A promising candidate for the faithful embedding of data with varying structure is product manifolds of component spaces of different geometries (spherical, hyperbolic, or euclidean). In this paper, we take a closer look at the structure of product manifold embedding spaces and argue that each component space in a product contributes differently to expressing structures in the input graph, hence should be weighted accordingly. This is different from previous works which consider the roles of different components equally. We then propose WEIGHTED-PM, a data-driven method for learning embedding of heterogeneous graphs in weighted product manifolds. Our method utilizes the topological information of the input graph to automatically determine the weight of each component in product spaces. Extensive experiments on synthetic and real-world graph datasets demonstrate that WEIGHTED-PM is capable of learning better graph representations with lower geometric distortion from input data, and performs better on multiple downstream tasks, such as word similarity learning, top-$k$ recommendation, and knowledge graph embedding.
摘要
在图表示学中,capturing复杂的图形结构,如隐藏的节点关系,在嵌入空间中是非常重要的。然而,标准的欧几何嵌入空间有限的表示能力,无法捕捉不同结构的图。一种有前途的方法是使用不同geometry的组件空间的产品拓扑(球面、尖锥形、欧几何)。在这篇文章中,我们做了产品拓扑 embedding 空间的结构分析,并 argue that每个组件空间在产品中都有不同的表达效果,因此应该给予不同的权重。这与之前的作品不同,一般认为所有组件都应该具有相同的角色。我们然后提出了WEIGHTED-PM,一种数据驱动的方法,用于学习异类图的嵌入。我们的方法利用输入图的topological信息来自动确定每个组件在产品空间中的权重。我们的实验表明,WEIGHTED-PM可以从输入数据中学习更好的图表示,并且在多个下游任务中表现更好,如word相似度学习、top-$k$推荐和知识图嵌入。
An Algorithm with Optimal Dimension-Dependence for Zero-Order Nonsmooth Nonconvex Stochastic Optimization
results: 本研究的结果表明,我们的算法可以在几乎所有情况下达到最优的 convergence rate,并且在满足certain condition下,我们的算法可以在随机扰动下达到最优的 convergence rate。此外,我们的分析还证明了在非 convex 随机零次设定下,nonsmooth 优化与 smooth 优化之间存在等价关系。Abstract
We study the complexity of producing $(\delta,\epsilon)$-stationary points of Lipschitz objectives which are possibly neither smooth nor convex, using only noisy function evaluations. Recent works proposed several stochastic zero-order algorithms that solve this task, all of which suffer from a dimension-dependence of $\Omega(d^{3/2})$ where $d$ is the dimension of the problem, which was conjectured to be optimal. We refute this conjecture by providing a faster algorithm that has complexity $O(d\delta^{-1}\epsilon^{-3})$, which is optimal (up to numerical constants) with respect to $d$ and also optimal with respect to the accuracy parameters $\delta,\epsilon$, thus solving an open question due to Lin et al. (NeurIPS'22). Moreover, the convergence rate achieved by our algorithm is also optimal for smooth objectives, proving that in the nonconvex stochastic zero-order setting, nonsmooth optimization is as easy as smooth optimization. We provide algorithms that achieve the aforementioned convergence rate in expectation as well as with high probability. Our analysis is based on a simple yet powerful geometric lemma regarding the Goldstein-subdifferential set, which allows utilizing recent advancements in first-order nonsmooth nonconvex optimization.
摘要
我们研究生成($\delta$, $\epsilon$)-稳定点的复杂性,这些目标可能不对称,并且可能不 глад。我们使用仅有抽象函数评估来进行。近期的研究提出了一些测量零次法,解决了这个任务,但是所有这些方法都受到了维度($d$)的依赖性,即$\Omega(d^{3/2})$。我们推翻了这个推论,提出了一个更快的算法,其复杂度为$O(d\delta^{-1}\epsilon^{-3})$,它在$d$和精度参数$\delta,\epsilon$方面都是最佳(即对应的常数),解决了林等人(NeurIPS'22)所提出的开问题。此外,我们的算法还可以在光滑目标上 достичь最佳的凝固率,证明了在非对称零次法设定中,非光滑优化是与光滑优化一样容易。我们提供了在预期中和高概率下实现的算法,我们的分析基于一个简单又强大的几何学 lemma,允许我们利用最近的非光滑非对称非凝固优化的进步。
Geometric Constraints in Probabilistic Manifolds: A Bridge from Molecular Dynamics to Structured Diffusion Processes
results: 该方法可以帮助实现在深度学习基础上的药物设计中维护特定分子对话的需要,以确保治疗效果和安全性。Abstract
Understanding the macroscopic characteristics of biological complexes demands precision and specificity in statistical ensemble modeling. One of the primary challenges in this domain lies in sampling from particular subsets of the state-space, driven either by existing structural knowledge or specific areas of interest within the state-space. We propose a method that enables sampling from distributions that rigorously adhere to arbitrary sets of geometric constraints in Euclidean spaces. This is achieved by integrating a constraint projection operator within the well-regarded architecture of Denoising Diffusion Probabilistic Models, a framework founded in generative modeling and probabilistic inference. The significance of this work becomes apparent, for instance, in the context of deep learning-based drug design, where it is imperative to maintain specific molecular profile interactions to realize the desired therapeutic outcomes and guarantee safety.
摘要
理解生物复杂体系的宏观特征需要精度和特点的统计ensemble模型。这个领域的主要挑战在于采样特定的子集state-space,由于现有的结构知识或特定区域的 интересы。我们提出了一种方法,可以准确地采样符合任意的几何约束的分布在欧几何空间中。这是通过将约束投影Operator интеGRATE到了Denosing Diffusion Probabilistic Models的可靠的建筑基础上实现的。这种方法的重要性在深度学习基础设计中 particualry evident,因为需要保持特定的分子 profil interactio来实现所需的治疗效果和安全性。
Invertible Low-Dimensional Modelling of X-ray Absorption Spectra for Potential Applications in Spectral X-ray Imaging
methods: 该论文提出了一种新的模型,它combines一个深度神经网络自编器和一个最佳的线性模型基于 Singular Value Decomposition (SVD)。
results: 作者比较了这种新方法与其他线性和非线性方法,包括一种简单的模型和一种另一种深度学习模型。结果显示,该新方法在模拟 X-ray 吸收谱中的 K-edge 区域时表现更优异。Abstract
X-ray interaction with matter is an energy-dependent process that is contingent on the atomic structure of the constituent material elements. The most advanced models to capture this relationship currently rely on Monte Carlo (MC) simulations. Whilst these very accurate models, in many problems in spectral X-ray imaging, such as data compression, noise removal, spectral estimation, and the quantitative measurement of material compositions, these models are of limited use, as these applications typically require the efficient inversion of the model, that is, they require the estimation of the best model parameters for a given spectral measurement. Current models that can be easily inverted however typically only work when modelling spectra in regions away from their K-edges, so they have limited utility when modelling a wider range of materials. In this paper, we thus propose a novel, non-linear model that combines a deep neural network autoencoder with an optimal linear model based on the Singular Value Decomposition (SVD). We compare our new method to other alternative linear and non-linear approaches, a sparse model and an alternative deep learning model. We demonstrate the advantages of our method over traditional models, especially when modelling X-ray absorption spectra that contain K-edges in the energy range of interest.
摘要
在这篇论文中,我们因此提出了一种新的、非线性模型,它结合了深度神经网络自适应器和基于Singular Value Decomposition(SVD)的最佳线性模型。我们与其他的线性和非线性方法进行比较,包括一种稀疏模型和一种另一种深度学习模型。我们示出了我们的方法在传统模型的基础上具有优势,特别是当模型X射线吸收谱在能量范围内时。
Badgers: generating data quality deficits with Python
results: 这篇论文通过使用“badgers”库,可以生成不同类型的数据质量缺陷,以便对数据驱动应用的数据质量进行实验性评估。 documentation 可以在https://fraunhofer-iese.github.io/badgers/ 中找到,源代码可以在https://github.com/Fraunhofer-IESE/badgers 中找到。Abstract
Generating context specific data quality deficits is necessary to experimentally assess data quality of data-driven (artificial intelligence (AI) or machine learning (ML)) applications. In this paper we present badgers, an extensible open-source Python library to generate data quality deficits (outliers, imbalanced data, drift, etc.) for different modalities (tabular data, time-series, text, etc.). The documentation is accessible at https://fraunhofer-iese.github.io/badgers/ and the source code at https://github.com/Fraunhofer-IESE/badgers
摘要
生成模式特定数据质量缺陷是评估数据驱动(人工智能(AI)或机器学习(ML))应用的实验需求。本文介绍badgers,一个可扩展的开源Python库,用于生成不同类型的数据质量缺陷(异常值、不均衡数据、漂移等)。文档可以在https://fraunhofer-iese.github.io/badgers/中查看,代码在https://github.com/Fraunhofer-IESE/badgers上可以查看。
Multi-modal Graph Learning over UMLS Knowledge Graphs
results: 比较于现有的建筑方案,提高了表示多模态医学概念的性能,并且证明了在医学知识的支持下,多模态医学概念表示具有重要的意义。In English, this means:
for: Predicting the progression of patient illnesses across multiple hospital visits
methods: Using graph neural networks to learn meaningful representations of medical concepts, and combining them to represent entire patient visits.
results: Outperforming existing architectures in representing multiple modalities of medical concepts, and demonstrating the significance of incorporating prior medical knowledge.Abstract
Clinicians are increasingly looking towards machine learning to gain insights about patient evolutions. We propose a novel approach named Multi-Modal UMLS Graph Learning (MMUGL) for learning meaningful representations of medical concepts using graph neural networks over knowledge graphs based on the unified medical language system. These representations are aggregated to represent entire patient visits and then fed into a sequence model to perform predictions at the granularity of multiple hospital visits of a patient. We improve performance by incorporating prior medical knowledge and considering multiple modalities. We compare our method to existing architectures proposed to learn representations at different granularities on the MIMIC-III dataset and show that our approach outperforms these methods. The results demonstrate the significance of multi-modal medical concept representations based on prior medical knowledge.
摘要
临床医生们正在寻求机器学习技术以获取病人发展情况的洞察。我们提出了一种新的方法名为多modal UMLS图像学习(MMUGL),该方法利用知识图库基于统一医学语言系统(UMLS)中的医学概念进行图神经网络学习,以获取有用的医学概念表示。这些表示被聚合以表示整个病人访问,然后通过顺序模型进行预测,以达到多个医院访问病人的级别。我们通过 incorporating 先前的医学知识和考虑多种感知modalities来提高性能。我们对MIMIC-III数据集中的不同层次的表示学习体系进行比较,并显示了我们的方法在这些方法之上出performances。结果表明多modal医学概念表示基于先前的医学知识具有重要性。
Invex Programs: First Order Algorithms and Their Convergence
paper_authors: Adarsh Barik, Suvrit Sra, Jean Honorio
for: 解决非对称问题,即具有全局最小值的每个站点点。
methods: 提出了新的一代首选法,并证明其 converge 的条件和速率。
results: 提供了一种新的投影法,可以解决约束的非对称问题,并提供了速率 guarantees。Abstract
Invex programs are a special kind of non-convex problems which attain global minima at every stationary point. While classical first-order gradient descent methods can solve them, they converge very slowly. In this paper, we propose new first-order algorithms to solve the general class of invex problems. We identify sufficient conditions for convergence of our algorithms and provide rates of convergence. Furthermore, we go beyond unconstrained problems and provide a novel projected gradient method for constrained invex programs with convergence rate guarantees. We compare and contrast our results with existing first-order algorithms for a variety of unconstrained and constrained invex problems. To the best of our knowledge, our proposed algorithm is the first algorithm to solve constrained invex programs.
摘要
“inx problems是一种特殊的非凸问题,它们在每个稳定点上取得全球最小值。 classical first-order gradient descent方法可以解决它们,但是它们对应的速度很慢。在这篇论文中,我们提出了一些新的first-order算法来解决一般的inx问题。我们识别出了它们的充分条件,并提供了速度 guarantee。此外,我们不仅处理不受限制的问题,而且提出了一个新的对应投影法,它具有速度 guarantee。我们与现有的first-order算法进行比较和对照,并证明了我们的提案是第一个可以解决约束inx问题的算法。”Note: The translation is in Simplified Chinese, which is the standard written form of Chinese used in mainland China. If you need the translation in Traditional Chinese, please let me know.
Graph Convolutional Networks for Simulating Multi-phase Flow and Transport in Porous Media
for: numerical simulation of multi-phase fluid dynamics in porous media
methods: data-driven surrogate modeling using Graph Convolutional Networks (GCNs)
results: high accuracy in predicting pressure and saturation states, and generalization to irregular domain geometries and unstructured meshes.Here is the same information in Simplified Chinese:
results: 高精度地预测压力和湿度状态的演变,并可以通过不同的域几何和粗粒度网格来扩展。Abstract
Numerical simulation of multi-phase fluid dynamics in porous media is critical for many subsurface applications. Data-driven surrogate modeling provides computationally inexpensive alternatives to high-fidelity numerical simulators. While the commonly used convolutional neural networks (CNNs) are powerful in approximating partial differential equation solutions, it remains challenging for CNNs to handle irregular and unstructured simulation meshes. However, subsurface simulation models often involve unstructured meshes with complex mesh geometries, which limits the application of CNNs. To address this challenge, here we construct surrogate models based on Graph Convolutional Networks (GCNs) to approximate the spatial-temporal solutions of multi-phase flow and transport processes. We propose a new GCN architecture suited to the hyperbolic character of the coupled PDE system, to better capture the saturation dynamics. Results of 2D heterogeneous test cases show that our surrogates predict the evolutions of the pressure and saturation states with high accuracy, and the predicted rollouts remain stable for multiple timesteps. Moreover, the GCN-based models generalize well to irregular domain geometries and unstructured meshes that are unseen in the training dataset.
摘要
numerically simulating multi-phase fluid dynamics in porous media is crucial for many subsurface applications. data-driven surrogate modeling provides computationally inexpensive alternatives to high-fidelity numerical simulators. although commonly used convolutional neural networks (CNNs) are powerful in approximating partial differential equation solutions, it remains challenging for CNNs to handle irregular and unstructured simulation meshes. however, subsurface simulation models often involve unstructured meshes with complex mesh geometries, which limits the application of CNNs. to address this challenge, we construct surrogate models based on graph convolutional networks (GCNs) to approximate the spatial-temporal solutions of multi-phase flow and transport processes. we propose a new GCN architecture suited to the hyperbolic character of the coupled PDE system, to better capture the saturation dynamics. results of 2D heterogeneous test cases show that our surrogates predict the evolutions of the pressure and saturation states with high accuracy, and the predicted rollouts remain stable for multiple timesteps. moreover, the GCN-based models generalize well to irregular domain geometries and unstructured meshes that are unseen in the training dataset.
Learning Behavioral Representations of Routines From Large-scale Unlabeled Wearable Time-series Data Streams using Hawkes Point Process
results: 研究人员通过使用该方法,成功地探索了100多名参与者的日常活动模式,并发现了每天的活动转换关系。此外,该方法还能够捕捉到各个人的个性特征和情绪变化。Abstract
Continuously-worn wearable sensors enable researchers to collect copious amounts of rich bio-behavioral time series recordings of real-life activities of daily living, offering unprecedented opportunities to infer novel human behavior patterns during daily routines. Existing approaches to routine discovery through bio-behavioral data rely either on pre-defined notions of activities or use additional non-behavioral measurements as contexts, such as GPS location or localization within the home, presenting risks to user privacy. In this work, we propose a novel wearable time-series mining framework, Hawkes point process On Time series clusters for ROutine Discovery (HOT-ROD), for uncovering behavioral routines from completely unlabeled wearable recordings. We utilize a covariance-based method to generate time-series clusters and discover routines via the Hawkes point process learning algorithm. We empirically validate our approach for extracting routine behaviors using a completely unlabeled time-series collected continuously from over 100 individuals both in and outside of the workplace during a period of ten weeks. Furthermore, we demonstrate this approach intuitively captures daily transitional relationships between physical activity states without using prior knowledge. We also show that the learned behavioral patterns can assist in illuminating an individual's personality and affect.
摘要
<>使用不间断搭配的便携式传感器,研究人员可以收集大量的丰富的生物行为时序记录,以获得前无之例的人类行为模式发现机会。现有的 Routine 发现方法通过生物行为数据,都是基于先前定义的活动或者使用其他非行为测量,如 GPS 位置或家庭内部的位置,这些方法存在用户隐私风险。在这项工作中,我们提出了一种基于 Hawkes 点过程的时序序分 clustering 框架,称为 HOT-ROD,用于从未标记的便携式记录中探索行为 Routine。我们使用 covariance 基本方法生成时序序 clusters,通过 Hawkes 点过程学习算法发现 Routine。我们经验 Validate 我们的方法可以从未标记的时序记录中提取行为 Routine。我们使用了来自100名参与者的完全无标记时序记录,在工作场所和外部环境中进行了10周的观察。此外,我们还示出了我们的方法可以直观地捕捉日常physical activity状态之间的关系,无需使用先前知识。最后,我们还示出了学习的行为模式可以帮助描述个体的人性和情感。
results: 根据实验结果,DCA-NAS比对应深度学习架构更好地适应边缘设备的硬件限制,并且与流行的 mobil 架构相比,在不同的图像识别 datasets 上具有相似的表现。此外,DCA-NAS 还可以在 Hardware-NAS-Bench 上进行硬件特定的架构搜索,实现低 inference 延迟和现场表现。Abstract
Edge computing aims to enable edge devices, such as IoT devices, to process data locally instead of relying on the cloud. However, deep learning techniques like computer vision and natural language processing can be computationally expensive and memory-intensive. Creating manual architectures specialized for each device is infeasible due to their varying memory and computational constraints. To address these concerns, we automate the construction of task-specific deep learning architectures optimized for device constraints through Neural Architecture Search (NAS). We present DCA-NAS, a principled method of fast neural network architecture search that incorporates edge-device constraints such as model size and floating-point operations. It incorporates weight sharing and channel bottleneck techniques to speed up the search time. Based on our experiments, we see that DCA-NAS outperforms manual architectures for similar sized models and is comparable to popular mobile architectures on various image classification datasets like CIFAR-10, CIFAR-100, and Imagenet-1k. Experiments with search spaces -- DARTS and NAS-Bench-201 show the generalization capabilities of DCA-NAS. On further evaluating our approach on Hardware-NAS-Bench, device-specific architectures with low inference latency and state-of-the-art performance were discovered.
摘要
(Simplified Chinese translation)边计算旨在让边缘设备,如物联网设备,直接处理本地数据而不依赖于云端。然而,深度学习技术如计算机视觉和自然语言处理可能具有计算成本和内存占用的问题。为了解决这些问题,我们自动构建任务特定的深度学习架构,以适应边缘设备的约束。我们提出了DCA-NAS,一种基于约束的快速神经网络架构搜索方法,该方法包括边缘设备的模型大小和浮点运算数量等约束。它还包括权重共享和通道瓶颈技术来加速搜索时间。根据我们的实验,DCA-NAS在相同大小的模型中表现更好,与流行的移动设备架构相当,并在多个图像分类 datasets like CIFAR-10, CIFAR-100, Imagenet-1k 上达到了类似的性能。我们的方法在 DARTS 和 NAS-Bench-201 的搜索空间上进行了广泛的测试,并证明了DCA-NAS的普适性。在进一步评估我们的方法在 Hardware-NAS-Bench 上时,我们发现了适用于具体的设备的低延迟和状态公平的设备特定架构。
results: 通过实验表明,提posed方法可以寻找比现有模型更好的认知诊断模型,同时保持与人类设计的模型相同的可解释性。Abstract
Cognitive diagnosis plays a vital role in modern intelligent education platforms to reveal students' proficiency in knowledge concepts for subsequent adaptive tasks. However, due to the requirement of high model interpretability, existing manually designed cognitive diagnosis models hold too simple architectures to meet the demand of current intelligent education systems, where the bias of human design also limits the emergence of effective cognitive diagnosis models. In this paper, we propose to automatically design novel cognitive diagnosis models by evolutionary multi-objective neural architecture search (NAS). Specifically, we observe existing models can be represented by a general model handling three given types of inputs and thus first design an expressive search space for the NAS task in cognitive diagnosis. Then, we propose multi-objective genetic programming (MOGP) to explore the NAS task's search space by maximizing model performance and interpretability. In the MOGP design, each architecture is transformed into a tree architecture and encoded by a tree for easy optimization, and a tailored genetic operation based on four sub-genetic operations is devised to generate offspring effectively. Besides, an initialization strategy is also suggested to accelerate the convergence by evolving half of the population from existing models' variants. Experiments on two real-world datasets demonstrate that the cognitive diagnosis models searched by the proposed approach exhibit significantly better performance than existing models and also hold as good interpretability as human-designed models.
摘要
现代智能教育平台中,认知诊断发挥了关键作用,以揭示学生知识概念的熟练程度,以便进行后续适应任务。然而,由于需要高度的模型解释性,现有的手动设计的认知诊断模型具有过于简单的结构,无法满足当前智能教育系统的需求。在这篇论文中,我们提议使用EVOLUTIONARY MULTI-OBJECTIVE NEURAL ARCHITECTURE SEARCH(NAS)自动设计新的认知诊断模型。 Specifically, we observe that existing models can be represented by a general model handling three given types of inputs, and thus first design an expressive search space for the NAS task in cognitive diagnosis. Then, we propose multi-objective genetic programming (MOGP) to explore the NAS task's search space by maximizing model performance and interpretability. In the MOGP design, each architecture is transformed into a tree architecture and encoded by a tree for easy optimization, and a tailored genetic operation based on four sub-genetic operations is devised to generate offspring effectively. Besides, an initialization strategy is also suggested to accelerate the convergence by evolving half of the population from existing models' variants. Experiments on two real-world datasets demonstrate that the cognitive diagnosis models searched by the proposed approach exhibit significantly better performance than existing models and also hold as good interpretability as human-designed models.
Observation of high-energy neutrinos from the Galactic plane
paper_authors: R. Abbasi, M. Ackermann, J. Adams, J. A. Aguilar, M. Ahlers, M. Ahrens, J. M. Alameddine, A. A. Alves Jr., N. M. Amin, K. Andeen, T. Anderson, G. Anton, C. Argüelles, Y. Ashida, S. Athanasiadou, S. Axani, X. Bai, A. Balagopal V., S. W. Barwick, V. Basu, S. Baur, R. Bay, J. J. Beatty, K. -H. Becker, J. Becker Tjus, J. Beise, C. Bellenghi, S. Benda, S. BenZvi, D. Berley, E. Bernardini, D. Z. Besson, G. Binder, D. Bindig, E. Blaufuss, S. Blot, M. Boddenberg, F. Bontempo, J. Y. Book, J. Borowka, S. Böser, O. Botner, J. Böttcher, E. Bourbeau, F. Bradascio, J. Braun, B. Brinson, S. Bron, J. Brostean-Kaiser, R. T. Burley, R. S. Busse, M. A. Campana, E. G. Carnie-Bronca, C. Chen, Z. Chen, D. Chirkin, K. Choi, B. A. Clark, K. Clark, L. Classen, A. Coleman, G. H. Collin, A. Connolly, J. M. Conrad, P. Coppin, P. Correa, D. F. Cowen, R. Cross, C. Dappen, P. Dave, C. De Clercq, J. J. DeLaunay, D. Delgado López, H. Dembinski, K. Deoskar, A. Desai, P. Desiati, K. D. de Vries, G. de Wasseige, T. DeYoung, A. Diaz, J. C. Díaz-Vélez, M. Dittmer, H. Dujmovic, M. Dunkman, M. A. DuVernois, T. Ehrhardt, P. Eller, R. Engel, H. Erpenbeck, J. Evans, P. A. Evenson, K. L. Fan, A. R. Fazely, A. Fedynitch, N. Feigl, S. Fiedlschuster, A. T. Fienberg, C. Finley, L. Fischer, D. Fox, A. Franckowiak, E. Friedman, A. Fritz, P. Fürst, T. K. Gaisser, J. Gallagher, E. Ganster, A. Garcia, S. Garrappa, L. Gerhardt, A. Ghadimi, C. Glaser, T. Glauch, T. Glüsenkamp, N. Goehlke, A. Goldschmidt, J. G. Gonzalez, S. Goswami, D. Grant, T. Grégoire, S. Griswold, C. Günther, P. Gutjahr, C. Haack, A. Hallgren, R. Halliday, L. Halve, F. Halzen, M. Ha Minh, K. Hanson, J. Hardin, A. A. Harnisch, A. Haungs, K. Helbing, F. Henningsen, E. C. Hettinger, S. Hickford, J. Hignight, C. Hill, G. C. Hill, K. D. Hoffman, K. Hoshina, W. Hou, F. Huang, M. Huber, T. Huber, K. Hultqvist, M. Hünnefeld, R. Hussain, K. Hymon, S. In, N. Iovine, A. Ishihara, M. Jansson, G. S. Japaridze, M. Jeong, M. Jin, B. J. P. Jones, D. Kang, W. Kang, X. Kang, A. Kappes, D. Kappesser, L. Kardum, T. Karg, M. Karl, A. Karle, U. Katz, M. Kauer, M. Kellermann, J. L. Kelley, A. Kheirandish, K. Kin, J. Kiryluk, S. R. Klein, A. Kochocki, R. Koirala, H. Kolanoski, T. Kontrimas, L. Köpke, C. Kopper, S. Kopper, D. J. Koskinen, P. Koundal, M. Kovacevich, M. Kowalski, T. Kozynets, E. Krupczak, E. Kun, N. Kurahashi, N. Lad, C. Lagunas Gualda, J. L. Lanfranchi, M. J. Larson, F. Lauber, J. P. Lazar, J. W. Lee, K. Leonard, A. Leszczyńska, Y. Li, M. Lincetto, Q. R. Liu, M. Liubarska, E. Lohfink, C. J. Lozano Mariscal, L. Lu, F. Lucarelli, A. Ludwig, W. Luszczak, Y. Lyu, W. Y. Ma, J. Madsen, K. B. M. Mahn, Y. Makino, S. Mancina, I. C. Mariş, I. Martinez-Soler, R. Maruyama, S. McCarthy, T. McElroy, F. McNally, J. V. Mead, K. Meagher, S. Mechbal, A. Medina, M. Meier, S. Meighen-Berger, Y. Merckx, J. Micallef, D. Mockler, T. Montaruli, R. W. Moore, K. Morik, R. Morse, M. Moulai, T. Mukherjee, R. Naab, R. Nagai, R. Nahnhauer, U. Naumann, J. Necker, L. V. Nguyen, H. Niederhausen, M. U. Nisa, S. C. Nowicki, D. Nygren, A. Obertacke Pollmann, M. Oehler, B. Oeyen, A. Olivas, E. O’Sullivan, H. Pandya, D. V. Pankova, N. Park, G. K. Parker, E. N. Paudel, L. Paul, C. Pérez de los Heros, L. Peters, J. Peterson, S. Philippen, S. Pieper, A. Pizzuto, M. Plum, Y. Popovych, A. Porcelli, M. Prado Rodriguez, B. Pries, G. T. Przybylski, C. Raab, J. Rack-Helleis, A. Raissi, M. Rameez, K. Rawlins, I. C. Rea, Z. Rechav, A. Rehman, P. Reichherzer, R. Reimann, G. Renzi, E. Resconi, S. Reusch, W. Rhode, M. Richman, B. Riedel, E. J. Roberts, S. Robertson, G. Roellinghoff, M. Rongen, C. Rott, T. Ruhe, D. Ryckbosch, D. Rysewyk Cantu, I. Safa, J. Saffer, D. Salazar-Gallegos, P. Sampathkumar, S. E. Sanchez Herrera, A. Sandrock, M. Santander, S. Sarkar, S. Sarkar, K. Satalecka, M. Schaufel, H. Schieler, S. Schindler, T. Schmidt, A. Schneider, J. Schneider, F. G. Schröder, L. Schumacher, G. Schwefer, S. Sclafani, D. Seckel, S. Seunarine, A. Sharma, S. Shefali, N. Shimizu, M. Silva, B. Skrzypek, B. Smithers, R. Snihur, J. Soedingrekso, A. Sogaard, D. Soldin, C. Spannfellner, G. M. Spiczak, C. Spiering, M. Stamatikos, T. Stanev, R. Stein, J. Stettner, T. Stezelberger, B. Stokstad, T. Stürwald, T. Stuttard, G. W. Sullivan, I. Taboada, S. Ter-Antonyan, J. Thwaites, S. Tilav, F. Tischbein, K. Tollefson, C. Tönnis, S. Toscano, D. Tosi, A. Trettin, M. Tselengidou, C. F. Tung, A. Turcati, R. Turcotte, C. F. Turley, J. P. Twagirayezu, B. Ty, M. A. Unland Elorrieta, N. Valtonen-Mattila, J. Vandenbroucke, N. van Eijndhoven, D. Vannerom, J. van Santen, J. Veitch-Michaelis, S. Verpoest, C. Walck, W. Wang, T. B. Watson, C. Weaver, P. Weigel, A. Weindl, M. J. Weiss, J. Weldert, C. Wendt, J. Werthebach, M. Weyrauch, N. Whitehorn, C. H. Wiebusch, N. Willey, D. R. Williams, M. Wolf, G. Wrede, J. Wulff, X. W. Xu, J. P. Yanez, E. Yildizci, S. Yoshida, S. Yu, T. Yuan, Z. Zhang, P. Zhelnin
results: 研究发现了来自银河平面的 neutrino 辐射,stats 相对于背景只是4.5$\sigma$ 的水平,但这些辐射也可能来自一群未能解决的点源。Abstract
The origin of high-energy cosmic rays, atomic nuclei that continuously impact Earth's atmosphere, has been a mystery for over a century. Due to deflection in interstellar magnetic fields, cosmic rays from the Milky Way arrive at Earth from random directions. However, near their sources and during propagation, cosmic rays interact with matter and produce high-energy neutrinos. We search for neutrino emission using machine learning techniques applied to ten years of data from the IceCube Neutrino Observatory. We identify neutrino emission from the Galactic plane at the 4.5$\sigma$ level of significance, by comparing diffuse emission models to a background-only hypothesis. The signal is consistent with modeled diffuse emission from the Galactic plane, but could also arise from a population of unresolved point sources.
摘要
高能宇宙射线的起源已经是一个谜题超过一个世纪。由于介电场的折射,宇宙射线从银河系 arrive at Earth from random directions。然而,在源头附近和传播过程中,宇宙射线与物质进行交互,生成高能中微子。我们使用机器学习技术对 IceCube 中微子观测器十年的数据进行搜索。我们在background-only假设下,在4.5σ水平上发现了来自银河平面的中微子发射。该信号与模型的散发辐射相一致,但也可能来自一群未解决的点源。
Handling Group Fairness in Federated Learning Using Augmented Lagrangian Approach
results: 在 CelebA 和 ImSitu 数据集上实验表明,提出的方法可以在统计上不同的客户端数量和随机性下,Quantitatively 和 Qualitatively 提高群体公平性,同时具有较少的减少精度损失。Abstract
Federated learning (FL) has garnered considerable attention due to its privacy-preserving feature. Nonetheless, the lack of freedom in managing user data can lead to group fairness issues, where models might be biased towards sensitive factors such as race or gender, even if they are trained using a legally compliant process. To redress this concern, this paper proposes a novel FL algorithm designed explicitly to address group fairness issues. We show empirically on CelebA and ImSitu datasets that the proposed method can improve fairness both quantitatively and qualitatively with minimal loss in accuracy in the presence of statistical heterogeneity and with different numbers of clients. Besides improving fairness, the proposed FL algorithm is compatible with local differential privacy (LDP), has negligible communication costs, and results in minimal overhead when migrating existing FL systems from the common FL protocol such as FederatedAveraging (FedAvg). We also provide the theoretical convergence rate guarantee for the proposed algorithm and the required noise level of the Gaussian mechanism to achieve desired LDP. This innovative approach holds significant potential to enhance the fairness and effectiveness of FL systems, particularly in sensitive applications such as healthcare or criminal justice.
摘要
federated learning (FL) 已经引起了广泛的关注,主要是因为它具有隐私保护的特点。然而,由于用户数据的管理不具有完全的自由性,这可能会导致群体公平问题,模型可能会偏爱敏感因素 such as 种族或性别,即使使用了合法的过程进行训练。为了解决这个问题,这篇论文提出了一种新的 Federated Learning 算法,用于直接地 Addressing Group Fairness Issues。我们通过实验表明,在 celebA 和 ImSitu 数据集上,我们的方法可以改善公平性 both quantitatively and qualitatively,即使在统计不同性和不同客户端数量的情况下。此外,我们的 FL 算法兼容地方 differential privacy (LDP),通信成本很低,并且在将现有 FL 系统从 Common FL Protocol such as FederatedAveraging (FedAvg) 迁移到我们的方法时,不会带来很大的干扰。我们还提供了对我们的算法的理论收敛率保证和需要的 Gaussian 机制的噪音水平来实现 Desired LDP。这种创新的方法可能会提高 FL 系统的公平性和效果,特别是在敏感应用 such as 医疗或刑事正义领域。
Episodic Gaussian Process-Based Learning Control with Vanishing Tracking Errors
methods: 这篇论文使用了 Gaussian process regression,具有高数据效率和明确的uncertainty表示,从而 derivation of prediction error bounds。
results: 论文提出了一种 Bayesian prediction error bound,其与数据密度相关,并证明了时间变化的跟踪准确性保证。通过这种 bound,可以实现vanishing tracking error with increasing data density。Abstract
Due to the increasing complexity of technical systems, accurate first principle models can often not be obtained. Supervised machine learning can mitigate this issue by inferring models from measurement data. Gaussian process regression is particularly well suited for this purpose due to its high data-efficiency and its explicit uncertainty representation, which allows the derivation of prediction error bounds. These error bounds have been exploited to show tracking accuracy guarantees for a variety of control approaches, but their direct dependency on the training data is generally unclear. We address this issue by deriving a Bayesian prediction error bound for GP regression, which we show to decay with the growth of a novel, kernel-based measure of data density. Based on the prediction error bound, we prove time-varying tracking accuracy guarantees for learned GP models used as feedback compensation of unknown nonlinearities, and show to achieve vanishing tracking error with increasing data density. This enables us to develop an episodic approach for learning Gaussian process models, such that an arbitrary tracking accuracy can be guaranteed. The effectiveness of the derived theory is demonstrated in several simulations.
摘要
To address this issue, we derive a Bayesian prediction error bound for GP regression, which we show to decay with the growth of a novel, kernel-based measure of data density. Based on this bound, we prove time-varying tracking accuracy guarantees for learned GP models used as feedback compensation of unknown nonlinearities, and show that the tracking error vanishes as the data density increases. This enables us to develop an episodic approach for learning Gaussian process models, such that an arbitrary tracking accuracy can be guaranteed. The effectiveness of the derived theory is demonstrated through several simulations.
Comparison of Point Cloud and Image-based Models for Calorimeter Fast Simulation
results: 研究发现,使用点云来表示calorimeter shower比使用3D voxels更加自然地处理稀疏数据集,并可以使用更加紧凑的模型和数据文件。Abstract
Score based generative models are a new class of generative models that have been shown to accurately generate high dimensional calorimeter datasets. Recent advances in generative models have used images with 3D voxels to represent and model complex calorimeter showers. Point clouds, however, are likely a more natural representation of calorimeter showers, particularly in calorimeters with high granularity. Point clouds preserve all of the information of the original simulation, more naturally deal with sparse datasets, and can be implemented with more compact models and data files. In this work, two state-of-the-art score based models are trained on the same set of calorimeter simulation and directly compared.
摘要
分数基于的生成模型是一种新的生成模型,已经证明可以准确地生成高维度的椭圆仪数据。最近的生成模型使用图像的3D矩阵来表示和模型复杂的椭圆仪涨潮。然而,点云更可能是椭圆仪涨潮的自然表示,特别是高精度的椭圆仪中。点云保留原始模拟中的所有信息,更自然地处理稀疏数据,并可以通过更 компакт的模型和数据文件实现。在这项工作中,两种当前领先的分数基于模型被直接对比训练。
CT-based Subchondral Bone Microstructural Analysis in Knee Osteoarthritis via MR-Guided Distillation Learning
For: The paper aims to develop a novel method for subchondral bone microstructural analysis using easily-acquired CT images, which can help diagnose knee osteoarthritis.* Methods: The proposed method, named SRRD, uses a distillation-learning-based approach to transfer MR structural information to a CT-based model, and leverages paired MR images to enhance the CT-based analysis model during training.* Results: The proposed method achieved high reliability and validity in MR-CT registration, regression, and knee osteoarthritis classification, with an AUC score of 0.767 (95% CI, 0.681-0.853) compared to 0.658 (95% CI, 0.574-0.742) using the CNN approach.Abstract
Background: MR-based subchondral bone effectively predicts knee osteoarthritis. However, its clinical application is limited by the cost and time of MR. Purpose: We aim to develop a novel distillation-learning-based method named SRRD for subchondral bone microstructural analysis using easily-acquired CT images, which leverages paired MR images to enhance the CT-based analysis model during training. Materials and Methods: Knee joint images of both CT and MR modalities were collected from October 2020 to May 2021. Firstly, we developed a GAN-based generative model to transform MR images into CT images, which was used to establish the anatomical correspondence between the two modalities. Next, we obtained numerous patches of subchondral bone regions of MR images, together with their trabecular parameters (BV / TV, Tb. Th, Tb. Sp, Tb. N) from the corresponding CT image patches via regression. The distillation-learning technique was used to train the regression model and transfer MR structural information to the CT-based model. The regressed trabecular parameters were further used for knee osteoarthritis classification. Results: A total of 80 participants were evaluated. CT-based regression results of trabecular parameters achieved intra-class correlation coefficients (ICCs) of 0.804, 0.773, 0.711, and 0.622 for BV / TV, Tb. Th, Tb. Sp, and Tb. N, respectively. The use of distillation learning significantly improved the performance of the CT-based knee osteoarthritis classification method using the CNN approach, yielding an AUC score of 0.767 (95% CI, 0.681-0.853) instead of 0.658 (95% CI, 0.574-0.742) (p<.001). Conclusions: The proposed SRRD method showed high reliability and validity in MR-CT registration, regression, and knee osteoarthritis classification, indicating the feasibility of subchondral bone microstructural analysis based on CT images.
摘要
背景:MR基于subchondral骨的效果可以预测膝关节炎,但其临床应用受到MR成本和时间的限制。目的:我们希望开发一种基于截然学习的方法,名为SRRD,用于CT图像上的subchondral骨微结构分析,利用MR图像来增强CT图像分析模型的训练。材料和方法:膝关节图像 Both CT和MR模式自2020年10月至2021年5月收集。首先,我们开发了一个GAN基于生成模型,将MR图像转换成CT图像,以建立两种模态之间的解剖相对性。接着,我们从MR图像中提取了许多subchondral骨区域的补丁,并将其与CT图像中的相应区域进行对比,以获取 trabecular参数(BV / TV、Tb. Th、Tb. Sp、Tb. N)的 regression。使用截然学习技术来训练回归模型,以将MR结构信息传递给CT基本的模型。重新计算的 trabecular参数被用于膝关节炎分类。结果:总共评估了80名参与者。CT基于 regression 结果中的 trabecular参数 achieved intra-class correlation coefficients (ICCs) of 0.804, 0.773, 0.711, and 0.622 for BV / TV, Tb. Th, Tb. Sp, and Tb. N, respectively。使用截然学习对CT基本的膝关节炎分类方法进行了显著改进,其AUC score为0.767(95% CI, 0.681-0.853),比之前的0.658(95% CI, 0.574-0.742)(p<.001)。结论:我们的SRRD方法在MR-CT регистраción、回归和膝关节炎分类中表现了高可靠性和有效性,表明基于CT图像的subchondral骨微结构分析是可能的。
Learning to Identify Graphs from Node Trajectories in Multi-Robot Networks
paper_authors: Eduardo Sebastian, Thai Duong, Nikolay Atanasov, Eduardo Montijano, Carlos Sagues
for: Identifying the interactions among nodes in a network given their state/feature trajectories.
methods: Combines a strongly convex program with a self-attention encoder to learn the graph topology and appropriate regularizers for optimization.
results: Can identify the graph topology of unseen networks with new configurations in terms of number of nodes, connectivity, or state trajectories, and demonstrates effectiveness in multi-robot formation and flocking tasks.Here’s the text in Simplified Chinese:
for: Identifying网络中节点之间的互动关系,给出状态/特征轨迹。
methods: 组合强型凸程程序和自注意编码器,学习网络架构和优化适当的正则化项。
results: 可以 Identify未看到的网络架构,包括节点数量、连接性和状态轨迹等新配置,并在多机器formation和群集任务中表现效果。Abstract
The graph identification problem consists of discovering the interactions among nodes in a network given their state/feature trajectories. This problem is challenging because the behavior of a node is coupled to all the other nodes by the unknown interaction model. Besides, high-dimensional and nonlinear state trajectories make difficult to identify if two nodes are connected. Current solutions rely on prior knowledge of the graph topology and the dynamic behavior of the nodes, and hence, have poor generalization to other network configurations. To address these issues, we propose a novel learning-based approach that combines (i) a strongly convex program that efficiently uncovers graph topologies with global convergence guarantees and (ii) a self-attention encoder that learns to embed the original state trajectories into a feature space and predicts appropriate regularizers for the optimization program. In contrast to other works, our approach can identify the graph topology of unseen networks with new configurations in terms of number of nodes, connectivity or state trajectories. We demonstrate the effectiveness of our approach in identifying graphs in multi-robot formation and flocking tasks.
摘要
《网络图标识问题》的解决方法是找出网络中节点之间的交互关系, giventheir状态/特征轨迹。这个问题具有挑战性,因为每个节点的行为都是所有其他节点的未知交互模型的各自响应。此外,高维和非线性的状态轨迹使得判断两个节点是否连接的很困难。现有的解决方案均基于节点的动态行为和网络拓扑结构的先验知识,因此具有poor泛化性。为了解决这些问题,我们提出了一种新的学习型方法,其包括(i)一个强 convex 程序,可以高效地揭示网络拓扑结构,并且具有全局收敛保证;(ii)一个自注意编码器,可以将原始状态轨迹编码到特征空间中,并预测适当的正则化项来优化程序。与其他工作不同,我们的方法可以在未看到的网络配置下,适应新的节点数、连接性和状态轨迹。我们在多机器人formation和群集任务中证明了我们的方法的有效性。
Recent Advancements in End-to-End Autonomous Driving using Deep Learning: A Survey
methods: 本研究使用了深度学习来实现自动驾驶栈的全面控制,包括感知到控制的全流程,并Addressed key challenges in real-world applications, such as explainability and safety aspects.
results: 本研究对自动驾驶领域的最新发展进行了分类和评估,并提供了一个包含最新开源实现的GitHub仓库。Abstract
End-to-End driving is a promising paradigm as it circumvents the drawbacks associated with modular systems, such as their overwhelming complexity and propensity for error propagation. Autonomous driving transcends conventional traffic patterns by proactively recognizing critical events in advance, ensuring passengers' safety and providing them with comfortable transportation, particularly in highly stochastic and variable traffic settings. This paper presents a comprehensive review of the End-to-End autonomous driving stack. It provides a taxonomy of automated driving tasks wherein neural networks have been employed in an End-to-End manner, encompassing the entire driving process from perception to control, while addressing key challenges encountered in real-world applications. Recent developments in End-to-End autonomous driving are analyzed, and research is categorized based on underlying principles, methodologies, and core functionality. These categories encompass sensorial input, main and auxiliary output, learning approaches ranging from imitation to reinforcement learning, and model evaluation techniques. The survey incorporates a detailed discussion of the explainability and safety aspects. Furthermore, it assesses the state-of-the-art, identifies challenges, and explores future possibilities. We maintained the latest advancements and their corresponding open-source implementations at https://github.com/Pranav-chib/Recent-Advancements-in-End-to-End-Autonomous-Driving-using-Deep-Learning.
摘要
END-TO-END自驾系统是一种优秀的思想,因为它绕过模块化系统的缺点,如它们的复杂性和错误卷积。自驾系统超越了传统的交通模式,通过积极地预测重要事件,确保乘客的安全和提供舒适的交通方式,特别是在高度随机和变化的交通环境中。这篇论文提供了END-TO-END自驾栈的全面回顾。它提供了自动驾驶任务中使用神经网络的End-to-End方式,涵盖整个驾驶过程从感知到控制,并解决了实际应用中遇到的主要挑战。最新的END-TO-END自驾技术发展分析,并按照基本原理、方法论和核心功能分类。这些类别包括感知输入、主要和辅助输出、学习方法从模仿到强化学习,以及模型评估技术。文章还包括对解释性和安全方面的详细讨论,以及现状、挑战和未来可能性的评估。我们将latestadvances和它们相应的开源实现存储在https://github.com/Pranav-chib/Recent-Advancements-in-End-to-End-Autonomous-Driving-using-Deep-Learning上。
ECS – an Interactive Tool for Data Quality Assurance
results: 该方法可以检测出可能对安全系统具有害的数据点,从而提高机器学习系统的可靠性和安全性。Abstract
With the increasing capabilities of machine learning systems and their potential use in safety-critical systems, ensuring high-quality data is becoming increasingly important. In this paper we present a novel approach for the assurance of data quality. For this purpose, the mathematical basics are first discussed and the approach is presented using multiple examples. This results in the detection of data points with potentially harmful properties for the use in safety-critical systems.
摘要
随着机器学习系统的能力不断提高,它们的潜在应用在安全关键系统中也在增加。为此,保证数据质量的重要性也在增加。在这篇论文中,我们提出了一种新的数据质量保证方法。为此,我们首先介绍了数学基础,然后通过多个示例介绍了该方法。这将导致检测数据点中可能有害的特性,因此不适用于安全关键系统的使用。
One-Shot Pruning for Fast-adapting Pre-trained Models on Devices
results: 实验分析显示,提案方法在各种数据集和任务下,与各种删除基准方法进行比较,具有较高的精度和效率。Abstract
Large-scale pre-trained models have been remarkably successful in resolving downstream tasks. Nonetheless, deploying these models on low-capability devices still requires an effective approach, such as model pruning. However, pruning the model from scratch can pose a practical challenge given the limited resources of each downstream task or device. To tackle this issue, we present a scalable one-shot pruning method that leverages pruned knowledge of similar tasks to extract a sub-network from the pre-trained model for a new task. Specifically, we create a score mask using the pruned models of similar tasks to identify task-specific filters/nodes in the pre-trained model for the new task. Based on this mask, we conduct a single round of pruning to extract a suitably-sized sub-network that can quickly adapt to the new task with only a few training iterations. Our experimental analysis demonstrates the effectiveness of the proposed method on the convolutional neural networks (CNNs) and vision transformers (ViT) with various datasets. The proposed method consistently outperforms popular pruning baseline methods in terms of accuracy and efficiency when dealing with diverse downstream tasks with different memory constraints.
摘要
Specifically, we create a score mask using the pruned models of similar tasks to identify task-specific filters/nodes in the pre-trained model for the new task. Based on this mask, we conduct a single round of pruning to extract a suitably-sized sub-network that can quickly adapt to the new task with only a few training iterations. Our experimental analysis demonstrates the effectiveness of the proposed method on convolutional neural networks (CNNs) and vision transformers (ViT) with various datasets. The proposed method consistently outperforms popular pruning baseline methods in terms of accuracy and efficiency when dealing with diverse downstream tasks with different memory constraints.
False Sense of Security: Leveraging XAI to Analyze the Reasoning and True Performance of Context-less DGA Classifiers
results: eliminating 偏见后,DGA 分类器的性能明显下降,但我们提出了一种 context-aware 检测系统,能够维持 state-of-the-art 深度学习分类器的检测率,并提供了一种可视化分析系统,帮助更好地理解分类器的 reasoning,从而提高检测方法的信任和透明度。Abstract
The problem of revealing botnet activity through Domain Generation Algorithm (DGA) detection seems to be solved, considering that available deep learning classifiers achieve accuracies of over 99.9%. However, these classifiers provide a false sense of security as they are heavily biased and allow for trivial detection bypass. In this work, we leverage explainable artificial intelligence (XAI) methods to analyze the reasoning of deep learning classifiers and to systematically reveal such biases. We show that eliminating these biases from DGA classifiers considerably deteriorates their performance. Nevertheless we are able to design a context-aware detection system that is free of the identified biases and maintains the detection rate of state-of-the art deep learning classifiers. In this context, we propose a visual analysis system that helps to better understand a classifier's reasoning, thereby increasing trust in and transparency of detection methods and facilitating decision-making.
摘要
“botnet活动探测透过类别生成算法(DGA)探测的问题似乎已经解决了,因为可用的深度学习分类器可以达到99.9%以上的准确率。但是这些分类器具有严重的偏见,容易被轻松地逃脱检测。在这个工作中,我们利用可解释人工智能(XAI)方法来分析深度学习分类器的思考过程,并系统地揭露这些偏见。我们发现,从DGA分类器中除掉这些偏见后,其性能会很差。但我们能够设计一个具有上述偏见的 контекст感知检测系统,并维持现有深度学习分类器的检测率。在这个上下文中,我们提出了一个可观分析系统,帮助更好地理解分类器的思考过程,从而增加检测方法的信任和透明度,并且方便决策。”
Formulating A Strategic Plan Based On Statistical Analyses And Applications For Financial Companies Through A Real-World Use Case
results: 研究发现,贷款数量对borrower债务抵押率有很大影响,而提出的战略计划可以帮助LendingClub提高收入,同时减少贷款风险。Abstract
Business statistics play a crucial role in implementing a data-driven strategic plan at the enterprise level to employ various analytics where the outcomes of such a plan enable an enterprise to enhance the decision-making process or to mitigate risks to the organization. In this work, a strategic plan informed by the statistical analysis is introduced for a financial company called LendingClub, where the plan is comprised of exploring the possibility of onboarding a big data platform along with advanced feature selection capacities. The main objectives of such a plan are to increase the company's revenue while reducing the risks of granting loans to borrowers who cannot return their loans. In this study, different hypotheses formulated to address the company's concerns are studied, where the results reveal that the amount of loans profoundly impacts the number of borrowers charging off their loans. Also, the proposed strategic plan includes onboarding advanced analytics such as machine learning technologies that allow the company to build better generalized data-driven predictive models.
摘要
企业统计在实施数据驱动策略方面发挥关键作用,以使用不同的统计分析来帮助企业做出更好的决策或减少对组织的风险。在这个工作中,我们将介绍一份基于统计分析的战略计划,用于一家名为LendingClub的金融公司,以增加公司的收入,同时减少向借款者发放贷款的风险。在这个研究中,我们提出了一些用于解决公司的问题的假设,其结果表明,贷款的数量对借款者养成贷款的数量产生很大的影响。此外,我们的战略计划还包括在boarding高级分析工具,如机器学习技术,以帮助公司建立更好的通用数据驱动预测模型。
Policy Finetuning in Reinforcement Learning via Design of Experiments using Offline Data
results: 研究表明,该算法可以在离线数据集的本地覆盖率和额外数据量的情况下提供可证明的质量保证。Abstract
In some applications of reinforcement learning, a dataset of pre-collected experience is already available but it is also possible to acquire some additional online data to help improve the quality of the policy. However, it may be preferable to gather additional data with a single, non-reactive exploration policy and avoid the engineering costs associated with switching policies. In this paper we propose an algorithm with provable guarantees that can leverage an offline dataset to design a single non-reactive policy for exploration. We theoretically analyze the algorithm and measure the quality of the final policy as a function of the local coverage of the original dataset and the amount of additional data collected.
摘要
在某些应用中的强化学习中,已经有一个准备好的经验数据集,但也可以获得一些在线数据来提高政策质量。然而,可能更好的选择是使用单一、不反应的探索策略来收集更多数据,而不是在政策之间切换。在这篇论文中,我们提出了一个可证明的 garantía 的算法,可以利用已有的离线数据集来设计单一的不反应策略。我们对算法进行了理论分析,并测量了最终政策质量与原始数据集的地方覆盖率以及额外收集的数据量之间的关系。
For: This paper is written for researchers and practitioners who work with high-dimensional data and are interested in evaluating conditional independence in a nonparametric manner.* Methods: The paper uses recently developed nonlinear sufficient dimension reduction techniques to introduce a sufficient graphical model for evaluating conditional independence. The model is nonparametric and does not make distributional assumptions, but it is based on conditional independence given a set of sufficient predictors with a reduced dimension.* Results: The paper demonstrates that the proposed method outperforms existing methods when the Gaussian or copula Gaussian assumptions are violated, and its performance remains excellent in high-dimensional settings. The method is also shown to be consistent in variable selection.Abstract
We introduce a sufficient graphical model by applying the recently developed nonlinear sufficient dimension reduction techniques to the evaluation of conditional independence. The graphical model is nonparametric in nature, as it does not make distributional assumptions such as the Gaussian or copula Gaussian assumptions. However, unlike a fully nonparametric graphical model, which relies on the high-dimensional kernel to characterize conditional independence, our graphical model is based on conditional independence given a set of sufficient predictors with a substantially reduced dimension. In this way we avoid the curse of dimensionality that comes with a high-dimensional kernel. We develop the population-level properties, convergence rate, and variable selection consistency of our estimate. By simulation comparisons and an analysis of the DREAM 4 Challenge data set, we demonstrate that our method outperforms the existing methods when the Gaussian or copula Gaussian assumptions are violated, and its performance remains excellent in the high-dimensional setting.
摘要
我们引入一种充分的图形模型,通过最近发展的非线性充分维度减少技术来评估条件独立性。这个图形模型是非 Parametric 性的,意味着它不会对条件独立性进行分布假设,如 Gaussian 或 copula Gaussian 假设。然而,与完全非 Parametric 图形模型不同,我们的图形模型基于条件独立性给定一组充分的预测变量,具有减少维度的优点。这样可以避免高维度 kernel 中的困惑。我们研究这个方法的人口级特性、整体速度和变量选择一致性。通过对比实验和分析 DREAM 4 Challenge 数据集,我们示出了我们的方法在假设不满足 Gaussian 或 copula Gaussian 假设时表现出色,并在高维度设置中保持出色的表现。
MD-HIT: Machine learning for materials property prediction with dataset redundancy control
paper_authors: Qin Li, Nihang Fu, Sadman Sadeed Omee, Jianjun Hu
For: 本研究旨在解决物料数据集中的重复样本问题,以提高物料预测性能的准确性。* Methods: 本文提出了一种物料数据集重复样本减少算法(MD-HIT),并对其进行了评估。* Results: 研究表明,通过使用 MD-HIT 减少样本重复,可以更好地反映物料预测性能的准确性。Abstract
Materials datasets are usually featured by the existence of many redundant (highly similar) materials due to the tinkering material design practice over the history of materials research. For example, the materials project database has many perovskite cubic structure materials similar to SrTiO$_3$. This sample redundancy within the dataset makes the random splitting of machine learning model evaluation to fail so that the ML models tend to achieve over-estimated predictive performance which is misleading for the materials science community. This issue is well known in the field of bioinformatics for protein function prediction, in which a redundancy reduction procedure (CD-Hit) is always applied to reduce the sample redundancy by ensuring no pair of samples has a sequence similarity greater than a given threshold. This paper surveys the overestimated ML performance in the literature for both composition based and structure based material property prediction. We then propose a material dataset redundancy reduction algorithm called MD-HIT and evaluate it with several composition and structure based distance threshold sfor reducing data set sample redundancy. We show that with this control, the predicted performance tends to better reflect their true prediction capability. Our MD-hit code can be freely accessed at https://github.com/usccolumbia/MD-HIT
摘要
In this paper, we survey the overestimated ML performance in the literature for both composition-based and structure-based material property prediction. We then propose a material dataset redundancy reduction algorithm called MD-HIT and evaluate it with several composition and structure-based distance thresholds for reducing data set sample redundancy. Our results show that with this control, the predicted performance tends to better reflect their true prediction capability. The MD-hit code can be freely accessed at https://github.com/usccolumbia/MD-HIT.
RLTF: Reinforcement Learning from Unit Test Feedback
results: 在APPS和MBPPbenchmark上实现了状态的最佳性能。Abstract
The goal of program synthesis, or code generation, is to generate executable code based on given descriptions. Recently, there has been an increasing number of studies employing reinforcement learning (RL) to improve the performance of large language models (LLMs) for code. However, these RL methods have only used offline frameworks, limiting their exploration of new sample spaces. Additionally, current approaches that utilize unit test signals are rather simple, not accounting for specific error locations within the code. To address these issues, we proposed RLTF, i.e., Reinforcement Learning from Unit Test Feedback, a novel online RL framework with unit test feedback of multi-granularity for refining code LLMs. Our approach generates data in real-time during training and simultaneously utilizes fine-grained feedback signals to guide the model towards producing higher-quality code. Extensive experiments show that RLTF achieves state-of-the-art performance on the APPS and the MBPP benchmarks. Our code can be found at: https://github.com/Zyq-scut/RLTF.
摘要
目标是使程序生成器、代码生成器或代码生成器可以根据给定的描述生成可执行的代码。近些年来,有越来越多的研究使用强化学习(RL)来提高大型自然语言模型(LLM)的代码生成性能。然而,这些RL方法只使用了离线框架,限制了新样本空间的探索。此外,现有的使用单元测试信号的方法相对简单,没有考虑特定的错误位置在代码中。为解决这些问题,我们提出了RLTF,即基于单元测试反馈的强化学习框架。我们的方法在训练过程中生成数据并同时使用多级别的反馈信号来引导模型生成更高质量的代码。我们的实验表明,RLTF可以在APPS和MBPPbenchmark上达到当前最佳性能。我们的代码可以在github上找到:https://github.com/Zyq-scut/RLTF。
Injecting Logical Constraints into Neural Networks via Straight-Through Estimators
results: 在使用GPU和批处理训练时,方法可以规模更好地比较现有的神经符号计算方法,同时可以在不同类型的神经网络上学习,无需或减少标注数据Abstract
Injecting discrete logical constraints into neural network learning is one of the main challenges in neuro-symbolic AI. We find that a straight-through-estimator, a method introduced to train binary neural networks, could effectively be applied to incorporate logical constraints into neural network learning. More specifically, we design a systematic way to represent discrete logical constraints as a loss function; minimizing this loss using gradient descent via a straight-through-estimator updates the neural network's weights in the direction that the binarized outputs satisfy the logical constraints. The experimental results show that by leveraging GPUs and batch training, this method scales significantly better than existing neuro-symbolic methods that require heavy symbolic computation for computing gradients. Also, we demonstrate that our method applies to different types of neural networks, such as MLP, CNN, and GNN, making them learn with no or fewer labeled data by learning directly from known constraints.
摘要
injecting discrete logical constraints into neural network learning is one of the main challenges in neuro-symbolic AI. we find that a straight-through-estimator, a method introduced to train binary neural networks, could effectively be applied to incorporate logical constraints into neural network learning. more specifically, we design a systematic way to represent discrete logical constraints as a loss function; minimizing this loss using gradient descent via a straight-through-estimator updates the neural network's weights in the direction that the binarized outputs satisfy the logical constraints. the experimental results show that by leveraging GPUs and batch training, this method scales significantly better than existing neuro-symbolic methods that require heavy symbolic computation for computing gradients. also, we demonstrate that our method applies to different types of neural networks, such as MLP, CNN, and GNN, making them learn with no or fewer labeled data by learning directly from known constraints.Here's the translation in Traditional Chinese as well:injecting discrete logical constraints into neural network learning is one of the main challenges in neuro-symbolic AI. we find that a straight-through-estimator, a method introduced to train binary neural networks, could effectively be applied to incorporate logical constraints into neural network learning. more specifically, we design a systematic way to represent discrete logical constraints as a loss function; minimizing this loss using gradient descent via a straight-through-estimator updates the neural network's weights in the direction that the binarized outputs satisfy the logical constraints. the experimental results show that by leveraging GPUs and batch training, this method scales significantly better than existing neuro-symbolic methods that require heavy symbolic computation for computing gradients. also, we demonstrate that our method applies to different types of neural networks, such as MLP, CNN, and GNN, making them learn with no or fewer labeled data by learning directly from known constraints.
Continual Learning as Computationally Constrained Reinforcement Learning
paper_authors: Saurabh Kumar, Henrik Marklund, Ashish Rao, Yifan Zhu, Hong Jun Jeon, Yueyang Liu, Benjamin Van Roy
for: 本研究旨在开发一种能够高效储存知识,逐渐提高人工智能能力的智能代理人。
methods: 本文提出了一种概念和工具集,用于探讨智能代理人的持续学习问题。
results: 本研究提供了一个明确和正式的定义和框架,以促进持续学习领域的进一步研究。Abstract
An agent that efficiently accumulates knowledge to develop increasingly sophisticated skills over a long lifetime could advance the frontier of artificial intelligence capabilities. The design of such agents, which remains a long-standing challenge of artificial intelligence, is addressed by the subject of continual learning. This monograph clarifies and formalizes concepts of continual learning, introducing a framework and set of tools to stimulate further research.
摘要
一个智能代理人,能够有效地积累知识,开发越来越复杂的技能,可能会推动人工智能技能的前沿。该代理人的设计,仍然是人工智能领域的长期挑战。本论文将 clarify和正式化持续学习的概念,提出一个框架和一组工具,以促进更多的研究。
Privacy-Preserving Graph Machine Learning from Data to Computation: A Survey
results: 论文详细介绍了现有的隐私保护技术和软件工具,同时也提出了未来研究的挑战和可能性。最终,论文概述了一个简单、通用的安全图机器学习系统。Abstract
In graph machine learning, data collection, sharing, and analysis often involve multiple parties, each of which may require varying levels of data security and privacy. To this end, preserving privacy is of great importance in protecting sensitive information. In the era of big data, the relationships among data entities have become unprecedentedly complex, and more applications utilize advanced data structures (i.e., graphs) that can support network structures and relevant attribute information. To date, many graph-based AI models have been proposed (e.g., graph neural networks) for various domain tasks, like computer vision and natural language processing. In this paper, we focus on reviewing privacy-preserving techniques of graph machine learning. We systematically review related works from the data to the computational aspects. We first review methods for generating privacy-preserving graph data. Then we describe methods for transmitting privacy-preserved information (e.g., graph model parameters) to realize the optimization-based computation when data sharing among multiple parties is risky or impossible. In addition to discussing relevant theoretical methodology and software tools, we also discuss current challenges and highlight several possible future research opportunities for privacy-preserving graph machine learning. Finally, we envision a unified and comprehensive secure graph machine learning system.
摘要
在图机器学习中,数据采集、分享和分析经常涉及多方面,每个方面可能需要不同的数据安全和隐私保护。因此,保护隐私是图机器学习中非常重要的一环。在大数据时代,数据之间的关系变得无前例地复杂,更多的应用利用高级数据结构(即图)来支持网络结构和相关属性信息。到目前为止,许多基于图的AI模型(如图神经网络)已经被提出,用于各种领域任务,如计算机视觉和自然语言处理。在本文中,我们关注图机器学习中的隐私保护技术的评审。我们系统地回顾相关的数据和计算方面的方法。我们首先回顾如何生成隐私保护的图数据。然后我们描述如何将隐私保持的信息(如图模型参数)传输,以实现在多方共享数据时的优化计算。除了讨论相关的理论方法和软件工具外,我们还讨论当前的挑战和可能的未来研究机会。最后,我们拟定了一个统一和完整的安全图机器学习系统。
Source-Aware Embedding Training on Heterogeneous Information Networks
results: 实验结果表明,SUMSHINE方法可以在实际世界数据集上达到现有方法的性能水平,同时具有更好的扩展性和灵活性。Abstract
Heterogeneous information networks (HINs) have been extensively applied to real-world tasks, such as recommendation systems, social networks, and citation networks. While existing HIN representation learning methods can effectively learn the semantic and structural features in the network, little awareness was given to the distribution discrepancy of subgraphs within a single HIN. However, we find that ignoring such distribution discrepancy among subgraphs from multiple sources would hinder the effectiveness of graph embedding learning algorithms. This motivates us to propose SUMSHINE (Scalable Unsupervised Multi-Source Heterogeneous Information Network Embedding) -- a scalable unsupervised framework to align the embedding distributions among multiple sources of an HIN. Experimental results on real-world datasets in a variety of downstream tasks validate the performance of our method over the state-of-the-art heterogeneous information network embedding algorithms.
摘要
各种多源多类网络(HINs)已经广泛应用于现实世界任务中,如推荐系统、社交网络和引用网络。 existing HIN表示学习方法可以有效学习网络的semantic和结构特征,但是忽略了各个子图在单个HIN中的分布差异。然而,我们发现忽略这些分布差异会降低图像学习算法的效果。这种情况 Motivates我们提出 SUMSHINE(可扩展无监督多源多类网络嵌入)——一种可扩展无监督框架,用于对多个HIN的嵌入分布进行对齐。实验结果表明,我们的方法在真实世界数据上的多种下游任务中表现出色,胜过现状的多种多样化网络嵌入算法。
Enhancing Adversarial Robustness via Score-Based Optimization
results: 实验结果表明,ScoreOpt方法可以在多个数据集上(包括CIFAR10、CIFAR100和ImageNet)击败现有的防御方法,both in terms of robustness performance和推理速度。Abstract
Adversarial attacks have the potential to mislead deep neural network classifiers by introducing slight perturbations. Developing algorithms that can mitigate the effects of these attacks is crucial for ensuring the safe use of artificial intelligence. Recent studies have suggested that score-based diffusion models are effective in adversarial defenses. However, existing diffusion-based defenses rely on the sequential simulation of the reversed stochastic differential equations of diffusion models, which are computationally inefficient and yield suboptimal results. In this paper, we introduce a novel adversarial defense scheme named ScoreOpt, which optimizes adversarial samples at test-time, towards original clean data in the direction guided by score-based priors. We conduct comprehensive experiments on multiple datasets, including CIFAR10, CIFAR100 and ImageNet. Our experimental results demonstrate that our approach outperforms existing adversarial defenses in terms of both robustness performance and inference speed.
摘要
深度神经网络分类器可能会被抗击攻击诱导,引入微小的扰动。为确保人工智能的安全使用,开发有效的抗击攻击算法是非常重要。latest studies suggest that score-based diffusion models are effective in adversarial defenses. However, existing diffusion-based defenses rely on the sequential simulation of the reversed stochastic differential equations of diffusion models, which are computationally inefficient and yield suboptimal results. In this paper, we introduce a novel adversarial defense scheme named ScoreOpt, which optimizes adversarial samples at test-time, towards original clean data in the direction guided by score-based priors. We conduct comprehensive experiments on multiple datasets, including CIFAR10, CIFAR100 and ImageNet. Our experimental results demonstrate that our approach outperforms existing adversarial defenses in terms of both robustness performance and inference speed.Here's the translation in Traditional Chinese:深度神经网络分类器可能会被抗击攻击诱导,引入微小的扰动。为确保人工智能的安全使用,开发有效的抗击攻击算法是非常重要。latest studies suggest that score-based diffusion models are effective in adversarial defenses. However, existing diffusion-based defenses rely on the sequential simulation of the reversed stochastic differential equations of diffusion models, which are computationally inefficient and yield suboptimal results. In this paper, we introduce a novel adversarial defense scheme named ScoreOpt, which optimizes adversarial samples at test-time, towards original clean data in the direction guided by score-based priors. We conduct comprehensive experiments on multiple datasets, including CIFAR10, CIFAR100 and ImageNet. Our experimental results demonstrate that our approach outperforms existing adversarial defenses in terms of both robustness performance and inference speed.
Leveraging Multiple Descriptive Features for Robust Few-shot Image Learning
results: compared to标准方法such as linear probing, this method outperforms in the few-shot learning setting, and when combined with fine-tuning, it also outperforms existing state-of-the-art finetuning approaches on both in-distribution and out-of-distribution performance.Abstract
Modern image classification is based upon directly predicting model classes via large discriminative networks, making it difficult to assess the intuitive visual ``features'' that may constitute a classification decision. At the same time, recent works in joint visual language models such as CLIP provide ways to specify natural language descriptions of image classes but typically focus on providing single descriptions for each class. In this work, we demonstrate that an alternative approach, arguably more akin to our understanding of multiple ``visual features'' per class, can also provide compelling performance in the robust few-shot learning setting. In particular, we automatically enumerate multiple visual descriptions of each class -- via a large language model (LLM) -- then use a vision-image model to translate these descriptions to a set of multiple visual features of each image; we finally use sparse logistic regression to select a relevant subset of these features to classify each image. This both provides an ``intuitive'' set of relevant features for each class, and in the few-shot learning setting, outperforms standard approaches such as linear probing. When combined with finetuning, we also show that the method is able to outperform existing state-of-the-art finetuning approaches on both in-distribution and out-of-distribution performance.
摘要
现代图像分类基于直接预测模型类via大量推理网络,这使得评估视觉“特征”的概念变得更加困难。同时,现有的 JOINT 视觉语言模型,如 CLIP,可以提供每个类型的自然语言描述,但通常只focus on提供每个类型的单一描述。在这个工作中,我们展示了一种alternative方法,更加类似于我们对多个“视觉特征”的认知,可以在robust few-shot learning setting中提供出色的表现。具体来说,我们使用大型语言模型(LLM)自动生成每个类型的多个视觉描述,然后使用视觉图像模型将这些描述翻译成每个图像的多个视觉特征;最后,我们使用稀疏逻辑回归选择每个图像的相关子集特征进行分类。这种方法不仅提供了每个类型的“直观”相关特征,而且在few-shot learning setting中,也超越了标准方法such as linear probing。当与finetuning结合使用时,我们还证明了该方法可以超越现有的state-of-the-art finetuning方法,包括在分布式和非分布式情况下的表现。
Data-driven Nonlinear Parametric Model Order Reduction Framework using Deep Hierarchical Variational Autoencoder
results: 基于LSH-VAE,提出了一种基于圆拟 interpolating 的参数缩放框架。该框架在三个非线性多物理动力系统上进行了验证和评估,并与传统的非线性参数缩放方法进行了比较。结果显示,LSH-VAE 能够在准确性和速度两个方面与传统方法相比有显著优势。Abstract
A data-driven parametric model order reduction (MOR) method using a deep artificial neural network is proposed. The present network, which is the least-squares hierarchical variational autoencoder (LSH-VAE), is capable of performing nonlinear MOR for the parametric interpolation of a nonlinear dynamic system with a significant number of degrees of freedom. LSH-VAE exploits two major changes to the existing networks: a hierarchical deep structure and a hybrid weighted, probabilistic loss function. The enhancements result in a significantly improved accuracy and stability compared against the conventional nonlinear MOR methods, autoencoder, and variational autoencoder. Upon LSH-VAE, a parametric MOR framework is presented based on the spherically linear interpolation of the latent manifold. The present framework is validated and evaluated on three nonlinear and multiphysics dynamic systems. First, the present framework is evaluated on the fluid-structure interaction benchmark problem to assess its efficiency and accuracy. Then, a highly nonlinear aeroelastic phenomenon, limit cycle oscillation, is analyzed. Finally, the present framework is applied to a three-dimensional fluid flow to demonstrate its capability of efficiently analyzing a significantly large number of degrees of freedom. The performance of LSH-VAE is emphasized by comparing its results against that of the widely used nonlinear MOR methods, convolutional autoencoder, and $\beta$-VAE. The present framework exhibits a significantly enhanced accuracy to the conventional methods while still exhibiting a large speed-up factor.
摘要
提出了一种基于深度人工神经网络的数据驱动参数化模型简化方法(MOR)。该网络为Least-Squares Hierarchical Variational Autoencoder(LSH-VAE),可以实现非线性MOR,用于 interpolating非线性动力系统中的多个自由度。LSH-VAE利用了两个主要改进:层次深度结构和权重加权概率损失函数。这些改进使得MOR的准确性和稳定性得到了明显提高,相比于传统的非线性MOR方法、自动encoder和variational autoencoder。基于LSH-VAE,一种基于圆拟 interpolating latent manifold的参数化MOR框架被提出。该框架在三个非线性和多物理动力系统上进行了验证和评估。首先,该框架在流体-结构交互问题上进行了效率和准确性的评估。然后,一种高度非线性的风动现象,限цик尔振荡,进行了分析。最后,该框架在三维流体流中进行了efficient地分析一个相对较大的自由度。LSH-VAE的性能被与传统的非线性MOR方法、卷积自动encoder和β-VAE进行了比较,其准确性与传统方法相比具有明显提高,同时仍然具有大快速因子。
CT-BERT: Learning Better Tabular Representations Through Cross-Table Pre-training
paper_authors: Chao Ye, Guoshan Lu, Haobo Wang, Liyao Li, Sai Wu, Gang Chen, Junbo Zhao for: 这个论文的目的是为了提出一种可以在大规模表格数据上预训练表格数据的方法,以便在表格数据上实现通用表示。methods: 这个论文使用了一种名为 CT-BERT 的新框架,该框架可以在跨表格上进行预训练。 CT-BERT 可以与 both 监督学习和自监学习方法一起使用,并且提出了一种基于对比学习的表格模型目标函数。results: 论文的实验结果显示,CT-BERT 在 15 个 dataset 上达到了州际前进的性能,其中包括监督学习和自监学习两种不同的设置。 CT-BERT 的性能都高于先前的方法。Abstract
Tabular data -- also known as structured data -- is one of the most common data forms in existence, thanks to the stable development and scaled deployment of database systems in the last few decades. At present however, despite the blast brought by large pre-trained models in other domains such as ChatGPT or SAM, how can we extract common knowledge across tables at a scale that may eventually lead to generalizable representation for tabular data remains a full blank. Indeed, there have been a few works around this topic. Most (if not all) of them are limited in the scope of a single table or fixed form of a schema. In this work, we first identify the crucial research challenges behind tabular data pre-training, particularly towards the cross-table scenario. We position the contribution of this work in two folds: (i)-we collect and curate nearly 2k high-quality tabular datasets, each of which is guaranteed to possess clear semantics, clean labels, and other necessary meta information. (ii)-we propose a novel framework that allows cross-table pre-training dubbed as CT-BERT. Noticeably, in light of pioneering the scaled cross-table training, CT-BERT is fully compatible with both supervised and self-supervised schemes, where the specific instantiation of CT-BERT is very much dependent on the downstream tasks. We further propose and implement a contrastive-learning-based and masked table modeling (MTM) objective into CT-BERT, that is inspired from computer vision and natural language processing communities but sophistically tailored to tables. The extensive empirical results on 15 datasets demonstrate CT-BERT's state-of-the-art performance, where both its supervised and self-supervised setups significantly outperform the prior approaches.
摘要
表格数据 -- 也称为结构化数据 -- 是现代数据的最常见形式,这主要归功于过去几十年内Database系统的稳定发展和大规模部署。然而,当前,即使大型预训模型在其他领域,如ChatGPT或SAM,所带来的冲击,总之,如何在大规模的表格数据中提取通用知识,以达到可generalizable的表格数据表示尚未得到解决。事实上,有一些相关的研究工作。大多数(如果不是所有)这些工作都受到单个表格或固定的表格Schema的限制。在这种工作中,我们首先 indentified the crucial research challenges behind tabular data pre-training, particularly in the cross-table scenario。我们的贡献在两个方面:(i)我们收集和精心整理了nearly 2k高质量的表格数据集,每个数据集都具有明确的 semantics、clean labels和其他必要的元信息。(ii)我们提出了一种新的框架,称为 CT-BERT,可以在跨表格上进行预训。另外,我们还提出了一种基于对比学习和遮盖表格模型(MTM)的目标函数,这种目标函数是从计算机视觉和自然语言处理社区中吸取的,但是它在表格上进行了精心修改。我们的实验结果表明,CT-BERT在15个数据集上 display state-of-the-art performance,其中包括supervised和self-supervised设置。
Automatic Piano Transcription with Hierarchical Frequency-Time Transformer
for: automatic piano transcription, especially for determining the precise onset and offset of each note in polyphonic piano content.
methods: hFT-Transformer, a two-level hierarchical frequency-time Transformer architecture that captures long-term dependencies in the frequency and time axes using self-attention mechanism.
results: state-of-the-art performance on all F1-scores of metrics among Frame, Note, Note with Offset, and Note with Offset and Velocity estimations, as demonstrated on the widely used MAPS and MAESTRO v3.0.0 datasets.Abstract
Taking long-term spectral and temporal dependencies into account is essential for automatic piano transcription. This is especially helpful when determining the precise onset and offset for each note in the polyphonic piano content. In this case, we may rely on the capability of self-attention mechanism in Transformers to capture these long-term dependencies in the frequency and time axes. In this work, we propose hFT-Transformer, which is an automatic music transcription method that uses a two-level hierarchical frequency-time Transformer architecture. The first hierarchy includes a convolutional block in the time axis, a Transformer encoder in the frequency axis, and a Transformer decoder that converts the dimension in the frequency axis. The output is then fed into the second hierarchy which consists of another Transformer encoder in the time axis. We evaluated our method with the widely used MAPS and MAESTRO v3.0.0 datasets, and it demonstrated state-of-the-art performance on all the F1-scores of the metrics among Frame, Note, Note with Offset, and Note with Offset and Velocity estimations.
摘要
需要考虑长期 spectral 和 temporal 依赖性,以便自动识别钢琴乐谱。特别是在确定每个乐谱中的精确开始和结束时间点时,长期依赖性对于多重钢琴内容非常重要。在这种情况下,我们可以利用 Transformer 模型中的自我注意力机制,以捕捉在频谱和时间轴上的长期依赖性。在这项工作中,我们提出了 hFT-Transformer,这是一种使用两级叠加频谱-时间 Transformer 架构的自动音乐识别方法。第一层包括时间轴中的卷积块,频谱轴中的 Transformer 编码器,以及将频谱维度转换为时间轴的 Transformer 解码器。输出然后被 fed 到第二层,该层包括另一个时间轴中的 Transformer 编码器。我们对 widely 使用的 MAPS 和 MAESTRO v3.0.0 数据集进行了评估,并示出了所有 F1-scores 的 metric 中的状态顶峰性能。
Edge Storage Management Recipe with Zero-Shot Data Compression for Road Anomaly Detection
results: 经过对比性测试,我们发现该方法可以保持噪声检测性能,同时提高存储和传输效率。Abstract
Recent studies show edge computing-based road anomaly detection systems which may also conduct data collection simultaneously. However, the edge computers will have small data storage but we need to store the collected audio samples for a long time in order to update existing models or develop a novel method. Therefore, we should consider an approach for efficient storage management methods while preserving high-fidelity audio. A hardware-perspective approach, such as using a low-resolution microphone, is an intuitive way to reduce file size but is not recommended because it fundamentally cuts off high-frequency components. On the other hand, a computational file compression approach that encodes collected high-resolution audio into a compact code should be recommended because it also provides a corresponding decoding method. Motivated by this, we propose a way of simple yet effective pre-trained autoencoder-based data compression method. The pre-trained autoencoder is trained for the purpose of audio super-resolution so it can be utilized to encode or decode any arbitrary sampling rate. Moreover, it will reduce the communication cost for data transmission from the edge to the central server. Via the comparative experiments, we confirm that the zero-shot audio compression and decompression highly preserve anomaly detection performance while enhancing storage and transmission efficiency.
摘要
近期研究显示基于边缘计算的公路异常检测系统可能同时进行数据收集。然而,边缘计算机器具有小容量数据存储,我们需要长期保存收集的音频采样以更新现有模型或开发新方法。因此,我们应该考虑一种高效存储管理方法,同时保持高质量音频。一种硬件视角的方法,如使用低分辨率 Mikrofon,不建议使用,因为它基本上切断高频组件。相反,一种计算机件压缩方法,通过编码收集的高分辨率音频为紧凑的编码,应该得到推荐。受到这一点的激励,我们提出了简单又有效的预训练自动编码器基于数据压缩方法。预训练自动编码器是为了音频超分辨率而训练的,因此可以用于编码或解码任何采样率。此外,它还会降低边缘到中央服务器的数据传输成本。通过比较实验,我们证明了零aser音频压缩和解压缩可以高度保持异常检测性能,同时提高存储和传输效率。
Online Ad Procurement in Non-stationary Autobidding Worlds
results: 本研究表明,我们的算法在不知道飞行过程的情况下可以 achieve low regret in many worlds,并且可以在不同类型的飞行过程中提供优化的杠杆决策。Abstract
Today's online advertisers procure digital ad impressions through interacting with autobidding platforms: advertisers convey high level procurement goals via setting levers such as budget, target return-on-investment, max cost per click, etc.. Then ads platforms subsequently procure impressions on advertisers' behalf, and report final procurement conversions (e.g. click) to advertisers. In practice, advertisers may receive minimal information on platforms' procurement details, and procurement outcomes are subject to non-stationary factors like seasonal patterns, occasional system corruptions, and market trends which make it difficult for advertisers to optimize lever decisions effectively. Motivated by this, we present an online learning framework that helps advertisers dynamically optimize ad platform lever decisions while subject to general long-term constraints in a realistic bandit feedback environment with non-stationary procurement outcomes. In particular, we introduce a primal-dual algorithm for online decision making with multi-dimension decision variables, bandit feedback and long-term uncertain constraints. We show that our algorithm achieves low regret in many worlds when procurement outcomes are generated through procedures that are stochastic, adversarial, adversarially corrupted, periodic, and ergodic, respectively, without having to know which procedure is the ground truth. Finally, we emphasize that our proposed algorithm and theoretical results extend beyond the applications of online advertising.
摘要
今天的在线广告主通过交互式自动拍卖平台购买数字广告印象:广告主通过设置杠杆如预算、目标回报率、最高单击成本等来传递高级购买目标。然后广告平台会在广告主的 behalf 购买印象,并将最终购买转化(例如点击)报告给广告主。在实践中,广告主可能会收到最小的平台购买详细信息,并且购买结果受到不同因素的影响,如季节性模式、 occasional 系统腐蚀和市场趋势,这使得广告主难以有效地优化杠杆决策。为了解决这个问题,我们提出了一个在线学习框架,帮助广告主在面临实际的链接环境下动态优化广告平台杠杆决策,同时遵循长期不确定的约束。具体来说,我们引入了 primal-dual 算法,用于在线决策中的多维决策变量、链接反馈和长期不确定约束。我们证明了我们的算法在许多世界中具有低念悟,无需知道采用哪种程序是真实的。最后,我们强调了我们的提出的算法和理论结果超出了在线广告应用的限制,可以应用于其他领域。
Generalizing Graph ODE for Learning Complex System Dynamics across Environments
results: 我们的模型可以准确预测系统动力学,特别是在长距离上,并且可以将新系统中的少量观测数据泛化到整个系统。在多种物理 simulate 中进行了实验,我们的模型能够准确预测系统动力学,特别是在长距离上,并且可以将新系统中的少量观测数据泛化到整个系统。Abstract
Learning multi-agent system dynamics has been extensively studied for various real-world applications, such as molecular dynamics in biology. Most of the existing models are built to learn single system dynamics from observed historical data and predict the future trajectory. In practice, however, we might observe multiple systems that are generated across different environments, which differ in latent exogenous factors such as temperature and gravity. One simple solution is to learn multiple environment-specific models, but it fails to exploit the potential commonalities among the dynamics across environments and offers poor prediction results where per-environment data is sparse or limited. Here, we present GG-ODE (Generalized Graph Ordinary Differential Equations), a machine learning framework for learning continuous multi-agent system dynamics across environments. Our model learns system dynamics using neural ordinary differential equations (ODE) parameterized by Graph Neural Networks (GNNs) to capture the continuous interaction among agents. We achieve the model generalization by assuming the dynamics across different environments are governed by common physics laws that can be captured via learning a shared ODE function. The distinct latent exogenous factors learned for each environment are incorporated into the ODE function to account for their differences. To improve model performance, we additionally design two regularization losses to (1) enforce the orthogonality between the learned initial states and exogenous factors via mutual information minimization; and (2) reduce the temporal variance of learned exogenous factors within the same system via contrastive learning. Experiments over various physical simulations show that our model can accurately predict system dynamics, especially in the long range, and can generalize well to new systems with few observations.
摘要
学习多智能体系统动态已经广泛研究了各种现实世界应用,如生物分子动力学。大多数现有模型都是建立来学习单个系统动态从观察到数据,预测未来轨迹。然而,在实践中,我们可能会观察到来自不同环境的多个系统,这些系统之间的潜在因素不同,如温度和重力。一个简单的解决方案是学习每个环境特定的模型,但这会忽略系统动态之间的共同特征,并且预测结果在每个环境数据稀缺或有限时会不佳。我们提出了GG-ODE(通用图 ordininary differential equations)机器学习框架,用于学习连续多智能体系统动态。我们的模型通过使用图神经网络(GNNs)参数化神经ordinary differential equations(ODE)来捕捉连续智能体之间的交互。我们通过假设不同环境的动态都受到共同的物理法则所控制,以学习共同的ODE函数来泛化模型。每个环境的潜在隐藏因素被学习到ODE函数中,以考虑它们之间的差异。为了提高模型性能,我们还设计了两种常见化loss,即(1)通过对学习的初始状态和隐藏因素进行相互信息减少来保持对隐藏因素的正交性; 和(2)在同一个系统中减少学习的时间异谱。通过对各种物理 simulate experiment 进行测试,我们发现我们的模型可以准确预测系统动态,特别是在长距离内,并且可以良好地泛化到新系统。
Assessing the efficacy of large language models in generating accurate teacher responses
results: 实验结果表明,GPT-4在BERTScore和DialogRPT上的表现较高,而其他精度调整模型表现较差,这些结果提示了 dataset 特性的影响,如采样、代表性和对话完整性,对于精度调整具有 significanteffect。Abstract
(Tack et al., 2023) organized the shared task hosted by the 18th Workshop on Innovative Use of NLP for Building Educational Applications on generation of teacher language in educational dialogues. Following the structure of the shared task, in this study, we attempt to assess the generative abilities of large language models in providing informative and helpful insights to students, thereby simulating the role of a knowledgeable teacher. To this end, we present an extensive evaluation of several benchmarking generative models, including GPT-4 (few-shot, in-context learning), fine-tuned GPT-2, and fine-tuned DialoGPT. Additionally, to optimize for pedagogical quality, we fine-tuned the Flan-T5 model using reinforcement learning. Our experimental findings on the Teacher-Student Chatroom Corpus subset indicate the efficacy of GPT-4 over other fine-tuned models, measured using BERTScore and DialogRPT. We hypothesize that several dataset characteristics, including sampling, representativeness, and dialog completeness, pose significant challenges to fine-tuning, thus contributing to the poor generalizability of the fine-tuned models. Finally, we note the need for these generative models to be evaluated with a metric that relies not only on dialog coherence and matched language modeling distribution but also on the model's ability to showcase pedagogical skills.
摘要
我们认为,共同任务的 dataset 特点,包括采样、代表性和对话完整性,对于调整 pose significiant 挑战,从而导致调整模型的泛化性不佳。最后,我们注意到,为了评估这些生成模型,需要使用一个指标,不仅考虑对话 coherence 和语言模型的匹配分布,还需要考虑模型的教学技能展示能力。
MentalHealthAI: Utilizing Personal Health Device Data to Optimize Psychiatry Treatment
results: 使用流行的心理健康数据集,实现了Promising results,表明我们的方法可以提供有效的心理健康跟踪和预测。Abstract
Mental health disorders remain a significant challenge in modern healthcare, with diagnosis and treatment often relying on subjective patient descriptions and past medical history. To address this issue, we propose a personalized mental health tracking and mood prediction system that utilizes patient physiological data collected through personal health devices. Our system leverages a decentralized learning mechanism that combines transfer and federated machine learning concepts using smart contracts, allowing data to remain on users' devices and enabling effective tracking of mental health conditions for psychiatric treatment and management in a privacy-aware and accountable manner. We evaluate our model using a popular mental health dataset that demonstrates promising results. By utilizing connected health systems and machine learning models, our approach offers a novel solution to the challenge of providing psychiatrists with further insight into their patients' mental health outside of traditional office visits.
摘要
精神健康问题仍然是现代医疗中的主要挑战,诊断和治疗往往基于患者主观描述和医疗历史。为解决这一问题,我们提出了个性化精神健康跟踪和情绪预测系统,该系统利用通过个人健康设备收集的患者生物数据。我们的系统采用分布式学习机制,结合了转移和联邦机器学习概念,使得数据能够留在用户设备上,并且有效地跟踪精神健康状况,供心理医生进行精神疾病治疗和管理,同时具有隐私和负责任的特点。我们使用了一个流行的精神健康数据集,并得到了批判性的结果。通过连接医疗系统和机器学习模型,我们的方法提供了一种新的解决方案,即为心理医生在传统办公室访问之外提供更多的精神健康情况的信息。
RidgeBase: A Cross-Sensor Multi-Finger Contactless Fingerprint Dataset
For: 这 paper 的目的是提出一个大规模实际数据集,以促进无接触指纹识别技术的进一步发展。* Methods: 这 paper 使用了两种摄像头和一个平板式接触传感器来收集 contactless 和 contact-based 指纹图像,并提出了一种基于 facial recognition 数据集的集成匹配协议来处理 intra-sample 差异。* Results: 这 paper 的结果显示,使用 COTS 指纹匹配器和深度 CNN 方法在 RidgeBase 数据集上实现了高度准确的指纹识别。Abstract
Contactless fingerprint matching using smartphone cameras can alleviate major challenges of traditional fingerprint systems including hygienic acquisition, portability and presentation attacks. However, development of practical and robust contactless fingerprint matching techniques is constrained by the limited availability of large scale real-world datasets. To motivate further advances in contactless fingerprint matching across sensors, we introduce the RidgeBase benchmark dataset. RidgeBase consists of more than 15,000 contactless and contact-based fingerprint image pairs acquired from 88 individuals under different background and lighting conditions using two smartphone cameras and one flatbed contact sensor. Unlike existing datasets, RidgeBase is designed to promote research under different matching scenarios that include Single Finger Matching and Multi-Finger Matching for both contactless- to-contactless (CL2CL) and contact-to-contactless (C2CL) verification and identification. Furthermore, due to the high intra-sample variance in contactless fingerprints belonging to the same finger, we propose a set-based matching protocol inspired by the advances in facial recognition datasets. This protocol is specifically designed for pragmatic contactless fingerprint matching that can account for variances in focus, polarity and finger-angles. We report qualitative and quantitative baseline results for different protocols using a COTS fingerprint matcher (Verifinger) and a Deep CNN based approach on the RidgeBase dataset. The dataset can be downloaded here: https://www.buffalo.edu/cubs/research/datasets/ridgebase-benchmark-dataset.html
摘要
请注意,以下文本将使用简化中文表示法。无接触指纹识别使用智能手机镜头可以解决传统指纹系统中的主要挑战,包括医疗安全、可移植性和展示攻击。然而,发展实用且可靠的无接触指纹识别技术受到实际数据的有限可用性所限制。为了鼓励进一步的无接触指纹识别技术发展,我们介绍了RidgeBase参考 dataset。RidgeBase 包含了15,000多个无接触和接触基于指纹图像的对照项目,来自88名参与者,在不同的背景和照明条件下使用两款智能手机镜头和一款平板式接触感应器所取得。与现有数据集不同的是,RidgeBase 是设计来鼓励研究不同的匹配enario,包括单指纹匹配和多指纹匹配,并且包括CL2CL和C2CL验证和识别。此外,由于无接触指纹内部的同 fingers 的标本之间的高同一样性,我们提出了一个基于人脸识别数据集的集合匹配协议。这个协议特别适用于实用无接触指纹识别,可以考虑到专注、极化和手均角度的变化。我们将在RidgeBase dataset上报告基于不同协议的 qualitative 和量化基eline结果,使用商业 fingerprint 匹配软件(Verifinger)和深度 CNN 基础的方法。数据可以在以下网址下载:https://www.buffalo.edu/cubs/research/datasets/ridgebase-benchmark-dataset.html。
The Future of Fundamental Science Led by Generative Closed-Loop Artificial Intelligence
paper_authors: Hector Zenil, Jesper Tegnér, Felipe S. Abrahão, Alexander Lavin, Vipin Kumar, Jeremy G. Frey, Adrian Weller, Larisa Soldatova, Alan R. Bundy, Nicholas R. Jennings, Koichi Takahashi, Lawrence Hunter, Saso Dzeroski, Andrew Briggs, Frederick D. Gregory, Carla P. Gomes, Christopher K. I. Williams, Jon Rowe, James Evans, Hiroaki Kitano, Joshua B. Tenenbaum, Ross King
For: The paper explores the potential of AI-driven, automated, closed-loop approach to scientific discovery, including self-driven hypothesis generation and open-ended autonomous exploration of the hypothesis space.* Methods: The paper discusses the use of Generative AI and Large Language Models to augment and accelerate the scientific discovery of fundamental deep science with quantitative models.* Results: The paper suggests that integrating AI-driven automation into the practice of science could mitigate current problems, including the replication of findings, systematic production of data, and ultimately democratise the scientific process.Abstract
Recent advances in machine learning and AI, including Generative AI and LLMs, are disrupting technological innovation, product development, and society as a whole. AI's contribution to technology can come from multiple approaches that require access to large training data sets and clear performance evaluation criteria, ranging from pattern recognition and classification to generative models. Yet, AI has contributed less to fundamental science in part because large data sets of high-quality data for scientific practice and model discovery are more difficult to access. Generative AI, in general, and Large Language Models in particular, may represent an opportunity to augment and accelerate the scientific discovery of fundamental deep science with quantitative models. Here we explore and investigate aspects of an AI-driven, automated, closed-loop approach to scientific discovery, including self-driven hypothesis generation and open-ended autonomous exploration of the hypothesis space. Integrating AI-driven automation into the practice of science would mitigate current problems, including the replication of findings, systematic production of data, and ultimately democratisation of the scientific process. Realising these possibilities requires a vision for augmented AI coupled with a diversity of AI approaches able to deal with fundamental aspects of causality analysis and model discovery while enabling unbiased search across the space of putative explanations. These advances hold the promise to unleash AI's potential for searching and discovering the fundamental structure of our world beyond what human scientists have been able to achieve. Such a vision would push the boundaries of new fundamental science rather than automatize current workflows and instead open doors for technological innovation to tackle some of the greatest challenges facing humanity today.
摘要
ChatGPT in the Age of Generative AI and Large Language Models: A Concise Survey
paper_authors: Salman Mohamadi, Ghulam Mujtaba, Ngan Le, Gianfranco Doretto, Donald A. Adjeroh for: 这 paper 的主要目的是为了提供一份简洁的survey关于 ChatGPT 的当前研究进展和演化。methods: 这 paper 使用了两种视角来研究 ChatGPT:玻璃盒视角(glass box view)和黑盒视角(black box view)。玻璃盒视角专注于理解技术的内部工作方式,而黑盒视角则视其为一个复杂系统,从输入、输出和效果的角度进行研究。results: 这 paper 提供了一份全面的探究 ChatGPT 技术的研究进展和应用前景,同时也提出了进一步研究的必要性和潜在应用领域。此外, paper 还 shed light on LLMS 和GAI 的基础文献,并探讨了这些技术在教育、研究、医疗、金融等领域的应用和关键问题。Abstract
ChatGPT is a large language model (LLM) created by OpenAI that has been carefully trained on a large amount of data. It has revolutionized the field of natural language processing (NLP) and has pushed the boundaries of LLM capabilities. ChatGPT has played a pivotal role in enabling widespread public interaction with generative artificial intelligence (GAI) on a large scale. It has also sparked research interest in developing similar technologies and investigating their applications and implications. In this paper, our primary goal is to provide a concise survey on the current lines of research on ChatGPT and its evolution. We considered both the glass box and black box views of ChatGPT, encompassing the components and foundational elements of the technology, as well as its applications, impacts, and implications. The glass box approach focuses on understanding the inner workings of the technology, and the black box approach embraces it as a complex system, and thus examines its inputs, outputs, and effects. This paves the way for a comprehensive exploration of the technology and provides a road map for further research and experimentation. We also lay out essential foundational literature on LLMs and GAI in general and their connection with ChatGPT. This overview sheds light on existing and missing research lines in the emerging field of LLMs, benefiting both public users and developers. Furthermore, the paper delves into the broad spectrum of applications and significant concerns in fields such as education, research, healthcare, finance, etc.
摘要
chatGPT是一个大型自然语言模型(LLM),由OpenAI精心训练了大量数据。它对自然语言处理(NLP)领域产生了革命性的变革,并推动了LLM的可能性的极限。chatGPT使得大规模的人工智能生成(GAI)与公众进行交互,并且引发了研究人员对类似技术的开发和应用的兴趣。在这篇论文中,我们的主要目标是提供对chatGPT的当前研究进展和演化的简短概述。我们包括了“玻璃盒”和“黑盒”两种视角,即理解技术的内部结构和行为,以及对其输入、输出和影响的研究。这种方法使得我们可以全面探索技术,并为进一步的研究和实验提供了道路图。此外,我们还提供了LLM和GAI的基础文献,这些文献对于公众和开发者都是必读的。此概述照明了LLM领域的现有和缺失的研究方向,并且探讨了GAI在各个领域的应用和关注点,如教育、研究、医疗、金融等。
Ensemble learning for blending gridded satellite and gauge-measured precipitation data
results: 对整个美国陆地区和15年时间段进行了广泛的比较,并发现ensemble学习算法可以提高卫星降水产品的准确性。Abstract
Regression algorithms are regularly used for improving the accuracy of satellite precipitation products. In this context, ground-based measurements are the dependent variable and the satellite data are the predictor variables, together with topography factors. Alongside this, it is increasingly recognised in many fields that combinations of algorithms through ensemble learning can lead to substantial predictive performance improvements. Still, a sufficient number of ensemble learners for improving the accuracy of satellite precipitation products and their large-scale comparison are currently missing from the literature. In this work, we fill this specific gap by proposing 11 new ensemble learners in the field and by extensively comparing them for the entire contiguous United States and for a 15-year period. We use monthly data from the PERSIANN (Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks) and IMERG (Integrated Multi-satellitE Retrievals for GPM) gridded datasets. We also use gauge-measured precipitation data from the Global Historical Climatology Network monthly database, version 2 (GHCNm). The ensemble learners combine the predictions by six regression algorithms (base learners), namely the multivariate adaptive regression splines (MARS), multivariate adaptive polynomial splines (poly-MARS), random forests (RF), gradient boosting machines (GBM), extreme gradient boosting (XGBoost) and Bayesian regularized neural networks (BRNN), and each of them is based on a different combiner. The combiners include the equal-weight combiner, the median combiner, two best learners and seven variants of a sophisticated stacking method. The latter stacks a regression algorithm on the top of the base learners to combine their independent predictions...
摘要
干涉算法常用于提高卫星降水产品的准确性。在这个上下文中,地面测量是dependent变量,而卫星数据和地形因素是预测变量。此外,逐渐认识到,将算法组合在ensemble学习中可以导致重要的预测性能提高。然而,卫星降水产品的精度 improvemen要求足够多的ensemble学习者,而现有文献中缺乏这些学习者。在这项工作中,我们填充这个空白,并提出11种新的ensemble学习者,并对整个大陆和15年时间进行了广泛的比较。我们使用PERSIANN(Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks)和IMERG(Integrated Multi-satellitE Retrievals for GPM)的月度数据,以及GHCNm(Global Historical Climatology Network monthly database, version 2)中的 gauge-measured precipitation数据。ensemble学习者将base learner(六种回归算法)的预测结果进行组合,其中包括平均值 combiner、 median combiner、两个best learner和七种堆叠方法的七种变种。这些堆叠方法都基于不同的combiner。
Efficient Bayesian travel-time tomography with geologically-complex priors using sensitivity-informed polynomial chaos expansion and deep generative networks
for: This paper focuses on developing a strategy for Bayesian travel-time tomography using Monte Carlo Markov Chain (MCMC) methods, which can accurately characterize the prior distribution and efficiently evaluate the likelihood.
methods: The paper combines the use of principal component analysis (PCA) and polynomial chaos expansion (PCE) to develop a surrogate model for the forward problem, and leverages variational autoencoders (VAEs) to represent the prior distribution.
results: The proposed method enables accurate reconstruction of the true travel times and provides a viable alternative to traditional MCMC methods, which can be computationally expensive and challenging to implement.Abstract
Monte Carlo Markov Chain (MCMC) methods commonly confront two fundamental challenges: the accurate characterization of the prior distribution and the efficient evaluation of the likelihood. In the context of Bayesian studies on tomography, principal component analysis (PCA) can in some cases facilitate the straightforward definition of the prior distribution, while simultaneously enabling the implementation of accurate surrogate models based on polynomial chaos expansion (PCE) to replace computationally intensive full-physics forward solvers. When faced with scenarios where PCA does not offer a direct means of easily defining the prior distribution alternative methods like deep generative models (e.g., variational autoencoders (VAEs)), can be employed as viable options. However, accurately producing a surrogate capable of capturing the intricate non-linear relationship between the latent parameters of a VAE and the outputs of forward modeling presents a notable challenge. Indeed, while PCE models provide high accuracy when the input-output relationship can be effectively approximated by relatively low-degree multivariate polynomials, this condition is typically unmet when utilizing latent variables derived from deep generative models. In this contribution, we present a strategy that combines the excellent reconstruction performances of VAE in terms of prio representation with the accuracy of PCA-PCE surrogate modeling in the context of Bayesian ground penetrating radar (GPR) travel-time tomography. Within the MCMC process, the parametrization of the VAE is leveraged for prior exploration and sample proposal. Concurrently, modeling is conducted using PCE, which operates on either globally or locally defined principal components of the VAE samples under examination.
摘要
Monte Carlo Markov Chain(MCMC)方法常遇到两个基本挑战:准确地 characteryrization of the prior distribution和高效地评估 likelihood。在扩展学中的Tomography研究中,主成分分析(PCA)可以在一些情况下使得归一化分布的定义变得更加直观,同时允许通过多项式混合(PCE)来替代 computationally intensive full-physics forward solvers。然而,在PCA无法直接定义归一化分布的情况下,可以使用深度生成模型(例如变量自动编码器(VAEs))作为可行的选择。然而,生成高精度的surrogate模型,以capture latent parameters of VAE和前向模型之间的复杂非线性关系,则成为一大挑战。实际上,PCE模型在输入-输出关系可以高度有效地被近似为低度多ivariate polynomials时,具有高准确性。然而,这种条件通常不满足使用来自深度生成模型 derivated的latent variables。在这篇论文中,我们提出一种策略,将VAE的很好的重建性与PCA-PCE模型的准确性相结合,以进行 Bayesian ground penetrating radar(GPR) travel-time tomography。在MCMC过程中,VAE的 Parametrization被用于归一化和样本提议。同时,使用PCE进行模型化, operate on either globally or locally defined principal components of VAE samples under examination。
Hierarchical Autoencoder-based Lossy Compression for Large-scale High-resolution Scientific Data
paper_authors: Hieu Le, Hernan Santos, Jian Tao for: compress large-scale scientific data while maintaining high reconstruction qualitymethods: uses a neural network-based approach with Autoencoder architectureresults: achieves a compression ratio of 140 on several benchmark data sets, and 200 on simulation data from the High-Resolution Community Earth System Model (CESM) Version 1.3 with negligible reconstruction error.Abstract
Lossy compression has become an important technique to reduce data size in many domains. This type of compression is especially valuable for large-scale scientific data, whose size ranges up to several petabytes. Although Autoencoder-based models have been successfully leveraged to compress images and videos, such neural networks have not widely gained attention in the scientific data domain. Our work presents a neural network that not only significantly compresses large-scale scientific data but also maintains high reconstruction quality. The proposed model is tested with scientific benchmark data available publicly and applied to a large-scale high-resolution climate modeling data set. Our model achieves a compression ratio of 140 on several benchmark data sets without compromising the reconstruction quality. Simulation data from the High-Resolution Community Earth System Model (CESM) Version 1.3 over 500 years are also being compressed with a compression ratio of 200 while the reconstruction error is negligible for scientific analysis.
摘要
lossy compression技术在许多领域中已成为重要的数据压缩方法。特别是在大规模科学数据领域,数据的大小可以达到数十个petabyte级别。虽然基于Autoencoder的模型已成功地压缩图像和视频,但这些神经网络在科学数据领域尚未得到广泛关注。我们的工作推出了一种能够高效压缩大规模科学数据,同时保持高重建质量的神经网络模型。我们的模型在公共预测数据集上进行测试,并应用于大规模高分辨率气候模拟数据集。我们的模型在多个 benchmark 数据集上实现了压缩率为140,而且重建质量几乎不受影响。同时,我们还对高分辨率社区地球系统模型(CESM)版本1.3的500年 simultion 数据进行压缩,压缩率达200,重建错误几乎可以忽略不计。
Generalized Action-based Ball Recovery Model using 360$^\circ$ data
results: 研究发现,在各种不同的防御方式下,球 Possession 的变化是由几种不同的动作引起的,而这些动作与防御方式之间存在着正相关关系。此外,研究还发现了一些球队的位置对球 Possession 的影响。Abstract
Even though having more possession does not necessarily lead to winning, teams like Manchester City, Liverpool, and Leeds United notably have tried to recover the ball quickly after they lost it over the past few years. Nowadays, some of the top managers in the world apply high-pressing styles, and concepts such as the five-second rule, usually credited to Guardiola, have been spreading out [9][10], becoming a fundamental part of how lots of teams have played over the recent years. Expressions like "don't let them breathe" and "get the ball back as soon as possible" are often heard in the media [4][5][6], but what are the actions that most lead to a change in possession? What is the influence of a team's positioning on the ball recovery? Which are the players that more often collapse when under pressure? Can we evaluate the defensive dynamics of teams that do not necessarily press the player in possession as intensely as those mentioned above? We try to answer those and other questions in this paper by creating a Generalized Action based Ball Recovery model (GABR) using Statsbomb 360$^\circ$ data.
摘要
即使 possessed 更多不一定会导致赢球,但是球队如曼城、利物浦和利兹联等在过去几年来尝试快速回夺球的情况仍然很常见。目前,世界上许多顶尖教练会应用高压风格,并且概念如五秒规则,通常被归功于加多达(Guardiola),在过去几年间成为了许多队伍的基本战斗方式。媒体中经常听到“不让他呼吸”和“尽快回夺球”的表达(4][5][6),但是哪些行为最可能导致 possession 的变化?队伍的位置如何影响球回夺?哪些球员在压力下更容易塌陷?我们通过创建一个通用行动基于球回夺模型(GABR),使用Statsbomb 360$^\circ$ 数据来回答这些问题。