results: 这篇论文的结果显示了与其他基eline比较之下,在多个分子属性评估标准中表现出色,并在特定但普遍的挑战 scenarios 中具有更高的稳定性和应用性。Abstract
Reliable molecular property prediction is essential for various scientific endeavors and industrial applications, such as drug discovery. However, the scarcity of data, combined with the highly non-linear causal relationships between physicochemical and biological properties and conventional molecular featurization schemes, complicates the development of robust molecular machine learning models. Self-supervised learning (SSL) has emerged as a popular solution, utilizing large-scale, unannotated molecular data to learn a foundational representation of chemical space that might be advantageous for downstream tasks. Yet, existing molecular SSL methods largely overlook domain-specific knowledge, such as molecular similarity and scaffold importance, as well as the context of the target application when operating over the large chemical space. This paper introduces a novel learning framework that leverages the knowledge of structural hierarchies within molecular structures, embeds them through separate pre-training tasks over distinct channels, and employs a task-specific channel selection to compose a context-dependent representation. Our approach demonstrates competitive performance across various molecular property benchmarks and establishes some state-of-the-art results. It further offers unprecedented advantages in particularly challenging yet ubiquitous scenarios like activity cliffs with enhanced robustness and generalizability compared to other baselines.
摘要
可靠的分子性质预测是科学研究和工业应用中的关键,如药物搜索。然而,数据稀缺和物理化和生物性质之间非线性关系,以及传统的分子特征化方案,使分子机器学习模型的开发变得复杂。自我超视学习(SSL)已成为一种流行的解决方案,利用大规模、无注释的分子数据来学习分子空间的基础表示,这可能对下游任务有利。然而,现有的分子SSL方法忽视了域专门知识,如分子相似性和架构重要性,以及目标应用场景的 контекст。本文介绍一种新的学习框架,利用分子结构中的结构层次结构,通过不同的预训练任务来嵌入这些结构,并使用任务特定的通道选择来组合上下文依赖的表示。我们的方法在多种分子性质benchmark上显示竞争性的性能,并在一些特殊 yet ubiquitous的enario中提供了前所未有的优势,比如活性峰值中的提高了Robustness和普遍性。
Riemannian Laplace Approximation with the Fisher Metric
results: 提供了两种修改后的变体,并在几个实验中证明了其实际上的改进Abstract
The Laplace's method approximates a target density with a Gaussian distribution at its mode. It is computationally efficient and asymptotically exact for Bayesian inference due to the Bernstein-von Mises theorem, but for complex targets and finite-data posteriors it is often too crude an approximation. A recent generalization of the Laplace Approximation transforms the Gaussian approximation according to a chosen Riemannian geometry providing a richer approximation family, while still retaining computational efficiency. However, as shown here, its properties heavily depend on the chosen metric, indeed the metric adopted in previous work results in approximations that are overly narrow as well as being biased even at the limit of infinite data. We correct this shortcoming by developing the approximation family further, deriving two alternative variants that are exact at the limit of infinite data, extending the theoretical analysis of the method, and demonstrating practical improvements in a range of experiments.
摘要
拉普拉斯方法用 Gaussian 分布近似目标概率分布, computationally efficient 和 asymptotically exact для Bayesian inference due to the Bernstein-von Mises theorem,但对复杂目标和 finite-data posterior 通常太粗糙。一种 recient generalization of the Laplace Approximation 使用 chosen Riemannian geometry 提供一个更加丰富的近似家族,仍然保持 computation efficiency,但选择的 метри可能会导致近似过于窄而且偏向的问题。我们在这里修正这个缺陷,发展了两个替代方案,并对方法的理论分析进行了扩展,在一系列实验中也表现了实践上的改进。
Log-Concavity of Multinomial Likelihood Functions Under Interval Censoring Constraints on Frequencies or Their Partial Sums
results: 本文证明了 multinomial vector 在 interval censoring 约束下的概率函数是完全log-concave。Abstract
We show that the likelihood function for a multinomial vector observed under arbitrary interval censoring constraints on the frequencies or their partial sums is completely log-concave by proving that the constrained sample spaces comprise M-convex subsets of the discrete simplex.
摘要
我们显示了Multinomial vector在arbitrary interval censored的情况下观察到的概率函数是完全log-concave,通过证明受限样本空间包含M-convex的简单体的子集。
One-Shot Strategic Classification Under Unknown Costs
results: 研究发现, même avec small mis-estimation of the true cost, the accuracy of the classifier can be arbitrarily low in the worst case. 我们提出了一种 minimax 问题来解决这个问题,并提供了efficient algorithms for both full-batch and stochastic settings, which converge to the minimax optimal solution at the dimension-independent rate of $\tilde{\mathcal{O}(T^{-\frac{1}{2})$.Abstract
A primary goal in strategic classification is to learn decision rules which are robust to strategic input manipulation. Earlier works assume that strategic responses are known; while some recent works address the important challenge of unknown responses, they exclusively study sequential settings which allow multiple model deployments over time. But there are many domains$\unicode{x2014}$particularly in public policy, a common motivating use-case$\unicode{x2014}$where multiple deployments are unrealistic, or where even a single bad round is undesirable. To address this gap, we initiate the study of strategic classification under unknown responses in the one-shot setting, which requires committing to a single classifier once. Focusing on the users' cost function as the source of uncertainty, we begin by proving that for a broad class of costs, even a small mis-estimation of the true cost can entail arbitrarily low accuracy in the worst case. In light of this, we frame the one-shot task as a minimax problem, with the goal of identifying the classifier with the smallest worst-case risk over an uncertainty set of possible costs. Our main contribution is efficient algorithms for both the full-batch and stochastic settings, which we prove converge (offline) to the minimax optimal solution at the dimension-independent rate of $\tilde{\mathcal{O}(T^{-\frac{1}{2})$. Our analysis reveals important structure stemming from the strategic nature of user responses, particularly the importance of dual norm regularization with respect to the cost function.
摘要
primary goal in strategic classification 是学习强制性不受输入操纵的决策规则。earlier works 假设战略回应是已知的;而一些最近的工作 Address 了重要的挑战,但 exclusively 研究了顺序设置,允许多个模型的多次部署在时间上。但是,在公共政策领域等很多领域,多个部署是不现实的,或者 Even a single bad round 是不 desirable。为了解决这个差距,我们开始研究不知道回应的战略分类在一枚 Setting 中,需要在一次性地选择一个分类器。我们从用户的成本函数中的不确定性开始,我们证明了,even a small mis-estimation of the true cost 可以导致最差情况下的准确率为零。在这种情况下,我们将一枚 Setting 定义为一个 minimax 问题,目标是找到可以在不确定性集中的可能成本中最小最差情况的决策器。我们的主要贡献是对批处理和随机设置中的精炼算法,我们证明它们在线上 converges 到 minimax 优化的解决方案,具有约等于 $T^{- \frac{1}{2}$ 的缩放率。我们的分析表明了由战略性的用户回应带来的重要结构,特别是对于成本函数的双重范数规范。
ELEGANT: Certified Defense on the Fairness of Graph Neural Networks
results: 在实际实验中,ELEGANT被证明可以有效地防止攻击者通过添加偏见来让GNNs的预测结果偏离公平性,并且可以用于GNNs偏移修复Here is the translation in English:
for: Protecting Graph Neural Networks (GNNs) from bias and unfair attacks
methods: Proposed a principled framework called ELEGANT and provided a detailed theoretical certification analysis to ensure the fairness of GNNs
results: In practical experiments, ELEGANT was proven to be effective in preventing attackers from corrupting the fairness level of GNNs’ predictions by adding perturbations, and it can also be used for GNN debiasing.Abstract
Graph Neural Networks (GNNs) have emerged as a prominent graph learning model in various graph-based tasks over the years. Nevertheless, due to the vulnerabilities of GNNs, it has been empirically proved that malicious attackers could easily corrupt the fairness level of their predictions by adding perturbations to the input graph data. In this paper, we take crucial steps to study a novel problem of certifiable defense on the fairness level of GNNs. Specifically, we propose a principled framework named ELEGANT and present a detailed theoretical certification analysis for the fairness of GNNs. ELEGANT takes any GNNs as its backbone, and the fairness level of such a backbone is theoretically impossible to be corrupted under certain perturbation budgets for attackers. Notably, ELEGANT does not have any assumption over the GNN structure or parameters, and does not require re-training the GNNs to realize certification. Hence it can serve as a plug-and-play framework for any optimized GNNs ready to be deployed. We verify the satisfactory effectiveness of ELEGANT in practice through extensive experiments on real-world datasets across different backbones of GNNs, where ELEGANT is also demonstrated to be beneficial for GNN debiasing. Open-source code can be found at https://github.com/yushundong/ELEGANT.
摘要
格网神经网络(GNNs)在各种基于图的任务中显示出了突出的表现。然而,由于GNNS的漏洞,实际证明了恶意攻击者可以轻松地腐蚀GNNS的预测公平性水平。在这篇论文中,我们研究了一个新的问题——GNNS公平性水平的证明防御。 Specifically, we propose a principled framework named ELEGANT and present a detailed theoretical certification analysis for the fairness of GNNs. ELEGANT takes any GNNs as its backbone, and the fairness level of such a backbone is theoretically impossible to be corrupted under certain perturbation budgets for attackers. Notably, ELEGANT does not make any assumptions about the GNN structure or parameters, and does not require re-training the GNNs to realize certification. Therefore, it can serve as a plug-and-play framework for any optimized GNNs ready to be deployed. We verify the satisfactory effectiveness of ELEGANT in practice through extensive experiments on real-world datasets across different backbones of GNNs, where ELEGANT is also demonstrated to be beneficial for GNN debiasing. 开源代码可以在https://github.com/yushundong/ELEGANT找到。
Staged Reinforcement Learning for Complex Tasks through Decomposed Environments
results: 实验结果表明,提posed方法可以提高智能代理人在交通十字路相关的复杂任务中的表现,并最小化可能发生的安全问题。Abstract
Reinforcement Learning (RL) is an area of growing interest in the field of artificial intelligence due to its many notable applications in diverse fields. Particularly within the context of intelligent vehicle control, RL has made impressive progress. However, currently it is still in simulated controlled environments where RL can achieve its full super-human potential. Although how to apply simulation experience in real scenarios has been studied, how to approximate simulated problems to the real dynamic problems is still a challenge. In this paper, we discuss two methods that approximate RL problems to real problems. In the context of traffic junction simulations, we demonstrate that, if we can decompose a complex task into multiple sub-tasks, solving these tasks first can be advantageous to help minimising possible occurrences of catastrophic events in the complex task. From a multi-agent perspective, we introduce a training structuring mechanism that exploits the use of experience learned under the popular paradigm called Centralised Training Decentralised Execution (CTDE). This experience can then be leveraged in fully decentralised settings that are conceptually closer to real settings, where agents often do not have access to a central oracle and must be treated as isolated independent units. The results show that the proposed approaches improve agents performance in complex tasks related to traffic junctions, minimising potential safety-critical problems that might happen in these scenarios. Although still in simulation, the investigated situations are conceptually closer to real scenarios and thus, with these results, we intend to motivate further research in the subject.
摘要
强化学习(RL)是人工智能领域的一个快速发展领域,具有各种应用场景的优势。特别是在智能控制领域,RL已经做出了卓越的进展。然而,目前RL仍然在模拟控制环境中达到了最高的超人类水平。虽然有研究如何将模拟经验应用于实际场景,但是如何近似模拟问题到实际动态问题仍然是一个挑战。在这篇论文中,我们讨论了两种方法可以将RL问题近似到实际问题。在交通立交点模拟中,我们示出了如果将复杂任务分解成多个子任务,解决这些子任务可以帮助避免在复杂任务中可能发生的潜在灾难事件。从多智能代理的视角来看,我们介绍了一种使用中央训练分布执行(CTDE)的训练结构机制,利用这种机制可以在完全分布式的设置中使用经验学习。这些经验可以在实际场景中使用,agent们在实际场景中通常不具备中央报告机制,因此这些经验可以在完全分布式的设置中帮助agent们提高完成复杂任务的能力。结果显示,提出的方法可以在交通立交点任务中提高agent的性能,避免可能发生的安全关键问题。虽然仍在模拟环境中, investigate的情况概念上更近于实际场景,因此我们希望通过这些结果激励更多的研究在这个领域。
Exploiting Correlated Auxiliary Feedback in Parameterized Bandits
results: 实验结果在不同的设定中证明了我们提出的方法可以减少 regret,并且可以在不同的协助反馈下达到更好的性能。Abstract
We study a novel variant of the parameterized bandits problem in which the learner can observe additional auxiliary feedback that is correlated with the observed reward. The auxiliary feedback is readily available in many real-life applications, e.g., an online platform that wants to recommend the best-rated services to its users can observe the user's rating of service (rewards) and collect additional information like service delivery time (auxiliary feedback). In this paper, we first develop a method that exploits auxiliary feedback to build a reward estimator with tight confidence bounds, leading to a smaller regret. We then characterize the regret reduction in terms of the correlation coefficient between reward and its auxiliary feedback. Experimental results in different settings also verify the performance gain achieved by our proposed method.
摘要
我们研究一种新的参数化强制投票问题变体,在该问题中学习者可以观察附加的auxiliary反馈,这些反馈与观察到的奖励相关。这些附加反馈在实际应用中很普遍,例如一个在线平台想要推荐用户最佳评分服务可以观察用户对服务的评分(奖励)并收集附加信息如服务交付时间(auxiliary反馈)。我们首先开发了一种利用附加反馈建立奖励估计器,并提供紧张的信息 bounds,从而减少了 regret。然后,我们Characterize了 regret reduction的相对评价差,并通过不同的设置的实验结果来验证我们的提posed方法的性能提升。
results: 这篇论文提出了一种新的解释神经元和神经细胞之间的交互方式,并在多种脑区和物种中进行了应用,以研究智能行为的起源。Abstract
Humans and animals exhibit a range of interesting behaviors in dynamic environments, and it is unclear how our brains actively reformat this dense sensory information to enable these behaviors. Experimental neuroscience is undergoing a revolution in its ability to record and manipulate hundreds to thousands of neurons while an animal is performing a complex behavior. As these paradigms enable unprecedented access to the brain, a natural question that arises is how to distill these data into interpretable insights about how neural circuits give rise to intelligent behaviors. The classical approach in systems neuroscience has been to ascribe well-defined operations to individual neurons and provide a description of how these operations combine to produce a circuit-level theory of neural computations. While this approach has had some success for small-scale recordings with simple stimuli, designed to probe a particular circuit computation, often times these ultimately lead to disparate descriptions of the same system across stimuli. Perhaps more strikingly, many response profiles of neurons are difficult to succinctly describe in words, suggesting that new approaches are needed in light of these experimental observations. In this thesis, we offer a different definition of interpretability that we show has promise in yielding unified structural and functional models of neural circuits, and describes the evolutionary constraints that give rise to the response properties of the neural population, including those that have previously been difficult to describe individually. We demonstrate the utility of this framework across multiple brain areas and species to study the roles of recurrent processing in the primate ventral visual pathway; mouse visual processing; heterogeneity in rodent medial entorhinal cortex; and facilitating biological learning.
摘要
人类和动物在动态环境中展现出各种 interessante 行为,但是我们的大脑如何活动地重新格式化这些紧密的感知信息以启用这些行为仍然不清楚。现代神经科学实验受到了记录和修改百到千个神经元的技术的革命,这些方法使得我们可以在动物表现复杂行为时获取至前无之有的脑部数据。随着这些方法的发展,一个自然的问题出现了:如何将这些数据转化成可解释的洞察。传统的系统神经科学方法是将各个神经元归功于特定的操作,并提供一种描述如何这些操作相互作用以生成神经计算的综合理论。虽然这种方法在小规模记录下有一定的成功,但是它在面对复杂的刺激时经常导致不同的描述,这些描述在不同的刺激下都是不一致的。事实上,许多神经元响应 profiles 很难以用字符串来描述,这表明需要新的方法。在这个论文中,我们提出了一种不同的可解释性定义,并证明该定义在生成神经Circuit 级别的结构和功能模型方面具有承诺。我们还证明了这种定义在多个脑区和种类中的应用,以研究恒定处理的角色,包括人类脑镜下部Visual 路径; 鼠类视觉处理; 鼠类中脑核心受体区域的多样性; 和促进生物学学习。
Enhancing AI Research Paper Analysis: Methodology Component Extraction using Factored Transformer-based Sequence Modeling Approach
results: 实验结果显示,分解方法在尝试setup中表现出色,与基eline相比,提高了9.257%的精度。Abstract
Research in scientific disciplines evolves, often rapidly, over time with the emergence of novel methodologies and their associated terminologies. While methodologies themselves being conceptual in nature and rather difficult to automatically extract and characterise, in this paper, we seek to develop supervised models for automatic extraction of the names of the various constituents of a methodology, e.g., `R-CNN', `ELMo' etc. The main research challenge for this task is effectively modeling the contexts around these methodology component names in a few-shot or even a zero-shot setting. The main contributions of this paper towards effectively identifying new evolving scientific methodology names are as follows: i) we propose a factored approach to sequence modeling, which leverages a broad-level category information of methodology domains, e.g., `NLP', `RL' etc.; ii) to demonstrate the feasibility of our proposed approach of identifying methodology component names under a practical setting of fast evolving AI literature, we conduct experiments following a simulated chronological setup (newer methodologies not seen during the training process); iii) our experiments demonstrate that the factored approach outperforms state-of-the-art baselines by margins of up to 9.257\% for the methodology extraction task with the few-shot setup.
摘要
科学研究领域中的研究方法不断发展,经常快速地出现新的方法和其相关的术语。在这篇论文中,我们想要开发有监督模型来自动提取方法学Component的名称,例如“R-CNN”、“ELMo”等。我们的研究挑战是在几个或者 zeroshot设置下,有效地模型这些方法组件名称的上下文。我们的主要贡献如下:1. 我们提出了一种分解方法来模型序列,借鉴了方法学领域的大致类别信息,例如“NLP”、“RL”等。2. 为证明我们提出的方法在实际情况下可行,我们在快速演化的AI文献中进行了实验,采用了模拟时间序列的设置( newer methodologies not seen during the training process)。3. 我们的实验表明,我们的分解方法可以在几个或者 zeroshot设置下,与现有的基eline相比,提高了方法提取任务的效果,提高了9.257%。
Identifying Linearly-Mixed Causal Representations from Multi-Node Interventions
paper_authors: Simon Bing, Urmi Ninad, Jonas Wahl, Jakob Runge
for: 本研究旨在 Addressing the underconstrained problem of causal representation learning, particularly in the presence of multiple variables intervened upon within one environment.
results: 我们的实验结果表明,我们的方法可以在多变量干预下学习有效的 causal representation,并且可以避免一些先前的假设,如单变量干预和独立干预。Abstract
The task of inferring high-level causal variables from low-level observations, commonly referred to as causal representation learning, is fundamentally underconstrained. As such, recent works to address this problem focus on various assumptions that lead to identifiability of the underlying latent causal variables. A large corpus of these preceding approaches consider multi-environment data collected under different interventions on the causal model. What is common to virtually all of these works is the restrictive assumption that in each environment, only a single variable is intervened on. In this work, we relax this assumption and provide the first identifiability result for causal representation learning that allows for multiple variables to be targeted by an intervention within one environment. Our approach hinges on a general assumption on the coverage and diversity of interventions across environments, which also includes the shared assumption of single-node interventions of previous works. The main idea behind our approach is to exploit the trace that interventions leave on the variance of the ground truth causal variables and regularizing for a specific notion of sparsity with respect to this trace. In addition to and inspired by our theoretical contributions, we present a practical algorithm to learn causal representations from multi-node interventional data and provide empirical evidence that validates our identifiability results.
摘要
task of inferring high-level causal variables from low-level observations, commonly referred to as causal representation learning, is fundamentally underconstrained. As such, recent works to address this problem focus on various assumptions that lead to identifiability of the underlying latent causal variables. A large corpus of these preceding approaches consider multi-environment data collected under different interventions on the causal model. What is common to virtually all of these works is the restrictive assumption that in each environment, only a single variable is intervened on. In this work, we relax this assumption and provide the first identifiability result for causal representation learning that allows for multiple variables to be targeted by an intervention within one environment. Our approach hinges on a general assumption on the coverage and diversity of interventions across environments, which also includes the shared assumption of single-node interventions of previous works. The main idea behind our approach is to exploit the trace that interventions leave on the variance of the ground truth causal variables and regularizing for a specific notion of sparsity with respect to this trace. In addition to and inspired by our theoretical contributions, we present a practical algorithm to learn causal representations from multi-node interventional data and provide empirical evidence that validates our identifiability results.
Regret Analysis of Learning-Based Linear Quadratic Gaussian Control with Additive Exploration
results: 我们证明了 LQG-NAIVE 可以实现 regret 增长率为 $\tilde{\mathcal{O}(\sqrt{T})$,即 $\mathcal{O}(\sqrt{T})$ 以上下标 Logarithmic factors 之后 $T$ 步骤。此外,我们还提出了 LQG-IF2E,它在探索信号中包括 Fisher Information Matrix (FIM),并提供了 LQG-IF2E 的竞争性性能比 LQG-NAIVE 更好的数据分析证明。Abstract
In this paper, we analyze the regret incurred by a computationally efficient exploration strategy, known as naive exploration, for controlling unknown partially observable systems within the Linear Quadratic Gaussian (LQG) framework. We introduce a two-phase control algorithm called LQG-NAIVE, which involves an initial phase of injecting Gaussian input signals to obtain a system model, followed by a second phase of an interplay between naive exploration and control in an episodic fashion. We show that LQG-NAIVE achieves a regret growth rate of $\tilde{\mathcal{O}(\sqrt{T})$, i.e., $\mathcal{O}(\sqrt{T})$ up to logarithmic factors after $T$ time steps, and we validate its performance through numerical simulations. Additionally, we propose LQG-IF2E, which extends the exploration signal to a `closed-loop' setting by incorporating the Fisher Information Matrix (FIM). We provide compelling numerical evidence of the competitive performance of LQG-IF2E compared to LQG-NAIVE.
摘要
在本文中,我们分析了computationally efficient exploration strategy(naive exploration)在Linear Quadratic Gaussian(LQG)框架下控制未知部分可观测系统中的 regret。我们提出了一种两相控制算法,即LQG-NAIVE,其包括一个初始阶段插入 Gaussian 输入信号以获得系统模型,然后是一个 episodic 的第二阶段,在这个阶段中,naive exploration 和控制之间进行了协调。我们证明了LQG-NAIVE 的 regret增长率为 $\tilde{\mathcal{O}(\sqrt{T})$,即在 $T$ 步时间后, regret 增长率为 $\mathcal{O}(\sqrt{T})$ 以上 logarithmic 因素。此外,我们还提出了LQG-IF2E,它在探索信号中包含了 Fisher Information Matrix(FIM)。我们通过数值实验证明了LQG-IF2E 的竞争性性比 LQG-NAIVE 更高。
Drone-Enabled Load Management for Solar Small Cell Networks in Next-Gen Communications Optimization for Solar Small Cells
methods: 使用无人机携带的空中基站 Load Transfer 技术实现稳定可靠的能源重新分配
results: 提高了基站的可靠性和灵活性,降低了基站的能源损失和无人机交换次数Abstract
In recent years, the cellular industry has witnessed a major evolution in communication technologies. It is evident that the Next Generation of cellular networks(NGN) will play a pivotal role in the acceptance of emerging IoT applications supporting high data rates, better Quality of Service(QoS), and reduced latency. However, the deployment of NGN will introduce a power overhead on the communication infrastructure. Addressing the critical energy constraints in 5G and beyond, this study introduces an innovative load transfer method using drone-carried airborne base stations (BSs) for stable and secure power reallocation within a green micro-grid network. This method effectively manages energy deficit by transferring aerial BSs from high to low-energy cells, depending on user density and the availability of aerial BSs, optimizing power distribution in advanced cellular networks. The complexity of the proposed system is significantly lower as compared to existing power cable transmission systems currently employed in powering the BSs. Furthermore, our proposed algorithm has been shown to reduce BS power outages while requiring a minimum number of drone exchanges. We have conducted a thorough review on real-world dataset to prove the efficacy of our proposed approach to support BS during high load demand times
摘要
Pointer Networks with Q-Learning for OP Combinatorial Optimization
results: 在 OP 中的优秀表现Abstract
The Orienteering Problem (OP) presents a unique challenge in combinatorial optimization, emphasized by its widespread use in logistics, delivery, and transportation planning. Given the NP-hard nature of OP, obtaining optimal solutions is inherently complex. While Pointer Networks (Ptr-Nets) have exhibited prowess in various combinatorial tasks, their performance in the context of OP leaves room for improvement. Recognizing the potency of Q-learning, especially when paired with deep neural structures, this research unveils the Pointer Q-Network (PQN). This innovative method combines Ptr-Nets and Q-learning, effectively addressing the specific challenges presented by OP. We deeply explore the architecture and efficiency of PQN, showcasing its superior capability in managing OP situations.
摘要
Orienteering Problem(OP)呈现了 combinatorial optimization 领域的独特挑战,它在物流、交通规划等领域广泛应用。由于 OP 的NP-硬性,获得优化解决方案是自然复杂的。然而,Pointer Networks(Ptr-Nets)在其他 combinatorial 任务中表现出色,但在 OP 中的表现仍有空间提升。本研究认识到 Q-学习的能力,特别是在与深度神经结构结合时,因此提出了 Pointer Q-Network(PQN)。这种创新方法结合了 Ptr-Nets 和 Q-学习,有效地解决了 OP 中的特定挑战。我们深入探讨 PQN 的architecture和效率,展示其在 OP 中的superior 能力。
An adaptive standardisation model for Day-Ahead electricity price forecasting
methods: Introducing adaptive standardization to mitigate dataset shifts and improve forecasting performance
results: Significant improvement in forecasting accuracy across four markets, including two novel datasets, using less complex and widely accepted learning algorithms.Abstract
The study of Day-Ahead prices in the electricity market is one of the most popular problems in time series forecasting. Previous research has focused on employing increasingly complex learning algorithms to capture the sophisticated dynamics of the market. However, there is a threshold where increased complexity fails to yield substantial improvements. In this work, we propose an alternative approach by introducing an adaptive standardisation to mitigate the effects of dataset shifts that commonly occur in the market. By doing so, learning algorithms can prioritize uncovering the true relationship between the target variable and the explanatory variables. We investigate four distinct markets, including two novel datasets, previously unexplored in the literature. These datasets provide a more realistic representation of the current market context, that conventional datasets do not show. The results demonstrate a significant improvement across all four markets, using learning algorithms that are less complex yet widely accepted in the literature. This significant advancement unveils opens up new lines of research in this field, highlighting the potential of adaptive transformations in enhancing the performance of forecasting models.
摘要
研究一天前价格在电力市场是时间序列预测中最受欢迎的问题。先前的研究强调使用越来越复杂的学习算法来捕捉市场的复杂动态。然而,有一个阈值,其中增加复杂性不会带来显著改善。在这种情况下,我们提议一种不同的方法,即引入适应标准化,以mitigate dataset shifts常见于市场中。这样做可以使学习算法更加注重捕捉target变量和解释变量之间的真实关系。我们对四个市场进行了研究,包括两个新的数据集,之前从未出现在文献中。这些数据集提供了更加现实的市场背景,与 conventient datasets不同。结果显示在所有四个市场中有显著改善,使用在文献中广泛accepted的学习算法。这一显著进步揭示了适应转换在预测模型性能提高方面的潜在力量,开启了新的研究方向。
Steady-State Analysis of Queues with Hawkes Arrival and Its Application to Online Learning for Hawkes Queues
methods: 该论文使用了新的 coupling 技术来确定工作负荷和忙期过程的 finite moment bounds,并证明这些队列过程在恒定状态下对数快速 converges。
results: 根据这些理论结论,该论文开发了一种高效的数据驱动的 numerial 算法来解决 Hawkes 队列中的优化工作人员问题,并发现在高峰期 régime, Hawkes 队列的工作人员划算与 класси GI/GI/1 模型存在显著差异。Abstract
We investigate the long-run behavior of single-server queues with Hawkes arrivals and general service distributions and related optimization problems. In detail, utilizing novel coupling techniques, we establish finite moment bounds for the stationary distribution of the workload and busy period processes. In addition, we are able to show that, those queueing processes converge exponentially fast to their stationary distribution. Based on these theoretic results, we develop an efficient numerical algorithm to solve the optimal staffing problem for the Hawkes queues in a data-driven manner. Numerical results indicate a sharp difference in staffing for Hawkes queues, compared to the classic GI/GI/1 model, especially in the heavy-traffic regime.
摘要
我们研究单服务器队列中的长期行为,包括途径 Hawkes 的到达和一般服务分布,以及相关的优化问题。在详细的探讨中,我们利用新的 Coupling 技术,确定了工作负荷和忙期过程的finite moment bound。此外,我们还证明了这些队列过程在 exponentially fast 速度下关于其站点分布的整体准确性。基于这些理论结果,我们开发了一种高效的数据驱动的数字算法,解决 Hawkes 队列的优化人员问题。 numerically 的结果表明,在高负荷情况下,Hawkes 队列的人员配置和 класси GI/GI/1 模型之间存在很大的差异。
Temporal Treasure Hunt: Content-based Time Series Retrieval System for Discovering Insights
results: 对于多个领域时序数据库中的时序数据检索问题,新的距离学习模型表现出色,超过了现有的方法。Abstract
Time series data is ubiquitous across various domains such as finance, healthcare, and manufacturing, but their properties can vary significantly depending on the domain they originate from. The ability to perform Content-based Time Series Retrieval (CTSR) is crucial for identifying unknown time series examples. However, existing CTSR works typically focus on retrieving time series from a single domain database, which can be inadequate if the user does not know the source of the query time series. This limitation motivates us to investigate the CTSR problem in a scenario where the database contains time series from multiple domains. To facilitate this investigation, we introduce a CTSR benchmark dataset that comprises time series data from a variety of domains, such as motion, power demand, and traffic. This dataset is sourced from a publicly available time series classification dataset archive, making it easily accessible to researchers in the field. We compare several popular methods for modeling and retrieving time series data using this benchmark dataset. Additionally, we propose a novel distance learning model that outperforms the existing methods. Overall, our study highlights the importance of addressing the CTSR problem across multiple domains and provides a useful benchmark dataset for future research.
摘要
时序数据在不同领域 everywhere,如金融、医疗和制造等,但它们的属性可以很大不同。能够实现基于内容的时序数据检索(CTSR)是识别未知时序例子的重要能力。然而,现有的CTSR工作通常将注意力集中在单一领域数据库上,这可能不够用于用户不知道查询时序序列的来源。这种限制使我们感到需要调查多个领域数据库中的CTSR问题。为了实现这一目的,我们提出了一个CTSRBenchmark dataset,该dataset包含多个领域的时序数据,如运动、电力需求和交通。这些数据来自公共可用时序分类数据集存档,因此可以让研究人员在领域中轻松地访问。我们比较了多种流行的时序数据模型化和检索方法,并提出了一种新的距离学习模型,该模型在CTSRBenchmark dataset上表现出色。总之,我们的研究强调了跨多个领域的CTSR问题的重要性,并提供了一个有用的CTSRBenchmark dataset,为未来的研究提供了便利。
Fast Minimization of Expected Logarithmic Loss via Stochastic Dual Averaging
paper_authors: Chung-En Tsai, Hao-Chung Cheng, Yen-Huan Li
for: 这个论文目的是将预期对数损失最小化,并且考虑了概率简单体和量子激发函数的问题。
methods: 这个论文使用了 Stochastic First-Order Algorithm with Logarithmic Barrier, named $B$-sample stochastic dual averaging。
results: 这个算法可以在 $\tilde{O} (d^2/\varepsilon^2)$ 时间内获得 $\varepsilon$-优解,与现有的概率方法减少了 $d^{2\omega-2}$ 的时间复杂度,超过了批处理方法的时间复杂度 $d^2$。Abstract
Consider the problem of minimizing an expected logarithmic loss over either the probability simplex or the set of quantum density matrices. This problem encompasses tasks such as solving the Poisson inverse problem, computing the maximum-likelihood estimate for quantum state tomography, and approximating positive semi-definite matrix permanents with the currently tightest approximation ratio. Although the optimization problem is convex, standard iteration complexity guarantees for first-order methods do not directly apply due to the absence of Lipschitz continuity and smoothness in the loss function. In this work, we propose a stochastic first-order algorithm named $B$-sample stochastic dual averaging with the logarithmic barrier. For the Poisson inverse problem, our algorithm attains an $\varepsilon$-optimal solution in $\tilde{O} (d^2/\varepsilon^2)$ time, matching the state of the art. When computing the maximum-likelihood estimate for quantum state tomography, our algorithm yields an $\varepsilon$-optimal solution in $\tilde{O} (d^3/\varepsilon^2)$ time, where $d$ denotes the dimension. This improves on the time complexities of existing stochastic first-order methods by a factor of $d^{\omega-2}$ and those of batch methods by a factor of $d^2$, where $\omega$ denotes the matrix multiplication exponent. Numerical experiments demonstrate that empirically, our algorithm outperforms existing methods with explicit complexity guarantees.
摘要
问题是最小化预期的含阶函数损失的问题,这个问题包括解决波索因 inverse 问题、计算量子状态探测的最大可能性估计、以及使用当前最紧的比率来近似正semidefinite 矩阵的 permanents。尽管优化问题是凸的,但标准的第一阶方法的证明不直接适用,因为损失函数没有 lipschitz 连续和光滑性。在这篇文章中,我们提出了一种 Stochastic first-order 算法,名为 $B$-sample stochastic dual averaging with logarithmic barrier。对于波索因 inverse 问题,我们的算法可以在 $\tilde{O} (d^2/\varepsilon^2)$ 时间内获得 $\varepsilon$-优的解,与当前状态之冲突。当计算量子状态探测的最大可能性估计时,我们的算法可以在 $\tilde{O} (d^3/\varepsilon^2)$ 时间内获得 $\varepsilon$-优的解,其中 $d$ 是维度。这比现有的随机第一阶方法的时间复杂度增加 $d^{\omega}-2}$,并且比批处理方法增加 $d^2$,其中 $\omega$ 是矩阵乘法 exponent。实验表明,我们的算法在实际中比现有的方法 WITH 显式复杂度保证更好。
High-dimensional Bid Learning for Energy Storage Bidding in Energy Markets
for: optimize the profitability of Energy Storage Systems (ESSs) in electricity markets with high volatility
methods: modify the common reinforcement learning (RL) process with a new bid representation method called Neural Network Embedded Bids (NNEBs), which represents market bids as monotonic neural networks with discrete outputs
results: achieve 18% higher profit than the baseline and up to 78% profit of the optimal market bidder through experiments on real-world market datasetsAbstract
With the growing penetration of renewable energy resource, electricity market prices have exhibited greater volatility. Therefore, it is important for Energy Storage Systems(ESSs) to leverage the multidimensional nature of energy market bids to maximize profitability. However, current learning methods cannot fully utilize the high-dimensional price-quantity bids in the energy markets. To address this challenge, we modify the common reinforcement learning(RL) process by proposing a new bid representation method called Neural Network Embedded Bids (NNEBs). NNEBs refer to market bids that are represented by monotonic neural networks with discrete outputs. To achieve effective learning of NNEBs, we first learn a neural network as a strategic mapping from the market price to ESS power output with RL. Then, we re-train the network with two training modifications to make the network output monotonic and discrete. Finally, the neural network is equivalently converted into a high-dimensional bid for bidding. We conducted experiments over real-world market datasets. Our studies show that the proposed method achieves 18% higher profit than the baseline and up to 78% profit of the optimal market bidder.
摘要
NNEBs refer to market bids that are represented by monotonic neural networks with discrete outputs. To effectively learn NNEBs, we first learn a neural network as a strategic mapping from the market price to ESS power output with RL. Then, we re-train the network with two training modifications to make the network output monotonic and discrete. Finally, the neural network is equivalently converted into a high-dimensional bid for bidding.We conducted experiments over real-world market datasets. Our studies show that the proposed method achieves 18% higher profit than the baseline and up to 78% profit of the optimal market bidder.
Preliminary Analysis on Second-Order Convergence for Biased Policy Gradient Methods
results: 提供了 biased policy gradient 算法的 preliminary 结果,并且采用 nonconvex 优化 技巧进行证明。 future work 将是提供actor-critic 算法的 finite-time second-order convergence 分析。Abstract
Although the convergence of policy gradient algorithms to first-order stationary points is well-established, the objective functions of reinforcement learning problems are typically highly nonconvex. Therefore, recent work has focused on two extensions: ``global" convergence guarantees under regularity assumptions on the function structure, and second-order guarantees for escaping saddle points and convergence to true local minima. Our work expands on the latter approach, avoiding the restrictive assumptions of the former that may not apply to general objective functions. Existing results on vanilla policy gradient only consider an unbiased gradient estimator, but practical implementations under the infinite-horizon discounted setting, including both Monte-Carlo methods and actor-critic methods, involve gradient descent updates with a biased gradient estimator. We present preliminary results on the convergence of biased policy gradient algorithms to second-order stationary points, leveraging proof techniques from nonconvex optimization. In our next steps we aim to provide the first finite-time second-order convergence analysis for actor-critic algorithms.
摘要
although the convergence of policy gradient algorithms to first-order stationary points is well-established, the objective functions of reinforcement learning problems are typically highly nonconvex. therefore, recent work has focused on two extensions: "global" convergence guarantees under regularity assumptions on the function structure, and second-order guarantees for escaping saddle points and convergence to true local minima. our work expands on the latter approach, avoiding the restrictive assumptions of the former that may not apply to general objective functions. existing results on vanilla policy gradient only consider an unbiased gradient estimator, but practical implementations under the infinite-horizon discounted setting, including both monte-carlo methods and actor-critic methods, involve gradient descent updates with a biased gradient estimator. we present preliminary results on the convergence of biased policy gradient algorithms to second-order stationary points, leveraging proof techniques from nonconvex optimization. in our next steps, we aim to provide the first finite-time second-order convergence analysis for actor-critic algorithms.Here's the translation in Traditional Chinese:although the convergence of policy gradient algorithms to first-order stationary points is well-established, the objective functions of reinforcement learning problems are typically highly nonconvex. therefore, recent work has focused on two extensions: "global" convergence guarantees under regularity assumptions on the function structure, and second-order guarantees for escaping saddle points and convergence to true local minima. our work expands on the latter approach, avoiding the restrictive assumptions of the former that may not apply to general objective functions. existing results on vanilla policy gradient only consider an unbiased gradient estimator, but practical implementations under the infinite-horizon discounted setting, including both monte-carlo methods and actor-critic methods, involve gradient descent updates with a biased gradient estimator. we present preliminary results on the convergence of biased policy gradient algorithms to second-order stationary points, leveraging proof techniques from nonconvex optimization. in our next steps, we aim to provide the first finite-time second-order convergence analysis for actor-critic algorithms.