cs.AI - 2023-10-21

Ask To The Point: Open-Domain Entity-Centric Question Generation

  • paper_url: http://arxiv.org/abs/2310.14126
  • repo_url: https://github.com/liuyuxiang512/ecqg
  • paper_authors: Yuxiang Liu, Jie Huang, Kevin Chen-Chuan Chang
  • for: 实现话题具体学习、助教读物和 факт检查等应用场景,提出了一个新任务:实体中心问题生成(ECQG)。
  • methods: 提出了一种具有一致性的 PLM 基础架构 GenCONE,包括两个新模块:内容强调模块和问题验证模块。
  • results: 经过广泛的实验,GenCONE 能够具有显著和稳定的性能优势,并且两个模块具有辅之usage和补做作用。
    Abstract We introduce a new task called *entity-centric question generation* (ECQG), motivated by real-world applications such as topic-specific learning, assisted reading, and fact-checking. The task aims to generate questions from an entity perspective. To solve ECQG, we propose a coherent PLM-based framework GenCONE with two novel modules: content focusing and question verification. The content focusing module first identifies a focus as "what to ask" to form draft questions, and the question verification module refines the questions afterwards by verifying the answerability. We also construct a large-scale open-domain dataset from SQuAD to support this task. Our extensive experiments demonstrate that GenCONE significantly and consistently outperforms various baselines, and two modules are effective and complementary in generating high-quality questions.
    摘要 我们介绍了一个新任务called *实体中心问题生成* (ECQG), 该任务受到实际应用场景的启发,如专题学习、辅助阅读和事实核实。该任务的目标是从实体角度生成问题。为解决ECQG,我们提议了一个协调PLM-based框架GenCONE,该框架包括两个新模块:内容专注和问题验证。内容专注模块首先确定了“要问什么”的焦点,以生成签证问题,而问题验证模块则在验证答案可否回答。我们还构建了一个大规模的开放领域数据集,来支持这个任务。我们的广泛实验表明,GenCONE在various baselines的比较中具有显著且一致的优势,两个模块也是生成高质量问题的有效和补充性的。

Sentiment Analysis Across Multiple African Languages: A Current Benchmark

  • paper_url: http://arxiv.org/abs/2310.14120
  • repo_url: None
  • paper_authors: Saurav K. Aryal, Howard Prioleau, Surakshya Aryal
  • for: 这项研究的目的是为了提高非洲语言 sentiment 分析的研究,并评估当前的 transformer 模型在非洲语言上的性能。
  • methods: 研究使用了 AfriSenti-SemEval Shared Task 12 上的注释 sentiment analysis 数据,并对当前状态的 transformer 模型进行了比较和评估。
  • results: 研究发现,即使在低资源情况下,更多的数据仍然可以生成更好的模型,并且模型专门为非洲语言开发的模型在所有任务上都表现出色。此外,没有一个 universal 模型能够适用于所有语言的评估。
    Abstract Sentiment analysis is a fundamental and valuable task in NLP. However, due to limitations in data and technological availability, research into sentiment analysis of African languages has been fragmented and lacking. With the recent release of the AfriSenti-SemEval Shared Task 12, hosted as a part of The 17th International Workshop on Semantic Evaluation, an annotated sentiment analysis of 14 African languages was made available. We benchmarked and compared current state-of-art transformer models across 12 languages and compared the performance of training one-model-per-language versus single-model-all-languages. We also evaluated the performance of standard multilingual models and their ability to learn and transfer cross-lingual representation from non-African to African languages. Our results show that despite work in low resource modeling, more data still produces better models on a per-language basis. Models explicitly developed for African languages outperform other models on all tasks. Additionally, no one-model-fits-all solution exists for a per-language evaluation of the models evaluated. Moreover, for some languages with a smaller sample size, a larger multilingual model may perform better than a dedicated per-language model for sentiment classification.
    摘要 《叙述分析是NLP领域的基础和重要任务。然而,由于数据和技术限制,关于非洲语言的叙述分析研究受到了限制,Fragmented和缺乏。随着最近发布的AfriSenti-SemEval Shared Task 12,14种非洲语言的叙述分析标注数据被提供。我们对当前状态的转换器模型进行了比较和比较,并 evaluate了单语言模型 versus 所有语言模型的训练。我们还评估了标准多语言模型的能力以及其在非洲语言上学习和传递cross-语言表示的能力。我们的结果显示,尽管在低资源模型方面做了很多工作,但更多的数据仍然可以生成更好的模型。专门为非洲语言开发的模型在所有任务上都高于其他模型。此外,没有一个“一模型 fits all”的解决方案,每种语言的评估中的模型都不同。此外,对于一些语言的小样本大小,大型多语言模型可能会在叙述分类任务上表现更好于专门为该语言开发的模型。

CLIP meets Model Zoo Experts: Pseudo-Supervision for Visual Enhancement

  • paper_url: http://arxiv.org/abs/2310.14108
  • repo_url: None
  • paper_authors: Mohammadreza Salehi, Mehrdad Farajtabar, Maxwell Horton, Fartash Faghri, Hadi Pouransari, Raviteja Vemulapalli, Oncel Tuzel, Ali Farhadi, Mohammad Rastegari, Sachin Mehta
  • for: 这个论文的目的是提高CLIP模型的视觉表示能力。
  • methods: 这个论文使用了开源的任务特定视觉模型生成 pseudo-labels,并在这些 pseudo-labels 基础上训练 CLIP 模型。
  • results: 这个方法可以提高 CLIP 模型在不同视觉任务中的表现,包括 segmentation、检测、深度估计和表面法线估计,最多提高16.3%。这些提高不会减少 CLIP 模型的现有能力,包括其在适应性零分类中的护卫。
    Abstract Contrastive language image pretraining (CLIP) is a standard method for training vision-language models. While CLIP is scalable, promptable, and robust to distribution shifts on image classification tasks, it lacks object localization capabilities. This paper studies the following question: Can we augment CLIP training with task-specific vision models from model zoos to improve its visual representations? Towards this end, we leverage open-source task-specific vision models to generate pseudo-labels for an uncurated and noisy image-text dataset. Subsequently, we train CLIP models on these pseudo-labels in addition to the contrastive training on image and text pairs. This simple setup shows substantial improvements of up to 16.3% across different vision tasks, including segmentation, detection, depth estimation, and surface normal estimation. Importantly, these enhancements are achieved without compromising CLIP's existing capabilities, including its proficiency in promptable zero-shot classification.
    摘要 对比语言图像预训练(CLIP)是一种标准的视觉语言模型训练方法。CLIP可以扩展,可以提示,并具有对分布变化的鲁棒性,但它缺乏对象定位功能。这篇论文研究以下问题:可以通过将任务特定的视觉模型添加到CLIP训练中来改善其视觉表示?为此,我们利用开源的任务特定视觉模型生成pseudo-标签,并在这些pseudo-标签的基础上训练CLIP模型。这个简单的设置可以提高CLIP模型在不同视觉任务上的表现,最高提高达16.3%。这些改进不仅不会减少CLIP的现有能力,包括它的能力在零shot提示下的报表能力。

On the Transferability of Visually Grounded PCFGs

  • paper_url: http://arxiv.org/abs/2310.14107
  • repo_url: https://github.com/zhaoyanpeng/cpcfg
  • paper_authors: Yanpeng Zhao, Ivan Titov
  • for: 本研究旨在评估视觉基础 grammar 生成器在不同文本领域中的可迁移性。
  • methods: 我们extend了VC-PCFG模型,使其能够在不同文本领域中进行迁移学习。我们采用了零shot转移学习 Setting,即在源领域训练模型,然后直接应用到目标领域。
  • results: 我们的实验结果表明,视觉基础对文本的改进效果在相似领域中传递,但在远程领域中失效。我们还进行了数据和结果分析,发现 lexicon overlap между源领域和目标领域是迁移性的最重要因素。
    Abstract There has been a significant surge of interest in visually grounded grammar induction in recent times. While a variety of models have been developed for the task and have demonstrated impressive performance, they have not been evaluated on text domains that are different from the training domain, so it is unclear if the improvements brought by visual groundings are transferable. Our study aims to fill this gap and assess the degree of transferability. We start by extending VC-PCFG (short for Visually-grounded Compound PCFG~\citep{zhao-titov-2020-visually}) in such a way that it can transfer across text domains. We consider a zero-shot transfer learning setting where a model is trained on the source domain and is directly applied to target domains, without any further training. Our experimental results suggest that: the benefits from using visual groundings transfer to text in a domain similar to the training domain but fail to transfer to remote domains. Further, we conduct data and result analysis; we find that the lexicon overlap between the source domain and the target domain is the most important factor in the transferability of VC-PCFG.
    摘要 Recently, there has been a significant increase in interest in visually grounded grammar induction. While various models have been developed for this task and have shown impressive performance, their transferability to different text domains has not been evaluated. Our study aims to fill this gap and assess the degree of transferability. We start by extending VC-PCFG (short for Visually-grounded Compound PCFG) to enable transfer across text domains. We use a zero-shot transfer learning setting where a model is trained on the source domain and is directly applied to target domains without further training. Our experimental results show that: the benefits of using visual groundings transfer to text in a domain similar to the training domain but fail to transfer to remote domains. Additionally, we conduct data and result analysis and find that the lexicon overlap between the source domain and the target domain is the most important factor in the transferability of VC-PCFG.

Revisiting Instruction Fine-tuned Model Evaluation to Guide Industrial Applications

  • paper_url: http://arxiv.org/abs/2310.14103
  • repo_url: https://github.com/manuelfay/ifteval
  • paper_authors: Manuel Faysse, Gautier Viaud, Céline Hudelot, Pierre Colombo
  • for: 这 paper 是 investigating task-specialization strategies for IFT model deployment in practical industrial settings.
  • methods: 该 paper 使用 LLM-based metrics to evaluate the performance of IFT models.
  • results: 该 paper 提供了实际 industrial setting 中 IFT 模型的部署中的贸易offs, offering practitioners actionable insights for real-world IFT model deployment.
    Abstract Instruction Fine-Tuning (IFT) is a powerful paradigm that strengthens the zero-shot capabilities of Large Language Models (LLMs), but in doing so induces new evaluation metric requirements. We show LLM-based metrics to be well adapted to these requirements, and leverage them to conduct an investigation of task-specialization strategies, quantifying the trade-offs that emerge in practical industrial settings. Our findings offer practitioners actionable insights for real-world IFT model deployment.
    摘要 instruction 细调 (IFT) 是一种强大的思想方法,可以增强大型语言模型 (LLM) 的零shot 能力,但是在这之前需要新的评估指标。我们表明 LLM 基于的指标适应这些要求,并利用它们来研究实际工业场景中的任务特化策略,量化在实际应用中出现的交易offs。我们的发现可以为实际 IFT 模型部署提供实践性的指导意见。Here's a word-for-word translation of the text into Simplified Chinese: instruction 细调(IFT)是一种强大的思想方法,可以增强大型语言模型(LLM)的零shot能力,但是在这之前需要新的评估指标。我们表明 LLM基于的指标适应这些要求,并利用它们来研究实际工业场景中的任务特化策略,量化在实际应用中出现的交易offs。我们的发现可以为实际 IFT 模型部署提供实践性的指导意见。

Stabilizing reinforcement learning control: A modular framework for optimizing over all stable behavior

  • paper_url: http://arxiv.org/abs/2310.14098
  • repo_url: None
  • paper_authors: Nathan P. Lawrence, Philip D. Loewen, Shuyuan Wang, Michael G. Forbes, R. Bhushan Gopaluni
  • for: 本文提出了一个权重控制器设计框架,该框架结合深度学习的优化和模型自由,同时保证稳定性。
  • methods: 本文使用了Youla-Kucera参数化定义搜索领域,并使用行为系统构建了数据驱动内部模型。在噪声存在的情况下,对数据驱动模型的稳定性进行了分析。
  • results: 本文通过matrix factorization方法给出了所有稳定线性运算符的集合,并使用神经网络表示参数化的稳定运算符集合,实现了与标准深度学习库的无缝集成。最后,本文还展示了如何应用这些想法来调整固定结构控制器。
    Abstract We propose a framework for the design of feedback controllers that combines the optimization-driven and model-free advantages of deep reinforcement learning with the stability guarantees provided by using the Youla-Kucera parameterization to define the search domain. Recent advances in behavioral systems allow us to construct a data-driven internal model; this enables an alternative realization of the Youla-Kucera parameterization based entirely on input-output exploration data. Perhaps of independent interest, we formulate and analyze the stability of such data-driven models in the presence of noise. The Youla-Kucera approach requires a stable "parameter" for controller design. For the training of reinforcement learning agents, the set of all stable linear operators is given explicitly through a matrix factorization approach. Moreover, a nonlinear extension is given using a neural network to express a parameterized set of stable operators, which enables seamless integration with standard deep learning libraries. Finally, we show how these ideas can also be applied to tune fixed-structure controllers.
    摘要 我们提出一个框架,将深度学习的优化优势和模型自由的优势结合起来,同时保证稳定性通过用Youla-Kucera参数化来定义搜索空间。现有的行为系统技术使我们可以构建数据驱动的内部模型,这使得我们可以基于输入输出探索数据来实现Youla-Kucera参数化的alternative实现。此外,我们还研究了这些数据驱动模型在噪声存在时的稳定性。Youla-Kucera方法需要一个稳定的参数来设计控制器。在训练深度学习代理人时,所有稳定的线性运算的集合可以通过矩阵分解方法得到Explicitly。此外,我们还给出了一种使用神经网络表示参数化集合的稳定运算的非线性扩展,这使得我们可以轻松地与标准深度学习库集成。最后,我们示出了如何应用这些想法来调整固定结构的控制器。

Learning Reward for Physical Skills using Large Language Model

  • paper_url: http://arxiv.org/abs/2310.14092
  • repo_url: None
  • paper_authors: Yuwei Zeng, Yiqing Xu
  • for: 学习物理技能的奖励函数是一个挑战,因为这些任务的谱度非常广泛,状态和动作空间的维度很高,以及复杂的感知反馈。获取专家示范数据是成本高昂和时间费时的。
  • methods: 我们使用大型自然语言模型(LLM)提取任务相关知识,以提出有效的奖励函数。我们的方法包括两个组成部分:首先,使用 LLM 提出特征和参数化的奖励函数。然后,我们通过迭代自适应过程来更新这些提出的奖励函数的参数,以最小化与 LLM 的排名不一致性。
  • results: 我们在三个模拟的物理技能学习任务上进行了测试,证明了我们的设计选择的有效性。
    Abstract Learning reward functions for physical skills are challenging due to the vast spectrum of skills, the high-dimensionality of state and action space, and nuanced sensory feedback. The complexity of these tasks makes acquiring expert demonstration data both costly and time-consuming. Large Language Models (LLMs) contain valuable task-related knowledge that can aid in learning these reward functions. However, the direct application of LLMs for proposing reward functions has its limitations such as numerical instability and inability to incorporate the environment feedback. We aim to extract task knowledge from LLMs using environment feedback to create efficient reward functions for physical skills. Our approach consists of two components. We first use the LLM to propose features and parameterization of the reward function. Next, we update the parameters of this proposed reward function through an iterative self-alignment process. In particular, this process minimizes the ranking inconsistency between the LLM and our learned reward functions based on the new observations. We validated our method by testing it on three simulated physical skill learning tasks, demonstrating effective support for our design choices.
    摘要
  1. Use LLM to propose features and parameterization of the reward function.2. Update the proposed reward function parameters through an iterative self-alignment process that minimizes the ranking inconsistency between the LLM and our learned reward functions based on new observations.We validated our method by testing it on three simulated physical skill learning tasks, demonstrating effective support for our design choices.

To Copy, or not to Copy; That is a Critical Issue of the Output Softmax Layer in Neural Sequential Recommenders

  • paper_url: http://arxiv.org/abs/2310.14079
  • repo_url: https://github.com/iesl/softmax_cpr_recommend
  • paper_authors: Haw-Shiuan Chang, Nikhil Agarwal, Andrew McCallum
  • for: 强化Sequential Recommendation任务中复现项目的处理能力
  • methods: 采用recently-proposed softmax alternatives如softmax-CPR,对输出softmax层进行修改,解决单个隐藏状态嵌入和静态项嵌入的问题
  • results: 在12个数据集上提供了一致性的改进,对GRU4Rec模型进行修改后,在5个数据集中具有重复项的NDCG@10提高10%(4%-17%),在7个数据集中无重复项的NDCG@10提高24%(8%-39%)。
    Abstract Recent studies suggest that the existing neural models have difficulty handling repeated items in sequential recommendation tasks. However, our understanding of this difficulty is still limited. In this study, we substantially advance this field by identifying a major source of the problem: the single hidden state embedding and static item embeddings in the output softmax layer. Specifically, the similarity structure of the global item embeddings in the softmax layer sometimes forces the single hidden state embedding to be close to new items when copying is a better choice, while sometimes forcing the hidden state to be close to the items from the input inappropriately. To alleviate the problem, we adapt the recently-proposed softmax alternatives such as softmax-CPR to sequential recommendation tasks and demonstrate that the new softmax architectures unleash the capability of the neural encoder on learning when to copy and when to exclude the items from the input sequence. By only making some simple modifications on the output softmax layer for SASRec and GRU4Rec, softmax-CPR achieves consistent improvement in 12 datasets. With almost the same model size, our best method not only improves the average NDCG@10 of GRU4Rec in 5 datasets with duplicated items by 10% (4%-17% individually) but also improves 7 datasets without duplicated items by 24% (8%-39%)!
    摘要

Convolutional Bidirectional Variational Autoencoder for Image Domain Translation of Dotted Arabic Expiration

  • paper_url: http://arxiv.org/abs/2310.14069
  • repo_url: None
  • paper_authors: Ahmed Zidane, Ghada Soliman
  • for: 这个论文提出了一种基于升降栈底部卷积 convolutional bidirectional variational autoencoder(LCBVAE)架构的Encoder和Decoder,用于将斜体阿拉伯数字日期翻译成填充了阿拉伯数字日期。
  • methods: 我们采用了自定义和适应CRNN模型,并将其训练在2019年至2027年的填充图像上,以提取日期和评估LCBVAE模型在日期识别方面的性能。
  • results: 我们发现,在LCBVAE架构中添加缓冲层可以提高总体化的性能,并在下游传播学习任务中实现了图像翻译的97%准确率。这种方法可以普适应任何下游学习任务,如图像翻译和重建。
    Abstract THIS paper proposes an approach of Ladder Bottom-up Convolutional Bidirectional Variational Autoencoder (LCBVAE) architecture for the encoder and decoder, which is trained on the image translation of the dotted Arabic expiration dates by reconstructing the Arabic dotted expiration dates into filled-in expiration dates. We employed a customized and adapted version of Convolutional Recurrent Neural Network CRNN model to meet our specific requirements and enhance its performance in our context, and then trained the custom CRNN model with the filled-in images from the year of 2019 to 2027 to extract the expiration dates and assess the model performance of LCBVAE on the expiration date recognition. The pipeline of (LCBVAE+CRNN) can be then integrated into an automated sorting systems for extracting the expiry dates and sorting the products accordingly during the manufacture stage. Additionally, it can overcome the manual entry of expiration dates that can be time-consuming and inefficient at the merchants. Due to the lack of the availability of the dotted Arabic expiration date images, we created an Arabic dot-matrix True Type Font (TTF) for the generation of the synthetic images. We trained the model with unrealistic synthetic dates of 59902 images and performed the testing on a realistic synthetic date of 3287 images from the year of 2019 to 2027, represented as yyyy/mm/dd. In our study, we demonstrated the significance of latent bottleneck layer with improving the generalization when the size is increased up to 1024 in downstream transfer learning tasks as for image translation. The proposed approach achieved an accuracy of 97% on the image translation with using the LCBVAE architecture that can be generalized for any downstream learning tasks as for image translation and reconstruction.
    摘要 本文提出了一种垂直卷积减采样变换自动编码器(LCBVAE)架构,用于编码器和解码器,用于图像翻译 arabic 黑点日期。我们采用了自定义和适应版本的卷积循环神经网络(CRNN)模型,以满足我们的特定需求,并在我们的上下文中进行了性能改进。然后,我们在2019年到2027年的 filled-in 图像上训练了自定义 CRNN 模型,以提取日期和评估 LCBVAE 模型在日期识别方面的性能。该管道可以在生产阶段 integrating 到自动化分类系统中,以提取日期并根据日期进行产品的分类。此外,它可以超越手动输入日期,因为这可能是时间consuming 和不可靠的。由于lack arabic 黑点日期图像的可用性,我们创建了一个 arabic 黑点矩阵 True Type Font(TTF),用于生成 synthetic 图像。我们在59902 个不实际的日期图像上训练了模型,并在2019年到2027年的 realistic synthetic 日期上进行测试,表示为 yyyy/mm/dd。在我们的研究中,我们发现了隐藏瓶颈层可以提高通用性,并且当隐藏瓶颈层的大小增加到 1024 时,在下游转移学习任务中可以提高通用性。我们的方法达到了 97% 的准确率在图像翻译任务中,并且可以普适应任何下游学习任务。

MOELoRA: An MOE-based Parameter Efficient Fine-Tuning Method for Multi-task Medical Applications

  • paper_url: http://arxiv.org/abs/2310.18339
  • repo_url: https://github.com/liuqidong07/moelora-peft
  • paper_authors: Qidong Liu, Xian Wu, Xiangyu Zhao, Yuanshao Zhu, Derong Xu, Feng Tian, Yefeng Zheng
  • for: 这个研究旨在为大语言模型(LLMs)在医疗系统中进行微调,以应对实际医疗任务中的多种任务。
  • methods: 我们提出了一个叫做MOELoRA的参数效率微调框架,利用MOE的多任务学习和LoRA的参数效率微调。我们将专家分为两个低矩阵对,以保持小数目的参数。此外,我们提出了一个任务驱动的闸函数,可以调节各专家的贡献和生成不同任务的特有参数。
  • results: 我们在公开的多任务中文医疗数据集上进行了广泛的实验,结果显示MOELoRA可以超越现有的参数效率微调方法。
    Abstract The recent surge in the field of Large Language Models (LLMs) has gained significant attention in numerous domains. In order to tailor an LLM to a specific domain such as a web-based healthcare system, fine-tuning with domain knowledge is necessary. However, two issues arise during fine-tuning LLMs for medical applications. The first is the problem of task variety, where there are numerous distinct tasks in real-world medical scenarios. This diversity often results in suboptimal fine-tuning due to data imbalance and seesawing problems. Additionally, the high cost of fine-tuning can be prohibitive, impeding the application of LLMs. The large number of parameters in LLMs results in enormous time and computational consumption during fine-tuning, which is difficult to justify. To address these two issues simultaneously, we propose a novel parameter-efficient fine-tuning framework for multi-task medical applications called MOELoRA. The framework aims to capitalize on the benefits of both MOE for multi-task learning and LoRA for parameter-efficient fine-tuning. To unify MOE and LoRA, we devise multiple experts as the trainable parameters, where each expert consists of a pair of low-rank matrices to maintain a small number of trainable parameters. Additionally, we propose a task-motivated gate function for all MOELoRA layers that can regulate the contributions of each expert and generate distinct parameters for various tasks. To validate the effectiveness and practicality of the proposed method, we conducted comprehensive experiments on a public multi-task Chinese medical dataset. The experimental results demonstrate that MOELoRA outperforms existing parameter-efficient fine-tuning methods. The implementation is available online for convenient reproduction of our experiments.
    摘要 Recently, there has been a surge of interest in Large Language Models (LLMs) in various domains. To adapt an LLM to a specific domain like a web-based healthcare system, fine-tuning with domain knowledge is essential. However, there are two challenges during fine-tuning LLMs for medical applications. First, there are many diverse tasks in real-world medical scenarios, leading to suboptimal fine-tuning due to data imbalance and seesawing problems. Second, the high cost of fine-tuning can be prohibitive, making it difficult to apply LLMs. The large number of parameters in LLMs results in significant time and computational consumption during fine-tuning, which is difficult to justify. To address these two issues simultaneously, we propose a novel parameter-efficient fine-tuning framework for multi-task medical applications called MOELoRA. The framework combines the benefits of MOE for multi-task learning and LoRA for parameter-efficient fine-tuning.In our framework, we use multiple experts as the trainable parameters, where each expert consists of a pair of low-rank matrices to maintain a small number of trainable parameters. Additionally, we propose a task-motivated gate function for all MOELoRA layers that can regulate the contributions of each expert and generate distinct parameters for various tasks. To validate the effectiveness and practicality of the proposed method, we conducted comprehensive experiments on a public multi-task Chinese medical dataset. The experimental results show that MOELoRA outperforms existing parameter-efficient fine-tuning methods. The implementation is available online for convenient reproduction of our experiments.

On the Neural Tangent Kernel of Equilibrium Models

  • paper_url: http://arxiv.org/abs/2310.14062
  • repo_url: None
  • paper_authors: Zhili Feng, J. Zico Kolter
  • for: 这个研究探讨了深度平衡(DEQ)模型的神经 Tangent kernel(NTK)。
  • methods: 研究使用了 root-finding 方法来有效地找到 DEQ 模型的 deterministic NTK。
  • results: 研究发现,尽管 Fully-connected 神经网络的 NTK 在宽度和深度都在无限大时可能是随机的,但 DEQ 模型在某些条件下仍然具有 deterministic NTK,并且可以通过 root-finding 方法有效地找到它。
    Abstract This work studies the neural tangent kernel (NTK) of the deep equilibrium (DEQ) model, a practical ``infinite-depth'' architecture which directly computes the infinite-depth limit of a weight-tied network via root-finding. Even though the NTK of a fully-connected neural network can be stochastic if its width and depth both tend to infinity simultaneously, we show that contrarily a DEQ model still enjoys a deterministic NTK despite its width and depth going to infinity at the same time under mild conditions. Moreover, this deterministic NTK can be found efficiently via root-finding.
    摘要

Composer Style-specific Symbolic Music Generation Using Vector Quantized Discrete Diffusion Models

  • paper_url: http://arxiv.org/abs/2310.14044
  • repo_url: None
  • paper_authors: Jincheng Zhang, Jingjing Tang, Charalampos Saitis, György Fazekas
  • for: 这篇论文旨在应用vector quantized variational autoencoder(VQ-VAE)和数组diffusion模型,实现符合作曲者风格的象数音乐生成。
  • methods: 本文使用VQ-VAE将象数音乐转换为一系列的index,然后使用数组diffusion模型来模拟VQ-VAE的数组几何空间。
  • results: 实验结果显示,使用本文提出的方法可以实现符合作曲者风格的象数音乐生成,精度为72.36%。
    Abstract Emerging Denoising Diffusion Probabilistic Models (DDPM) have become increasingly utilised because of promising results they have achieved in diverse generative tasks with continuous data, such as image and sound synthesis. Nonetheless, the success of diffusion models has not been fully extended to discrete symbolic music. We propose to combine a vector quantized variational autoencoder (VQ-VAE) and discrete diffusion models for the generation of symbolic music with desired composer styles. The trained VQ-VAE can represent symbolic music as a sequence of indexes that correspond to specific entries in a learned codebook. Subsequently, a discrete diffusion model is used to model the VQ-VAE's discrete latent space. The diffusion model is trained to generate intermediate music sequences consisting of codebook indexes, which are then decoded to symbolic music using the VQ-VAE's decoder. The results demonstrate our model can generate symbolic music with target composer styles that meet the given conditions with a high accuracy of 72.36%.
    摘要 emerging Denoising Diffusion Probabilistic Models (DDPM) 已经越来越受到使用,因为它在不同的生成任务中的连续数据上表现出色,如图像和音频生成。然而, diffusion 模型尚未被完全扩展到字符串符号音乐。我们提议将量化变换自动编码器(VQ-VAE)和字符串扩散模型结合使用,以生成符号音乐 WITH 愿景作曲风格。训练过的 VQ-VAE 可以将符号音乐表示为一个序列的索引,这些索引与一个学习的码库中的特定条目相对应。然后,一个字符串扩散模型将 VQ-VAE 的字符串潜在空间模型化。扩散模型会生成符号音乐序列中的代码库索引,这些索引然后通过 VQ-VAE 的解码器转换为符号音乐。结果表明,我们的模型可以生成符号音乐 WITH 目标作曲风格,并且准确率为 72.36%。

Fast Diffusion GAN Model for Symbolic Music Generation Controlled by Emotions

  • paper_url: http://arxiv.org/abs/2310.14040
  • repo_url: None
  • paper_authors: Jincheng Zhang, György Fazekas, Charalampos Saitis
  • for: 本研究旨在用扩散模型和生成对抗网络(GAN)控制生成的符号音乐,以实现 targets 的情感控制。
  • methods: 我们首先使用已经训练过的变量自动编码器获取符号音乐数据集的嵌入,然后使用这些嵌入来训练扩散模型。
  • results: 我们的模型成功控制了生成的符号音乐,并且在计算成本方面具有显著改善,只需要四个时间步骤来减噪,而现有的扩散模型对符号音乐生成的计算成本是千万次。
    Abstract Diffusion models have shown promising results for a wide range of generative tasks with continuous data, such as image and audio synthesis. However, little progress has been made on using diffusion models to generate discrete symbolic music because this new class of generative models are not well suited for discrete data while its iterative sampling process is computationally expensive. In this work, we propose a diffusion model combined with a Generative Adversarial Network, aiming to (i) alleviate one of the remaining challenges in algorithmic music generation which is the control of generation towards a target emotion, and (ii) mitigate the slow sampling drawback of diffusion models applied to symbolic music generation. We first used a trained Variational Autoencoder to obtain embeddings of a symbolic music dataset with emotion labels and then used those to train a diffusion model. Our results demonstrate the successful control of our diffusion model to generate symbolic music with a desired emotion. Our model achieves several orders of magnitude improvement in computational cost, requiring merely four time steps to denoise while the steps required by current state-of-the-art diffusion models for symbolic music generation is in the order of thousands.
    摘要 Diffusion models have shown promising results for a wide range of generative tasks with continuous data, such as image and audio synthesis. However, little progress has been made on using diffusion models to generate discrete symbolic music because this new class of generative models are not well suited for discrete data while its iterative sampling process is computationally expensive. In this work, we propose a diffusion model combined with a Generative Adversarial Network, aiming to (i) alleviate one of the remaining challenges in algorithmic music generation, which is the control of generation towards a target emotion, and (ii) mitigate the slow sampling drawback of diffusion models applied to symbolic music generation. We first used a trained Variational Autoencoder to obtain embeddings of a symbolic music dataset with emotion labels and then used those to train a diffusion model. Our results demonstrate the successful control of our diffusion model to generate symbolic music with a desired emotion. Our model achieves several orders of magnitude improvement in computational cost, requiring merely four time steps to denoise while the steps required by current state-of-the-art diffusion models for symbolic music generation is in the order of thousands.Here's the translation in Traditional Chinese:Diffusion models have shown promising results for a wide range of generative tasks with continuous data, such as image and audio synthesis. However, little progress has been made on using diffusion models to generate discrete symbolic music because this new class of generative models are not well suited for discrete data while its iterative sampling process is computationally expensive. In this work, we propose a diffusion model combined with a Generative Adversarial Network, aiming to (i) alleviate one of the remaining challenges in algorithmic music generation, which is the control of generation towards a target emotion, and (ii) mitigate the slow sampling drawback of diffusion models applied to symbolic music generation. We first used a trained Variational Autoencoder to obtain embeddings of a symbolic music dataset with emotion labels and then used those to train a diffusion model. Our results demonstrate the successful control of our diffusion model to generate symbolic music with a desired emotion. Our model achieves several orders of magnitude improvement in computational cost, requiring merely four time steps to denoise while the steps required by current state-of-the-art diffusion models for symbolic music generation is in the order of thousands.

Small Language Models Fine-tuned to Coordinate Larger Language Models improve Complex Reasoning

  • paper_url: http://arxiv.org/abs/2310.18338
  • repo_url: https://github.com/lcs2-iiitd/daslam
  • paper_authors: Gurusha Juneja, Subhabrata Dutta, Soumen Chakrabarti, Sunny Manchanda, Tanmoy Chakraborty
  • for: 提高大语言模型(LLM)的链式思维能力,解决复杂多步骤的理性问题。
  • methods: 使用 decomposition generator 将复杂问题 decomposes 成需 fewer reasoning steps 的子问题,然后使用 solver 解决子问题。
  • results: 使用 DaSLaM 方法,可以使用 comparable 或更好的性能,与 orders-of-magnitude 更大的 GPT-4 相比。此外, DaSLaM 方法不受 solver 的scale 影响,可以使用 diverse 大小的 solver LM 获得显著的性能提升。
    Abstract Large Language Models (LLMs) prompted to generate chain-of-thought (CoT) exhibit impressive reasoning capabilities. Recent attempts at prompt decomposition toward solving complex, multi-step reasoning problems depend on the ability of the LLM to simultaneously decompose and solve the problem. A significant disadvantage is that foundational LLMs are typically not available for fine-tuning, making adaptation computationally prohibitive. We believe (and demonstrate) that problem decomposition and solution generation are distinct capabilites, better addressed in separate modules, than by one monolithic LLM. We introduce DaSLaM, which uses a decomposition generator to decompose complex problems into subproblems that require fewer reasoning steps. These subproblems are answered by a solver. We use a relatively small (13B parameters) LM as the decomposition generator, which we train using policy gradient optimization to interact with a solver LM (regarded as black-box) and guide it through subproblems, thereby rendering our method solver-agnostic. Evaluation on multiple different reasoning datasets reveal that with our method, a 175 billion parameter LM (text-davinci-003) can produce competitive or even better performance, compared to its orders-of-magnitude larger successor, GPT-4. Additionally, we show that DaSLaM is not limited by the solver's capabilities as a function of scale; e.g., solver LMs with diverse sizes give significant performance improvement with our solver-agnostic decomposition technique. Exhaustive ablation studies evince the superiority of our modular finetuning technique over exorbitantly large decomposer LLMs, based on prompting alone.
    摘要 大型语言模型(LLM)因为被调动产生链式思维(CoT)而表现出卓越的推理能力。现在的尝试通过问题分解来解决复杂多步骤的问题,很大程度上取决于LLM的能力同时进行问题分解和解决。然而,基础LLM通常不可以进行微调,从而使得适应成本高昂。我们认为(并证明)问题分解和解决是不同的能力,更好地通过分类模组来处理,而不是单一的LLM。我们称之为DaSLaM,它使用问题分解生成器将复杂问题分解成需要 fewer 推理步骤的子问题。这些子问题由解决器回答。我们使用一个相对较小的(13B个参数)LM作为问题分解生成器,并使用政策倾斜优化训练它与解决器LM(被视为黑盒)互动,以导引它通过子问题,因此让我们的方法成为解决器无关的。我们在多个不同的推理数据集上进行评估,发现我们的方法可以让1750亿个参数的LM(text-davinci-003)生成竞争或更好的性能,相比之下它的规模增加了许多。此外,我们还证明了DaSLaM不受解决器的规模影响,例如,不同大小的解决器LM可以为我们的模块化微调技术带来很大的性能提升。我们进行了详细的剥夺研究,证明我们的模块化微调技术在Prompting alone下比极大的问题分解LLM更有优势。

Contrast Everything: A Hierarchical Contrastive Framework for Medical Time-Series

  • paper_url: http://arxiv.org/abs/2310.14017
  • repo_url: https://github.com/dl4mhealth/comet
  • paper_authors: Yihe Wang, Yu Han, Haishuai Wang, Xiang Zhang
  • for: 这个研究旨在提高医疗时间序列分析中的相似表现学习,以便更好地利用医疗时间序列中的资讯,减少专业人员的努力和时间投入。
  • methods: 本研究提出了一个创新的层次架构,叫做COMET,它在医疗时间序列中捕捉到四个可能的水平的数据一致性:观察、样本、实验和病人水平。通过开发多个水平的对照损失函数,我们可以学习有效的表现,并实现自动化的医疗时间序列分析。
  • results: 我们在具有10%和1%的标签数据分布的挑战性设置下实现了实验,并与六个基eline进行比较。结果显示,COMET在所有基eline中都表现出色,特别是在10%和1%的标签数据分布下的设置中。这些结果证明了我们的框架在医疗时间序列相似表现学习中的重要性。
    Abstract Contrastive representation learning is crucial in medical time series analysis as it alleviates dependency on labor-intensive, domain-specific, and scarce expert annotations. However, existing contrastive learning methods primarily focus on one single data level, which fails to fully exploit the intricate nature of medical time series. To address this issue, we present COMET, an innovative hierarchical framework that leverages data consistencies at all inherent levels in medical time series. Our meticulously designed model systematically captures data consistency from four potential levels: observation, sample, trial, and patient levels. By developing contrastive loss at multiple levels, we can learn effective representations that preserve comprehensive data consistency, maximizing information utilization in a self-supervised manner. We conduct experiments in the challenging patient-independent setting. We compare COMET against six baselines using three diverse datasets, which include ECG signals for myocardial infarction and EEG signals for Alzheimer's and Parkinson's diseases. The results demonstrate that COMET consistently outperforms all baselines, particularly in setup with 10% and 1% labeled data fractions across all datasets. These results underscore the significant impact of our framework in advancing contrastive representation learning techniques for medical time series. The source code is available at https://github.com/DL4mHealth/COMET.
    摘要 医疗时序分析中,对比表示学学习是关键,因为它减轻了劳动密集、领域特定和珍贵专家标注的依赖。然而,现有的对比学习方法主要集中在单一数据层次,这会无法全面利用医疗时序的复杂性。为解决这个问题,我们提出了COMET,一种创新的层次结构框架,利用医疗时序中所有自然层次的数据一致性。我们 méticulously 设计的模型逐级捕捉数据一致性,从观察、样本、试验和患者四个级别进行对比学习。通过开发多级对比损失函数,我们可以学习有效的表示,保留医疗时序中完整的数据一致性,最大化自我监督的信息利用。我们在医疗时序中的挑战性 patrnt-independent 设置中进行实验,与六个基线比较。我们使用三种多样化的数据集,包括ECG信号和 Alzheimer's 和 Parkinson's 疾病的 EEG 信号。结果表明,COMET 在所有基线之上具有优异表现,特别是在10%和1%标注数据分布中的设置中。这些结果赋予COMET 在医疗时序对比表示学习技术的进步。COMET 的源代码可以在 GitHub 上获取:https://github.com/DL4mHealth/COMET。

One is More: Diverse Perspectives within a Single Network for Efficient DRL

  • paper_url: http://arxiv.org/abs/2310.14009
  • repo_url: None
  • paper_authors: Yiqin Tan, Ling Pan, Longbo Huang
  • for: 这 paper 是用于提高 deep reinforcement learning 的效率和稳定性的研究。
  • methods: 这 paper 使用了多个子网络(OMNet),每个子网络输出不同的结果,从而提高了学习效率和鲁棒性。
  • results: 通过在 MuJoCo 测试集上进行 comprehensive 评估, authors 发现 OMNet 能够很好地寻找效果和计算成本之间的平衡。
    Abstract Deep reinforcement learning has achieved remarkable performance in various domains by leveraging deep neural networks for approximating value functions and policies. However, using neural networks to approximate value functions or policy functions still faces challenges, including low sample efficiency and overfitting. In this paper, we introduce OMNet, a novel learning paradigm utilizing multiple subnetworks within a single network, offering diverse outputs efficiently. We provide a systematic pipeline, including initialization, training, and sampling with OMNet. OMNet can be easily applied to various deep reinforcement learning algorithms with minimal additional overhead. Through comprehensive evaluations conducted on MuJoCo benchmark, our findings highlight OMNet's ability to strike an effective balance between performance and computational cost.
    摘要 深度强化学习已在多个领域取得了显著的成绩,通过使用深度神经网络来近似价值函数和策略函数。然而,使用神经网络来近似价值函数或策略函数仍然面临挑战,包括低样本效率和过拟合。在本文中,我们介绍了OMNet,一种新的学习模式,它在单个网络中嵌入多个子网络,以获得多种输出,高效地。我们提供了一个系统化的管道,包括初始化、训练和采样,以及OMNet可以轻松应用于多种深度强化学习算法,增加了最小的额外开销。通过对MuJoCo benchmark进行了全面的评估,我们的发现表明OMNet能够均衡性能和计算成本。

On Bilingual Lexicon Induction with Large Language Models

  • paper_url: http://arxiv.org/abs/2310.13995
  • repo_url: https://github.com/cambridgeltl/prompt4bli
  • paper_authors: Yaoyiran Li, Anna Korhonen, Ivan Vulić
  • for: 本研究旨在探讨使用最新一代大语言模型(LLMs)来开发双语词表。
  • methods: 我们采用了零批示推理、几批示推理和标准的BLI预训练方法来评估多语言模型(mLLMs)的应用性。
  • results: 我们的实验结果显示,使用几批示推理的 nearest neighbours 示例可以达到最佳性能,并创造了许多语言对的新纪录。
    Abstract Bilingual Lexicon Induction (BLI) is a core task in multilingual NLP that still, to a large extent, relies on calculating cross-lingual word representations. Inspired by the global paradigm shift in NLP towards Large Language Models (LLMs), we examine the potential of the latest generation of LLMs for the development of bilingual lexicons. We ask the following research question: Is it possible to prompt and fine-tune multilingual LLMs (mLLMs) for BLI, and how does this approach compare against and complement current BLI approaches? To this end, we systematically study 1) zero-shot prompting for unsupervised BLI and 2) few-shot in-context prompting with a set of seed translation pairs, both without any LLM fine-tuning, as well as 3) standard BLI-oriented fine-tuning of smaller LLMs. We experiment with 18 open-source text-to-text mLLMs of different sizes (from 0.3B to 13B parameters) on two standard BLI benchmarks covering a range of typologically diverse languages. Our work is the first to demonstrate strong BLI capabilities of text-to-text mLLMs. The results reveal that few-shot prompting with in-context examples from nearest neighbours achieves the best performance, establishing new state-of-the-art BLI scores for many language pairs. We also conduct a series of in-depth analyses and ablation studies, providing more insights on BLI with (m)LLMs, also along with their limitations.
    摘要 global paradigm shift в NLP towards Large Language Models (LLMs) 我们从 LLMs 的最新一代获得了开发双语词汇的潜力。我们的研究问题是:可以将多ilingual LLMs (mLLMs) 用于双语词汇问题(BLI)的问题,并且如何与现有的 BLI 方法相比。为此,我们系统地研究了以下问题:1. 零shot 提示,不需要精度批评的 BLI。2. 几个 shot 的内容提示,使用一组seed translation pairs,而不需要精度批评。3. 标准的 BLI-oriented 精度批评。我们实验了 18 个开源的文本至文本 mLLMs ,它们的大小在 0.3B 到 13B 个参数之间,在两个标准的 BLI 库中进行了评估。我们的研究是首次证明了文本至文本 mLLMs 的强大 BLI 能力。结果显示,几个 shot 的内容提示可以取得最好的性能,建立了许多语言对的新的顶峰 BLI 分数。我们还进行了详细的分析和剔除研究,提供了更多关于 BLI 的关键。

Application of deep and reinforcement learning to boundary control problems

  • paper_url: http://arxiv.org/abs/2310.15191
  • repo_url: https://github.com/zenineasa/MasterThesis
  • paper_authors: Zenin Easa Panthakkalakath, Juraj Kardoš, Olaf Schenk
  • for: 本研究旨在使用深度学习和强化学习解决边控制问题。
  • methods: 我们采用迭代优化策略,使用空间神经网络构建初始猜测,并使用维度-时间神经网络学习迭代优化算法。
  • results: 我们的数据培育和测试表明,提议的方法可以与现有的解决方案相比,速度和准确性相当。在我们的初步结果中,网络实现成本比IPOPT低于51%的情况。
    Abstract The boundary control problem is a non-convex optimization and control problem in many scientific domains, including fluid mechanics, structural engineering, and heat transfer optimization. The aim is to find the optimal values for the domain boundaries such that the enclosed domain adhering to the governing equations attains the desired state values. Traditionally, non-linear optimization methods, such as the Interior-Point method (IPM), are used to solve such problems. This project explores the possibilities of using deep learning and reinforcement learning to solve boundary control problems. We adhere to the framework of iterative optimization strategies, employing a spatial neural network to construct well-informed initial guesses, and a spatio-temporal neural network learns the iterative optimization algorithm using policy gradients. Synthetic data, generated from the problems formulated in the literature, is used for training, testing and validation. The numerical experiments indicate that the proposed method can rival the speed and accuracy of existing solvers. In our preliminary results, the network attains costs lower than IPOPT, a state-of-the-art non-linear IPM, in 51\% cases. The overall number of floating point operations in the proposed method is similar to that of IPOPT. Additionally, the informed initial guess method and the learned momentum-like behaviour in the optimizer method are incorporated to avoid convergence to local minima.
    摘要 “边界控制问题是科学领域中的一种非 convex 优化和控制问题,包括流体动力学、结构工程和热传输优化等。目标是找到包含 governing 方程的Domaint 的优化值,使涵盖的 Domaint 达到所需的状态值。传统上,非线性优化方法,如Interior-Point 方法(IPM),用于解决这些问题。”This project explores the use of deep learning and reinforcement learning to solve boundary control problems. We use an iterative optimization strategy, with a spatial neural network constructing well-informed initial guesses and a spatio-temporal neural network learning the iterative optimization algorithm using policy gradients. Synthetic data, generated from problems formulated in the literature, is used for training, testing, and validation. Our numerical experiments show that the proposed method can rival the speed and accuracy of existing solvers. In our preliminary results, the network attains costs lower than IPOPT, a state-of-the-art non-linear IPM, in 51% of cases. The overall number of floating point operations in the proposed method is similar to that of IPOPT. Additionally, we incorporate an informed initial guess method and a learned momentum-like behavior in the optimizer to avoid convergence to local minima.

Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs

  • paper_url: http://arxiv.org/abs/2310.13961
  • repo_url: https://github.com/ibm/ensemble-instruct
  • paper_authors: Young-Suk Lee, Md Arafat Sultan, Yousef El-Kurdi, Tahira Naseem Asim Munawar, Radu Florian, Salim Roukos, Ramón Fernandez Astudillo
  • for: 用于自动生成数据,提高对话机器人的强度,只需要小量的人工监督。
  • methods: 使用 Self-Instruct 和 Alpaca 等技术,训练小型语言模型(10B–40B参数),并使用 permissive licenses。
  • results: 提高自动生成数据的质量,对于小型语言模型和大型语言模型都有显著提高,并且小型 instruction-tuned LM 生成的输出更有用。
    Abstract Using in-context learning (ICL) for data generation, techniques such as Self-Instruct (Wang et al., 2023) or the follow-up Alpaca (Taori et al., 2023) can train strong conversational agents with only a small amount of human supervision. One limitation of these approaches is that they resort to very large language models (around 175B parameters) that are also proprietary and non-public. Here we explore the application of such techniques to language models that are much smaller (around 10B--40B parameters) and have permissive licenses. We find the Self-Instruct approach to be less effective at these sizes and propose new ICL methods that draw on two main ideas: (a) Categorization and simplification of the ICL templates to make prompt learning easier for the LM, and (b) Ensembling over multiple LM outputs to help select high-quality synthetic examples. Our algorithm leverages the 175 Self-Instruct seed tasks and employs separate pipelines for instructions that require an input and instructions that do not. Empirical investigations with different LMs show that: (1) Our proposed method yields higher-quality instruction tuning data than Self-Instruct, (2) It improves performances of both vanilla and instruction-tuned LMs by significant margins, and (3) Smaller instruction-tuned LMs generate more useful outputs than their larger un-tuned counterparts. Our codebase is available at https://github.com/IBM/ensemble-instruct.
    摘要 Translation in Simplified Chinese:使用增Context学习(ICL)数据生成技术,如Self-Instruct(Wang et al., 2023)或Alpaca(Taori et al., 2023),可以帮助强化对话机器人,只需小量人工监督。这些方法的一个限制是它们需要非常大的语言模型(约175B参数),这些模型也是 propriety 和非公共的。在这里,我们探索将这些技术应用于更小的语言模型(约10B--40B参数),这些模型具有允许的许可证。我们发现Self-Instruct方法在这些大小下不太有效,我们提出了新的ICL方法,它们基于以下两个主要想法:(a)将ICL模板分类和简化,使Language Model(LM)更容易学习提示,和(b)将多个LM输出ensemble,以选择高质量的人工示例。我们的算法利用Self-Instruct的175个种子任务,并使用不同的LM pipeline,对于需要输入和不需要输入的指令而分开。我们的实验表明,我们的提posed方法可以生成更高质量的指令调整数据,并且可以提高vanilla LM和指令调整LM的性能,同时使小型指令调整LM生成更有用的输出。我们的代码库可以在https://github.com/IBM/ensemble-instruct中获取。

Towards dialogue based, computer aided software requirements elicitation

  • paper_url: http://arxiv.org/abs/2310.13953
  • repo_url: None
  • paper_authors: Vasiliy Seibert
  • for: 这篇论文是为了探讨如何从自然语言规范中提取模型的问题。
  • methods: 这篇论文提出了一种对话基于的计算机支持的软件需求分析方法,而不是先前的模型提取方法,它鼓励个性、创造力和真实的妥协。
  • results: 这篇论文通过简单的实验示例了这种方法的核心思想,并讨论了这种方法和现有方法的比较。同时,它也认为未来的自然语言处理和生成AI技术的进步可能会带来重要的进步。
    Abstract Several approaches have been presented, which aim to extract models from natural language specifications. These approaches have inherent weaknesses for they assume an initial problem understanding that is perfect, and they leave no room for feedback. Motivated by real-world collaboration settings between requirements engineers and customers, this paper proposes an interaction blueprint that aims for dialogue based, computer aided software requirements analysis. Compared to mere model extraction approaches, this interaction blueprint encourages individuality, creativity and genuine compromise. A simplistic Experiment was conducted to showcase the general idea. This paper discusses the experiment as well as the proposed interaction blueprint and argues, that advancements in natural language processing and generative AI might lead to significant progress in a foreseeable future. However, for that, there is a need to move away from a magical black box expectation and instead moving towards a dialogue based approach that recognizes the individuality that is an undeniable part of requirements engineering.
    摘要 Translated into Simplified Chinese:有几种方法已经被提出来,以EXTRACT模型从自然语言规格。这些方法具有内在的弱点,因为它们假设问题理解是完美的,并且没有减 feedback 的机制。由实际世界的合作场景中的需求工程师和客户而受到 inspirited,这篇论文提出了对话基本的交互蓝图,用于计算机助成的软件需求分析。这种方法强调个性、创造力和真实的妥协,并在简单的实验中进行了示例。这篇论文讨论了实验和提出的交互蓝图,并 argued That advancements in自然语言处理和生成 AI 可能会在未来导致显著的进步,但我们需要停止对 "黑盒子" 的期望,而是转向对话基本的方法,认可需求工程的个性。

Approximate Implication for Probabilistic Graphical Models

  • paper_url: http://arxiv.org/abs/2310.13942
  • repo_url: None
  • paper_authors: Batya Kenig
  • for: 本研究证明了关于 Conditional Independence (CI) 在 Probabilistic Graphical Models (PGMs) 中的准确性问题。
  • methods: 本文使用了现有的系统推理方法,以及新的证明方法,来研究 CI 的准确性。
  • results: 本文证明了对于非导向图模型,无法提供任何保证,而对于导向图模型,使用 $d$-separation 算法可以提供准确的 CI。 Additionally, the paper establishes improved approximation guarantees for independence relations derived from marginal and saturated CIs.
    Abstract The graphical structure of Probabilistic Graphical Models (PGMs) represents the conditional independence (CI) relations that hold in the modeled distribution. Every separator in the graph represents a conditional independence relation in the distribution, making them the vehicle through which new conditional independencies are inferred and verified. The notion of separation in graphs depends on whether the graph is directed (i.e., a Bayesian Network), or undirected (i.e., a Markov Network). The premise of all current systems-of-inference for deriving CIs in PGMs, is that the set of CIs used for the construction of the PGM hold exactly. In practice, algorithms for extracting the structure of PGMs from data discover approximate CIs that do not hold exactly in the distribution. In this paper, we ask how the error in this set propagates to the inferred CIs read off the graphical structure. More precisely, what guarantee can we provide on the inferred CI when the set of CIs that entailed it hold only approximately? It has recently been shown that in the general case, no such guarantee can be provided. In this work, we prove new negative and positive results concerning this problem. We prove that separators in undirected PGMs do not necessarily represent approximate CIs. That is, no guarantee can be provided for CIs inferred from the structure of undirected graphs. We prove that such a guarantee exists for the set of CIs inferred in directed graphical models, making the $d$-separation algorithm a sound and complete system for inferring approximate CIs. We also establish improved approximation guarantees for independence relations derived from marginal and saturated CIs.
    摘要 PGMs(概率图)的图Structural representation表示模型中的 conditional independence(CI)关系。每个分隔器在图中表示模型中的CI关系,使其成为新的CI关系的推理和验证的渠道。图中的分隔器取决于图是指向的(即 bayesian network)还是无向的(即 markov network)。现有所有系统的推理方法都基于CI关系的集合在PGM中准确地满足。在实际中,数据中PGM的结构检索算法发现的CI关系不准确地存在于分布中。在这篇论文中,我们问 Error propagation in inferred CIs 问题的解决方案。即在PGM中的CI关系是否准确地推理出来?我们证明了一些新的负面和正面结果。在无向PGM中,分隔器不一定表示CI关系的近似。即,无法提供PGM中的CI关系的准确性 garantia。而在指向PGM中,我们证明了$d$-separation算法是一个准确和完整的系统,用于推理CI关系。此外,我们还证明了基于边缘和满足CI关系的约束的约束 CI 关系的改进了近似性保证。

The Hidden Adversarial Vulnerabilities of Medical Federated Learning

  • paper_url: http://arxiv.org/abs/2310.13893
  • repo_url: None
  • paper_authors: Erfan Darzi, Florian Dubost, Nanna. M. Sijtsema, P. M. A van Ooijen
  • for: 这个论文探讨了联合医疗图像分析系统对 adversarial 攻击的感受性。
  • methods: 该分析发现了一种新的攻击途径:利用先前全局模型更新的梯度信息,攻击者可以提高他们的攻击效率和传播性,而无需额外的计算成本增加。
  • results: 研究表明,适当初始化后的单步攻击(例如 FGSM)可以超越其迭代对手的效率,但需要更少的计算资源。这些发现强调了在联合医疗设备中应对 AI 安全的需要。
    Abstract In this paper, we delve into the susceptibility of federated medical image analysis systems to adversarial attacks. Our analysis uncovers a novel exploitation avenue: using gradient information from prior global model updates, adversaries can enhance the efficiency and transferability of their attacks. Specifically, we demonstrate that single-step attacks (e.g. FGSM), when aptly initialized, can outperform the efficiency of their iterative counterparts but with reduced computational demand. Our findings underscore the need to revisit our understanding of AI security in federated healthcare settings.
    摘要 在这篇论文中,我们探讨了联合医疗图像分析系统中的恶意攻击的感受性。我们的分析发现了一个新的攻击途径:使用先前全球模型更新的梯度信息,恶意者可以提高攻击的效率和传播性。具体来说,我们表明了使用单步攻击(如FGSM),当初始化得当时,可以超过迭代攻击的效率,但却减少计算强度。我们的发现强调了在联合医疗设施中AI安全的重要性。

COVIDFakeExplainer: An Explainable Machine Learning based Web Application for Detecting COVID-19 Fake News

  • paper_url: http://arxiv.org/abs/2310.13890
  • repo_url: https://github.com/DatProGuy/COVIDFakeExplainer
  • paper_authors: Dylan Warman, Muhammad Ashad Kabir
  • for: 这篇论文旨在提供一个实用的伪新闻检测解决方案,以帮助社会对伪新闻进行有效防范。
  • methods: 本论文使用机器学习技术,包括深度学习方法,以探索伪新闻检测的可能性。特别是,本论文使用BERT模型,并将其应用于实际的伪新闻检测和解释。
  • results: 本论文的实验结果显示,BERT模型在检测COVID-19相关伪新闻方面 exhibits 高度的准确性。此外,本论文还提出了一个可读性增强的BERT模型,并将其作为一个服务通过AWS云端API主机。
    Abstract Fake news has emerged as a critical global issue, magnified by the COVID-19 pandemic, underscoring the need for effective preventive tools. Leveraging machine learning, including deep learning techniques, offers promise in combatting fake news. This paper goes beyond by establishing BERT as the superior model for fake news detection and demonstrates its utility as a tool to empower the general populace. We have implemented a browser extension, enhanced with explainability features, enabling real-time identification of fake news and delivering easily interpretable explanations. To achieve this, we have employed two publicly available datasets and created seven distinct data configurations to evaluate three prominent machine learning architectures. Our comprehensive experiments affirm BERT's exceptional accuracy in detecting COVID-19-related fake news. Furthermore, we have integrated an explainability component into the BERT model and deployed it as a service through Amazon's cloud API hosting (AWS). We have developed a browser extension that interfaces with the API, allowing users to select and transmit data from web pages, receiving an intelligible classification in return. This paper presents a practical end-to-end solution, highlighting the feasibility of constructing a holistic system for fake news detection, which can significantly benefit society.
    摘要 假新闻在全球范围内已成为一个严重的问题,COVID-19大流行的爆发进一步强调了需要有效的预防工具。利用机器学习,包括深度学习技术,可能在打击假新闻方面提供希望。这篇论文超越了现有的研究,将BERT模型确定为假新闻检测的最佳模型,并将其作为普通民众 empower 的工具。我们开发了一款浏览器扩展程序,该扩展程序具有解释性特性,可以在实时检测假新闻并提供可读性高的解释。为了实现这一点,我们使用了两个公共可用的数据集,并创建了七种不同的数据配置来评估三种知名的机器学习架构。我们的广泛的实验证明了 COVID-19 相关的假新闻检测BERT模型的突出性。此外,我们将BERT模型集成了解释性Component,并通过Amazon云API主机(AWS)部署为服务。我们开发了一款浏览器扩展程序,该扩展程序可以从网页上选择和传输数据,并得到可读性高的分类结果。本论文提出了一个实用的端到端解决方案, highlighting 社会的可能性建立一个总体的假新闻检测系统,该系统可以对社会产生很大的好处。