paper_authors: Yihao Fang, Xianzhi Li, Stephen W. Thomas, Xiaodan Zhu
for: 增强自然语言理解任务中的拓展性 generale
methods: 使用ChatGPT作为数据增强技术,提高开放意图检测任务中的组合泛化能力
results: 对多个benchmark进行严格评估,发现我们的方法可以明显提高模型性能,并且在开放意图检测任务中具有显著的提升效果。Abstract
Open intent detection, a crucial aspect of natural language understanding, involves the identification of previously unseen intents in user-generated text. Despite the progress made in this field, challenges persist in handling new combinations of language components, which is essential for compositional generalization. In this paper, we present a case study exploring the use of ChatGPT as a data augmentation technique to enhance compositional generalization in open intent detection tasks. We begin by discussing the limitations of existing benchmarks in evaluating this problem, highlighting the need for constructing datasets for addressing compositional generalization in open intent detection tasks. By incorporating synthetic data generated by ChatGPT into the training process, we demonstrate that our approach can effectively improve model performance. Rigorous evaluation of multiple benchmarks reveals that our method outperforms existing techniques and significantly enhances open intent detection capabilities. Our findings underscore the potential of large language models like ChatGPT for data augmentation in natural language understanding tasks.
摘要
开放意图检测是自然语言理解的重要方面,涉及到用户生成文本中未经见的意图的识别。Despite the progress made in this field, there are still challenges in handling new combinations of language components, which is crucial for compositional generalization. In this paper, we present a case study exploring the use of ChatGPT as a data augmentation technique to enhance compositional generalization in open intent detection tasks.我们开始 by discussing the limitations of existing benchmarks in evaluating this problem, highlighting the need for constructing datasets for addressing compositional generalization in open intent detection tasks. By incorporating synthetic data generated by ChatGPT into the training process, we demonstrate that our approach can effectively improve model performance. Rigorous evaluation of multiple benchmarks reveals that our method outperforms existing techniques and significantly enhances open intent detection capabilities. Our findings underscore the potential of large language models like ChatGPT for data augmentation in natural language understanding tasks.Here's the text with some minor adjustments to make it more readable in Simplified Chinese:开放意图检测是自然语言理解的重要方面,涉及到用户生成文本中未经见的意图的识别。尽管在这个领域已经做出了很多进步,但是处理新的语言组成部分的挑战仍然存在,这是重要的 Compositional generalization。在这篇论文中,我们进行了一个案例研究,探讨使用 ChatGPT 作为数据增强技术来提高开放意图检测任务中的 Compositional generalization。我们开始 by 讨论现有的 benchmar 的限制,高亮需要为开放意图检测任务构建数据集来解决 Compositional generalization 问题。通过在训练过程中添加 ChatGPT 生成的Synthetic数据,我们示出了我们的方法可以有效提高模型性能。多个 benchmar 的严格评估表明,我们的方法超过了现有的方法,并有效地提高了开放意图检测能力。我们的发现强调了大语言模型 like ChatGPT 的潜在作用在自然语言理解任务中。
Does Asking Clarifying Questions Increases Confidence in Generated Code? On the Communication Skills of Large Language Models
results: 通过提高沟通技巧,提高代码生成器对代码质量的信任度Abstract
Large language models (LLMs) have significantly improved the ability to perform tasks in the field of code generation. However, there is still a gap between LLMs being capable coders and being top-tier software engineers. Based on the observation that top-level software engineers often ask clarifying questions to reduce ambiguity in both requirements and coding solutions, we argue that the same should be applied to LLMs for code generation tasks. By asking probing questions in various topics before generating the final code, the challenges of programming with LLMs, such as unclear intent specification, lack of computational thinking, and undesired code quality, may be alleviated. This, in turn, increases confidence in the generated code. In this work, we explore how to leverage better communication skills to achieve greater confidence in generated code. We propose a communication-centered process that uses an LLM-generated communicator to identify issues with high ambiguity or low confidence in problem descriptions and generated code. We then ask clarifying questions to obtain responses from users for refining the code.
摘要
Translated into Simplified Chinese:大型语言模型(LLM)已经对代码生成任务做出了重要改进,但是仍然存在LLM是出色的程序员和首席软件工程师之间的差距。根据观察到的首席软件工程师经常对需求和解决方案中的模糊性问题提出询问,我们认为这样的方法也应被应用到LLM的代码生成任务中。通过在生成代码之前向用户提出询问,可以帮助解决LLM在代码生成中的挑战,例如不清晰的意图规定、Computational Thinking的缺乏和不满意的代码质量。这样可以增加代码的信任度。在这个工作中,我们探索如何通过更好的沟通技巧来实现更高的代码信任度。我们提出了一个沟通中心的过程,使用LLM生成的通信器来识别问题中的高模糊性或低信任性,然后对用户提出询问以获取反馈。
Attending Generalizability in Course of Deep Fake Detection by Exploring Multi-task Learning
results: 结果表明,我们提出的检测模型具有良好的泛化性,能够正确地检测不同修改方法的视频,比对state-of-the-art更高效。Abstract
This work explores various ways of exploring multi-task learning (MTL) techniques aimed at classifying videos as original or manipulated in cross-manipulation scenario to attend generalizability in deep fake scenario. The dataset used in our evaluation is FaceForensics++, which features 1000 original videos manipulated by four different techniques, with a total of 5000 videos. We conduct extensive experiments on multi-task learning and contrastive techniques, which are well studied in literature for their generalization benefits. It can be concluded that the proposed detection model is quite generalized, i.e., accurately detects manipulation methods not encountered during training as compared to the state-of-the-art.
摘要
这项工作探讨了多种多任务学习(MTL)技术,用于分类视频为原始或修改的混合 manipulate enario,以提高深度假象场景中的泛化性。我们使用的数据集是 FaceForensics++, 该数据集包含 1000 个原始视频,被四种不同的技术修改,总共有 5000 个视频。我们进行了广泛的多任务学习和对比技术实验,这些技术在文献中已经得到了广泛的研究和证明了其泛化效果。可以结论,我们提出的检测模型具有良好的泛化性,即在训练中未遇到的修改方法上具有高度的检测精度,比之前的状态艺术。
Escaping the Sample Trap: Fast and Accurate Epistemic Uncertainty Estimation with Pairwise-Distance Estimators
results: 通过一系列常用来评估 epistemic uncertainty 估计的实验(1D 杆形数据、Pendulum-v0、Hopper-v2、Ant-v2 和 Humanoid-v2),我们证明了 PaiDEs 在 epistemic uncertainty 估计中的优势。在每个实验 Setting 中,我们采用了 Active Learning 框架来展示 PaiDEs 的优势。Abstract
This work introduces a novel approach for epistemic uncertainty estimation for ensemble models using pairwise-distance estimators (PaiDEs). These estimators utilize the pairwise-distance between model components to establish bounds on entropy and uses said bounds as estimates for information-based criterion. Unlike recent deep learning methods for epistemic uncertainty estimation, which rely on sample-based Monte Carlo estimators, PaiDEs are able to estimate epistemic uncertainty up to 100$\times$ faster, over a larger space (up to 100$\times$) and perform more accurately in higher dimensions. To validate our approach, we conducted a series of experiments commonly used to evaluate epistemic uncertainty estimation: 1D sinusoidal data, Pendulum-v0, Hopper-v2, Ant-v2 and Humanoid-v2. For each experimental setting, an Active Learning framework was applied to demonstrate the advantages of PaiDEs for epistemic uncertainty estimation.
摘要
这个研究提出了一种新的方法来估计 ensemble 模型中的认知不确定性使用对比距离估计器(PaiDEs)。这些估计器利用对比距离来确定模型组件之间的 entropy bound,并将这些 bound 用作信息基来的估计 criterion。与最近的深度学习方法不同,PaiDEs 可以在更大的空间(最多 100 倍)和更高维度(最多 100 倍)上更快(up to 100 倍)和更准确地估计认知不确定性。为验证我们的方法,我们进行了一系列通常用于评估认知不确定性估计的实验:1D 振荡数据、Pendulum-v0、Hopper-v2、Ant-v2 和 Humanoid-v2。对每个实验设置,我们应用了活动学习框架来展示 PaiDEs 在认知不确定性估计中的优势。
Open Gaze: An Open-Source Implementation Replicating Google’s Eye Tracking Paper
paper_authors: Sushmanth reddy Mereddy, Jyothi Swaroop Reddy, Somnath Sharma for:This paper aims to develop an open-source implementation of a smartphone-based gaze tracker that can accurately track eye movements without the need for specialized hardware.methods:The authors use machine learning techniques to develop an eye tracking solution that is native to smartphones, and they validate their approach using the MIT GazeCapture dataset.results:The authors demonstrate that their approach can accurately track eye movements during natural image observation and reading comprehension tasks, and they show that their smartphone-based gaze tracker is comparable in accuracy to state-of-the-art mobile eye trackers that are two orders of magnitude more expensive.Abstract
Eye tracking has been a pivotal tool in diverse fields such as vision research, language analysis, and usability assessment. The majority of prior investigations, however, have concentrated on expansive desktop displays employing specialized, costly eye tracking hardware that lacks scalability. Remarkably little insight exists into ocular movement patterns on smartphones, despite their widespread adoption and significant usage. In this manuscript, we present an open-source implementation of a smartphone-based gaze tracker that emulates the methodology proposed by a GooglePaper (whose source code remains proprietary). Our focus is on attaining accuracy comparable to that attained through the GooglePaper's methodology, without the necessity for supplementary hardware. Through the integration of machine learning techniques, we unveil an accurate eye tracking solution that is native to smartphones. Our approach demonstrates precision akin to the state-of-the-art mobile eye trackers, which are characterized by a cost that is two orders of magnitude higher. Leveraging the vast MIT GazeCapture dataset, which is available through registration on the dataset's website, we successfully replicate crucial findings from previous studies concerning ocular motion behavior in oculomotor tasks and saliency analyses during natural image observation. Furthermore, we emphasize the applicability of smartphone-based gaze tracking in discerning reading comprehension challenges. Our findings exhibit the inherent potential to amplify eye movement research by significant proportions, accommodating participation from thousands of subjects with explicit consent. This scalability not only fosters advancements in vision research, but also extends its benefits to domains such as accessibility enhancement and healthcare applications.
摘要
眼动跟踪技术已经在多个领域得到广泛应用,如视觉研究、语言分析和用户体验评估。然而,大多数前期研究都集中在使用特殊、昂贵的桌面显示器上进行眼动跟踪,lacking scalability。尚未得到充分的研究对于智能手机上的眼动跟踪,尽管智能手机的普及和使用率很高。在这篇文章中,我们提供了一个开源实现的智能手机基于眼动跟踪器,基于Google文献(其源代码尚未公开)的方法论。我们的注重点在于实现与Google文献的方法论相同的准确性,不需要额外的硬件。通过机器学习技术的 интеграción,我们提出了一种Native to smartphones的眼动跟踪解决方案。我们的方法与状态 искусственный智能手机眼动跟踪器相比,具有更高的准确性和可扩展性。基于MIT GazeCapture数据集,我们成功复制了先前研究中关于眼动行为在视觉任务和自然图像观看中的关键发现。此外,我们强调了智能手机基于眼动跟踪在了解阅读挑战中的应用。我们的发现表明了智能手机基于眼动跟踪的潜在潜力,可以提高眼动研究的进步,并扩展到访问ibilty enhancement和医疗应用领域。
Ultrafast-and-Ultralight ConvNet-Based Intelligent Monitoring System for Diagnosing Early-Stage Mpox Anytime and Anywhere
paper_authors: Yubiao Yue, Xiaoqiang Shi, Li Qin, Xinyue Zhang, Yanmei Chen, Jialong Xu, Zipei Zheng, Yujun Cao, Di Liu, Zhenzhang Li, Yang Li for:The paper aims to develop a real-time diagnostic tool for monkeypox, addressing the lack of efficient diagnostic tools and the challenges of high inference speed, large parameter size, and limited diagnosis performance for early-stage monkeypox.methods:The proposed method, Fast-MpoxNet, is an ultrafast and ultralight deep learning network that integrates attention-based feature fusion and multiple auxiliary losses enhancement. It uses transfer learning and five-fold cross-validation, achieving 94.26% Accuracy on the Mpox dataset with a recall of 93.65% for early-stage monkeypox.results:Fast-MpoxNet achieves high accuracy and practicality in real-time diagnosis, with an Accuracy of 98.40% and a Practicality Score of 0.80 when adopting data augmentation. An application system named Mpox-AISM V2 was also developed for both personal computers and mobile phones, featuring ultrafast responses, offline functionality, and easy deployment. The proposed method has the potential to mitigate future monkeypox outbreaks and provide a new paradigm for developing real-time diagnostic tools in the healthcare field.Abstract
Due to the lack of more efficient diagnostic tools for monkeypox, its spread remains unchecked, presenting a formidable challenge to global health. While the high efficacy of deep learning models for monkeypox diagnosis has been demonstrated in related studies, the overlook of inference speed, the parameter size and diagnosis performance for early-stage monkeypox renders the models inapplicable in real-world settings. To address these challenges, we proposed an ultrafast and ultralight network named Fast-MpoxNet. Fast-MpoxNet possesses only 0.27M parameters and can process input images at 68 frames per second (FPS) on the CPU. To counteract the diagnostic performance limitation brought about by the small model capacity, it integrates the attention-based feature fusion module and the multiple auxiliary losses enhancement strategy for better detecting subtle image changes and optimizing weights. Using transfer learning and five-fold cross-validation, Fast-MpoxNet achieves 94.26% Accuracy on the Mpox dataset. Notably, its recall for early-stage monkeypox achieves 93.65%. By adopting data augmentation, our model's Accuracy rises to 98.40% and attains a Practicality Score (A new metric for measuring model practicality in real-time diagnosis application) of 0.80. We also developed an application system named Mpox-AISM V2 for both personal computers and mobile phones. Mpox-AISM V2 features ultrafast responses, offline functionality, and easy deployment, enabling accurate and real-time diagnosis for both the public and individuals in various real-world settings, especially in populous settings during the outbreak. Our work could potentially mitigate future monkeypox outbreak and illuminate a fresh paradigm for developing real-time diagnostic tools in the healthcare field.
摘要
Towards Optimal Head-to-head Autonomous Racing with Curriculum Reinforcement Learning
paper_authors: Dvij Kalaria, Qin Lin, John M. Dolan
for: 本研究旨在提出一个头阵自动赛车环境,以便使用循环学习学习出最佳政策。
methods: 本研究使用了curriculum learning和安全循环学习算法,从 simpler vehicle model 逐渐转移到更加复杂的real environment,以教导循环学习代理人一个更加优化的政策。
results: 本研究的结果显示,使用curriculum learning和安全循环学习算法可以更加有效地将循环学习代理人训练到更加优化的政策,并且能够更加安全地进行训练。Abstract
Head-to-head autonomous racing is a challenging problem, as the vehicle needs to operate at the friction or handling limits in order to achieve minimum lap times while also actively looking for strategies to overtake/stay ahead of the opponent. In this work we propose a head-to-head racing environment for reinforcement learning which accurately models vehicle dynamics. Some previous works have tried learning a policy directly in the complex vehicle dynamics environment but have failed to learn an optimal policy. In this work, we propose a curriculum learning-based framework by transitioning from a simpler vehicle model to a more complex real environment to teach the reinforcement learning agent a policy closer to the optimal policy. We also propose a control barrier function-based safe reinforcement learning algorithm to enforce the safety of the agent in a more effective way while not compromising on optimality.
摘要
HEAD-TO-HEAD自动赛车是一个复杂的问题,车辆需要在摩擦或控制限制下运行,以实现最低圈速并同时积极寻找超越或保持领先的策略。在这项工作中,我们提出了一个真实精度模型车辆动力学环境的HEAD-TO-HEAD赛车环境。一些前一次的工作已经尝试直接在复杂的车辆动力学环境中学习策略,但未能学习优化策略。在这项工作中,我们提出了一种学习纲程学习框架,从一个更加简单的车辆模型转移到更加复杂的真实环境,以教育学习代理更近似于优化策略。我们还提出了一种基于控制障碍函数的安全学习算法,以更有效地保证代理的安全性,不会影响优化性。
Temporal Uncertainty Localization to Enable Human-in-the-loop Analysis of Dynamic Contrast-enhanced Cardiac MRI Datasets
results: 研究发现,使用提出的dQC工具可以准确地识别分割失败的情况,并且可以提高分割结果的准确率和减少分割失败的数量。Abstract
Dynamic contrast-enhanced (DCE) cardiac magnetic resonance imaging (CMRI) is a widely used modality for diagnosing myocardial blood flow (perfusion) abnormalities. During a typical free-breathing DCE-CMRI scan, close to 300 time-resolved images of myocardial perfusion are acquired at various contrast "wash in/out" phases. Manual segmentation of myocardial contours in each time-frame of a DCE image series can be tedious and time-consuming, particularly when non-rigid motion correction has failed or is unavailable. While deep neural networks (DNNs) have shown promise for analyzing DCE-CMRI datasets, a "dynamic quality control" (dQC) technique for reliably detecting failed segmentations is lacking. Here we propose a new space-time uncertainty metric as a dQC tool for DNN-based segmentation of free-breathing DCE-CMRI datasets by validating the proposed metric on an external dataset and establishing a human-in-the-loop framework to improve the segmentation results. In the proposed approach, we referred the top 10% most uncertain segmentations as detected by our dQC tool to the human expert for refinement. This approach resulted in a significant increase in the Dice score (p<0.001) and a notable decrease in the number of images with failed segmentation (16.2% to 11.3%) whereas the alternative approach of randomly selecting the same number of segmentations for human referral did not achieve any significant improvement. Our results suggest that the proposed dQC framework has the potential to accurately identify poor-quality segmentations and may enable efficient DNN-based analysis of DCE-CMRI in a human-in-the-loop pipeline for clinical interpretation and reporting of dynamic CMRI datasets.
摘要
对 Dynamic Contrast-Enhanced (DCE) Cardiac Magnetic Resonance Imaging (CMRI) 诊断Myocardial Blood Flow (Perfusion) 异常,通常需要获取约 300 个时间分解的Myocardial Perfusion 影像,并在不同的对比“洗入/洗出”阶段进行评估。然而,手动分类Myocardial 边析在每个时间点的DCE影像系列可以是时间consuming 和耗时consuming,尤其是当非静态运动调整失败或无法使用时。深度神经网 (DNNs) 已经显示出了分析 DCE-CMRI 数据的潜力,但是一个“动态品质控制” (dQC) 技术来可靠地检测失败的分类是缺乏的。我们提出了一个新的空间时间不确定度量来作为 dQC 工具,并在一个人际loop 框架中进行改进。在我们的方法中,我们将top 10% 最不确定的分类作为 dQC 工具所检测的,并请求人工专家进行重新分类。这种方法导致了 Dice 分数的增加(p < 0.001)和失败分类的数量的下降(16.2% 到 11.3%),而对照方法,将相同数量的分类 randomly 选择进行人工参考,无法获得任何有意义的改善。我们的结果表明,我们的 dQC 框架具有可靠地检测失败分类的潜力,并且可以实现人际loop pipeline中的有效 DNN-based 分析 DCE-CMRI 数据,以便在诊断和报告动态 CMRI 数据时提供有用的资讯。
Leveraging Knowledge and Reinforcement Learning for Enhanced Reliability of Language Models
results: 研究表明,使用这种知识导向集成学习方法可以提高语言模型的可靠性和准确性,在九个GLUE任务上都有出色的表现,超越了现有的最佳实现。Abstract
The Natural Language Processing(NLP) community has been using crowd sourcing techniques to create benchmark datasets such as General Language Understanding and Evaluation(GLUE) for training modern Language Models such as BERT. GLUE tasks measure the reliability scores using inter annotator metrics i.e. Cohens Kappa. However, the reliability aspect of LMs has often been overlooked. To counter this problem, we explore a knowledge-guided LM ensembling approach that leverages reinforcement learning to integrate knowledge from ConceptNet and Wikipedia as knowledge graph embeddings. This approach mimics human annotators resorting to external knowledge to compensate for information deficits in the datasets. Across nine GLUE datasets, our research shows that ensembling strengthens reliability and accuracy scores, outperforming state of the art.
摘要
natural language processing(NLP)社区已经使用人群SOURCING技术创建了通用语言理解和评估(GLUE)测试集,用于训练现代语言模型(BERT)。GLUE任务测试语言模型的可靠性使用互annotator metric,即科恩斯均度。然而,语言模型的可靠性问题经常被忽略。为解决这个问题,我们研究了一种基于知识图谱Embedding的知识导向语言模型集成方法。这种方法模拟人工注解者通过外部知识补做信息缺乏的数据集。在九个GLUE任务上,我们的研究表明,集成可以提高可靠性和准确率,超过当前最佳。