cs.CL - 2023-08-01

CoSMo: A constructor specification language for Abstract Wikipedia’s content selection process

  • paper_url: http://arxiv.org/abs/2308.02539
  • repo_url: None
  • paper_authors: Kutz Arrieta, Pablo R. Fillottrani, C. Maria Keet
  • for: This paper is written for the purpose of creating a novel language modeling framework called CoSMo, which can be used for multilingual content selection and abstract representation in the context of the Abstract Wikipedia project.
  • methods: The paper uses a rigorous language design process that includes broad stakeholder consultation to create CoSMo, which meets the requirements of multilingual modeling, content selection covering declarative content and functions, and both classes and instances.
  • results: The preliminary evaluation of CoSMo shows that it is a useful language modeling framework for abstract representation in the Abstract Wikipedia project and potentially in other contexts as well.
    Abstract Representing snippets of information abstractly is a task that needs to be performed for various purposes, such as database view specification and the first stage in the natural language generation pipeline for generative AI from structured input, i.e., the content selection stage to determine what needs to be verbalised. For the Abstract Wikipedia project, requirements analysis revealed that such an abstract representation requires multilingual modelling, content selection covering declarative content and functions, and both classes and instances. There is no modelling language that meets either of the three features, let alone a combination. Following a rigorous language design process inclusive of broad stakeholder consultation, we created CoSMo, a novel {\sc Co}ntent {\sc S}election {\sc Mo}deling language that meets these and other requirements so that it may be useful both in Abstract Wikipedia as well as other contexts. We describe the design process, rationale and choices, the specification, and preliminary evaluation of the language.
    摘要 表示信息抽象是一项需要进行的任务,用于多种目的,如数据库视图规定和生成AI自结构输入的首个阶段,即内容选择阶段,以确定需要被描述。为Abstract Wikipedia项目,需求分析表明,这种抽象表示需要多语言模型、内容选择覆盖声明内容和函数,以及类和实例。当时没有一种模型语言满足这三个特点,更不用说是一起满足。我们采用了严格的语言设计过程,包括广泛的参与者咨询,创造了CoSMo,一种新的内容选择模型语言,满足这些和其他需求,以便在Abstract Wikipedia中以及其他上使用。我们将介绍设计过程、理由和选择、规范,以及初步评估语言。

Unimodal Intermediate Training for Multimodal Meme Sentiment Classification

  • paper_url: http://arxiv.org/abs/2308.00528
  • repo_url: None
  • paper_authors: Muzhaffar Hazman, Susan McKeever, Josephine Griffith
  • for: 这篇论文的目的是为了开发一个多Modal的Memes感受分类器。
  • methods: 这篇论文使用了一种新的supervised中途训练方法,利用大量的文本和图像感受分类数据。
  • results: 这篇论文的结果显示,将多Modal的Memes融合到单Modal的感受分类器中可以提高模型的性能,并且可以将训练集中的标签Memes减少40%而不影响下游模型的性能。
    Abstract Internet Memes remain a challenging form of user-generated content for automated sentiment classification. The availability of labelled memes is a barrier to developing sentiment classifiers of multimodal memes. To address the shortage of labelled memes, we propose to supplement the training of a multimodal meme classifier with unimodal (image-only and text-only) data. In this work, we present a novel variant of supervised intermediate training that uses relatively abundant sentiment-labelled unimodal data. Our results show a statistically significant performance improvement from the incorporation of unimodal text data. Furthermore, we show that the training set of labelled memes can be reduced by 40% without reducing the performance of the downstream model.
    摘要 互联网迷因(Internet Memes)是自动感情分类的挑战性用户生成内容之一。实际上,缺乏labelled memes是发展多modal meme感情分类器的阻碍因素。为解决这个问题,我们提出使用具有标签的多modal meme感情分类器的训练,并补充训练材料中的单modal(仅有图像和仅有文本)数据。在这个研究中,我们提出了一种新的supervised intermediate training的变iante,使用比较充足的标签文本数据。我们的结果显示,将单modal文本数据包含在训练中可以 statistically significant提高下游模型的表现。此外,我们显示了可以透过将labelled meme训练集量减少40%而不减少下游模型的表现。

Covid-19 Public Sentiment Analysis for Indian Tweets Classification

  • paper_url: http://arxiv.org/abs/2308.06241
  • repo_url: None
  • paper_authors: Mohammad Maksood Akhter, Devpriya Kanojia
  • for: 这篇论文主要是为了研究印度Twitter数据中的情感分析,以便分析COVID-19 tweets中的意见和情感。
  • methods: 论文使用Twitter数据EXTRACTING和情感分析 queries来分析Twitter数据中的意见和情感。
  • results: 这篇论文显示了在Twitter数据中的情感分析结果,包括正面、负面和中性等意见和情感。
    Abstract When any extraordinary event takes place in the world wide area, it is the social media that acts as the fastest carrier of the news along with the consequences dealt with that event. One can gather much information through social networks regarding the sentiments, behavior, and opinions of the people. In this paper, we focus mainly on sentiment analysis of twitter data of India which comprises of COVID-19 tweets. We show how Twitter data has been extracted and then run sentimental analysis queries on it. This is helpful to analyze the information in the tweets where opinions are highly unstructured, heterogeneous, and are either positive or negative or neutral in some cases.
    摘要 当世界范围内发生任何不寻常事件时,社交媒体就会成为最快的消息传递者,同时也会附带该事件所带来的后果。通过社交网络,你可以了解人们的情感、行为和意见的趋势。在这篇论文中,我们主要关注印度Twitter数据的情感分析,即COVID-19 tweets。我们将介绍如何从Twitter数据中提取数据,然后运行情感分析查询。这有助于分析社交媒体上的信息,因为这些信息通常是不结构化、多元和有时是正面、负面或中性的。

ZRIGF: An Innovative Multimodal Framework for Zero-Resource Image-Grounded Dialogue Generation

  • paper_url: http://arxiv.org/abs/2308.00400
  • repo_url: https://github.com/zhangbo-nlp/zrigf
  • paper_authors: Bo Zhang, Jian Wang, Hui Ma, Bo Xu, Hongfei Lin
  • for: 本研究旨在开发一种能够在零资源情况下使用图像信息进行对话生成的框架。
  • methods: 该框架基于一种两阶段学习策略,包括对比预训练和生成预训练。对比预训练包括一个文本和图像匹配模块,该模块将图像和文本映射到一个统一的编码vector空间中,以及一个文本辅助遮盲图像模型,以保持预训练的视觉特征并促进多模态特征的对齐。生成预训练使用一个多模态融合模块和一个信息传递模块,生成基于融合的多模态表示的相关回答。
  • results: 对文本基本对话和图像基本对话数据集进行了广泛的实验,并emonstrated ZRIGF的有效性在生成Contextually pertinent和informative回答。此外,我们采用了一种完全零资源情况来证明我们的框架在新领域中的稳定普适性。
    Abstract Image-grounded dialogue systems benefit greatly from integrating visual information, resulting in high-quality response generation. However, current models struggle to effectively utilize such information in zero-resource scenarios, mainly due to the disparity between image and text modalities. To overcome this challenge, we propose an innovative multimodal framework, called ZRIGF, which assimilates image-grounded information for dialogue generation in zero-resource situations. ZRIGF implements a two-stage learning strategy, comprising contrastive pre-training and generative pre-training. Contrastive pre-training includes a text-image matching module that maps images and texts into a unified encoded vector space, along with a text-assisted masked image modeling module that preserves pre-training visual features and fosters further multimodal feature alignment. Generative pre-training employs a multimodal fusion module and an information transfer module to produce insightful responses based on harmonized multimodal representations. Comprehensive experiments conducted on both text-based and image-grounded dialogue datasets demonstrate ZRIGF's efficacy in generating contextually pertinent and informative responses. Furthermore, we adopt a fully zero-resource scenario in the image-grounded dialogue dataset to demonstrate our framework's robust generalization capabilities in novel domains. The code is available at https://github.com/zhangbo-nlp/ZRIGF.
    摘要 图像背景对话系统受益很大地含义图像信息,从而生成高质量的回答。然而,当前模型在零资源场景下尚未能有效利用这些信息,主要是因为图像和文本Modalities之间的差异。为了解决这个挑战,我们提出了一种创新的多模态框架,称为ZRIGF,它在零资源情况下使用图像背景信息进行对话生成。ZRIGF采用了两个阶段的学习策略:对比预训练和生成预训练。对比预训练包括一个图像和文本匹配模块,将图像和文本映射到一个统一编码 vector space,以及一个文本辅助遮盖图像模型,以保留预训练的视觉特征并促进多模态特征的对应。生成预训练使用多模态融合模块和信息传递模块,生成基于融合的多模态表示中的启发性回答。我们在文本基于和图像背景基于对话数据集上进行了广泛的实验,证明ZRIGF在生成上下文ually pertinent和有用的回答。此外,我们采用了完全零资源场景,以示我们的框架在新领域中的稳定性和普适性。代码可以在https://github.com/zhangbo-nlp/ZRIGF 上获取。

Tackling Hallucinations in Neural Chart Summarization

  • paper_url: http://arxiv.org/abs/2308.00399
  • repo_url: https://github.com/worldhellow/hallucinations-c2t
  • paper_authors: Saad Obaid ul Islam, Iza Škrjanec, Ondřej Dušek, Vera Demberg
  • for: 这个论文目的是解决chart summarization中的幻觉问题。
  • methods: 该论文提出了一种基于自然语言推理(NLI)的数据预处理方法,以减少幻觉现象。
  • results: 人工评估表明,该方法可以显著减少幻觉现象,同时也提高了总的性能。此外,缩短长距离依赖关系和添加图表标题和标签也有助于提高表要求。
    Abstract Hallucinations in text generation occur when the system produces text that is not grounded in the input. In this work, we tackle the problem of hallucinations in neural chart summarization. Our analysis shows that the target side of chart summarization training datasets often contains additional information, leading to hallucinations. We propose a natural language inference (NLI) based method to preprocess the training data and show through human evaluation that our method significantly reduces hallucinations. We also found that shortening long-distance dependencies in the input sequence and adding chart-related information like title and legends improves the overall performance.
    摘要 文本生成中的幻觉现象发生在系统生成的文本与输入无法匹配时。在这个工作中,我们解决了chart summarization的幻觉问题。我们的分析显示,chart summarization的目标 сторо面常常包含更多的信息,导致幻觉。我们提出了基于自然语言推理(NLI)的预处理方法,并通过人工评估表明,我们的方法可以明显减少幻觉。此外,我们发现缩短输入序列中的长距离依赖关系和添加图表标题和标签信息可以提高总性表现。

LimeAttack: Local Explainable Method for Textual Hard-Label Adversarial Attack

  • paper_url: http://arxiv.org/abs/2308.00319
  • repo_url: None
  • paper_authors: Hai Zhu, Zhaoqing Yang, Weiwei Shang, Yuren Wu
  • for: 本研究探讨了自然语言处理模型对 adversarial example 的抵御性。
  • methods: 本文提出了一种新的 hard-label attack 算法,named LimeAttack,该算法使用了本地可解释方法来估算单词重要性排名,然后通过 beam search 找到最佳的黑hat 示例。
  • results: 对比 existed 的 hard-label attack 算法,LimeAttack 在同样的查询预算下 achiev 了更好的攻击性能。此外,对大型语言模型进行评估,结果表明 adversarial example 仍然是大型语言模型的一大威胁。LimeAttack 生成的黑hat 示例具有高度传输性,可以有效提高模型的Robustness 在 adversarial training 中。
    Abstract Natural language processing models are vulnerable to adversarial examples. Previous textual adversarial attacks adopt gradients or confidence scores to calculate word importance ranking and generate adversarial examples. However, this information is unavailable in the real world. Therefore, we focus on a more realistic and challenging setting, named hard-label attack, in which the attacker can only query the model and obtain a discrete prediction label. Existing hard-label attack algorithms tend to initialize adversarial examples by random substitution and then utilize complex heuristic algorithms to optimize the adversarial perturbation. These methods require a lot of model queries and the attack success rate is restricted by adversary initialization. In this paper, we propose a novel hard-label attack algorithm named LimeAttack, which leverages a local explainable method to approximate word importance ranking, and then adopts beam search to find the optimal solution. Extensive experiments show that LimeAttack achieves the better attacking performance compared with existing hard-label attack under the same query budget. In addition, we evaluate the effectiveness of LimeAttack on large language models, and results indicate that adversarial examples remain a significant threat to large language models. The adversarial examples crafted by LimeAttack are highly transferable and effectively improve model robustness in adversarial training.
    摘要 自然语言处理模型容易受到敌意攻击。先前的文本敌意攻击通常使用梯度或信心分数来计算单词重要性排名,并生成敌意攻击示例。然而,这些信息在实际世界中不可获得。因此,我们将注意力点在更真实和挑战性的设定上,即硬标签攻击。现有的硬标签攻击算法通常通过随机替换初始化敌意示例,然后使用复杂的规则来优化敌意扰动。这些方法需要访问模型的多少次,并且攻击成功率受到敌手初始化的限制。在这篇论文中,我们提出了一种新的硬标签攻击算法,名为LimeAttack。LimeAttack利用了一种本地可解释的方法来估算单词重要性排名,然后使用搜索树来找到最佳解决方案。广泛的实验表明,LimeAttack在同样的查询预算下比现有的硬标签攻击算法更好的攻击性能。此外,我们还评估了LimeAttack的效果于大语言模型,结果表明,敌意示例仍然是大语言模型的重要威胁。LimeAttack生成的敌意示例具有高 Transfer Learning 性和可以有效地提高模型在对抗训练中的 Robustness。

Skills-in-Context Prompting: Unlocking Compositionality in Large Language Models

  • paper_url: http://arxiv.org/abs/2308.00304
  • repo_url: None
  • paper_authors: Jiaao Chen, Xiaoman Pan, Dian Yu, Kaiqiang Song, Xiaoyang Wang, Dong Yu, Jianshu Chen
  • for: 提高大型自然语言模型(LLM)的compositional generalization能力
  • methods: 使用novel的prompting策略——skills-in-context(SKiC) prompting
  • results: 实现了LLM解决难度较高的问题的能力,并且能够解决未看过的问题,这表明LLM具有人类智能的推理能力。
    Abstract We consider the problem of eliciting compositional generalization capabilities in large language models (LLMs) with a novel type of prompting strategy. Compositional generalization empowers the LLMs to solve problems that are harder than the ones they have seen (i.e., easy-to-hard generalization), which is a critical reasoning capability of human-like intelligence. However, even the current state-of-the-art LLMs still struggle with this form of reasoning. To bridge this gap, we propose skills-in-context (SKiC) prompting, which instructs LLMs how to compose basic skills to resolve more complex problems. We find that it is crucial to demonstrate both the skills and the compositional examples within the same prompting context. With as few as two examplars, our SKiC prompting initiates strong synergies between skills and their composition capabilities. Notably, it empowers LLMs to solve unseen problems that require innovative skill compositions, achieving near-perfect generalization on a broad range of challenging compositionality tasks. Intriguingly, SKiC prompting unlocks the latent potential of LLMs, enabling them to leverage pre-existing internal skills acquired during earlier pre-training stages, even when these skills are not explicitly presented in the prompting context. This results in the capability of LLMs to solve unseen complex problems by activating and composing internal competencies. With such prominent features, SKiC prompting is able to achieve state-of-the-art performance on challenging mathematical reasoning benchmarks (e.g., MATH).
    摘要 我们考虑了大型自然语言模型(LLM)中的问题,即召唤扩展能力的启发策略。这种能力使得LLM能够解决比它们所见过的问题更加复杂(即易于难的扩展),这是人类智能的重要逻辑能力之一。然而,当前的LLM仍然在这种类型的逻辑能力方面做出了差。为bridge这个差距,我们提出了技能在Context(SKiC)召唤策略。我们发现,在同一个召唤上显示技能和其组合例子是关键。只需要两个示例,我们的 SKiC 召唤就可以强化 LLM 的基本技能和组合能力。特别是,它可以让 LLM 解决未经见过的问题,通过组合内部的技能来实现创新的技能组合,达到了 near-perfect 的扩展性。启示地,SKiC 召唤可以让 LLM Activate和组合其内部的竞争能力,解决无法被直接示例所覆盖的问题。这些特点使得 SKiC 召唤可以在复杂的数学逻辑任务(例如 MATH)中达到最新的表现。

Towards Effective Ancient Chinese Translation: Dataset, Model, and Evaluation

  • paper_url: http://arxiv.org/abs/2308.00240
  • repo_url: None
  • paper_authors: Geyang Guo, Jiarong Yang, Fengyuan Lu, Jiaxin Qin, Tianyi Tang, Wayne Xin Zhao
  • for: 本文提出了一种基于古代中文的翻译模型,帮助更好地理解中国古代文学、传统和文化。
  • methods: 我们收集、清洗和分类了各种古代中文资料,形成了目前最大的古代中文资源。我们还提出了一种特有的训练方法,包括对称对换和双重遮盲语言模型。
  • results: 我们建立了一个用于评估古代中文翻译质量的标准准则,并对各种现有模型进行了评估。我们的模型在五个领域中表现出色,与GPT-3.5模型的+12.0 BLEU比进行了比较,并在人工评估中超过了ERNIE Bot。进一步的微调也显示了Erya模型的优秀传送能力,升幅+6.2 BLEU。我们将所有资源发布到https://github.com/RUCAIBox/Erya。
    Abstract Interpreting ancient Chinese has been the key to comprehending vast Chinese literature, tradition, and civilization. In this paper, we propose Erya for ancient Chinese translation. From a dataset perspective, we collect, clean, and classify ancient Chinese materials from various sources, forming the most extensive ancient Chinese resource to date. From a model perspective, we devise Erya training method oriented towards ancient Chinese. We design two jointly-working tasks: disyllabic aligned substitution (DAS) and dual masked language model (DMLM). From an evaluation perspective, we build a benchmark to judge ancient Chinese translation quality in different scenarios and evaluate the ancient Chinese translation capacities of various existing models. Our model exhibits remarkable zero-shot performance across five domains, with over +12.0 BLEU against GPT-3.5 models and better human evaluation results than ERNIE Bot. Subsequent fine-tuning further shows the superior transfer capability of Erya model with +6.2 BLEU gain. We release all the above-mentioned resources at https://github.com/RUCAIBox/Erya.
    摘要 ancient Chinese 的解释对中国文学、传统和文明产生了关键作用。在这篇论文中,我们提出了“Erya”作为古代中文翻译的解决方案。从数据aset的角度来看,我们收集、清洗并分类了各种古代中文资料,组建了迄今为止最大的古代中文资源。从模型的角度来看,我们设计了古代中文翻译 oriented的Erya训练方法。我们设计了两个联合工作任务:词组对应替换(DAS)和双重遮盲语言模型(DMLM)。从评估角度来看,我们建立了评估古代中文翻译质量的标准套件,并评估了不同enario下的古代中文翻译能力。我们的模型在五个领域中表现出了很好的零MQA表现,与GPT-3.5模型相比,我们的模型的BLEU得分高于12.0。后续的微调更显示了Erya模型的提升性能,BLEU得分增加了6.2。我们在https://github.com/RUCAIBox/Erya上发布了所有资源。

Boosting Adverse Drug Event Normalization on Social Media: General-Purpose Model Initialization and Biomedical Semantic Text Similarity Benefit Zero-Shot Linking in Informal Contexts

  • paper_url: http://arxiv.org/abs/2308.00157
  • repo_url: None
  • paper_authors: François Remy, Simone Scaboro, Beatrice Portelli
  • for: 这篇论文的目的是提出一种新的社交媒体上的不良药物事件Normalization方法,并通过使用通用模型初始化和semantic-text-similarity精细调整(STS)来改善表现。
  • methods: 本研究使用了 BioLORD 的通用模型初始化和 STS 的semantic-text-similarity精细调整,并在多个社交媒体数据集上进行了实验评估。
  • results: 本研究的实验结果显示,使用我们提出的方法可以在社交媒体上实现state-of-the-art的性能,并且在所有测试数据集上都表现出良好的结果。
    Abstract Biomedical entity linking, also known as biomedical concept normalization, has recently witnessed the rise to prominence of zero-shot contrastive models. However, the pre-training material used for these models has, until now, largely consisted of specialist biomedical content such as MIMIC-III clinical notes (Johnson et al., 2016) and PubMed papers (Sayers et al., 2021; Gao et al., 2020). While the resulting in-domain models have shown promising results for many biomedical tasks, adverse drug event normalization on social media texts has so far remained challenging for them (Portelli et al., 2022). In this paper, we propose a new approach for adverse drug event normalization on social media relying on general-purpose model initialization via BioLORD (Remy et al., 2022) and a semantic-text-similarity fine-tuning named STS. Our experimental results on several social media datasets demonstrate the effectiveness of our proposed approach, by achieving state-of-the-art performance. Based on its strong performance across all the tested datasets, we believe this work could emerge as a turning point for the task of adverse drug event normalization on social media and has the potential to serve as a benchmark for future research in the field.
    摘要 生物医学实体链接,也称为生物医学概念 normalization,最近受到零批示对比模型的普及。然而,这些模型的预训练材料, Until now, largely consisted of specialist biomedical content such as MIMIC-III clinical notes (Johnson et al., 2016) and PubMed papers (Sayers et al., 2021; Gao et al., 2020). While the resulting in-domain models have shown promising results for many biomedical tasks, adverse drug event normalization on social media texts has so far remained challenging for them (Portelli et al., 2022).在这篇论文中,我们提出了一种新的方法,基于通用模型初始化via BioLORD (Remy et al., 2022)和semantic-text-similarity fine-tuning named STS。我们的实验结果表明,我们的提议方法在多个社交媒体数据集上具有最佳性能。基于所有测试数据集的优秀表现,我们认为这项工作可能会成为社交媒体上药品副作用normalization任务的转折点,并且具有未来研究领域的标准 referential。

Virtual Prompt Injection for Instruction-Tuned Large Language Models

  • paper_url: http://arxiv.org/abs/2307.16888
  • repo_url: None
  • paper_authors: Jun Yan, Vikas Yadav, Shiyang Li, Lichang Chen, Zheng Tang, Hai Wang, Vijay Srinivasan, Xiang Ren, Hongxia Jin
  • for: 这个论文是为了漏洞抢夺大语言模型(LLM)的目的而写的。
  • methods: 这篇论文使用的方法是投入虚拟提示,以控制 LLM 的行为。
  • results: 研究发现,只需投入 0.1% 的恶意示例,就可以使 LLM 对有关杰布·纽伦(Joe Biden)的查询返回负面的结果。
    Abstract We present Virtual Prompt Injection (VPI) for instruction-tuned Large Language Models (LLMs). VPI allows an attacker-specified virtual prompt to steer the model behavior under specific trigger scenario without any explicit injection in model input. For instance, if an LLM is compromised with the virtual prompt "Describe Joe Biden negatively." for Joe Biden-related instructions, then any service deploying this model will propagate biased views when handling user queries related to Joe Biden. VPI is especially harmful for two primary reasons. Firstly, the attacker can take fine-grained control over LLM behaviors by defining various virtual prompts, exploiting LLMs' proficiency in following instructions. Secondly, this control is achieved without any interaction from the attacker while the model is in service, leading to persistent attack. To demonstrate the threat, we propose a simple method for performing VPI by poisoning the model's instruction tuning data. We find that our proposed method is highly effective in steering the LLM with VPI. For example, by injecting only 52 poisoned examples (0.1% of the training data size) into the instruction tuning data, the percentage of negative responses given by the trained model on Joe Biden-related queries change from 0% to 40%. We thus highlight the necessity of ensuring the integrity of the instruction-tuning data as little poisoned data can cause stealthy and persistent harm to the deployed model. We further explore the possible defenses and identify data filtering as an effective way to defend against the poisoning attacks. Our project page is available at https://poison-llm.github.io.
    摘要 我团队现在提出了虚拟提示插入(VPI)技术,用于对特定触发情况下大语言模型(LLM)的行为进行控制。VPI允许攻击者在没有显式输入的情况下,通过定制虚拟提示来操纵模型的行为。例如,如果一个LLM被恶意攻击者定制为“描述约瑟·贝登纳成分负面”,那么任何使用这个模型的服务都会在用户查询相关的约瑟·贝登纳问题时传播偏见。VPI具有两点优势:首先,攻击者可以通过定制虚拟提示来细化控制LLM的行为,利用LLM的遵从指令的能力。其次,这种控制是在服务模型时进行,导致 persistente攻击。为了证明这种威胁,我们提出了一种简单的VPI实现方法,利用模型的指令调整数据中毒。我们发现,只需插入52个恶意示例(数据量的0.1%),可以让训练后的模型对约瑟·贝登纳相关的查询发送40%的负面回答。我们因此强调了保持模型的指令调整数据的完整性,因为只需少量毒垢数据就可以让模型发生隐藏和持续的害。我们还探索了可能的防御策略,并确定了数据筛选是一种有效的防御方法。关于我们的项目,请参考https://poison-llm.github.io。

HAGRID: A Human-LLM Collaborative Dataset for Generative Information-Seeking with Attribution

  • paper_url: http://arxiv.org/abs/2307.16883
  • repo_url: https://github.com/project-miracl/hagrid
  • paper_authors: Ehsan Kamalloo, Aref Jafari, Xinyu Zhang, Nandan Thakur, Jimmy Lin
  • for: 这项研究的目的是开发一个可以生成搜索结果的自然语言搜索引擎,以提高搜索结果的可信度和可追溯性。
  • methods: 这项研究使用了人类和大语言模型(LLM)的协作方式,首先使用LLM自动生成了带有参考文献的解释,然后询问人类标注者评估这些解释的信息性和可追溯性。
  • results: 该研究提出了一个名为HAGRID的新数据集,用于建立可以生成搜索结果的信息搜索模型,该模型可以生成候选引用和带有参考文献的解释。与之前的研究不同的是,该数据集基于公开 accessible的数据集MIRACL,并且通过人类和LLM的协作来构建。
    Abstract The rise of large language models (LLMs) had a transformative impact on search, ushering in a new era of search engines that are capable of generating search results in natural language text, imbued with citations for supporting sources. Building generative information-seeking models demands openly accessible datasets, which currently remain lacking. In this paper, we introduce a new dataset, HAGRID (Human-in-the-loop Attributable Generative Retrieval for Information-seeking Dataset) for building end-to-end generative information-seeking models that are capable of retrieving candidate quotes and generating attributed explanations. Unlike recent efforts that focus on human evaluation of black-box proprietary search engines, we built our dataset atop the English subset of MIRACL, a publicly available information retrieval dataset. HAGRID is constructed based on human and LLM collaboration. We first automatically collect attributed explanations that follow an in-context citation style using an LLM, i.e. GPT-3.5. Next, we ask human annotators to evaluate the LLM explanations based on two criteria: informativeness and attributability. HAGRID serves as a catalyst for the development of information-seeking models with better attribution capabilities.
    摘要 大型自然语言模型(LLM)的出现对搜索产生了转变性影响,并且开启了一新的搜索引擎时代,这些搜索引擎可以生成搜索结果为自然语言文本,并且具有引用来源的参考。建立生成信息搜索模型需要公开 accessible 的数据集,现在还缺乏这些数据集。在这篇论文中,我们介绍了一个新的数据集,即 HAGRID(人类在Loop Attributable Generative Retrieval for Information-seeking Dataset),用于建立终端生成信息搜索模型,这些模型可以搜索候选引用和生成 attributed 解释。与之前的努力不同,我们基于英语subset 的 MIRACL 公共可用信息检索数据集构建了 HAGRID。HAGRID 基于人类和 LLM 的合作,我们首先使用 GPT-3.5 自然语言模型自动收集 attributed 解释,然后请求人工标注员根据两个 criterion 评估 LLM 解释:信息性和可识别性。HAGRID 作为生成信息搜索模型更好的层次结构的 catalyst。

Defense of Adversarial Ranking Attack in Text Retrieval: Benchmark and Baseline via Detection

  • paper_url: http://arxiv.org/abs/2307.16816
  • repo_url: None
  • paper_authors: Xuanang Chen, Ben He, Le Sun, Yingfei Sun
  • for: 本研究旨在提供一个用于攻击检测的NRMs benchmark数据集,并 introduce two types of攻击文档检测任务。
  • methods: 本研究使用了多种检测基准,包括查看Spamicity、混乱度和语言可接受度,并使用supervised分类器。
  • results: 实验结果显示,使用supervised分类器可以有效地 Mitigate known attacks,但是它在未seen攻击下表现不佳。此外,该分类器应避免使用查询文本,以避免学习相关性。
    Abstract Neural ranking models (NRMs) have undergone significant development and have become integral components of information retrieval (IR) systems. Unfortunately, recent research has unveiled the vulnerability of NRMs to adversarial document manipulations, potentially exploited by malicious search engine optimization practitioners. While progress in adversarial attack strategies aids in identifying the potential weaknesses of NRMs before their deployment, the defensive measures against such attacks, like the detection of adversarial documents, remain inadequately explored. To mitigate this gap, this paper establishes a benchmark dataset to facilitate the investigation of adversarial ranking defense and introduces two types of detection tasks for adversarial documents. A comprehensive investigation of the performance of several detection baselines is conducted, which involve examining the spamicity, perplexity, and linguistic acceptability, and utilizing supervised classifiers. Experimental results demonstrate that a supervised classifier can effectively mitigate known attacks, but it performs poorly against unseen attacks. Furthermore, such classifier should avoid using query text to prevent learning the classification on relevance, as it might lead to the inadvertent discarding of relevant documents.
    摘要 neur Ranking 模型 (NRM) 在信息检索 (IR) 系统中得到了广泛应用,但最近的研究发现,NRM 受到了恶意文档修改的威胁,可能会被黑客搜索优化师利用。虽然对 adversarial 攻击策略的进步帮助了在NRM 部署之前发现其潜在弱点,但对于这些攻击的防御措施,如检测恶意文档,还需要进一步探索。为此,本文提供了一个 Benchmark 数据集,并引入了两种检测任务 для恶意文档。我们进行了全面的检测基eline的调查,包括查看 Spamicity、perplexity 和语言可接受性,并使用超vised 分类器。实验结果表明,一个supervised 分类器可以有效地 Mitigate 已知攻击,但它在未知攻击下表现糟糕。此外,这个分类器应该避免使用查询文本,以避免学习对 relevance 的分类。

DoDo Learning: DOmain-DemOgraphic Transfer in Language Models for Detecting Abuse Targeted at Public Figures

  • paper_url: http://arxiv.org/abs/2307.16811
  • repo_url: https://github.com/turing-online-safety-codebase/dodo-learning
  • paper_authors: Hannah Rose Kirk, Angus R. Williams, Liam Burke, Yi-Ling Chung, Ivan Debono, Pica Johansson, Francesca Stevens, Jonathan Bright, Scott A. Hale
  • for: 这个研究旨在开发一个更普遍化的网络霸凌识别系统,以应对公众人物在社交媒体上 Receiving 过度的辱骂和攻击,并且这种霸凌可能会对公众人物的活跃参与产生负面影响。
  • methods: 本研究使用了自动化系统来识别网络霸凌,并且使用了一个 Novel DODO 数据集,包含 28,000 个标签的 tweet,其中有四个 Domain-demographic 组合。研究人员使用了语言模型进行 tweet 的分类,并且进行了精确的评估和调整,以确保模型能够在不同的领域和人口数据上表现出色。
  • results: 研究人员发现了以下四个关键结果:(i) 小量多样的数据可以帮助模型获得更好的普遍化和适应能力; (ii) 模型在不同的人口数据上的转移能力比较强,但是模型在跨领域数据上的转移能力更高; (iii) 一些人群对普遍化的贡献比较大; (iv) 数据的相似度是转移能力的讯号。
    Abstract Public figures receive a disproportionate amount of abuse on social media, impacting their active participation in public life. Automated systems can identify abuse at scale but labelling training data is expensive, complex and potentially harmful. So, it is desirable that systems are efficient and generalisable, handling both shared and specific aspects of online abuse. We explore the dynamics of cross-group text classification in order to understand how well classifiers trained on one domain or demographic can transfer to others, with a view to building more generalisable abuse classifiers. We fine-tune language models to classify tweets targeted at public figures across DOmains (sport and politics) and DemOgraphics (women and men) using our novel DODO dataset, containing 28,000 labelled entries, split equally across four domain-demographic pairs. We find that (i) small amounts of diverse data are hugely beneficial to generalisation and model adaptation; (ii) models transfer more easily across demographics but models trained on cross-domain data are more generalisable; (iii) some groups contribute more to generalisability than others; and (iv) dataset similarity is a signal of transferability.
    摘要 公众人物在社交媒体上收到过量的辱骂,影响其在公众生活中的活跃参与。自动化系统可以在大规模上识别辱骂,但标注训练数据是Expensive,复杂和 potentially harmful。因此,我们希望系统能够高效、普适,可以处理多个领域和特定方面的辱骂。我们研究跨群体文本分类的dinamics,以了解分类器在其他领域或人口类型上如何转移,以建立更普适的辱骂分类器。我们精细调整语言模型,用我们的novel DODO dataset进行分类,该dataset包含28,000个标注的条目,分别分配到四个领域-人口对的四个组合。我们发现了以下结论:(i)小量多样的数据对泛化和模型适应具有巨大的 beneficial effect;(ii)模型在人口类型之间更容易转移,但是模型在跨领域数据上进行了更好的泛化;(iii)一些组别对泛化具有更大的贡献;(iv)数据集的相似性是转移性的信号。

Changes in Policy Preferences in German Tweets during the COVID Pandemic

  • paper_url: http://arxiv.org/abs/2308.04444
  • repo_url: None
  • paper_authors: Felix Biessmann
  • for: 这个研究用于自动抽取在社交媒体上的政策偏好。
  • methods: 研究使用了一种文本分类模型,并使用了一个新的 tweet 数据集,以EXTRACT political preferences in a German Twitter corpus ranging from 2019 to 2022。
  • results: 研究发现,在 COVID 大流行期间,人们对政策表达增加了。通过使用一个确立的政策偏好分类法,分析了细腻的政治观点,并发现政策表达增加的主要类别是 pro-welfare、pro-education 和 pro-governmental administration efficiency。
    Abstract Online social media have become an important forum for exchanging political opinions. In response to COVID measures citizens expressed their policy preferences directly on these platforms. Quantifying political preferences in online social media remains challenging: The vast amount of content requires scalable automated extraction of political preferences -- however fine grained political preference extraction is difficult with current machine learning (ML) technology, due to the lack of data sets. Here we present a novel data set of tweets with fine grained political preference annotations. A text classification model trained on this data is used to extract policy preferences in a German Twitter corpus ranging from 2019 to 2022. Our results indicate that in response to the COVID pandemic, expression of political opinions increased. Using a well established taxonomy of policy preferences we analyse fine grained political views and highlight changes in distinct political categories. These analyses suggest that the increase in policy preference expression is dominated by the categories pro-welfare, pro-education and pro-governmental administration efficiency. All training data and code used in this study are made publicly available to encourage other researchers to further improve automated policy preference extraction methods. We hope that our findings contribute to a better understanding of political statements in online social media and to a better assessment of how COVID measures impact political preferences.
    摘要 在线社交媒体已成为政治意见交换的重要平台。响应COVID措施,公民直接在这些平台上表达了政策偏好。量化在线社交媒体中政治偏好的问题具有挑战性:巨量的内容需要扩展自动EXTRACT政治偏好,但现有机器学习(ML)技术无法准确地分类细化政治偏好。我们现在提供了一个新的推文数据集,其中每个推文均有细化政治偏好的注释。我们使用这些数据训练文本分类模型,并在2019-2022年德国推文集中提取政策偏好。我们的结果表明,COVID大流行期间,表达政治意见的人数增加。使用已有的政策偏好分类法,我们分析了细化的政治观点,并发现COVID措施的影响。我们的发现可能会促进自动政策偏好抽取方法的进一步改进,并为政策分析和评估提供更好的基础。我们的研究结果也可能会帮助我们更好地理解在线社交媒体中的政治声明,并为COVID措施的政治影响提供更好的评估。