results: 实验结果表明,使用句子级评估指标来评估整个段落的效果与使用专门为段落级设计的指标相当有效。Abstract
As research on machine translation moves to translating text beyond the sentence level, it remains unclear how effective automatic evaluation metrics are at scoring longer translations. In this work, we first propose a method for creating paragraph-level data for training and meta-evaluating metrics from existing sentence-level data. Then, we use these new datasets to benchmark existing sentence-level metrics as well as train learned metrics at the paragraph level. Interestingly, our experimental results demonstrate that using sentence-level metrics to score entire paragraphs is equally as effective as using a metric designed to work at the paragraph level. We speculate this result can be attributed to properties of the task of reference-based evaluation as well as limitations of our datasets with respect to capturing all types of phenomena that occur in paragraph-level translations.
摘要
研究在机器翻译中延伸到文段级别的文本翻译效果,然而现有自动评价指标的有效性在长文段翻译方面仍然不清楚。在这项工作中,我们首先提出了将现有句子级数据转化为训练和Meta评价指标的方法。然后,我们使用这些新的数据集来评估现有的句子级指标以及在 paragraph 级别进行学习的指标。实验结果显示,使用句子级指标评估整个文段的效果与使用特制的 paragraph 级指标相当。我们推测这些结果可能是由评估任务的特性以及我们数据集中捕捉到的所有类型的现象所致。
results: 实验结果显示,使用 M2M100 模型可以在原始数据和原始+ sintetic 数据上达到高的 BLEU 分数。此外,公共可用的 bitext 数据集可以用于研究用途。Abstract
In Africa, and the world at large, there is an increasing focus on developing Neural Machine Translation (NMT) systems to overcome language barriers. NMT for Low-resource language is particularly compelling as it involves learning with limited labelled data. However, obtaining a well-aligned parallel corpus for low-resource languages can be challenging. The disparity between the technological advancement of a few global languages and the lack of research on NMT for local languages in Chad is striking. End-to-end NMT trials on low-resource Chad languages have not been attempted. Additionally, there is a dearth of online and well-structured data gathering for research in Natural Language Processing, unlike some African languages. However, a guided approach for data gathering can produce bitext data for many Chadian language translation pairs with well-known languages that have ample data. In this project, we created the first sba-Fr Dataset, which is a corpus of Ngambay-to-French translations, and fine-tuned three pre-trained models using this dataset. Our experiments show that the M2M100 model outperforms other models with high BLEU scores on both original and original+synthetic data. The publicly available bitext dataset can be used for research purposes.
摘要
在非洲和世界上,有越来越多的关注发展神经机器翻译(NMT)系统,以超越语言障碍。NMT для低资源语言特别有吸引力,因为它涉及到有限的标注数据学习。然而,获得低资源语言的高质量并行数据可以是挑战。非洲语言技术发展落差和中非的语言研究欠缺是惊人的。在毫不计入End-to-end NMT实验中,有些非洲语言的翻译还没有尝试。此外,在自然语言处理领域的在线和结构化数据收集也比较缺乏,与一些非洲语言不同。然而,一种引导的方法可以生成许多中非语言翻译对的数据,并且可以使用已知语言的丰富数据进行精度调整。在本项目中,我们创建了第一个sba-FrDataset,它是一个 Ngambay-to-French 翻译 corpus,并使用这个数据集进行三个预训练模型的精度调整。我们的实验表明,M2M100模型在原始和原始+ sintetic 数据上的 BLEU 分数均高于其他模型。公共可用的 bitext 数据集可以用于研究purposes。
Prompting a Large Language Model to Generate Diverse Motivational Messages: A Comparison with Human-Written Messages
results: 研究发现,使用人群劳动任务管道作为LLM提示可以使GPT-4生成更多样化的消息,比两个基线提示更好。此外,论文还讨论了由人类写作者和LLM生成的消息之间的对比。Abstract
Large language models (LLMs) are increasingly capable and prevalent, and can be used to produce creative content. The quality of content is influenced by the prompt used, with more specific prompts that incorporate examples generally producing better results. On from this, it could be seen that using instructions written for crowdsourcing tasks (that are specific and include examples to guide workers) could prove effective LLM prompts. To explore this, we used a previous crowdsourcing pipeline that gave examples to people to help them generate a collectively diverse corpus of motivational messages. We then used this same pipeline to generate messages using GPT-4, and compared the collective diversity of messages from: (1) crowd-writers, (2) GPT-4 using the pipeline, and (3 & 4) two baseline GPT-4 prompts. We found that the LLM prompts using the crowdsourcing pipeline caused GPT-4 to produce more diverse messages than the two baseline prompts. We also discuss implications from messages generated by both human writers and LLMs.
摘要
Note: Simplified Chinese is also known as "Mandarin" or "Standard Chinese".Translation notes:* "Large language models" is translated as "大型语言模型" (dàxíng yǔyán módelǐng).* "Crowdsourcing" is translated as "人群协作" (rénqún xiézuò).* "Pipeline" is translated as "管道" (guǎndào).* "GPT-4" is translated as "GPT-4" (GPT-4).* "Baseline" is translated as "基线" (jīxiàn).* "Prompts" is translated as "提示" (tímǐ).* "Diverse" is translated as "多样的" (duōyàng de).* "Messages" is translated as "消息" (xiāoxi).Note that the translation is in Simplified Chinese, which is the standard form of Chinese used in mainland China and Singapore. Traditional Chinese is used in Hong Kong, Macau, and Taiwan.
ARTIST: ARTificial Intelligence for Simplified Text
results: 研究发现了自动文本简化技术的优势和局限性,包括处理文化和常识知识的挑战。这些结果代表了对荷兰文本简化的首次探索,并为未来的研究和实践提供了灯光。Abstract
Complex text is a major barrier for many citizens when accessing public information and knowledge. While often done manually, Text Simplification is a key Natural Language Processing task that aims for reducing the linguistic complexity of a text while preserving the original meaning. Recent advances in Generative Artificial Intelligence (AI) have enabled automatic text simplification both on the lexical and syntactical levels. However, as applications often focus on English, little is understood about the effectiveness of Generative AI techniques on low-resource languages such as Dutch. For this reason, we carry out empirical studies to understand the benefits and limitations of applying generative technologies for text simplification and provide the following outcomes: 1) the design and implementation for a configurable text simplification pipeline that orchestrates state-of-the-art generative text simplification models, domain and reader adaptation, and visualisation modules; 2) insights and lessons learned, showing the strengths of automatic text simplification while exposing the challenges in handling cultural and commonsense knowledge. These outcomes represent a first step in the exploration of Dutch text simplification and shed light on future endeavours both for research and practice.
摘要
各种复杂的文本是公共信息和知识访问的主要障碍。虽然经常是手动完成的,但文本简化是自然语言处理任务的关键任务,旨在降低文本语言复杂性,保留原始意思。现代生成人工智能技术(AI)已经使得自动文本简化在语言和语法层次上自动进行。然而,应用常focus on英语,对低资源语言如荷兰语的应用知之甚少。为了了解生成技术在文本简化中的效果和限制,我们进行了实践研究,并提供以下结果: 1)设计和实现一个可配置的文本简化管道,该管道将state-of-the-art生成文本简化模型、领域和读者适应、视觉模块相互协调。 2)对 automatic文本简化的发现和经验,包括自动简化的优势和处理文化和常识知识的挑战。这些结果代表了对荷兰文本简化的首次探索,并照亮未来的研究和实践的前景。