results: 研究发现了两个不同领域之间的新连接:三选一决策和评价语言表达理论。Abstract
We propose a linguistic interpretation of three-way decisions, where the regions of acceptance, rejection, and non-commitment are constructed by using the so-called evaluative linguistic expressions, which are expressions of natural language such as small, medium, very short, quite roughly strong, extremely good, etc. Our results highlight new connections between two different research areas: three-way decisions and the theory of evaluative linguistic expressions.
摘要
我们提出了一种语言解释三元决策,其中acceptance、rejection和non-commitment的区域由使用所谓的评价语言表达来构建,这些表达包括自然语言中的小、中、很短、很强、非常好等等。我们的研究结果揭示了两个不同的研究领域之间的新连接:三元决策和评价语言表达理论。
Opinion mining using Double Channel CNN for Recommender System
results: 该方法在评价用户对产品的看法方面达到了 91.6% 的准确率,与之前的方面基于的方法相比有显著提高。Abstract
Much unstructured data has been produced with the growth of the Internet and social media. A significant volume of textual data includes users' opinions about products in online stores and social media. By exploring and categorizing them, helpful information can be acquired, including customer satisfaction, user feedback about a particular event, predicting the sale of a specific product, and other similar cases. In this paper, we present an approach for sentiment analysis with a deep learning model and use it to recommend products. A two-channel convolutional neural network model has been used for opinion mining, which has five layers and extracts essential features from the data. We increased the number of comments by applying the SMOTE algorithm to the initial dataset and balanced the data. Then we proceed to cluster the aspects. We also assign a weight to each cluster using tensor decomposition algorithms that improve the recommender system's performance. Our proposed method has reached 91.6% accuracy, significantly improved compared to previous aspect-based approaches.
摘要
“随着互联网和社交媒体的发展,大量的未结构化数据被生成。这些文本数据中包含用户对产品的评价,可以从中获得有益信息,如客户满意度、用户对某个事件的反馈、预测特定产品的销售等。在这篇论文中,我们提出了一种基于深度学习模型的情感分析方法,并使其用于产品推荐。我们使用了两个通道卷积神经网络模型进行意见挖掘,该模型有五层,可以从数据中提取重要特征。我们首先应用了SMOTE算法来增加数据量,然后对特征进行分类。此外,我们还使用了矩阵分解算法来赋予每个分类器一个权重,以提高推荐系统的性能。我们的提议方法已达91.6%的准确率,与之前的方面基于的方法相比有显著提高。”
Political Sentiment Analysis of Persian Tweets Using CNN-LSTM Model
paper_authors: Mohammad Dehghani, Zahra Yazdanparast for: 本研究的目的是使用机器学习和深度学习模型来分析波斯语政治微博上的情感。methods: 本研究使用Bag of Words和ParsBERT来表示字词,并应用 Gaussian Naive Bayes、Gradient Boosting、Logistic Regression、Decision Trees、Random Forests,以及一种 combinaison of CNN和LSTM 来类别 tweet 的方向。results: 研究结果显示,使用 ParsBERT 嵌入深度学习模型可以更好地分析波斯语政治微博上的情感,CNN-LSTM 模型在第一个数据集上取得了89%的分类精度,在第二个数据集上取得了71%的分类精度。Abstract
Sentiment analysis is the process of identifying and categorizing people's emotions or opinions regarding various topics. The analysis of Twitter sentiment has become an increasingly popular topic in recent years. In this paper, we present several machine learning and a deep learning model to analysis sentiment of Persian political tweets. Our analysis was conducted using Bag of Words and ParsBERT for word representation. We applied Gaussian Naive Bayes, Gradient Boosting, Logistic Regression, Decision Trees, Random Forests, as well as a combination of CNN and LSTM to classify the polarities of tweets. The results of this study indicate that deep learning with ParsBERT embedding performs better than machine learning. The CNN-LSTM model had the highest classification accuracy with 89 percent on the first dataset with three classes and 71 percent on the second dataset with seven classes. Due to the complexity of Persian, it was a difficult task to achieve this level of efficiency.
摘要
sentiment分析是指 identificifying和 categorizing人们对不同话题的情感或意见。在最近几年,Twitter sentiment的分析已成为一个越来越流行的话题。在这篇论文中,我们提出了一些机器学习和深度学习模型,用于分析波斯政治微博的情感。我们的分析使用了Bag of Words和ParsBERT来表示单词。我们应用了 Gaussian Naive Bayes、Gradient Boosting、Logistic Regression、Decision Trees、Random Forests,以及一个组合的CNN和LSTM来分类微博的褒贬性。研究结果表明,使用ParsBERT embedding的深度学习模型在波斯政治微博中的情感分类任务中表现较好,其中CNN-LSTM模型在第一个数据集中的三类分类任务中取得了89%的分类精度,在第二个数据集中的七类分类任务中取得了71%的分类精度。由于波斯语的复杂性,这是一项具有挑战性的任务。
CPET: Effective Parameter-Efficient Tuning for Compressed Large Language Models
methods: 论文使用了各种主流 LLM 压缩技术来提高 PET 性能,并引入了知识继承和恢复策略来补偿压缩技术导致的知识损失。
results: 论文的实验结果显示,与原始压缩 LLM 相比,使用 CPET 框架可以实现更好的性能,并且在多任务情况下与直接运用普通 PET 方法相比 OUTperform。Abstract
Parameter-efficient tuning (PET) has been widely explored in recent years because it tunes much fewer parameters (PET modules) than full-parameter fine-tuning (FT) while still stimulating sufficient knowledge from large language models (LLMs) for downstream tasks. Moreover, when PET is employed to serve multiple tasks, different task-specific PET modules can be built on a frozen LLM, avoiding redundant LLM deployments. Although PET significantly reduces the cost of tuning and deploying LLMs, its inference still suffers from the computational bottleneck of LLMs. To address the above issue, we propose an effective PET framework based on compressed LLMs, named "CPET". In CPET, we evaluate the impact of mainstream LLM compression techniques on PET performance and then introduce knowledge inheritance and recovery strategies to restore the knowledge loss caused by these compression techniques. Our experimental results demonstrate that, owing to the restoring strategies of CPET, collaborating task-specific PET modules with a compressed LLM can achieve comparable performance to collaborating PET modules with the original version of the compressed LLM and outperform directly applying vanilla PET methods to the compressed LLM.
摘要
减少参数调参 (PET) 在最近几年内得到了广泛的探索,因为它在调参 fewer parameters (PET modules) 时仍然可以从大语言模型 (LLMs) 中继承足够的知识,用于下游任务。此外,当使用 PET 服务多个任务时,可以在冻结 LLM 上建立不同任务专门的 PET modules,以避免重复的 LLM 部署。虽然 PET 可以减少调试和部署 LLMS 的成本,但其推理仍然受到 LLMS 的计算瓶颈的限制。为解决以上问题,我们提出了一个有效的 PET 框架,名为 "CPET"。在 CPET 中,我们评估了主流 LLMS 压缩技术对 PET 性能的影响,然后引入了知识继承和恢复策略,以弥补压缩技术所导致的知识损失。我们的实验结果表明, CPET 可以在压缩 LLMS 上与原始版本的压缩 LLMS 相比,并且可以超越直接应用 vanilla PET 方法。
Think-on-Graph: Deep and Responsible Reasoning of Large Language Model with Knowledge Graph
methods: 我们提出了 Think-on-Graph(ToG)框架,利用知识图来提升 LLM 的深入负责推理能力。 ToG 可以识别问题中关键的实体,并从外部知识库中检索相关的 triplet,进行循环的推理和检索。
results: 通过对复杂多层推理问答 зада务进行实验,我们表明 ToG 可以有效地解决 LLM 的上述限制,不需要额外训练成本。Abstract
Large language models (LLMs) have made significant strides in various tasks, yet they often struggle with complex reasoning and exhibit poor performance in scenarios where knowledge traceability, timeliness, and accuracy are crucial. To address these limitations, we present Think-on-Graph (ToG), a novel framework that leverages knowledge graphs to enhance LLMs' ability for deep and responsible reasoning. By employing ToG, we can identify entities relevant to a given question and conduct exploration and reasoning to retrieve related triples from an external knowledge database. This iterative procedure generates multiple reasoning pathways consisting of sequentially connected triplets until sufficient information is gathered to answer the question or the maximum depth is reached. Through experiments on complex multi-hop reasoning question-answering tasks, we demonstrate that ToG outperforms existing methods, effectively addressing the aforementioned limitations of LLMs without incurring additional training costs.
摘要
Single and Multi-Speaker Cloned Voice Detection: From Perceptual to Learned Features
results: 这三种方法在训练在单一话者的声音和多话者的声音下都显示了良好的效果,并且对对抗过滤攻击也表现了良好的Robustness。Abstract
Synthetic-voice cloning technologies have seen significant advances in recent years, giving rise to a range of potential harms. From small- and large-scale financial fraud to disinformation campaigns, the need for reliable methods to differentiate real and synthesized voices is imperative. We describe three techniques for differentiating a real from a cloned voice designed to impersonate a specific person. These three approaches differ in their feature extraction stage with low-dimensional perceptual features offering high interpretability but lower accuracy, to generic spectral features, and end-to-end learned features offering less interpretability but higher accuracy. We show the efficacy of these approaches when trained on a single speaker's voice and when trained on multiple voices. The learned features consistently yield an equal error rate between $0\%$ and $4\%$, and are reasonably robust to adversarial laundering.
摘要
artifical voice 技术在过去几年内得到了 significiant advances,带来了一系列可能的害。从小规模到大规模的金融诈骗到假信息运动,需要可靠的方法来分辨真正的声音和假声音。我们描述了三种方法来分辨真正的声音和假声音,这三种方法在特征提取阶段有低维度的感知特征、通用频谱特征和终端学习的特征。我们表明这些方法在单个 speaker 的声音和多个声音上都具有较高的准确率,并且可以抵抗假钱包洗涤。
Towards Generalizable Detection of Urgency of Discussion Forum Posts
paper_authors: Valdemar Švábenský, Ryan S. Baker, Andrés Zambrano, Yishan Zou, Stefan Slater for: This paper aims to help instructors in online courses, such as MOOCs, better support student learning by automatically determining the urgency of forum posts.methods: The authors use machine learning techniques to build predictive models that determine the urgency of forum posts on a 7-point scale. They train and cross-validate several models on an original data set of 3,503 posts from MOOCs at the University of Pennsylvania, and test their performance on a separate data set of 29,604 posts from MOOCs at Stanford University.results: The best-performing model was a support vector regressor trained on the Universal Sentence Encoder embeddings of the posts, achieving an RMSE of 1.1 on the training set and 1.4 on the test set. This suggests that the model is effective in predicting the urgency of forum posts and could be used to help instructors focus their time more effectively and better support student learning.Abstract
Students who take an online course, such as a MOOC, use the course's discussion forum to ask questions or reach out to instructors when encountering an issue. However, reading and responding to students' questions is difficult to scale because of the time needed to consider each message. As a result, critical issues may be left unresolved, and students may lose the motivation to continue in the course. To help address this problem, we build predictive models that automatically determine the urgency of each forum post, so that these posts can be brought to instructors' attention. This paper goes beyond previous work by predicting not just a binary decision cut-off but a post's level of urgency on a 7-point scale. First, we train and cross-validate several models on an original data set of 3,503 posts from MOOCs at University of Pennsylvania. Second, to determine the generalizability of our models, we test their performance on a separate, previously published data set of 29,604 posts from MOOCs at Stanford University. While the previous work on post urgency used only one data set, we evaluated the prediction across different data sets and courses. The best-performing model was a support vector regressor trained on the Universal Sentence Encoder embeddings of the posts, achieving an RMSE of 1.1 on the training set and 1.4 on the test set. Understanding the urgency of forum posts enables instructors to focus their time more effectively and, as a result, better support student learning.
摘要
在线学生们,如果他们参加了MOOC课程,通常会使用课程的讨论 форум来提问或向教师们寻求帮助当遇到问题。然而,阅读和回答学生的问题需要一定的时间,因此可能会有一些重要的问题被忽略。为了解决这个问题,我们建立了一些预测模型,以便自动确定讨论 форум的帖子的紧急程度,以便将其带到教师的注意力中。这篇论文超过了之前的工作,因为它不仅预测了一个二分类决策门槛,而且预测了帖子的紧急程度在7个级别上。我们首先训练了多个模型,并对其进行跨验证。其中,我们使用大学 Pennsylvania 的 MOOC 课程数据集,训练了多个模型,并对其进行跨验证。为了证明我们的模型的一致性,我们对其进行了测试,并与之前发表的 Stanford University 的 MOOC 课程数据集进行了比较。而之前的帖子紧急性预测工作只使用了一个数据集,我们的模型则在不同的数据集和课程上进行了预测。我们最佳的模型是使用 Universal Sentence Encoder embedding 训练的支持向量回归模型,其在训练集上的 RMSE 为1.1,测试集上的 RMSE 为1.4。理解讨论帖子的紧急程度可以帮助教师更好地利用时间,从而更好地支持学生的学习。
QontSum: On Contrasting Salient Content for Query-focused Summarization
results: 对于一些 benchmark 数据集,QontSum Either outperforms 现有状态的艺术 или 具有较低的计算成本,而不是通过大规模预训练实验来实现。此外,人工研究表明,QontSum 生成的摘要更加与问题相关,而不会产生流利性的损失。Abstract
Query-focused summarization (QFS) is a challenging task in natural language processing that generates summaries to address specific queries. The broader field of Generative Information Retrieval (Gen-IR) aims to revolutionize information extraction from vast document corpora through generative approaches, encompassing Generative Document Retrieval (GDR) and Grounded Answer Retrieval (GAR). This paper highlights the role of QFS in Grounded Answer Generation (GAR), a key subdomain of Gen-IR that produces human-readable answers in direct correspondence with queries, grounded in relevant documents. In this study, we propose QontSum, a novel approach for QFS that leverages contrastive learning to help the model attend to the most relevant regions of the input document. We evaluate our approach on a couple of benchmark datasets for QFS and demonstrate that it either outperforms existing state-of-the-art or exhibits a comparable performance with considerably reduced computational cost through enhancements in the fine-tuning stage, rather than relying on large-scale pre-training experiments, which is the focus of current SOTA. Moreover, we conducted a human study and identified improvements in the relevance of generated summaries to the posed queries without compromising fluency. We further conduct an error analysis study to understand our model's limitations and propose avenues for future research.
摘要
问题集中摘要(QFS)是自然语言处理中的一个挑战任务,它生成摘要以回答特定问题。更广泛的生成信息抽取(Gen-IR)领域希望通过生成方法来从巨大的文档库中抽取信息,包括生成文档搜寻(GDR)和固定答案搜寻(GAR)。本文强调QFS在GAR中的角色,GAR是Gen-IR的一个子领域,它生成基于问题的人阅读性的答案,并将答案与问题相对应。在这篇研究中,我们提出了一种新的QFS方法,叫做QontSum,它利用对比学习来帮助模型对输入文档中最相关的区域进行专注。我们在一些QFS的 benchmarck datasets 进行评估,并证明了QontSum Either outperforms 现有的State-of-the-art(SOTA)或与现有的SOTA相比,具有许多reduced computational cost,而不是通过大规模的预训学习实验。此外,我们进行了人类研究,并发现了生成摘要与问题之间的相关性得到了改善,而不会妥协于流畅性。 finally, we conducted an error analysis study to understand our model's limitations and proposed avenues for future research.
Sensi-BERT: Towards Sensitivity Driven Fine-Tuning for Parameter-Efficient BERT
results: 这篇论文的实验结果显示,Sensi-BERT可以在不同的下游任务上(包括MNLI、QQP、QNLI和SST-2)实现更好的性能,而且在相同或更小的参数预算下。Abstract
Large pre-trained language models have recently gained significant traction due to their improved performance on various down-stream tasks like text classification and question answering, requiring only few epochs of fine-tuning. However, their large model sizes often prohibit their applications on resource-constrained edge devices. Existing solutions of yielding parameter-efficient BERT models largely rely on compute-exhaustive training and fine-tuning. Moreover, they often rely on additional compute heavy models to mitigate the performance gap. In this paper, we present Sensi-BERT, a sensitivity driven efficient fine-tuning of BERT models that can take an off-the-shelf pre-trained BERT model and yield highly parameter-efficient models for downstream tasks. In particular, we perform sensitivity analysis to rank each individual parameter tensor, that then is used to trim them accordingly during fine-tuning for a given parameter or FLOPs budget. Our experiments show the efficacy of Sensi-BERT across different downstream tasks including MNLI, QQP, QNLI, and SST-2, demonstrating better performance at similar or smaller parameter budget compared to various existing alternatives.
摘要
In this paper, we propose Sensi-BERT, a sensitivity-driven efficient fine-tuning method for BERT models that can take an off-the-shelf pre-trained BERT model and generate highly parameter-efficient models for downstream tasks. Specifically, we perform sensitivity analysis to rank each individual parameter tensor, and then trim them accordingly during fine-tuning based on a given parameter or FLOPs budget. Our experiments show the effectiveness of Sensi-BERT across different downstream tasks, including MNLI, QQP, QNLI, and SST-2, demonstrating better performance at a similar or smaller parameter budget compared to various existing alternatives.
Population Expansion for Training Language Models with Private Federated Learning
paper_authors: Tatsuki Koga, Congzheng Song, Martin Pelikan, Mona Chitnis
for: 提高小规模训练集的模型质量和训练效率
methods: 使用域 adaptive 技术扩大训练集大小,提高模型质量和训练效率
results: 在实际语言模型 datasets 上,提高模型质量约 13%-30%,训练效率也得到了提高。Abstract
Federated learning (FL) combined with differential privacy (DP) offers machine learning (ML) training with distributed devices and with a formal privacy guarantee. With a large population of devices, FL with DP produces a performant model in a timely manner. However, for applications with a smaller population, not only does the model utility degrade as the DP noise is inversely proportional to population, but also the training latency increases since waiting for enough clients to become available from a smaller pool is slower. In this work, we thus propose expanding the population based on domain adaptation techniques to speed up the training and improves the final model quality when training with small populations. We empirically demonstrate that our techniques can improve the utility by 13% to 30% on real-world language modeling datasets.
摘要
federated learning (FL) 与差异隐私 (DP) 结合可以实现分布式设备上的机器学习 (ML) 训练,并且具有正式的隐私保证。在大规模设备人口中,FL 与 DP 生成的模型在时间上具有良好的性能,但是在小规模应用中,模型的性能会降低,而且等待来自小池中的足够客户端的可用性需要更长时间。为了解决这个问题,我们提议通过域 adaptation 技术来扩大人口,以加速训练和提高最终模型质量。我们通过实验表明,我们的技术可以在实际语言模型 dataset 上提高utilities 13% 到 30%。
results: 研究发现,ECAPA-TDNN模型在识别爱尔兰语言的不同 диалект方面表现最佳,特别是在爱尔兰北部的伦敦达利尔 диалект方面具有94%的准确率。然而,模型在康诺特和明斯特两个 диалект之间的差异化方面存在问题,建议可能需要采用更加细致的方法来稳定地分辨这两个 диаLECT。Abstract
The Irish language is rich in its diversity of dialects and accents. This compounds the difficulty of creating a speech recognition system for the low-resource language, as such a system must contend with a high degree of variability with limited corpora. A recent study investigating dialect bias in Irish ASR found that balanced training corpora gave rise to unequal dialect performance, with performance for the Ulster dialect being consistently worse than for the Connacht or Munster dialects. Motivated by this, the present experiments investigate spoken dialect identification of Irish, with a view to incorporating such a system into the speech recognition pipeline. Two acoustic classification models are tested, XLS-R and ECAPA-TDNN, in conjunction with a text-based classifier using a pretrained Irish-language BERT model. The ECAPA-TDNN, particularly a model pretrained for language identification on the VoxLingua107 dataset, performed best overall, with an accuracy of 73%. This was further improved to 76% by fusing the model's outputs with the text-based model. The Ulster dialect was most accurately identified, with an accuracy of 94%, however the model struggled to disambiguate between the Connacht and Munster dialects, suggesting a more nuanced approach may be necessary to robustly distinguish between the dialects of Irish.
摘要
爱尔兰语言具有多样化的方言和口音,这使得为低资源语言的语音识别系统设计更加困难,因为系统需要面对各种方言和口音的变化。一项latest study发现,在爱尔兰语音识别中,具有平衡训练数据集的系统会导致不同方言的表现不均匀,特别是北爱尔兰方言表现较差。为了解决这个问题,当前实验探索爱尔兰口音的识别,并计划将其 integrate into speech recognition pipeline。两种音频分类模型被测试,即XLS-R和ECAPA-TDNN,并与一个基于爱尔兰语言BERT模型的文本分类器进行结合。ECAPA-TDNN模型,尤其是在VoxLingua107数据集上进行语言标识训练的模型,在整体上表现最佳,准确率达73%。通过将模型的输出与文本分类器结合,准确率得到了进一步提高,达76%。北爱尔兰方言的识别率最高,达94%,但是系统在connacht和munster方言之间的干扰仍然存在, suggesting a more nuanced approach may be necessary to robustly distinguish between the dialects of Irish.