results: 我们的 PuzzleTuning 框架在多个下游任务中表现出色,比如多个数据集上的不同任务。实验结果显示,我们的 PuzzleTuning 框架比前一代 SOTA 方法更好地适应路ологиcal image分析的问题。可以在 https://github.com/sagizty/PuzzleTuning 获取代码、示例和预训模型。Abstract
Pathological image analysis is a crucial field in computer vision. Due to the annotation scarcity in the pathological field, recently, most of the works leverage self-supervised learning (SSL) trained on unlabeled pathological images, hoping to mine the main representation automatically. However, there are two core defects in SSL-based pathological pre-training: (1) they do not explicitly explore the essential focuses of the pathological field, and (2) they do not effectively bridge with and thus take advantage of the large natural image domain. To explicitly address them, we propose our large-scale PuzzleTuning framework, containing the following innovations. Firstly, we identify three task focuses that can effectively bridge pathological and natural domains: appearance consistency, spatial consistency, and misalignment understanding. Secondly, we devise a multiple puzzle restoring task to explicitly pre-train the model with these focuses. Thirdly, for the existing large domain gap between natural and pathological fields, we introduce an explicit prompt-tuning process to incrementally integrate the domain-specific knowledge with the natural knowledge. Additionally, we design a curriculum-learning training strategy that regulates the task difficulty, making the model fit the complex multiple puzzle restoring task adaptively. Experimental results show that our PuzzleTuning framework outperforms the previous SOTA methods in various downstream tasks on multiple datasets. The code, demo, and pre-trained weights are available at https://github.com/sagizty/PuzzleTuning.
摘要
Pathological image分析是计算机视觉中的关键领域。由于pathological领域中的标注缺乏,最近的大多数工作都是通过自动学习(SSL)在未标注的pathological图像上进行训练,以希望从自动获得主表示。然而, SSL在pathological领域中有两个核心缺陷:(1)它们不直接探索pathological领域的关键点,和(2)它们不能有效地与大自然图像领域相结合。为了直接解决这些问题,我们提出了我们的大规模PuzzleTuning框架,包括以下创新:1. 我们确定了三个任务专注点,可以有效桥接pathological和自然领域:外观一致性、空间一致性和误差理解。2. 我们设计了一个多个谜题恢复任务,以直接在这些专注点上培养模型。3. 由于natural和pathological领域之间的域领域差距较大,我们引入了一个明确的提问调整过程,以增量地 интеGRATE域特定知识和自然知识。4. 我们设计了一个curriculum学习训练策略,以适应模型适应复杂的多个谜题恢复任务的适应性。实验结果显示,我们的PuzzleTuning框架在多个下游任务中超过了先前的SOTA方法。我们在https://github.com/sagizty/PuzzleTuning上提供了代码、示例和预训练 веса。