eess.IV - 2023-07-13

Body Fat Estimation from Surface Meshes using Graph Neural Networks

  • paper_url: http://arxiv.org/abs/2308.02493
  • repo_url: None
  • paper_authors: Tamara T. Mueller, Siyu Zhou, Sophie Starck, Friederike Jungmann, Alexander Ziller, Orhun Aksoy, Danylo Movchan, Rickmer Braren, Georgios Kaissis, Daniel Rueckert
    for:这篇论文的目的是估计肌肉和脂肪的分布和量,以估计人体病理健康和疾病风险。methods:这篇论文使用三角形体表面网络估计脂肪和肌肉的分布和量,并使用 graf neural network 进行准确估计。results:论文的方法可以实现高性能,同时降低训练时间和资源需求,相比于现有的 convolutional neural network。此外,这篇论文还预期这种方法可以应用于便宜且易доступible的医疗表面扫描机上,而不需要昂费的医疗影像设备。
    Abstract Body fat volume and distribution can be a strong indication for a person's overall health and the risk for developing diseases like type 2 diabetes and cardiovascular diseases. Frequently used measures for fat estimation are the body mass index (BMI), waist circumference, or the waist-hip-ratio. However, those are rather imprecise measures that do not allow for a discrimination between different types of fat or between fat and muscle tissue. The estimation of visceral (VAT) and abdominal subcutaneous (ASAT) adipose tissue volume has shown to be a more accurate measure for named risk factors. In this work, we show that triangulated body surface meshes can be used to accurately predict VAT and ASAT volumes using graph neural networks. Our methods achieve high performance while reducing training time and required resources compared to state-of-the-art convolutional neural networks in this area. We furthermore envision this method to be applicable to cheaper and easily accessible medical surface scans instead of expensive medical images.
    摘要 body 脂肪量和分布可能是一个人的总体健康状况和发展疾病类型2 диабеetes 和 cardiovascular 疾病的风险指标。通常使用的脂肪估算方法包括体重指数(BMI)、腰围或腰股比。但这些方法并不准确,无法区分不同类型的脂肪或肌肉组织。预测腹部内脂肪(VAT)和腹部外脂肪(ASAT)卷积体积的估算方法已经显示出了更高的准确性。在这个工作中,我们表明了使用三角形体表面网格可以准确预测 VAT 和 ASAT 体积,使用图 neuron 网络。我们的方法可以 достичь高性能,同时降低训练时间和需要的资源,比于现有的 convolutional neural networks 更高效。我们还可以预期这种方法可以应用于便宜并可以访问的医疗表面扫描机 instead of 昂贵的医疗图像。

Transformer-based end-to-end classification of variable-length volumetric data

  • paper_url: http://arxiv.org/abs/2307.06666
  • repo_url: https://github.com/marziehoghbaie/vlfat
  • paper_authors: Marzieh Oghbaie, Teresa Araujo, Taha Emre, Ursula Schmidt-Erfurth, Hrvoje Bogunovic
  • for: 这个研究旨在提出一个可以高效地处理三维医疗数据的自动分类方法,以解决记忆占用问题和不同标本中的 slice 数量的变化问题。
  • methods: 本研究使用 transformers 来分析Sequential数据,并在训练过程中随机调整输入量子化分辨率,从而增强learnable positional embedding的弹性和可靠性。
  • results: 在 retinal OCT 量子数据分类任务上,提出的方法比过去的 video transformers 获得了21.96%的平均改善率,并且在不同的计算预算下能够实现更好的一致性和可靠性。
    Abstract The automatic classification of 3D medical data is memory-intensive. Also, variations in the number of slices between samples is common. Na\"ive solutions such as subsampling can solve these problems, but at the cost of potentially eliminating relevant diagnosis information. Transformers have shown promising performance for sequential data analysis. However, their application for long sequences is data, computationally, and memory demanding. In this paper, we propose an end-to-end Transformer-based framework that allows to classify volumetric data of variable length in an efficient fashion. Particularly, by randomizing the input volume-wise resolution(#slices) during training, we enhance the capacity of the learnable positional embedding assigned to each volume slice. Consequently, the accumulated positional information in each positional embedding can be generalized to the neighbouring slices, even for high-resolution volumes at the test time. By doing so, the model will be more robust to variable volume length and amenable to different computational budgets. We evaluated the proposed approach in retinal OCT volume classification and achieved 21.96% average improvement in balanced accuracy on a 9-class diagnostic task, compared to state-of-the-art video transformers. Our findings show that varying the volume-wise resolution of the input during training results in more informative volume representation as compared to training with fixed number of slices per volume.
    摘要 自动分类三维医疗数据是内存开销很大的。另外,样本之间块数的变化也是非常常见的。Na\"ive的解决方案如下amplespling可以解决这些问题,但是可能会消除有关诊断信息。Transformers在sequential数据分析中表现出了良好的性能。然而,它们在长序数据上的应用是计算昂贵和内存占用很大的。在这篇论文中,我们提出了一个端到端Transformer基于的框架,可以有效地将变量长度的三维数据分类。具体来说,在训练时随机输入Volume-wise分辨率(# slice),我们提高了每个Volume slice中的可学习位置嵌入的能力。因此,在测试时,每个位置嵌入中的积累的位置信息可以泛化到邻近的slice,即使是高分辨率的Volume。这样做的好处是,模型会更愿意变量Volume长度,并且适应不同的计算预算。我们在Retinal OCTVolume分类任务上评估了该方法,与状态机器上的视频Transformers进行比较,实现了21.96%的平均改善率,相比于9类诊断任务的平均准确率。我们的发现表明,在训练时随机输入Volume-wise分辨率会导致更有用的Volume表示,相比于固定每个Volume slice的分辨率。

PatchSorter: A High Throughput Deep Learning Digital Pathology Tool for Object Labeling

  • paper_url: http://arxiv.org/abs/2307.07528
  • repo_url: None
  • paper_authors: Cedric Walker, Tasneem Talawalla, Robert Toth, Akhil Ambekar, Kien Rea, Oswin Chamian, Fan Fan, Sabina Berezowska, Sven Rottenberg, Anant Madabhushi, Marie Maillard, Laura Barisoni, Hugo Mark Horlings, Andrew Janowczyk
  • for: 用于高速标注大量数据集中的病理图像中的诊断、 прогнози和治疗效果的发现。
  • methods: 使用深度学习和直观的Web界面,开发了一个开源的标注工具——PatchSorter。
  • results: 使用 >100,000个对象,比无助标注的速度提高 >7倍,而无影响标注准确率,可以实现高速标注大量数据集。
    Abstract The discovery of patterns associated with diagnosis, prognosis, and therapy response in digital pathology images often requires intractable labeling of large quantities of histological objects. Here we release an open-source labeling tool, PatchSorter, which integrates deep learning with an intuitive web interface. Using >100,000 objects, we demonstrate a >7x improvement in labels per second over unaided labeling, with minimal impact on labeling accuracy, thus enabling high-throughput labeling of large datasets.
    摘要 发现与诊断、治疗效果和疾病进程相关的图像模式通常需要大量的历史学物体标注。我们现在发布一款开源的标注工具,PatchSorter,该工具将深度学习与易懂的Web界面集成。使用了超过100,000个物体,我们实现了每秒标注的 >7倍增加,而无需增加标注精度,因此可以实现大规模标注 dataset。

Explainable 2D Vision Models for 3D Medical Data

  • paper_url: http://arxiv.org/abs/2307.06614
  • repo_url: None
  • paper_authors: Alexander Ziller, Alp Güvenir, Ayhan Can Erdur, Tamara T. Mueller, Philip Müller, Friederike Jungmann, Johannes Brandt, Jan Peeken, Rickmer Braren, Daniel Rueckert, Georgios Kaissis
  • for: 本研究旨在提出一种简单的方法,用于在三维图像数据上训练人工智能模型。
  • methods: 本方法基于将二维网络适应到三维体volume中的块 slice中,然后使用一个特征减少模块将这些块 slice的特征合并为一个单一表示,并用于分类。
  • results: 我们在医疗分类 benchmark 和一个实际的临床数据集上评估了我们的方法,并与现有方法相比示出了相似的结果。此外,我们还使用了注意力池化作为特征减少模块,从而获得了每个块 slice 的重要性值 durante el paso adelante。我们显示了这些值对于模型预测的基础。
    Abstract Training Artificial Intelligence (AI) models on three-dimensional image data presents unique challenges compared to the two-dimensional case: Firstly, the computational resources are significantly higher, and secondly, the availability of large pretraining datasets is often limited, impeding training success. In this study, we propose a simple approach of adapting 2D networks with an intermediate feature representation for processing 3D volumes. Our method involves sequentially applying these networks to slices of a 3D volume from all orientations. Subsequently, a feature reduction module combines the extracted slice features into a single representation, which is then used for classification. We evaluate our approach on medical classification benchmarks and a real-world clinical dataset, demonstrating comparable results to existing methods. Furthermore, by employing attention pooling as a feature reduction module we obtain weighted importance values for each slice during the forward pass. We show that slices deemed important by our approach allow the inspection of the basis of a model's prediction.
    摘要 培训人工智能(AI)模型三维图像数据具有特殊挑战:首先,计算资源增加得非常大,其次,大量预训 dataset的可用性经常受限,这会阻碍训练的成功。在这种研究中,我们提议一种简单的方法,即适应2D网络中间特征表示来处理3D体Volume。我们的方法是顺序应用这些网络到3D体Volume中的所有方向的剖面上,然后使用特征减少模块将提取的剖面特征合并成一个单一表示,并用于分类。我们在医疗分类标准 benchmark 和一个实际的临床数据集上评估了我们的方法,并达到了与现有方法相当的结果。此外,通过在前向传播中使用注意力池化来实现特征减少模块,我们可以在每个剖面上获得重要性的Weight值。我们表明,我们的方法中重要的剖面可以为模型预测的基础提供可视化 inspect。

Image Denoising and the Generative Accumulation of Photons

  • paper_url: http://arxiv.org/abs/2307.06607
  • repo_url: https://github.com/krulllab/gap
  • paper_authors: Alexander Krull, Hector Basevi, Benjamin Salmon, Andre Zeug, Franziska Müller, Samuel Tonks, Leela Muppala, Ales Leonardis
  • for: This paper is written for the task of noise removal in images corrupted by shot noise.
  • methods: The paper proposes a new method called “generative accumulation of photons” (GAP) that uses a network to predict the location of the next photon to arrive and solve the minimum mean square error (MMSE) denoising task.
  • results: The paper evaluates the GAP method on 4 new fluorescence microscopy datasets and shows that it outperforms supervised, self-supervised, and unsupervised baselines, or performs on par with them.
    Abstract We present a fresh perspective on shot noise corrupted images and noise removal. By viewing image formation as the sequential accumulation of photons on a detector grid, we show that a network trained to predict where the next photon could arrive is in fact solving the minimum mean square error (MMSE) denoising task. This new perspective allows us to make three contributions: We present a new strategy for self-supervised denoising, We present a new method for sampling from the posterior of possible solutions by iteratively sampling and adding small numbers of photons to the image. We derive a full generative model by starting this process from an empty canvas. We call this approach generative accumulation of photons (GAP). We evaluate our method quantitatively and qualitatively on 4 new fluorescence microscopy datasets, which will be made available to the community. We find that it outperforms supervised, self-supervised and unsupervised baselines or performs on-par.
    摘要 我团队提出了一种新的视角,探讨抽象图像中的射频噪声和噪声除去问题。我们认为图像形成是探测器网格上积累着光子的sequential процесс,因此我们表明了一种基于MMSE的推理网络可以解决噪声除去问题。这个新的视角允许我们提出三个贡献:首先,我们提出了一种新的自动预测推理策略,其中网络被训练以估计下一个光子会在哪里出现。其次,我们提出了一种新的采样方法,通过 iteratively 采样并将小数量的光子添加到图像中来采样 posterior 中的可能解。最后,我们 Derive 了一个完整的生成模型,开始于一个空白画布。我们称这种方法为 photon 的生成总结(GAP)。我们对这种方法进行了量化和质量测试,并在4个新的激发镜icroscopy 数据集上进行了评估。我们发现,它在超过supervised、self-supervised和unsupervised 基线之上,或者与之相当。

Quantum Image Denoising: A Framework via Boltzmann Machines, QUBO, and Quantum Annealing

  • paper_url: http://arxiv.org/abs/2307.06542
  • repo_url: None
  • paper_authors: Phillip Kerger, Ryoji Miyazaki
  • For: 这 paper 是用来描述一种基于 Restricted Boltzmann Machines (RBMs) 的二进制图像干扰除法。* Methods: 该方法使用了 quadratic unconstrained binary optimization (QUBO) 形式来实现denoising目标,并且可以通过Quantum Annealing 来实现。* Results: 该方法可以在大规模的二进制数据上实现高质量的干扰除效果,并且可以预测在假设Target Distribution 已经很好地适应的情况下,denoised图像会更近于干净图像。Here’s the same information in a more detailed format:* For: 这 paper 是用来描述一种基于 Restricted Boltzmann Machines (RBMs) 的二进制图像干扰除法,该方法可以通过 quadratic unconstrained binary optimization (QUBO) 形式来实现denoising目标,并且可以通过Quantum Annealing 来实现。* Methods: 该方法使用了 RBMs 来学习target distribution,然后通过balancing distribution 和 penalty term来实现denoising目标。在 Target Distribution 已经很好地适应的情况下,该方法可以通过 Statistically Optimal 的 penalty parameter 来实现最佳的denoising效果。此外,该方法还提出了一种empirically supported modification,以使得方法更加Robust。* Results: 该方法可以在大规模的二进制数据上实现高质量的干扰除效果,并且可以预测在假设Target Distribution 已经很好地适应的情况下,denoised图像会更近于干净图像。在实际应用中,该方法可以在 D-Wave Advantage 机器上进行实现,并且也可以通过类比 heuristics 来实现在大规模数据上的应用。
    Abstract We investigate a framework for binary image denoising via restricted Boltzmann machines (RBMs) that introduces a denoising objective in quadratic unconstrained binary optimization (QUBO) form and is well-suited for quantum annealing. The denoising objective is attained by balancing the distribution learned by a trained RBM with a penalty term for derivations from the noisy image. We derive the statistically optimal choice of the penalty parameter assuming the target distribution has been well-approximated, and further suggest an empirically supported modification to make the method robust to that idealistic assumption. We also show under additional assumptions that the denoised images attained by our method are, in expectation, strictly closer to the noise-free images than the noisy images are. While we frame the model as an image denoising model, it can be applied to any binary data. As the QUBO formulation is well-suited for implementation on quantum annealers, we test the model on a D-Wave Advantage machine, and also test on data too large for current quantum annealers by approximating QUBO solutions through classical heuristics.
    摘要 我们研究了一个基于Restricted Boltzmann Machine(RBM)的二进制图像杂变推优方案,该方案通过在quadratic unconstrained binary optimization(QUBO)形式中引入杂变目标来实现二进制图像杂变。我们确定了在训练RBM后对分布的均衡,并增加了对不稳定图像的罚分,以实现杂变目标。我们还提出了一个可靠的修改,以使方法更加鲁棒。此外,我们还证明了在某些假设下,由我们的方法生成的杂变图像在预期下是比原始杂变图像更近于噪声自由图像。虽然我们将模型定义为图像杂变模型,但它可以应用于任何二进制数据。由于QUBO形式适合在量子泛化器上实现,我们在D-Wave Advantage机器上测试了该模型,并对数据太大于当前量子泛化器可以处理的情况下,使用经典规则来估算QUBO解。

SAM-Path: A Segment Anything Model for Semantic Segmentation in Digital Pathology

  • paper_url: http://arxiv.org/abs/2307.09570
  • repo_url: https://github.com/cvlab-stonybrook/SAMPath
  • paper_authors: Jingwei Zhang, Ke Ma, Saarthak Kapse, Joel Saltz, Maria Vakalopoulou, Prateek Prasanna, Dimitris Samaras
  • for: 本研究旨在适应SAM模型进行计算生物学工作流程中的Semantic segmentation任务。
  • methods: 本研究使用SAM模型,并在其基础上引入可调类提示和pathologyEncoder,以提高SAM的Semantic segmentation能力。
  • results: 经过实验表明,在两个公共的pathology数据集(BCSS和CRAG)上,我们的方法可以比vanilla SAM和人工提示后处理提高Dice分数和IOU分数。具体来说,与基eline相比,我们的方法可以提高27.52%的Dice分数和71.63%的IOU分数。此外,我们还提出了一种基于pathologyEncoder的附加方法,可以进一步提高Semantic segmentation的精度。
    Abstract Semantic segmentations of pathological entities have crucial clinical value in computational pathology workflows. Foundation models, such as the Segment Anything Model (SAM), have been recently proposed for universal use in segmentation tasks. SAM shows remarkable promise in instance segmentation on natural images. However, the applicability of SAM to computational pathology tasks is limited due to the following factors: (1) lack of comprehensive pathology datasets used in SAM training and (2) the design of SAM is not inherently optimized for semantic segmentation tasks. In this work, we adapt SAM for semantic segmentation by introducing trainable class prompts, followed by further enhancements through the incorporation of a pathology encoder, specifically a pathology foundation model. Our framework, SAM-Path enhances SAM's ability to conduct semantic segmentation in digital pathology without human input prompts. Through experiments on two public pathology datasets, the BCSS and the CRAG datasets, we demonstrate that the fine-tuning with trainable class prompts outperforms vanilla SAM with manual prompts and post-processing by 27.52% in Dice score and 71.63% in IOU. On these two datasets, the proposed additional pathology foundation model further achieves a relative improvement of 5.07% to 5.12% in Dice score and 4.50% to 8.48% in IOU.
    摘要 Semantic segmentations of pathological entities have crucial clinical value in computational pathology workflows. Foundation models, such as the Segment Anything Model (SAM), have been recently proposed for universal use in segmentation tasks. SAM shows remarkable promise in instance segmentation on natural images. However, the applicability of SAM to computational pathology tasks is limited due to the following factors: (1) lack of comprehensive pathology datasets used in SAM training and (2) the design of SAM is not inherently optimized for semantic segmentation tasks. In this work, we adapt SAM for semantic segmentation by introducing trainable class prompts, followed by further enhancements through the incorporation of a pathology encoder, specifically a pathology foundation model. Our framework, SAM-Path enhances SAM's ability to conduct semantic segmentation in digital pathology without human input prompts. Through experiments on two public pathology datasets, the BCSS and the CRAG datasets, we demonstrate that the fine-tuning with trainable class prompts outperforms vanilla SAM with manual prompts and post-processing by 27.52% in Dice score and 71.63% in IOU. On these two datasets, the proposed additional pathology foundation model further achieves a relative improvement of 5.07% to 5.12% in Dice score and 4.50% to 8.48% in IOU.Here's the translation in Traditional Chinese:Semantic segmentations of pathological entities have crucial clinical value in computational pathology workflows. Foundation models, such as the Segment Anything Model (SAM), have been recently proposed for universal use in segmentation tasks. SAM shows remarkable promise in instance segmentation on natural images. However, the applicability of SAM to computational pathology tasks is limited due to the following factors: (1) lack of comprehensive pathology datasets used in SAM training and (2) the design of SAM is not inherently optimized for semantic segmentation tasks. In this work, we adapt SAM for semantic segmentation by introducing trainable class prompts, followed by further enhancements through the incorporation of a pathology encoder, specifically a pathology foundation model. Our framework, SAM-Path enhances SAM's ability to conduct semantic segmentation in digital pathology without human input prompts. Through experiments on two public pathology datasets, the BCSS and the CRAG datasets, we demonstrate that the fine-tuning with trainable class prompts outperforms vanilla SAM with manual prompts and post-processing by 27.52% in Dice score and 71.63% in IOU. On these two datasets, the proposed additional pathology foundation model further achieves a relative improvement of 5.07% to 5.12% in Dice score and 4.50% to 8.48% in IOU.

Stochastic Light Field Holography

  • paper_url: http://arxiv.org/abs/2307.06277
  • repo_url: None
  • paper_authors: Florian Schiffers, Praneeth Chakravarthula, Nathan Matsuda, Grace Kuo, Ethan Tseng, Douglas Lanman, Felix Heide, Oliver Cossairt
  • for: 本研究旨在评估材料显示器的真实性,并解决辐射柱覆盖大 focal volume 的问题。
  • methods: 该研究使用了一种基于干扰光场和相关波函数光传输的新型投影算法,并通过使用Synthesized photographs进行监测投影计算。
  • results: 该研究发现,使用该算法可以生成具有正确的距离和Focus cues的投影图,从而提高了观看体验的真实感。并且对比于当前的CGH算法,该方法在不同的 pupil state 下表现更优。
    Abstract The Visual Turing Test is the ultimate goal to evaluate the realism of holographic displays. Previous studies have focused on addressing challenges such as limited \'etendue and image quality over a large focal volume, but they have not investigated the effect of pupil sampling on the viewing experience in full 3D holograms. In this work, we tackle this problem with a novel hologram generation algorithm motivated by matching the projection operators of incoherent Light Field and coherent Wigner Function light transport. To this end, we supervise hologram computation using synthesized photographs, which are rendered on-the-fly using Light Field refocusing from stochastically sampled pupil states during optimization. The proposed method produces holograms with correct parallax and focus cues, which are important for passing the Visual Turing Test. We validate that our approach compares favorably to state-of-the-art CGH algorithms that use Light Field and Focal Stack supervision. Our experiments demonstrate that our algorithm significantly improves the realism of the viewing experience for a variety of different pupil states.
    摘要 “visual turing test”是投射幕display的最终目标,以评估幕display的真实感。先前的研究主要关注了幕display的局限性和图像质量问题,但未曾调查到观察者的 pupil sampling 对全息三维幕display的视觉体验产生的影响。在这种工作中,我们通过一种基于干扰光场和惯性玻璃函数的投射算法来解决这个问题。我们在计算幕display时使用Synthesized photographs进行监测,这些Synthesized photographs在计算过程中使用Light Field refocusing来从杂然样本的 pupil states中随机抽取样本。我们的方法可以生成具有正确的距离和焦点提示的幕display,这些提示对于通过Visual Turing Test是非常重要的。我们的实验表明,我们的方法与现有的CGH算法相比,可以更好地提高不同的观察者 pupil states 的视觉体验的真实感。

The Whole Pathological Slide Classification via Weakly Supervised Learning

  • paper_url: http://arxiv.org/abs/2307.06344
  • repo_url: None
  • paper_authors: Qiehe Sun, Jiawen Li, Jin Xu, Junru Cheng, Tian Guan, Yonghong He
  • for: 把整个染色凝胶图像(WSI)分类问题转化为多例学习(MIL)框架,以便更高效地利用注释和处理大量图像。
  • methods: 利用核心病理特征和病理块之间的空间相关性,提出两种病理假设,并通过对抽取器训练进行对比学习来提取实例级别表示。
  • results: 在Camelyon16乳腺癌数据集和TCGA-NSCLC肺癌数据集上,提出的方法可以更好地处理癌病诊断和分型任务,并比州对病理图像分类方法表现更高效。
    Abstract Due to its superior efficiency in utilizing annotations and addressing gigapixel-sized images, multiple instance learning (MIL) has shown great promise as a framework for whole slide image (WSI) classification in digital pathology diagnosis. However, existing methods tend to focus on advanced aggregators with different structures, often overlooking the intrinsic features of H\&E pathological slides. To address this limitation, we introduced two pathological priors: nuclear heterogeneity of diseased cells and spatial correlation of pathological tiles. Leveraging the former, we proposed a data augmentation method that utilizes stain separation during extractor training via a contrastive learning strategy to obtain instance-level representations. We then described the spatial relationships between the tiles using an adjacency matrix. By integrating these two views, we designed a multi-instance framework for analyzing H\&E-stained tissue images based on pathological inductive bias, encompassing feature extraction, filtering, and aggregation. Extensive experiments on the Camelyon16 breast dataset and TCGA-NSCLC Lung dataset demonstrate that our proposed framework can effectively handle tasks related to cancer detection and differentiation of subtypes, outperforming state-of-the-art medical image classification methods based on MIL. The code will be released later.
    摘要 (简化中文)由于其高效使用注释和处理大型图像,多例学习(MIL)在数字生物 pathology 诊断中表现出了扎实的承诺。然而,现有方法通常会强调不同结构的高级聚合器,经常忽略 H&E 病理报告图像中的内在特征。为了解决这一限制,我们引入了两种病理假设:病理细胞中的核型多样性和病理块之间的空间相关性。通过利用前者,我们提出了一种数据增强方法,通过对抽象器训练中的着色剂分离来获取实例级表示。然后,我们使用一个相邻矩阵来描述病理块之间的空间关系。通过将这两种视图集成,我们设计了基于病理假设的多例框架,包括特征提取、筛选和聚合。广泛的实验表明,我们的提议框架可以有效地处理 relate to cancer detection and differentiation of subtypes 等任务,超越当前基于 MIL 的医学图像分类方法。代码将在未来发布。