results: 对比州的实验结果显示,提出的框架可以与现有的方法相比,在不同的信号噪含率 (SNR) 下提高 peak signal-to-noise ratio (PSNR) 的表现,提高图像的重建精度。Abstract
Semantic communications, aiming at ensuring the successful delivery of the meaning of information, are expected to be one of the potential techniques for the next generation communications. However, the knowledge forming and synchronizing mechanism that enables semantic communication systems to extract and interpret the semantics of information according to the communication intents is still immature. In this paper, we propose a semantic image transmission framework with explicit semantic base (Seb), where Sebs are generated and employed as the knowledge shared between the transmitter and the receiver with flexible granularity. To represent images with Sebs, a novel Seb-based reference image generator is proposed to generate Sebs and then decompose the transmitted images. To further encode/decode the residual information for precise image reconstruction, a Seb-based image encoder/decoder is proposed. The key components of the proposed framework are optimized jointly by end-to-end (E2E) training, where the loss function is dedicated designed to tackle the problem of nondifferentiable operation in Seb-based reference image generator by introducing a gradient approximation mechanism. Extensive experiments show that the proposed framework outperforms state-of-art works by 0.5 - 1.5 dB in peak signal-to-noise ratio (PSNR) w.r.t. different signal-to-noise ratio (SNR).
摘要
semantic 通信技术,预计将成为下一代通信技术的一个 potential 方向,但现有的知识形成和同步机制仍然是幼稚的。本文提出了一种具有显式semantic base(Seb)的semantic 图像传输框架,其中Sebs是在传输和接收方之间共享的知识,并可以在不同的粒度水平进行flexible 分配。为了将图像表示为Sebs,本文提出了一种新的Seb-based reference image generator,该generator可以生成Sebs并将传输的图像进行解码。另外,为了进一步编码/解码剩余信息以实现精确的图像重建,本文提出了一种基于Seb的图像编码器/解码器。关键组件的优化由综合(E2E)训练进行,loss函数通过引入梯度近似机制来解决Seb-based reference image generator中的非导数操作问题。实验表明,提议的框架可以与当前最佳性能相比提高0.5-1.5 dB的峰值信号噪声比(PSNR)。
On Versatile Video Coding at UHD with Machine-Learning-Based Super-Resolution
results: 根据论文的测试结果,可以获得12%~18%的Bjontegaard delta rate增加,同时减少 compression artifacts 和loss of details。Abstract
Coding 4K data has become of vital interest in recent years, since the amount of 4K data is significantly increasing. We propose a coding chain with spatial down- and upscaling that combines the next-generation VVC codec with machine learning based single image super-resolution algorithms for 4K. The investigated coding chain, which spatially downscales the 4K data before coding, shows superior quality than the conventional VVC reference software for low bitrate scenarios. Throughout several tests, we find that up to 12 % and 18 % Bjontegaard delta rate gains can be achieved on average when coding 4K sequences with VVC and QP values above 34 and 42, respectively. Additionally, the investigated scenario with up- and downscaling helps to reduce the loss of details and compression artifacts, as it is shown in a visual example.
摘要
coding 4K 数据已成为近年来的焦点,因为4K 数据量在不断增加。我们提议一种具有空间下采样和上采样的编码链,该链 combining 下一代 VVC 编码器和基于机器学习的单张图像超解算法,用于4K。我们的调查显示,将4K 数据先下采样后编码,可以在低比特率场景下实现较高的质量。在多个测试中,我们发现,使用 VVC 和 QP 值高于 34 和 42 时,可以实现平均 12% 到 18% Bjontegaard Delta 率增加。此外,我们发现,将数据上下采样可以降低数据的丢失细节和压缩artefacts,如图像示例所示。Note: "Bjontegaard delta rate" refers to the difference between the peak signal-to-noise ratio (PSNR) of the original and reconstructed signals, which is a common measure of the quality of video compression. A lower Bjontegaard delta rate indicates better compression quality.
Three-dimensional echo-shifted EPI with simultaneous blip-up and blip-down acquisitions for correcting geometric distortion
paper_authors: Kaibao Sun, Zhifeng Chen, Guangyu Dan, Qingfei Luo, Lirong Yan, Feng Liu, Xiaohong Joe Zhou For:* 这个研究旨在超越BUDA doublescan时间和降低功能MRI应用中的挑战,通过开发一种三维echo-shifted EPI BUDA(esEPI-BUDA)技术,以获得单shot中获得两个blip-up和blip-down数据集。Methods:* 这种三维esEPI-BUDA序列使用了一种echo-shifting策略,生成了两个EPI读取车的读取信号。这两个读取信号的k-空间轨迹相互交叠,并且使用了opposite phase-encoding gradient方向。这两个k-空间数据集分别使用3D SENSE算法重建,并生成了时间解决B0-场图像。Results:* 在一个phantom和一个人类大脑图像研究中,这种3D esEPI-BUDA技术可以有效地纠正几何扭曲。在人类大脑图像中,可以看到视觉激活区和其BOLD响应,与普通3D echo-planar图像相似。Abstract
Purpose: Echo-planar imaging (EPI) with blip-up/down acquisition (BUDA) can provide high-quality images with minimal distortions by using two readout trains with opposing phase-encoding gradients. Because of the need for two separate acquisitions, BUDA doubles the scan time and degrades the temporal resolution when compared to single-shot EPI, presenting a major challenge for many applications, particularly functional MRI (fMRI). This study aims at overcoming this challenge by developing an echo-shifted EPI BUDA (esEPI-BUDA) technique to acquire both blip-up and blip-down datasets in a single shot. Methods: A three-dimensional (3D) esEPI-BUDA pulse sequence was designed by using an echo-shifting strategy to produce two EPI readout trains. These readout trains produced a pair of k-space datasets whose k-space trajectories were interleaved with opposite phase-encoding gradient directions. The two k-space datasets were separately reconstructed using a 3D SENSE algorithm, from which time-resolved B0-field maps were derived using TOPUP in FSL and then input into a forward model of joint parallel imaging reconstruction to correct for geometric distortion. In addition, Hankel structured low-rank constraint was incorporated into the reconstruction framework to improve image quality by mitigating the phase errors between the two interleaved k-space datasets. Results: The 3D esEPI-BUDA technique was demonstrated in a phantom and an fMRI study on healthy human subjects. Geometric distortions were effectively corrected in both phantom and human brain images. In the fMRI study, the visual activation volumes and their BOLD responses were comparable to those from conventional 3D echo-planar images. Conclusion: The improved imaging efficiency and dynamic distortion correction capability afforded by 3D esEPI-BUDA are expected to benefit many EPI applications.
摘要
目的:使用电平扫描(EPI)和折叠/下降获取(BUDA)技术可以提供高质量图像,并且减少了图像扭曲。然而,由于需要两个分离的获取,BUDA将扫描时间 Doubles和功能磁共振成像(fMRI)等应用中的时间分辨率下降为主要挑战。本研究旨在解决这个挑战,通过开发一种三维电平扫描BUDA(esEPI-BUDA)技术,以获取一个单击数据集。方法:使用电平扫描扩展策略,生成两个EPI读取列。这两个读取列在干扰方向上具有相反的频率编码梯度。这两个k空间数据集分别使用3D SENSE算法重建,并使用FSL中的TOPUP算法生成时间相关的B0场图。然后,将这些图像输入到一种前向模型,以正确地修正几何错误。此外,在重建框架中添加了具有Hankel结构的低级别约束,以提高图像质量,减少相位错误。结果:在荚体和人类大脑图像中,使用3D esEPI-BUDA技术可以有效地纠正几何错误。在fMRI研究中,观察到的视觉激活体和其BOLD响应与普通3D电平扫描图像相同。结论:3D esEPI-BUDA技术的改进的扫描效率和动态几何纠正能力,预期会对许多EPI应用产生积极的影响。
The Color Clifford Hardy Signal: Application to Color Edge Detection and Optical Flow
paper_authors: Xiaoxiao Hu, Kit Ian Kou, Cuiming Zou, Dong Cheng
for: This paper introduces a new approach to processing color images using the color Clifford Hardy signal, which is a high-dimensional analytic function.
methods: The paper proposes five methods for edge detection in color images based on the local feature representation of the color Clifford Hardy signal. These methods utilize the multi-scale structure of the signal to resist noise and improve edge detection accuracy.
results: The proposed methods are evaluated using image quality assessment criteria and are shown to be superior to traditional edge detection methods in terms of robustness to noise and accuracy. Additionally, an example application of color optical flow detection using the proposed approach is provided.Abstract
This paper introduces the idea of the color Clifford Hardy signal, which can be used to process color images. As a complex analytic function's high-dimensional analogue, the color Clifford Hardy signal inherits many desirable qualities of analyticity. A crucial tool for getting the color and structural data is the local feature representation of a color image in the color Clifford Hardy signal. By looking at the extended Cauchy-Riemann equations in the high-dimensional space, it is possible to see the connection between the different parts of the color Clifford Hardy signal. Based on the distinctive and important local amplitude and local phase generated by the color Clifford Hardy signal, we propose five methods to identify the edges of color images with relation to a certain color. To prove the superiority of the offered methodologies, numerous comparative studies employing image quality assessment criteria are used. Specifically by using the multi-scale structure of the color Clifford Hardy signal, the proposed approaches are resistant to a variety of noises. In addition, a color optical flow detection method with anti-noise ability is provided as an example of application.
摘要
Out-of-distribution multi-view auto-encoders for prostate cancer lesion detection
results: 我们在公开可用的数据集上评估了我们的方法,与单向方法相比,我们的方法实现了更好的检测结果(AUC=82.3%)。Abstract
Traditional deep learning (DL) approaches based on supervised learning paradigms require large amounts of annotated data that are rarely available in the medical domain. Unsupervised Out-of-distribution (OOD) detection is an alternative that requires less annotated data. Further, OOD applications exploit the class skewness commonly present in medical data. Magnetic resonance imaging (MRI) has proven to be useful for prostate cancer (PCa) diagnosis and management, but current DL approaches rely on T2w axial MRI, which suffers from low out-of-plane resolution. We propose a multi-stream approach to accommodate different T2w directions to improve the performance of PCa lesion detection in an OOD approach. We evaluate our approach on a publicly available data-set, obtaining better detection results in terms of AUC when compared to a single direction approach (73.1 vs 82.3). Our results show the potential of OOD approaches for PCa lesion detection based on MRI.
摘要
传统的深度学习(DL)方法基于指导学习思想需要庞大量的标注数据,而医疗领域中这些数据很少。无监管 OUT-OF-DISTRIBUTION(OOD)检测是一种alternative,它需要 fewer annotated data。另外,OOD应用可以利用医学数据中的类倾斜。核磁共振成像(MRI)已经被证明是肠癌(PCa)诊断和管理的有用工具,但现有的DL方法仅仅利用T2w极向MRI,这种MRIuffer from low out-of-plane resolution。我们提议一种多流处理方法,以便同时处理不同的T2w方向,以提高PCa涂抹检测的性能。我们在公共可用数据集上评估了我们的方法,并与单向方法(73.1)进行比较,得到了更好的检测结果(AUC),即82.3。我们的结果表明,基于MRI的PCa涂抹检测可以通过OOD方法实现更高的性能。
Leveraging multi-view data without annotations for prostate MRI segmentation: A contrastive approach
paper_authors: Tim Nikolass Lindeijer, Tord Martin Ytredal, Trygve Eftestøl, Tobias Nordström, Fredrik Jäderling, Martin Eklund, Alvaro Fernandez-Quilez
results: 研究结果显示,使用对照方法可以提高肾脏分类的精确度,并且在不同的检测方向下实现了较好的一致性。Abstract
An accurate prostate delineation and volume characterization can support the clinical assessment of prostate cancer. A large amount of automatic prostate segmentation tools consider exclusively the axial MRI direction in spite of the availability as per acquisition protocols of multi-view data. Further, when multi-view data is exploited, manual annotations and availability at test time for all the views is commonly assumed. In this work, we explore a contrastive approach at training time to leverage multi-view data without annotations and provide flexibility at deployment time in the event of missing views. We propose a triplet encoder and single decoder network based on U-Net, tU-Net (triplet U-Net). Our proposed architecture is able to exploit non-annotated sagittal and coronal views via contrastive learning to improve the segmentation from a volumetric perspective. For that purpose, we introduce the concept of inter-view similarity in the latent space. To guide the training, we combine a dice score loss calculated with respect to the axial view and its manual annotations together with a multi-view contrastive loss. tU-Net shows statistical improvement in dice score coefficient (DSC) with respect to only axial view (91.25+-0.52% compared to 86.40+-1.50%,P<.001). Sensitivity analysis reveals the volumetric positive impact of the contrastive loss when paired with tU-Net (2.85+-1.34% compared to 3.81+-1.88%,P<.001). Further, our approach shows good external volumetric generalization in an in-house dataset when tested with multi-view data (2.76+-1.89% compared to 3.92+-3.31%,P=.002), showing the feasibility of exploiting non-annotated multi-view data through contrastive learning whilst providing flexibility at deployment in the event of missing views.
摘要
<>精准的肾脏定位和体积特征化可以支持肾脏癌诊断。大量自动肾脏分割工具忽略了多视图数据的可用性,即使据获取协议可以获得多视图数据。此外,当使用多视图数据时,手动标注和测试时 disponibility 通常被假设。在这种情况下,我们提出了一种对比方法,使得在训练时可以利用多视图数据无需标注,并在部署时提供灵活性。我们提出了一种基于 U-Net 的 triplet 编码器和单个解码器网络,我们称之为 tU-Net( triplet U-Net)。我们的提议的架构可以通过对不同视图的对比学习利用非标注的 sagittal 和横截视图来提高分割。为此,我们引入了视图间相似性的概念在幂空间。为了导航训练,我们将 dice 分数损失与AXIAL 视图和其手动标注相加,并与多视图对比损失相结合。 tU-Net 显示在 DSC 系数上有统计学上的提升(91.25+-0.52% 相比 86.40+-1.50%,P<.001)。敏感分析表明,对于 tU-Net 来说,对比损失的负面影响是可观的(2.85+-1.34% 相比 3.81+-1.88%,P<.001)。此外,我们的方法在我们的内部数据集中表现了良好的外部Volumetric 普适性(2.76+-1.89% 相比 3.92+-3.31%,P=.002),这表明了可以通过对比学习利用非标注多视图数据,并在部署时提供灵活性。
CATS v2: Hybrid encoders for robust medical segmentation
paper_authors: Hao Li, Han Liu, Dewei Hu, Xing Yao, Jiacheng Wang, Ipek Oguz for:This paper proposes a new method for 3D medical image segmentation, specifically for vestibular schwannoma (VS) and prostate segmentation.methods:The proposed method uses a hybrid encoder consisting of a CNN-based encoder path and a transformer path with a shifted window to leverage both local and global information.results:The proposed method demonstrates superior performance in terms of higher Dice scores compared to state-of-the-art methods on two public challenge datasets (CrossMoDA and MSD-5) for VS and prostate segmentation.Abstract
Convolutional Neural Networks (CNNs) have exhibited strong performance in medical image segmentation tasks by capturing high-level (local) information, such as edges and textures. However, due to the limited field of view of convolution kernel, it is hard for CNNs to fully represent global information. Recently, transformers have shown good performance for medical image segmentation due to their ability to better model long-range dependencies. Nevertheless, transformers struggle to capture high-level spatial features as effectively as CNNs. A good segmentation model should learn a better representation from local and global features to be both precise and semantically accurate. In our previous work, we proposed CATS, which is a U-shaped segmentation network augmented with transformer encoder. In this work, we further extend this model and propose CATS v2 with hybrid encoders. Specifically, hybrid encoders consist of a CNN-based encoder path paralleled to a transformer path with a shifted window, which better leverage both local and global information to produce robust 3D medical image segmentation. We fuse the information from the convolutional encoder and the transformer at the skip connections of different resolutions to form the final segmentation. The proposed method is evaluated on two public challenge datasets: Cross-Modality Domain Adaptation (CrossMoDA) and task 5 of Medical Segmentation Decathlon (MSD-5), to segment vestibular schwannoma (VS) and prostate, respectively. Compared with the state-of-the-art methods, our approach demonstrates superior performance in terms of higher Dice scores.
摘要
卷积神经网络(CNN)在医学图像分割任务中表现出色,通过捕捉高级(本地)信息,如边缘和文本ure。然而,由于卷积核心的视野有限,使得CNN难以完全表征全局信息。而在最近的几年,转移器在医学图像分割中表现良好,这是因为它们可以更好地模型长距离依赖关系。然而,转移器在捕捉高级空间特征方面表现不如CNN一样好。一个好的分割模型应该学习更好的表示本地和全局特征,以便具有高精度和Semantic Accuracy。在我们之前的工作中,我们提出了CATS,它是一个U型分割网络,其中包括转移器编码器。在这个工作中,我们进一步扩展了这个模型,并提出了CATS v2,其中包括混合编码器。特别是,混合编码器包括一个基于CNN的编码器路径和一个偏移窗口的转移器路径,这些路径都可以更好地利用本地和全局信息,以生成Robust 3D医学图像分割。我们在不同分辨率的 skip 连接中 fusion 了 convolutional 编码器和转移器的信息,以生成最终的分割。我们的方法在 Cross-Modality Domain Adaptation(CrossMoDA)和 Medical Segmentation Decathlon(MSD-5)两个公共挑战数据集上进行评估,用于分割 vestibular schwannoma(VS)和肾脏,分别。与当前状态艺术方法相比,我们的方法在 dice 分数上表现出优于其他方法。
Deep Learning-Based Open Source Toolkit for Eosinophil Detection in Pediatric Eosinophilic Esophagitis
paper_authors: Juming Xiong, Yilin Liu, Ruining Deng, Regina N Tyree, Hernan Correa, Girish Hiremath, Yaohong Wang, Yuankai Huo for: 这个研究是为了开发一个开源的工具集(Open-EoE),用于检测食道检查图像(Whole Slide Image,WSI)中的嗜酸细胞(Eos)。methods: 该工具集使用三种state-of-the-art深度学习基于对象检测模型,并实现了一种 ensemble learning 策略以提高性能。results: 实验结果表明,Open-EoE 工具集可以有效地检测食道检查图像中的嗜酸细胞,并达到了91%的准确率,与专业病理学家的评估相一致。Abstract
Eosinophilic Esophagitis (EoE) is a chronic, immune/antigen-mediated esophageal disease, characterized by symptoms related to esophageal dysfunction and histological evidence of eosinophil-dominant inflammation. Owing to the intricate microscopic representation of EoE in imaging, current methodologies which depend on manual identification are not only labor-intensive but also prone to inaccuracies. In this study, we develop an open-source toolkit, named Open-EoE, to perform end-to-end whole slide image (WSI) level eosinophil (Eos) detection using one line of command via Docker. Specifically, the toolkit supports three state-of-the-art deep learning-based object detection models. Furthermore, Open-EoE further optimizes the performance by implementing an ensemble learning strategy, and enhancing the precision and reliability of our results. The experimental results demonstrated that the Open-EoE toolkit can efficiently detect Eos on a testing set with 289 WSIs. At the widely accepted threshold of >= 15 Eos per high power field (HPF) for diagnosing EoE, the Open-EoE achieved an accuracy of 91%, showing decent consistency with pathologist evaluations. This suggests a promising avenue for integrating machine learning methodologies into the diagnostic process for EoE. The docker and source code has been made publicly available at https://github.com/hrlblab/Open-EoE.
摘要
《营养细胞性食管炎(EoE)》是一种慢性、免疫/抗原识别的食管疾病,表现为食管功能障碍和 histological 证明中充满嗜铁细胞的inflammation。由于诊断EoE的微scopic representation在成像中复杂,现有的方法ologies都是人工识别,不仅劳动密集,还容易出错。在这项研究中,我们开发了一个开源工具包,名为Open-EoE,通过一行命令via Docker进行整个报告图像(WSI)层级的嗜铁细胞(Eos)检测。具体来说,工具支持三种当前顶尖的深度学习基于对象检测模型。此外,Open-EoE还进一步优化了性能,通过实现 ensemble learning 策略,提高了结果的精度和可靠性。实验结果表明,Open-EoE 工具包可以有效地检测289张WSIs中的嗜铁细胞。在 widely accepted 的 >= 15 Eos per high power field(HPF)的标准reshold上,Open-EoE 达到了91%的准确率,与Pathologist 评估相当一致。这表明了机器学习方法的可能性在EoE 诊断过程中的应用。docker 和源代码已经公开发布在https://github.com/hrlblab/Open-EoE。
Revolutionizing Space Health (Swin-FSR): Advancing Super-Resolution of Fundus Images for SANS Visual Assessment Technology
paper_authors: Khondker Fariha Hossain, Sharif Amit Kamran, Joshua Ong, Andrew G. Lee, Alireza Tavakkoli for:This paper is written for the purpose of developing a novel model for fundus image super-resolution, specifically using Swin Transformer with spatial and depth-wise attention.methods:The paper utilizes a novel model called Swin-FSR, which combines Swin Transformer with spatial and depth-wise attention for fundus image super-resolution.results:The paper achieves Peak signal-to-noise-ratio (PSNR) of 47.89, 49.00 and 45.32 on three public datasets, namely iChallenge-AMD, iChallenge-PM, and G1020. Additionally, the model showed comparable results on a privately held dataset for Spaceflight-associated Neuro-ocular Syndrome (SANS) provided by NASA.Abstract
The rapid accessibility of portable and affordable retinal imaging devices has made early differential diagnosis easier. For example, color funduscopy imaging is readily available in remote villages, which can help to identify diseases like age-related macular degeneration (AMD), glaucoma, or pathological myopia (PM). On the other hand, astronauts at the International Space Station utilize this camera for identifying spaceflight-associated neuro-ocular syndrome (SANS). However, due to the unavailability of experts in these locations, the data has to be transferred to an urban healthcare facility (AMD and glaucoma) or a terrestrial station (e.g, SANS) for more precise disease identification. Moreover, due to low bandwidth limits, the imaging data has to be compressed for transfer between these two places. Different super-resolution algorithms have been proposed throughout the years to address this. Furthermore, with the advent of deep learning, the field has advanced so much that x2 and x4 compressed images can be decompressed to their original form without losing spatial information. In this paper, we introduce a novel model called Swin-FSR that utilizes Swin Transformer with spatial and depth-wise attention for fundus image super-resolution. Our architecture achieves Peak signal-to-noise-ratio (PSNR) of 47.89, 49.00 and 45.32 on three public datasets, namely iChallenge-AMD, iChallenge-PM, and G1020. Additionally, we tested the model's effectiveness on a privately held dataset for SANS provided by NASA and achieved comparable results against previous architectures.
摘要
随着可携式和Affordable的Retinal imaging设备的快速访问,早期差异诊断变得更加容易。例如,颜色基准摄影是在偏远村庄中ready available,可以 помо助于诊断年龄相关的macular degeneration(AMD)、 glaucoma 或pathological myopia(PM)等疾病。然而,由于这些地点缺乏专家,因此数据必须被传输到城市医疗机构(AMD和glaucoma)或者地球站(例如,SANS)进行更加精确的疾病诊断。此外,由于带宽限制,摄影数据必须进行压缩传输。过去的多年来,不同的超分辨率算法已经被提出来解决这个问题。此外,随着深度学习的出现,这一领域已经进步到了非常高的水平,可以使x2和x4压缩的图像得到原始形态的还原,无需失去空间信息。在本文中,我们提出了一种新的模型called Swin-FSR,该模型利用SwinTransformer和空间和深度精度注意力来进行fundus图像超分辨率。我们的架构实现了Peak signal-to-noise-ratio(PSNR)的47.89、49.00和45.32在三个公共数据集上,即iChallenge-AMD、iChallenge-PM和G1020。此外,我们对NASA提供的一个私人保持的SANS数据集进行测试,并与之前的建筑物实现了相似的结果。
A Hierarchical Descriptor Framework for On-the-Fly Anatomical Location Matching between Longitudinal Studies
results: 减少计算时间至毫秒级别,无需依赖于预训练、重新映射或多模态转换,并且在 Deep Lesion Tracking 数据集上达到更高的匹配精度,比最精确的算法 faster 24 倍。Abstract
We propose a method to match anatomical locations between pairs of medical images in longitudinal comparisons. The matching is made possible by computing a descriptor of the query point in a source image based on a hierarchical sparse sampling of image intensities that encode the location information. Then, a hierarchical search operation finds the corresponding point with the most similar descriptor in the target image. This simple yet powerful strategy reduces the computational time of mapping points to a millisecond scale on a single CPU. Thus, radiologists can compare similar anatomical locations in near real-time without requiring extra architectural costs for precomputing or storing deformation fields from registrations. Our algorithm does not require prior training, resampling, segmentation, or affine transformation steps. We have tested our algorithm on the recently published Deep Lesion Tracking dataset annotations. We observed more accurate matching compared to Deep Lesion Tracker while being 24 times faster than the most precise algorithm reported therein. We also investigated the matching accuracy on CT and MR modalities and compared the proposed algorithm's accuracy against ground truth consolidated from multiple radiologists.
摘要
我们提出了一种方法,用于在医疗影像序列中匹配 анатомиче位置。该方法基于源图像中查询点的Descriptor,该Descriptor通过 hierarchical sparse sampling 图像强度编码位置信息来计算。然后,使用 hierarchical 搜索操作找到目标图像中最相似的点。这种简单 yet powerful 策略可以在单个 CPU 上减少计算时间到毫秒级,因此 radiologist 可以在实时比较相似的 анатомиче位置,不需要额外的建筑成本或存储投影场景的预处理或存储步骤。我们的算法不需要先行训练、重新采样、分割或 affine 变换步骤。我们在 Deep Lesion Tracking 数据集注释中进行了测试,并观察到比 Deep Lesion Tracker 更高的匹配精度,同时比最精确的算法reported therein 24 倍快。我们还对 CT 和 MR 模式进行了匹配精度的研究,并与多名医生共同合理的ground truth 进行了比较。