eess.IV - 2023-07-30

Unsupervised Decomposition Networks for Bias Field Correction in MR Image

  • paper_url: http://arxiv.org/abs/2307.16219
  • repo_url: https://github.com/leongdong/bias-decomposition-networks
  • paper_authors: Dong Liang, Xingyu Qiu, Kuanquan Wang, Gongning Luo, Wei Wang, Yashu Liu
  • for: 这个研究旨在提出一种不需要监督学习的批处理网络,以获取受扭曲的MR影像中的偏差场。
  • methods: 该方法使用了一种由批处理网络组成的分解方法,包括一个分类部分和一个估算部分,以便分解受扭曲的MR影像。
  • results: 实验结果表明,该方法可以准确地估算偏差场并生成更好的偏差 corrections。 codes 可以在以下链接中找到:https://github.com/LeongDong/Bias-Decomposition-Networks。
    Abstract Bias field, which is caused by imperfect MR devices or imaged objects, introduces intensity inhomogeneity into MR images and degrades the performance of MR image analysis methods. Many retrospective algorithms were developed to facilitate the bias correction, to which the deep learning-based methods outperformed. However, in the training phase, the supervised deep learning-based methods heavily rely on the synthesized bias field. As the formation of the bias field is extremely complex, it is difficult to mimic the true physical property of MR images by synthesized data. While bias field correction and image segmentation are strongly related, the segmentation map is precisely obtained by decoupling the bias field from the original MR image, and the bias value is indicated by the segmentation map in reverse. Thus, we proposed novel unsupervised decomposition networks that are trained only with biased data to obtain the bias-free MR images. Networks are made up of: a segmentation part to predict the probability of every pixel belonging to each class, and an estimation part to calculate the bias field, which are optimized alternately. Furthermore, loss functions based on the combination of fuzzy clustering and the multiplicative bias field are also devised. The proposed loss functions introduce the smoothness of bias field and construct the soft relationships among different classes under intra-consistency constraints. Extensive experiments demonstrate that the proposed method can accurately estimate bias fields and produce better bias correction results. The code is available on the link: https://github.com/LeongDong/Bias-Decomposition-Networks.
    摘要 �� bias �eld,��由不完美的 MR 设备或图像物理特性所引起的,会导致 MR 图像中的�Intensity 不均��,从而降低 MR 图像分析方法的性能。许多retrospective算法已经开发来简化偏好 corrections,其中深度学习基于方法在训练阶段更高效。然而,在训练阶段,深度学习基于方法强依赖于制成的偏好场。由于偏好场的形成非常复杂,难以通过制成的数据来模拟真实的物理特性。而偏好场 correction 和图像 segmentation 是非常相关的,可以通过分解偏好场来获得不受偏好影响的 MR 图像。因此,我们提出了一种新的无监督分解网络,该网络通过在偏好数据上进行训练来获得偏好场 corrections。该网络由两部分组成:一个分类部分用于预测每个像素属于哪个类别,以及一个估计部分用于计算偏好场,这两个部分在 alternate 中优化。此外,我们还提出了一种基于多元偏好场的损失函数,该损失函数引入了偏好场的缓和性和不同类别之间的软连接。广泛的实验表明,我们的方法可以准确地估计偏好场并生成更好的偏好 corrections 结果。代码可以在以下链接获取:https://github.com/LeongDong/Bias-Decomposition-Networks。

Gastrointestinal Mucosal Problems Classification with Deep Learning

  • paper_url: http://arxiv.org/abs/2307.16198
  • repo_url: None
  • paper_authors: Mohammadhasan Goharian, Vahid Goharian, Hamidreza Bolhasani
  • for: 旨在检测胃肠粘膜变化,早期诊断和预防胃肠癌。
  • methods: 使用深度学习算法,特别是基于Convolutional Neural Networks(CNNs)的传送学习(TL)。
  • results: 在测试图像中,模型精度达93%,并在实际检anoscopy和colonoscopy视频中进行了预测。
    Abstract Gastrointestinal mucosal changes can cause cancers after some years and early diagnosing them can be very useful to prevent cancers and early treatment. In this article, 8 classes of mucosal changes and anatomical landmarks including Polyps, Ulcerative Colitis, Esophagitis, Normal Z-Line, Normal Pylorus, Normal Cecum, Dyed Lifted Polyps, and Dyed Lifted Margin were predicted by deep learning. We used neural networks in this article. It is a black box artificial intelligence algorithm that works like a human neural system. In this article, Transfer Learning (TL) based on the Convolutional Neural Networks (CNNs), which is one of the well-known types of neural networks in image processing is used. We compared some famous CNN architecture including VGG, Inception, Xception, and ResNet. Our best model got 93% accuracy in test images. At last, we used our model in some real endoscopy and colonoscopy movies to classify problems.
    摘要 胃肠内膜变化可能导致癌变,早期诊断可以有助于预防癌变并提供早期治疗。在这篇文章中,我们预测了8种胃肠内膜变化和解剖学特征,包括贫血溃疡、急性肠炎、胃肠内膜Z线、胃肠内膜pylorus、胃肠内膜 Cecum、染料吸引溃疡和染料吸引边缘。我们使用了神经网络来进行预测。神经网络是一种黑盒子人工智能算法,它工作如同人类神经系统一样。在这篇文章中,我们使用了传输学(TL)基于卷积神经网络(CNNs),这是一种广泛使用的神经网络类型在图像处理中。我们比较了一些著名的CNN架构,包括VGG、Inception、Xception和ResNet。我们的最佳模型在测试图像中达到了93%的准确率。最后,我们使用了我们的模型在一些真实的病理影像中进行分类。

StarSRGAN: Improving Real-World Blind Super-Resolution

  • paper_url: http://arxiv.org/abs/2307.16169
  • repo_url: https://github.com/kynthesis/StarSRGAN
  • paper_authors: Khoa D. Vo, Len T. Bui
  • for: This paper is written for improving the blind super-resolution (SR) in computer vision, aiming to enhance the resolution of low-resolution images without prior knowledge of the degradation process.
  • methods: The paper introduces StarSRGAN, a novel GAN model that utilizes 5 various architectures to achieve state-of-the-art (SOTA) performance in blind SR tasks. The model is designed to provide visually compelling outcomes with improved super-resolved quality.
  • results: The experimental comparisons with Real-ESRGAN show that StarSRGAN achieves roughly 10% better performance on the MANIQA and AHIQ measures, while StarSRGAN Lite provides approximately 7.5 times faster reconstruction speed with only a slight decrease in image quality. The codes are available at https://github.com/kynthesis/StarSRGAN.
    Abstract The aim of blind super-resolution (SR) in computer vision is to improve the resolution of an image without prior knowledge of the degradation process that caused the image to be low-resolution. The State of the Art (SOTA) model Real-ESRGAN has advanced perceptual loss and produced visually compelling outcomes using more complex degradation models to simulate real-world degradations. However, there is still room to improve the super-resolved quality of Real-ESRGAN by implementing recent techniques. This research paper introduces StarSRGAN, a novel GAN model designed for blind super-resolution tasks that utilize 5 various architectures. Our model provides new SOTA performance with roughly 10% better on the MANIQA and AHIQ measures, as demonstrated by experimental comparisons with Real-ESRGAN. In addition, as a compact version, StarSRGAN Lite provides approximately 7.5 times faster reconstruction speed (real-time upsampling from 540p to 4K) but can still keep nearly 90% of image quality, thereby facilitating the development of a real-time SR experience for future research. Our codes are released at https://github.com/kynthesis/StarSRGAN.
    摘要 目的是提高计算机视觉中的盲超分辨率(SR),无需先知道降低过程的信息,以提高图像的分辨率。现有的最佳实践(SOTA)模型Real-ESRGAN已经使用了更复杂的降低模型来模拟实际世界中的降低。然而,还有余地可以提高Real-ESRGAN中的超分辨率质量。这篇研究论文介绍了StarSRGAN,一种新的GAN模型,用于盲SR任务。我们的模型使用了5种不同的建筑,并提供了新的SOTA性能,在MANIQA和AHIQ测试中比Real-ESRGAN提高了约10%。此外,我们还提供了一个快速重建速度版本StarSRGAN Lite,可以在540p到4K的快速扩展中实现实时SR体验。我们的代码在https://github.com/kynthesis/StarSRGAN上发布。

Structure-Preserving Synthesis: MaskGAN for Unpaired MR-CT Translation

  • paper_url: http://arxiv.org/abs/2307.16143
  • repo_url: https://github.com/HieuPhan33/MaskGAN
  • paper_authors: Minh Hieu Phan, Zhibin Liao, Johan W. Verjans, Minh-Son To
  • for: 这篇论文旨在提供一个可靠且cost-effective的医疗影像合成方法,以便对于医疗影像资料的损失或缺乏实现合成。
  • methods: 这篇论文使用了CycleGAN的架构,并将自动提取的粗糙面给入力到架构中,以便保持体Structural consistency。
  • results: 实验结果显示,MaskGAN在一个儿童医疗领域的复杂数据集上表现出色,能够保持体 Structural consistency,而不需要专家的标注。
    Abstract Medical image synthesis is a challenging task due to the scarcity of paired data. Several methods have applied CycleGAN to leverage unpaired data, but they often generate inaccurate mappings that shift the anatomy. This problem is further exacerbated when the images from the source and target modalities are heavily misaligned. Recently, current methods have aimed to address this issue by incorporating a supplementary segmentation network. Unfortunately, this strategy requires costly and time-consuming pixel-level annotations. To overcome this problem, this paper proposes MaskGAN, a novel and cost-effective framework that enforces structural consistency by utilizing automatically extracted coarse masks. Our approach employs a mask generator to outline anatomical structures and a content generator to synthesize CT contents that align with these structures. Extensive experiments demonstrate that MaskGAN outperforms state-of-the-art synthesis methods on a challenging pediatric dataset, where MR and CT scans are heavily misaligned due to rapid growth in children. Specifically, MaskGAN excels in preserving anatomical structures without the need for expert annotations. The code for this paper can be found at https://github.com/HieuPhan33/MaskGAN.
    摘要 医学图像生成是一项具有挑战性的任务,因为精度匹配数据罕见。许多方法使用CycleGAN来利用无对数据,但它们经常生成错误的映射,导致身体结构的偏移。这个问题更加严重当图像来源和目标模式之间的偏移很大。目前的方法通过添加辅助分割网络来解决这个问题,但这需要成本和时间昂贵的像素级别标注。为了缓解这个问题,这篇论文提出了MaskGAN,一种新的和经济的框架,通过自动提取的粗略Mask来保持结构一致性。我们的方法使用Mask生成器将体结构析出,并使用内容生成器Synthesize CT内容,与这些结构相对应。我们的实验表明,MaskGAN在一个复杂的儿童数据集上表现出色,特别是在MR和CT扫描中存在快速增长的儿童身体中,具有优秀的结构保持性,而不需要专家标注。相关代码可以在https://github.com/HieuPhan33/MaskGAN中找到。

Implicit Neural Representation in Medical Imaging: A Comparative Survey

  • paper_url: http://arxiv.org/abs/2307.16142
  • repo_url: https://github.com/mindflow-institue/awesome-implicit-neural-representations-in-medical-imaging
  • paper_authors: Amirali Molaei, Amirhossein Aminimehr, Armin Tavakoli, Amirhossein Kazerouni, Bobby Azad, Reza Azad, Dorit Merhof
    for: This survey provides a comprehensive overview of implicit neural representations (INRs) in the field of medical imaging, exploring their applications and advantages in various medical imaging tasks.methods: The survey discusses the use of INRs in image reconstruction, segmentation, registration, novel view synthesis, and compression, highlighting their resolution-agnostic nature, memory efficiency, ability to avoid locality biases, and differentiability.results: The survey addresses the challenges and considerations specific to medical imaging data, such as data availability, computational complexity, and dynamic clinical scene analysis, and identifies future research directions and opportunities, including integration with multi-modal imaging, real-time and interactive systems, and domain adaptation for clinical decision support.
    Abstract Implicit neural representations (INRs) have gained prominence as a powerful paradigm in scene reconstruction and computer graphics, demonstrating remarkable results. By utilizing neural networks to parameterize data through implicit continuous functions, INRs offer several benefits. Recognizing the potential of INRs beyond these domains, this survey aims to provide a comprehensive overview of INR models in the field of medical imaging. In medical settings, numerous challenging and ill-posed problems exist, making INRs an attractive solution. The survey explores the application of INRs in various medical imaging tasks, such as image reconstruction, segmentation, registration, novel view synthesis, and compression. It discusses the advantages and limitations of INRs, highlighting their resolution-agnostic nature, memory efficiency, ability to avoid locality biases, and differentiability, enabling adaptation to different tasks. Furthermore, the survey addresses the challenges and considerations specific to medical imaging data, such as data availability, computational complexity, and dynamic clinical scene analysis. It also identifies future research directions and opportunities, including integration with multi-modal imaging, real-time and interactive systems, and domain adaptation for clinical decision support. To facilitate further exploration and implementation of INRs in medical image analysis, we have provided a compilation of cited studies along with their available open-source implementations on \href{https://github.com/mindflow-institue/Awesome-Implicit-Neural-Representations-in-Medical-imaging}. Finally, we aim to consistently incorporate the most recent and relevant papers regularly.
    摘要 启发神经表示(INR)在场景重建和计算机图形领域已经崭新出名,表现出色。通过使用神经网络来参数化数据通过间接连续函数,INR提供了多个优势。认识到INR在医疗领域之外的潜在应用,这份报告提供了医学成像领域INR模型的全面回顾。在医疗设置下,存在许多复杂和不稳定的问题,使INR成为一种吸引人的解决方案。本报告探讨了INR在各种医学成像任务中的应用,如图像重建、分割、注册、新视图生成和压缩。它讨论了INR的优点和限制,包括其分辨率不依赖、内存效率高、避免地方偏好和可导 differentiability,以便适应不同任务。此外,报告还考虑了医学成像数据特有的挑战和考虑因素,如数据可用性、计算复杂度和临床Scene analysis。最后,报告还提出了未来研究方向和机会,包括与多模态成像集成、实时交互系统和适应医疗决策的领域适应。为便于进一步探索和实现INR在医学成像分析中,我们在\href{https://github.com/mindflow-institue/Awesome-Implicit-Neural-Representations-in-Medical-imaging}提供了参考文献和其可用的开源实现。

RIS-Enhanced Semantic Communications Adaptive to User Requirements

  • paper_url: http://arxiv.org/abs/2307.16100
  • repo_url: None
  • paper_authors: Peiwen Jiang, Chao-Kai Wen, Shi Jin, Geoffrey Ye Li
  • for: 这个论文是为了提出一个基于智能表面的对话传输框架,以满足不断变化的用户需求和环境。
  • methods: 这个框架使用了智能表面来自动调整传输通道,以满足不同的用户需求和环境。它还使用了对话传输的混合编码设计和端到端训练,以提高传输效率和可靠性。
  • results: simulations results indicate that the proposed RIS-SC framework can achieve reasonable task performance and adapt to diverse channel conditions and user requirements. However, under severe channel conditions, some semantic parts may be abandoned. To address this issue, a reconstruction method is introduced to improve visual acceptance by inferring missing semantic parts. Additionally, the framework can efficiently allocate RIS resources among multiple users in friendly channel conditions.
    Abstract Semantic communication significantly reduces required bandwidth by understanding semantic meaning of the transmitted. However, current deep learning-based semantic communication methods rely on joint source-channel coding design and end-to-end training, which limits their adaptability to new physical channels and user requirements. Reconfigurable intelligent surfaces (RIS) offer a solution by customizing channels in different environments. In this study, we propose the RIS-SC framework, which allocates semantic contents with varying levels of RIS assistance to satisfy the changing user requirements. It takes into account user movement and line-of-sight obstructions, enabling the RIS resource to protect important semantics in challenging channel conditions. The simulation results indicate reasonable task performance, but some semantic parts that have no effect on task performances are abandoned under severe channel conditions. To address this issue, a reconstruction method is also introduced to improve visual acceptance by inferring those missing semantic parts. Furthermore, the framework can adjust RIS resources in friendly channel conditions to save and allocate them efficiently among multiple users. Simulation results demonstrate the adaptability and efficiency of the RIS-SC framework across diverse channel conditions and user requirements.
    摘要 semantic communication 可以减少需要的带宽,因为它理解传输的 semantic 含义。但是,现有的深度学习基于 semantic communication 方法依赖于共同源-通道编码设计和端到端训练,这限制了它们在新的物理通道和用户需求中的适应性。可重配置智能表面(RIS)提供了一种解决方案,可以在不同环境中自定义通道。在本研究中,我们提出了 RIS-SC 框架,它将具有不同水平的 RIS 帮助分配到满足变化的用户需求。它考虑用户的运动和视线干扰,使得 RIS 资源能够保护重要的 semantics 在具有挑战性的通道条件下。 sim 结果表明任务性能合理,但在严重的通道条件下,一些无关任务性能的 semantic 部分会被放弃。为解决这个问题,我们还提出了一种重建方法,可以通过推理这些缺失的 semantic 部分来提高视觉接受度。此外,框架还可以在友好的通道条件下调整 RIS 资源,以efficiently 地分配它们于多个用户。 sim 结果表明 RIS-SC 框架在多种通道条件和用户需求下展示了适应性和效率。

A New Multi-Level Hazy Image and Video Dataset for Benchmark of Dehazing Methods

  • paper_url: http://arxiv.org/abs/2307.16050
  • repo_url: None
  • paper_authors: Bedrettin Cetinkaya, Yucel Cimtay, Fatma Nazli Gunay, Gokce Nur Yilmaz
  • for: This study aims to present a new multi-level hazy color image dataset and compare the dehazing performance of five different dehazing methods/models.
  • methods: The study uses color video data captured for two real scenes with controlled levels of haze, and the dehazing performance is evaluated based on SSIM, PSNR, VSI, and DISTS image quality metrics.
  • results: The results show that traditional methods can generalize the dehazing problem better than many deep learning-based methods, and the performance of deep models depends mostly on the scene and is generally poor on cross-dataset dehazing.Here’s the Chinese translation of the three key points:
  • for: 这个研究的目的是为了提供一个多级雾度的颜色图像集合,并对五种不同的抑雾方法/模型进行比较。
  • methods: 这个研究使用了两个真实场景中的颜色视频数据,并使用了控制雾度的方式来生成多级雾度图像。抑雾性能是根据SSIM、PSNR、VSI和DISTS图像质量指标进行评估。
  • results: 结果表明,传统方法在抑雾问题上能够更好地总结,而深度学习基于的方法在不同场景下的性能很差,特别是在跨集合抑雾问题上。
    Abstract The changing level of haze is one of the main factors which affects the success of the proposed dehazing methods. However, there is a lack of controlled multi-level hazy dataset in the literature. Therefore, in this study, a new multi-level hazy color image dataset is presented. Color video data is captured for two real scenes with a controlled level of haze. The distance of the scene objects from the camera, haze level, and ground truth (clear image) are available so that different dehazing methods and models can be benchmarked. In this study, the dehazing performance of five different dehazing methods/models is compared on the dataset based on SSIM, PSNR, VSI and DISTS image quality metrics. Results show that traditional methods can generalize the dehazing problem better than many deep learning based methods. The performance of deep models depends mostly on the scene and is generally poor on cross-dataset dehazing.
    摘要 “雾度的变化是这些提议的滤雾方法成功的一个主要因素,但在文献中没有受控多级雾度数据集。因此,在本研究中,一个新的多级雾度彩色图像数据集被提出。实际拍摄的彩色视频数据被捕捉到两个场景中,并且有控制雾度、距离相机和真实预期(清晰图像)的资讯,以便不同的滤雾方法和模型进行比较。在本研究中,五种不同的滤雾方法/模型的比较结果显示,传统方法在不同场景下能够更好地应对滤雾问题,而深度学习基本方法则受到场景的影响,一般而言,跨数据集的滤雾性能较差。”Note: The translation is in Simplified Chinese, which is the standard form of Chinese used in mainland China. If you prefer Traditional Chinese, please let me know and I can provide the translation in that format as well.

CoVid-19 Detection leveraging Vision Transformers and Explainable AI

  • paper_url: http://arxiv.org/abs/2307.16033
  • repo_url: None
  • paper_authors: Pangoth Santhosh Kumar, Kundrapu Supriya, Mallikharjuna Rao K
  • for: 这个研究的目的是为了测定肺病的早期诊断,以提高病人的生存机会和质量生活。
  • methods: 这个研究使用了深度学习算法,包括卷积神经网络(CNN)、普通神经网络、视觉几何组网络(VGG)和封顶网络(Capsule Network)等,以进行肺病预测。
  • results: 这个研究发现,使用了视觉几何组网络(VGG)和封顶网络(Capsule Network)的方法可以实现肺病早期检测,并在 Covid 19 Radiography Database 上进行了训练和验证,获得了更高的准确率。
    Abstract Lung disease is a common health problem in many parts of the world. It is a significant risk to people health and quality of life all across the globe since it is responsible for five of the top thirty leading causes of death. Among them are COVID 19, pneumonia, and tuberculosis, to name just a few. It is critical to diagnose lung diseases in their early stages. Several different models including machine learning and image processing have been developed for this purpose. The earlier a condition is diagnosed, the better the patient chances of making a full recovery and surviving into the long term. Thanks to deep learning algorithms, there is significant promise for the autonomous, rapid, and accurate identification of lung diseases based on medical imaging. Several different deep learning strategies, including convolutional neural networks (CNN), vanilla neural networks, visual geometry group based networks (VGG), and capsule networks , are used for the goal of making lung disease forecasts. The standard CNN has a poor performance when dealing with rotated, tilted, or other aberrant picture orientations. As a result of this, within the scope of this study, we have suggested a vision transformer based approach end to end framework for the diagnosis of lung disorders. In the architecture, data augmentation, training of the suggested models, and evaluation of the models are all included. For the purpose of detecting lung diseases such as pneumonia, Covid 19, lung opacity, and others, a specialised Compact Convolution Transformers (CCT) model have been tested and evaluated on datasets such as the Covid 19 Radiography Database. The model has achieved a better accuracy for both its training and validation purposes on the Covid 19 Radiography Database.
    摘要 肺病是全球许多地区的常见健康问题。它对人们的健康和生活质量构成了重要的威胁,因为它负责全球前30名死亡原因中的5个。包括COVID-19、肺炎和结核病等在内,这些疾病的普遍性使得早期诊断变得非常重要。为了实现这一目标,许多不同的模型,包括机器学习和图像处理,已经被开发出来。随着深度学习算法的出现,对于基于医疗图像的肺病诊断,存在 significante 的承诺。在这种情况下,我们建议使用视transformer基本框架,以实现肺病诊断。在这个框架中,包括数据增强、模型训练和评估等方面。为了检测肺病如肺炎、COVID-19、肺抑制等,我们提出了一种专门的Compact Convolution Transformers(CCT)模型,并在 Covid 19 胸部X射线数据库上进行了测试和评估。该模型在训练和验证过程中具有更高的准确率。

LOTUS: Learning to Optimize Task-based US representations

  • paper_url: http://arxiv.org/abs/2307.16021
  • repo_url: None
  • paper_authors: Yordanka Velikova, Mohammad Farid Azampour, Walter Simson, Vanessa Gonzalez Duque, Nassir Navab
    for:The paper is written for the task of anatomical segmentation of organs in ultrasound images, specifically for diagnosis and monitoring.methods:The paper proposes a novel approach for learning to optimize task-based ultrasound image representations, using annotated CT segmentation maps as a simulation medium to generate ultrasound training data. The approach includes a fully differentiable ultrasound simulator that learns to optimize the parameters for generating physics-based ultrasound images guided by the downstream segmentation task, as well as an image adaptation network between real and simulated images to achieve simultaneous image synthesis and automatic segmentation on US images in an end-to-end training setting.results:The proposed method is evaluated on aorta and vessel segmentation tasks and shows promising quantitative results, as well as qualitative results of optimized image representations on other organs.
    Abstract Anatomical segmentation of organs in ultrasound images is essential to many clinical applications, particularly for diagnosis and monitoring. Existing deep neural networks require a large amount of labeled data for training in order to achieve clinically acceptable performance. Yet, in ultrasound, due to characteristic properties such as speckle and clutter, it is challenging to obtain accurate segmentation boundaries, and precise pixel-wise labeling of images is highly dependent on the expertise of physicians. In contrast, CT scans have higher resolution and improved contrast, easing organ identification. In this paper, we propose a novel approach for learning to optimize task-based ultra-sound image representations. Given annotated CT segmentation maps as a simulation medium, we model acoustic propagation through tissue via ray-casting to generate ultrasound training data. Our ultrasound simulator is fully differentiable and learns to optimize the parameters for generating physics-based ultrasound images guided by the downstream segmentation task. In addition, we train an image adaptation network between real and simulated images to achieve simultaneous image synthesis and automatic segmentation on US images in an end-to-end training setting. The proposed method is evaluated on aorta and vessel segmentation tasks and shows promising quantitative results. Furthermore, we also conduct qualitative results of optimized image representations on other organs.
    摘要 医学应用中对ultrasound图像的结构分割是非常重要的,特别是诊断和监测。现有的深度神经网络需要大量标注数据进行训练以达到临床可接受的性能。然而,在ultrasound中,由特有的斑点和噪声而导致的分割边界很难确定,并且医生 preciselly pixel-wise 标注图像是高度dependent于医生的专业技巧。然而,CT扫描机有更高的分辨率和更好的对比度,使得器官识别变得更容易。在这篇论文中,我们提出了一种新的方法,用于学习优化任务基于ultrasound图像的表示。我们使用了ray-casting模拟声波传播through tissue,以生成ultrasound训练数据。我们的ultrasound模拟器是完全可导的,可以学习优化参数,以便生成physics-based ultasound图像,并且被下游分割任务导引。此外,我们还训练了一种图像适应网络,以实现同时的图像合成和自动分割任务。我们的提案方法在AAA和血管分割任务中表现出了有力的量化结果。此外,我们还进行了其他器官的优化图像结果的质量评估。