eess.IV - 2023-07-08

Lightweight Improved Residual Network for Efficient Inverse Tone Mapping

  • paper_url: http://arxiv.org/abs/2307.03998
  • repo_url: None
  • paper_authors: Liqi Xue, Tianyi Xu, Yongbao Song, Yan Liu, Lei Zhang, Xiantong Zhen, Jun Xu
  • for: 用于SDR图像转换为HDR图像的高效 inverse tone mapping(ITM)。
  • methods: 提出了一种基于增强的 residual block 的轻量级 Improved Residual Network(IRNet),用于精细化HDR图像重建。
  • results: 在三个标准测试集上实现了State-of-the-art表现在ITM和 joint SR-ITM任务上。
    Abstract The display devices like HDR10 televisions are increasingly prevalent in our daily life for visualizing high dynamic range (HDR) images. But the majority of media images on the internet remain in 8-bit standard dynamic range (SDR) format. Therefore, converting SDR images to HDR ones by inverse tone mapping (ITM) is crucial to unlock the full potential of abundant media images. However, existing ITM methods are usually developed with complex network architectures requiring huge computational costs. In this paper, we propose a lightweight Improved Residual Network (IRNet) by enhancing the power of popular residual block for efficient ITM. Specifically, we propose a new Improved Residual Block (IRB) to extract and fuse multi-layer features for fine-grained HDR image reconstruction. Experiments on three benchmark datasets demonstrate that our IRNet achieves state-of-the-art performance on both the ITM and joint SR-ITM tasks. The code, models and data will be publicly available at https://github.com/ThisisVikki/ITM-baseline.
    摘要 显示设备如HDR10电视在我们日常生活中变得越来越普遍,用于可见化高动态范围(HDR)图像。但大多数网络图像仍然保留在8位标准动态范围(SDR)格式中。因此,将SDR图像转换成HDR图像的 inverse tone mapping(ITM)变得非常重要,以解锁丰富的网络图像的潜力。然而,现有的ITM方法通常具有复杂的网络架构,需要巨大的计算成本。在这篇论文中,我们提出了一种轻量级的改进的 residual 网络(IRNet),通过提高流行的 residual 块来提高精细度的 HDR 图像重建。具体来说,我们提出了一种新的改进的 residual 块(IRB),用于提取和融合多层特征进行精细度的 HDR 图像重建。实验结果表明,我们的 IRNet 在 ITM 和 joint SR-ITM 任务上均达到了状态略作性的表现。代码、模型和数据将在 GitHub 上公开,详细信息请参考

Ariadne’s Thread:Using Text Prompts to Improve Segmentation of Infected Areas from Chest X-ray images

  • paper_url: http://arxiv.org/abs/2307.03942
  • repo_url: https://github.com/junelin2333/languidemedseg-miccai2023
  • paper_authors: Yi Zhong, Mengqiu Xu, Kongming Liang, Kaixin Chen, Ming Wu
  • for: 本研究旨在提高肺病诊断的准确性,提出了一种语言驱动的医学图像分割方法,以增强图像分割结果的准确性。
  • methods: 本研究使用了语言提示来改进图像分割结果,并对QaTa-COV19数据集进行了实验,结果显示,与单Modal方法相比,语言驱动方法可以提高分割精度。
  • results: 本研究的结果表明,使用语言提示可以提高图像分割精度,并且对训练数据的大小有显著的优势。在QaTa-COV19数据集上,语言驱动方法的Dice分数提高6.09%以上,与单Modal方法相比。
    Abstract Segmentation of the infected areas of the lung is essential for quantifying the severity of lung disease like pulmonary infections. Existing medical image segmentation methods are almost uni-modal methods based on image. However, these image-only methods tend to produce inaccurate results unless trained with large amounts of annotated data. To overcome this challenge, we propose a language-driven segmentation method that uses text prompt to improve to the segmentation result. Experiments on the QaTa-COV19 dataset indicate that our method improves the Dice score by 6.09% at least compared to the uni-modal methods. Besides, our extended study reveals the flexibility of multi-modal methods in terms of the information granularity of text and demonstrates that multi-modal methods have a significant advantage over image-only methods in terms of the size of training data required.
    摘要 对于肺病的评估,分割感染区域的精确性是非常重要。现有的医疗影像分类方法都是基于图像的单 modal 方法,但这些图像仅方法往往会导致不准确的结果,除非训练数据量很大。为了解决这个挑战,我们提出了语言驱动的分类方法,使用文本提示来改善分类结果。实验结果显示,我们的方法在 QaTa-COV19 数据集上提高了 dice 分数6.09%至少,并且我们的扩展研究显示,多 modal 方法在文本信息粒度方面的灵活性和训练数据量方面的优势。

StyleGAN3: Generative Networks for Improving the Equivariance of Translation and Rotation

  • paper_url: http://arxiv.org/abs/2307.03898
  • repo_url: None
  • paper_authors: Tianlei Zhu, Junqi Chen, Renzhe Zhu, Gaurav Gupta
  • for: 本研究的目的是评估StyleGAN2和两个修改后的StyleGAN3版本在生成图像方面的性能差异。
  • methods: 本研究使用的方法包括使用FFHQ数据集和FID、EQ-T和EQ-R指标评估模型的表现。
  • results: 研究发现,StyleGan3版本是一个更好的生成网络,可以提高图像的等距变换性。这些发现对动画和视频的创作有积极的影响。
    Abstract StyleGAN can use style to affect facial posture and identity features, and noise to affect hair, wrinkles, skin color and other details. Among these, the outcomes of the picture processing will vary slightly between different versions of styleGAN. As a result, the comparison of performance differences between styleGAN2 and the two modified versions of styleGAN3 will be the main focus of this study. We used the FFHQ dataset as the dataset and FID, EQ-T, and EQ-R were used to be the assessment of the model. In the end, we discovered that Stylegan3 version is a better generative network to improve the equivariance. Our findings have a positive impact on the creation of animation and videos.
    摘要 StyleGAN 可以通过风格影响 facial 姿态和人脸特征,并通过噪声影响头发、皮肤色、皱纹等细节。 Among these, 不同版本的 StyleGAN 的图像处理结果会有一些微妙的差异。因此,我们将在这种情况下进行 StyleGAN2 和两个修改版本的 StyleGAN3 之间的性能对比。我们使用 FFHQ 数据集作为数据集,并使用 FID、EQ-T 和 EQ-R 三种指标评估模型。最终,我们发现 StyleGAN3 版本是一个更好的生成网络,可以提高equivariance。我们的发现对动画和视频的创建产生了积极的影响。Here's the breakdown of the translation:* StyleGAN 可以通过风格影响 facial 姿态和人脸特征 (StyleGAN can use style to affect facial posture and identity features)* 并通过噪声影响头发、皮肤色、皱纹等细节 (and noise to affect hair, skin color, and other details)* Among these, 不同版本的 StyleGAN 的图像处理结果会有一些微妙的差异 (Among these, the outcomes of the picture processing will vary slightly between different versions of StyleGAN)* 因此,我们将在这种情况下进行 StyleGAN2 和两个修改版本的 StyleGAN3 之间的性能对比 (Therefore, we will compare the performance of StyleGAN2 and the two modified versions of StyleGAN3 in this situation)* 我们使用 FFHQ 数据集作为数据集 (We use the FFHQ dataset as the dataset)* 并使用 FID、EQ-T 和 EQ-R 三种指标评估模型 (And use three metrics to evaluate the model: FID, EQ-T, and EQ-R)* 最终,我们发现 StyleGAN3 版本是一个更好的生成网络,可以提高equivariance (Finally, we found that the StyleGAN3 version is a better generative network, which can improve equivariance)* 我们的发现对动画和视频的创建产生了积极的影响 (Our discovery has a positive impact on the creation of animation and videos)

TBSS++: A novel computational method for Tract-Based Spatial Statistics

  • paper_url: http://arxiv.org/abs/2307.05387
  • repo_url: None
  • paper_authors: Davood Karimi, Hamza Kebiri, Ali Gholipour
  • for: 这个论文旨在提高Diffusion-weighted磁共振成像(dMRI)中脑白 mater 的评估。
  • methods: 该论文提出了一种新的计算框架,通过(i)精准的脑 tract 分割,和(ii)交叉Subject数据的精准注册,以超越现有方法的缺陷和限制。
  • results: 与TBSS相比,该方法可以提供更高的重复性和数据抖动鲁棒性。
    Abstract Diffusion-weighted magnetic resonance imaging (dMRI) is widely used to assess the brain white matter. One of the most common computations in dMRI involves cross-subject tract-specific analysis, whereby dMRI-derived biomarkers are compared between cohorts of subjects. The accuracy and reliability of these studies hinges on the ability to compare precisely the same white matter tracts across subjects. This is an intricate and error-prone computation. Existing computational methods such as Tract-Based Spatial Statistics (TBSS) suffer from a host of shortcomings and limitations that can seriously undermine the validity of the results. We present a new computational framework that overcomes the limitations of existing methods via (i) accurate segmentation of the tracts, and (ii) precise registration of data from different subjects/scans. The registration is based on fiber orientation distributions. To further improve the alignment of cross-subject data, we create detailed atlases of white matter tracts. These atlases serve as an unbiased reference space where the data from all subjects is registered for comparison. Extensive evaluations show that, compared with TBSS, our proposed framework offers significantly higher reproducibility and robustness to data perturbations. Our method promises a drastic improvement in accuracy and reproducibility of cross-subject dMRI studies that are routinely used in neuroscience and medical research.
    摘要 Diffusion-weighted магнитно共振成像(dMRI)广泛用于评估大脑白 matter。一种最常见的计算在 dMRI 中是 между cohorts of subjects 进行 tract-specific 分析,其中 dMRI 得到的生物标志物被比较 между不同的subjects。这些研究的准确性和可靠性取决于能够准确比较不同subjects 中的白 matter tracts。现有的计算方法,如 Tract-Based Spatial Statistics(TBSS),受到严重的缺陷和限制,这些缺陷可能会严重地损害结果的有效性。我们提出了一个新的计算框架,该框架可以超越现有的方法的限制,通过(i)准确地分割 tracts,和(ii)精准地注册不同subjects/scans 的数据。注册基于纤维方向分布。为了进一步改进cross-subject数据的对Alignment,我们创建了详细的 white matter tracts Atlases。这些 Atlases 作为一个不偏见的参照空间,用于注册所有subjects 的数据进行比较。广泛的评估表明,相比TBSS,我们提出的方法具有更高的可重现性和数据抖动强度的鲁棒性。我们的方法承诺可以在 neuroscience 和医学研究中提供明显的改进,以提高cross-subject dMRI 研究的准确性和可靠性。

Invariant Scattering Transform for Medical Imaging

  • paper_url: http://arxiv.org/abs/2307.04771
  • repo_url: None
  • paper_authors: Nafisa Labiba Ishrat Huda, Angona Biswas, MD Abdullah Al Nasim, Md. Fahim Rahman, Shoaib Ahmed
  • for: This paper is written for researchers and practitioners in the field of medical image analysis and deep learning, particularly those interested in using scattering transform for efficient image classification.
  • methods: The paper uses a novel approach called scattering transform, which combines signal processing and deep learning for medical image analysis. The transform is based on a wavelet technique that builds a useful signal representation for image classification.
  • results: The paper presents a step-by-step case study demonstrating the efficiency of scattering transform for medical image analysis, achieving high accuracy and outperforming traditional deep learning methods.Here is the information in Simplified Chinese text:
  • for: 这篇论文是为医学图像分析和深度学习领域的研究人员和实践者写的,特别是关心使用散射变换进行高效的图像分类。
  • methods: 这篇论文使用了一种新的方法——散射变换,它将信号处理和深度学习两个领域融合在一起,用于医学图像分析。散射变换基于wavelet技术,建立了有用的信号表示,用于图像分类。
  • results: 这篇论文展示了一个步骤很多的案例研究,用于证明散射变换在医学图像分析中的高效性,并且超过了传统的深度学习方法。
    Abstract Invariant scattering transform introduces new area of research that merges the signal processing with deep learning for computer vision. Nowadays, Deep Learning algorithms are able to solve a variety of problems in medical sector. Medical images are used to detect diseases brain cancer or tumor, Alzheimer's disease, breast cancer, Parkinson's disease and many others. During pandemic back in 2020, machine learning and deep learning has played a critical role to detect COVID-19 which included mutation analysis, prediction, diagnosis and decision making. Medical images like X-ray, MRI known as magnetic resonance imaging, CT scans are used for detecting diseases. There is another method in deep learning for medical imaging which is scattering transform. It builds useful signal representation for image classification. It is a wavelet technique; which is impactful for medical image classification problems. This research article discusses scattering transform as the efficient system for medical image analysis where it's figured by scattering the signal information implemented in a deep convolutional network. A step by step case study is manifested at this research work.
    摘要 固定扩散变换引入了一新的研究领域,把信号处理与深度学习结合以应用于计算机视觉。目前,深度学习算法能够解决医疗领域多种问题。医疗图像用于检测脑瘤或肿瘤、阿尔茨曼病、乳腺癌、parkinson病和其他多种疾病。在2020年疫情期间,机器学习和深度学习扮演了关键角色,检测COVID-19,包括变异分析、预测、诊断和决策。医疗图像如X射线、MRI(磁共振成像)、CT扫描是用于检测疾病的。此外,深度学习还有另一种方法用于医疗图像分类,即扩散变换。它建立了有用的信号表示,用于图像分类问题。这篇研究文章讨论了扩散变换作为医疗图像分析的有效系统,其中使用了扩散信号信息在深度征值网络中实现。本研究文章还提供了一步步的实践案例。

Coordinate-based neural representations for computational adaptive optics in widefield microscopy

  • paper_url: http://arxiv.org/abs/2307.03812
  • repo_url: https://github.com/iksungk/cocoa
  • paper_authors: Iksung Kang, Qinrong Zhang, Stella X. Yu, Na Ji
    for:* 这个论文旨在描述一种基于自适应光学的Machine Learning算法,用于无侵入性地图像生物结构,并且可以在复杂的样品中提高图像质量。methods:* 这个算法使用了自适应光学技术,包括光谱扫描和激光扫描,以估计波前弯曲和三维结构信息。results:* 研究人员使用了这个算法,在实验室中成功地图像了一个 mouse brain 的三维结构,并且系统性地探讨了这个算法的性能的限制因素。
    Abstract Widefield microscopy is widely used for non-invasive imaging of biological structures at subcellular resolution. When applied to complex specimen, its image quality is degraded by sample-induced optical aberration. Adaptive optics can correct wavefront distortion and restore diffraction-limited resolution but require wavefront sensing and corrective devices, increasing system complexity and cost. Here, we describe a self-supervised machine learning algorithm, CoCoA, that performs joint wavefront estimation and three-dimensional structural information extraction from a single input 3D image stack without the need for external training dataset. We implemented CoCoA for widefield imaging of mouse brain tissues and validated its performance with direct-wavefront-sensing-based adaptive optics. Importantly, we systematically explored and quantitatively characterized the limiting factors of CoCoA's performance. Using CoCoA, we demonstrated the first in vivo widefield mouse brain imaging using machine-learning-based adaptive optics. Incorporating coordinate-based neural representations and a forward physics model, the self-supervised scheme of CoCoA should be applicable to microscopy modalities in general.
    摘要 广角微scopia 广泛应用于非侵入性的生物结构成像,其图像质量在复杂样品下受样品引起的光学扭曲的影响。可适应光学可以修复波前弯曲和恢复 diffraction-limited 分辨率,但需要波前测量和修正设备,从而增加系统复杂度和成本。我们描述了一种自主学习机器学习算法 CoCoA,它可以在单个输入 3D 图像堆中同时进行波前估计和三维结构信息提取,无需外部训练集。我们在宽场探针中实现了 CoCoA,并通过 direct-wavefront-sensing-based adaptive optics 进行验证。重要的是,我们系统地探索和量化 CoCoA 的性能限制因素。使用 CoCoA,我们实现了首次在 vivo 宽场 mouse brain 成像,使用机器学习基于 adaptive optics 。通过卷积 нейрон表示和前向物理模型,CoCoA 的自主学习方案应用于 microscopy Modalities 中。

Thoracic Cartilage Ultrasound-CT Registration using Dense Skeleton Graph

  • paper_url: http://arxiv.org/abs/2307.03800
  • repo_url: None
  • paper_authors: Zhongliang Jiang, Chenyang Li, Xuesong Li, Nassir Navab
  • for: 用于实现自适应超声成像,尤其是在骨骼结构下面的高频率吸收层面上。
  • methods: 使用图形基于非导入注册方法,特别是利用骨表面特征来转移规划路径。
  • results: 可以有效地将规划路径从CT图像传播到当前设置下的US视图,并且可以减少干扰。
    Abstract Autonomous ultrasound (US) imaging has gained increased interest recently, and it has been seen as a potential solution to overcome the limitations of free-hand US examinations, such as inter-operator variations. However, it is still challenging to accurately map planned paths from a generic atlas to individual patients, particularly for thoracic applications with high acoustic-impedance bone structures under the skin. To address this challenge, a graph-based non-rigid registration is proposed to enable transferring planned paths from the atlas to the current setup by explicitly considering subcutaneous bone surface features instead of the skin surface. To this end, the sternum and cartilage branches are segmented using a template matching to assist coarse alignment of US and CT point clouds. Afterward, a directed graph is generated based on the CT template. Then, the self-organizing map using geographical distance is successively performed twice to extract the optimal graph representations for CT and US point clouds, individually. To evaluate the proposed approach, five cartilage point clouds from distinct patients are employed. The results demonstrate that the proposed graph-based registration can effectively map trajectories from CT to the current setup for displaying US views through limited intercostal space. The non-rigid registration results in terms of Hausdorff distance (Mean$\pm$SD) is 9.48$\pm$0.27 mm and the path transferring error in terms of Euclidean distance is 2.21$\pm$1.11 mm.
    摘要 自主式超声成像(US)在最近几年内得到了更多的关注,被视为可以超越自由手操作US检测的限制。然而,准确地将规划路径从通用 Atlas 传递到当前设置仍然是一项挑战,特别是在骨盆部应用中,因为有高频率声 impedance 结构位于皮肤下。为解决这个挑战,一种基于图的非RIGID региstración被提议,以便将规划路径从 Atlas 传递到当前设置,并且Explicitly 考虑到骨质表面特征而不是皮肤表面。为此,使用模板匹配 segment 胸板和软骨支持的 Cartilage 分支。然后,基于 CT 模板生成一个指向图。接着,使用自组织地图进行两次 Successive 地执行 geographical distance 自适应映射,以提取 CT 和 US 点云的最佳图表示。为评估提议方法,使用了五个不同患者的 Cartilage 点云。结果表明,提议的图基于REGISTRATION 可以有效地将 CT 的规划路径传递到当前设置,并且非RIGID регистраción的 Hausdorff 距离(Mean ± SD)为 9.48 ± 0.27 mm,路径传递错误(Euclidean 距离)为 2.21 ± 1.11 mm。

Motion Magnification in Robotic Sonography: Enabling Pulsation-Aware Artery Segmentation

  • paper_url: http://arxiv.org/abs/2307.03698
  • repo_url: https://github.com/dianyehuang/robpmepasnn
  • paper_authors: Dianye Huang, Yuan Bi, Nassir Navab, Zhongliang Jiang
  • for: 用于诊断和监测arterial疾病,提供非侵入、无辐射、实时的优势。
  • methods: 使用neuronal网络(PAS-NN),利用心跳刺激信号,提高血管分割精度和稳定性。
  • results: 在 volontiers的carotid和radial artery上进行实验, demonstarted that PAS-NN可以与当前最佳方法匹配,并有效地改善小血管(radial artery)的分割性能。
    Abstract Ultrasound (US) imaging is widely used for diagnosing and monitoring arterial diseases, mainly due to the advantages of being non-invasive, radiation-free, and real-time. In order to provide additional information to assist clinicians in diagnosis, the tubular structures are often segmented from US images. To improve the artery segmentation accuracy and stability during scans, this work presents a novel pulsation-assisted segmentation neural network (PAS-NN) by explicitly taking advantage of the cardiac-induced motions. Motion magnification techniques are employed to amplify the subtle motion within the frequency band of interest to extract the pulsation signals from sequential US images. The extracted real-time pulsation information can help to locate the arteries on cross-section US images; therefore, we explicitly integrated the pulsation into the proposed PAS-NN as attention guidance. Notably, a robotic arm is necessary to provide stable movement during US imaging since magnifying the target motions from the US images captured along a scan path is not manually feasible due to the hand tremor. To validate the proposed robotic US system for imaging arteries, experiments are carried out on volunteers' carotid and radial arteries. The results demonstrated that the PAS-NN could achieve comparable results as state-of-the-art on carotid and can effectively improve the segmentation performance for small vessels (radial artery).
    摘要 ultrasound(US)成像广泛应用于诊断和监测动脉疾病,主要是因为它不侵入、无辐射和实时。为了为临床医生提供更多的诊断信息,在US图像中分割动脉结构成为一项重要任务。为了提高动脉分割精度和稳定性,本工作提出了一种基于征动脉信号的新型激活分割神经网络(PAS-NN)。使用了振荡增强技术来增强US图像中的某些频谱信息,以提取动脉的征动脉信号。这些实时征动脉信号可以帮助在US图像的横截面上定位动脉,因此我们直接将征动脉信号 интеGRATED到提案的PAS-NN中作为注意力引导。需要注意的是,为了保证US成像过程中的稳定运动,需要使用机器人臂提供稳定的运动。为验证提案的机器人US系统是否能够成功地成像动脉,我们在志愿者的轮状和 radial artery 上进行了实验。结果表明,PAS-NN可以与当前最佳的结果相比,并且可以有效地提高小动脉( radial artery)的分割性能。