eess.IV - 2023-07-10

DWA: Differential Wavelet Amplifier for Image Super-Resolution

paper_url: http://arxiv.org/abs/2307.04593
repo_url: None
paper_authors: Brian B. Moser, Stanislav Frolov, Federico Raue, Sebastian Palacio, Andreas Dengel
for: 这篇论文是为了提出一种 Drop-in 模块，即 diferencial wavelet amplifier (DWA)，用于提高wavelet-based image Super-Resolution (SR)。
methods: DWA 使用了Discrete Wavelet Transformation (DWT)，一种已经受到较少关注的方法，来实现高效的图像表示和减少输入空间大小。
results: 根据实验结果，DWA 可以提高wavelet-based SR 模型的性能，特别是在经典的 SR 任务中，如 DWSR 和 MWCNN。此外，DWA 可以直接应用于输入图像空间，从而避免了传统 DWT 的频道化表示。

Abstract
This work introduces Differential Wavelet Amplifier (DWA), a drop-in module for wavelet-based image Super-Resolution (SR). DWA invigorates an approach recently receiving less attention, namely Discrete Wavelet Transformation (DWT). DWT enables an efficient image representation for SR and reduces the spatial area of its input by a factor of 4, the overall model size, and computation cost, framing it as an attractive approach for sustainable ML. Our proposed DWA model improves wavelet-based SR models by leveraging the difference between two convolutional filters to refine relevant feature extraction in the wavelet domain, emphasizing local contrasts and suppressing common noise in the input signals. We show its effectiveness by integrating it into existing SR models, e.g., DWSR and MWCNN, and demonstrate a clear improvement in classical SR tasks. Moreover, DWA enables a direct application of DWSR and MWCNN to input image space, reducing the DWT representation channel-wise since it omits traditional DWT.

摘要
Our proposed DWA model enhances wavelet-based SR models by leveraging the difference between two convolutional filters to refine relevant feature extraction in the wavelet domain. This approach emphasizes local contrasts and suppresses common noise in the input signals. We demonstrate the effectiveness of DWA by integrating it into existing SR models, such as DWSR and MWCNN, and show a clear improvement in classical SR tasks.Moreover, DWA enables a direct application of DWSR and MWCNN to the input image space, reducing the DWT representation channel-wise since it omits traditional DWT. This simplifies the model architecture and reduces the computational cost, making it more sustainable and efficient.

TFR: Texture Defect Detection with Fourier Transform using Normal Reconstructed Template of Simple Autoencoder

paper_url: http://arxiv.org/abs/2307.04574
repo_url: None
paper_authors: Jongwook Si, Sungyoung Kim
for: detection of texture defects in real-world images
methods: simple autoencoder + Fourier transform analysis
results: effective and accurate defect detection, demonstrated through experimental results

Abstract
Texture is an essential information in image representation, capturing patterns and structures. As a result, texture plays a crucial role in the manufacturing industry and is extensively studied in the fields of computer vision and pattern recognition. However, real-world textures are susceptible to defects, which can degrade image quality and cause various issues. Therefore, there is a need for accurate and effective methods to detect texture defects. In this study, a simple autoencoder and Fourier transform are employed for texture defect detection. The proposed method combines Fourier transform analysis with the reconstructed template obtained from the simple autoencoder. Fourier transform is a powerful tool for analyzing the frequency domain of images and signals. Moreover, since texture defects often exhibit characteristic changes in specific frequency ranges, analyzing the frequency domain enables effective defect detection. The proposed method demonstrates effectiveness and accuracy in detecting texture defects. Experimental results are presented to evaluate its performance and compare it with existing approaches.

摘要
Texture 是图像表示中的重要信息，捕捉pattern和结构。由于这些Texture在生产业中扮演着重要的角色，因此在计算机视觉和模式识别领域中进行了广泛的研究。然而，实际世界中的Texture受到defect的影响，这会导致图像质量下降和多种问题。因此，需要一种准确和有效的方法来检测Texture defect。在本研究中，使用了简单的自适应神经网络和快推trasform来检测Texture defect。该方法将快推trasform分析与自适应神经网络重建的模板结合使用。快推trasform是图像和信号频谱分析的powerful工具，而Texture defects通常在特定频谱范围内表现出 caracteristic 变化，因此在频谱分析中可以实现有效的检测。该方法的实验结果表明其效果和准确性在检测Texture defect方面具有优势。与现有方法进行比较的实验结果也是如此。

CoactSeg: Learning from Heterogeneous Data for New Multiple Sclerosis Lesion Segmentation

paper_url: http://arxiv.org/abs/2307.04513
repo_url: https://github.com/ycwu1997/coactseg
paper_authors: Yicheng Wu, Zhonghua Wu, Hengcan Shi, Bjoern Picker, Winston Chong, Jianfei Cai
for: 提高多发形股病（MS）临床治疗中新出现的肿瘤分割精度，以估计疾病进程和治疗效果。
methods: 利用不同时点样本的异类数据（新出现肿瘤注解两个时点样本和全部肿瘤注解单个时点样本），提出了一种协作分割（CoactSeg）框架，以提高新肿瘤分割精度。
results: 对于新肿瘤和全部肿瘤分割任务，通过利用不同时点样本和提出的关系常量约束，可以显著提高分割精度。同时，还提供了一个MS-23v1 dataset，包括38个澳大利亚单个时点样本的全部肿瘤标签。

Abstract
New lesion segmentation is essential to estimate the disease progression and therapeutic effects during multiple sclerosis (MS) clinical treatments. However, the expensive data acquisition and expert annotation restrict the feasibility of applying large-scale deep learning models. Since single-time-point samples with all-lesion labels are relatively easy to collect, exploiting them to train deep models is highly desirable to improve new lesion segmentation. Therefore, we proposed a coaction segmentation (CoactSeg) framework to exploit the heterogeneous data (i.e., new-lesion annotated two-time-point data and all-lesion annotated single-time-point data) for new MS lesion segmentation. The CoactSeg model is designed as a unified model, with the same three inputs (the baseline, follow-up, and their longitudinal brain differences) and the same three outputs (the corresponding all-lesion and new-lesion predictions), no matter which type of heterogeneous data is being used. Moreover, a simple and effective relation regularization is proposed to ensure the longitudinal relations among the three outputs to improve the model learning. Extensive experiments demonstrate that utilizing the heterogeneous data and the proposed longitudinal relation constraint can significantly improve the performance for both new-lesion and all-lesion segmentation tasks. Meanwhile, we also introduce an in-house MS-23v1 dataset, including 38 Oceania single-time-point samples with all-lesion labels. Codes and the dataset are released at https://github.com/ycwu1997/CoactSeg.

摘要
新的肿瘤分割是 Multiple Sclerosis (MS) 诊断和治疗中的关键，但是收集大规模的数据和专家标注的成本限制了应用大规模深度学习模型的可能性。由于单个时间点样本中的所有肿瘤标签比较容易获得，因此可以利用它们来训练深度模型以提高新的肿瘤分割。为此，我们提出了一个合作分割（CoactSeg）框架，可以利用不同类型的数据（新的肿瘤标签的两个时间点数据和所有肿瘤标签的单个时间点数据）来进行新的肿瘤分割。CoactSeg 模型设计为一个统一的模型，三个输入（基线、追视和它们之间的脑部差异）和三个输出（相应的所有肿瘤和新肿瘤预测），不管使用哪种不同类型的数据。此外，我们还提出了一种简单而有效的时间相关约束，以保证新肿瘤和所有肿瘤之间的长期关系，从而提高模型学习。我们的实验表明，利用不同类型的数据和提议的时间相关约束可以大幅提高新肿瘤和所有肿瘤分割任务的表现。此外，我们还释放了一个 MS-23v1 数据集，包括 38 个澳大利亚单个时间点样本，每个样本均有所有肿瘤标签。代码和数据可以在上下载。

SAM-IQA: Can Segment Anything Boost Image Quality Assessment?

paper_url: http://arxiv.org/abs/2307.04455
repo_url: https://github.com/hedlen/sam-iqa
paper_authors: Xinpeng Li, Ting Jiang, Haoqiang Fan, Shuaicheng Liu
for: 本研究旨在提高图像质量评估（IQA） task 的准确性，通过使用大量数据进行训练。
methods: 本研究使用 Segment Anything 模型的encoder部分进行高级别 semantic feature extraction，并利用 Fourier 和标准卷积来提取频域特征。
results: 对四个代表性的数据集进行了广泛的实验，结果表明我们的方法可以比STATE-OF-THE-ART 高效， both qualitatively 和 quantitatively。

Abstract
Image Quality Assessment (IQA) is a challenging task that requires training on massive datasets to achieve accurate predictions. However, due to the lack of IQA data, deep learning-based IQA methods typically rely on pre-trained networks trained on massive datasets as feature extractors to enhance their generalization ability, such as the ResNet network trained on ImageNet. In this paper, we utilize the encoder of Segment Anything, a recently proposed segmentation model trained on a massive dataset, for high-level semantic feature extraction. Most IQA methods are limited to extracting spatial-domain features, while frequency-domain features have been shown to better represent noise and blur. Therefore, we leverage both spatial-domain and frequency-domain features by applying Fourier and standard convolutions on the extracted features, respectively. Extensive experiments are conducted to demonstrate the effectiveness of all the proposed components, and results show that our approach outperforms the state-of-the-art (SOTA) in four representative datasets, both qualitatively and quantitatively. Our experiments confirm the powerful feature extraction capabilities of Segment Anything and highlight the value of combining spatial-domain and frequency-domain features in IQA tasks. Code: https://github.com/Hedlen/SAM-IQA

摘要

Identification of Hemorrhage and Infarct Lesions on Brain CT Images using Deep Learning

paper_url: http://arxiv.org/abs/2307.04425
repo_url: None
paper_authors: Arunkumar Govindarajan, Arjun Agarwal, Subhankar Chattoraj, Dennis Robert, Satish Golla, Ujjwal Upadhyay, Swetha Tanamala, Aarthi Govindarajan
for: 鉴定非contrast computed tomography（NCCT）头部影像的潜在疾病
methods: 使用深度学习（DL）基本的计算机助临 diagnosis（CAD）模型
results: 对头部NCCT影像的自动识别出血肿和损伤的可能性和局限性

Abstract
Head Non-contrast computed tomography (NCCT) scan remain the preferred primary imaging modality due to their widespread availability and speed. However, the current standard for manual annotations of abnormal brain tissue on head NCCT scans involves significant disadvantages like lack of cutoff standardization and degeneration identification. The recent advancement of deep learning-based computer-aided diagnostic (CAD) models in the multidisciplinary domain has created vast opportunities in neurological medical imaging. Significant literature has been published earlier in the automated identification of brain tissue on different imaging modalities. However, determining Intracranial hemorrhage (ICH) and infarct can be challenging due to image texture, volume size, and scan quality variability. This retrospective validation study evaluated a DL-based algorithm identifying ICH and infarct from head-NCCT scans. The head-NCCT scans dataset was collected consecutively from multiple diagnostic imaging centers across India. The study exhibits the potential and limitations of such DL-based software for introduction in routine workflow in extensive healthcare facilities.

摘要
head non-contrast computed tomography (NCCT) 扫描仍然是 primary imaging modality 的首选方式，因为它们在可用性和速度方面具有广泛的优势。然而，现有的手动标注病理脑组织在 head NCCT 扫描中存在一些缺点，如标准化标注的缺乏和衰变识别。Recent Advances 在多学科领域的 computer-aided diagnostic (CAD) 模型中，已经创造了巨大的机会，特别是在神经科医学影像领域。 Earlier literature 已经发表了自动识别不同 imaging modalities 中的脑组织。然而，由于图像 текстура、体积大小和扫描质量的变化，识别Intracranial hemorrhage (ICH) 和衰变可以是困难的。本回顾验证研究检查了一种基于深度学习 (DL) 的算法，可以从 head-NCCT 扫描中识别 ICH 和衰变。该 dataset 是从印度多个诊断影像中心收集的 consecutively。研究表明了这种 DL-based 软件的潜在和局限性，并探讨了在广泛的医疗设施中的应用前景。

Towards Enabling Cardiac Digital Twins of Myocardial Infarction Using Deep Computational Models for Inverse Inference

paper_url: http://arxiv.org/abs/2307.04421
repo_url: None
paper_authors: Lei Li, Julia Camps, Zhinuo, Wang, Abhirup Banerjee, Marcel Beetz, Blanca Rodriguez, Vicente Grau
for: 这个论文的目的是开发一个基于电cardiac digital twins (CDTs)的个性化诊断和治疗规划系统，以便更准确地诊断心肺病。
methods: 该论文使用了多Modal数据，包括心肺成像和电cardiacogram (ECG)，以提高推断肉粉组织特性的准确性和可靠性。具体来说，该论文采用了一种深度计算模型，通过对QRS信号和相应的损肉区域之间的复杂关系进行捕捉，来推断损肉的位置和分布。
results: 在计算实验中，该模型能够有效地捕捉QRS信号和相应的损肉区域之间的复杂关系，并且在未来的临床应用中具有扎实的潜在性。

Abstract
Myocardial infarction (MI) demands precise and swift diagnosis. Cardiac digital twins (CDTs) have the potential to offer individualized evaluation of cardiac function in a non-invasive manner, making them a promising approach for personalized diagnosis and treatment planning of MI. The inference of accurate myocardial tissue properties is crucial in creating a reliable CDT platform, and particularly in the context of studying MI. In this work, we investigate the feasibility of inferring myocardial tissue properties from the electrocardiogram (ECG), focusing on the development of a comprehensive CDT platform specifically designed for MI. The platform integrates multi-modal data, such as cardiac MRI and ECG, to enhance the accuracy and reliability of the inferred tissue properties. We perform a sensitivity analysis based on computer simulations, systematically exploring the effects of infarct location, size, degree of transmurality, and electrical activity alteration on the simulated QRS complex of ECG, to establish the limits of the approach. We subsequently propose a deep computational model to infer infarct location and distribution from the simulated QRS. The in silico experimental results show that our model can effectively capture the complex relationships between the QRS signals and the corresponding infarct regions, with promising potential for clinical application in the future. The code will be released publicly once the manuscript is accepted for publication.

摘要
In this study, we investigate the feasibility of inferring myocardial tissue properties from the electrocardiogram (ECG) to develop a comprehensive CDT platform specifically designed for MI. Our platform integrates multi-modal data, such as cardiac MRI and ECG, to enhance the accuracy and reliability of the inferred tissue properties.We performed a sensitivity analysis based on computer simulations to explore the effects of infarct location, size, degree of transmurality, and electrical activity alteration on the simulated QRS complex of ECG. Our results show that our approach has limitations, but our deep computational model can effectively capture the complex relationships between the QRS signals and the corresponding infarct regions, with promising potential for clinical application in the future. The code will be publicly released once the manuscript is accepted for publication.

K-Space-Aware Cross-Modality Score for Synthesized Neuroimage Quality Assessment

paper_url: http://arxiv.org/abs/2307.04296
repo_url: None
paper_authors: Jinbao Wang, Guoyang Xie, Yawen Huang, Jiayi Lyu, Feng Zheng, Yefeng Zheng, Yaochu Jin
for: This paper aims to address the problem of assessing cross-modality medical image synthesis, which has been largely unexplored and neglected by existing measures such as PSNR and SSIM.
methods: The proposed method, called K-CROSS, uses a pre-trained multi-modality segmentation network to predict lesion locations, together with a tumor encoder to represent features such as texture details and brightness intensities. Both k-space features and vision features are obtained and employed in comprehensive encoders with a frequency reconstruction penalty. The structure-shared encoders are designed and constrained with a similarity loss to capture the intrinsic common structural information for both modalities.
results: The proposed method outperforms other metrics, especially when compared with radiologists on a large-scale cross-modality neuroimaging perceptual similarity (NIRPS) dataset with 6,000 radiologist judgments.

Abstract
The problem of how to assess cross-modality medical image synthesis has been largely unexplored. The most used measures like PSNR and SSIM focus on analyzing the structural features but neglect the crucial lesion location and fundamental k-space speciality of medical images. To overcome this problem, we propose a new metric K-CROSS to spur progress on this challenging problem. Specifically, K-CROSS uses a pre-trained multi-modality segmentation network to predict the lesion location, together with a tumor encoder for representing features, such as texture details and brightness intensities. To further reflect the frequency-specific information from the magnetic resonance imaging principles, both k-space features and vision features are obtained and employed in our comprehensive encoders with a frequency reconstruction penalty. The structure-shared encoders are designed and constrained with a similarity loss to capture the intrinsic common structural information for both modalities. As a consequence, the features learned from lesion regions, k-space, and anatomical structures are all captured, which serve as our quality evaluators. We evaluate the performance by constructing a large-scale cross-modality neuroimaging perceptual similarity (NIRPS) dataset with 6,000 radiologist judgments. Extensive experiments demonstrate that the proposed method outperforms other metrics, especially in comparison with the radiologists on NIRPS.

摘要
医疗影像合成的跨Modalidad评估问题一直受到了相对少数研究。现有的度量方法，如PSNR和SSIM，主要关注医疗影像的结构特征，而忽略了重要的疾病位置和基本的k-空间特点。为了解决这个问题，我们提出了一个新的度量方法，即K-CROSS。Specifically，K-CROSS使用一个预训练的多Modalidad分割网络预测疾病位置，并使用一个恶性编码器来表示特征，如纹理细节和明亮度。为了更好地反映医疗影像的频率特征，我们在我们的全面编码器中使用频率重建罚款。同时，我们设计了结构共享编码器，并使用一个相似损失来捕捉两Modalidad之间的共同结构信息。因此，我们可以从疾病区域、k-空间和解剖结构中 capture所有的特征，这些特征作为我们的质量评估器。我们通过构建一个大规模的跨Modalidad神经成像相似度（NIRPS）数据集，并对6,000名医生判断进行评估，来评估性能。广泛的实验表明，我们提出的方法在比较其他度量方法时，尤其是与医生在NIRPS上的评估中，表现出了优异。