eess.IV - 2023-08-16

Prediction of post-radiotherapy recurrence volumes in head and neck squamous cell carcinoma using 3D U-Net segmentation

  • paper_url: http://arxiv.org/abs/2308.08396
  • repo_url: None
  • paper_authors: Denis Kutnár, Ivan R Vogelius, Katrin Elisabet Håkansson, Jens Petersen, Jeppe Friborg, Lena Specht, Mogens Bernsdorf, Anita Gothelf, Claus Kristensen, Abraham George Smith
    For:* The paper aims to investigate the use of Convolutional Neural Networks (CNNs) to predict locoregional recurrences (LRR) in head and neck squamous cell carcinoma (HNSCC) patients based on pre-treatment imaging.Methods:* The study uses pre-treatment 18F-fluorodeoxyglucose positron emission tomography (FDG-PET)/computed tomography (CT) scans to train a CNN to predict LRR volumes.* The dataset is divided into training, validation, and test sets.* The CNN is trained from scratch and compared to a pre-trained CNN and a SUVmax threshold approach.Results:* The SUVmax threshold method had a median volume of 4.6 cubic centimeters (cc) and included 5 out of 7 relapse origin points.* The GTV contour and best CNN segmentations included the relapse origin 6 out of 7 times with median volumes of 28 and 18 cc, respectively.* The CNN included the same or greater number of relapse volume points of origin (POs) with significantly smaller relapse volumes.I hope this helps! Let me know if you have any further questions.
    Abstract Locoregional recurrences (LRR) are still a frequent site of treatment failure for head and neck squamous cell carcinoma (HNSCC) patients. Identification of high risk subvolumes based on pretreatment imaging is key to biologically targeted radiation therapy. We investigated the extent to which a Convolutional neural network (CNN) is able to predict LRR volumes based on pre-treatment 18F-fluorodeoxyglucose positron emission tomography (FDG-PET)/computed tomography (CT) scans in HNSCC patients and thus the potential to identify biological high risk volumes using CNNs. For 37 patients who had undergone primary radiotherapy for oropharyngeal squamous cell carcinoma, five oncologists contoured the relapse volumes on recurrence CT scans. Datasets of pre-treatment FDG-PET/CT, gross tumour volume (GTV) and contoured relapse for each of the patients were randomly divided into training (n=23), validation (n=7) and test (n=7) datasets. We compared a CNN trained from scratch, a pre-trained CNN, a SUVmax threshold approach, and using the GTV directly. The SUVmax threshold method included 5 out of the 7 relapse origin points within a volume of median 4.6 cubic centimetres (cc). Both the GTV contour and best CNN segmentations included the relapse origin 6 out of 7 times with median volumes of 28 and 18 cc respectively. The CNN included the same or greater number of relapse volume POs, with significantly smaller relapse volumes. Our novel findings indicate that CNNs may predict LRR, yet further work on dataset development is required to attain clinically useful prediction accuracy.
    摘要 “头颈癌细胞瘤(HNSCC)患者的局部再次发病(LRR)仍然是治疗失败的常见现象。我们使用深度学习模型(CNN)来预测LRR объем based on pre-treatment 18F-fluorodeoxyglucose пози트рон辐射 Tomatoes(FDG-PET)/计算机成像(CT)扫描图。我们通过对37名受到首次放疗治疗的口咽癌患者的再次发病CT扫描图进行评估,并对每个患者的预处理FDG-PET/CT、GTV和评估关键点进行分类。我们对比了从零开始训练的CNN、预训练CNN、SUVmax阈值方法和直接使用GTV。我们发现,SUVmax阈值方法包含5个再次发病起点, median 4.6立方厘米(cc)内,而GTV和最佳CNN分 segmentation都包含6个再次发病起点, median 28和18cc分别。CNN包含了同样或更多的再次发病体积点,并且再次发病体积更小。我们的新发现表明,CNN可能预测LRR,但是进一步的数据集开发是需要以获得临床有用的预测精度。”

DeepContrast: Deep Tissue Contrast Enhancement using Synthetic Data Degradations and OOD Model Predictions

  • paper_url: http://arxiv.org/abs/2308.08365
  • repo_url: None
  • paper_authors: Nuno Pimpão Martins, Yannis Kalaidzidis, Marino Zerial, Florian Jug
  • For: The paper aims to improve the quality of microscopy images by addressing the problem of image degradation, specifically blurring and contrast loss, which is a major challenge in life science research.* Methods: The authors use a deep learning approach, training a neural network to learn the inverse of the degradation function, and demonstrate that the network can be used out-of-distribution to improve the quality of less severely degraded images. They also explore the effect of iterative predictions and find that a balance between contrast improvement and retention of image details is necessary.* Results: The authors show that their method can improve the quality of microscopy images, especially in deeper regions of thick samples, and demonstrate the effectiveness of their approach through experiments using real microscopy images.
    Abstract Microscopy images are crucial for life science research, allowing detailed inspection and characterization of cellular and tissue-level structures and functions. However, microscopy data are unavoidably affected by image degradations, such as noise, blur, or others. Many such degradations also contribute to a loss of image contrast, which becomes especially pronounced in deeper regions of thick samples. Today, best performing methods to increase the quality of images are based on Deep Learning approaches, which typically require ground truth (GT) data during training. Our inability to counteract blurring and contrast loss when imaging deep into samples prevents the acquisition of such clean GT data. The fact that the forward process of blurring and contrast loss deep into tissue can be modeled, allowed us to propose a new method that can circumvent the problem of unobtainable GT data. To this end, we first synthetically degraded the quality of microscopy images even further by using an approximate forward model for deep tissue image degradations. Then we trained a neural network that learned the inverse of this degradation function from our generated pairs of raw and degraded images. We demonstrated that networks trained in this way can be used out-of-distribution (OOD) to improve the quality of less severely degraded images, e.g. the raw data imaged in a microscope. Since the absolute level of degradation in such microscopy images can be stronger than the additional degradation introduced by our forward model, we also explored the effect of iterative predictions. Here, we observed that in each iteration the measured image contrast kept improving while detailed structures in the images got increasingly removed. Therefore, dependent on the desired downstream analysis, a balance between contrast improvement and retention of image details has to be found.
    摘要 μ可微镜图像是生命科学研究中不可或缺的,它允许详细的检查和Characterization of cellular和组织水平结构和功能。然而,μ可微镜数据必然受到图像噪声、模糊和其他问题的影响,这些问题也会导致图像尺度的损失。今天,最佳的图像质量提高方法都基于深度学习方法,这些方法通常需要训练时的标准数据(GT数据)。我们无法对深入采样中的图像进行减震和减少尺度损失,因此无法获得这些干净的GT数据。由于前向过程中的减震和尺度损失可以被模型化,我们提出了一种新的方法,可以缺省GT数据的问题。我们首先使用一种近似的前向模型来进一步降低μ可微镜图像的质量。然后,我们使用这些生成的对应图像对进行了深度学习网络的训练,这些网络可以学习对这种降低函数的逆操作。我们展示了这种方法可以在OOD(out-of-distribution)情况下提高μ可微镜图像的质量。由于μ可微镜图像中的噪声水平可能比我们添加的降低水平更高,我们也研究了反复预测的效果。在每次预测中,我们发现图像尺度的改善随着图像细节的消失而增加。因此,根据下游分析的需求,需要找到和保留图像细节的平衡。

GAEI-UNet: Global Attention and Elastic Interaction U-Net for Vessel Image Segmentation

  • paper_url: http://arxiv.org/abs/2308.08345
  • repo_url: None
  • paper_authors: Ruiqiang Xiao, Zhuoyue Wan
  • for: 这个研究的目的是提高静脉影像分类的精度和可靠性,以提供医疗界更加准确和可靠的诊断工具。
  • methods: 本研究提出了一个新的模型,即GAEI-UNet,它结合了全球注意力和弹性互动的技术来优化高水平的 semantic 理解,以提高小血管的精确分类。
  • results: 在DRIVE retinal vessel dataset 上进行评估,GAEI-UNet 能够在 SE 和小血管之间的连接性方面表现出色,而且不需要对计算复杂性进行增加。
    Abstract Vessel image segmentation plays a pivotal role in medical diagnostics, aiding in the early detection and treatment of vascular diseases. While segmentation based on deep learning has shown promising results, effectively segmenting small structures and maintaining connectivity between them remains challenging. To address these limitations, we propose GAEI-UNet, a novel model that combines global attention and elastic interaction-based techniques. GAEI-UNet leverages global spatial and channel context information to enhance high-level semantic understanding within the U-Net architecture, enabling precise segmentation of small vessels. Additionally, we adopt an elastic interaction-based loss function to improve connectivity among these fine structures. By capturing the forces generated by misalignment between target and predicted shapes, our model effectively learns to preserve the correct topology of vessel networks. Evaluation on retinal vessel dataset -- DRIVE demonstrates the superior performance of GAEI-UNet in terms of SE and connectivity of small structures, without significantly increasing computational complexity. This research aims to advance the field of vessel image segmentation, providing more accurate and reliable diagnostic tools for the medical community. The implementation code is available on Code.
    摘要 船体图像分割在医疗诊断中扮演着关键角色,帮助早期发现和治疗血管疾病。然而,基于深度学习的分割方法仍然存在小结构分割和保持结构连接的挑战。为解决这些限制,我们提出了GAEI-UNet模型,该模型结合全球注意力和弹性互动基于技术。GAEI-UNet 利用全球空间和通道信息来增强 U-Net 架构中高级别 semantics的理解,以提高小血管的精确分割。此外,我们采用了弹性互动基于损失函数,以提高小结构之间的连接。通过捕捉目标形态与预测形态之间的弹性力,我们的模型能够保持血管网络的正确拓扑结构。DRIVE retinal vessel dataset 的评估表明,GAEI-UNet 在小结构连接和精确分割方面具有显著优势,而无需显著增加计算复杂性。本研究旨在提高船体图像分割领域的准确性和可靠性,为医疗界提供更多的可靠的诊断工具。代码可在 Code.

Denoising Diffusion Probabilistic Model for Retinal Image Generation and Segmentation

  • paper_url: http://arxiv.org/abs/2308.08339
  • repo_url: https://github.com/aaleka/retree
  • paper_authors: Alnur Alimanov, Md Baharul Islam
  • For: The paper aims to provide a novel dataset and method for retinal image segmentation, which can help detect and diagnose various eye, blood circulation, and brain-related diseases.* Methods: The proposed method uses a Denoising Diffusion Probabilistic Model (DDPM) to generate retinal images and vessel trees, and a two-stage DDPM to guide the generation of fundus images from given vessel trees and random distribution.* Results: The proposed dataset, called Retinal Trees (ReTree), has been evaluated quantitatively and qualitatively, and the results show that the DDPM outperformed GANs in image synthesis. The dataset and source code are available online for further research and validation.
    Abstract Experts use retinal images and vessel trees to detect and diagnose various eye, blood circulation, and brain-related diseases. However, manual segmentation of retinal images is a time-consuming process that requires high expertise and is difficult due to privacy issues. Many methods have been proposed to segment images, but the need for large retinal image datasets limits the performance of these methods. Several methods synthesize deep learning models based on Generative Adversarial Networks (GAN) to generate limited sample varieties. This paper proposes a novel Denoising Diffusion Probabilistic Model (DDPM) that outperformed GANs in image synthesis. We developed a Retinal Trees (ReTree) dataset consisting of retinal images, corresponding vessel trees, and a segmentation network based on DDPM trained with images from the ReTree dataset. In the first stage, we develop a two-stage DDPM that generates vessel trees from random numbers belonging to a standard normal distribution. Later, the model is guided to generate fundus images from given vessel trees and random distribution. The proposed dataset has been evaluated quantitatively and qualitatively. Quantitative evaluation metrics include Frechet Inception Distance (FID) score, Jaccard similarity coefficient, Cohen's kappa, Matthew's Correlation Coefficient (MCC), precision, recall, F1-score, and accuracy. We trained the vessel segmentation model with synthetic data to validate our dataset's efficiency and tested it on authentic data. Our developed dataset and source code is available at https://github.com/AAleka/retree.
    摘要 专家利用眼球图像和血管树来检测和诊断不同的眼部、血液和脑部疾病。然而,手动分割眼球图像是一项时间consuming和需要高度专业知识的过程,另外,隐私问题也是一个大的问题。许多方法已经被提出来分割图像,但是因为缺乏大量眼球图像数据,这些方法的性能受到限制。本文提出了一种新的吸引 diffusion 概率模型(DDPM),它在图像生成方面超越了生成 adversarial 网络(GAN)的性能。我们开发了一个名为“Retinal Trees”(ReTree)的数据集,该数据集包括眼球图像、相应的血管树和基于 DDPM 的分割网络。在第一个阶段,我们开发了一种两个阶段的 DDPM,它可以将随机数据从标准差分布中生成血管树。然后,模型被引导以使用给定的血管树和随机分布生成眼球图像。我们对该数据集进行了量化和质量上的评估。量化评估指标包括Fréchet Inception Distance(FID)分数、Jaccard相似度系数、Cohen的κ值、Matthew的相似度系数(MCC)、精度、 recall、F1-score 和准确率。我们使用合成数据来验证我们的数据集的效果,并对真实数据进行测试。我们开发的数据集和源代码可以在 上下载。

ECPC-IDS:A benchmark endometrail cancer PET/CT image dataset for evaluation of semantic segmentation and detection of hypermetabolic regions

  • paper_url: http://arxiv.org/abs/2308.08313
  • repo_url: None
  • paper_authors: Dechao Tang, Xuanyi Li, Tianming Du, Deguo Ma, Zhiyu Ma, Hongzan Sun, Marcin Grzegorzek, Huiyan Jiang, Chen Li
  • for: 这个论文目的是提供一个大量多图像的 ENDOMETRIAL CANCER PET/CT影像数据集,以便研究者利用计算机支持的诊断技术来提高诊断精度和对象ivity。
  • methods: 这个论文使用的方法包括五种经典的深度学习语义 segmentation 方法和六种深度学习对象检测方法,以证明不同方法在 ECPC-IDS 上的差异。
  • results: 这个论文通过对 ECPC-IDS 进行广泛的实验,证明了深度学习基于语义 segmentation 和对象检测方法的效果,并且这些方法可以帮助研究者开发新的计算机支持诊断技术,从而为临床医生和患者带来很大的 benefit.
    Abstract Endometrial cancer is one of the most common tumors in the female reproductive system and is the third most common gynecological malignancy that causes death after ovarian and cervical cancer. Early diagnosis can significantly improve the 5-year survival rate of patients. With the development of artificial intelligence, computer-assisted diagnosis plays an increasingly important role in improving the accuracy and objectivity of diagnosis, as well as reducing the workload of doctors. However, the absence of publicly available endometrial cancer image datasets restricts the application of computer-assisted diagnostic techniques.In this paper, a publicly available Endometrial Cancer PET/CT Image Dataset for Evaluation of Semantic Segmentation and Detection of Hypermetabolic Regions (ECPC-IDS) are published. Specifically, the segmentation section includes PET and CT images, with a total of 7159 images in multiple formats. In order to prove the effectiveness of segmentation methods on ECPC-IDS, five classical deep learning semantic segmentation methods are selected to test the image segmentation task. The object detection section also includes PET and CT images, with a total of 3579 images and XML files with annotation information. Six deep learning methods are selected for experiments on the detection task.This study conduct extensive experiments using deep learning-based semantic segmentation and object detection methods to demonstrate the differences between various methods on ECPC-IDS. As far as we know, this is the first publicly available dataset of endometrial cancer with a large number of multiple images, including a large amount of information required for image and target detection. ECPC-IDS can aid researchers in exploring new algorithms to enhance computer-assisted technology, benefiting both clinical doctors and patients greatly.
    摘要 结直肠癌是女性生殖系统中最常见的肿瘤,是生殖系统癌症的第三大死因,只次于卵巢癌和子宫癌。早期诊断可以提高患者5年生存率。随着人工智能的发展,计算机助诊技术在诊断准确性和 объектив性方面扮演着越来越重要的角色,同时也减轻了医生的工作负担。然而,由于缺乏公共可用的结直肠癌图像数据集,计算机助诊技术的应用受到了限制。在这篇论文中,一个公共可用的结直肠癌PET/CT图像数据集(ECPC-IDS)被发布。具体来说,分别包括PET和CT图像,总计7159张图像,多种格式。为证明ECPC-IDS上 segmentation 方法的效果,我们选择了5种经典的深度学习semantic segmentation方法进行测试图像 segmentation 任务。对象检测部分也包括PET和CT图像,总计3579张图像和XML文件中的注释信息。为了证明对ECPC-IDS的不同方法的差异,我们选择了6种深度学习方法进行实验。这个研究通过深度学习基于semantic segmentation和对象检测方法进行广泛的实验,以示ECPC-IDS上不同方法之间的差异。我们知道,ECPC-IDS是目前所有公共可用的结直肠癌图像数据集中最大的一个,包括大量的图像信息和目标检测信息。ECPC-IDS可以帮助研究人员探索新的算法,以提高计算机助诊技术,对医生和病人都是非常有利。

OnUVS: Online Feature Decoupling Framework for High-Fidelity Ultrasound Video Synthesis

  • paper_url: http://arxiv.org/abs/2308.08269
  • repo_url: None
  • paper_authors: Han Zhou, Dong Ni, Ao Chang, Xinrui Zhou, Rusi Chen, Yanlin Chen, Lian Liu, Jiamin Liang, Yuhao Huang, Tong Han, Zhe Liu, Deng-Ping Fan, Xin Yang
  • For: The paper aims to address the challenges of synthesizing high-fidelity ultrasound (US) videos for medical education and diagnosis, specifically the limited availability of specific US video cases and the need to accurately animate dynamic anatomic structures while preserving image fidelity.* Methods: The proposed method, called OnUVS, is an online feature-decoupling framework that incorporates anatomic information into keypoint learning, uses a dual-decoder to decouple content and textural features, and employs a multiple-feature discriminator to enhance the sharpness and fine details of the generated videos. Additionally, the method constrains the motion trajectories of keypoints during online learning to enhance the fluidity of the generated videos.* Results: The paper reports that OnUVS synthesizes US videos with high fidelity, as validated and demonstrated through user studies on in-house echocardiographic and pelvic floor US videos.
    Abstract Ultrasound (US) imaging is indispensable in clinical practice. To diagnose certain diseases, sonographers must observe corresponding dynamic anatomic structures to gather comprehensive information. However, the limited availability of specific US video cases causes teaching difficulties in identifying corresponding diseases, which potentially impacts the detection rate of such cases. The synthesis of US videos may represent a promising solution to this issue. Nevertheless, it is challenging to accurately animate the intricate motion of dynamic anatomic structures while preserving image fidelity. To address this, we present a novel online feature-decoupling framework called OnUVS for high-fidelity US video synthesis. Our highlights can be summarized by four aspects. First, we introduced anatomic information into keypoint learning through a weakly-supervised training strategy, resulting in improved preservation of anatomical integrity and motion while minimizing the labeling burden. Second, to better preserve the integrity and textural information of US images, we implemented a dual-decoder that decouples the content and textural features in the generator. Third, we adopted a multiple-feature discriminator to extract a comprehensive range of visual cues, thereby enhancing the sharpness and fine details of the generated videos. Fourth, we constrained the motion trajectories of keypoints during online learning to enhance the fluidity of generated videos. Our validation and user studies on in-house echocardiographic and pelvic floor US videos showed that OnUVS synthesizes US videos with high fidelity.
    摘要 Ultrasound (US) 影像是临床实践中不可或缺的。为了诊断某些疾病,医学技术员需要观察相应的动态解剖结构,以获取全面的信息。然而,有限的特定US视频案例的可用性会导致教学困难,从而可能影响疾病检测率。 synthesizing US videos 可能是一个有前途的解决方案。然而,复杂动态解剖结构的精准动画化是一项挑战。为解决这个问题,我们提出了一种在线特征解coupling框架called OnUVS,用于高精度US видео生成。我们的主要特点包括:First, we introduced anatomic information into keypoint learning through a weakly-supervised training strategy, resulting in improved preservation of anatomical integrity and motion while minimizing the labeling burden.Second, to better preserve the integrity and textural information of US images, we implemented a dual-decoder that decouples the content and textural features in the generator.Third, we adopted a multiple-feature discriminator to extract a comprehensive range of visual cues, thereby enhancing the sharpness and fine details of the generated videos.Fourth, we constrained the motion trajectories of keypoints during online learning to enhance the fluidity of generated videos. Our validation and user studies on in-house echocardiographic and pelvic floor US videos showed that OnUVS synthesizes US videos with high fidelity.

Neural Spherical Harmonics for structurally coherent continuous representation of diffusion MRI signal

  • paper_url: http://arxiv.org/abs/2308.08210
  • repo_url: None
  • paper_authors: Tom Hendriks, Anna Vilanova, Maxime Chamberland
  • for: 这个论文的目的是提出一种基于幂等函数(NeSH)的扩展层次模型,用于重构扩散磁共振成像(dMRI)数据,以便利用人脑的结构协调性,并且只使用单个试验者的数据。
  • methods: 这个论文使用神经网络来Parameterize NeSH series来表示单个试验者的dMRI信号,这个信号是连续的在角度和空间频谱中。这种方法可以去除梯度图像中的噪声,并且 fibre orientation distribution functions (FOD)显示了纤维轨迹中的缓和的方向变化。
  • results: 该方法可以计算mean diffusivity、fractional anisotropy和总显示纤维density等结果,这些结果可以通过单一的模型架构和 hyperparameter 的调整来实现。此外,在angular和空间频谱上的upsampling也可以实现 reconstruction 的同等或更好的结果。
    Abstract We present a novel way to model diffusion magnetic resonance imaging (dMRI) datasets, that benefits from the structural coherence of the human brain while only using data from a single subject. Current methods model the dMRI signal in individual voxels, disregarding the intervoxel coherence that is present. We use a neural network to parameterize a spherical harmonics series (NeSH) to represent the dMRI signal of a single subject from the Human Connectome Project dataset, continuous in both the angular and spatial domain. The reconstructed dMRI signal using this method shows a more structurally coherent representation of the data. Noise in gradient images is removed and the fiber orientation distribution functions show a smooth change in direction along a fiber tract. We showcase how the reconstruction can be used to calculate mean diffusivity, fractional anisotropy, and total apparent fiber density. These results can be achieved with a single model architecture, tuning only one hyperparameter. In this paper we also demonstrate how upsampling in both the angular and spatial domain yields reconstructions that are on par or better than existing methods.
    摘要 我们提出了一种新的方法来模型扩散磁共振成像(dMRI)数据集,该方法利用人脑的结构准确性而不需要多个试验者的数据。现有方法对dMRI信号在单个 vozuel 中进行模型化,忽略了 voxel 之间的幂相关性。我们使用神经网络来参数化一个圆锥函数序列(NeSH)来表示单个试验者的 dMRI 信号,这种表示是连续的 both angular 和 spatial 领域。重建的 dMRI 信号使用这种方法显示出更加结构准确的数据。gradient 图像中的噪声被除去, fibers 方向分布函数显示出一直线性的改变方向。我们显示了如何使用这种重建来计算平均扩散率、分数扩散率和总表面积磁度。这些结果可以通过单个模型架构和一个 hyperparameter 的调整来实现。在这篇论文中,我们还证明了在 both angular 和 spatial 领域进行 upsampling 可以获得与现有方法相当或更好的重建结果。

Self-Reference Deep Adaptive Curve Estimation for Low-Light Image Enhancement

  • paper_url: http://arxiv.org/abs/2308.08197
  • repo_url: https://github.com/john-venti/self-dace
  • paper_authors: Jianyu Wen, Chenhao Wu, Tong Zhang, Yixuan Yu, Piotr Swierczynski
  • for: 提高低光照图像的增强
  • methods: 提出了一种两stage的低光照图像增强方法,包括一个intuitive、轻量级、快速、不需supervision的明亮增强算法,以及一种新的损失函数,用于保持自然图像的颜色、结构和准确性。
  • results: 对多个实际世界数据集进行了详细的qualitative和量化分析,并证明了方法的超越现有状态的算法。
    Abstract In this paper, we propose a 2-stage low-light image enhancement method called Self-Reference Deep Adaptive Curve Estimation (Self-DACE). In the first stage, we present an intuitive, lightweight, fast, and unsupervised luminance enhancement algorithm. The algorithm is based on a novel low-light enhancement curve that can be used to locally boost image brightness. We also propose a new loss function with a simplified physical model designed to preserve natural images' color, structure, and fidelity. We use a vanilla CNN to map each pixel through deep Adaptive Adjustment Curves (AAC) while preserving the local image structure. Secondly, we introduce the corresponding denoising scheme to remove the latent noise in the darkness. We approximately model the noise in the dark and deploy a Denoising-Net to estimate and remove the noise after the first stage. Exhaustive qualitative and quantitative analysis shows that our method outperforms existing state-of-the-art algorithms on multiple real-world datasets.
    摘要 在本文中,我们提出了一种两stage的低光照图像提升方法,称为Self-Reference Deep Adaptive Curve Estimation(Self-DACE)。在第一个阶段,我们提供了一种直观、轻量级、快速、不需超级视觉的亮度增强算法。该算法基于一个新的低光照增强曲线,可以在本地增加图像亮度。我们还提出了一个新的损失函数,该函数遵循简化的物理模型,以保持自然图像的颜色、结构和准确性。我们使用一个普通的CNN来将每个像素通过深度适应曲线(AAC)进行映射,同时保持图像的本地结构。在第二个阶段,我们引入了对应的噪声除减方法,以除除在黑暗中的噪声。我们简单地模型了黑暗中的噪声,并部署了一个Denoising-Net来估计和除减噪声。经过详细的质量和量化分析,我们的方法在多个真实世界数据集上表现出excel。

Conditional Perceptual Quality Preserving Image Compression

  • paper_url: http://arxiv.org/abs/2308.08154
  • repo_url: None
  • paper_authors: Tongda Xu, Qian Zhang, Yanghao Li, Dailan He, Zhe Wang, Yuanyuan Wang, Hongwei Qin, Yan Wang, Jingjing Liu, Ya-Qin Zhang
  • for: 这个论文是为了扩展基于用户定义信息的 conditional perceptual quality(cPQ),以提高图像压缩的质量和效率。
  • methods: 该论文使用了一种基于 rate-distortion-perception 质量评价的优化方法,并提出了一个最佳的 conditional perceptual quality 框架,以保持高质量和高效率。
  • results: 实验结果表明,该代码可以成功保持高质量和高Semantic quality,并且可以提供底层 bound of common randomness required,解决了过去的争议是否应该在生成器中添加随机性以提高 conditional perceptual quality 压缩。Here’s the full translation of the abstract in Simplified Chinese:
  • for: 本论文是为了扩展基于用户定义信息的 conditional perceptual quality(cPQ),以提高图像压缩的质量和效率。
  • methods: 该论文使用了一种基于 rate-distortion-perception 质量评价的优化方法,并提出了一个最佳的 conditional perceptual quality 框架,以保持高质量和高效率。
  • results: 实验结果表明,该代码可以成功保持高质量和高Semantic quality,并且可以提供底层 bound of common randomness required,解决了过去的争议是否应该在生成器中添加随机性以提高 conditional perceptual quality 压缩。
    Abstract We propose conditional perceptual quality, an extension of the perceptual quality defined in \citet{blau2018perception}, by conditioning it on user defined information. Specifically, we extend the original perceptual quality $d(p_{X},p_{\hat{X})$ to the conditional perceptual quality $d(p_{X|Y},p_{\hat{X}|Y})$, where $X$ is the original image, $\hat{X}$ is the reconstructed, $Y$ is side information defined by user and $d(.,.)$ is divergence. We show that conditional perceptual quality has similar theoretical properties as rate-distortion-perception trade-off \citep{blau2019rethinking}. Based on these theoretical results, we propose an optimal framework for conditional perceptual quality preserving compression. Experimental results show that our codec successfully maintains high perceptual quality and semantic quality at all bitrate. Besides, by providing a lowerbound of common randomness required, we settle the previous arguments on whether randomness should be incorporated into generator for (conditional) perceptual quality compression. The source code is provided in supplementary material.
    摘要 我们提议用条件的感知质量来扩展感知质量,定义在\citet{blau2018perception}中。specifically,我们将原始的感知质量$d(p_{X},p_{\hat{X})$扩展到条件感知质量$d(p_{X|Y},p_{\hat{X}|Y})$,其中$X$是原始图像,$\hat{X}$是重建的图像,$Y$是用户定义的侧信息。我们证明了条件感知质量具有类似的理论性质,与比特率-质量-感知质量评估交易\citep{blau2019rethinking}相似。基于这些理论结果,我们提出了一个优化的条件感知质量保持压缩框架。实验结果表明,我们的编码器成功保持高度的感知质量和语义质量,无论比特率如何。此外,我们提供了一个下界的公共随机性需求,解决了以前关于是否应该在生成器中包含随机性以实现(条件)感知质量压缩的辩论。代码可以在补充材料中找到。

A Comprehensive Overview of Computational Nuclei Segmentation Methods in Digital Pathology

  • paper_url: http://arxiv.org/abs/2308.08112
  • repo_url: None
  • paper_authors: Vasileios Magoulianitis, Catherine A. Alexander, C. -C. Jay Kuo
  • for: 这 paper 旨在对 cancer 诊断领域中数字 PATHOLOGY 的应用进行全面的 Review,尤其是在诊断、分期和评价方面。
  • methods: 这 paper 使用了传统的图像处理技术和 Deep Learning (DL) 模型来自动 segment nuclei,并评估了不同类型的超级vision 的影响。
  • results: 这 paper 提出了一种可靠且高效的 nuclei segmentation 方法,并讨论了不同模型和超级vision 的优劣。 futur 的研究应该强调高效可解释的模型,以便医生可以信任其输出。
    Abstract In the cancer diagnosis pipeline, digital pathology plays an instrumental role in the identification, staging, and grading of malignant areas on biopsy tissue specimens. High resolution histology images are subject to high variance in appearance, sourcing either from the acquisition devices or the H\&E staining process. Nuclei segmentation is an important task, as it detects the nuclei cells over background tissue and gives rise to the topology, size, and count of nuclei which are determinant factors for cancer detection. Yet, it is a fairly time consuming task for pathologists, with reportedly high subjectivity. Computer Aided Diagnosis (CAD) tools empowered by modern Artificial Intelligence (AI) models enable the automation of nuclei segmentation. This can reduce the subjectivity in analysis and reading time. This paper provides an extensive review, beginning from earlier works use traditional image processing techniques and reaching up to modern approaches following the Deep Learning (DL) paradigm. Our review also focuses on the weak supervision aspect of the problem, motivated by the fact that annotated data is scarce. At the end, the advantages of different models and types of supervision are thoroughly discussed. Furthermore, we try to extrapolate and envision how future research lines will potentially be, so as to minimize the need for labeled data while maintaining high performance. Future methods should emphasize efficient and explainable models with a transparent underlying process so that physicians can trust their output.
    摘要 在肿瘤诊断管道中,数字pathology扮演着重要的角色,用于识别、分期和评价肿瘤区域的病理样本中的细胞。高分辨率历史学图像具有高度的变化,来源于取得设备或H\&E染色过程。细胞核 segmentation 是一项重要的任务,因为它可以在背景组织中检测细胞核,并且对肿瘤检测有关键的因素。然而,这是一项时间consuming的任务,它需要许多的专业人员时间和subjectivity。计算机支持诊断 (CAD) 工具,激活了现代人工智能 (AI) 模型,可以自动完成细胞核 segmentation。这可以降低分析和阅读时间的主观性,并提高诊断的准确性。本文提供了广泛的回顾,从传统图像处理技术开始,到现代DL方法。我们的回顾还特别关注了弱监督问题的挑战,由于标注数据稀缺。文章结尾,我们详细讨论了不同模型和监督方式的优势。此外,我们尝试预测未来研究的发展趋势,以减少标注数据的需求,保持高性能。未来的方法应该强调高效和可读性的模型,并且具有透明的下面过程,以便医生可以信任其输出。

Snapshot High Dynamic Range Imaging with a Polarization Camera

  • paper_url: http://arxiv.org/abs/2308.08094
  • repo_url: https://github.com/Intelligent-Sensing/polarization-hdr
  • paper_authors: Mingyang Xie, Matthew Chan, Christopher Metzler
  • for: 将折衣相机转换为高 Dynamic Range(HDR)相机
  • methods: 使用linear polarizer并capture变化exposure的四个图像,并使用自适应和异常检测算法重构HDR图像
  • results: 实验结果表明该方法可以实现高效的HDR图像重构
    Abstract High dynamic range (HDR) images are important for a range of tasks, from navigation to consumer photography. Accordingly, a host of specialized HDR sensors have been developed, the most successful of which are based on capturing variable per-pixel exposures. In essence, these methods capture an entire exposure bracket sequence at once in a single shot. This paper presents a straightforward but highly effective approach for turning an off-the-shelf polarization camera into a high-performance HDR camera. By placing a linear polarizer in front of the polarization camera, we are able to simultaneously capture four images with varied exposures, which are determined by the orientation of the polarizer. We develop an outlier-robust and self-calibrating algorithm to reconstruct an HDR image (at a single polarity) from these measurements. Finally, we demonstrate the efficacy of our approach with extensive real-world experiments.
    摘要 高动态范围(HDR)图像在多种任务中具有重要性,从导航到消费型摄影。因此,一些专门的HDR传感器被开发出来,最成功的是基于每像素不同曝光的捕获方法。这篇论文提出了将普通的 polarization 摄像头转化为高性能 HDR 摄像头的简单 yet effective 方法。我们在 polarization 摄像头前面加入了线性激光 polarizer,从而同时捕获了不同曝光的四个图像,其曝光强度与 polarizer 的 orientations 相关。我们开发了一种具有耐抗异常值和自适应特性的算法,将这些测量转化为 HDR 图像(具有单一极性)。最后,我们通过实验证明了我们的方法的有效性。

Deep Learning Framework for Spleen Volume Estimation from 2D Cross-sectional Views

  • paper_url: http://arxiv.org/abs/2308.08038
  • repo_url: None
  • paper_authors: Zhen Yuan, Esther Puyol-Anton, Haran Jogeesvaran, Baba Inusa, Andrew P. King
  • for: 本研究的目的是开发一种自动从2D肝脏分割图像中计算肝脏体积的方法,以便在临床实践中更加准确地评估肝病和相关的临床病情。
  • methods: 我们提出了一种基于变分自动encoder的框架,用于从单个或双个视图的2D肝脏分割图像中计算肝脏体积。我们还提出了三种体积估计方法,并对这些方法进行评估。
  • results: 我们的最佳模型在单个视图和双个视图的肝脏分割图像上达到了86.62%和92.58%的相对体积准确率,超过了现有的临床标准方法和一种相关的深度学习基于2D-3D重建的方法。
    Abstract Abnormal spleen enlargement (splenomegaly) is regarded as a clinical indicator for a range of conditions, including liver disease, cancer and blood diseases. While spleen length measured from ultrasound images is a commonly used surrogate for spleen size, spleen volume remains the gold standard metric for assessing splenomegaly and the severity of related clinical conditions. Computed tomography is the main imaging modality for measuring spleen volume, but it is less accessible in areas where there is a high prevalence of splenomegaly (e.g., the Global South). Our objective was to enable automated spleen volume measurement from 2D cross-sectional segmentations, which can be obtained from ultrasound imaging. In this study, we describe a variational autoencoder-based framework to measure spleen volume from single- or dual-view 2D spleen segmentations. We propose and evaluate three volume estimation methods within this framework. We also demonstrate how 95% confidence intervals of volume estimates can be produced to make our method more clinically useful. Our best model achieved mean relative volume accuracies of 86.62% and 92.58% for single- and dual-view segmentations, respectively, surpassing the performance of the clinical standard approach of linear regression using manual measurements and a comparative deep learning-based 2D-3D reconstruction-based approach. The proposed spleen volume estimation framework can be integrated into standard clinical workflows which currently use 2D ultrasound images to measure spleen length. To the best of our knowledge, this is the first work to achieve direct 3D spleen volume estimation from 2D spleen segmentations.
    摘要 非常常见的脾脓肥大(splenomegaly)被认为是许多疾病的临床指标之一,包括肝病、 cancer 和血液疾病。 Although spleen length measured from ultrasound images is commonly used as a surrogate for spleen size, spleen volume remains the gold standard metric for assessing splenomegaly and the severity of related clinical conditions. computed tomography 是评估脾脓体积的主要成像Modal,但它在全球南部地区更普遍不可用。 Our objective was to enable automated spleen volume measurement from 2D cross-sectional segmentations, which can be obtained from ultrasound imaging. In this study, we describe a variational autoencoder-based framework to measure spleen volume from single- or dual-view 2D spleen segmentations. We propose and evaluate three volume estimation methods within this framework. We also demonstrate how 95% confidence intervals of volume estimates can be produced to make our method more clinically useful. Our best model achieved mean relative volume accuracies of 86.62% and 92.58% for single- and dual-view segmentations, respectively, surpassing the performance of the clinical standard approach of linear regression using manual measurements and a comparative deep learning-based 2D-3D reconstruction-based approach. The proposed spleen volume estimation framework can be integrated into standard clinical workflows which currently use 2D ultrasound images to measure spleen length. To the best of our knowledge, this is the first work to achieve direct 3D spleen volume estimation from 2D spleen segmentations.