eess.IV - 2023-08-23

Tumor-Centered Patching for Enhanced Medical Image Segmentation

  • paper_url: http://arxiv.org/abs/2308.12168
  • repo_url: None
  • paper_authors: Mutyyba Asghar, Ahmad Raza Shahid, Akhtar Jamil, Kiran Aftab, Syed Ather Enam
  • for: 这个研究旨在提高医疗影像诊断系统中的图像分割精度,特别是解决深度学习方法在实际应用中存在的限制和慢速收敛问题。
  • methods: 这篇研究提出了一种新的方法,称为“肿瘤中心的块分割方法”,该方法通过将块分割与肿瘤的 анатомиче上下文进行对齐,以提高特征提取的准确性和计算效率。
  • results: 实验结果表明,使用这种新方法可以改善类别不均衡问题,并且 segmentation 分数为 0.78、0.76 和 0.71 分别对整体、核心和加强肿瘤进行了评估。
    Abstract The realm of medical image diagnosis has advanced significantly with the integration of computer-aided diagnosis and surgical systems. However, challenges persist, particularly in achieving precise image segmentation. While deep learning techniques show potential, obstacles like limited resources, slow convergence, and class imbalance impede their effectiveness. Traditional patch-based methods, though common, struggle to capture intricate tumor boundaries and often lead to redundant samples, compromising computational efficiency and feature quality. To tackle these issues, this research introduces an innovative approach centered on the tumor itself for patch-based image analysis. This novel tumor-centered patching method aims to address the class imbalance and boundary deficiencies, enabling focused and accurate tumor segmentation. By aligning patches with the tumor's anatomical context, this technique enhances feature extraction accuracy and reduces computational load. Experimental results demonstrate improved class imbalance, with segmentation scores of 0.78, 0.76, and 0.71 for whole, core, and enhancing tumors, respectively using a lightweight simple U-Net. This approach shows potential for enhancing medical image segmentation and improving computer-aided diagnosis systems.
    摘要 医疗图像诊断领域已经凭借计算机辅助诊断和手术系统的整合而取得了显著进步。然而,仍然存在一些挑战,主要是精准图像分割的问题。深度学习技术表现出了潜在的潜力,但是有限的资源、慢速融合和类别不均衡问题妨碍了其效iveness。传统的贴图方法,尽管广泛使用,但是它们往往无法捕捉复杂的肿瘤边界,导致了重复的样本生成和计算效率的下降,从而降低了特征质量。为了解决这些问题,本研究提出了一种新的方法,即基于肿瘤自身的贴图方法。这种新方法通过对肿瘤的 анатомиче上文进行匹配,提高了特征提取的准确性和计算效率。实验结果表明,使用轻量级简单的U-Net,该方法可以提高类别不均衡问题, segmentation scores为0.78、0.76和0.71 для整体、核心和增强肿瘤,分别。这种方法表现出了改善医疗图像分割的潜力,并且可能用于改善计算机辅助诊断系统。

DISGAN: Wavelet-informed Discriminator Guides GAN to MRI Super-resolution with Noise Cleaning

  • paper_url: http://arxiv.org/abs/2308.12084
  • repo_url: None
  • paper_authors: Qi Wang, Lucas Mahler, Julius Steiglechner, Florian Birk, Klaus Scheffler, Gabriele Lohmann
    for: 这个研究旨在提出一种同时解决MRI超分辨和噪声约束的深度学习方法,而不需要显式提供噪声和清晰图像的对应训练数据。methods: 该方法基于GAN模型,并使用3D Discrete Wavelet Transform(DWT)操作作为频谱约束在GAN框架中。results: 该模型可以同时实现高质量的SR和自动噪声除除,并且不需要单独训练SR和噪声约束两个模型。
    Abstract MRI super-resolution (SR) and denoising tasks are fundamental challenges in the field of deep learning, which have traditionally been treated as distinct tasks with separate paired training data. In this paper, we propose an innovative method that addresses both tasks simultaneously using a single deep learning model, eliminating the need for explicitly paired noisy and clean images during training. Our proposed model is primarily trained for SR, but also exhibits remarkable noise-cleaning capabilities in the super-resolved images. Instead of conventional approaches that introduce frequency-related operations into the generative process, our novel approach involves the use of a GAN model guided by a frequency-informed discriminator. To achieve this, we harness the power of the 3D Discrete Wavelet Transform (DWT) operation as a frequency constraint within the GAN framework for the SR task on magnetic resonance imaging (MRI) data. Specifically, our contributions include: 1) a 3D generator based on residual-in-residual connected blocks; 2) the integration of the 3D DWT with $1\times 1$ convolution into a DWT+conv unit within a 3D Unet for the discriminator; 3) the use of the trained model for high-quality image SR, accompanied by an intrinsic denoising process. We dub the model "Denoising Induced Super-resolution GAN (DISGAN)" due to its dual effects of SR image generation and simultaneous denoising. Departing from the traditional approach of training SR and denoising tasks as separate models, our proposed DISGAN is trained only on the SR task, but also achieves exceptional performance in denoising. The model is trained on 3D MRI data from dozens of subjects from the Human Connectome Project (HCP) and further evaluated on previously unseen MRI data from subjects with brain tumours and epilepsy to assess its denoising and SR performance.
    摘要 MRI超分解(SR)和噪声去除任务是深度学习领域的基础挑战,传统上被视为独立的两个任务,需要分别培训独立的深度学习模型。在这篇论文中,我们提出了一种创新的方法,通过单一的深度学习模型同时解决SR和噪声去除两个任务。我们的提议的模型主要是SR的,但也表现出了很好的噪声去除能力。不同于传统的方法,我们的新方法不使用频率相关的操作在生成过程中,而是通过基于GAN模型的频率知识导向的激活器来实现。为了实现这一点,我们利用了3D分割波лет变换(DWT)操作作为SR任务中MRI数据的频率约束。我们的贡献包括:1. 基于差分律的3D生成器,使用了重复律连接块;2. DWT+conv单元的整合,通过在3D U-Net中使用1x1卷积来实现;3. 使用训练过的模型进行高质量的SR图像生成,同时实现了内在的噪声去除过程。我们将这种模型称为“噪声去除引起的超分解GAN”(DISGAN),因为它同时实现了SR图像生成和噪声去除两个任务。与传统的方法不同,我们的DISGAN只受SR任务培训,同时也可以达到出色的噪声去除性能。我们的模型在HCP提供的3D MRI数据上进行了训练,并在未经见过的MRI数据上进行了测试,以评估其噪声去除和SR性能。

StofNet: Super-resolution Time of Flight Network

  • paper_url: http://arxiv.org/abs/2308.12009
  • repo_url: https://github.com/hahnec/stofnet
  • paper_authors: Christopher Hahne, Michel Hayoz, Raphael Sznitman
  • for: 本文旨在提出一种基于现代超解像技术的时间抵抗(ToF)探测方法,以提高ToF探测的可靠性和准确性。
  • methods: 该方法结合超解像和有效的征素减少块,以平衡细详信号的细节和大规模的Contextual信息。
  • results: 对比六种state-of-the-art方法, results showcase our proposed StofNet方法在精度、可靠性和模型复杂度方面具有显著优势。
    Abstract Time of Flight (ToF) is a prevalent depth sensing technology in the fields of robotics, medical imaging, and non-destructive testing. Yet, ToF sensing faces challenges from complex ambient conditions making an inverse modelling from the sparse temporal information intractable. This paper highlights the potential of modern super-resolution techniques to learn varying surroundings for a reliable and accurate ToF detection. Unlike existing models, we tailor an architecture for sub-sample precise semi-global signal localization by combining super-resolution with an efficient residual contraction block to balance between fine signal details and large scale contextual information. We consolidate research on ToF by conducting a benchmark comparison against six state-of-the-art methods for which we employ two publicly available datasets. This includes the release of our SToF-Chirp dataset captured by an airborne ultrasound transducer. Results showcase the superior performance of our proposed StofNet in terms of precision, reliability and model complexity. Our code is available at https://github.com/hahnec/stofnet.
    摘要 时间飞行(ToF)是现代深度探测技术的广泛应用领域,包括机器人、医疗成像和非 destruktive 检测。然而,ToF 探测受到复杂的 ambient 环境的影响,从而使得反向模型化从稀疏的时间信息变得不可能。本文提出了现代超解像技术可以学习不同的围 circum ambiente 以实现可靠和准确的 ToF 探测。与现有模型不同,我们采用了结合超解像和高效的 residual 压缩块来平衡细详信号和大规模的上下文信息。我们对 ToF 进行了 benchmark 比较,使用了六种现状顶尖方法,并使用了两个公共可用的数据集。这包括我们发布的 SToF-Chirp 数据集,由一架飞行式ultrasound 传感器记录。结果显示我们提出的 StofNet 在精度、可靠性和模型复杂度方面具有优于其他六种方法。我们的代码可以在 https://github.com/hahnec/stofnet 上获取。

Comparing Autoencoder to Geometrical Features for Vascular Bifurcations Identification

  • paper_url: http://arxiv.org/abs/2308.12314
  • repo_url: None
  • paper_authors: Ibtissam Essadik, Anass Nouri, Raja Touahni, Florent Autrusseau
  • for: 本研究旨在提出两种基于自动编码器和几何特征的新方法,用于脑血管树分支点确定。
  • methods: 本研究使用了自动编码器和几何特征来提取特征和识别模式,并对医疗数据进行分类。
  • results: 研究结果表明,自动编码器方法和几何特征方法均可以高效地确定脑血管树分支点,并且具有较高的准确率和F1分数。
    Abstract The cerebrovascular tree is a complex anatomical structure that plays a crucial role in the brain irrigation. A precise identification of the bifurcations in the vascular network is essential for understanding various cerebral pathologies. Traditional methods often require manual intervention and are sensitive to variations in data quality. In recent years, deep learning techniques, and particularly autoencoders, have shown promising performances for feature extraction and pattern recognition in a variety of domains. In this paper, we propose two novel approaches for vascular bifurcation identification based respectiveley on Autoencoder and geometrical features. The performance and effectiveness of each method in terms of classification of vascular bifurcations using medical imaging data is presented. The evaluation was performed on a sample database composed of 91 TOF-MRA, using various evaluation measures, including accuracy, F1 score and confusion matrix.
    摘要 《脑血管树是脑血管系统的复杂结构,对脑血管系统的灌溉具有关键作用。正确地识别血管网络中的分叉点是理解脑血管疾病的关键。传统方法通常需要人工干预,并且敏感于数据质量的变化。在最近几年,深度学习技术和特别是自动编码器在不同领域中表现出了有 promise的特性。本文提出了两种基于自动编码器和几何特征的新方法 для识别血管分叉点。每种方法的性能和效果在使用医疗影像数据进行血管分叉点的分类中得到了评估。评估结果表明,自动编码器方法在精度和效率方面具有优势。》Note: Please keep in mind that the translation is in Simplified Chinese, which is used in mainland China and Singapore, while Traditional Chinese is used in Taiwan, Hong Kong, and Macau.

Recovering a Molecule’s 3D Dynamics from Liquid-phase Electron Microscopy Movies

  • paper_url: http://arxiv.org/abs/2308.11927
  • repo_url: None
  • paper_authors: Enze Ye, Yuhang Wang, Hong Zhang, Yiqin Gao, Huan Wang, He Sun
  • for: 这项研究的目的是为了研究生物分子的动态行为,以便更好地理解它们在生物系统中的工作机制。
  • methods: 这项研究使用了一种创新的液相电子镜像技术(liquid-phase EM),可以让分子保持在Native的液态环境中,从而提供了一个独特的机会来观察它们的动态变化。
  • results: 这项研究提出了一种名为TEMPOR的Temporal Electron MicroscoPy Object Reconstruction算法,可以使用隐藏神经网络表示(INR)和动态变量自适应器(DVAE)来恢复液相EM电影中的时间序列分子结构。这个算法在两个模拟数据集7bcq和Cas9中显示出了其优势,可以直接从液相EM电影中恢复3D结构的时间变化。这是structural biology领域的首个尝试,并提供了一个有前途的新方法来研究分子的3D动态行为。
    Abstract The dynamics of biomolecules are crucial for our understanding of their functioning in living systems. However, current 3D imaging techniques, such as cryogenic electron microscopy (cryo-EM), require freezing the sample, which limits the observation of their conformational changes in real time. The innovative liquid-phase electron microscopy (liquid-phase EM) technique allows molecules to be placed in the native liquid environment, providing a unique opportunity to observe their dynamics. In this paper, we propose TEMPOR, a Temporal Electron MicroscoPy Object Reconstruction algorithm for liquid-phase EM that leverages an implicit neural representation (INR) and a dynamical variational auto-encoder (DVAE) to recover time series of molecular structures. We demonstrate its advantages in recovering different motion dynamics from two simulated datasets, 7bcq and Cas9. To our knowledge, our work is the first attempt to directly recover 3D structures of a temporally-varying particle from liquid-phase EM movies. It provides a promising new approach for studying molecules' 3D dynamics in structural biology.
    摘要 生物分子动态是我们理解它们在生物系统中功能的关键。然而,现有的3D影像技术,如冷阻电子显微镜(cryo-EM),需要冻结样品,这限制了观察分子 conformational 变化的实时观察。新的液相电子显微镜技术(liquid-phase EM)可以将分子置于原始液态环境中,提供了一个独特的机会来观察它们的动态。在这篇论文中,我们提出了 TEMPOR,一种基于偶极 нейрон表示(INR)和动态变量自适应器(DVAE)的电子镜像重建算法,可以从液相电子影像中回归时间序列的分子结构。我们在两个模拟数据集7bcq和Cas9中展示了它的优势。根据我们所知,我们的工作是直接从液相电子影像中提取3D变化的首次尝试。它提供了一个有前途的新方法,用于Structural biology中研究分子的3D动态。

Studying the Impact of Augmentations on Medical Confidence Calibration

  • paper_url: http://arxiv.org/abs/2308.11902
  • repo_url: None
  • paper_authors: Adrit Rao, Joon-Young Lee, Oliver Aalami
  • for: 本研究旨在评估现代增强技术对于卷积神经网络(CNN)的准确性和性能的影响。
  • methods: 本研究使用了三种现代增强技术:CutMix、MixUp和CutOut,以评估它们对于CNN的准确性和性能的影响。
  • results: 研究发现,CutMix可以最大化准确性,而CutOut可能会降低准确性水平。
    Abstract The clinical explainability of convolutional neural networks (CNN) heavily relies on the joint interpretation of a model's predicted diagnostic label and associated confidence. A highly certain or uncertain model can significantly impact clinical decision-making. Thus, ensuring that confidence estimates reflect the true correctness likelihood for a prediction is essential. CNNs are often poorly calibrated and prone to overconfidence leading to improper measures of uncertainty. This creates the need for confidence calibration. However, accuracy and performance-based evaluations of CNNs are commonly used as the sole benchmark for medical tasks. Taking into consideration the risks associated with miscalibration is of high importance. In recent years, modern augmentation techniques, which cut, mix, and combine images, have been introduced. Such augmentations have benefited CNNs through regularization, robustness to adversarial samples, and calibration. Standard augmentations based on image scaling, rotating, and zooming, are widely leveraged in the medical domain to combat the scarcity of data. In this paper, we evaluate the effects of three modern augmentation techniques, CutMix, MixUp, and CutOut on the calibration and performance of CNNs for medical tasks. CutMix improved calibration the most while CutOut often lowered the level of calibration.
    摘要 医学预测模型(Convolutional Neural Network,简称CNN)的临床解释性取决于模型预测结果和相关的信任度的共同解释。一个具有高度的确定或不确定性可以对临床决策产生重要影响。因此,确保模型的信任度反映预测结果的真实正确性可能性是非常重要的。然而,CNN通常具有质量不佳和过度自信的问题,导致不正确的不确定性评估。这创造了对 confidence 的调整的需求。然而,在医学任务中,精度和性能基于的评估方法仍然广泛使用。考虑到误差的风险非常高,特别是在医学领域。在最近几年,一些现代的扩展技术,如CutMix、MixUp和CutOut等,已经被引入。这些扩展技术可以在医学任务中提供常见的刺激,以提高CNN的耐性、对抗黑客样本和调整。在本文中,我们评估了CutMix、MixUp和CutOut等三种现代扩展技术对CNN的医学任务性能和调整的影响。结果显示,CutMix最大程度地提高了模型的准确性,而CutOut通常会降低模型的准确性。

Enhanced Residual SwinV2 Transformer for Learned Image Compression

  • paper_url: http://arxiv.org/abs/2308.11864
  • repo_url: None
  • paper_authors: Yongqiang Wang, Feng Liang, Haisheng Fu, Jie Liang, Haipeng Qin, Junzhe Liang
  • for: 提高图像压缩性能和实现简化模型
  • methods: 使用增强的 residual Swinv2 变换器和特征增强模块,并采用 SwinV2 变换器来进行编码和超编码
  • results: 对 Kodak 和 Tecnick 数据集进行测试,与一些最新的学习型图像压缩方法相当,并超过一些传统的编码器,包括 VVC,同时减少了模型复杂度56%。
    Abstract Recently, the deep learning technology has been successfully applied in the field of image compression, leading to superior rate-distortion performance. However, a challenge of many learning-based approaches is that they often achieve better performance via sacrificing complexity, which making practical deployment difficult. To alleviate this issue, in this paper, we propose an effective and efficient learned image compression framework based on an enhanced residual Swinv2 transformer. To enhance the nonlinear representation of images in our framework, we use a feature enhancement module that consists of three consecutive convolutional layers. In the subsequent coding and hyper coding steps, we utilize a SwinV2 transformer-based attention mechanism to process the input image. The SwinV2 model can help to reduce model complexity while maintaining high performance. Experimental results show that the proposed method achieves comparable performance compared to some recent learned image compression methods on Kodak and Tecnick datasets, and outperforms some traditional codecs including VVC. In particular, our method achieves comparable results while reducing model complexity by 56% compared to these recent methods.
    摘要

Robust RF Data Normalization for Deep Learning

  • paper_url: http://arxiv.org/abs/2308.11833
  • repo_url: None
  • paper_authors: Mostafa Sharifzadeh, Habib Benali, Hassan Rivaz
  • for: 用于深度神经网络训练,以优化ultrasound图像处理
  • methods: 使用个体标准化方法,改进传统数据Normalization的性能
  • results: 提高深度神经网络模型的通用性和性能
    Abstract Radio frequency (RF) data contain richer information compared to other data types, such as envelope or B-mode, and employing RF data for training deep neural networks has attracted growing interest in ultrasound image processing. However, RF data is highly fluctuating and additionally has a high dynamic range. Most previous studies in the literature have relied on conventional data normalization, which has been adopted within the computer vision community. We demonstrate the inadequacy of those techniques for normalizing RF data and propose that individual standardization of each image substantially enhances the performance of deep neural networks by utilizing the data more efficiently. We compare conventional and proposed normalizations in a phase aberration correction task and illustrate how the former enhances the generality of trained models.
    摘要 radio frequency (RF) 数据包含更多信息 compare to other data types, such as envelope or B-mode, 和使用 RF 数据来训练深度神经网络已经引起了ultrasound image processing 领域的关注。然而,RF 数据具有高度波动和含有高动态范围。大多数前一代研究Literature中的技术都是通过传统的数据Normalization来进行normalization,这些技术在计算机视觉领域得到广泛应用。我们示出了这些技术对 RF 数据的normalization是不充分的,并提出了基于每个图像的个体标准化,可以更有效地利用数据,并提高深度神经网络的性能。我们对传统和我们提议的normalization进行比较,并在阶梯偏移 correction 任务中示出了我们的方法可以提高训练模型的普遍性。

Frequency-Space Prediction Filtering for Phase Aberration Correction in Plane-Wave Ultrasound

  • paper_url: http://arxiv.org/abs/2308.11830
  • repo_url: None
  • paper_authors: Mostafa Sharifzadeh, Habib Benali, Hassan Rivaz
  • for: This paper aims to improve the image quality of ultrasound imaging by addressing the challenge of phase aberration, which is a significant contributing factor to image degradation in focused ultrasound imaging.
  • methods: The paper proposes an adaptive AR model to improve the performance of frequency-space prediction filtering (FXPF) in plane-wave imaging, where the number of contributing signals varies at different depths.
  • results: The proposed adaptive AR model is effective in improving image quality, as demonstrated by the improved contrast and generalized contrast-to-noise ratio metrics compared to using a fixed-order AR model.Here’s the Chinese translation of the three points:
  • for: 这篇论文目的是改进ultrasound imaging中的图像质量,解决phas aberration的挑战,phas aberration是注意力ultrasound imaging中的主要质量下降因素。
  • methods: 该论文提议使用适应AR模型来改进FXPF技术在平面波成像中的性能,在不同深度下的信号数量不同时,采用适应AR模型可以提高图像质量。
  • results: 提议的适应AR模型在改进图像质量方面具有良好的效果,通过对比使用固定顺序AR模型,在对比度和通用对比度比例方面表现出了改进。
    Abstract Ultrasound imaging often suffers from image degradation stemming from phase aberration, which represents a significant contributing factor to the overall image degradation in ultrasound imaging. Frequency-space prediction filtering or FXPF is a technique that has been applied within focused ultrasound imaging to alleviate the phase aberration effect. It presupposes the existence of an autoregressive (AR) model across the signals received at the transducer elements and removes any components that do not conform to the established model. In this study, we illustrate the challenge of applying this technique to plane-wave imaging, where, at shallower depths, signals from more distant elements lose relevance, and a fewer number of elements contribute to image reconstruction. While the number of contributing signals varies, adopting a fixed-order AR model across all depths, results in suboptimal performance. To address this challenge, we propose an AR model with an adaptive order and quantify its effectiveness using contrast and generalized contrast-to-noise ratio metrics.
    摘要

WS-SfMLearner: Self-supervised Monocular Depth and Ego-motion Estimation on Surgical Videos with Unknown Camera Parameters

  • paper_url: http://arxiv.org/abs/2308.11776
  • repo_url: None
  • paper_authors: Ange Lou, Jack Noble
  • for: 这篇论文是为了建立一个自动化的深度和 egocentric 运动估计系统,用于医学影像导航手术。
  • methods: 该论文使用了一种基于 cost-volume 的自我监督方法,以便预测摄像头参数。
  • results: 实验结果表明,提议的方法可以改进摄像头参数、 egocentric 运动和深度估计的准确性。
    Abstract Depth estimation in surgical video plays a crucial role in many image-guided surgery procedures. However, it is difficult and time consuming to create depth map ground truth datasets in surgical videos due in part to inconsistent brightness and noise in the surgical scene. Therefore, building an accurate and robust self-supervised depth and camera ego-motion estimation system is gaining more attention from the computer vision community. Although several self-supervision methods alleviate the need for ground truth depth maps and poses, they still need known camera intrinsic parameters, which are often missing or not recorded. Moreover, the camera intrinsic prediction methods in existing works depend heavily on the quality of datasets. In this work, we aimed to build a self-supervised depth and ego-motion estimation system which can predict not only accurate depth maps and camera pose, but also camera intrinsic parameters. We proposed a cost-volume-based supervision manner to give the system auxiliary supervision for camera parameters prediction. The experimental results showed that the proposed method improved the accuracy of estimated camera parameters, ego-motion, and depth estimation.
    摘要 深度估算在手术视频中扮演着重要的角色,但是创建深度地图真实数据集在手术视频中具有困难和耗时的问题,主要是因为手术场景中的光照不均匀和噪声。因此,建立一个准确和可靠的自我超vised深度和摄像头自身运动估计系统在计算机视觉领域中受到更多的关注。虽然一些自我超视方法可以减少需要真实深度地图和pose的需求,但它们仍需要已知的摄像头内参数,这些参数通常缺失或者不被记录。此外,现有的摄像头内参数预测方法依赖于数据质量。在这个工作中,我们尝试了建立一个自我超vised深度和摄像头自身运动估计系统,该系统可以预测不仅准确的深度地图和摄像头pose,还可以预测摄像头内参数。我们提议使用Volume-based超vision方式为系统提供auxiliary超vision。实验结果表明,我们的方法可以提高摄像头参数预测的准确性,以及深度估算和摄像头自身运动估计的准确性。

EndoNet: model for automatic calculation of H-score on histological slides

  • paper_url: http://arxiv.org/abs/2308.11562
  • repo_url: None
  • paper_authors: Egor Ushakov, Anton Naumov, Vladislav Fomberg, Polina Vishnyakova, Aleksandra Asaturova, Alina Badlaeva, Anna Tregubova, Evgeny Karpulevich, Gennady Sukhikh, Timur Fatkhudinov
  • for: This paper aims to develop a computer-aided method for automatic calculation of H-score on histological slides, which can improve the efficiency and accuracy of pathologists’ workflows.
  • methods: The proposed method, called EndoNet, uses neural networks and consists of two main parts: a detection model that predicts keypoints of centers of nuclei, and a H-score module that calculates the value of the H-score using mean pixel values of predicted keypoints.
  • results: The model was trained and validated on 1780 annotated tiles with a shape of 100x100 $\mu m$ and achieved 0.77 mAP on a test dataset. The model is effective and robust in the analysis of histology slides, which can improve and significantly accelerate the work of pathologists.
    Abstract H-score is a semi-quantitative method used to assess the presence and distribution of proteins in tissue samples by combining the intensity of staining and percentage of stained nuclei. It is widely used but time-consuming and can be limited in accuracy and precision. Computer-aided methods may help overcome these limitations and improve the efficiency of pathologists' workflows. In this work, we developed a model EndoNet for automatic calculation of H-score on histological slides. Our proposed method uses neural networks and consists of two main parts. The first is a detection model which predicts keypoints of centers of nuclei. The second is a H-score module which calculates the value of the H-score using mean pixel values of predicted keypoints. Our model was trained and validated on 1780 annotated tiles with a shape of 100x100 $\mu m$ and performed 0.77 mAP on a test dataset. Moreover, the model can be adjusted to a specific specialist or whole laboratory to reproduce the manner of calculating the H-score. Thus, EndoNet is effective and robust in the analysis of histology slides, which can improve and significantly accelerate the work of pathologists.
    摘要 “H-score”是一种半量化方法,用于评估组织样本中蛋白质的存在和分布,通过结合染料强度和染料覆盖率的组合。它广泛使用,但时间消耗大,精度和精密性有限。计算机助け方法可以帮助解决这些问题,提高病理医生的工作效率。在这项工作中,我们开发了一个名为“EndoNet”的自动计算H-score模型。我们的提议方法包括两个主要部分:第一是一个检测模型,可以预测核心点的位置;第二是H-score模块,可以使用预测的核心点的平均像素值来计算H-score的值。我们的模型在1780个标注的块中训练和验证,在测试集上达到0.77 mAP。此外,模型可以根据特定的专家或实验室来调整H-score的计算方式,因此EndoNet是对 histology slice 的分析中效果强大和稳定的。

Open Set Synthetic Image Source Attribution

  • paper_url: http://arxiv.org/abs/2308.11557
  • repo_url: None
  • paper_authors: Shengbang Fang, Tai D. Nguyen, Matthew C. Stamm
  • for: 本研究旨在开发一种基于 metric learning 的开放集成源归属技术,以便在新的生成器出现时能够快速归属 synthetic 图像的来源。
  • methods: 本研究使用了一种基于 metric learning 的方法,包括学习可转移的嵌入 Vector 以便区分不同的生成器。图像首先被归类到可能的生成器中,然后根据图像与已知生成器的嵌入空间距离进行接受或拒绝。
  • results: 经过一系列实验,本研究 demonstrate 了开放集成源归属技术的可行性和效果,并且表明可以快速归属 synthetic 图像的来源,即使新的生成器没有在训练过程中出现。
    Abstract AI-generated images have become increasingly realistic and have garnered significant public attention. While synthetic images are intriguing due to their realism, they also pose an important misinformation threat. To address this new threat, researchers have developed multiple algorithms to detect synthetic images and identify their source generators. However, most existing source attribution techniques are designed to operate in a closed-set scenario, i.e. they can only be used to discriminate between known image generators. By contrast, new image-generation techniques are rapidly emerging. To contend with this, there is a great need for open-set source attribution techniques that can identify when synthetic images have originated from new, unseen generators. To address this problem, we propose a new metric learning-based approach. Our technique works by learning transferrable embeddings capable of discriminating between generators, even when they are not seen during training. An image is first assigned to a candidate generator, then is accepted or rejected based on its distance in the embedding space from known generators' learned reference points. Importantly, we identify that initializing our source attribution embedding network by pretraining it on image camera identification can improve our embeddings' transferability. Through a series of experiments, we demonstrate our approach's ability to attribute the source of synthetic images in open-set scenarios.
    摘要 人工生成的图像已经变得越来越真实,引起了大量的公众关注。虽然人工图像吸引人的目光,但它们也 pose 一个重要的误导性问题。为了解决这个新的问题,研究人员们已经开发出了多种检测人工图像和确定它们的来源生成器的算法。然而,大多数现有的来源归属技术都是针对关闭集成 scenarios,即只能在已知的图像生成器之间进行分类。然而,新的图像生成技术在不断地出现。为了应对这个问题,有一个很大的需求是开放集成来源归属技术,可以在未经见过的生成器之间进行归属。为此,我们提出了一种新的度量学习基于的方法。我们的方法通过学习可转移的嵌入来确定生成器,即可以在不同的生成器之间进行归属,即使它们在训练时未经见过。在一系列的实验中,我们证明了我们的方法可以在开放集成 scenarios 中归属人工图像的来源。