eess.IV - 2023-11-10

Deep learning segmentation of fibrous cap in intravascular optical coherence tomography images

paper_url: http://arxiv.org/abs/2311.06202
repo_url: None
paper_authors: Juhwan Lee, Justin N. Kim, Luis A. P. Dallan, Vladislav N. Zimin, Ammar Hoori, Neda S. Hassani, Mohamed H. E. Makhlouf, Giulio Guagliumi, Hiram G. Bezerra, David L. Wilson
for: 这个研究旨在开发一种全自动的深度学习方法来 segmentation 膜状细胞（FC），以提高板凝聚扫描成像技术（IVOCT）中的膜厚度测量精度。
methods: 该研究使用了修改后的SegResNet和比较网络来进行FC segmentation，并使用了卷积批处理、批处理学习和数据增强等技术来提高 segmentation 精度。
results: 研究发现，使用该方法可以得到更好的FC segmentation结果（Dice指数为0.837+/-0.012），并且在五次交叉验证和保留测试集上表现很好（敏感度为85.0+/-0.3%，Dice指数为0.846+/-0.011）。此外，研究还发现了膜厚度与实际值之间的高度一致（膜厚度差为2.95+/-20.73um），并且在预和后硬件扩展之间存在高度一致的重复性（平均FC角度为200.9+/-128.0度/202.0+/-121.1度）。

Abstract
Thin-cap fibroatheroma (TCFA) is a prominent risk factor for plaque rupture. Intravascular optical coherence tomography (IVOCT) enables identification of fibrous cap (FC), measurement of FC thicknesses, and assessment of plaque vulnerability. We developed a fully-automated deep learning method for FC segmentation. This study included 32,531 images across 227 pullbacks from two registries. Images were semi-automatically labeled using our OCTOPUS with expert editing using established guidelines. We employed preprocessing including guidewire shadow detection, lumen segmentation, pixel-shifting, and Gaussian filtering on raw IVOCT (r,theta) images. Data were augmented in a natural way by changing theta in spiral acquisitions and by changing intensity and noise values. We used a modified SegResNet and comparison networks to segment FCs. We employed transfer learning from our existing much larger, fully-labeled calcification IVOCT dataset to reduce deep-learning training. Overall, our method consistently delivered better FC segmentation results (Dice: 0.837+/-0.012) than other deep-learning methods. Transfer learning reduced training time by 84% and reduced the need for more training samples. Our method showed a high level of generalizability, evidenced by highly-consistent segmentations across five-fold cross-validation (sensitivity: 85.0+/-0.3%, Dice: 0.846+/-0.011) and the held-out test (sensitivity: 84.9%, Dice: 0.816) sets. In addition, we found excellent agreement of FC thickness with ground truth (2.95+/-20.73 um), giving clinically insignificant bias. There was excellent reproducibility in pre- and post-stenting pullbacks (average FC angle: 200.9+/-128.0 deg / 202.0+/-121.1 deg). Our method will be useful for multiple research purposes and potentially for planning stent deployments that avoid placing a stent edge over an FC.

摘要
薄层纤维肉瘤（TCFA）是膜裂崩溃的重要风险因素。内血流图像学（IVOCT）可以识别纤维覆（FC）、测量FC厚度和评估膜易裂性。我们开发了一种自动化的深度学习方法 для FC分割。本研究包括32531张图像，来自227个推出的数据。图像通过我们的OCTOPUS自动标注，并由专家编辑以确定的指南进行了手动修改。我们使用了Raw IVOCT（r,θ）图像的准备处理，包括导向杆影像检测、血液分割、像素拼接和高斯滤波。我们使用了修改后的SegResNet和比较网络来分割FC。我们利用了我们现有的大量、完全标注calcification IVOCT数据进行深度学习减少训练时间。总的来说，我们的方法在FC分割方面提供了更好的结果（Dice值为0.837±0.012），而且在其他深度学习方法的比较中表现出了更高的一致性。传输学习可以降低训练时间84%，并降低了需要更多的训练样本。我们的方法在多个横向验证（敏感性：85.0±0.3%，Dice值：84.6±0.011）和保留测试集（敏感性：84.9%，Dice值：81.6）中表现出了高度一致性。此外，我们发现FC厚度与实际值（2.95±20.73 um）之间存在了临界的一致性。在预和后植入推出中，FC角度也具有极高的一致性（平均FC角度：200.9±128.0度/202.0±121.1度）。我们的方法将有助于多种研究目的，并可能用于规划避免在FC上部留下植入体的执行。

Perceptual impact of the loss function on deep-learning image coding performance

paper_url: http://arxiv.org/abs/2311.06084
repo_url: None
paper_authors: Shima Mohammadi, Joao Ascenso
for: 这个论文的目的是研究在深度学习图像编码器中使用不同图像质量指标的影响，以提高编码器的感知性能。
methods: 该论文使用了一种基于优化算法的训练方法，以获得适合压缩的模型（参数集）。训练过程中使用了一个梯度下降算法，并且使用了一个可导的质量指标来评估图像质量。
results: 该论文通过一项人工测试来研究不同图像质量指标对深度学习图像编码器的感知性能的影响。结果表明，选择合适的质量指标对深度学习图像编码器的感知性能至关重要，而且可以根据图像内容进行选择。

Abstract
Nowadays, deep-learning image coding solutions have shown similar or better compression efficiency than conventional solutions based on hand-crafted transforms and spatial prediction techniques. These deep-learning codecs require a large training set of images and a training methodology to obtain a suitable model (set of parameters) for efficient compression. The training is performed with an optimization algorithm which provides a way to minimize the loss function. Therefore, the loss function plays a key role in the overall performance and includes a differentiable quality metric that attempts to mimic human perception. The main objective of this paper is to study the perceptual impact of several image quality metrics that can be used in the loss function of the training process, through a crowdsourcing subjective image quality assessment study. From this study, it is possible to conclude that the choice of the quality metric is critical for the perceptual performance of the deep-learning codec and that can vary depending on the image content.

摘要
现在，深度学习图像编码解决方案已经达到了传统基于手工设计变换和空间预测技术的同等或更好的压缩效率。这些深度学习编码器需要一大量的图像训练集和训练方法来获得有效的压缩模型（参数集）。训练是通过优化算法来进行，该算法提供了一种将损失函数最小化的方式。因此，损失函数在整体性能中扮演关键的角色，并包含一个可导的质量指标，以模仿人类嗅感。本文的主要目标是通过人类主观图像质量评估研究来研究各种图像质量指标的感知影响，以便在训练过程中选择合适的质量指标。从这项研究中，可以结论出选择质量指标是深度学习编码器的感知性能的关键因素，并且可以根据图像内容而变化。

YOLOv5s-BC: An improved YOLOv5s-based method for real-time apple detection

paper_url: http://arxiv.org/abs/2311.05811
repo_url: None
paper_authors: Jingfan Liu, Zhaobing Liu
for:* 这种研究旨在解决现有的苹果检测算法存在的问题，提出了一种基于YOLOv5s的改进方法，以实现实时的苹果检测。methods:* 该方法在基础模块中添加了坐标注意（CA）块，并将原始 concatenation 操作替换为双向特征 pyramid network（BiFPN）。* 此外，该方法还添加了一个新的检测头，以便在视野中检测更小和更远的目标。results:* 对比多种目标检测算法，包括YOLOv5s、YOLOv4、YOLOv3、SSD、Faster R-CNN（ResNet50）和Faster R-CNN（VGG），提出的方法具有显著的改进，具体是4.6%、3.6%、20.48%、23.22%、15.27%和15.59%的准确率提升。* 该方法的检测精度也得到了显著提升，并且具有实时的检测速度（0.018秒/图像）和较小的模型大小（16.7 Mb），满足了找苹果机器人的实时要求。* 根据热图，该方法可以更好地关注和学习目标苹果的高级特征，并能够更好地识别小目标苹果。* 在其他苹果园测试中，模型可以在实时中检测并正确地捕捉到可搜集的苹果。

Abstract
To address the issues associated with the existing algorithms for the current apple detection, this study proposes an improved YOLOv5s-based method, named YOLOv5s-BC, for real-time apple detection, in which a series of modifications have been introduced. Firstly, a coordinate attention (CA) block has been incorporated into the backbone module to construct a new backbone network. Secondly, the original concatenation operation has been replaced with a bidirectional feature pyramid network (BiFPN) in the neck module. Lastly, a new detection head has been added to the head module, enabling the detection of smaller and more distant targets within the field of view of the robot. The proposed YOLOv5s-BC model was compared to several target detection algorithms, including YOLOv5s, YOLOv4, YOLOv3, SSD, Faster R-CNN (ResNet50), and Faster R-CNN (VGG), with significant improvements of 4.6%, 3.6%, 20.48%, 23.22%, 15.27%, and 15.59% in mAP, respectively. The detection accuracy of the proposed model is also greatly enhanced over the original YOLOv5s model. The model boasts an average detection speed of 0.018 seconds per image, and the weight size is only 16.7 Mb with 4.7 Mb smaller than that of YOLOv8s, meeting the real-time requirements for the picking robot. Furthermore, according to the heat map, our proposed model can focus more on and learn the high-level features of the target apples, and recognize the smaller target apples better than the original YOLOv5s model. Then, in other apple orchard tests, the model can detect the pickable apples in real time and correctly, illustrating a decent generalization ability.

摘要
要解决现有算法对现有苹果检测的问题，这些研究提出了改进的 YOLOv5s 基于方法，称为 YOLOv5s-BC，用于实时苹果检测。在这些改进中，我们在背bone模块中添加了坐标注意（CA）块，并将原始 concatenation 操作替换为双向特征pyramid网络（BiFPN）在 neck 模块中。此外，我们还添加了一个新的检测头到头模块，以便在视野中检测更小和更远的目标。与其他目标检测算法相比，我们的 YOLOv5s-BC 模型在 mAP 方面表现出了显著提高，分别为 4.6%、3.6%、20.48%、23.22%、15.27% 和 15.59%。此外，我们的模型还可以在实时要求下运行，每张图像需要0.018秒的检测时间，模型的权重大小只有16.7 Mb，比 YOLOv8s 小4.7 Mb。此外，根据热度图，我们的提posed模型可以更好地关注和学习高级特征，并在原始 YOLOv5s 模型中更好地识别更小的目标。然后，在其他苹果园测试中，模型可以在实时内correctly检测采集可能的苹果。