eess.IV - 2023-07-14

Combining multitemporal optical and SAR data for LAI imputation with BiLSTM network

paper_url: http://arxiv.org/abs/2307.07434
repo_url: None
paper_authors: W. Zhao, F. Yin, H. Ma, Q. Wu, J. Gomez-Dans, P. Lewis
for: 预测冬小麦产量，提高冬小麦区域的空间-时间密度
methods: 使用时序Sentinel-1 VH/VV的LAI数据进行时序LAI报假，使用BI-LSTM网络进行报假，并使用半平均方差为损失函数进行训练
results: BiLSTM方法比传统回归方法更高效，能够捕捉多时序序列之间的非线性动态，并且在不同生长条件下表现良好，特别是在衰老期表现更佳，因此BiLSTM方法可以用于时序Sentinel-1 VH/VV和Sentinel-2数据中的LAI报假问题。

Abstract
The Leaf Area Index (LAI) is vital for predicting winter wheat yield. Acquisition of crop conditions via Sentinel-2 remote sensing images can be hindered by persistent clouds, affecting yield predictions. Synthetic Aperture Radar (SAR) provides all-weather imagery, and the ratio between its cross- and co-polarized channels (C-band) shows a high correlation with time series LAI over winter wheat regions. This study evaluates the use of time series Sentinel-1 VH/VV for LAI imputation, aiming to increase spatial-temporal density. We utilize a bidirectional LSTM (BiLSTM) network to impute time series LAI and use half mean squared error for each time step as the loss function. We trained models on data from southern Germany and the North China Plain using only LAI data generated by Sentinel-1 VH/VV and Sentinel-2. Experimental results show BiLSTM outperforms traditional regression methods, capturing nonlinear dynamics between multiple time series. It proves robust in various growing conditions and is effective even with limited Sentinel-2 images. BiLSTM's performance surpasses that of LSTM, particularly over the senescence period. Therefore, BiLSTM can be used to impute LAI with time-series Sentinel-1 VH/VV and Sentinel-2 data, and this method could be applied to other time-series imputation issues.

摘要
“叶面指数（LAI）是预测冬小麦生产的重要指标。但是，由于持续云层干扰，可能导致农业条件资料取得受限，影响生产预测。Synthetic Aperture Radar（SAR）可提供全天候图像，其中双极化通道（C-band）的比率与时间序列LAI之间存在高度相关性。本研究探讨使用时间序列Sentinel-1 VH/VV来替代LAI数据，以提高空间时间密度。我们使用了 bidirectional LSTM（BiLSTM）网络来替代时间序列LAI，并使用每个时间步的平均方差作为损失函数。我们使用了德国南部和中国北部的数据，并仅使用Sentinel-1 VH/VV和Sentinel-2的数据进行训练。实验结果显示，BiLSTM在不同生长条件下表现更加稳定，并且在衰老期表现更好。因此，BiLSTM可以用来替代LAI的时间序列数据，并且这种方法可以应用于其他时间序列替代问题。”

BiGSeT: Binary Mask-Guided Separation Training for DNN-based Hyperspectral Anomaly Detection

paper_url: http://arxiv.org/abs/2307.07428
repo_url: None
paper_authors: Haijun Liu, Xi Su, Xiangfei Shen, Lihui Chen, Xichuan Zhou
for: 这个研究旨在提高深度神经网络（DNNs）在高spectral anomaly detection（HAD） tasks 中的表现，特别是避免预先知识的限制。
methods: 我们提出了一个独立于模型的数位面积训练策略，named BiGSeT，它使用了一个几何变数对应的隐藏 binary mask 来分离背景和异常。
results: 我们在实际世界数据集上验证了我们的训练策略，结果显示与一些现有的方法比较，我们的方法具有更高的检测性。具体来说，我们在HyMap Cooke City dataset上取得了90.67% AUC 得分。此外，我们还将训练策略应用到其他深度网络结构上，实现了对异常检测表现的改进。

Abstract
Hyperspectral anomaly detection (HAD) aims to recognize a minority of anomalies that are spectrally different from their surrounding background without prior knowledge. Deep neural networks (DNNs), including autoencoders (AEs), convolutional neural networks (CNNs) and vision transformers (ViTs), have shown remarkable performance in this field due to their powerful ability to model the complicated background. However, for reconstruction tasks, DNNs tend to incorporate both background and anomalies into the estimated background, which is referred to as the identical mapping problem (IMP) and leads to significantly decreased performance. To address this limitation, we propose a model-independent binary mask-guided separation training strategy for DNNs, named BiGSeT. Our method introduces a separation training loss based on a latent binary mask to separately constrain the background and anomalies in the estimated image. The background is preserved, while the potential anomalies are suppressed by using an efficient second-order Laplacian of Gaussian (LoG) operator, generating a pure background estimate. In order to maintain separability during training, we periodically update the mask using a robust proportion threshold estimated before the training. In our experiments, We adopt a vanilla AE as the network to validate our training strategy on several real-world datasets. Our results show superior performance compared to some state-of-the-art methods. Specifically, we achieved a 90.67% AUC score on the HyMap Cooke City dataset. Additionally, we applied our training strategy to other deep network structures, achieving improved detection performance compared to their original versions, demonstrating its effective transferability. The code of our method will be available at https://github.com/enter-i-username/BiGSeT.

摘要
高spectral异常检测（HAD）的目标是找到谱spectrum中异常的小股，不需要先知道异常的特征。深度神经网络（DNNs），包括自适应神经网络（AEs）、卷积神经网络（CNNs）和视transformer（ViTs），在这个领域表现出了非常出色的能力，因为它们可以模型谱spectrum中复杂的背景。然而，在重建任务中，DNNs tend to incorporate both background and anomalies into the estimated background，这被称为同一个映射问题（IMP），导致性能明显下降。为解决这个限制，我们提出了一种独立于模型的二进制掩码引导分离训练策略，名为BiGSeT。我们的方法引入了基于潜在二进制掩码的分离训练损失，以分离训练中的背景和异常。背景被保留，而潜在的异常被用高效的第二阶差分布（LoG）运算器抑制，生成了纯净的背景估计。为保持分离性在训练过程中，我们在训练过程中 périodically 更新掩码，使其在训练过程中保持稳定。在我们的实验中，我们采用了一个普通的AEs作为网络，以验证我们的训练策略在多个实际数据集上的效果。我们的结果显示，我们在HyMap Cooke City dataset上 achievied a 90.67% AUC score，并在其他深度网络结构上应用我们的训练策略，实现了与原版网络的比较性能。这说明了我们的训练策略具有可 пере移性。我们的代码将在 GitHub 上提供。

Transient Neural Radiance Fields for Lidar View Synthesis and 3D Reconstruction

paper_url: http://arxiv.org/abs/2307.09555
repo_url: None
paper_authors: Anagh Malik, Parsa Mirdehghan, Sotiris Nousias, Kiriakos N. Kutulakos, David B. Lindell
for: 用于模拟Raw lidar measurements的虚拟场景渲染
methods: 使用时间分辨率版本的体积渲染公式来渲染雷达测量结果，捕捉瞬时光传输现象
results: 能够在新视图中渲染脉冲雷达测量结果，并与传统的点云基础授益相比，提高了地形和常规外观的恢复Here’s the translation of the abstract in Simplified Chinese:
for: 这个研究是用于模拟Raw lidar measurements的虚拟场景渲染。
methods: 这个方法使用时间分辨率版本的体积渲染公式来渲染雷达测量结果，捕捉瞬时光传输现象。
results: 这个方法能够在新视图中渲染脉冲雷达测量结果，并与传统的点云基础授益相比，提高了地形和常规外观的恢复。

Abstract
Neural radiance fields (NeRFs) have become a ubiquitous tool for modeling scene appearance and geometry from multiview imagery. Recent work has also begun to explore how to use additional supervision from lidar or depth sensor measurements in the NeRF framework. However, previous lidar-supervised NeRFs focus on rendering conventional camera imagery and use lidar-derived point cloud data as auxiliary supervision; thus, they fail to incorporate the underlying image formation model of the lidar. Here, we propose a novel method for rendering transient NeRFs that take as input the raw, time-resolved photon count histograms measured by a single-photon lidar system, and we seek to render such histograms from novel views. Different from conventional NeRFs, the approach relies on a time-resolved version of the volume rendering equation to render the lidar measurements and capture transient light transport phenomena at picosecond timescales. We evaluate our method on a first-of-its-kind dataset of simulated and captured transient multiview scans from a prototype single-photon lidar. Overall, our work brings NeRFs to a new dimension of imaging at transient timescales, newly enabling rendering of transient imagery from novel views. Additionally, we show that our approach recovers improved geometry and conventional appearance compared to point cloud-based supervision when training on few input viewpoints. Transient NeRFs may be especially useful for applications which seek to simulate raw lidar measurements for downstream tasks in autonomous driving, robotics, and remote sensing.

摘要
neural radiance fields (NeRFs) 已成为多视图图像和几何的模型工具。 current work also explores how to use additional supervision from lidar or depth sensor measurements in the NeRF framework. However, previous lidar-supervised NeRFs focus on rendering conventional camera imagery and use lidar-derived point cloud data as auxiliary supervision; thus, they fail to incorporate the underlying image formation model of the lidar. Here, we propose a novel method for rendering transient NeRFs that take as input the raw, time-resolved photon count histograms measured by a single-photon lidar system, and we seek to render such histograms from novel views. Different from conventional NeRFs, the approach relies on a time-resolved version of the volume rendering equation to render the lidar measurements and capture transient light transport phenomena at picosecond timescales. We evaluate our method on a first-of-its-kind dataset of simulated and captured transient multiview scans from a prototype single-photon lidar. Overall, our work brings NeRFs to a new dimension of imaging at transient timescales, newly enabling rendering of transient imagery from novel views. Additionally, we show that our approach recovers improved geometry and conventional appearance compared to point cloud-based supervision when training on few input viewpoints. Transient NeRFs may be especially useful for applications which seek to simulate raw lidar measurements for downstream tasks in autonomous driving, robotics, and remote sensing.

Reconstructing Three-decade Global Fine-Grained Nighttime Light Observations by a New Super-Resolution Framework

paper_url: http://arxiv.org/abs/2307.07366
repo_url: None
paper_authors: Jinyu Guo, Feng Zhang, Hang Zhao, Baoxiang Pan, Linlu Mei
for: 这个论文旨在提供长期和精细的夜晚照明数据，以便研究人类活动，包括城市化、人口增长和疫病等。
methods: 我们开发了一种创新的框架，并使用其设计了一个新的超分辨率模型，可以将低分辨率的夜晚照明数据重新构建为高分辨率。
results: 我们验证了一亿个数据点，发现我们的模型在全球范围内的相关系数为0.873，与其他现有模型相比（最大值为0.713）显著高于。我们的模型也在国家和城市层次上表现出色。此外，我们通过对机场和公路进行检查，发现只有我们的图像细节可以描述历史发展这些设施。我们提供了长期和精细的夜晚照明观测数据，以便研究人类活动。数据集可以在 \url{https://doi.org/10.5281/zenodo.7859205} 上下载。

Abstract
Satellite-collected nighttime light provides a unique perspective on human activities, including urbanization, population growth, and epidemics. Yet, long-term and fine-grained nighttime light observations are lacking, leaving the analysis and applications of decades of light changes in urban facilities undeveloped. To fill this gap, we developed an innovative framework and used it to design a new super-resolution model that reconstructs low-resolution nighttime light data into high resolution. The validation of one billion data points shows that the correlation coefficient of our model at the global scale reaches 0.873, which is significantly higher than that of other existing models (maximum = 0.713). Our model also outperforms existing models at the national and urban scales. Furthermore, through an inspection of airports and roads, only our model's image details can reveal the historical development of these facilities. We provide the long-term and fine-grained nighttime light observations to promote research on human activities. The dataset is available at \url{https://doi.org/10.5281/zenodo.7859205}.

摘要
卫星收集的夜间照明提供了人类活动的一个独特视角，包括城市化、人口增长和疫病。然而，长期和细化夜间照明观测缺乏，这留下了多年夜间照明变化的分析和应用未发展。为了填补这一空白，我们开发了一种创新性的框架，并使用其设计了一个新的超分辨率模型，可以将低分辨率夜间照明数据变换为高分辨率数据。我们验证了一亿个数据点，结果显示，我们的模型在全球规模上的相关度为0.873，与其他现有模型相比明显高于最高的0.713。此外，我们的模型在国家和城市层次上也表现出了优于现有模型。进一步地，通过观察机场和公路的图像细节，只有我们的图像细节可以描述历史发展这些设施。我们提供了长期和细化夜间照明观测，以便促进人类活动的研究。数据集可以在 \url{https://doi.org/10.5281/zenodo.7859205} 获取。

Sampling-Priors-Augmented Deep Unfolding Network for Robust Video Compressive Sensing

paper_url: http://arxiv.org/abs/2307.07291
repo_url: https://github.com/yuhaoo00/SPA-DUN
paper_authors: Yuhao Huang, Gangrong Qu, Youran Ge
for: 高速场景记录和低帧率传感器
methods: 使用 sampling-priors-augmented deep unfolding network (SPA-DUN) 实现高效和可靠的多帧重建
results: 在 simulate 和实际数据上实现 SOTA 性能，并且可以处理多种采样设置，提高了可读性和通用性

Abstract
Video Compressed Sensing (VCS) aims to reconstruct multiple frames from one single captured measurement, thus achieving high-speed scene recording with a low-frame-rate sensor. Although there have been impressive advances in VCS recently, those state-of-the-art (SOTA) methods also significantly increase model complexity and suffer from poor generality and robustness, which means that those networks need to be retrained to accommodate the new system. Such limitations hinder the real-time imaging and practical deployment of models. In this work, we propose a Sampling-Priors-Augmented Deep Unfolding Network (SPA-DUN) for efficient and robust VCS reconstruction. Under the optimization-inspired deep unfolding framework, a lightweight and efficient U-net is exploited to downsize the model while improving overall performance. Moreover, the prior knowledge from the sampling model is utilized to dynamically modulate the network features to enable single SPA-DUN to handle arbitrary sampling settings, augmenting interpretability and generality. Extensive experiments on both simulation and real datasets demonstrate that SPA-DUN is not only applicable for various sampling settings with one single model but also achieves SOTA performance with incredible efficiency.

摘要

cOOpD: Reformulating COPD classification on chest CT scans as anomaly detection using contrastive representations

paper_url: http://arxiv.org/abs/2307.07254
repo_url: None
paper_authors: Silvia D. Almeida, Carsten T. Lüth, Tobias Norajitra, Tassilo Wald, Marco Nolden, Paul F. Jaeger, Claus P. Heussel, Jürgen Biederer, Oliver Weinheimer, Klaus Maier-Hein
for: 这篇论文的目的是为了提出一种基于偏见检测的 Chronic Obstructive Pulmonary Disease (COPD) 分类方法。
methods: 这篇论文使用了一种自我超vised contrastive pretext model 来学习肺部的表现，然后使用生成模型来检测异常。
results: 这篇论文在两个公开数据集上取得了最好的性能，与之前的现有的supervised state-of-the-art 比较提高了8.2%和7.7%的AUROC值。此外，这篇论文还提供了可读的空间偏见地图和病人级别分数，这些分数有助于早期识别病人的进展阶段。

Abstract
Classification of heterogeneous diseases is challenging due to their complexity, variability of symptoms and imaging findings. Chronic Obstructive Pulmonary Disease (COPD) is a prime example, being underdiagnosed despite being the third leading cause of death. Its sparse, diffuse and heterogeneous appearance on computed tomography challenges supervised binary classification. We reformulate COPD binary classification as an anomaly detection task, proposing cOOpD: heterogeneous pathological regions are detected as Out-of-Distribution (OOD) from normal homogeneous lung regions. To this end, we learn representations of unlabeled lung regions employing a self-supervised contrastive pretext model, potentially capturing specific characteristics of diseased and healthy unlabeled regions. A generative model then learns the distribution of healthy representations and identifies abnormalities (stemming from COPD) as deviations. Patient-level scores are obtained by aggregating region OOD scores. We show that cOOpD achieves the best performance on two public datasets, with an increase of 8.2% and 7.7% in terms of AUROC compared to the previous supervised state-of-the-art. Additionally, cOOpD yields well-interpretable spatial anomaly maps and patient-level scores which we show to be of additional value in identifying individuals in the early stage of progression. Experiments in artificially designed real-world prevalence settings further support that anomaly detection is a powerful way of tackling COPD classification.

摘要
classification of 多元疾病是困难的，因为它们的复杂性、症状和影像找到的变化。 chronic obstructive pulmonary disease (COPD) 是一个严重的例子，即使是全球第三大的死亡原因，它仍然被诊断不充分。 its 稀疏、杂乱和多元的计算机扫描图像具有挑战性，我们将 COPD 二分类问题转化为一个异常检测任务。 we propose cOOpD：在健康肺部中检测疾病多元区域，并将其视为外部数据（OOD）。为此，我们使用自我超vised异常检测预测模型，学习不标注的肺部表示，可能捕捉疾病和健康不标注区域特征。然后，我们使用生成模型学习健康表示的分布，并将异常（来自COPD）视为偏离。每个病人的分数由多个区域OOD分数集成。我们发现，cOOpD 在两个公共数据集上表现最佳，与之前的同学报表现相比，AUROC 提高了8.2%和7.7%。此外，cOOpD 提供了可读的空间异常地图和病人级分数，我们显示它们在早期进程中的识别中具有补做作用。在人工设计的真实世界 prévalence 中进行的实验也支持，异常检测是一种有力的方法来解决 COPD 分类。

Masked Autoencoders for Unsupervised Anomaly Detection in Medical Images

paper_url: http://arxiv.org/abs/2307.07534
repo_url: https://github.com/lilygeorgescu/mae-medical-anomaly-detection
paper_authors: Mariana-Iuliana Georgescu
for: 这个研究旨在探讨如何使用仅使用健康标本进行医疗影像异常检测，以便训练深度学习模型。
methods: 我们提议使用遮盾自动encoder模型来学习正常标本的结构，然后将这些标本与遮盾自动encoder的差异作为输入，训练一个异常分类器。
results: 我们在两个医疗影像数据集BRATS2020和LUNA16上进行实验，与四种现有的异常检测框架AST、RD4AD、AnoVAEGAN和f-AnoGAN进行比较，结果显示我们的方法在异常检测上有着佳的表现。

Abstract
Pathological anomalies exhibit diverse appearances in medical imaging, making it difficult to collect and annotate a representative amount of data required to train deep learning models in a supervised setting. Therefore, in this work, we tackle anomaly detection in medical images training our framework using only healthy samples. We propose to use the Masked Autoencoder model to learn the structure of the normal samples, then train an anomaly classifier on top of the difference between the original image and the reconstruction provided by the masked autoencoder. We train the anomaly classifier in a supervised manner using as negative samples the reconstruction of the healthy scans, while as positive samples, we use pseudo-abnormal scans obtained via our novel pseudo-abnormal module. The pseudo-abnormal module alters the reconstruction of the normal samples by changing the intensity of several regions. We conduct experiments on two medical image data sets, namely BRATS2020 and LUNA16 and compare our method with four state-of-the-art anomaly detection frameworks, namely AST, RD4AD, AnoVAEGAN and f-AnoGAN.

摘要
医学影像中的疾病异常现象可能具有多种不同的外观，这使得收集和标注充足的数据来训练深度学习模型变得困难。因此，在这项工作中，我们采用了使用只有健康样本进行训练的方式。我们提议使用假隐藏自动编码器模型来学习正常样本的结构，然后在假隐藏自动编码器的差异上训练异常分类器。我们在监督式的方式下训练异常分类器，使用正样本的重建结果作为负样本，而使用我们提出的新型假异常模块生成的假异常样本作为正样本。我们在 BRATS2020 和 LUNA16 两个医学影像数据集上进行了实验，并与四种现有的异常检测框架进行了比较，namely AST、RD4AD、AnoVAEGAN 和 f-AnoGAN。

Improved Flood Insights: Diffusion-Based SAR to EO Image Translation

paper_url: http://arxiv.org/abs/2307.07123
repo_url: None
paper_authors: Minseok Seo, Youngtack Oh, Doyi Kim, Dongmin Kang, Yeji Choi
for: This paper aims to improve the interpretability of flood insights from Synthetic Aperture Radar (SAR) images for human analysts.
methods: The paper proposes a novel framework called Diffusion-Based SAR to EO Image Translation (DSE) to convert SAR images into Electro-Optical (EO) images, enhancing the interpretability of flood insights.
results: Experimental results on two datasets (Sen1Floods11 and SEN12-FLOOD) show that the DSE framework not only delivers enhanced visual information but also improves performance across all tested flood segmentation baselines.

Abstract
Driven by rapid climate change, the frequency and intensity of flood events are increasing. Electro-Optical (EO) satellite imagery is commonly utilized for rapid response. However, its utilities in flood situations are hampered by issues such as cloud cover and limitations during nighttime, making accurate assessment of damage challenging. Several alternative flood detection techniques utilizing Synthetic Aperture Radar (SAR) data have been proposed. Despite the advantages of SAR over EO in the aforementioned situations, SAR presents a distinct drawback: human analysts often struggle with data interpretation. To tackle this issue, this paper introduces a novel framework, Diffusion-Based SAR to EO Image Translation (DSE). The DSE framework converts SAR images into EO images, thereby enhancing the interpretability of flood insights for humans. Experimental results on the Sen1Floods11 and SEN12-FLOOD datasets confirm that the DSE framework not only delivers enhanced visual information but also improves performance across all tested flood segmentation baselines.

摘要
由快速气候变化驱动，洪水事件的频率和严重程度在增加。电子光（EO）卫星图像通常用于快速应急应用，但是在云覆盖和夜间限制下，其在洪水情况下的应用受到限制，具体评估损害困难。一些使用Synthetic Aperture Radar（SAR）数据的洪水探测技术已经被提出。虽然SAR在上述情况下的优势比EO更大，但SAR存在一个缺点：人工分析者往往难以理解数据。为解决这个问题，本文介绍了一种新的框架，即Diffusion-Based SAR to EO Image Translation（DSE）框架。DSE框架将SAR图像转换为EO图像，从而提高洪水洞察的可读性。实验结果表明，DSE框架不仅提供了加强的视觉信息，还提高了所有测试的洪水分 segmentation基线性能。