eess.IV - 2023-11-03

Quantitative Evaluation of a Multi-Modal Camera Setup for Fusing Event Data with RGB Images

  • paper_url: http://arxiv.org/abs/2311.01881
  • repo_url: None
  • paper_authors: Julian Moosmann, Jakub Mandula, Philipp Mayer, Luca Benini, Michele Magno
  • for: 这个论文的目的是提出一种多模式摄像头设置,用于将高分辨率DVS数据与RGB图像数据进行融合,以便使用两种技术 simultaneously。
  • methods: 这个论文使用了几种时间基于的同步方法来帮助将DVS数据与RGB图像数据进行对应,并进行了相关的Camera alignment和镜头影响的分析。
  • results: 实验结果表明,提出的系统具有较低的图像校准误差(less than 0.90px)和像素十分之偏差(1.6px),而使用8毫米 focal length镜头可以检测到距离350米的30厘米大小的 объекts against homogeneous background。
    Abstract Event-based cameras, also called silicon retinas, potentially revolutionize computer vision by detecting and reporting significant changes in intensity asynchronous events, offering extended dynamic range, low latency, and low power consumption, enabling a wide range of applications from autonomous driving to longtime surveillance. As an emerging technology, there is a notable scarcity of publicly available datasets for event-based systems that also feature frame-based cameras, in order to exploit the benefits of both technologies. This work quantitatively evaluates a multi-modal camera setup for fusing high-resolution DVS data with RGB image data by static camera alignment. The proposed setup, which is intended for semi-automatic DVS data labeling, combines two recently released Prophesee EVK4 DVS cameras and one global shutter XIMEA MQ022CG-CM RGB camera. After alignment, state-of-the-art object detection or segmentation networks label the image data by mapping boundary boxes or labeled pixels directly to the aligned events. To facilitate this process, various time-based synchronization methods for DVS data are analyzed, and calibration accuracy, camera alignment, and lens impact are evaluated. Experimental results demonstrate the benefits of the proposed system: the best synchronization method yields an image calibration error of less than 0.90px and a pixel cross-correlation deviation of1.6px, while a lens with 8mm focal length enables detection of objects with size 30cm at a distance of 350m against homogeneous background.
    摘要 Event-based 摄像头,也称为silicon retina,有 potential 革命化计算机视觉,因为它可以检测和报告快速变化的强度 asynchronous 事件,提供扩展的动态范围,低延迟,和低功耗,因此可以应用于自动驾驶到长期监测等多种应用。作为新兴技术,公共可用的 dataset для event-based 系统和 frame-based 摄像头的混合还是罕见的。本研究使用多模式摄像头设置,将高分辨率 DVS 数据与 RGB 图像数据混合,并通过静态摄像头对齐来实现。这种设置是为 semi-automatic DVS 数据标注而设计,使用两个最新发布的 Prophesee EVK4 DVS 摄像头和一个全球闭环 XIMEA MQ022CG-CM RGB 摄像头。在对齐后,使用现状的对象检测或分割网络将图像数据标注为对齐事件。为此,我们分析了多种时间基准的同步方法,并评估了相机对齐精度、镜头影响和摄像头对齐精度。实验结果表明,我们的方法具有优秀的效果:最佳同步方法的图像准确性错误低于0.90px,像素十分之偏移低于1.6px,而8mm focal length 镜头可以检测到30cm大小的 объек的到达350m 距离。

3-Dimensional residual neural architecture search for ultrasonic defect detection

  • paper_url: http://arxiv.org/abs/2311.01867
  • repo_url: None
  • paper_authors: Shaun McKnight, Christopher MacKinnon, S. Gareth Pierce, Ehsan Mohseni, Vedran Tunukovic, Charles N. MacLeod, Randika K. W. Vithanage, Tom OHare
  • for: 这种研究使用深度学习方法检测碳纤维复合材料中的缺陷,通过用三维卷积神经网络处理三维超声测试数据。
  • methods: 这种方法使用了一种新的数据生成方法,通过保留完整的三维数据,使得复杂的预处理步骤减少,神经网络可以利用空间和时间信息,提高模型的性能。
  • results: 研究 comparing三种体制,包括一种自定义的卷积神经网络,一种使用立方体卷积神经网络,以及一种通过神经网络搜索生成的三维差异神经网络。结果显示,使用全连接层进行维度减少,比使用最大池化层更高的性能。此外,在训练时添加域特性增强方法,也有显著提高模型性能的效果。
    Abstract This study presents a deep learning methodology using 3-dimensional (3D) convolutional neural networks to detect defects in carbon fiber reinforced polymer composites through volumetric ultrasonic testing data. Acquiring large amounts of ultrasonic training data experimentally is expensive and time-consuming. To address this issue, a synthetic data generation method was extended to incorporate volumetric data. By preserving the complete volumetric data, complex preprocessing is reduced, and the model can utilize spatial and temporal information that is lost during imaging. This enables the model to utilise important features that might be overlooked otherwise. The performance of three architectures were compared. The first two architectures were hand-designed to address the high aspect ratios between the spatial and temporal dimensions. The first architecture reduced dimensionality in the time domain and used cubed kernels for feature extraction. The second architecture used cuboidal kernels to account for the large aspect ratios. The evaluation included comparing the use of max pooling and convolutional layers for dimensionality reduction, with the fully convolutional layers consistently outperforming the models using max pooling. The third architecture was generated through neural architecture search from a modified 3D Residual Neural Network (ResNet) search space. Additionally, domain-specific augmentation methods were incorporated during training, resulting in significant improvements in model performance for all architectures. The mean accuracy improvements ranged from 8.2% to 22.4%. The best performing models achieved mean accuracies of 91.8%, 92.2%, and 100% for the reduction, constant, and discovered architectures, respectively. Whilst maintaining a model size smaller than most 2-dimensional (2D) ResNets.
    摘要 Three architecture designs were compared: the first two were hand-designed to address high aspect ratios between spatial and temporal dimensions. The first architecture reduced dimensionality in the time domain using cubed kernels for feature extraction, while the second architecture used cuboidal kernels to account for large aspect ratios. The third architecture was generated through neural architecture search from a modified 3D Residual Neural Network (ResNet) search space.During training, domain-specific augmentation methods were incorporated, resulting in significant improvements in model performance for all architectures. The mean accuracy improvements ranged from 8.2% to 22.4%. The best-performing models achieved mean accuracies of 91.8%, 92.2%, and 100% for the reduction, constant, and discovered architectures, respectively, while maintaining a model size smaller than most 2D ResNets.

Neural SPDE solver for uncertainty quantification in high-dimensional space-time dynamics

  • paper_url: http://arxiv.org/abs/2311.01783
  • repo_url: None
  • paper_authors: Maxime Beauchamp, Ronan Fablet, Hugo Georgenthum
  • for: 这篇论文目的是对大型地球物理数据进行插值和资料融合。
  • methods: 这篇论文使用了Stochastic Partial Differential Equations(SPDE)和 Gaussian Markov Random Fields(GMRF)来处理大数据,并使用了简短精度矩阵来实现插值。
  • results: 这篇论文的解法提高了Optimal Interpolation(OI)的基eline,并能够quantify the associated uncertainties。它还能够与神经网络结合,实现资料融合和线上参数估测。
    Abstract Historically, the interpolation of large geophysical datasets has been tackled using methods like Optimal Interpolation (OI) or model-based data assimilation schemes. However, the recent connection between Stochastic Partial Differential Equations (SPDE) and Gaussian Markov Random Fields (GMRF) introduced a novel approach to handle large datasets making use of sparse precision matrices in OI. Recent advancements in deep learning also addressed this issue by incorporating data assimilation into neural architectures: it treats the reconstruction task as a joint learning problem involving both prior model and solver as neural networks. Though, it requires further developments to quantify the associated uncertainties. In our work, we leverage SPDEbased Gaussian Processes to estimate complex prior models capable of handling nonstationary covariances in space and time. We develop a specific architecture able to learn both state and SPDE parameters as a neural SPDE solver, while providing the precisionbased analytical form of the SPDE sampling. The latter is used as a surrogate model along the data assimilation window. Because the prior is stochastic, we can easily draw samples from it and condition the members by our neural solver, allowing flexible estimation of the posterior distribution based on large ensemble. We demonstrate this framework on realistic Sea Surface Height datasets. Our solution improves the OI baseline, aligns with neural prior while enabling uncertainty quantification and online parameter estimation.
    摘要 En el pasado, la interpolación de grandes conjuntos de datos geofísicos se ha abordado utilizando métodos como Interpolación Óptima (OI) o esquemas de asimilación de datos basados en modelos. Sin embargo, la reciente conexión entre Ecuaciones Parciales Diferenciales Estocásticas (SPDE) y Campos de Markov Gaussianos (GMRF) presentó una nueva aproximación para manejar grandes conjuntos de datos utilizando matrices de precisión esparcas en OI. Los avances recientes en aprendizaje profundo también abordaron este problema al incorporar la asimilación de datos en arquitecturas neurales: se trata la tarea de reconstrucción como un problema de aprendizaje conjunto que involucra tanto el modelo previo como el solver como redes neuronales. Aunque requiere desarrollos adicionales para cuantificar las incertidumbres asociadas. En nuestro trabajo, utilizamos Procesos de Gaussianas Basadas en SPDE para estimar modelos priorizados complejos capaces de manejar covarianzas no estacionarias en el espacio y el tiempo. Desarrollamos una arquitectura específica que aprende tanto los parámetros del estado como los parámetros de SPDE como un solucionador neural SPDE, mientras proporciona la forma analítica de la SPDE sampling. La última se utiliza como un modelo de surrogato a lo largo de la ventana de asimilación de datos. Como el prior es estocástico, podemos fácilmente extraer muestras de él y condicionarlos con nuestro solucionador neural, lo que permite una estimación flexible de la distribución posterior en función de un gran ensamble. Demostramos este marco en conjuntos de datos de Altura de la Surface del Mar realistas. Nuestra solución mejora el umbral de OI, se alinea con el prior neural y permite la cuantificación de incertidumbres y la estimación en línea de parámetros.