Publications | Prof. Gabriele Cavallaro

Academic papers

Recent Publications

Journals

C. Gomes, I. Wittmann, D. Robert, J. Jakubik, T. Reichelt, S. Maurogiovanni, R. Vinge, J. Hurst, E. Scheurer, R. Sedona, T. Brunschwiler, S. Kesselheim, M. Batic, P. Stier, J. D. Wegner, G. Cavallaro, E. Pebesma, M. Marszalek, M. A. Belenguer-Plomer, K. Adriko, P. Fraccaro, R. Kienzler, R. Briq, S. Benassou, M. Lazzarini, C. M. Albrecht.

IEEE Geoscience and Remote Sensing Magazine

Over the past decades, there has been an explosion in the amount of available Earth observation (EO) data. The unprecedented coverage of Earth’s surface and atmosphere by satellite imagery has resulted in large volumes of data that must be transmitted to ground stations, stored in data centers, and distributed to end users. Modern Earth system models (ESMs) face similar challenges, operating at high spatial and temporal resolutions, producing petabytes of data per simulated day. Data compression has gained relevance over the past decade, with neural compression (NC) emerging from deep learning and information theory, making EO data and ESM outputs ideal candidates because of their abundance of unlabeled data. In this review, we outline recent developments in NC applied to geospatial data. We introduce the fundamental concepts of NC, including seminal works in its traditional applications to image and video compression domains with a focus on lossy compression. We discuss the unique characteristics of EO and ESM data, contrasting them with “natural images,” and we explain the additional challenges and opportunities they present. Additionally, we review current applications of NC across various EO modalities and explore the limited efforts in ESM compression to date. The advent of self-supervised learning (SSL) and foundation models (FMs) has advanced methods to efficiently distill representations from vast amounts of unlabeled data. We connect these developments to NC for EO, highlighting the similarities between the two fields and elaborate on the potential of transferring compressed feature representations for machine-to-machine communication. Based on insights drawn from this review, we devise future directions relevant to applications in EO and ESMs.

To the publication

D. Szwarcman, S. Roy, P. Fraccaro, Þ. Elí Gíslason, B. Blumenstiel, R. Ghosal, P. Henrique de Oliveira, J. Lucas de Sousa Almeida, R. Sedona, Y. Kang, S. Chakraborty, S. Wang, C. Gomes, A. Kumar, M. Truong, D. Godwin, H. Lee, C. Hsu, A. Akbari Asanjan, B. Mujeci, D. Shidham, T. Keenan, P. Arevalo, W. Li, H. Alemohammad, P. Olofsson, C. Hain, R. Kennedy, B. Zadrozny, D. Bell, G. Cavallaro, C. Watson, M. Maskey, R. Ramachandran, J. Bernabe Moreno.

arXiv

This technical report presents Prithvi-EO-2.0, a new geospatial foundation model that offers significant improvements over its predecessor, Prithvi-EO-1.0. Trained on 4.2M global time series samples from NASA's Harmonized Landsat and Sentinel-2 data archive at 30m resolution, the new 300M and 600M parameter models incorporate temporal and location embeddings for enhanced performance across various geospatial tasks. Through extensive benchmarking with GEO-Bench, the 600M version outperforms the previous Prithvi-EO model by 8\% across a range of tasks. It also outperforms six other geospatial foundation models when benchmarked on remote sensing tasks from different domains and resolutions (i.e. from 0.1m to 15m). The results demonstrate the versatility of the model in both classical earth observation and high-resolution applications. Early involvement of end-users and subject matter experts (SMEs) are among the key factors that contributed to the project's success. In particular, SME involvement allowed for constant feedback on model and dataset design, as well as successful customization for diverse SME-led applications in disaster response, land use and crop mapping, and ecosystem dynamics monitoring. Prithvi-EO-2.0 is available on Hugging Face and IBM terratorch, with additional resources on GitHub. The project exemplifies the Trusted Open Science approach embraced by all involved organizations.

‍

To the publication

J. Arnold, J. P. G. Hermosillo Muriedas, S. Nassyr, R. Sedona, M. Götz, A. Streit, M. Riedel, G. Cavallaro

IEEE Access

Clustering in data mining involves grouping similar objects into categories based on their characteristics. As the volume of data continues to grow and advancements in high-performance computing evolve, a critical need has emerged for algorithms that can efficiently process these computations and exploit the various levels of parallelism offered by modern supercomputing systems. Exploiting Single Instruction Multiple Data (SIMD) instructions enhances parallelism at the instruction level and minimizes data movement within the memory hierarchy. To fully harness a processor’s SIMD capabilities and achieve optimal performance, adapting algorithms for better compatibility with vector operations is necessary. In this paper, we introduce a vectorized implementation of the Density-based Clustering for Applications with Noise (DBSCAN) algorithm suitable for the execution on both shared and distributed memory systems. By leveraging SIMD, we enhance the performance of distance computations. Our proposed Vectorized HPDBSCAN (VHPDBSCAN) demonstrates a performance improvement of up to two times over the state-of-the-art parallel version, Highly Parallel DBSCAN (HPDBSCAN), on the ARM-based A64FX processor on two different datasets with varying dimensions. We have parallelized computations which are essential for the efficient workload distribution. This has significantly enhanced the performance on higher dimensional datasets. Additionally, we evaluate VHPDBSCAN’s energy consumption on the A64FX and Intel Xeon processors. The results show that in both processors, due to the reduced runtime, the total energy consumption of the application is reduced by 50% on the A64FX Central Processing Unit (CPU) and by approximately 19% on the Intel Xeon 8368 CPU compared to HPDBSCAN.

To the publication

S. Moreno-Álvarez, M. E. Paoletti, G. Cavallaro, J. A. Rico, J. M. Haut

IEEE Geoscience and Remote Sensing Letters

Land-cover classification methods are based on the processing of large image volumes to accurately extract representative features. Particularly, convolutional models provide notable characterization properties for image classification tasks. Distributed learning mechanisms on high-performance computing platforms have been proposed to speed up the processing, while achieving an efficient feature extraction. High-performance computing platforms are commonly composed of a combination of central processing units (CPUs) and graphics processing units (GPUs) with different computational capabilities. As a result, current homogeneous workload distribution techniques for deep learning (DL) become obsolete due to their inefficient use of computational resources. To address this, new computational balancing proposals, such as heterogeneous data parallelism, have been implemented. Nevertheless, these techniques should be improved to handle the peculiarities of working with heterogeneous data workloads in the training of distributed DL models. The objective of handling heterogeneous workloads for current platforms motivates the development of this work. This letter proposes an innovative heterogeneous gradient calculation applied to land-cover classification tasks through convolutional models, considering the data amount assigned to each device in the platform while maintaining the acceleration. Extensive experimentation has been conducted on multiple datasets, considering different deep models on heterogeneous platforms to demonstrate the performance of the proposed methodology.

To the publication

R. Sedona, L. Hoffmann, R. Spang, G. Cavallaro, S. Griessbach, M. Höpfner, M. Book, M. Riedel

Atmospheric Measurement Techniques

Polar stratospheric clouds (PSCs) play a key role in polar ozone depletion in the stratosphere. Improved observations and continuous monitoring of PSCs can help to validate and improve chemistry–climate models that are used to predict the evolution of the polar ozone hole. In this paper, we explore the potential of applying machine learning (ML) methods to classify PSC observations of infrared limb sounders. Two datasets were considered in this study. The first dataset is a collection of infrared spectra captured in Northern Hemisphere winter 2006/2007 and Southern Hemisphere winter 2009 by the Michelson Interferometer for Passive Atmospheric Sounding (MIPAS) instrument on board the European Space Agency's (ESA) Envisat satellite. The second dataset is the cloud scenario database (CSDB) of simulated MIPAS spectra. We first performed an initial analysis to assess the basic characteristics of the CSDB and to decide which features to extract from it. Here, we focused on an approach using brightness temperature differences (BTDs). From both the measured and the simulated infrared spectra, more than 10 000 BTD features were generated. Next, we assessed the use of ML methods for the reduction of the dimensionality of this large feature space using principal component analysis (PCA) and kernel principal component analysis (KPCA) followed by a classification with the support vector machine (SVM). The random forest (RF) technique, which embeds the feature selection step, has also been used as a classifier. All methods were found to be suitable to retrieve information on the composition of PSCs. Of these, RF seems to be the most promising method, being less prone to overfitting and producing results that agree well with established results based on conventional classification methods.

To the publication

J. M. Haut, J. A. Gallardo, M. E. Paoletti, G. Cavallaro, J. Plaza, A. Plaza and M. Riedel

IEEE Transactions on Geoscience and Remote Sensing

Advances in remote sensing hardware have led to a significantly increased capability for high-quality data acquisition, which allows the collection of remotely sensed images with very high spatial, spectral, and radiometric resolution. This trend calls for the development of new techniques to enhance the way that such unprecedented volumes of data are stored, processed, and analyzed. An important approach to deal with massive volumes of information is data compression, related to how data are compressed before their storage or transmission. For instance, hyperspectral images (HSIs) are characterized by hundreds of spectral bands. In this sense, high-performance computing (HPC) and high-throughput computing (HTC) offer interesting alternatives. Particularly, distributed solutions based on cloud computing can manage and store huge amounts of data in fault-tolerant environments, by interconnecting distributed computing nodes so that no specialized hardware is needed. This strategy greatly reduces the processing costs, making the processing of high volumes of remotely sensed data a natural and even cheap solution. In this paper, we present a new cloud-based technique for spectral analysis and compression of HSIs. Specifically, we develop a cloud implementation of a popular deep neural network for non-linear data compression, known as autoencoder (AE). Apache Spark serves as the backbone of our cloud computing environment by connecting the available processing nodes using a master-slave architecture. Our newly developed approach has been tested using two widely available HSI data sets. Experimental results indicate that cloud computing architectures offer an adequate solution for managing big remotely sensed data sets.

To the publication

G. Cavallaro, M. Riedel, M. Richerzhagen, J. A. Benediktsson and A. Plaza

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (J-STARS)

Owing to the recent development of sensor resolutions onboard different Earth observation platforms, remote sensing is an important source of information for mapping and monitoring natural and man-made land covers. Of particular importance is the increasing amounts of available hyperspectral data originating from airborne and satellite sensors such as AVIRIS, HyMap, and Hyperion with very high spectral resolution (i.e., high number of spectral channels) containing rich information for a wide range of applications. A relevant example is the separation of different types of land-cover classes using the data in order to understand, e.g., impacts of natural disasters or changing of city buildings over time. More recently, such increases in the data volume, velocity, and variety of data contributed to the term big data that stand for challenges shared with many other scientific disciplines. On one hand, the amount of available data is increasing in a way that raises the demand for automatic data analysis elements since many of the available data collections are massively underutilized lacking experts for manual investigation. On the other hand, proven statistical methods (e.g., dimensionality reduction) driven by manual approaches have a significant impact in reducing the amount of big data toward smaller smart data contributing to the more recently used terms data value and veracity (i.e., less noise, lower dimensions that capture the most important information). This paper aims to take stock of which proven statistical data mining methods in remote sensing are used to contribute to smart data analysis processes in the light of possible automation as well as scalable and parallel processing techniques. We focus on parallel support vector machines (SVMs) as one of the best out-of-the-box classification methods.

To the publication

Conference Papers

M. Riedel, M. Book, H. Neukirchen, G. Cavallaro, A. Lintermann

45th Jubilee International Convention on Information, Communication and Electronic Technology (MIPRO)

High-Performance Computing (HPC) can quickly process scientific data and perform complex calculations at extremely high speeds. A vast increase in HPC use across scientific communities is observed, especially in using parallel data science methods to speed-up scientific applications. HPC enables scaling up machine and deep learning algorithms that inherently solve optimization problems. More recently, the field of quantum machine learning evolved as another HPC related approach to speed-up data science methods. This paper will address primarily traditional HPC and partly the new quantum machine learning aspects, whereby the latter specifically focus on our experiences on using quantum annealing at the Juelich Supercomputing Centre (JSC). Quantum annealing is particularly effective for solving optimization problems like those that are inherent in machine learning methods. We contrast these new experiences with our lessons learned of using many parallel data science methods with a high number of Graphical Processing Units (GPUs). That includes modular supercomputers such as JUWELS, the fastest European supercomputer at the time of writing. Apart from practice and experience with HPC co-design applications, technical challenges and solutions are discussed, such as using interactive access via JupyterLab on typical batch-oriented HPC systems or enabling distributed training tools for deep learning on our HPC systems.

‍

To the publication

C. Barakat, M. Riedel, S. Brynjólfsson, G. Cavallaro, J. Busch, R. Sedona

44th International Convention on Information, Communication and Electronic Technology (MIPRO)

Given the Covid-19 pandemic, the retail industry shifts many business models to enable more online purchases that produce large transaction data quantities (i.e., big data). Data science methods infer seasonal trends about products from this data and spikes in purchases, the effectiveness of advertising campaigns, or brand loyalty but require extensive processing power leveraging High-Performance Computing to deal with large transaction datasets. This paper proposes an High-Performance Computing-based expert system architectural design tailored for ‘big data analysis’ in the retail industry, providing data science methods and tools to speed up the data analysis with conceptual interoperability to commercial cloud-based services. Our expert system leverages an innovative Modular Supercomputer Architecture to enable the fast analysis by using parallel and distributed algorithms such as association rule mining (i.e., FP-Growth) and recommender methods (i.e., collaborative filtering). It enables the seamless use of accelerators of supercomputers or cloud-based systems to perform automated product tagging (i.e., residual deep learning networks for product image analysis) to obtain colour, shapes automatically, and other product features. We validate our expert system and its enhanced knowledge representation with commercial datasets obtained from our ON4OFF research project in a retail case study in the beauty sector.

To the publication

M. Riedel, R. Sedona, C. Barakat, P. Einarsson, R. Hassanian, G. Cavallaro, M. Book, H. Neukirchen and A. Lintermann

Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

We observe a continuously increased use of Deep Learning (DL) as a specific type of Machine Learning (ML) for data-intensive problems (i.e., ’big data’) that requires powerful computing resources with equally increasing performance. Consequently, innovative heterogeneous High-Performance Computing (HPC) systems based on multi-core CPUs and many-core GPUs require an architectural design that addresses end user communities’ requirements that take advantage of ML and DL. Still the workloads of end user communities of the simulation sciences (e.g., using numerical methods based on known physical laws) needs to be equally supported in those architectures. This paper offers insights into the Modular Supercomputer Architecture (MSA) developed in the Dynamic Exascale Entry Platform (DEEP) series of projects to address the requirements of both simulation sciences and data-intensive sciences such as High Performance Data Analytics (HPDA). It shares insights into implementing the MSA in the Jülich Supercomputing Centre (JSC) hosting Europe No. 1 Supercomputer Jülich Wizard for European Leadership Science (JUWELS). We augment the technical findings with experience and lessons learned from two application communities case studies (i.e., remote sensing and health sciences) using the MSA with JUWELS and the DEEP systems in practice. Thus, the paper provides details into specific MSA design elements that enable significant performance improvements of ML and DL algorithms. While this paper focuses on MSA-based HPC systems and application experience, we are not losing sight of advances in Cloud Computing (CC) and Quantum Computing (QC) relevant for ML and DL.

To the publication

Recent Publications

Journals

Conference Papers

Abstracts

Magazine articles

Book chapters

Recent Publications

Journals

Lossy Neural Compression for Geospatial Analytics: A review

Prithvi-EO-2.0: A Versatile Multi-Temporal Foundation Model for Earth Observation Applications [preprint - not peer reviewed]

Vectorized Highly Parallel Density-Based Clustering for Applications With Noise

Local Binary and Multiclass SVMs Trained on a Quantum Annealer

Sen4Map: Advancing Mapping with Sentinel-2 by Providing Detailed Semantic Descriptions and Customizable Land-Use and Land-Cover Data

Kernel Approximation on a Quantum Annealer for Remote Sensing Regression Tasks

A Single-Step Multiclass SVM based on Quantum Annealing for Remote Sensing Data Classification

Toward the Production of Spatiotemporally Consistent Annual Land Cover Maps using Sentinel-2 Time Series

Enhancing Distributed Neural Network Training Through Node-Based Communications

Deep Learning-based 3D Surface Reconstruction - A Survey

Few-Shot Remote Sensing Image Classification with Meta-Learning [preprint - not peer reviewed]

Learning from Data for Remote Sensing Image Analysis

Quantum SVR for Chlorophyll Concentration Estimation in Water with Remote Sensing

Remote Sensing Image Classification Using CNNs With Balanced Gradient for Distributed Heterogeneous Computing

Predicting Classification Performance for Benchmark Hyperspectral Datasets

A High-Performance Multispectral Adaptation GAN for Harmonizing Dense Time Series of Landsat-8 and Sentinel-2 images

Exploration of Machine Learning Methods for the Classification of Infrared Limb Spectra of Polar Stratospheric Clouds

Remote Sensing Big Data Classification with High Performance Distributed Deep Learning

Cloud Deep Networks for Hyperspectral Image Analysis

Parallel Computation of Component Trees on Distributed Memory Machines

Integration of LiDAR and Hyperspectral Data for Land-cover Classification: A Case Study

Automatic Attribute Profiles

Remote Sensing Image Classification Using Attribute Filters Defined over the Tree of Shapes

On Understanding Big Data Impacts in Remotely Sensed Image Classification Using Support Vector Machine Methods

Extended Self-Dual Attribute Profiles for the Classification of Hyperspectral Images

Automatic Framework for Spectral–Spatial Classification Based on Supervised Feature Extraction and Morphological Attribute Profiles

Conference Papers

Enhancing Land Cover Mapping: A Novel Automatic Approach to Improve Mixed Spectral Pixel Classification

A CNN Architecture Tailored for Quantum Feature Map-Based Radar Sounder Signal Segmentation

Supporting Seismic Data Survey Design Through the Integration of Satellite-Based Land Cover Maps

Reverse Quantum Annealing for Hybrid Quantum-Classical Satellite Mission Planning

A Hybrid Quantum-Classical CNN Architecture for Semantic Segmentation of Radar Sounder Data

Quantum Annealing for Semantic Segmentation in Remote Sensing: Potential and Limitations

Accuracy Assessment of Land-Use-Land-Cover Maps: the Semantic Gap between in Situ and Satellite Data

Adiabatic Quantum Kitchen Sinks With Parallel Annealing For Remote Sensing Regression Problems

End-to-End Process Orchestration of Earth Observation Data Workflows with Apache Airflow on High Performance Computing

Challenges and Opportunities in the Adoption of High Performance Computing for Earth Observation Applications in the Exascale Era

Enhancing Training Set Through Multi-temporal Attention Analysis in Transformers for Multi-Year Land Cover Mapping

Hybrid Quantum-Classical Workflows in Modular Supercomputing Architectures with the Jülich Unified Infrastructure for Quantum Computing

Improving Generalization for Few-Shot Remote Sensing Classification with Meta-learning

Quantum Support Vector Regression for Biophysical Variable Estimation in Remote Sensing

Optimizing Distributed Deep Learning in Heterogeneous Computing Platforms for Remote Sensing Data Classification

An Automatic Approach for the production of a Time Series of Consistent Land-cover Maps Based on Long-short Term Memory

Accelerating Hyperparameter Tuning of a Deep Learning Model for Remote Sensing Image Classification

Practice and Experience using High Performance Computing and Quantum Computing to Speed-up Data Science Methods in Scientific Applications

JUWELS Booster–A Supercomputer for Large-Scale AI Research

Design and Evaluation of an HPC-based Expert System to speed-up Retail Data Analysis using Residual Networks Combined with Parallel Association Rule Mining and Scalable Recommenders

Practice and Experience in using Parallel and Scalable Machine Learning with Heterogenous Modular Supercomputing Architectures

Enhancing Large Batch Size Training of Deep Models for Remote Sensing Applications

Practice and Experience in using Parallel and Scalable Machine Learning in Remote Sensing from HPC over Clouds to Quantum Computing

Quantum Support Vector Machine Algorithms for Remote Sensing Data Classification

Scaling up a Multispectral Resnet-50 to 128 GPUs

Super-Resolution of Large Volumes of Sentinel-2 Images with High Performance Distributed Deep Learning

Approaching Remote Sensing Image Classification with Ensembles of Support Vector Machines on the D-WAVE Quantum Annealer

Remote Sensing Data Analytics with the Udocker Container Tool using Multi-GPU Deep Learning Systems

Scalable workflows for Remote Sensing Data Processing with the DEEP-EST Modular Supercomputing Architecture

Multi-Scale Convolutional SVM Networks for Multi-Class Classification Problems of Remote Sensing Images

Scaling Support Vector Machines Towards Exascale Computing for Classification of Large-Scale High-Resolution Remote Sensing Images

The Influence of Sampling Methods on Pixel-Wise Hyperspectral Image Classification with 3D Convolutional Neural Networks

Automated Analysis of Remotely Sensed Images Using the UNICORE Workflow Management System

Tree-Based Supervised Feature Extraction Method Based on Self-Dual Attribute Profiles

Facilitating Efficient Data Analysis of Remotely Sensed Images Using Standards-Based Parameter Sweep Models

Region-Based Classification of Remote Sensing Images with the Morphological Tree of Shapes

Unsupervised Change Detection Analysis to Multi-Channel Scenario based on Morphological Contextual Analysis

Scalable Developments for Big Data Analytics in Remote Sensing

Automatic Morphological Attribute Profiles

Automatic Threshold Selection for Profiles of Attribute Filters Based on Granulometric Characteristic Functions

Processing High Resolution Images of Urban Areas with Self-Dual Attribute Filters (Invited Paper)

An Advanced Classifier for the Joint Use of LiDAR and Hyperspectral Data: Case Study in Queensland, Australia

On Scalable Data Mining Techniques for Earth Science

Smart Data Analytics Methods for Remote Sensing Applications

A Comparison of Self-Dual Attribute Profiles Based on Different Filter Rules for Classification

Detection of Hedges Based on Attribute Filters

Abstracts

Downstream Applications on Prithvi: A Foundation Model for Global Geospatial Data

Prithvi V1.0: Generalist Geospatial Foundation Model on Global HLS Data

4M4EO – Massively Multi-Modal Masked Autoencoders for Earth Observation

Magazine articles