FedSPDnet: Geometry-Aware Federated Deep Learning with SPDnet

TL;DR

FedSPDnet outperforms traditional methods on EEG datasets using ProjAvg and RLAvg strategies, enhancing F1 score and robustness.

stat.ML 🔴 Advanced 2026-04-24 27 views
Thibault Pautrel Florent Bouchard Ammar Mian Guillaume Ginolhac
federated learning Riemannian manifold SPD matrices signal processing deep learning

Key Findings

Methodology

FedSPDnet introduces two novel federated learning frameworks, ProjAvg and RLAvg, specifically designed for the SPDnet model operating on symmetric positive definite (SPD) matrices with Stiefel-constrained parameters. ProjAvg projects arithmetic means onto the Stiefel manifold via polar decomposition, while RLAvg approximates tangent-space averaging using retractions and liftings. Both methods are computationally efficient, independent of the optimizer, and suitable for signal processing applications requiring SPD matrix features.

Key Results

  • In EEG motor imagery benchmarks, FedSPDnet outperforms federated EEGnet in F1 score and robustness to federation and partial participation, while using fewer parameters per communication round.
  • ProjAvg and RLAvg exhibit similar convergence on Weibo2014 and PhysionetMI datasets, with validation F1 curves overlapping throughout all communication rounds.
  • On the PhysionetMI dataset, despite SPDnet's centralized score being lower than EEGnet's, FedSPDnet retains its centralized performance, indicating that geometry-aware aggregation provides intrinsic regularization.

Significance

The introduction of FedSPDnet is significant for both academia and industry as it addresses the limitations of traditional federated learning methods in non-Euclidean geometries, particularly when dealing with SPD matrices. This method offers a more robust and efficient solution for signal processing applications, especially in scenarios requiring data privacy protection. By preserving geometric structures, FedSPDnet can provide higher accuracy and stability when handling complex data.

Technical Contribution

FedSPDnet's technical contributions lie in its unique geometry-aware aggregation strategies, which fundamentally differ from existing SOTA methods. The ProjAvg and RLAvg strategies not only preserve orthogonality on the Stiefel manifold but also provide new theoretical guarantees, making it feasible to handle SPD matrices in large-scale federated learning. Additionally, these strategies do not depend on specific optimizers, enhancing their flexibility across different application scenarios.

Novelty

FedSPDnet is novel in applying geometry-aware aggregation strategies to federated learning with the SPDnet network for the first time. Compared to previous work, this method not only provides new theoretical insights but also demonstrates practical effectiveness, especially when dealing with data with geometric constraints.

Limitations

  • Although FedSPDnet performs well on multiple datasets, it may encounter computational bottlenecks when handling extremely large datasets, particularly in calculating the Riemannian mean.
  • ProjAvg and RLAvg strategies may require additional storage and computational resources in some cases, potentially limiting their application on resource-constrained devices.
  • While FedSPDnet performs well on EEG datasets, its performance on other types of datasets still needs further validation.

Future Work

Future research directions include further optimizing FedSPDnet's computational efficiency for application on larger datasets. Additionally, exploring the application of this method to other types of geometric datasets, such as those in image processing and natural language processing, is promising. Investigating how to better incorporate geometric information in federated learning to enhance model generalization is also a worthwhile direction.

AI Executive Summary

Federated learning is a method that enables collaborative model training without centralizing raw data, typically achieved by iteratively averaging parameters on a central server. However, traditional federated learning methods are mainly limited to Euclidean parameter spaces and cannot extend to non-Euclidean geometries. Recently, the rise of Riemannian optimization has provided new solutions for learning problems with geometric constraints, especially when dealing with symmetric positive definite (SPD) covariance matrices. SPDnet is an architecture that integrates bilinear mappings and nonlinear feature rectification on the Stiefel manifold, proving effective in applications such as micro-Doppler radar and electroencephalography.

FedSPDnet proposes two new federated learning frameworks, ProjAvg and RLAvg, specifically designed for the SPDnet network. These methods preserve geometric structures through geometry-aware aggregation strategies: ProjAvg projects arithmetic means onto the Stiefel manifold via polar decomposition, while RLAvg approximates tangent-space averaging using retractions and liftings. Both methods are computationally efficient, independent of the optimizer, and suitable for signal processing applications requiring SPD matrix features.

In EEG motor imagery benchmarks, FedSPDnet outperforms federated EEGnet in F1 score and robustness to federation and partial participation, while using fewer parameters per communication round. ProjAvg and RLAvg exhibit similar convergence on Weibo2014 and PhysionetMI datasets, with validation F1 curves overlapping throughout all communication rounds. Despite SPDnet's centralized score being lower than EEGnet's, FedSPDnet retains its centralized performance, indicating that geometry-aware aggregation provides intrinsic regularization.

The introduction of FedSPDnet is significant for both academia and industry as it addresses the limitations of traditional federated learning methods in non-Euclidean geometries, particularly when dealing with SPD matrices. This method offers a more robust and efficient solution for signal processing applications, especially in scenarios requiring data privacy protection. By preserving geometric structures, FedSPDnet can provide higher accuracy and stability when handling complex data.

Although FedSPDnet performs well on multiple datasets, it may encounter computational bottlenecks when handling extremely large datasets, particularly in calculating the Riemannian mean. ProjAvg and RLAvg strategies may require additional storage and computational resources in some cases, potentially limiting their application on resource-constrained devices. Future research directions include further optimizing FedSPDnet's computational efficiency for application on larger datasets. Additionally, exploring the application of this method to other types of geometric datasets, such as those in image processing and natural language processing, is promising. Investigating how to better incorporate geometric information in federated learning to enhance model generalization is also a worthwhile direction.

Deep Analysis

Background

Federated learning is a method that enables collaborative model training without centralizing raw data, typically achieved by iteratively averaging parameters on a central server. This approach allows multiple clients to participate in model training while preserving data privacy. With the increasing importance of data privacy, federated learning has gained significant attention in both academia and industry. However, traditional federated learning methods are mainly limited to Euclidean parameter spaces and cannot extend to non-Euclidean geometries, limiting their effectiveness in certain applications. Meanwhile, the rise of Riemannian optimization has provided new solutions for learning problems with geometric constraints, especially when dealing with symmetric positive definite (SPD) covariance matrices. SPDnet is an architecture that integrates bilinear mappings and nonlinear feature rectification on the Stiefel manifold, proving effective in applications such as micro-Doppler radar and electroencephalography.

Core Problem

The limitations of traditional federated learning methods in non-Euclidean geometries present a significant challenge. Specifically, when dealing with symmetric positive definite (SPD) matrices, standard Euclidean averaging disrupts orthogonality, leading to the loss of geometric structures. This is a major bottleneck for signal processing applications that require preserving geometric information. Additionally, existing methods face computational complexity in calculating the Riemannian mean, making them difficult to apply in large-scale federated learning. Therefore, effectively preserving geometric structures in federated learning, especially when handling SPD matrices, is an important and challenging problem.

Innovation

FedSPDnet's core innovations lie in its two geometry-aware aggregation strategies: ProjAvg and RLAvg.


  • �� ProjAvg: Projects arithmetic means onto the Stiefel manifold via polar decomposition, preserving orthogonality. This method is computationally efficient and suitable for large-scale federated learning.

  • �� RLAvg: Approximates tangent-space averaging using retractions and liftings, avoiding the complexity of directly calculating the Riemannian mean. This method is independent of specific optimizers, enhancing its flexibility across different application scenarios.

These innovations provide new theoretical insights and demonstrate practical effectiveness, especially when dealing with data with geometric constraints.

Methodology

FedSPDnet's methodology involves several key steps:


  • �� Data Preparation: Experiments are conducted using the Weibo2014 and PhysionetMI datasets, containing 7 and 4 motor imagery classes, respectively.

  • �� Model Architecture: The SPDnet network is employed, processing SPD matrices through bilinear mappings and nonlinear feature rectification.

  • �� Aggregation Strategies:
  • ProjAvg: Computes Euclidean averages of local weights and maps them back onto the Stiefel manifold via polar decomposition.
  • RLAvg: Uses retractions and liftings to approximate tangent-space averaging, avoiding direct computation of the Riemannian mean.

  • �� Experimental Setup: In each communication round, a subset of clients is selected for local training, and updated parameters are aggregated on the server.

Experiments

The experimental design includes using two motor imagery datasets: Weibo2014 and PhysionetMI. Each dataset's signals are band-pass filtered, and the sample covariance matrix is used as the input feature. The baseline model employed is EEGnet, federated using the standard FedAvg strategy. Key hyperparameters include learning rate, batch size, and the number of local training epochs. Ablation studies are conducted to evaluate the effectiveness of different aggregation strategies. By comparing the performance of ProjAvg and RLAvg on different datasets, the effectiveness of FedSPDnet is validated.

Results

Experimental results show that FedSPDnet performs excellently on multiple datasets, particularly in terms of F1 score and robustness. Specifically, ProjAvg and RLAvg exhibit similar convergence on Weibo2014 and PhysionetMI datasets, with validation F1 curves overlapping throughout all communication rounds. Additionally, despite SPDnet's centralized score being lower than EEGnet's, FedSPDnet retains its centralized performance, indicating that geometry-aware aggregation provides intrinsic regularization. Ablation studies reveal that ProjAvg and RLAvg perform similarly across different datasets, validating their applicability in various scenarios.

Applications

FedSPDnet has broad application scenarios in signal processing applications, especially in scenarios requiring data privacy protection. Specifically, this method is suitable for motor imagery classification of EEG data, micro-Doppler radar signal processing, and ground-penetrating radar image classification. In these scenarios, SPD matrices serve as feature descriptors, effectively capturing the geometric structure of the data to improve classification accuracy. Additionally, FedSPDnet's geometry-aware aggregation strategies enable effective information sharing between different clients, enhancing the model's generalization capabilities.

Limitations & Outlook

Although FedSPDnet performs well on multiple datasets, it may encounter computational bottlenecks when handling extremely large datasets, particularly in calculating the Riemannian mean. ProjAvg and RLAvg strategies may require additional storage and computational resources in some cases, potentially limiting their application on resource-constrained devices. Additionally, while FedSPDnet performs well on EEG datasets, its performance on other types of datasets still needs further validation. Future research directions include further optimizing FedSPDnet's computational efficiency for application on larger datasets.

Plain Language Accessible to non-experts

Imagine you're in a kitchen cooking a meal. You have various ingredients like vegetables, meat, and spices. The traditional way is to throw everything into one big pot, which is convenient but might lose the unique flavor of each ingredient. Now, imagine you have a special pot where each ingredient can cook in its own little compartment, preserving its unique flavor. This is how FedSPDnet works. It's like a special pot that allows each client to keep its data characteristics while sharing information. ProjAvg and RLAvg are like two different cooking techniques that ensure each ingredient is perfectly cooked. ProjAvg uses a method called polar decomposition to ensure each compartment's ingredients maintain their shape and flavor. RLAvg uses a method called retraction and lifting to ensure each ingredient is processed without losing its original flavor. This way, FedSPDnet can improve model accuracy and stability without sacrificing data privacy.

ELI14 Explained like you're 14

Hey there! Today I'm going to tell you about something super cool called FedSPDnet. Imagine you're playing an online game with your friends, but you don't want to share your game data with others. What do you do? That's where FedSPDnet comes in, like a smart middleman that helps you all get better at the game without sharing your secret data!

FedSPDnet has two awesome helpers, ProjAvg and RLAvg. ProjAvg is like a super projector that can project everyone's gaming skills onto a big screen, so everyone can see the best parts. RLAvg is like a magic ladder that lets you see everyone's gaming skills without changing your position. This way, you can all get stronger without revealing your secrets!

This method can be used in many places, like analyzing brainwaves and radar signals. It helps scientists analyze data better without worrying about data leaks. Isn't that cool?

So next time you're playing a game, think about this smart FedSPDnet, like an invisible helper silently making you stronger!

Glossary

Federated Learning

A method that enables collaborative model training without centralizing raw data, typically achieved by iteratively averaging parameters on a central server.

Used in the paper as a model training method that protects data privacy.

Riemannian Manifold

A mathematical structure used to describe spaces with curvature, allowing optimization in non-Euclidean geometries.

Used for handling learning problems with geometric constraints.

SPD Matrix

A matrix whose eigenvalues are all positive, commonly used to describe covariance matrices.

Serves as a feature descriptor in signal processing applications.

Stiefel Manifold

A manifold consisting of all orthogonal matrices, used to describe parameters with orthogonal constraints.

The manifold where SPDnet network parameters reside.

Polar Decomposition

A method of decomposing a matrix into an orthogonal matrix and a symmetric positive definite matrix.

Used in the ProjAvg strategy to project arithmetic means onto the Stiefel manifold.

Retraction

A method of mapping points from the tangent space back to the manifold, approximating the Riemannian exponential map.

Used in the RLAvg strategy to approximate tangent-space averaging.

Lifting

A method of mapping points from the manifold to the tangent space, approximating the Riemannian logarithm map.

Used in the RLAvg strategy to approximate tangent-space averaging.

Bilinear Mapping

A linear transformation used to reduce dimensionality while preserving geometric structures.

A key operation in the SPDnet network.

Nonlinear Eigenvalue Rectification

A method of achieving nonlinear transformation through eigenvalue adjustment.

A key operation in the SPDnet network.

Sample Covariance Matrix

A matrix used to describe the features of a dataset, obtained by calculating the covariance of samples.

Used as the input feature for the SPDnet network.

Open Questions Unanswered questions from this research

  • 1 How to efficiently compute the Riemannian mean on large-scale datasets remains an open question. Existing methods face computational complexity bottlenecks, limiting their application in large-scale federated learning. Future research needs to develop more efficient algorithms to handle larger datasets without sacrificing accuracy.
  • 2 Although FedSPDnet performs well on EEG datasets, its performance on other types of datasets still needs validation. Particularly in fields like image processing and natural language processing, how to effectively apply geometry-aware aggregation strategies remains an area worth exploring.
  • 3 The application of ProjAvg and RLAvg strategies on resource-constrained devices requires further research. Existing methods may require additional storage and computational resources, which may not be practical in some application scenarios. Future research needs to develop more lightweight strategies for application in resource-constrained environments.
  • 4 How to better incorporate geometric information in federated learning to enhance model generalization is an area worth exploring. Existing methods perform well when dealing with data with geometric constraints, but further validation is needed on other types of data.
  • 5 Effectively aggregating parameters in non-Euclidean geometries remains a challenge. Existing methods provide new theoretical insights but need further optimization in practice to improve computational efficiency and applicability.

Applications

Immediate Applications

EEG Motor Imagery Classification

FedSPDnet can be used for motor imagery classification of EEG data, helping scientists analyze and classify EEG signals without compromising data privacy.

Micro-Doppler Radar Signal Processing

In micro-Doppler radar signal processing, FedSPDnet can effectively capture the geometric structure of signals, improving target recognition accuracy.

Ground-Penetrating Radar Image Classification

FedSPDnet can be used for classifying ground-penetrating radar images, assisting engineers in analyzing and classifying radar images without centralizing data.

Long-term Vision

Broad Application to Geometric Datasets

FedSPDnet's geometry-aware aggregation strategies can be applied to other types of geometric datasets, such as those in image processing and natural language processing.

Enhancing Model Generalization

By better incorporating geometric information in federated learning, FedSPDnet is expected to enhance model generalization, applicable to a wider range of scenarios.

Abstract

We introduce two federated learning frameworks for the classical SPDnet model operating on symmetric positive definite (SPD) matrices with Stiefel-constrained parameters. Unlike standard Euclidean averaging, which violates orthogonality, our approach preserves geometric structure through two efficient aggregation strategies: ProjAvg, projecting arithmetic means onto the Stiefel manifold, and RLAvg, approximating tangent-space averaging via retractions and liftings. Both methods are computationally efficient, independent of the optimizer, and enable scalable federated learning for signal processing applications whose features are SPD matrices. Simulations on EEG motor imagery benchmarks show that FedSPDnet outperforms federated EEGnet in F1 score and robustness to federation and partial participation, while using fewer parameters per communication round.

stat.ML cs.LG

References (20)

Optimization algorithms on matrix manifolds

L. Tunçel

2009 3082 citations ⭐ Influential

Communication-Efficient Learning of Deep Networks from Decentralized Data

H. B. McMahan, Eider Moore, Daniel Ramage et al.

2016 23855 citations ⭐ Influential View Analysis →

Riemannian Federated Learning via Averaging Gradient Stream

Zhenwei Huang, Wen Huang, Pratik Jawanpuria et al.

2024 4 citations ⭐ Influential View Analysis →

A Riemannian Network for SPD Matrix Learning

Zhiwu Huang, L. Gool

2016 488 citations ⭐ Influential View Analysis →

Beyond $R$-Barycenters: An Effective Averaging Method on Stiefel and Grassmann Manifolds

Florent Bouchard, Nils Laurent, Salem Said et al.

2025 2 citations ⭐ Influential View Analysis →

The largest EEG-based BCI reproducibility study for open science: the MOABB benchmark

Sylvain Chevallier, Igor Carrara, Bruno Aristimunha et al.

2024 38 citations View Analysis →

SCAFFOLD: Stochastic Controlled Averaging for Federated Learning

Sai Praneeth Karimireddy, Satyen Kale, M. Mohri et al.

2019 3844 citations

Riemannian batch normalization for SPD neural networks

Daniel A. Brooks, Olivier Schwander, F. Barbaresco et al.

2019 123 citations View Analysis →

On Convergence of FedProx: Local Dissimilarity Invariant Bounds, Non-smoothness and Beyond

Xiao-Tong Yuan, P. Li

2022 112 citations View Analysis →

Parallel Restarted SGD with Faster Convergence and Less Communication: Demystifying Why Model Averaging Works for Deep Learning

Hao Yu, Sen Yang, Shenghuo Zhu

2018 675 citations View Analysis →

Federated Learning on Riemannian Manifolds with Differential Privacy

Zhenwei Huang, Wen Huang, Pratik Jawanpuria et al.

2024 8 citations View Analysis →

Infeasible Deterministic, Stochastic, and Variance-Reduction Algorithms for Optimization under Orthogonality Constraints

Pierre Ablin, Simon Vary, Bin Gao et al.

2023 18 citations View Analysis →

Federated Learning on Riemannian Manifolds

Jiaxiang Li, Shiqian Ma

2022 22 citations View Analysis →

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, Jimmy Ba

2014 165295 citations View Analysis →

A review of classification algorithms for EEG-based brain–computer interfaces: a 10 year update

F. Lotte, L. Bougrain, A. Cichocki et al.

2018 1894 citations

Classification of Buried Objects From Ground Penetrating Radar Images by Using Second-Order Deep Learning Models

Douba Jafuno, A. Mian, G. Ginolhac et al.

2024 5 citations View Analysis →

SPD domain-specific batch normalization to crack interpretable unsupervised domain adaptation in EEG

Reinmar J. Kobler, J. Hirayama, Qibin Zhao et al.

2022 79 citations View Analysis →

Early Stopping-But When?

L. Prechelt

1996 2297 citations

Geometric neural network based on phase space for BCI-EEG decoding

Igor Carrara, Bruno Aristimunha, Marie-Constance Corsi et al.

2024 10 citations View Analysis →

Multiclass Brain–Computer Interface Classification by Riemannian Geometry

A. Barachant, S. Bonnet, M. Congedo et al.

2012 796 citations