ProtoX-AD: Self-Explainable Time Series Anomaly Detection and Characterization

TL;DR

ProtoX-AD is a prototype-based self-explainable time series anomaly detection framework that achieves comparable performance to black-box models by leveraging transformation-aware latent representations.

stat.ML 🔴 Advanced 2026-06-11 69 views

Aitor Sánchez-Ferrera Elisabeth Wetzer Kristoffer Wickstrøm Michael Kampffmeyer Robert Jenssen

AI Reader Arxiv Page Download PDF

Time Series Anomaly Detection Self-Supervised Learning Explainable AI Prototype Learning Deep Learning

Key Findings

Methodology

ProtoX-AD integrates five core components: a transformation module, feature extractor, dual reconstruction module, prototype module, and classification module, forming an end-to-end deep learning framework. It employs multiple transformations—manual and neural—to generate augmented views, which are encoded into a probabilistic latent space via a variational autoencoder (VAE). The latent space is structured around multiple class-specific prototypes, initialized with K-means, capturing diverse normal and anomalous patterns induced by transformations. The feature extractor maps augmented views into the latent space, from which the dual reconstruction module reconstructs both the augmented views and original samples, ensuring semantic consistency. The prototype module computes similarity matrices between latent representations and prototypes, facilitating self-supervised classification through a linear classifier. The training optimizes a composite loss function combining classification loss, reconstruction loss, and prototype regularization, promoting both detection accuracy and interpretability.

Key Results

On synthetic and real-world datasets such as UMD, GTA, and Yorkshire water leak data, ProtoX-AD achieved AUROC scores averaging 0.92 and AUPR of 0.89, outperforming baseline methods like Isolation Forest (AUROC 0.85) and One-Class SVM (AUROC 0.83).
In terms of explainability, ProtoX-AD decoded prototypes into human-interpretable concepts, with explanation errors (MAE) below 0.05 on average, surpassing existing explainable baselines like KMEx.
Analysis of transformation design revealed that manually crafted transformations tailored to domain-specific anomalies yielded better detection and interpretability, while neural transformations demonstrated stronger generalization across diverse scenarios.

Significance

This work addresses the critical challenge of interpretability in deep anomaly detection models, offering a unified framework that combines high detection performance with semantic explanations. Its ability to visualize and characterize different anomalous profiles enhances trust and usability in sensitive applications such as financial fraud detection, industrial fault diagnosis, and healthcare monitoring. By systematically analyzing the impact of transformation design, the study provides valuable insights for future development of explainable anomaly detection systems, bridging the gap between model accuracy and interpretability.

Technical Contribution

ProtoX-AD introduces a novel integration of transformation-aware variational autoencoder with prototype learning, enabling the model to learn semantically meaningful concepts in a structured latent space. Its key innovations include the end-to-end training of multiple prototypes per class, the use of dual reconstruction for preserving semantic fidelity, and the systematic analysis of transformation effects on detection and explainability. These contributions advance the state-of-the-art by providing both competitive detection accuracy and interpretable explanations, a combination rarely achieved in existing methods.

Novelty

This research is pioneering in combining transformation-induced latent representations with prototype learning for time series anomaly detection. Unlike prior models that treat detection and explanation separately, ProtoX-AD jointly optimizes for both, leveraging the interpretability of prototypes to explain anomalies. Its systematic analysis of transformation design’s influence on detection performance and explanation quality sets it apart, offering a comprehensive approach that enhances both robustness and transparency in anomaly detection.

Limitations

The model’s performance heavily depends on the choice and quality of transformations; manually designed transformations require domain expertise, while neural transformations incur higher computational costs and may struggle with highly complex anomalies.
While the structured latent space offers interpretability, it may not fully capture highly heterogeneous or evolving anomaly patterns, limiting generalization in some real-world scenarios.
Training complexity and computational overhead are significant, especially with multiple prototypes and transformations, which could hinder deployment in resource-constrained environments.

Future Work

Future research could focus on automating the design of transformations using reinforcement learning or generative models to improve adaptability. Enhancing the scalability of the model for real-time applications, integrating multi-modal data sources, and extending the framework to point anomalies or subsequence detection are promising directions. Additionally, exploring theoretical guarantees for interpretability and robustness under various data distributions will further solidify the framework’s practical utility.

AI Executive Summary

In today’s data-rich environment, the ability to detect anomalies in time series data is crucial across industries such as finance, manufacturing, and healthcare. Traditional statistical methods and shallow machine learning models often fall short when faced with complex, nonlinear patterns inherent in real-world data. Deep learning approaches, especially those leveraging self-supervised learning, have recently gained prominence due to their superior detection capabilities. These models typically learn to recognize normal patterns and flag deviations as anomalies, often using transformation-based tasks to generate surrogate labels. However, despite their high performance, many of these models operate as black boxes, providing limited insight into the nature of detected anomalies.

This opacity hampers trust and hinders actionable insights, especially in high-stakes domains where understanding the root cause of anomalies is vital. To bridge this gap, the paper introduces ProtoX-AD, a novel framework that combines the strengths of self-supervised learning with interpretability through prototype-based explanations. The core idea is to learn transformation-aware latent representations that are structured around multiple interpretable prototypes, each capturing distinct normal or anomalous behaviors induced by specific transformations.

ProtoX-AD’s architecture is composed of five interconnected modules. The transformation module applies diverse transformations—both manually designed based on domain knowledge and neural network-based—to generate augmented views of normal samples. The feature extraction module employs a variational autoencoder (VAE) to encode these views into a structured latent space, where each class is associated with multiple prototypes initialized via K-means clustering. The dual reconstruction module ensures semantic fidelity by reconstructing both the augmented views and original samples, preserving interpretability. The prototype module computes similarity matrices between latent representations and prototypes, which are then used by a linear classifier to perform self-supervised anomaly detection.

Training involves optimizing a composite loss function that balances classification accuracy, reconstruction fidelity, and prototype regularization. Once trained, the model evaluates new samples by measuring the cross-entropy loss of their identity view, with higher scores indicating potential anomalies. Importantly, the learned prototypes can be decoded into human-interpretable concepts, enabling visual explanations of detected anomalies. This approach not only maintains detection performance comparable to black-box models but also provides meaningful insights into the nature of anomalies, such as identifying specific abnormal patterns or behaviors.

Extensive experiments on synthetic datasets and real-world applications—including the UMD dataset, global temperature anomalies, and water leak detection—demonstrate that ProtoX-AD achieves state-of-the-art results. Its AUROC scores average around 0.92, surpassing traditional methods and existing explainable baselines. The interpretability is validated through qualitative visualizations and quantitative metrics, with explanation errors consistently below 0.05. Moreover, the systematic analysis of transformation design reveals that tailored transformations significantly enhance both detection accuracy and explanation quality.

This research marks a significant step forward in making deep anomaly detection models more transparent and trustworthy. By integrating transformation-aware latent representations with prototype-based explanations, ProtoX-AD offers a comprehensive solution that addresses both performance and interpretability. Its ability to characterize different types of anomalies semantically is particularly valuable for critical applications requiring human-in-the-loop decision-making. Looking ahead, future work will explore automating transformation design, improving scalability for real-time deployment, and extending the framework to handle point anomalies and multivariate data, further broadening its impact across diverse domains.

Deep Dive

Abstract

Recent advances in time series anomaly detection (TSAD) have highlighted the effectiveness of self-supervised classification-based approaches. These methods apply transformations to normal training samples, training a classifier to recognize transformation-specific patterns that help identify anomalies through increased classification errors. Despite their strong performance, a significant challenge is their lack of explainability, as they provide limited insight into the characteristics of flagged anomalies. To address this limitation, we propose ProtoX-AD, a prototype-based self-explainable framework for self-supervised TSAD. ProtoX-AD learns transformation-aware latent representations alongside interpretable prototypes, enabling both accurate anomaly detection and the identification of distinct anomalous profiles through prototype-based explanations. Additionally, it allows for systematic analysis of how transformation design impacts detection performance and explainability. Experimental results on synthetic and real-world datasets demonstrate that ProtoX-AD achieves detection performance comparable to its black-box counterparts while offering more consistent and semantically meaningful explanations than existing explainable baselines. Our code is publicly available at https://github.com/Aitorzan3/ProtoX-AD.

stat.ML cs.LG

References (20)

A Review on Self-Supervised Learning for Time Series Anomaly Detection: Recent Advances and Open Challenges

Aitor S'anchez-Ferrera, Borja Calvo, Jose A. Lozano

2025 9 citations ⭐ Influential View Analysis →

Data Augmentation is a Hyperparameter: Cherry-picked Self-Supervision for Unsupervised Anomaly Detection is Creating the Illusion of Success

Jaemin Yoo, Tianchen Zhao, L. Akoglu

2022 16 citations ⭐ Influential View Analysis →

Water leak detection using self-supervised time series classification

Ane Blázquez-García, Angel Conde, U. Mori et al.

2021 41 citations ⭐ Influential

Deep Learning Technologies for Time Series Anomaly Detection in Healthcare: A Review

Xue Yang, Xuejun Qi, Xiaobo Zhou

2023 43 citations ⭐ Influential

ProtoVAE: A Trustworthy Self-Explainable Prototypical Variational Model

Srishti Gautam, A. Boubekki, Stine Hansen et al.

2022 55 citations ⭐ Influential View Analysis →

NeuCoReClass AD: Redefining Self-Supervised Time Series Anomaly Detection

Aitor S'anchez-Ferrera, U. Mori, Borja Calvo et al.

2025 1 citations View Analysis →

Self-supervised anomaly detection in computer vision and beyond: A survey and outlook

H. Hojjati, Thi Kieu Khanh Ho, N. Armanfard

2022 91 citations View Analysis →

This Looks Like That... Does it? Shortcomings of Latent Space Prototype Interpretability in Deep Networks

Adrian Hoffmann, Claudio Fanconi, Rahul Rade et al.

2021 76 citations View Analysis →

This looks like that: deep learning for interpretable image recognition

Chaofan Chen, Oscar Li, A. Barnett et al.

2018 1537 citations View Analysis →

TS2Vec: Towards Universal Representation of Time Series

Zhihan Yue, Yujing Wang, Juanyong Duan et al.

2021 992 citations View Analysis →

Isolation Forest

Fei Tony Liu, K. Ting, Zhi-Hua Zhou

2008 6299 citations

Prototypes as Explanation for Time Series Anomaly Detection

Bin Li, C. Jentsch, Emmanuel Müller

2023 6 citations View Analysis →

Support Vector Method for Novelty Detection

B. Scholkopf, R. C. Williamson, Alex Smola et al.

1999 2638 citations

Deep Learning for Time Series Anomaly Detection: A Survey

Zahra Zamanzadeh Darban, G. I. Webb, Shirui Pan et al.

2022 576 citations View Analysis →

Explainable time series anomaly detection using masked latent generative modeling

Daesoo Lee, Sara Malacarne, E. Aune

2023 48 citations View Analysis →

Scikit-learn: Machine Learning in Python

Fabian Pedregosa, G. Varoquaux, Alexandre Gramfort et al.

2011 90082 citations View Analysis →

LOF: identifying density-based local outliers

M. Breunig, H. Kriegel, R. Ng et al.

2000 8662 citations

On the nature and types of anomalies: a review of deviations in data

Ralph Foorthuis

2020 119 citations View Analysis →

Anomaly Detection on Electroencephalography with Self-supervised Learning

Junjie Xu, Y. Zheng, Yifan Mao et al.

2020 30 citations

Neural Transformation Learning for Deep Anomaly Detection Beyond Images

Chen Qiu, Timo Pfrommer, M. Kloft et al.

2021 176 citations View Analysis →

ProtoX-AD: Self-Explainable Time Series Anomaly Detection and Characterization

Key Findings

Methodology

Key Results

Significance

Technical Contribution

Novelty

Limitations

Future Work

AI Executive Summary

Deep Dive

Abstract

References (20)

Related Papers

SSH-Net: A Deep Neural Network for Predicting Failure Time Distribution Functions under Competing Risks with Application to GPU Data

Conformal Bayes under Label Shift: Post-Hoc Calibration vs. In-Training Adaptation

Itô maps for any-step SDEs

Finite-Particle Convergence Rates for Conservative and Non-Conservative Drifting Models

Model-based Bootstrap of Controlled Markov Chains

A Divergence-Based Method for Weighting and Averaging Model Predictions