Overcoming Selection Bias in Statistical Studies With Amortized Bayesian Inference

TL;DR

Bias-aware simulation-based inference framework addresses selection bias, enhancing estimation accuracy.

stat.ML 🔴 Advanced 2026-04-20 29 views

Jonas Arruda Sophie Chervet Paula Staudt Andreas Wieser Michael Hoelscher Isabelle Sermet-Gaudelus Nadine Binder Lulla Opatowski Jan Hasenauer

AI Reader Arxiv Page Download PDF

selection bias Bayesian inference simulation-based inference neural posterior estimation high-dimensional data

Key Findings

Methodology

This study proposes a bias-aware simulation-based inference framework by embedding the selection mechanism directly into the generative simulator, enabling Bayesian inference without requiring tractable likelihoods. The method utilizes Neural Posterior Estimation (NPE) to address selection bias and integrates Simulation-Based Calibration (SBC) and Classifier Two-Sample Tests (C2ST) to assess posterior calibration. The framework recovers well-calibrated posterior distributions across three statistical applications, particularly in settings where likelihood-based approaches yield biased estimates.

Key Results

In the KoCo19 study, the bias-aware NPE estimated prevalence more accurately than unadjusted estimators and inverse probability weighting across 1000 simulated datasets, demonstrating its advantage in handling non-representative sampling and outcome missingness.
In the Framingham Heart Study, the bias-aware NPE accurately recovered all transition hazards in simulated data, outperforming the standard NPE under death-induced selection bias.
In the PedCovid study, the bias-aware NPE achieved unbiased inference in complex stochastic simulation models, addressing the infeasibility of explicit likelihood-based correction due to underlying process complexity.

Significance

The significance of this study lies in providing a novel approach to address selection bias, especially in complex stochastic models. Traditional methods rely on tractable likelihoods, while this method overcomes this limitation through simulation-based inference, enabling accurate parameter estimation in high-dimensional and latent variable dynamic systems. This framework is not only significant in academia but also offers new tools for data analysis in practical applications, particularly in epidemiology and social science research.

Technical Contribution

Technical contributions include embedding the selection mechanism within the generative simulator, achieving bias-aware simulation-based inference. This method overcomes the limitations of traditional likelihood-based methods, providing new theoretical guarantees and engineering possibilities. By utilizing neural posterior estimation, the method addresses selection bias in high-dimensional and latent variable dynamic systems, and verifies posterior calibration through simulation-based calibration and classifier two-sample tests.

Novelty

This study is the first to recast the correction of selection bias as a simulation problem and solve it through a bias-aware simulation-based inference framework. Compared to existing likelihood-based methods, this approach does not rely on tractable likelihoods, allowing it to handle more complex models and selection mechanisms.

Limitations

The method may still have limitations in handling extremely complex selection mechanisms, as constructing and training the simulator requires substantial computational resources.
In some cases, modeling the selection mechanism may not be accurate enough, affecting the accuracy of inference results.
The framework may face challenges in computational efficiency when dealing with real-time data.

Future Work

Future research directions include further optimizing the construction and training process of the simulator to improve computational efficiency and accuracy. Additionally, exploring the application of this framework in other fields, such as finance and social sciences, for selection bias problems. Researchers can also develop more efficient algorithms to handle selection bias in real-time data.

AI Executive Summary

Selection bias is a common issue in statistical studies, particularly in epidemiological and survey settings. Traditional correction methods rely on tractable likelihoods, limiting their applicability in complex stochastic models. This paper proposes a bias-aware simulation-based inference framework by embedding the selection mechanism directly into the generative simulator, enabling Bayesian inference without requiring tractable likelihoods.

The framework utilizes Neural Posterior Estimation (NPE) to address selection bias and integrates Simulation-Based Calibration (SBC) and Classifier Two-Sample Tests (C2ST) to assess posterior calibration. By embedding the selection mechanism within the generative simulator, the method enables bias-aware Bayesian inference without the need for tractable likelihoods.

In experiments, the method recovers well-calibrated posterior distributions across three statistical applications, particularly in settings where likelihood-based approaches yield biased estimates. In the KoCo19 study, the bias-aware NPE estimated prevalence more accurately than unadjusted estimators and inverse probability weighting across 1000 simulated datasets.

In the Framingham Heart Study, the bias-aware NPE accurately recovered all transition hazards in simulated data, outperforming the standard NPE under death-induced selection bias. In the PedCovid study, the bias-aware NPE achieved unbiased inference in complex stochastic simulation models, addressing the infeasibility of explicit likelihood-based correction due to underlying process complexity.

This study is significant in academia and offers new tools for data analysis in practical applications, particularly in epidemiology and social science research. Future research directions include further optimizing the construction and training process of the simulator to improve computational efficiency and accuracy.

Deep Analysis

Background

Selection bias is a common issue in statistical studies, particularly in epidemiological and survey settings. It arises when the probability that an observation enters a dataset depends on variables related to the quantities of interest, leading to systematic distortions in estimation and uncertainty quantification. Traditional correction methods, such as inverse probability weighting and explicit likelihood-based models of the selection process, rely on tractable likelihoods, limiting their applicability in complex stochastic models with latent dynamics or high-dimensional structure. As statistical models become more complex, the necessity of tractable likelihoods becomes a central bottleneck, rendering bias correction infeasible.

Core Problem

The core problem of selection bias lies in the systematic distortion of estimation and uncertainty quantification due to the dependence of the probability of observation inclusion on variables related to the quantities of interest. Traditional correction methods rely on tractable likelihoods, limiting their applicability in complex stochastic models. How to achieve bias-aware Bayesian inference without relying on tractable likelihoods is a significant and challenging problem.

Innovation

The core innovation of this paper is recasting the correction of selection bias as a simulation problem and solving it through a bias-aware simulation-based inference framework. Specifically, the framework embeds the selection mechanism directly into the generative simulator, enabling Bayesian inference without requiring tractable likelihoods. This method utilizes Neural Posterior Estimation (NPE) to address selection bias and integrates Simulation-Based Calibration (SBC) and Classifier Two-Sample Tests (C2ST) to assess posterior calibration.

Methodology

�� Embed the selection mechanism within the generative simulator to achieve bias-aware simulation-based inference.
�� Utilize Neural Posterior Estimation (NPE) to address selection bias.
�� Integrate Simulation-Based Calibration (SBC) and Classifier Two-Sample Tests (C2ST) to assess posterior calibration.
�� Enable bias-aware Bayesian inference without the need for tractable likelihoods by embedding the selection mechanism within the generative simulator.
�� Validate the effectiveness of the method across three different statistical applications.

Experiments

The experimental design includes validating the effectiveness of the method across three different statistical applications. Specifically, in the KoCo19 study, the bias-aware NPE estimated prevalence more accurately than unadjusted estimators and inverse probability weighting across 1000 simulated datasets. In the Framingham Heart Study, the bias-aware NPE accurately recovered all transition hazards in simulated data. In the PedCovid study, the bias-aware NPE achieved unbiased inference in complex stochastic simulation models.

Results

Experimental results show that the bias-aware NPE performs excellently in addressing selection bias. In the KoCo19 study, the bias-aware NPE estimated prevalence more accurately than unadjusted estimators and inverse probability weighting across 1000 simulated datasets. In the Framingham Heart Study, the bias-aware NPE accurately recovered all transition hazards in simulated data. In the PedCovid study, the bias-aware NPE achieved unbiased inference in complex stochastic simulation models.

Applications

This method has broad application prospects in epidemiology and social science research, particularly in addressing selection bias problems. Through simulation-based inference, the method enables bias-aware Bayesian inference without relying on tractable likelihoods, allowing for accurate parameter estimation in high-dimensional and latent variable dynamic systems.

Limitations & Outlook

The method may still have limitations in handling extremely complex selection mechanisms, as constructing and training the simulator requires substantial computational resources. In some cases, modeling the selection mechanism may not be accurate enough, affecting the accuracy of inference results. The framework may face challenges in computational efficiency when dealing with real-time data.

Plain Language Accessible to non-experts

Imagine you're shopping in a large supermarket. There are many products, but not all of them are visible to you because some are placed out of sight. Selection bias is like only being able to see certain products while shopping, not all of them. To better understand the variety of products in the supermarket, you need a way to estimate those you can't see. The method proposed in this paper is like a smart shopping assistant that can infer the unseen products by observing the ones you do see. This assistant uses a technique called Neural Posterior Estimation, which is like a clever algorithm that helps you understand the supermarket's product range more accurately without needing to know all the product information. In this way, you can gain a more comprehensive understanding of the supermarket's offerings without being affected by selection bias.

ELI14 Explained like you're 14

Hey there! You know how sometimes when you're doing a school experiment, not all the data is visible to you, like when the teacher hides some of it? That's what selection bias is like—it makes the data we see incomplete. To solve this problem, scientists invented a method called bias-aware simulation-based inference. Imagine it as a super-smart detective that can figure out the hidden data by analyzing the data you can see. This detective uses a technique called Neural Posterior Estimation, like a clever algorithm that helps us understand the whole experiment's results more accurately. This way, we can get a full picture of the experiment without being affected by selection bias. Isn't that cool?

Glossary

Selection Bias

Selection bias occurs when the probability that an observation enters a dataset depends on variables related to the quantities of interest, leading to systematic distortions in estimation and uncertainty quantification.

Selection bias is a common issue in epidemiological and survey settings.

Bayesian Inference

Bayesian inference is a statistical inference method that updates the probability for a hypothesis as more evidence or information becomes available, using Bayes' theorem.

The paper utilizes Bayesian inference to address selection bias.

Neural Posterior Estimation

Neural Posterior Estimation is a method that uses neural networks to approximate the posterior distribution.

The paper uses Neural Posterior Estimation to achieve bias-aware simulation-based inference.

Simulation-Based Calibration

Simulation-Based Calibration is a method for assessing the calibration of statistical models by using simulations.

The paper uses Simulation-Based Calibration to verify posterior distribution calibration.

Classifier Two-Sample Test

Classifier Two-Sample Test is a method that evaluates whether two samples come from the same distribution by training a classifier.

The paper uses Classifier Two-Sample Test to assess posterior distribution calibration.

Inverse Probability Weighting

Inverse Probability Weighting is a method that corrects for selection bias by weighting observations inversely to their probability of being sampled.

In the KoCo19 study, Inverse Probability Weighting is used as a baseline method.

Likelihood-Based Methods

Likelihood-Based Methods are statistical inference methods that rely on tractable likelihoods to make inferences.

Traditional methods for correcting selection bias rely on likelihood-based methods.

Latent Variables

Latent variables are variables that are not directly observed but are inferred from other variables within a statistical model.

The method addresses selection bias in high-dimensional and latent variable dynamic systems.

High-Dimensional Data

High-dimensional data refers to datasets with a large number of variables, often requiring complex statistical methods to analyze.

The method addresses selection bias in high-dimensional data.

Simulation-Based Inference

Simulation-Based Inference is a method that uses simulations to perform statistical inference, often used to handle intractable likelihoods.

The paper proposes a bias-aware simulation-based inference framework.

Open Questions Unanswered questions from this research

1 How can the construction and training efficiency of the simulator be further improved under extremely complex selection mechanisms? The current method may still have limitations in handling extremely complex selection mechanisms due to the substantial computational resources required.
2 How can the bias-aware simulation-based inference framework be applied to real-time data? The framework may face challenges in computational efficiency when dealing with real-time data.
3 How can this framework be applied in other fields, such as finance and social sciences, for selection bias problems? The current research primarily focuses on epidemiology and social science research.
4 How can the performance of Simulation-Based Calibration and Classifier Two-Sample Tests be further optimized? These methods play a crucial role in verifying posterior distribution calibration but still have room for optimization.
5 How can the accuracy of inference results be improved when the selection mechanism modeling is not accurate enough? The modeling of the selection mechanism may not be accurate enough, affecting the accuracy of inference results.

Applications

Immediate Applications

Epidemiological Research

The method can be used for correcting selection bias in epidemiological research, helping researchers estimate disease prevalence and transmission parameters more accurately.

Social Science Surveys

In social science surveys, the method can be used to correct estimation bias due to sampling bias, providing more reliable survey results.

Medical Research

In medical research, the method can be used to correct estimation bias due to selection bias, helping researchers evaluate treatment effects more accurately.

Long-term Vision

Financial Data Analysis

The method can be applied to correct selection bias in financial data analysis, helping analysts assess financial market risks and returns more accurately.

Real-Time Data Processing

In the future, the method can be used for correcting selection bias in real-time data processing, helping researchers obtain accurate analysis results more quickly.

Abstract

Selection bias arises when the probability that an observation enters a dataset depends on variables related to the quantities of interest, leading to systematic distortions in estimation and uncertainty quantification. For example, in epidemiological or survey settings, individuals with certain outcomes may be more likely to be included, resulting in biased prevalence estimates with potentially substantial downstream impact. Classical corrections, such as inverse-probability weighting or explicit likelihood-based models of the selection process, rely on tractable likelihoods, which limits their applicability in complex stochastic models with latent dynamics or high-dimensional structure. Simulation-based inference enables Bayesian analysis without tractable likelihoods but typically assumes missingness at random and thus fails when selection depends on unobserved outcomes or covariates. Here, we develop a bias-aware simulation-based inference framework that explicitly incorporates selection into neural posterior estimation. By embedding the selection mechanism directly into the generative simulator, the approach enables amortized Bayesian inference without requiring tractable likelihoods. This recasting of selection bias as part of the simulation process allows us to both obtain debiased estimates and explicitly test for the presence of bias. The framework integrates diagnostics to detect discrepancies between simulated and observed data and to assess posterior calibration. The method recovers well-calibrated posterior distributions across three statistical applications with diverse selection mechanisms, including settings in which likelihood-based approaches yield biased estimates. These results recast the correction of selection bias as a simulation problem and establish simulation-based inference as a practical and testable strategy for parameter estimation under selection bias.

stat.ML cs.LG stat.ME

References (20)

Diffusion Models in Simulation-Based Inference: A Tutorial Review

J. Arruda, Niels Bracher, Ullrich Köthe et al.

2025 8 citations ⭐ Influential View Analysis →

Flow Matching for Scalable Simulation-Based Inference

Maximilian Dax, J. Wildberger, Simon Buchholz et al.

2023 114 citations ⭐ Influential View Analysis →

Protocol of a population-based prospective COVID-19 cohort study Munich, Germany (KoCo19)

K. Radon, E. Saathoff, M. Pritsch et al.

2020 52 citations ⭐ Influential

Statistical Analysis With Missing Data

Subir Ghosh

1988 5179 citations ⭐ Influential

A multi-state model based reanalysis of the Framingham Heart Study: Is dementia incidence really declining?

N. Binder, J. Balmford, M. Schumacher

2019 16 citations ⭐ Influential

Validating Bayesian Inference Algorithms with Simulation-Based Calibration

Sean Talts, M. Betancourt, Daniel P. Simpson et al.

2018 344 citations ⭐ Influential View Analysis →

BayesFlow 2.0: Multi-Backend Amortized Bayesian Inference in Python

Lars Kühmichel, Jerry M. Huang, Valentin Pratz et al.

2026 1 citations ⭐ Influential

Flexible statistical inference for mechanistic models of neural dynamics

Jan-Matthis Lueckmann, P. J. Gonçalves, Giacomo Bassetto et al.

2017 311 citations View Analysis →

Sensitivity-Aware Amortized Bayesian Inference

Lasse Elsemüller, Hans Olischläger, M. Schmitt et al.

2023 24 citations View Analysis →

Head-to-head evaluation of seven different seroassays including direct viral neutralisation in a representative cohort for SARS-CoV-2

Laura Olbrich, N. Castelletti, Yannik Schälte et al.

2021 26 citations

Estimating prevalence from the results of a screening test.

W. Rogan, B. Gladen

1978 993 citations

Revisiting Classifier Two-Sample Tests

David Lopez-Paz, M. Oquab

2016 482 citations View Analysis →

Robust adaptive distance functions for approximate Bayesian inference on outlier-corrupted data

Yannik Schälte, Emad Alamoudi, J. Hasenauer

2021 8 citations

Inference for Non‐random Samples

J. Copas, H. Li

1997 359 citations

SGDR: Stochastic Gradient Descent with Warm Restarts

I. Loshchilov, F. Hutter

2016 10235 citations View Analysis →

Bayesian Approaches for Missing Not at Random Outcome Data: The Role of Identifying Restrictions.

A. Linero, M. Daniels

2018 56 citations

Does Unsupervised Domain Adaptation Improve the Robustness of Amortized Bayesian Inference? A Systematic Evaluation

Lasse Elsemuller, Valentin Pratz, Mischa von Krause et al.

2025 12 citations View Analysis →

SARS-CoV-2 incubation period across variants of concern, individual factors, and circumstances of infection in France: a case series analysis from the ComCor study

S. Galmiche, T. Cortier, Tiffany Charmet et al.

2023 66 citations

Deep Sets

M. Zaheer, Satwik Kottur, Siamak Ravanbakhsh et al.

2017 2859 citations View Analysis →

A Generalization of Sampling Without Replacement from a Finite Universe

D. Horvitz, D. Thompson

1952 4785 citations

Overcoming Selection Bias in Statistical Studies With Amortized Bayesian Inference

Key Findings

Methodology

Key Results

Significance

Technical Contribution

Novelty

Limitations

Future Work

AI Executive Summary

Deep Analysis

Background

Core Problem

Innovation

Methodology

Experiments

Results

Applications

Limitations & Outlook

Plain Language Accessible to non-experts

ELI14 Explained like you're 14

Glossary

Selection Bias

Bayesian Inference

Neural Posterior Estimation

Simulation-Based Calibration

Classifier Two-Sample Test

Inverse Probability Weighting

Likelihood-Based Methods

Latent Variables

High-Dimensional Data

Simulation-Based Inference

Open Questions Unanswered questions from this research

Applications

Immediate Applications

Epidemiological Research

Social Science Surveys

Medical Research

Long-term Vision

Financial Data Analysis

Real-Time Data Processing

Abstract

References (20)

Related Papers

A Divergence-Based Method for Weighting and Averaging Model Predictions

CLVAE: A Variational Autoencoder for Long-Term Customer Revenue Forecasting

Mixed Membership sub-Gaussian Models

Explanation of Dynamic Physical Field Predictions using WassersteinGrad: Application to Autoregressive Weather Forecasting

FedSPDnet: Geometry-Aware Federated Deep Learning with SPDnet

Pack only the essentials: Adaptive dictionary learning for kernel ridge regression