Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?

TL;DR

Self-distillation can degrade LLMs' reasoning in math by suppressing uncertainty expression.

cs.CL 🔴 Advanced 2026-03-26 69 views
Jeonghye Kim Xufang Luo Minbeom Kim Sangmook Lee Dohyung Kim Jiwon Jeon Dongsheng Li Yuqing Yang
self-distillation LLMs mathematical reasoning uncertainty expression reasoning degradation

Key Findings

Methodology

This study investigates the impact of self-distillation on the reasoning capabilities of large language models (LLMs), particularly in mathematical reasoning tasks. Using models like Qwen3-8B, DeepSeek-Distill-Qwen-7B, and Olmo3-7B-Instruct, the research analyzes how conditioning the teacher model on rich information suppresses uncertainty expression in the student model. Controlled experiments varied the richness of the conditioning context and task coverage to systematically study how self-distillation affects reasoning behavior.

Key Results

  • Across models like Qwen3-8B, DeepSeek-Distill-Qwen-7B, and Olmo3-7B-Instruct, self-distillation led to performance drops of up to 40%. This decline is primarily due to the suppression of uncertainty expression during reasoning, which negatively impacts performance on unseen problems.
  • Experiments show that when the teacher model is conditioned on rich information, the student model's reasoning becomes more confident and concise, but this also suppresses uncertainty expression, affecting out-of-distribution (OOD) performance.
  • By comparing model performance under different conditions, it was found that self-distillation in rich information contexts leads to changes in reasoning style, which, while effective for in-domain optimization, performs poorly with broad task coverage.

Significance

This research reveals the mechanism by which self-distillation can degrade reasoning capabilities in mathematical tasks, highlighting the importance of appropriately expressing uncertainty during reasoning. This finding is significant for both academia and industry as it challenges the current assumption that self-distillation universally improves model performance and points to new directions for optimizing reasoning behavior beyond merely reinforcing correct answer traces.

Technical Contribution

Technical contributions include uncovering the suppressive effect of self-distillation on uncertainty expression under rich information conditions and how this suppression affects reasoning capabilities and generalization performance. The study also proposes new methods for optimizing reasoning behavior, emphasizing the importance of retaining uncertainty expression during reasoning to improve performance on unseen tasks.

Novelty

This study is the first to systematically analyze the impact of self-distillation on uncertainty expression in mathematical reasoning tasks, proposing a mechanism by which self-distillation may lead to reasoning degradation under rich information conditions. This finding contrasts with previous studies that concluded self-distillation universally improves performance, providing a new perspective.

Limitations

  • The study focuses primarily on mathematical reasoning tasks, which may not apply to reasoning tasks in other domains. Different domains may have varying requirements for uncertainty expression.
  • The models and datasets used in the experiments are limited, which may not fully represent the behavior of all large language models.
  • The study does not consider the specific performance differences of models in different reasoning tasks, which may affect the generalizability of the conclusions.

Future Work

Future research could explore the performance of self-distillation in other reasoning tasks, especially those requiring high levels of uncertainty expression. Additionally, the study could further analyze the impact of different model architectures and datasets on the effects of self-distillation to develop more general optimization strategies.

AI Executive Summary

In the post-training of large language models (LLMs), self-distillation has emerged as an effective paradigm, often improving model performance and shortening reasoning paths. However, in mathematical reasoning tasks, it has been found that self-distillation can reduce response length while degrading performance. The root cause of this phenomenon is traced to the suppression of uncertainty expression during reasoning. Through a series of controlled experiments, researchers found that when the teacher model is conditioned on rich information, the student model's reasoning trajectory becomes more confident and concise, but this also suppresses uncertainty expression, affecting the model's performance on out-of-distribution (OOD) tasks.

The study used models like Qwen3-8B, DeepSeek-Distill-Qwen-7B, and Olmo3-7B-Instruct to analyze the effects of self-distillation under different conditions. The experiments showed that in these models, self-distillation led to performance drops of up to 40%. This decline is primarily due to the suppression of uncertainty expression during reasoning, which negatively impacts performance on unseen problems.

The research reveals the mechanism by which self-distillation can degrade reasoning capabilities in mathematical tasks, highlighting the importance of appropriately expressing uncertainty during reasoning. This finding is significant for both academia and industry as it challenges the current assumption that self-distillation universally improves model performance and points to new directions for optimizing reasoning behavior beyond merely reinforcing correct answer traces.

Technical contributions include uncovering the suppressive effect of self-distillation on uncertainty expression under rich information conditions and how this suppression affects reasoning capabilities and generalization performance. The study also proposes new methods for optimizing reasoning behavior, emphasizing the importance of retaining uncertainty expression during reasoning to improve performance on unseen tasks.

Future research could explore the performance of self-distillation in other reasoning tasks, especially those requiring high levels of uncertainty expression. Additionally, the study could further analyze the impact of different model architectures and datasets on the effects of self-distillation to develop more general optimization strategies.

Deep Analysis

Background

In recent years, large language models (LLMs) have made significant advances in the field of natural language processing. Self-distillation, as a post-training technique, aims to improve model performance by using two instances of the same model, where one instance serves as the teacher model providing informative reward signals, and the other instance serves as the student model generating responses. Self-distillation has been shown to significantly improve model performance in various domains, especially in scientific reasoning and agentic environments. However, there is limited research on the effects of self-distillation in mathematical reasoning tasks.

Core Problem

Self-distillation in mathematical reasoning tasks may lead to a degradation of the model's reasoning capabilities. The core problem lies in the suppression of uncertainty expression during the self-distillation process, which may affect the model's performance on unseen problems. Mathematical reasoning tasks often require the model to express uncertainty across different reasoning paths to adjust and correct during the reasoning process.

Innovation

The core innovation of this study is the revelation of the suppressive effect of self-distillation on uncertainty expression in mathematical reasoning tasks. Through controlled experiments, the study systematically analyzes the impact of self-distillation under different conditions, particularly how it affects reasoning behavior in rich information contexts. The study also proposes new methods for optimizing reasoning behavior, emphasizing the importance of retaining uncertainty expression during reasoning.

Methodology

  • �� Use models like Qwen3-8B, DeepSeek-Distill-Qwen-7B, and Olmo3-7B-Instruct for experiments.
  • �� Analyze the impact of self-distillation on model reasoning behavior by varying the richness of the conditioning context.
  • �� In controlled experiments, the teacher model is trained under rich information conditions, while the student model is optimized within limited task coverage.
  • �� Observe model performance on OOD tasks and analyze the impact of uncertainty expression on reasoning capabilities.

Experiments

The experimental design includes comparative experiments using models like Qwen3-8B, DeepSeek-Distill-Qwen-7B, and Olmo3-7B-Instruct. Different datasets and baseline models were used, and evaluation metrics included reasoning capabilities and response length. Ablation studies were also conducted to analyze the effects of self-distillation under different conditions.

Results

The experimental results show that self-distillation in mathematical reasoning tasks can lead to performance drops of up to 40%. This decline is primarily due to the suppression of uncertainty expression during reasoning, which negatively impacts performance on unseen problems. The experiments also found that when the teacher model is trained under rich information conditions, the student model's reasoning becomes more confident and concise, but this also suppresses uncertainty expression, affecting OOD performance.

Applications

The research findings have significant implications for optimizing large language models, especially in reasoning tasks requiring high levels of uncertainty expression. The study reveals the mechanism by which self-distillation can degrade reasoning capabilities in mathematical tasks, highlighting the importance of appropriately expressing uncertainty during reasoning.

Limitations & Outlook

The study focuses primarily on mathematical reasoning tasks, which may not apply to reasoning tasks in other domains. The models and datasets used in the experiments are limited, which may not fully represent the behavior of all large language models. The study does not consider the specific performance differences of models in different reasoning tasks, which may affect the generalizability of the conclusions.

Plain Language Accessible to non-experts

Imagine you're cooking in a kitchen. You have a recipe (teacher model) that tells you how to make the perfect dish. You follow the recipe step by step (student model), but sometimes you might be unsure about certain steps, like "How much seasoning should I add?" At this point, you might pause to think or even try different amounts (uncertainty expression).

Now, imagine you have a super-smart kitchen assistant (self-distillation) that gives you advice while you cook. This assistant is very confident and always tells you, "Just do it this way, don't worry!" As a result, you cook quickly, but sometimes the dish doesn't taste quite right because you didn't have the chance to experiment and adjust.

This is similar to the problem with self-distillation in mathematical reasoning. The model no longer expresses uncertainty during reasoning, leading to poor performance on unseen problems. Just like in the kitchen, if you always follow the assistant's advice without trying and adjusting, you might miss out on some delicious possibilities.

Therefore, appropriately expressing uncertainty is important as it gives you the chance to experiment and adjust, leading to better performance when facing new problems.

ELI14 Explained like you're 14

Hey there! Have you ever played a puzzle game where you need to solve riddles to find a treasure? Sometimes, you might think, "How do I solve this puzzle?" At this point, you might try different methods or even ask your friends for advice, right?

Now, imagine you have a super-cool game assistant that always tells you, "Just do it this way, it's fine!" At first, you might think it's great because you can find the treasure quickly. But slowly, you'll realize that some puzzles are still unsolvable because the assistant always gives you the same advice, and you don't get the chance to try different methods.

This is like what scientists found when studying large language models. Sometimes, when solving math problems, the model becomes too confident and doesn't try different methods, leading to poor performance on new problems.

So, expressing uncertainty is like trying different methods in a game. It gives you the chance to explore and learn, leading to better performance when facing new challenges!

Glossary

Self-Distillation

A post-training technique that uses two instances of the same model to improve performance, where one instance acts as the teacher model providing informative reward signals, and the other acts as the student model generating responses.

Used in the study to analyze the impact of self-distillation on LLM reasoning capabilities.

Epistemic Verbalization

During reasoning, the model expresses its uncertainty about certain reasoning paths through language. This expression can help the model adjust and correct during the reasoning process.

The study analyzes the suppressive effect of self-distillation on epistemic verbalization.

Large Language Model (LLM)

A deep learning-based natural language processing model capable of generating and understanding human language.

Models like Qwen3-8B, DeepSeek-Distill-Qwen-7B, and Olmo3-7B-Instruct were used in the study.

Reasoning Capability

The ability of a model to perform logical reasoning and decision-making when solving problems.

The study analyzes the impact of self-distillation on model reasoning capabilities.

In-Domain Optimization

Optimization of a model within the distribution of the training data to improve performance on known tasks.

The study analyzes the in-domain optimization effects of self-distillation with limited task coverage.

Out-of-Distribution (OOD) Performance

The performance of a model on data or tasks it has not seen before.

The study analyzes the impact of self-distillation on model OOD performance.

Qwen3-8B

A large language model used to study the impact of self-distillation on reasoning capabilities.

One of the models used in the study.

DeepSeek-Distill-Qwen-7B

A large language model used to study the impact of self-distillation on reasoning capabilities.

One of the models used in the study.

Olmo3-7B-Instruct

A large language model used to study the impact of self-distillation on reasoning capabilities.

One of the models used in the study.

Conditioning Context

The informational background on which the teacher model is based during self-distillation.

The study analyzes the impact of the richness of the conditioning context on self-distillation effects.

Information Richness

The amount and level of detail of information contained in the conditioning context.

The study analyzes the impact of information richness on uncertainty expression.

Task Coverage

The variety and number of tasks the model is exposed to during training.

The study analyzes the impact of task coverage on self-distillation effects.

Ablation Study

A method of analyzing the impact of removing or altering certain parts of a model on overall performance.

Used in the study to analyze the effects of self-distillation.

Reasoning Trajectory

The reasoning path and steps a model goes through when solving a problem.

The study analyzes the impact of self-distillation on reasoning trajectories.

Model Performance

The performance of a model on specific tasks, including metrics like accuracy and response time.

The study analyzes the impact of self-distillation on model performance.

Open Questions Unanswered questions from this research

  • 1 What are the effects of self-distillation in reasoning tasks in other domains? Current research focuses primarily on mathematical reasoning tasks, and other domains may have different requirements for uncertainty expression.
  • 2 What is the impact of different model architectures and datasets on the effects of self-distillation? The models and datasets used in the study are limited and may not fully represent the behavior of all large language models.
  • 3 How can uncertainty expression be effectively retained during self-distillation? The study highlights the importance of uncertainty expression for reasoning capabilities, but how to effectively retain this expression during self-distillation requires further exploration.
  • 4 What is the mechanism by which self-distillation affects model generalization capabilities? The study reveals the mechanism by which self-distillation may lead to reasoning degradation, but the specific impact mechanism requires further study.
  • 5 How can self-distillation be optimized to improve model performance on unseen tasks? The study proposes new methods for optimizing reasoning behavior, but specific optimization strategies require further validation.

Applications

Immediate Applications

Mathematical Reasoning Task Optimization

The research findings can be used to optimize the performance of large language models in mathematical reasoning tasks, especially those requiring high levels of uncertainty expression.

Educational Applications

Large language models can be used in automated problem-solving and assessment systems in education, improving accuracy and reliability by appropriately expressing uncertainty.

Scientific Research Assistance

Large language models can be used in data analysis and reasoning tasks in scientific research, improving performance in complex tasks through optimized self-distillation.

Long-term Vision

Development of General Artificial Intelligence

By optimizing self-distillation and uncertainty expression, large language models can be advanced towards general artificial intelligence, improving performance across various tasks.

Cross-Domain Application Expansion

The research findings can be used to expand the application of large language models in different domains, including automated decision-making and reasoning tasks in healthcare, finance, and law.

Abstract

Self-distillation has emerged as an effective post-training paradigm for LLMs, often improving performance while shortening reasoning traces. However, in mathematical reasoning, we find that it can reduce response length while degrading performance. We trace this degradation to the suppression of epistemic verbalization - the model's expression of uncertainty during reasoning. Through controlled experiments varying conditioning context richness and task coverage, we show that conditioning the teacher on rich information suppresses uncertainty expression, enabling rapid in-domain optimization with limited task coverage but harming OOD performance, where unseen problems benefit from expressing uncertainty and adjusting accordingly. Across Qwen3-8B, DeepSeek-Distill-Qwen-7B, and Olmo3-7B-Instruct, we observe performance drops of up to 40%. Our findings highlight that exposing appropriate levels of uncertainty is crucial for robust reasoning and underscore the importance of optimizing reasoning behavior beyond merely reinforcing correct answer traces.

cs.CL cs.LG

References (17)

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Adam Suma, Sam Dauncey

2025 1987 citations ⭐ Influential

Understanding Reasoning in LLMs through Strategic Information Allocation under Uncertainty

Jeonghye Kim, Xufang Luo, Minbeom Kim et al.

2026 1 citations ⭐ Influential View Analysis →

Reinforcement Learning via Self-Distillation

Jonas Hubotter, Frederike Lubeck, L. Behric et al.

2026 17 citations View Analysis →

HybridFlow: A Flexible and Efficient RLHF Framework

Guangming Sheng, Chi Zhang, Zilingfeng Ye et al.

2024 1345 citations View Analysis →

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Qiying Yu, Zheng Zhang, Ruofei Zhu et al.

2025 1444 citations View Analysis →

Qwen3 Technical Report

An Yang, Anfeng Li, Baosong Yang et al.

2025 3827 citations View Analysis →

LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code

Naman Jain, King Han, Alex Gu et al.

2024 1225 citations View Analysis →

On-Policy Context Distillation for Language Models

Tianzhu Ye, Li Dong, Xun Wu et al.

2026 3 citations View Analysis →

Learning by Distilling Context

Charles Burton Snell, D. Klein, Ruiqi Zhong

2022 72 citations View Analysis →

Self-Distilled Reasoner: On-Policy Self-Distillation for Large Language Models

Siyan Zhao, Zhihui Xie, Mengchen Liu et al.

2026 18 citations View Analysis →

Expanding the Capabilities of Reinforcement Learning via Text Feedback

Yuda Song, Lili Chen, Fahim Tajwar et al.

2026 5 citations View Analysis →

(Preprint)

Sarah Verschueren, J. van Aalst, A. Bangels et al.

2018 4382 citations

Trans-Formers

Oliver Bendel

2020 35 citations

In-Token Rationality Optimization: Towards Accurate and Concise LLM Reasoning via Self-Feedback

Mingye Zhu, Yi Liu, Zheren Fu et al.

2025 1 citations View Analysis →

ToolAlpaca: Generalized Tool Learning for Language Models with 3000 Simulated Cases

Qiaoyu Tang, Ziliang Deng, Hongyu Lin et al.

2023 320 citations View Analysis →

SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models

Kehua Feng, Keyan Ding, Weijie Wang et al.

2024 22 citations

Self-Distillation Enables Continual Learning

Idan Shenfeld, Mehul Damani, Jonas Hübotter et al.

2026 14 citations View Analysis →