Paper Insights - AI Arxiv Paper Analysis

cs.LG 2606.06486

Regret Minimization with Adaptive Opponents in Repeated Games

Introduces RP-Regret for adaptive opponents, with algorithms achieving sublinear regret and better equilibria in repeated games.

Mingyang Liu, Asuman Ozdaglar, Tiancheng Yu et al.

2026-06-05 77

cs.LG 2606.06470

PC Layer: Polynomial Weight Preconditioning for Improving LLM Pre-Training

Proposes Polynomial Weight Preconditioning (PC) layer to regulate singular-value spectrum, accelerating LLM pretraining; achieves 2× speedup on Llama-1B with no inference overhead.

Senmiao Wang, Tiantian Fang, Haoran Zhang et al.

2026-06-05 102

cs.LG 2606.06364

End-to-End Subgraph Detection with GraphDETR

GraphDETR formulates subgraph detection as set prediction, achieving 91.2 AP on molecular datasets with graphs up to 1000 nodes and 50-node substructures.

Dexiong Chen, Till Hendrik Schulz, Karsten Borgwardt

2026-06-05 60

cs.LG 2606.06329

Efficient Mean Curvature Computation on High-Dimensional Data Manifolds

Proposes an algebraic identity and low-rank SVD approximation to compute mean curvature efficiently on high-dimensional data manifolds, reducing complexity from O(m^4) to near O(k^2 m).

Alexandre L. M. Levada

2026-06-05 61

cs.LG 2606.05693

MolE-RAG: Molecular Structure-Enhanced Retrieval-Augmented Generation for Chemistry

MolE-RAG integrates literature, molecular features, and structural similarity to enhance LLM-based molecular property prediction, boosting ROC-AUC by up to 28% and reducing RMSE by 67%.

Joey Chan, Wonbin Kweon, Ashley Shin et al.

2026-06-04 69

cs.LG 2606.05152

Reinforcement Learning from Rich Feedback with Distributional DAgger

Proposes DistIL, a distributional imitation learning algorithm with monotonic improvement guarantees, leveraging rich feedback for complex reasoning tasks.

Rishabh Agrawal, Jacob Fein-Ashley, Paria Rashidinejad

2026-06-04 72

cs.LG 2606.03980

Skill-RM: Unifying Heterogeneous Evaluation Criteria via Agent Skill

Skill-RM unifies heterogeneous evaluation criteria via agent skills, enabling dynamic resource orchestration, outperforming traditional judges with a 3-6% improvement on RewardBench2.

Tao Chen, Gangwei Jiang, Pengyu Cheng et al.

2026-06-03 78

cs.LG 2606.03979

Language Models Need Sleep: Learning to Self-Modify and Consolidate Memories

Introducing the 'Sleep' paradigm with Knowledge Seeding and Dreaming mechanisms enables LLMs to self-modify and consolidate memories for continual learning.

Ali Behrouz, Farnoosh Hashemi, Vahab Mirrokni

2026-06-03 2 citations 46

cs.LG 2606.03962

Using Reward Uncertainty to Induce Diverse Behaviour in Reinforcement Learning

Proposes ROSA, a reward distribution-based framework for inducing diverse behaviors without performance loss, leveraging set functions and unbiased gradient estimators.

Anthony GX-Chen, Ankit Anand, Gheorghe Comanici et al.

2026-06-03 50

cs.LG 2606.03584

Training a Predictive Coding Network on ImageNet using Equilibrium Propagation

This paper introduces an equilibrium propagation (EP)-based training method for deep predictive coding networks (PCNs), achieving 13.23% Top-5 error on ImageNet with a 10-layer VGG model, close to the 12.2% baseline of backpropagation.

Tugdual Kerjan, Rasmus Høier, Benjamin Scellier

2026-06-02 39

cs.LG 2605.31562

Effective Biological Representation Learning by Masking Gene Expression

This paper introduces TxFM, a masked autoencoder trained on 1.4 million RNA-seq samples, outperforming large-scale foundation models in gene representation learning.

Kian Kenyon-Dean, Alina Selega, Ihab Bendidi et al.

2026-05-30 82

cs.LG 2605.31559

Functional Attention: From Pairwise Affinities to Functional Correspondences

Proposes Functional Attention, transforming pointwise attention into linear operators in function spaces, achieving resolution-invariant PDE solving and 3D segmentation with superior performance.

Jiefang Xiao, Maolin Gao, Simon Weber et al.

2026-05-30 89

cs.LG 2605.31261

Why Linear Recurrent Memory Works in Partially Observable Reinforcement Learning

This paper establishes the theoretical foundation for linear recurrent memory units (ALF) in partially observable reinforcement learning, constructing two linear filters that precisely replicate belief dynamics.

Yike Zhao, Onno Eberhard, Malek Khammassi et al.

2026-05-29 77

cs.LG 2605.30337

Efficient Test-Time Finetuning of LLMs via Convex Reconstruction and Gradient Caching

HullFT employs convex reconstruction and gradient caching for efficient test-time fine-tuning, improving speed and quality tradeoff in large language models.

Alaa Khamis, Alaa Maalouf

2026-05-29 177

cs.LG 2605.30119

Evolving Features vs Evolving Entire Trees with GP for Interpretable Survival Analysis

Combining multi-objective genetic programming with survival tree optimization, this study enhances predictive accuracy and interpretability in survival analysis, validated on two real-world datasets.

Thalea Schlender, Peter A. N. Bosman, Tanja Alderliesten

2026-05-28 56

cs.LG 2605.29543

SCOPE: A Lightweight-training LLM Framework for Air Traffic Control Readback Monitoring

SCOPE integrates a frozen LLM with an open-set plugin classifier, achieving 91.05% open-set detection accuracy and 96.63% anomaly correction in ATC readback monitoring.

Qihan Deng, Minghua Zhang, Yang Yang et al.

2026-05-28 83

cs.LG 2605.28775

Learn from Weaknesses: Automated Domain Specialization for Small Computer-Use Agents

LearnWeak framework uses a stronger reference agent to identify model weaknesses, synthesizes targeted tasks, and improves small CUAs by 11.6% on average across 8 domains.

Suji Kim, Kangsan Kim, Sung Ju Hwang

2026-05-28 84

cs.LG 2605.28739

BIRDNet: Mining and Encoding Boolean Implication Knowledge Graphs as Interpretable Deep Neural Networks

BIRDNet encodes mined Boolean implication graphs into sparse, interpretable deep neural networks, achieving near state-of-the-art AUROC with 96x fewer active parameters on six biomedical datasets.

Tirtharaj Dash

2026-05-28 65

cs.LG 2605.27354

Guiding LLM Post-training Data Engineering with Model Internals from Sparse Autoencoders

SAERL leverages Sparse Autoencoder activations to model diversity, difficulty, and quality for LLM post-training data engineering, boosting Qwen2.5-Math-1.5B accuracy by 3%.

Yi Jing, Zao Dai, Jinwu Hu et al.

2026-05-27 168

cs.LG 2605.27352

From Scores to Gibbs Correctors: Accelerating Uniform-Rate Discrete Diffusion Models

Proposed GADD algorithm achieves O(polylog(ε⁻¹)) sampling complexity for uniform-rate discrete diffusion models, significantly accelerating sampling.

Yuchen Liang, Ness Shroff, Yingbin Liang

2026-05-27 86