Paper Insights - AI Arxiv Paper Analysis

cs.LG 2604.19072

S2MAM: Semi-supervised Meta Additive Model for Robust Estimation and Variable Selection

S2MAM uses bilevel optimization for robust estimation and variable selection, validated on 16 datasets.

Xuelin Zhang, Hong Chen, Yingjie Wang et al.

2026-04-21 29

cs.LG 2604.18580

Sessa: Selective State Space Attention

Sessa enhances long-range memory by embedding selective attention in feedback paths.

Liubomyr Horbatko

2026-04-21 26

cs.LG 2604.18578

Bounded Ratio Reinforcement Learning

Introduced Bounded Ratio Reinforcement Learning (BRRL) framework, outperforming PPO in environments like MuJoCo.

Yunke Ao, Le Chen, Bruce D. Lee et al.

2026-04-21 25

cs.LG 2604.18570

A multimodal and temporal foundation model for virtual patient representations at healthcare system scale

Apollo model integrates 28 medical modalities and 12 specialties to predict disease risk up to 5 years in advance.

Andrew Zhang, Tong Ding, Sophia J. Wagner et al.

2026-04-21 48

cs.LG 2604.17402

On the Generalization Bounds of Symbolic Regression with Genetic Programming

Analysis of generalization bounds in symbolic regression with genetic programming, revealing complexities in structure selection and constant fitting.

Masahiro Nomura, Ryoki Hamano, Isao Ono

2026-04-19 31

cs.LG 2604.17040

When Spike Sparsity Does Not Translate to Deployed Cost: VS-WNO on Jetson Orin Nano

VS-WNO fails to translate spike sparsity into deployment cost advantage on Jetson Orin Nano.

Jason Yoo, Shailesh Garg, Souvik Chakraborty et al.

2026-04-18 26

cs.LG 2604.16279

Evaluating the Progression of Large Language Model Capabilities for Small-Molecule Drug Design

A smaller model post-trained with reinforcement learning excels in small-molecule drug design tasks, rivaling state-of-the-art frontier models.

Shriram Chennakesavalu, Kirill Shmilovich, Hayley Weir et al.

2026-04-18 30

cs.LG 2604.16259

Beyond Distribution Sharpening: The Importance of Task Rewards

Task-reward optimization enhances Llama-3.2-3B-Instruct's performance on math datasets.

Sarthak Mittal, Leo Gagnon, Guillaume Lajoie

2026-04-18 25

cs.LG 2604.16247

Joint-Centric Dual Contrastive Alignment with Structure-Preserving and Information-Balanced Regularization

HILBERT framework achieves significant performance improvement in long-sequence audio-text representation learning through dual contrastive learning and information-balanced regularization.

Habibeh Naderi, Behrouz Haji Soleimani, Stan Matwin

2026-04-18 30

cs.LG 2604.16242

Detecting and Suppressing Reward Hacking with Gradient Fingerprints

Detect and suppress reward hacking using Gradient Fingerprints, achieving superior performance on math, code, and logical reasoning benchmarks.

Songtao Wang, Quang Hieu Pham, Fangcong Yin et al.

2026-04-18 72

cs.LG 2604.16076

Prototype-Grounded Concept Models for Verifiable Concept Alignment

Prototype-Grounded Concept Models (PGCMs) verify concept alignment via visual prototypes, enhancing interpretability.

Stefano Colamonaco, David Debot, Pietro Barbiero et al.

2026-04-17 31

cs.LG 2604.15297

Benchmarking Optimizers for MLPs in Tabular Deep Learning

Muon optimizer outperforms AdamW in MLP-based tabular deep learning, recommended if training efficiency is acceptable.

Yury Gorishniy, Ivan Rubachev, Dmitrii Feoktistov et al.

2026-04-17 35

cs.LG 2604.15259

Stability and Generalization in Looped Transformers

Analyzes stability and generalization of looped transformers using a fixed-point framework, validated on chess, sudoku, and prefix-sum tasks.

Asher Labovich

2026-04-17 30

cs.LG 2603.23414

SortedRL: Accelerating RL Training for LLMs through Online Length-Aware Scheduling

SortedRL accelerates RL training for LLMs through online length-aware scheduling, enhancing efficiency and performance.

Yiqi Zhang, Huiqiang Jiang, Xufang Luo et al.

2026-03-25 7 citations 49

cs.LG 2603.23398

Graph Energy Matching: Transport-Aligned Energy-Based Modeling for Graph Generation

Introduces Graph Energy Matching (GEM), surpassing discrete diffusion models in molecular graph generation.

Michal Balcerak, Suprosana Shit, Chinmay Prabhakar et al.

2026-03-25 41

cs.LG 2603.22276

Scaling DoRA: High-Rank Adaptation via Factored Norms and Fused Kernels

Scaling DoRA achieves high-rank adaptation via factored norms and fused kernels, significantly reducing memory usage and enhancing speed.

Alexandra Zelenin, Alexandra Zhuravlyova

2026-03-24 42

cs.LG 2603.22213

SPA: A Simple but Tough-to-Beat Baseline for Knowledge Injection

SPA method uses carefully designed prompts to generate large-scale synthetic data for effective knowledge injection.

Kexian Tang, Jiani Wang, Shaowen Wang et al.

2026-03-24 50

cs.LG 2603.19221

Online Learning and Equilibrium Computation with Ranking Feedback

Proposed a new algorithm for online learning with ranking feedback, addressing the absence of traditional numeric feedback.

Mingyang Liu, Yongshan Chen, Zhiyuan Fan et al.

2026-03-20 52

cs.LG 2603.19204

Robustness, Cost, and Attack-Surface Concentration in Phishing Detection

A cost-aware evasion framework reveals robustness gaps in phishing detection; median evasion cost is 2, with over 80% attacks on three low-cost features.

Julian Allagan, Mohamed Elbakary, Zohreh Safari et al.

2026-03-20 78

cs.LG 2603.18965

Maximum-Entropy Exploration with Future State-Action Visitation Measures

The paper introduces a maximum-entropy exploration method using future state-action visitation measures, improving feature visitation and convergence speed.

Adrien Bolland, Gaspard Lambrechts, Damien Ernst

2026-03-19 54