Paper Insights - AI Arxiv Paper Analysis

Sort: Latest Popular Citations

All Artificial Intelligence Computation and Language Computer Vision Information Retrieval Machine Learning Machine Learning (Stats) Neural and Evolutionary Computing Robotics

cs.AI 2603.19191

OS-Themis: A Scalable Critic Framework for Generalist GUI Rewards

OS-Themis framework improves GUI agent performance by 10.3% on AndroidWorld using a multi-agent critic mechanism.

Zehao Li, Zhenyu Wu, Yibo Zhao et al.

2026-03-20 1 citations 113

cs.AI 2603.19182

Box Maze: A Process-Control Architecture for Reliable LLM Reasoning

Box Maze framework reduces LLM reasoning error rate to below 1% through memory grounding, structured inference, and boundary enforcement.

Zou Qiang

2026-03-20 128

cs.AI 2603.18573

Interplay: Training Independent Simulators for Reference-Free Conversational Recommendation

Proposes a reference-free simulation framework by training independent user and recommender simulators for more realistic dialogues.

Jerome Ramos, Feng Xia, Xi Wang et al.

2026-03-19 154

cs.AI 2603.18104

Adaptive Domain Models: Bayesian Evolution, Warm Rotation, and Principled Training for Geometric and Neuromorphic AI

Adaptive Domain Models leverage Bayesian distillation and warm rotation for efficient training in geometric and neuromorphic AI.

Houston Haynes

2026-03-18 1 citations 123

cs.AI 2603.16843

Internalizing Agency from Reflective Experience

LEAFE framework internalizes recovery agency from reflective experience, enhancing Pass@k performance in long-horizon tasks.

Rui Ge, Yichao Fu, Yuyang Qian et al.

2026-03-18 1 citations 168

cs.AI 2603.15607

Do Metrics for Counterfactual Explanations Align with User Perception?

The study finds that counterfactual explanation metrics do not align with user perception, necessitating more human-centered evaluation methods.

Felix Liedeker, Basil Ell, Philipp Cimiano et al.

2026-03-17 143

cs.AI 2603.15594

OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data

OpenSeeker democratizes frontier search agents by fully open-sourcing training data, utilizing controllable QA synthesis and denoised trajectory synthesis.

Yuwen Du, Rui Ye, Shuo Tang et al.

2026-03-17 3 citations 351

cs.AI 2603.15586

Computational Concept of the Psyche

Proposes a cognitive architecture viewing the psyche as an operating system for constructing AGI.

Anton Kolonin, Vladimir Krykov

2026-03-17 109

cs.AI 2603.13168

Developing and evaluating a chatbot to support maternal health care

Developed a chatbot for maternal health in India using stage-aware triage and hybrid retrieval, achieving 86.7% emergency recall.

Smriti Jha, Vidhi Jain, Jianyu Xu et al.

2026-03-14 163

cs.AI 2603.13099

Beyond Final Answers: CRYSTAL Benchmark for Transparent Multimodal Reasoning Evaluation

CRYSTAL benchmark evaluates multimodal reasoning transparency using Match F1 and Ordered Match F1, revealing systematic flaws in existing models.

Wayner Barrios, SouYoung Jin

2026-03-13 111

cs.AI 2603.13017

Structured Distillation for Personalized Agent Memory: 11x Token Reduction with Retrieval Preservation

Structured distillation reduces personalized agent memory tokens by 11x while preserving retrieval capabilities.

Sydney Lewis

2026-03-13 140

cs.AI 2603.12246

Examining Reasoning LLMs-as-Judges in Non-Verifiable LLM Post-Training

The study enhances performance in non-verifiable LLM post-training using reasoning LLM judges, with gpt-oss-120b as the gold standard.

Yixin Liu, Yue Yu, DiJia Su et al.

2026-03-13 191

cs.AI 2603.12224

Portfolio of Solving Strategies in CEGAR-based Object Packing and Scheduling for Sequential 3D Printing

Porfolio-CEGAR-SEQ algorithm optimizes object packing and scheduling in 3D printing, reducing the number of printing plates used.

Pavel Surynek

2026-03-13 113