Paper 解读 - Arxiv 论文中文解读平台

cs.CL 2605.28805

OmniVerifier-M1: Multimodal Meta-Verifier with Explicit Structured Recalibration

OmniVerifier-M1采用符号化输出与解耦强化学习，提升视觉验证的准确性和效率，达成0.68在ViVerBench指标。

Xinchen Zhang, Bowei Liu, Jiale Liu 等

2026-05-28 96

cs.CL 2605.28802

Human Label Variation as Stable Signal: Learning Annotator-Specific Explanation Behavior via Cross-Annotator Preference Optimization

提出CAPO方法，通过跨标注偏好优化，模型学习到个体标注者的稳定解释行为，显著优于提示和SFT。

Beiduo Chen, Pingjun Hong, Ziyun Zhang 等

2026-05-28 77

cs.LG 2605.28775

Learn from Weaknesses: Automated Domain Specialization for Small Computer-Use Agents

提出LearnWeak框架，通过强参考代理自动识别小型CUA的弱点，提升8个软件域的性能，平均提升11.6个百分点。

Suji Kim, Kangsan Kim, Sung Ju Hwang

2026-05-28 87

cs.CL 2605.28773

Rethinking Memory as Continuously Evolving Connectivity

FluxMem通过三阶段演化机制，将记忆建模为动态异构图，显著提升LLM在复杂环境中的适应性和泛化能力。

Jizhan Fang, Buqiang Xu, Zhixian Wang 等

2026-05-28 208

cs.LG 2605.28739

BIRDNet: Mining and Encoding Boolean Implication Knowledge Graphs as Interpretable Deep Neural Networks

BIRDNet通过挖掘布尔蕴涵关系构建稀疏可解释的深度神经网络，验证六个生物医学数据集，参数显著少于传统MLP。

Tirtharaj Dash

2026-05-28 67

cs.RO 2605.28726

How VLAs Fail Differently: Black-Box Action Monitoring Reveals Architecture-Specific Failure Signatures

本研究通过黑箱监控揭示三种VLA架构在运动指令层的不同失败签名，强调架构匹配监控的重要性。

Krishnam Gupta

2026-05-28 119

cs.CY 2605.27371

Algorithmic Monocultures in Hiring

基于pymetrics数据，揭示算法单一供应商导致招聘中种族不公与系统性拒绝现象。

Rishi Bommasani, Sarah H. Bana, Kathleen A. Creel 等

2026-05-27 447

cs.AI 2605.27366

MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation

MUSE-Autoskill通过技能生命周期管理提升任务成功率，技能复用率达68.4%。

Huawei Lin, Peng Li, Jie Song 等

2026-05-27 394

cs.CV 2605.27365

LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding

LocateAnything基于并行框解码，训练138M样本，显著提升定位速度与精度

Shihao Wang, Shilong Liu, Yuanguo Kuang 等

2026-05-27 83

cs.AI 2605.27361

Natural Language Query to Configuration for Retrieval Agents

BRANE方法通过LLM提取查询特征，实现MuSiQue等数据集89%成本节省的动态检索配置优化。

Melissa Z. Pan, Negar Arabzadeh, Mathew Jacob 等

2026-05-27 65

cs.LG 2605.27354

Guiding LLM Post-training Data Engineering with Model Internals from Sparse Autoencoders

SAERL框架利用稀疏自编码器内在激活，提升LLM后训练数据多样性、难度排序与质量过滤，Qwen2.5-Math-1.5B准确率提升3%。

Yi Jing, Zao Dai, Jinwu Hu 等

2026-05-27 171

cs.LG 2605.27352

From Scores to Gibbs Correctors: Accelerating Uniform-Rate Discrete Diffusion Models

提出GADD算法，实现统一速率离散扩散模型采样复杂度降至O(polylog(ε⁻¹))，显著提升采样效率。

Yuchen Liang, Ness Shroff, Yingbin Liang

2026-05-27 89

cs.CV 2605.27343

Towards Controllable Image Generation through Representation-Conditioned Diffusion Models

基于DINO表征的条件扩散模型实现高质量且可控图像生成，LSUN和CelebA数据集验证。

Nithesh Chandher Karthikeyan, Jonas Unger, Gabriel Eilertsen

2026-05-27 54

cs.CL 2605.27333

FinHarness: An Inline Lifecycle Safety Harness for Finance LLM Agents

FinHarness通过内联生命周期安全护具，FinVault基准ASR降至15%，高级判官调用减少4.7倍

Haoxuan Jia, Yang Liu, Bin Chong 等

2026-05-27 146

cs.LG 2605.27306

Normal Guidance is what Attention Needs

提出Normal Guidance正态引导正则化方法，提升基于注意力的MIL在4百万切片CT数据上的切片级定位性能。

Ethan Harvey, Dennis Johan Loevlie, Michael C. Hughes

2026-05-27 81

cs.MA 2605.26448

Constitutional Arms Races in the Public Goods Game: Co-Evolving LLM Constitutions Under Cooperation-Defection Pressure

基于LLM进化搜索的对抗性宪法演化，在公共物品博弈中实现蓝红阵营近0.78稳定均衡。

Ujwal Kumar, Arth Singh, Hershraj Niranjani 等

2026-05-26 66

cs.LG 2605.26248

Unified Neural Scaling Laws

统一神经网络缩放定律（UNSL）精准建模多维度同时变化下的深度学习性能，提升预测准确度超过10%。

Ethan Caballero, Priyank Jaini, David Krueger 等

2026-05-26 66

cs.CV 2605.22823

Which Way Did It Move? Diagnosing and Overcoming Directional Motion Blindness in Video-LLMs

提出DeltaDirect方法，MoDirect数据集，合成域准确率从25.9%提升至85.4%。

Jongseo Lee, Hyuntak Lee, Sunghun Kim 等

2026-05-22 54

cs.CL 2605.22821

Tokenisation via Convex Relaxations

ConvexTok通过凸松弛优化Tokeniser，提升压缩率，词汇量128k时接近最优，BpB提升显著。

Jan Tempus, Philip Whittington, Craig W. Schmidt 等

2026-05-22 230

cs.CV 2605.22818

MotiMotion: Motion-Controlled Video Generation with Visual Reasoning

MotiMotion结合视觉语言模型推理与置信度调控，实现运动控制视频生成，MotiBench评测优于MagicMotion和Wan-Move。

Lee Hsin-Ying, Hanwen Jiang, Yiqun Mei 等

2026-05-22 53