cs.SE 2603.23448

Code Review Agent Benchmark

c-CRAB dataset evaluates code review agents' abilities; current agents solve only 40% of tasks.

Yuntong Zhang, Zhiyuan Pan, Imam Nur Bani Yusuf et al.

2026-03-25 118
cs.CL 2603.22241

MemDLM: Memory-Enhanced DLM Training

MemDLM embeds a simulated denoising process into training via bi-level optimization, enhancing DLM training efficiency and long-context understanding.

Zehua Pei, Hui-Ling Zhen, Weizhe Lin et al.

2026-03-24 94