Paper Insights - AI Arxiv Paper Analysis

cs.RO 2604.24648

Computational Design and Co-Robotic Fabrication for Material Reuse in Architecture

Integrating data-driven computational design with feedback-driven co-robotic fabrication for material reuse in architecture.

Arash Adel, Daniel Ruan, Ruxin Xie

2026-04-28 145

cs.RO 2604.24487

Guiding Vector Field Generation via Score-based Diffusion Model

Generate complex path vector fields using Score-Induced Guiding Vector Field (SGVF) to enhance robotic navigation.

Zirui Chen, Shiliang Guo, Shiyu Zhao

2026-04-27 147

cs.RO 2604.22724

GCImOpt: Learning efficient goal-conditioned policies by imitating optimal trajectories

GCImOpt learns efficient goal-conditioned policies by imitating optimal trajectories, significantly improving control task success rates and efficiency.

Jon Goikoetxea, Jesús F. Palacián

2026-04-25 101

cs.RO 2604.22615

GazeVLA: Learning Human Intention for Robotic Manipulation

GazeVLA learns human intention to enhance robotic manipulation, significantly outperforming baseline methods.

Chengyang Li, Kaiyi Xiong, Yuan Xu et al.

2026-04-24 239

cs.RO 2604.22591

RedVLA: Physical Red Teaming for Vision-Language-Action Models

RedVLA identifies physical safety risks in VLA models through a two-stage process, achieving an ASR of 95.5%.

Yuhao Zhang, Borong Zhang, Jiaming Fan et al.

2026-04-24 232

cs.RO 2604.22363

LeHome: A Simulation Environment for Deformable Object Manipulation in Household Scenarios

LeHome simulation environment achieves high-fidelity manipulation of deformable objects in household scenarios using PBD and FEM.

Zeyi Li, Yushi Yang, Shawn Xie et al.

2026-04-24 2 citations 234

cs.RO 2604.19728

VLA Foundry: A Unified Framework for Training Vision-Language-Action Models

VLA Foundry: A unified framework for training Vision-Language-Action models, enhancing multi-task tabletop manipulation policies.

Jean Mercat, Sedrick Keh, Kushal Arora et al.

2026-04-22 158

cs.RO 2604.19683

Mask World Model: Predicting What Matters for Robust Robot Policy Learning

Mask World Model predicts semantic masks instead of pixels, enhancing robust robot policy learning, excelling in LIBERO and RLBench.

Yunfan Lou, Xiaowei Chi, Xiaojie Zhang et al.

2026-04-22 283

cs.RO 2604.19677

Learning Hybrid-Control Policies for High-Precision In-Contact Manipulation Under Uncertainty

MATCH method improves peg-in-hole task success rate by 35% under high noise, reducing average force by 30%.

Hunter L. Brown, Geoffrey Hollinger, Stefan Lee

2026-04-22 133

cs.RO 2604.19670

Multi-Cycle Spatio-Temporal Adaptation in Human-Robot Teaming

RAPIDDS framework enhances human-robot teaming efficiency through multi-cycle spatio-temporal adaptation, significantly improving plan fluency and user preference.

Alex Cuellar, Michael Hagenow, Julie Shah

2026-04-22 95

cs.RO 2604.19643

A Gesture-Based Visual Learning Model for Acoustophoretic Interactions using a Swarm of AcoustoBots

Gesture recognition using OpenCLIP visual learning model improves AcoustoBot swarm interaction accuracy to 87.8%.

Alex Lin, Lei Gao, Narsimlu Kemsaram et al.

2026-04-22 114

cs.RO 2604.19618

Autonomous UAV Pipeline Near-proximity Inspection via Disturbance-Aware Predictive Visual Servoing

The ESKF-PRE-VMPC framework reduces RMSE by 52.63% and 75.04% in UAV pipeline inspection without wind.

Wen Li, Hui Wang, Jinya Su et al.

2026-04-22 116

cs.RO 2604.19536

LiveVLN: Breaking the Stop-and-Go Loop in Vision-Language Navigation

LiveVLN breaks the stop-and-go loop in vision-language navigation, reducing waiting time by up to 77.7%.

Xiangchen Wang, Weiye Zhu, Teng Wang et al.

2026-04-21 209

cs.RO 2604.18343

DAG-STL: A Hierarchical Framework for Zero-Shot Trajectory Planning under Signal Temporal Logic Specifications

DAG-STL framework achieves zero-shot trajectory planning under Signal Temporal Logic (STL) constraints, significantly enhancing complex task planning capabilities.

Ruijia Liu, Ancheng Hou, Xiao Yu et al.

2026-04-20 107

cs.RO 2604.18336

Enhancing Glass Surface Reconstruction via Depth Prior for Robot Navigation

Enhancing glass surface reconstruction using depth prior improves robot navigation accuracy.

Jiamin Zheng, Jingwen Yu, Guangcheng Chen et al.

2026-04-20 125

cs.RO 2604.18289

Relative State Estimation using Event-Based Propeller Sensing

Relative state estimation using event-based propeller sensing with error under 3%.

Ravi Kumar Thakur, Luis Granados Segura, Jan Klivan et al.

2026-04-20 128

cs.RO 2604.18271

EmbodiedLGR: Integrating Lightweight Graph Representation and Retrieval for Semantic-Spatial Memory in Robotic Agents

EmbodiedLGR-Agent integrates lightweight graph representation and retrieval for efficient semantic-spatial memory in robots.

Paolo Riva, Leonardo Gargani, Matteo Frosi et al.

2026-04-20 119

cs.RO 2604.18236

COFFAIL: A Dataset of Successful and Anomalous Robot Skill Executions in the Context of Coffee Preparation

COFFAIL dataset includes successful and anomalous robot skill executions in coffee preparation, supporting imitation learning.

Alex Mitrevski, Ayush Salunke

2026-04-20 139

cs.RO 2604.16263

Semantic Area Graph Reasoning for Multi-Robot Language-Guided Search

Proposed SAGR framework coordinates multi-robot language-guided search using semantic area graphs, improving efficiency by 18.8% in large environments.

Ruiyang Wang, Hao-Lun Hsu, Jiwoo Kim et al.

2026-04-18 154

cs.RO 2604.16201

DENALI: A Dataset Enabling Non-Line-of-Sight Spatial Reasoning with Low-Cost LiDARs

DENALI dataset enables non-line-of-sight spatial reasoning with low-cost LiDARs, covering 72,000 scenes.

Nikhil Behari, Diego Rivero, Luke Apostolides et al.

2026-04-18 1 citations 118