VOLT: Vision and Language Trajectory Segmentation for Faster-than-Demonstration Policies
VOLT leverages vision-language models for trajectory segmentation, enabling robots to execute tasks up to 2.57× faster while maintaining success rates.
Robert Ramirez Sanchez, Daniel J. Evans, Dylan P. Losey et al.