Execution-State Capsules: Graph-Bound Execution-State Checkpoint and Restore for Low-Latency, Small-Batch, On-Device Physical-AI Serving
Proposes graph-bound execution-state capsules for low-latency, small-batch on-device AI, enabling byte-exact snapshot and restore with sub-millisecond GPU performance.
Liang Su