cs.IR
2603.11486
Quantized Inference for OneRec-V2
OneRec-V2 achieves 49% latency reduction and 92% throughput increase via FP8 quantized inference.
Yi Su, Xinchen Luo, Hongtao Cheng et al.
2026-03-12
50