cs.IR 2603.11486

Quantized Inference for OneRec-V2

OneRec-V2 achieves 49% latency reduction and 92% throughput increase via FP8 quantized inference.

Yi Su, Xinchen Luo, Hongtao Cheng et al.

2026-03-12 8