cs.IR
2603.11486
Quantized Inference for OneRec-V2
OneRec-V2通过FP8量化推理实现49%延迟减少和92%吞吐量提升。
Yi Su, Xinchen Luo, Hongtao Cheng 等
2026-03-12
50