Long-form RewardBench: Evaluating Reward Models for Long-form Generation
Long-form RewardBench evaluates reward models for long-form generation, revealing current models' deficiencies in long-form reward modeling.
Hui Huang, Yancheng He, Wei Liu et al.