arxiv:2604.14258
Wangjie Gan
zju-omniai
ยท
AI & ML interests
None yet
Recent Activity
authored a paper about 15 hours ago
GFT: From Imitation to Reward Fine-Tuning with Unbiased Group Advantages and Dynamic Coefficient Rectification commentedon a paper about 23 hours ago
GFT: From Imitation to Reward Fine-Tuning with Unbiased Group Advantages and Dynamic Coefficient Rectification upvoted a paper about 23 hours ago
GFT: From Imitation to Reward Fine-Tuning with Unbiased Group Advantages and Dynamic Coefficient Rectification