Models used in CHARM: Calibrating Reward Models With Chatbot Arena Scores.
shawnxzhu
shawnxzhu
AI & ML interests
None yet
Recent Activity
authored a paper 10 days ago
EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL submitted a paper 10 days ago
EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL upvoted a paper 10 days ago
EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL