GRPO Trainer for VLM?

Wonder HuggingFace has any plan to support GRPO Trainer or DPO Trainer for VLM?

It would be a huge contribution to the community.

sorry I just saw VLM part :slight_smile:

Seems ongoing issue (request).

any updates on this?

Welcome first time posters @SabaPivot and @Cherran

Someone wrote the code, but it doesn’t seem to have been committed yet.