GRPO or PPO or some RL

Is there a GRPO/RL/PPO for text classification task using encoder only models like bert/roberta.
any github repo , example or help would be really appreciated thanks in advance.

This may be an unresolved issue. The following article may be helpful for general information about GRPO, but it is not specific to classification tasks…