Qwen3-VL-8B GRPO RLVR checkpoints from a token-dropout exploration study. OMR ppexplore=winner (0.714); video ~0.485 dead-heat.
Nguyen Quang Trung
ngqtrung
AI & ML interests
None yet
Recent Activity
updated a collection 3 days ago
Qwen3-VL-8B RLVR — Models (v1) updated a collection 3 days ago
Qwen3-VL-8B RLVR — Models (v1) updated a collection 3 days ago
Qwen3-VL-8B RLVR — Models (v1)Organizations
Qwen3-VL-8B RLVR — Models (v1)
Qwen3-VL-8B GRPO RLVR checkpoints from a token-dropout exploration study. OMR ppexplore=winner (0.714); video ~0.485 dead-heat.
Qwen3-VL-8B RLVR — Datasets (v1)
Curated SFT + GRPO RL datasets (video MC-QA, OMR math-image, OpenMMReasoner-RL, Vero) for Qwen3-VL-8B post-training.
models 12
ngqtrung/video-8b-grpo-ppexplore-n16k8
Image-Text-to-Text • 9B • Updated • 23
ngqtrung/video-8b-grpo-ppexplore
Image-Text-to-Text • 9B • Updated • 18
ngqtrung/video-8b-grpo-sft770
Image-Text-to-Text • 9B • Updated • 24
ngqtrung/video-8b-grpo-base
Image-Text-to-Text • 9B • Updated • 24
ngqtrung/omr-8b-grpo-base
Image-Text-to-Text • 9B • Updated • 24
ngqtrung/omr-8b-grpo-ppexplore
Image-Text-to-Text • 9B • Updated • 24
ngqtrung/verify-tool
Updated
ngqtrung/Qwen3-Omni-Thinker-30B-Instruct
Image-Text-to-Text • 32B • Updated • 4
ngqtrung/Qwen3-Omni-Thinker-30B-Thinking
Image-Text-to-Text • 32B • Updated • 3
ngqtrung/Qwen2.5-Omni-Thinker-7B
Image-Text-to-Text • 9B • Updated • 6
datasets 47
ngqtrung/vero-rl
Viewer • Updated • 95.1k • 18
ngqtrung/openmmreasoner-rl-74k
Updated • 21
ngqtrung/omr-grpo-val
Viewer • Updated • 15k • 20
ngqtrung/omr-grpo-train
Viewer • Updated • 75k • 17
ngqtrung/videorl-video-val
Viewer • Updated • 11.3k • 21
ngqtrung/videorl-video-rl-train
Viewer • Updated • 212k • 22
ngqtrung/ommr-sft-recipe
Updated • 19
ngqtrung/vmar-sft-distill-raw
Viewer • Updated • 506k • 20
ngqtrung/vmar-realgold-eval
Viewer • Updated • 4.24k • 24
ngqtrung/vmar-sft-seed-v2
Viewer • Updated • 105k • 23