minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models Paper • 2605.30263 • Published 18 days ago • 58
Granite 4.1 Language Models Collection Efficient language models for multilingual generation, coding, RAG, and AI assistant workflows. • 6 items • Updated Apr 29 • 58
NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 23 items • Updated 3 days ago • 325
view article Article Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents nvidia • Apr 28 • 62
TIPSv2 Collection TIPSv2 foundational vision-language models. Webpage: https://gdm-tipsv2.github.io/ • 9 items • Updated Apr 14 • 36
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift • Apr 2 • 908
Structured 3D Latents for Scalable and Versatile 3D Generation Paper • 2412.01506 • Published Dec 2, 2024 • 90
Phi-4 Collection Phi-4 family of small language, multi-modal and reasoning models. • 17 items • Updated Jul 10, 2025 • 212
Orpheus Multilingual Research Release Collection Beta Release of multilingual models. • 12 items • Updated Apr 10, 2025 • 114
view article Article ChatGPT-4o's Image Generation Capabilities and Its Wild Examples prithivMLmods • Apr 5, 2025 • 22
Canary ASR/AST Collection A collection of multilingual and multitask speech to text models from NVIDIA NeMo 🐤 • 6 items • Updated 3 days ago • 35