MOSS-Audio Collection An open-source audio understanding model supporting speech recognition, environmental sound analysis, music understanding, time-aware QA, and complex • 5 items • Updated 2 days ago • 37
HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds Paper • 2604.14268 • Published 8 days ago • 106
Seedance 2.0: Advancing Video Generation for World Complexity Paper • 2604.14148 • Published 8 days ago • 150
WAON Collection WAON: Large-Scale and High-Quality Japanese Image-Text Pair Dataset for Vision-Language Models • 4 items • Updated Mar 2 • 2
Marco-MoE Collection A suit of multilingual MoE models with highly-sparse architectures • 5 items • Updated 14 days ago • 16
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 21 days ago • 875
Sarashina2.2 Collection Large Language Models developed by SB Intuitions. Pretrained and instruction-tuned models are available in three sizes: 0.5B, 1B, and 3B. • 6 items • Updated Mar 5, 2025 • 9
Constructing Synthetic Instruction Datasets for Improving Reasoning in Domain-Specific LLMs: A Case Study in the Japanese Financial Domain Paper • 2603.01353 • Published Mar 2 • 3