6 14 319

Jinsei Shiraishi

OsakanaTeishoku

Osakana7777777

AI & ML interests

Large Language Models, Computer Vision, AI/ML application to medical settings

Recent Activity

liked a model 3 days ago

unsloth/gemma-4-E4B-it-UD-MLX-4bit

liked a dataset 21 days ago

roneneldan/TinyStories

upvoted an article about 1 month ago

Mixture of Experts Explained

View all activity

Organizations

upvoted 2 articles about 1 month ago

Article

Mixture of Experts Explained

osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq

•

Dec 11, 2023

• 1.13k

Article

TRL v1.0: Post-Training Library Built to Move with the Field

qgallouedec, stevhliu, pcuenq, sergiopaniego

•

Mar 31

• 53

upvoted an article about 2 months ago

Article

Welcome Gemma 4: Frontier multimodal intelligence on device

merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift

•

Apr 2

• 899

upvoted a paper 3 months ago

On the Optimal Reasoning Length for RL-Trained Language Models

Paper • 2602.09591 • Published Feb 10 • 6

upvoted 2 articles 3 months ago

Article

NVIDIA Nemotron 2 Nano 9B Japanese: 日本のソブリンAIを支える最先端小規模言語モデル

nvidia

•

Feb 17

• 25

Article

Transformers v5: Simple model definitions powering the AI ecosystem

lysandre, ArthurZ, cyrilvallez, reach-vb

•

Dec 1, 2025

• 311

upvoted 2 articles 8 months ago

Article

Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers

ariG23498, sergiopaniego, reach-vb, pcuenq, ArthurZ, SaylorTwift, cyrilvallez

•

Sep 11, 2025

• 188

Article

The 4 Things Qwen-3’s Chat Template Teaches Us

cfahlgren1

•

Apr 30, 2025

• 88

upvoted an article 11 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf

•

Jul 8, 2025

• 777

upvoted a collection 11 months ago

OpenMathReasoning

Collection

Models and datasets from "AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset" • 7 items • Updated 4 days ago • 47

upvoted a collection 12 months ago

Any-to-Any Models, Datasets, Spaces

Collection

19 items • Updated Feb 9 • 31

upvoted an article 12 months ago

Article

Vision Language Models Explained

merve, edbeeching

•

Apr 11, 2024

• 531

upvoted 2 collections about 1 year ago

Qwen2.5

Collection

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 43 items • Updated Mar 2 • 720

Asagi-VLM

Collection

Asagi is a Japanese Vision & Language model, trained on a large-scale synthetic dataset. • 4 items • Updated Nov 27, 2025 • 7

Jinsei Shiraishi

AI & ML interests

Recent Activity

Organizations

OsakanaTeishoku's activity

Mixture of Experts Explained

TRL v1.0: Post-Training Library Built to Move with the Field

Welcome Gemma 4: Frontier multimodal intelligence on device

NVIDIA Nemotron 2 Nano 9B Japanese: 日本のソブリンAIを支える最先端小規模言語モデル

Transformers v5: Simple model definitions powering the AI ecosystem

Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers

The 4 Things Qwen-3’s Chat Template Teaches Us

SmolLM3: smol, multilingual, long-context reasoner

Vision Language Models Explained