The model is the strongest mid-sized LLM on nearly all benchmarks.
Run on 23GB RAM via Unsloth Dynamic GGUFs.
GGUFs to run: unsloth/Qwen3.6-35B-A3B-GGUF
Guide: https://unsloth.ai/docs/models/qwen3.6
Join the community of Machine Learners and AI enthusiasts.
Sign UpWe just compressed Qwen 3.6's KV cache 4x with zero quality loss (PPL actually improves slightly).
Works automatically on the hybrid architecture โ detects standard vs linear attention layers.
Model card: huggingface.co/fraQtl/Qwen3.6-35B-A3B-fraQtl-kv :)
Are 27B and 122B coming soon?
p.s.: Qwen3.5 is starting to show promise... it's the first Qwen reasoning model that worked, imho, since QwQ -- the first not to reason too much or get stuck in loops too often. However, it still feels like it's not truly understanding; like it's just parrotting what it thinks a teacher model would say, even when that's not aligned with what the user requested. Hope this is part of the "quality" focus that was mentioned a while back.
oh 3.6 35B is a literal never ending reasoning loop for me. like 3 out 6 times need to kill the server type of deal