Instructions to use DrStrangel0ve/Qwen3-VL-4B-SpreadsheetBench-QLoRA with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use DrStrangel0ve/Qwen3-VL-4B-SpreadsheetBench-QLoRA with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/Qwen3-VL-4B-Thinking-unsloth-bnb-4bit") model = PeftModel.from_pretrained(base_model, "DrStrangel0ve/Qwen3-VL-4B-SpreadsheetBench-QLoRA") - Notebooks
- Google Colab
- Kaggle
Qwen3-VL-4B SpreadsheetBench QLoRA
This repository contains a PEFT/QLoRA adapter trained for spreadsheet manipulation code generation on top of:
unsloth/Qwen3-VL-4B-Thinking-unsloth-bnb-4bit
The adapter was trained to generate executable Python, primarily openpyxl, for SpreadsheetBench-style workbook manipulation tasks.
Runtime helpers and vLLM examples live at:
https://github.com/DrStrangel0ve/spreadsheetbench-qwen3vl-qlora
Important note
The best benchmark result reported below uses this adapter together with a tightened SpreadsheetBench inference/runtime layer that enforces code-only output, output-path correctness, workbook saves, target-change checks, and deterministic recovery templates for common spreadsheet failure patterns.
Adapter-only performance improved modestly. Adapter plus the tightened runtime produced the largest practical gain.
Results
On the 200-case SpreadsheetBench slice used during development:
| System | Tests passed | Soft avg | Hard avg | Full-pass cases | Output workbooks |
|---|---|---|---|---|---|
| Original base GGUF | 126/600 | 0.2100 | 0.1800 | 36/200 | 600 |
| Base GGUF + tightened templates | 143/600 | 0.2383 | 0.2200 | 44/200 | 600 |
| Initial Kaggle/template QLoRA | 122/600 | 0.2033 | 0.1750 | 35/200 | 583 |
| Failure-algorithmic QLoRA v2 | 135/600 | 0.2250 | 0.1950 | 39/200 | 593 |
| Failure-algorithmic QLoRA v2 + tightened templates | 157/600 | 0.2617 | 0.2300 | 46/200 | 593 |
Training
The selected adapter is outputs/qlora_failure_algorithmic_v2.
Training configuration:
- Base model:
unsloth/Qwen3-VL-4B-Thinking-unsloth-bnb-4bit - Method: QLoRA / PEFT LoRA
- Target modules:
q_proj,k_proj,v_proj,o_proj - LoRA rank: 8
- LoRA alpha: 16
- LoRA dropout: 0.05
- Max examples: 1800
- Epochs: 1
- Max sequence length: 896
- Learning rate: 7e-5
- Warmup ratio: 0.04
- Gradient accumulation: 4
- Weight decay: 0.0
- Max grad norm: 0.25
The adapter was initialized from an earlier Kaggle/template adapter trained with learning rate 3e-4. A failure-focused adapter at 1.5e-4 and a later v3 continuation at 5e-5 were tested but not promoted.
Data
The training mix included:
- Kaggle-derived synthetic spreadsheet tasks.
- Spreadsheet template tasks.
- Failure-archetype tasks derived from benchmark failure analysis.
The Kaggle synthetic set was built locally from downloaded CSV/XLSX files. Candidate solvers were executed to create gold output workbooks before examples were accepted. The Kaggle generation accepted 278 examples and rejected 18.
Loading with Transformers and PEFT
from peft import PeftModel
from transformers import AutoModelForImageTextToText, AutoTokenizer, BitsAndBytesConfig
import torch
base_model = "unsloth/Qwen3-VL-4B-Thinking-unsloth-bnb-4bit"
adapter = "DrStrangel0ve/Qwen3-VL-4B-SpreadsheetBench-QLoRA"
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_use_double_quant=True,
)
tokenizer = AutoTokenizer.from_pretrained(adapter, trust_remote_code=True)
model = AutoModelForImageTextToText.from_pretrained(
base_model,
trust_remote_code=True,
device_map="auto",
quantization_config=bnb_config,
)
model = PeftModel.from_pretrained(model, adapter)
model.eval()
Serving with vLLM
vLLM supports serving LoRA adapters with --enable-lora and --lora-modules name=path_or_repo. See the official vLLM LoRA documentation: https://docs.vllm.ai/en/stable/features/lora.html
Example:
vllm serve unsloth/Qwen3-VL-4B-Thinking-unsloth-bnb-4bit \
--enable-lora \
--lora-modules spreadsheet=DrStrangel0ve/Qwen3-VL-4B-SpreadsheetBench-QLoRA \
--max-model-len 4096
Depending on your vLLM version and GPU, you may need additional quantization flags for the Unsloth 4-bit base model.
Limitations
- This is a QLoRA adapter, not a fully merged standalone model.
- SpreadsheetBench scores depend strongly on the execution harness and postprocessing.
- The best reported score includes deterministic runtime templates, not just raw model generation.
- The adapter is specialized for code-generation style spreadsheet tasks and should not be treated as a general-purpose finance or spreadsheet reasoning model.
- Downloads last month
- 28
Model tree for DrStrangel0ve/Qwen3-VL-4B-SpreadsheetBench-QLoRA
Base model
Qwen/Qwen3-VL-4B-Thinking