lbourdois commited on
Commit
a9ad94d
·
verified ·
1 Parent(s): c50f552

Update model card for Bashkir

Browse files
Files changed (1) hide show
  1. README.md +89 -86
README.md CHANGED
@@ -1,86 +1,89 @@
1
- ---
2
- language: bak
3
- license: apache-2.0
4
- tags:
5
- - trimmed
6
- - qwen3.5
7
- base_model: Qwen/Qwen3.5-2B
8
- base_model_relation: quantized
9
- datasets:
10
- - Lumberjackk/fineweb-2-trimming
11
- ---
12
-
13
- # Qwen3.5-2B-bak-32768
14
-
15
- This model is a **19.95% smaller** version of [Qwen/Qwen3.5-2B](https://huggingface.co/Qwen/Qwen3.5-2B) optimized for Bashkir language via vocabulary size reduction using the [trimming](https://huggingface.co/blog/introduction-to-trimming) method.
16
-
17
- This trimmed model should perform similarly to the original model with only **32,768 tokens** and a much smaller memory footprint. However, it may not perform well for other languages as tokens not commonly used in Bashkir were removed from the vocabulary.
18
-
19
- Note: Qwen3.5 is a multimodal (vision-language) model. This trimmed version retains the vision encoder but reduces only the text vocabulary.
20
-
21
- ## Model Statistics
22
-
23
- | Metric | Original | Trimmed | Reduction |
24
- |--------|----------|---------|-----------|
25
- | **Vocabulary size** | 248,044 tokens | 32,768 tokens | **86.79%** |
26
- | **Model size** | 2,213,241,664 params | 1,771,791,168 params | **19.95%** |
27
-
28
-
29
- ## Mining Dataset Statistics
30
-
31
- - **Number of texts used for mining**: 200,000 texts
32
- - **Dataset**: [Lumberjackk/fineweb-2-trimming](https://huggingface.co/datasets/Lumberjackk/fineweb-2-trimming)
33
-
34
- ## Usage
35
-
36
- ```python
37
- from transformers import AutoModelForCausalLM, AutoTokenizer
38
-
39
- model_name = "AlphaEdge-AI/Qwen3.5-2B-bak-32768"
40
-
41
- # load the tokenizer and the model
42
- tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
43
- model = AutoModelForCausalLM.from_pretrained(
44
- model_name,
45
- torch_dtype="auto",
46
- device_map="auto",
47
- trust_remote_code=True
48
- )
49
-
50
- # prepare the model input
51
- prompt = "Your prompt in Bashkir."
52
- messages = [
53
- {"role": "user", "content": prompt}
54
- ]
55
- text = tokenizer.apply_chat_template(
56
- messages,
57
- tokenize=False,
58
- add_generation_prompt=True
59
- )
60
- model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
61
-
62
- # conduct text completion
63
- generated_ids = model.generate(
64
- **model_inputs,
65
- max_new_tokens=32768
66
- )
67
- output_ids = generated_ids[0][len(model_inputs.input_ids[0]):]
68
- content = tokenizer.decode(output_ids, skip_special_tokens=True)
69
-
70
- print("content:", content)
71
-
72
- ```
73
-
74
- ## Citation
75
-
76
- #### Qwen3.5
77
-
78
- ```bibtex
79
- @misc{qwen3.5,
80
- title = {Qwen3.5: Towards Native Multimodal Agents},
81
- author = {Qwen Team},
82
- month = {February},
83
- year = {2026},
84
- url = {https://qwen.ai/blog?id=qwen3.5}
85
- }
86
- ```
 
 
 
 
1
+ ---
2
+ pipeline_tag: text-generation
3
+ language: bak
4
+ license: apache-2.0
5
+ tags:
6
+ - trimmed
7
+ library_name: transformers
8
+ base_model: Qwen3.5-2B
9
+ base_model_relation: quantized
10
+ datasets:
11
+ - lbourdois/fineweb-2-trimming
12
+ ---
13
+
14
+ # Qwen3.5-2B-bak-32768
15
+ This model is a **19.95% smaller** version of [Qwen/Qwen3.5-2B](https://huggingface.co/Qwen/Qwen3.5-2B) optimized for **Bashkir** language via vocabulary size reduction using the [trimming](https://huggingface.co/blog/lbourdois/introduction-to-trimming) method.
16
+ This trimmed model should perform similarly to the original model with only 32,768 tokens and a much smaller memory footprint. However, it may not perform well for other languages as tokens not commonly used in the selected languages were removed from the vocabulary.
17
+
18
+ ## Model Statistics
19
+ | Metric | Original | Trimmed | Reduction |
20
+ |--------|----------|---------|-----------|
21
+ | **Vocabulary size** | 248,320 tokens | 32,768 tokens | **86.80%** |
22
+ | **Model size** | 2,213,241,664 params | 1,771,791,168 params | **19.95%** |
23
+
24
+ ![image](https://raw.githubusercontent.com/lbourdois/blog/refs/heads/master/assets/images/Trimming/qwen.5-2B-32768.png)
25
+
26
+ ## Mining Dataset Statistics
27
+ - **Number of texts used for mining**: 179,964 texts
28
+ - **Dataset**: [lbourdois/fineweb-2-trimming](https://huggingface.co/datasets/lbourdois/fineweb-2-trimming)
29
+
30
+ ## Usage
31
+ ```python
32
+ from transformers import AutoModelForCausalLM, AutoTokenizer
33
+
34
+ model_name = "alphaedge-ai/Qwen.5-2B-bak-32768"
35
+
36
+ # load the tokenizer and the model
37
+ tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
38
+ model = AutoModelForCausalLM.from_pretrained(
39
+ model_name,
40
+ torch_dtype="auto",
41
+ device_map="auto",
42
+ trust_remote_code=True
43
+ )
44
+
45
+ # prepare the model input
46
+ prompt = "Your prompt in Bashkir."
47
+ messages = [
48
+ {"role": "user", "content": prompt}
49
+ ]
50
+ text = tokenizer.apply_chat_template(
51
+ messages,
52
+ tokenize=False,
53
+ add_generation_prompt=True
54
+ )
55
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
56
+
57
+ # conduct text completion
58
+ generated_ids = model.generate(
59
+ **model_inputs,
60
+ max_new_tokens=32768
61
+ )
62
+ output_ids = generated_ids[0][len(model_inputs.input_ids[0]):]
63
+ content = tokenizer.decode(output_ids, skip_special_tokens=True)
64
+
65
+ print("content:", content)
66
+ ```
67
+
68
+ ## Citations
69
+
70
+ #### Qwen3
71
+ ```
72
+ @misc{qwen3.5,
73
+ title = {Qwen3.5: Towards Native Multimodal Agents},
74
+ author = {Qwen Team},
75
+ month = {February},
76
+ year = {2026},
77
+ url = {https://qwen.ai/blog?id=qwen3.5}
78
+ }
79
+ ```
80
+
81
+ #### Trimming blog post
82
+ ```
83
+ @misc{hf_blogpost_trimming,
84
+ title={Introduction to Trimming},
85
+ author={Loïck BOURDOIS and Tom AARSEN and Bram VANROY and Christopher AKIKI and Woojun JUNG and Manuel ROMERO and Prithiv SAKTHI},
86
+ year={2026},
87
+ url={https://huggingface.co/blog/lbourdois/introduction-to-trimming},
88
+ }
89
+ ```