Instructions to use Abhinav-Anand/Poet_Of_Tsushima with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use Abhinav-Anand/Poet_Of_Tsushima with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
Poet_Of_Tsushima-SmolLM-135M (LoRA Adapter)
Overview
A LoRA (Low-Rank Adaptation) adapter that transforms the tiny SmolLM-135M model into a haiku generation specialist. The adapter weighs only ~7.1 MB but adds the ability to compose haikus in the traditional 5-7-5 syllable format.
Key Features
- Tiny adapter: Only ~7.1 MB on top of the 135M base model
- Poetry specialist: Trained on 45+ curated haiku examples across themes
- Themes covered: Nature, emotions, philosophy, technology, daily life
- LoRA rank 16: Good balance of expressiveness and efficiency
LoRA Configuration
- Rank (r): 16
- Alpha: 32
- Target modules: q_proj, v_proj, k_proj, o_proj (attention layers)
- Dropout: 0.05
- Trainable parameters: ~0.5% of base model
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
# Load base model
base_model = AutoModelForCausalLM.from_pretrained("HuggingFaceTB/SmolLM-135M")
tokenizer = AutoTokenizer.from_pretrained("Ringkvist/Poet_Of_Tsushima-SmolLM-135M")
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "Ringkvist/Poet_Of_Tsushima-SmolLM-135M/adapter")
# Generate a haiku
prompt = "Write a haiku about the ocean:"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50, temperature=0.8, top_p=0.9)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Training Details
- Dataset: 45+ hand-curated haiku examples
- Epochs: 30
- Learning rate: 2e-4
- Batch size: 4
- Hardware: Apple Silicon Mac (MPS/CPU)
Example Outputs
Write a haiku about stars:
Stars dot the night sky
Ancient light from distant suns
We are never lost
Base Model
- HuggingFaceTB/SmolLM-135M (135M parameters)
- Downloads last month
- -
Model tree for Abhinav-Anand/Poet_Of_Tsushima
Base model
HuggingFaceTB/SmolLM-135M