YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

TestGen: AI Test Case Generator (Qwen2.5-Coder-7B + LoRA)

Fine-tuned Qwen2.5-Coder-7B-Instruct with LoRA for comprehensive unit test generation.

Overview

This model generates ALL test cases including edge cases for any source code input. Based on the paper "Parameter-Efficient Fine-Tuning of LLMs for Unit Test Generation" (arxiv:2411.02462).

Training Recipe

Component Details
Base Model Qwen/Qwen2.5-Coder-7B-Instruct
Method LoRA (rank=16, alpha=32)
Dataset andstor/methods2test fm+fc+t+tc (46K+ samples)
Training 3 epochs, lr=1e-4, cosine schedule, effective batch=32
Hardware A10G-large (24GB VRAM)
Framework TRL SFTTrainer + PEFT LoRA

LoRA Target Modules

  • Attention: q_proj, k_proj, v_proj, o_proj
  • MLP: gate_proj, up_proj, down_proj

How to Run Training

# 1. Build the training dataset
pip install datasets huggingface_hub
python scripts/data_pipeline.py --repo YOUR_ORG/testgen-data --max-samples 50000

# 2. (Optional) Add your company's code+test files
python scripts/data_pipeline.py --repo YOUR_ORG/testgen-data --custom-dirs /path/to/your/code

# 3. Run training
pip install transformers trl torch datasets trackio accelerate peft bitsandbytes
python scripts/train.py

# Or via HF Jobs (recommended):
# Hardware: a10g-large, Timeout: 5h

How to Add Your Company's Data

Your raw training data should be organized as code files paired with test files:

your-project/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ calculator.py
β”‚   β”œβ”€β”€ utils.py
β”‚   └── auth.py
└── tests/
    β”œβ”€β”€ test_calculator.py
    β”œβ”€β”€ test_utils.py
    └── test_auth.py

The data pipeline auto-discovers pairs using naming conventions:

  • Python: calculator.py ↔ test_calculator.py
  • Java: Calculator.java ↔ CalculatorTest.java
  • JS/TS: calculator.js ↔ calculator.test.js

Live Demo

Try it: πŸ§ͺ AI Test Case Generator Space

Resources

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Paper for Navyatha2006/testgen-qwen-coder-7b