Instructions to use ethzanalytics/distilgpt2-tiny-conversational with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use ethzanalytics/distilgpt2-tiny-conversational with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="ethzanalytics/distilgpt2-tiny-conversational")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("ethzanalytics/distilgpt2-tiny-conversational")
model = AutoModelForCausalLM.from_pretrained("ethzanalytics/distilgpt2-tiny-conversational")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use ethzanalytics/distilgpt2-tiny-conversational with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "ethzanalytics/distilgpt2-tiny-conversational"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ethzanalytics/distilgpt2-tiny-conversational",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/ethzanalytics/distilgpt2-tiny-conversational

SGLang

How to use ethzanalytics/distilgpt2-tiny-conversational with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "ethzanalytics/distilgpt2-tiny-conversational" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ethzanalytics/distilgpt2-tiny-conversational",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "ethzanalytics/distilgpt2-tiny-conversational" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ethzanalytics/distilgpt2-tiny-conversational",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use ethzanalytics/distilgpt2-tiny-conversational with Docker Model Runner:
```
docker model run hf.co/ethzanalytics/distilgpt2-tiny-conversational
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

distilgpt2-tiny-conversational

This model is a fine-tuned version of distilgpt2 on a parsed version of Wizard of Wikipedia. Persona alpha/beta framework designed for use with ai-msgbot. It achieves the following results on the evaluation set:

Loss: 2.2461

Model description

a basic dialogue model for conversation. It can be used as a chatbot.
check out a simple demo here

Intended uses & limitations

usage is designed for integrating with this repo: ai-msgbot
the main specific information to know is that the model generates whole conversations between two entities, person alpha and person beta. These entity names are used functionally as custom <bos> tokens to extract when one response ends and another begins.

Training and evaluation data

wizard of Wikipedia parsed, from parlAI

Training procedure

deepspeed + huggingface trainer, an example notebook is in ai-msgbot

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
distributed_type: multi-GPU
gradient_accumulation_steps: 4
total_train_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.05
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss
No log	1.0	418	2.7793
2.9952	2.0	836	2.6914
2.7684	3.0	1254	2.6348
2.685	4.0	1672	2.5938
2.6243	5.0	2090	2.5625
2.5816	6.0	2508	2.5332
2.5816	7.0	2926	2.5098
2.545	8.0	3344	2.4902
2.5083	9.0	3762	2.4707
2.4793	10.0	4180	2.4551
2.4531	11.0	4598	2.4395
2.4269	12.0	5016	2.4238
2.4269	13.0	5434	2.4102
2.4051	14.0	5852	2.3945
2.3777	15.0	6270	2.3848
2.3603	16.0	6688	2.3711
2.3394	17.0	7106	2.3613
2.3206	18.0	7524	2.3516
2.3206	19.0	7942	2.3398
2.3026	20.0	8360	2.3301
2.2823	21.0	8778	2.3203
2.2669	22.0	9196	2.3105
2.2493	23.0	9614	2.3027
2.2334	24.0	10032	2.2930
2.2334	25.0	10450	2.2852
2.2194	26.0	10868	2.2754
2.2014	27.0	11286	2.2695
2.1868	28.0	11704	2.2598
2.171	29.0	12122	2.2539
2.1597	30.0	12540	2.2461

Framework versions

Transformers 4.16.1
Pytorch 1.10.0+cu111
Tokenizers 0.11.0

Downloads last month: 2,403

Safetensors

Model size

88.2M params

Tensor type

F32

Model tree for ethzanalytics/distilgpt2-tiny-conversational

Quantizations

2 models

ethzanalytics
/

distilgpt2-tiny-conversational